MissPred: A Robust Two-Stage Radar Echo Extrapolation Algorithm for Incomplete Sequences

Zhao, Ziqi; Duan, Chunxu; Song, Lin; Zhang, Qilin; Zhu, Wenda; Liu, Yi

doi:10.3390/rs17122066

Open AccessArticle

MissPred: A Robust Two-Stage Radar Echo Extrapolation Algorithm for Incomplete Sequences

by

Ziqi Zhao

¹

,

Chunxu Duan

¹,

Lin Song

²,

Qilin Zhang

^1,*

,

Wenda Zhu

³ and

Yi Liu

^1,4

¹

Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters (CIC-FEMD)/Key Laboratory of Meteorological Disaster, Ministry of Education (KLME)/Collaborative Innovation Center on Atmospheric Environment and Equipment Technology, B-DAT, Nanjing University of Information Science and Technology, Nanjing 210044, China

²

Qingdao Ecological and Agricultural Meteorological Center, Qingdao 266003, China

³

Guizhou Meteorological Observatory, Guiyang 550001, China

⁴

Department of Computer Science, University of Reading, Whiteknights, Reading RG6 6DH, UK

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(12), 2066; https://doi.org/10.3390/rs17122066

Submission received: 20 May 2025 / Revised: 7 June 2025 / Accepted: 15 June 2025 / Published: 16 June 2025

Download

Browse Figures

Versions Notes

Abstract

Radar echo extrapolation based on real-world data is a fundamental problem in meteorological forecasting. Existing extrapolation models typically assume complete radar echo sequences, but in practice, data loss frequently occurs due to equipment failures and communication disruptions. Although traditional solutions can handle missing values through a interpolation-then-prediction pipeline, they suffer from a major limitation: interpolating the missing data and then extrapolating will introduce a cumulative error. To address these issues, we propose MissPred, a radar echo extrapolation model specifically designed for missing data patterns. MissPred employs a dual encoder–decoder architecture. Specifically, the training process involves the sequential execution of interpolation and extrapolation as two distinct serial tasks. In order to circumvent the occurrence of cumulative errors, interpolation and extrapolation are required to share encoder parameters. Furthermore, a missing spatiotemporal feature fusion module (MSTF) that is absent has been designed for the purpose of extracting fine-grained complete spatiotemporal features. Finally, the incorporation of adversarial training is introduced to enhance the authenticity of the prediction results. In order to evaluate the proposed model, case studies are conducted on real radar datasets. Our dataset covers missing rates ranging from 10% to 50%. The experimental results show that the model outperforms the baseline model with the prior interpolation of missing data in the missing mode with stable robustness.

Keywords:

radar echo extrapolation; missing pattern; cumulative error; adversarial training

1. Introduction

Extreme weather events, such as heavy rainfall, typhoons, tornadoes, and thunderstorms, have been increasing in both frequency and intensity globally in recent years, posing significant threats to human life and property. Accurate prediction of these extreme weather phenomena is crucial in the face of the growing challenge of climate change. The real-time prediction of the motion trend and intensity of radar echoes helps to analyze the development of convective systems [1,2]. The traditional forecasting approach is based on the numerical forecasting model (NWP), which simulates weather changes through a set of atmospheric physics equations. Although NWP has obvious advantages in providing high-precision weather forecasts and large-scale convective system simulations, there is still a need to investigate efficient and real-time methods due to its high demand on computational resources and the long forecasting period [3,4,5,6].

In recent years, radar echo extrapolation methods incorporating deep learning have been able to provide more accurate and real-time predictions of strong convection by learning the spatial and temporal characteristics of radar echo sequences and have demonstrated great potential in short-term warning. Deep learning-based spatiotemporal prediction techniques can usually be classified into four categories: convolutional neural networks (CNNs), recurrent neural network (RNN), visual transformer (VIT), and hybrid CNN-RNN models [7]. Convolutional neural networks can effectively capture the spatiotemporal patterns of radar echo sequences through powerful local feature extraction [8,9]. RNN-based models are able to learn the trend of radar echo evolution over time when modeling time-series data [10,11,12]. VIT can understand the complex spatiotemporal dynamics in radar echo sequences by using the attention mechanism to learn the global contextual relationships [13,14]. Hybrid models of CNNs and RNNs are able to simultaneously capture the spatial features of the radar echoes and the temporal evolution patterns [15,16,17,18,19]. For instance, Liu et al. proposed the TEDR method, which combines optical flow techniques and a two-path spatiotemporal attention network with distributional correction to address the ambiguity problem in radar echo extrapolation [20]. Niu et al. proposed the FsrGAN framework, which integrates multisource satellite and radar data via a two-stage generative adversarial network (GAN) to address the limitations of single-source radar echo extrapolation in capturing complex precipitation dynamics and predicting convective initiation [21]. Wang et al. developed the RainHCNet, leveraging spatiotemporal attention mechanisms to address the challenge of preserving fine-scale precipitation structures in radar echo extrapolation under limited observational data [22]. Wang et al. proposed a patch-wise (PW) mechanism to enhance sparse radar echo extrapolation in precipitation nowcasting, dividing radar echo frames into patches and using a local attention mechanism, multiscale convolutions, and a convolutional block attention module to address the limitations of handling sparse echo sequences and improve prediction accuracy for low-frequency heavy rainfall and localized precipitation events [23]. Wang et al. proposed a two-stage radar echo extrapolation model combining a 3D convolutional neural network and conditional generative adversarial network (CGAN), aiming to address the issues of blurry predictions, underestimated echo intensity, and missing small-scale details in radar extrapolation, particularly for intense echoes and convective systems [24].

However, in practical applications, due to problems such as equipment failure, signal interference, or data transmission, data may be missing in some frames of the radar echo sequence, which poses a serious challenge to the prediction accuracy and reliability of the radar echo extrapolation model. Traditionally, the treatment of missing data has relied on linear interpolation or other interpolation techniques. Although this method is able to fill in missing frames in simple cases, it ignores the inherent nonlinear dynamic properties of meteorological data, leading to lower prediction accuracy. In several studies, the interpolation of data has been proposed on the basis of deep learning techniques for application to downstream tasks [25,26,27]. Si et al. proposed a dynamic residual convolution network and investigated how to reconstruct radar reflectivity data from geostationary satellite data using dynamic convolution and residual convolution modules [28]. Yu et al. introduced the STR-UNet method, which is based on U-Net architecture, to reconstruct radar echoes in oceanic regions using satellite data. They also analyzed feature importance across different land surface types using DeepLIFT, aiming to enhance the monitoring capability for strong convective weather phenomena [29]. Gong et al. developed the DSA-UNet model, which integrates dilated convolution and self-attention mechanisms. Their work focused on improving the accuracy of radar data reconstruction in regions with missing data, particularly for extreme and local-scale radar echo patterns, by leveraging deep learning techniques [30]. The aforementioned approaches primarily address single-task data reconstruction, often relying on interpolation methods to handle missing data in specific regions of the radar frame. However, limited research has been conducted on radar echo extrapolation tasks, particularly those that focus on directly predicting missing data while enhancing model robustness. Notably, if the reconstruction outputs of deep learning models are sequentially cascaded and re-used as inputs in subsequent prediction models, the issue of cumulative error is an inevitable concern.

To address the above challenges, this paper proposes a two-stage deep learning model with a dual encoder–decoder structure for the task of radar echo extrapolation in data-deficient mode. The model utilizes a pretraining strategy that enables the encoder to have an associative interpolation function. The original and differential sequences are first passed through a dual-branch encoder to fully exploit the spatiotemporal information of the original sequences and the variation trend of the differential sequences. In the skip connection, a missing spatiotemporal fusion module is designed to fuse information for missing data in order to recover the lost spatiotemporal features and improve the extrapolation accuracy. In the extrapolation stage, the extrapolation decoder is trained in combination with a pretrained interpolation encoder. In this process, we introduce the discriminator and use adversarial training to enhance the accuracy and reasonableness of the model’s prediction of radar echoes at future moments. The method not only effectively overcomes the shortcomings of traditional interpolation methods and improves the prediction performance in missing mode but also provides a solution for the radar echo extrapolation task in missing mode.

The contributions of this work are summarized as follows.

A two-stage training strategy is proposed that allows for reliable extrapolation without introducing cumulative errors in the model cascade when the input sequence contains missing frames.
In order to recover the missing spatiotemporal information of the input sequence, this paper proposes a parallel structure consisting of a raw sequence encoder and a differential encoder. The raw sequence encoder extracts the spatiotemporal characteristics of the sequence, while the differential encoder extracts the echo variation characteristics between frames.
This paper presents a novel dual-path adaptive fusion module that has been specifically designed for missing data scenarios. The module features branch-specific channel attention, which enables the dynamic reweighting of complementary features from both encoders. These features are then concatenated and integrated through dual-pooling, with the aim of achieving robust spatial fusion.

The remainder of this paper is organized as follows: Section 2 introduces the dataset used in this paper. Section 3 demonstrates the MissPred model, implementation details, and evaluation metrics. The experimental results are presented in Section 4. Finally, Section 5 provides a summary and discussion.

2. Data

The data used in this study were derived from the national radar data provided by the Meteorological Observation Center of the China Meteorological Administration (CMA), covering the period from October 2022 to July 2024. The study area mainly includes Shandong Province, with geographic boundaries defined by longitudes 114.88–120°E and latitudes 34–39.12°N, as shown in Figure 1. The spatial resolution of the dataset is 1 km, and the temporal resolution is 6 min. To enhance training efficiency, the image size was down-sampled from 512 × 512 pixels to 256 × 256 pixels. The experiment employs radar sequences from the past hour to predict the subsequent hour. Specifically, the model uses 10 frames of historical data to forecast the subsequent 10 frames.

Radar echo extrapolation plays a crucial role in the early warning of strong convective weather events. For this study, radar reflectivity data corresponding to precipitation events were selected. Data samples characterized by high echo intensities and extensive echo regions were prioritized to strengthen the model’s ability to capture both the spatial distribution and intensity variations of the precipitation systems. Notably, no additional preprocessing, such as clustering, was performed on the data to preserve the robustness of the model.

To simulate the missing pattern, a random subset of frames (from 1 to 5) in the input sequence were set to zero, thereby simulating a 10% to 50% data missing rate. The complete dataset comprises 15,000 samples, with 4000 samples reserved for the test dataset. The spatial extent of the test dataset also covers the aforementioned Shandong region, and the timeframe is June–July 2024. For the two-stage training task, the reconstruction and extrapolation tasks were split in a 7:3 ratio, as the reconstruction task is considered simpler in nature.

3. Method

In this section, we provide a detailed description of our task and MissPred, with the model architecture illustrated in Figure 2. The proposed architecture is inspired by concepts of error correction and data reconstruction, leveraging techniques such as forward error correction and differential redundancy from the communication domain. Sequences containing some missing frames are interpolated by fusing their raw and differential data in order to achieve information interpolation of the input sequences. For this purpose, we designed a dual-encoder and dual-decoder radar echo extrapolation model.

The first stage of the missing data reconstruction process involves pretraining the model to enable the encoder to develop some associative capabilities. Then, the interpolating encoder is applied to the extrapolated decoder and the extrapolated decoder is trained. However, due to the inherent prediction errors in deep learning models, utilizing the output of the reconstructed decoder as input for subsequent steps would lead to significant distortion, adversely impacting the extrapolation task. To mitigate this, we freeze the parameters of the interpolation encoder and apply it directly to the extrapolation decoder.

In addition, we introduce adversarial training, which is used to improve the realism of the model’s predicted sequences. The model is a two-stage model that implements interpolation and extrapolation of radar echo sequences in the missing mode through knowledge migration.

3.1. Description of Tasks

Radar echo extrapolation with a missing pattern is to predict future sequences based on partially missing echo sequences, which requires high model robustness. The input can be represented as

X \in R^{I \times M \times N}

, where the goal is to predict the future sequence

\hat{X} \in R^{K \times M \times N}

as accurately as possible:

{\hat{x}}_{t + 1}, \dots, {\hat{x}}_{t + K} = F (x_{t - I + 1}, \dots, x_{t})

(1)

where

x_{t - I + 1}, \dots, x_{t}

represents the input sequence in which some of the missing frames are replaced with zero frames, and

{\hat{x}}_{t + 1}, \dots, {\hat{x}}_{t + K}

denotes the future sequence predicted by the model.

3.2. Pretrain Interpolation Encoder

In the classical U-Net architecture, the spatiotemporal features of the radar echo sequence are primarily extracted by the convolutional layers in the encoder. These features are subsequently passed to the decoder for extrapolation through layer-by-layer down-sampling and skip connection. Consequently, the features output from each encoder layer must contain enough information for the decoder to perform effective extrapolation. However, for sequences with missing frames, enabling the encoder to handle missing data interpolation is a key challenge. To address this problem, we propose a two-stage training strategy for the encoder and decoder. In the first stage, the encoder is pretrained on a data reconstruction task, and its parameters are then frozen to facilitate knowledge transfer to the interpolation task [31].

For missing data recovery, we introduce a differential fusion approach. In this method, a differential sequence is generated by applying a differential operation to the input sequence. It can capture the motion trends of the radar echo and the variations in echo intensity. Additionally, the differential frames can intuitively reveal missing data patterns. Specifically, a differential frame value of 0 indicates that two consecutive frames are missing; a differential frame value close to 0 indicates that neither of the original frames is missing; a differential frame value of 0 indicates that the latter frame is missing; and a differential frame value of large indicates that the preceding frame is missing. By fusing the characteristics of both the original sequence and the differential sequence, we can effectively represent the complete sequence information. The differential operation is mathematically defined as follows.

D_{t} = x_{t + 1} - x_{t}

(2)

where

D_{t}

denotes the differential image between frame t and frame

t + 1

. Note that the differential operation here is for the entire input sequence, not just for missing frame pairs.

To implement this approach, we designed a dual-encoder structure. First, the original input sequence and the differential sequence are processed through 1 × 1 convolution layers for channel adjustment. Then, feature extraction is performed by two encoders of identical structure, with each encoder processing either the original or the differential sequence. The interpolation between the original and differential sequences is achieved based on the valid data during the convolution process. The architecture of each encoder follows the classical U-Net design, consisting of two consecutive layers of Convolution-BatchNorm-Activation functions, followed by down-sampling through max pooling. This process reduces the resolution of the feature map and extracts advanced spatiotemporal features. The Batch Normalization expression used is as follows.

Z_{i} = γ \cdot \frac{X_{i} - μ_{b a t c h}}{\sqrt{σ_{b a t c h}^{2} + ϵ}} + β

(3)

where

X_{i}

is the i-th sample and

Z_{i}

is the output after the BatchNormal layer.

μ_{b a t c h}

and

σ_{b a t c h}^{2}

represent the mean and variance of the mini-batch, respectively.

γ

and

β

are the trainable parameters used to scale and shift the normalized results, and

ϵ

is a small constant added to prevent division by zero.

3.3. Dual-Branch Decoder

In this paper, a two-branch decoder structure is proposed to handle missing patterns in radar echo data and effectively perform sequence extrapolation. The structure is based on the U-Net architecture and combines a pretrain strategy with a stepwise training approach. The reconstruction decoder and extrapolation decoder operate in two distinct training phases, respectively.

Firstly, the reconstruction decoder aims to leverage the expressive power of a CNN to accurately reconstruct the complete input sequence. It reconstructs the full input sequence by performing layer-wise up-sampling and incorporating feature information fused through skip connections. The training objective for this stage is to minimize the reconstruction error, as defined in Equation (4).

L_{r} = \frac{1}{I} \sum_{t = 1}^{I} {(x_{l a b e l_{t}} - \hat{x_{t}})}^{2}

(4)

where I is the number of frames in the input sequence,

x_{l a b e l}

is the ground truth label for reconstruction, and

\hat{x}

is the model’s predicted output.

Following the completion of pretraining, the model transitions to the second stage, which focuses on training the extrapolation decoder. During this stage, the encoder parameters are frozen, and only the extrapolation decoder is trained. The training objective is to predict future frames by leveraging the features extracted from the frozen encoder and the fused features from the skip connections.

3.4. Missing Spatiotemporal Fusion Block

In the radar echo extrapolation task under missing patterns, the continuity of spatiotemporal information is disrupted due to the absence of certain frames in the input sequence. As shown in Figure 3, for the original input sequence, the missing frames

X_{2}, X_{4}, X_{5}

hinder the model’s ability to capture the complete spatiotemporal variation features. In the differential sequence, the differential frame in the normal mode, such as

D_{6}

, exhibits small variance and low overall contrast. However, in the missing mode, the differential frames, as shown in

D_{1}, D_{2}

, and

D_{4}

(with missing neighboring and consecutive frames), exhibit large variance, complicating the model’s learning process.

While the convolutional layer in the encoder can partially interpolate the missing information through convolution operations, it struggles to fully address the discontinuity in spatiotemporal information. To overcome this problem, we designed the MSTF module. This module is intended to enhance the model’s robustness under a missing data pattern and improve extrapolation performance by fusing the spatiotemporal features of both the original and differential sequences. The structure of the MSTF module is illustrated in Figure 4.

The MSTF module takes the spatiotemporal feature maps from both the original sequence branch (containing complete spatial–temporal context) and the differential sequence branch (highlighting change trends) as inputs. To focus on the most relevant information across channels, each branch’s features first undergo average pooling, preserving channel information. Subsequently, a lightweight attention mechanism is applied: 1D convolution followed by sigmoid activation generates channel-specific weights for each branch. These weights dynamically recalibrate the importance of each channel in the original feature maps.

The recalibrated features from both branches are then concatenated. To further integrate information and capture diverse perspectives, a global feature extractor applies both average pooling and max pooling across spatial dimensions. Finally, the pooled features are concatenated and fused using a 1 × 1 convolutional layer to produce the output.

Through this cross-branch fusion and adaptive channel weighting, the MSTF module synthesizes a more comprehensive spatiotemporal representation. It retains effective information from both sequences and mitigates the challenges posed by missing frames.

3.5. Adversarial Training

To enhance the realism and accuracy of the predicted results, we adopt the adversarial training approach used in generative adversarial networks [32]. In this framework, the pretrained encoder and the extrapolation decoder jointly function as the generator, aiming to produce radar echo sequences that closely resemble real data. A discriminator is employed to evaluate the authenticity of these generated sequences. The optimization objective is defined as follows.

\min_{G} \max_{D} V (D, G) = E_{x \sim p_{d a t a} (x)} [l o g D (x)] + E_{z \sim p_{z} (z)} [\log (1 - D (G (z)))]

(5)

where G denotes the generator, D denotes the discriminator, x is the sample from the real distribution, and z is the input to the generator, which is the conditional input to the distribution

p_{z}

.

3.6. Training Strategy

MissPred uses a two-stage training strategy, as shown in Algorithm 1. In the first stage, the parameters of the pretrained interpolating encoder and reconstruction decoder are initialized. The labels correspond to the complete sequences associated with the input sequences. The reconstruction task is then trained using the loss function with the Adam optimizer. Once training is completed, the parameters of the encoder (including the MSTF module) are frozen. Subsequently, the extrapolation decoder and discriminator, with parameters, are connected. In this stage, the labels are the ground truth values of the future sequences. Training is then carried out with the objective defined by (5).

Algorithm 1 Training scheme.

Input: Missing radar sequence

X_{raw}

, differential radar sequence

X_{diff}

Output: Predicted future sequences

1: Initialize the encoder parameters

θ_{1}

and decoder parameter

θ_{2}

of the pretrained model

2: for

e p o c h = 1

to Epoch do

3: for

i t e r = 1

to Iteration do

4:

\hat{X} = Pretrain model (X_{raw}, X_{diff}, θ_{1}, θ_{2})

5:

L_{r} = \frac{1}{I} \sum_{t = 1}^{I} {(x_{{label}_{t}} - {\hat{x}}_{t})}^{2}

6: Update the

θ_{1}

7: end for

8: end for

9: Freeze parameter

θ_{1}

, initialize the decoder parameters

θ_{3}

and discriminator parameters

θ_{4}

10: for

e p o c h = 1

to Epoch do

11: for

i t e r = 1

to Iteration do

12:

\hat{Y} = G (X_{raw}, X_{diff}, θ_{1}, θ_{3})

13: Calculate the discriminator loss

L^{GAN} (D)

14: Update the

θ_{3}

15: if

i t e r % k = = 0

then

16: Calculate the generator loss

L^{GAN} (G)

17: Update the

θ_{4}

18: end if

19: end for

20: end for

21: return G

3.7. Evaluation Metrics

To evaluate the performance of the comprehensive models in both the missing data reconstruction and extrapolation tasks, the following metrics were employed: Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), Critical Success Index (CSI), and Probability of Detection (POD). These metrics assess the prediction accuracy and quality of the models from different perspectives.

(1): MSE

MSE is the fundamental metric used to measure the difference between the predicted values and the true values. It quantifies the error by calculating the average of the squared differences between the predicted and actual values:

M S E = \frac{1}{K} \sum_{i = 1}^{K} {(Y_{i} - {\hat{x}}_{i})}^{2}

(6)

where

Y_{i}

denotes the real frame,

{\hat{x}}_{i}

represents the model’s predicted value, and K is the number of predicted frames.

(2): PSNR

The PSNR is a commonly used metric in image quality evaluation, indicating the ratio of signal strength to noise between the reconstructed image and the original image. It is defined as follows:

P S N R = 10 \cdot l o g_{10} (\frac{R^{2}}{M S E})

(7)

where R is the maximum value of the data.

(3): SSIM

The SSIM evaluates the structural similarity between two images by considering three factors: brightness, contrast, and structure. This index reflects the model’s ability to maintain the structural integrity of the original image during extrapolation. It is calculated as follows:

S S I M (\hat{X}, Y) = \frac{(2 μ_{\hat{X}} μ_{Y} + c_{1}) (2 σ_{\hat{X} Y} + c_{2})}{(μ_{\hat{X}}^{2} + μ_{\hat{Y}}^{2} + c_{1}) (σ_{\hat{X}}^{2} + σ_{Y}^{2} + c_{2})}

(8)

where

μ_{\hat{X}}

and

μ_{Y}

are the mean values of the predicted and true frames, respectively.

σ_{\hat{X}}

and

σ_{Y}

are the variances.

σ_{\hat{X} Y}

is the covariance between the two frames, and

c_{1}

and

c_{2}

are small constants to avoid division by zero.

(4): CSI

The CSI is used to evaluate the accuracy of the model in predicting the specific thresholds. It is calculated as follows:

C S I = \frac{T P}{T P + F P + F N}

(9)

where

T P

is the number of true positives,

F P

is the number of false positives, and

F N

is the number of false negatives.

(5): POD

The POD represents the proportion of the region with a specified echo strength threshold that is correctly identified by the model. It is defined as follows:

P O D = \frac{T P}{T P + F N}

(10)

4. Experiments and Analysis

We introduced eight baseline models to demonstrate the effectiveness of our proposed MissPred: ConvLSTM [10], PredRNN [11], 3D-Unet [33], SmaAt-Unet [34], SimVP [7], TAU [35], UCTransNet [36], and DeepLabV3D (3D version of DeepLabV3) [37].

4.1. Quantitative Comparison

In this section, the performance of MissPred is evaluated using a range of image and meteorological assessment metrics. In order to validate the performance of MissPred, we compared the performance of MissPred based on missing data with the model effect of the interpolation–prediction pipeline to analyze the effect of a cumulative error. The results are shown in Table 1. For the baseline model, all the missing data were used for sequence completion using linear interpolation. From the table, it can be seen that MissPred outperforms the baseline model in all metrics. This is because linear interpolation assumes that the variation between data is linear, which is often not the case in real radar echo data. The development, evolution, and dissipation processes of weather systems are usually nonlinear in nature. In addition, linear interpolation smooths out details and rapidly changing features in the original data, which can lead to the loss of information. It is thus found that the cumulative error due to interpolation is significant.

Meanwhile, in order to quantify the robustness of different models in the face of missing data, we have tabulated the performance of all the models under missing data, as shown in Table 2. The results indicate that MissPred significantly outperforms the baseline models across all metrics, demonstrating its ability to deliver high-quality and accurate predictions for radar echo extrapolation with a missing pattern.

The analysis reveals that MissPred consistently outperforms the baseline models at varying echo strengths (20 dBZ, 30 dBZ, and 40 dBZ). Specifically, MissPred achieves CSI values of 0.8257, 0.6829, and 0.3510, and POD values of 0.8637, 0.7348, and 0.3904, at echo thresholds of 20 dBZ, 30 dBZ, and 40 dBZ, respectively. In contrast, the baseline models such as PredRNN exhibit CSI and POD values of 0.0997 and 0.1734, respectively, at a threshold of 40 dBZ. SimVP yields CSI and POD values of 0.1798 and 0.2179, respectively, at the same threshold. These results suggest that the performance of these state-of-the-art baseline models is significantly compromised in the missing mode. Although they address prediction duration, accuracy, and ambiguity issues, they neglect model robustness, which is critical for deep learning models, particularly when handling real-world data that often contain noise and missing values.

Robustness is a crucial aspect, as real-world data is seldom ideal. The ability to handle imperfect data is indispensable, and MissPred’s superior performance in this regard is noteworthy. Notably, ConvLSTM, a classical model proposed earlier, performs well in missing modes and ranks second only to MissPred. This may be attributed to the fact that radar echo data often exhibit inherent spatiotemporal patterns during different weather processes. ConvLSTM is likely able to leverage these pre-existing patterns, particularly at a lower missing rate, without relying on overly complex model structures.

Figure 5 illustrates the trend of model performance metrics over different prediction times, ranging from 6 min to 60 min. As shown in the figure, all the models exhibit a downward trend in performance as the prediction time increases. This can be attributed to the growing disparity between the predicted and actual future states of the convective system, which leads to error accumulation over time. Despite this, MissPred consistently outperforms all the other models. Notably, at longer prediction times (beyond 42 min), MissPred’s relative advantage becomes increasingly pronounced. This suggests that the spatial and temporal continuity of radar echo sequences is severely disrupted in the missing mode, significantly diminishing prediction accuracy. However, MissPred’s ability to complement missing information enables it to maintain greater robustness and stability compared to the other models.

4.2. Visual Comparison

In this section, three examples from the test dataset will be provided for visualization to intuitively assess the predictive effectiveness of the proposed MissPred model at different missing rates (MRs).

(1): MR = 0.2

Figure 6 shows the visualization of the predictions for MissPred and the baseline model at the MR = 0.2. Over the course of the next 1 h, the cloud undergoes a complex evolution of motion and enhancement. The baseline model shows poor performance, particularly in terms of the movement of the echo position in the lower left corner of the echo image, the evolution of the cloud profile, and the tendency for the intensity to increase at the center of the image. At the 6 min extrapolation, all the models are able to better predict the position and echo intensity of the large cloud mass and capture the cloud contour effectively. This is due to the fact that despite two frames of data missing from the input sequence, the models are still able to infer the general morphology and evolutionary features of the cloud mass based on the remaining frames due to their low MR.

However, starting from 18 min, the prediction of echo intensity by 3D U-Net, SmaAt-Unet, UCTransUnet, and DeepLabV3_3D diminishes significantly. Although SmaAt-Unet incorporates an attention mechanism to enhance its spatial information processing, its ability to extrapolate over long time series is hindered by intermittent inputs. This disruption compromises the model’s capacity to maintain temporal continuity, reducing its robustness in capturing temporal dependencies. UCTransUnet, which integrates a Transformer module for capturing global dependencies through self-attention, also struggles under incomplete time-series data. Its performance weakens significantly when the temporal data is incomplete, as its ability to capture global information is compromised. The 3D U-Net and DeepLabV3_3D utilize 3D convolutions to model spatiotemporal dynamics, but the missing two frames of data disrupt the temporal continuity, leading to a failure to accurately capture the trend in echo intensity changes. In scenarios with more complex variations in echo intensity, these 3D convolution-based models may lose the ability to model dynamic changes, leading to erroneous intensity predictions and inaccurate echo evolution, which ultimately results in ambiguous cloud predictions.

In the later stages of the prediction, all the baseline models lose their ability to extrapolate effectively. Although ConvLSTM excels at modeling spatiotemporal dynamics, it struggles with fine-grained spatial information (e.g., small cloud masses). When the input sequence is missing, ConvLSTM faces challenges in effectively completing the missing information, resulting in poor representation of echo movement and evolution.

(2): MR = 0.5

Figure 7 shows the visualization of MissPred and the baseline models for a process with significant evolution at the MR = 0.5. As can be seen from the figure, almost all the baseline models break down in the radar echo extrapolation task in the face of high missing rates. Although ConvLSTM, PredRNN, and SimVP exhibit relatively low ambiguity, their accuracy in predicting echo regions and contours is poor, particularly in forecasting the shape of future echoes. Notably, none of these models are capable of predicting the separation of echoes in the lower part of the echo image. The models 3D U-Net, SmaAt-UNet, TAU, UCTransNet, and DeepLabV3_3D show significant ambiguities in their predictions. All of these models inaccurately predict the evolution and separation of echoes, interpreting them as dissipation. This is reflected in the gradual distortion of echo intensity, where the high-intensity region diminishes almost entirely over time.

ConvLSTM, which combines convolutional layers with LSTM networks, struggles significantly when faced with a large number of missing frames in the input. The temporal dependencies of the LSTM component are not effectively recovered due to the missing data, resulting in a marked decline in long-term prediction performance. This is particularly evident in the later stages of the extrapolation, where distortion in the shapes and contours becomes more pronounced. Although PredRNN performs better in terms of ambiguity, it also faces the shape and contour distortion problem. This is because the spatiotemporal convolutional structure that PredRNN relies on fails to effectively compensate for the loss of spatiotemporal dependence due to missing frames, especially on short-term and local variations, and fails to accurately predict the contours and shapes of the echoes. SimVP and TAU, both based on encoder–decoder architectures utilizing purely convolutional networks, also face considerable challenges. The large number of missing frames disrupts the temporal continuity and pattern consistency, significantly affecting the robustness of these models. The 3D U-Net, SmaAt-UNet, and DeepLabV3_3D, while capable of processing incomplete inputs, tend to smooth the predicted results, leading to a gradual reduction in echo intensity. This behavior suggests that traditional U-Net architectures may be prone to overfitting to the visible portion of the input, neglecting the overall trend in high-missing-rate scenarios. UCTransUnet, which enhances long-range dependency modeling through a self-attention mechanism, shows severe intensity distortion from 12 min onward. This indicates that while the Transformer-based structure can capture long-range temporal dependencies, the self-attention mechanism fails to effectively recover the missing information, leading to inaccuracies in the predictions.

In contrast, MissPred demonstrates a clear advantage in robustness when faced with high missing rates. It effectively predicts the evolution and separation of clouds during the weather process, capturing both the contours and finer details of the echoes with high accuracy. This capability is attributed to its two-branch interpolating encoder, which interpolates historical sequence features for more accurate extrapolation.

(3): MR from 0.1 to 0.5.

To visually analyze the effect of missing rates on radar echo extrapolation, Figure 8 compares the prediction results across different missing rates while controlling for missing positions.

The results illustrate a gradual decline in prediction quality as the number of missing frames increases, particularly in the echo contours and intensity within the region marked by the purple box. Specifically, when the input data is complete, the model can fully leverage spatiotemporal information to generate high-quality predictions. Even with one or two missing frames, the interpolating encoder effectively recovers most temporal information, resulting in accurate extrapolations. However, as the number of missing frames increases (e.g., five missing frames), model performance deteriorates significantly. This suggests that despite the inclusion of a well-designed interpolating encoder and a two-branch temporal fusion module, a large number of consecutive missing frames still represents a substantial challenge for the model.

This performance degradation is likely due to the disruption in the model’s ability to capture temporal dependencies when large amounts of spatiotemporal data are missing. As a result, the model exhibits prediction bias in terms of shape, detail, and intensity. This highlights a key limitation of spatiotemporal prediction models: their strong reliance on the continuity of the input data. The loss of spatiotemporal information caused by missing data severely impairs the model’s prediction capability. In particular, when a significant number of frames are missing, the model struggles to accurately recover the evolution of the echoes, particularly in rapidly evolving or detailed regions. Nevertheless, the experimental results demonstrate that MissPred retains a significant advantage in prediction accuracy, even in the presence of multiple missing frames.

4.3. Robustness Verification

In order to verify the robustness of the model and its ability to maintain features in the face of missing data, we selected a sample from the test set for which 1–5 frames were missing (i.e., 10–50% missing rate). Then inference was performed on it separately, and some of the feature maps at the skip connection of the fourth layer were visualized, as shown in Figure 9. It can be seen that in the face of missing data, the model is still able to generate intermediate features that are highly similar to those in the full input condition. This result shows that the model has good robustness as well as contextual association capabilities. Despite facing a high missing rate of 50%, most of its feature maps still have similar distributions to the complete sequence.

4.4. Ablation Study

We conducted a series of ablation experiments to evaluate the necessity and effectiveness of specific components in the proposed MissPred model, including the pretrain strategy, differential sequence branching, MSTF, and discriminator. Specifically, the removal of the pretrain strategy involved eliminating the reconstruction task and training the model end-to-end, which allowed us to assess the importance of the interpolation encoder. For the removal of the differential sequence branch (Diff-Branch), we used only the original input sequence while maintaining the other components, such as pretraining. Removing the MSTF module involved directly concatenating the outputs of the two encoder branches and connecting them to the corresponding decoder layer after a 1 × 1 convolution. Lastly, removing the discriminator means training the model without adversarial training, relying solely on the encoder–decoder structure.

A quantitative analysis of these ablation experiments is provided in Table 3. The results highlight the indispensability of each component. The model’s performance was notably degraded after removing the pretrain strategy, as the encoder could no longer effectively interpolate and capture complete spatiotemporal features, leading to insufficient understanding of historical information. When the Diff-Branch was removed, the model could only utilize the original sequence, thereby losing crucial spatiotemporal change information. As a result, the encoder’s ability to perform information completion was compromised. The Diff-Branch is essential for representing different missing patterns, providing valuable auxiliary information for interpolation. The MSTF module plays a critical role in extracting spatiotemporal features from missing sequences. During the pretrain stage, it learns the spatiotemporal dependencies of both the original and differential sequences through the reconstruction task. In the extrapolation stage, MSTF helps mitigate the loss of information caused by missing frames, improving the model’s extrapolation performance under missing patterns. Consequently, the removal of MSTF led to a decline in model performance. Finally, removing the discriminator eliminated the adversarial constraints, causing the generated extrapolation results to become smoother and less precise.

5. Conclusions

In conclusion, this paper proposed MissPred, a radar echo extrapolation model for use in missing modes, which provides new ideas for forecasting tasks in non-ideal situations from the perspective of model robustness. The experimental results on a real-world radar echo dataset validate the model’s effectiveness in handling missing data. Furthermore, ablation experiments validate the importance of each component in the model’s performance. Through analyzing the experimental results and visualization cases, the following conclusions are drawn:

The two-stage training strategy can avoid the cumulative error of the cascade structure and improve the prediction accuracy of the radar echoes by sharing the encoder parameters.
The difference sequence can reconstruct the missing information from coarse grains. Difference-based sequences can recover image details from a fine-grained level by reconstructing the echo evolution between frames. The two-branch feature fusion structure can effectively improve the encoder’s ability to complement information.
The proposed MSTF module can effectively integrate the spatiotemporal features of the original and differential sequences to enhance the feature extraction capability of the model.

With this training strategy and model architecture, the experimental results on the radar dataset show that the method has higher prediction accuracy and stronger robustness in data-deficient modes compared to the existing methods. However, the two-stage training procedure, the two-branch structure, and the complexity of the model’s components necessitate substantial computational resources. Additionally, radar echoes are influenced by local weather conditions, which exhibit varying features and vertical structures. Future work will focus on developing low-complexity models while examining their predictive performance across different weather scenarios. In addition, different regions are affected by topography and climate, which may lead to radar echo data with different characteristics, and analyzing the generalization ability of the model through more datasets is also our future research direction.

Author Contributions

Conceptualization, Z.Z. and Q.Z.; methodology, Z.Z., C.D. and Q.Z.; validation, Z.Z. and C.D.; formal analysis, L.S., Y.L., Z.Z. and W.Z.; resources, W.Z., L.S. and Q.Z.; data curation, W.Z.; writing—original draft preparation, Z.Z.; writing—review and editing, Z.Z. and Q.Z.; visualization, Z.Z. and C.D.; supervision, Q.Z. and L.S.; project administration, Z.Z. and C.D.; funding acquisition, L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Research Project of the Key Laboratory of Lightning of China Meteorological Administration 2024KELL-B013 and Natural Science Foundation of Shandong Province of China under grant ZR2023MD012.

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Fang, W.; Pang, L.; Sheng, V.S.; Wang, Q. STUNNER: Radar echo extrapolation model based on spatiotemporal fusion neural network. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5103714. [Google Scholar] [CrossRef]
Liu, Y.; Wang, J.; Song, Y.; Liang, S.; Xia, M.; Zhang, Q. Lightning nowcasting based on high-density area and extrapolation utilizing long-range lightning location data. Atmos. Res. 2025, 321, 108070. [Google Scholar] [CrossRef]
Pei, Y.; Li, Q.; Zhang, L.; Sun, N.; Jing, J.; Ding, Y. MPFNet: Multi-product Fusion Network for Radar Echo Extrapolation. IEEE Trans. Geosci. Remote Sens. 2024, 62. [Google Scholar] [CrossRef]
Sokol, Z. Assimilation of extrapolated radar reflectivity into a NWP model and its impact on a precipitation forecast at high resolution. Atmos. Res. 2011, 100, 201–212. [Google Scholar] [CrossRef]
Wang, G.; Wong, W.-K.; Hong, Y.; Liu, L.; Dong, J.; Xue, M. Improvement of forecast skill for severe weather by merging radar-based extrapolation and storm-scale NWP corrected forecast. Atmos. Res. 2015, 154, 14–24. [Google Scholar] [CrossRef]
Ridal, M.; Lindskog, M.; Gustafsson, N.; Haase, G. Optimized advection of radar reflectivities. Atmos. Res. 2011, 100, 213–225. [Google Scholar] [CrossRef]
Gao, Z.; Tan, C.; Wu, L.; Li, S.Z. Simvp: Simpler yet better video prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 3170–3180. [Google Scholar]
Geng, H.; Zhao, H.; Shi, Z.; Wu, F.; Geng, L.; Ma, K. MBFE-UNet: A Multi-Branch Feature Extraction UNet with Temporal Cross Attention for Radar Echo Extrapolation. Remote Sens. 2024, 16, 3956. [Google Scholar] [CrossRef]
Li, J.; Li, L.; Zhang, T.; Xing, H.; Shi, Y.; Li, Z.; Wang, C.; Liu, J. Flood forecasting based on radar precipitation nowcasting using U-net and its improved models. J. Hydrol. 2024, 632, 130871. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.-Y.; Wong, W.-K.; Woo, W.-C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Proceedings of the Advances in Neural Information Processing Systems 28 (NIPS 2015), Montreal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar]
Wang, Y.; Long, M.; Wang, J.; Gao, Z.; Yu, P.S. Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Yang, Z.; Wu, H.; Liu, Q.; Liu, X.; Zhang, Y.; Cao, X. A self-attention integrated spatiotemporal LSTM approach to edge-radar echo extrapolation in the Internet of Radars. ISA Trans. 2023, 132, 155–166. [Google Scholar] [CrossRef]
Chen, S.; Shu, T.; Zhao, H.; Zhong, G.; Chen, X. TempEE: Temporal-Spatial Parallel Transformer for Radar Echo Extrapolation Beyond Autoregression. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5108914. [Google Scholar] [CrossRef]
Xu, L.; Lu, W.; Yu, H.; Yao, F.; Sun, X.; Fu, K. SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4102415. [Google Scholar] [CrossRef]
Guo, S.; Sun, N.; Pei, Y.; Li, Q. 3D-UNet-LSTM: A Deep Learning-Based Radar Echo Extrapolation Model for Convective Nowcasting. Remote Sens. 2023, 15, 1529. [Google Scholar] [CrossRef]
Li, Q.; Jing, J.; Ma, L.; Chen, L.; Guo, S.; Chen, H. A Deep Contrastive Model for Radar Echo Extrapolation. IEEE Geosci. Remote Sens. Lett. 2024, 22, 3500705. [Google Scholar] [CrossRef]
He, G.; Qu, H.; Luo, J.; Cheng, Y.; Wang, J.; Zhang, P. An Long Short-Term Memory Model with Multi-Scale Context Fusion and Attention for Radar Echo Extrapolation. Remote Sens. 2024, 16, 376. [Google Scholar] [CrossRef]
Zheng, C.; Tao, Y.; Zhang, J.; Xun, L.; Li, T.; Yan, Q. TISE-LSTM: A LSTM model for precipitation nowcasting with temporal interactions and spatial extract blocks. Neurocomputing 2024, 590, 127700. [Google Scholar] [CrossRef]
Tan, Y.; Zhang, T.; Li, L.; Li, J. Radar-Based Precipitation Nowcasting Based on Improved U-Net Model. Remote Sens. 2024, 16, 1681. [Google Scholar] [CrossRef]
Liu, J.; Qian, X.; Peng, L.; Lou, D.; Li, Y. TEDR: A spatiotemporal attention radar extrapolation network constrained by optical flow and distribution correction. Atmos. Res. 2024, 311, 107702. [Google Scholar] [CrossRef]
Niu, D.; Li, Y.; Wang, H.; Zang, Z.; Jiang, M.; Chen, X. FsrGAN: A satellite and radar-based fusion prediction network for precipitation nowcasting. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 7002–7013. [Google Scholar] [CrossRef]
Wang, L.; Wang, Z.; Hu, W.; Bai, C. RainHCNet: Hybrid High-Low Frequency and Cross-Scale Network for Precipitation Nowcasting. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 8923–8937. [Google Scholar] [CrossRef]
Wang, Y.; Jiang, H.; Liu, T.; Yao, L.; Zhou, C. A Patch-wise Mechanism for Enhancing Sparse Radar Echo Extrapolation in Precipitation Nowcasting. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 8138–8150. [Google Scholar] [CrossRef]
Wang, C.; Wang, P.; Wang, P.; Xue, B.; Wang, D. Using Conditional Generative Adversarial 3-D Convolutional Neural Network for Precise Radar Extrapolation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 5735–5749. [Google Scholar] [CrossRef]
Vandal, T.J.; Nemani, R.R. Temporal Interpolation of Geostationary Satellite Imagery With Optical Flow. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3245–3254. [Google Scholar] [CrossRef]
Miller, L.; Pelletier, C.; Webb, G.I. Deep learning for satellite image time-series analysis: A review. IEEE Geosci. Remote Sens. Mag. 2024, 12, 81–124. [Google Scholar] [CrossRef]
Tuheti, A.; Dong, Z.; Li, G.; Deng, S.; Li, Z.; Li, L. Spatiotemporal imputation of missing aerosol optical depth using hybrid machine learning with downscaling. Atmos. Environ. 2025, 343, 120989. [Google Scholar] [CrossRef]
Si, J.; Chen, H.; Han, L. Enhancing Weather Radar Reflectivity Emulation From Geostationary Satellite Data Using Dynamic Residual Convolutional Network. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4201711. [Google Scholar] [CrossRef]
Yu, X.; Lou, X.; Yan, Y.; Yan, Z.; Cheng, W.; Wang, Z.; Zhao, D.; Xia, J. Radar Echo Reconstruction in Oceanic Area via Deep Learning of Satellite Data. Remote Sens. 2023, 15, 3065. [Google Scholar] [CrossRef]
Gong, A.; Chen, H.; Ni, G. Improving the Completion of Weather Radar Missing Data with Deep Learning. Remote Sens. 2023, 15, 4568. [Google Scholar] [CrossRef]
Wen, L.; Gao, L.; Li, X. A New Deep Transfer Learning Based on Sparse Auto-Encoder for Fault Diagnosis. IEEE Trans. Syst. Man Cybern. Syst. 2019, 49, 136–144. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2019, 63, 139–144. [Google Scholar] [CrossRef]
Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016, Athens, Greece, 17–21 October 2016; Springer: Cham, Switzerland, 2016; pp. 424–432. [Google Scholar]
Trebing, K.; Staṅczyk, T.; Mehrkanoon, S. SmaAt-UNet: Precipitation nowcasting using a small attention-UNet architecture. Pattern Recogn. Lett. 2021, 145, 178–186. [Google Scholar] [CrossRef]
Tan, C.; Gao, Z.; Wu, L.; Xu, Y.; Xia, J.; Li, S.; Li, S.Z. Temporal attention unit: Towards efficient spatiotemporal predictive learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 18770–18782. [Google Scholar]
Wang, H.; Cao, P.; Wang, J.; Zaiane, O.R. Uctransnet: Rethinking the skip connections in u-net from a channel-wise perspective with transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; pp. 2441–2449. [Google Scholar]
Yurtkulu, S.C.; Şahin, Y.H.; Unal, G. Semantic segmentation with extended DeepLabv3 architecture. In Proceedings of the 27th Signal Processing and Communications Applications Conference (SIU), Sivas, Turkey, 24–26 April 2019; pp. 1–4. [Google Scholar]

Figure 1. Radar data coverage area (red rectangle).

Figure 2. Architecture of the proposed MissPred, including the pretraining of the interpolation encoder, missing spatiotemporal fusion block (MSTF), prediction decoder, and adversarial training in the discriminator component.

Figure 3. Schematic representation of differential sequences for different patterns.

Figure 4. MSTF module structure.

Figure 5. Curves of MissPred and baseline models over time on (a) MSE, (b) PSNR, (c) SSIM, (d) CSI 20 dBZ, (e) CSI 30 dBZ, (f) CSI 40 dBZ, (g) POD 20 dBZ, (h) POD 30 dBZ, and (i) POD 40 dBZ metrics.

Figure 6. Visualization comparison with baseline models (missing rate = 0.2).

Figure 7. Visualization comparison with baseline models (missing rate = 0.5).

Figure 8. Comparison of baseline model and MissPred visualization results. The missing rate is from 0.1 to 0.5.

Figure 9. Time-by-time performance comparison of the proposed MissPred and baseline model under all the evaluation metrics.

Table 1. Cumulative error performance comparison between MissPred and other baseline models.

Model	MSE ↓	SSIM ↑	PSNR ↑	CSI ↑			POD ↑
Model	MSE ↓	SSIM ↑	PSNR ↑	20 dBZ	30 dBZ	40 dBZ	20 dBZ	30 dBZ	40 dBZ
ConvLSTM	37.0038	0.6981	22.7454	0.8028	0.6152	0.3015	0.8123	0.6847	0.3597
PredRNN	31.3015	0.6858	23.5585	0.7918	0.6415	0.3118	0.8248	0.6948	0.3699
3D U-Net	26.0024	0.7088	24.3152	0.8157	0.6582	0.3122	0.8328	0.7092	0.3742
SmaAt-UNet	27.2027	0.7061	24.1228	0.8045	0.6248	0.3082	0.8294	0.6882	0.3648
SimVP	24.5026	0.7195	24.5851	0.8029	0.6681	0.3157	0.8311	0.7218	0.3790
TAU	27.1768	0.7124	24.1241	0.8048	0.6658	0.3098	0.8324	0.7133	0.3672
UCTransNet	23.5015	0.7328	24.7520	0.8158	0.6702	0.3492	0.8510	0.7305	0.3825
DeepLabV3_3D	29.5032	0.7018	23.8285	0.7745	0.6328	0.2983	0.8105	0.6910	0.3548
MissPred	21.2227	0.7414	25.3985	0.8257	0.6829	0.3510	0.8637	0.7348	0.3904

Preferably scores are marked using bold, ↑ which means that higher is better, while ↓ lower is better.

Table 2. Robustness performance comparison between MissPred and other baseline models.

Model	MSE ↓	SSIM ↑	PSNR ↑	CSI ↑			POD ↑
Model	MSE ↓	SSIM ↑	PSNR ↑	20 dBZ	30 dBZ	40 dBZ	20 dBZ	30 dBZ	40 dBZ
ConvLSTM	46.4901	0.6516	21.6184	0.7291	0.5886	0.2870	0.7900	0.6672	0.3471
PredRNN	103.8467	0.5434	18.3258	0.5698	0.3576	0.0997	0.6401	0.4628	0.1734
3D U-Net	61.3635	0.5844	20.4058	0.6742	0.4609	0.1088	0.7477	0.5123	0.1161
SmaAt-UNet	79.0693	0.5726	19.4782	0.5805	0.2561	0.0192	0.6175	0.2658	0.0194
SimVP	70.7463	0.5514	19.8457	0.6498	0.4754	0.1798	0.7310	0.5592	0.2179
TAU	63.7063	0.5872	20.2200	0.6726	0.4929	0.1278	0.7441	0.5719	0.1543
UCTransNet	72.4601	0.5746	19.7851	0.6168	0.3431	0.0608	0.6642	0.3646	0.0631
DeepLabV3_3D	81.2753	0.5433	19.3582	0.5665	0.2340	0.0086	0.6064	0.2454	0.0089
MissPred	21.2227	0.7414	25.3985	0.8257	0.6829	0.3510	0.8637	0.7348	0.3904

Preferably scores are marked using bold, ↑ which means that higher is better, while ↓ lower is better.

Table 3. Performance comparison of different combinations of components.

Pretrain	Diff-Branch	MSTF	Discriminator	MSE	SSIM	PSNR	CSI	POD
×	✓	✓	✓	53.8457	0.6183	21.0854	0.4872	0.5275
✓	×	✓	✓	41.3586	0.6482	22.1687	0.5296	0.5984
✓	✓	×	✓	33.4896	0.6648	22.9870	0.5358	0.6158
✓	✓	✓	×	24.6859	0.7259	24.3841	0.5782	0.6570
Full	✓	✓	✓	21.2227	0.7414	25.3985	0.6199	0.6630

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, Z.; Duan, C.; Song, L.; Zhang, Q.; Zhu, W.; Liu, Y. MissPred: A Robust Two-Stage Radar Echo Extrapolation Algorithm for Incomplete Sequences. Remote Sens. 2025, 17, 2066. https://doi.org/10.3390/rs17122066

AMA Style

Zhao Z, Duan C, Song L, Zhang Q, Zhu W, Liu Y. MissPred: A Robust Two-Stage Radar Echo Extrapolation Algorithm for Incomplete Sequences. Remote Sensing. 2025; 17(12):2066. https://doi.org/10.3390/rs17122066

Chicago/Turabian Style

Zhao, Ziqi, Chunxu Duan, Lin Song, Qilin Zhang, Wenda Zhu, and Yi Liu. 2025. "MissPred: A Robust Two-Stage Radar Echo Extrapolation Algorithm for Incomplete Sequences" Remote Sensing 17, no. 12: 2066. https://doi.org/10.3390/rs17122066

APA Style

Zhao, Z., Duan, C., Song, L., Zhang, Q., Zhu, W., & Liu, Y. (2025). MissPred: A Robust Two-Stage Radar Echo Extrapolation Algorithm for Incomplete Sequences. Remote Sensing, 17(12), 2066. https://doi.org/10.3390/rs17122066

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

MissPred: A Robust Two-Stage Radar Echo Extrapolation Algorithm for Incomplete Sequences

Abstract

1. Introduction

2. Data

3. Method

3.1. Description of Tasks

3.2. Pretrain Interpolation Encoder

3.3. Dual-Branch Decoder

3.4. Missing Spatiotemporal Fusion Block

3.5. Adversarial Training

3.6. Training Strategy

3.7. Evaluation Metrics

4. Experiments and Analysis

4.1. Quantitative Comparison

4.2. Visual Comparison

4.3. Robustness Verification

4.4. Ablation Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI