Next Article in Journal
Lake Evolution and Emerging Hazards on the Tibetan Plateau from 2014 to 2023
Next Article in Special Issue
Precipitation Phase Classification with X-Band Polarimetric Radar and Machine Learning Using Micro Rain Radar and Disdrometer Data in Grenoble (French Alps)
Previous Article in Journal
Reduced-Dynamic Orbit Determination of Low-Orbit Satellites Taking into Account GNSS Attitude Errors
Previous Article in Special Issue
Evaluation of the flagGraupelHail Product from Dual-Frequency Precipitation Radar Onboard the Global Precipitation Measurement Core Observatory Using Multi-Parameter Phased Array Weather Radar
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

MRKAN: A Multi-Scale Network for Dual-Polarization Radar Multi-Parameter Extrapolation

1
School of Electronics and Information Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
2
School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
3
School of Internet of Things Engineering, Wuxi University, Wuxi 214105, China
4
School of Software, Nanjing University of Information Science and Technology, Nanjing 210044, China
5
Wuxi Meteorological Bureau of Jiangsu Province, Wuxi 214000, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(2), 372; https://doi.org/10.3390/rs18020372
Submission received: 27 November 2025 / Revised: 20 January 2026 / Accepted: 21 January 2026 / Published: 22 January 2026

Highlights

What are the main findings?
  • MRKAN substantially improves the joint extrapolation accuracy of Zh, Zdr, and Kdp. It outperforms conventional deep learning models across all major evaluation metrics and shows increased robustness in convective environments.
  • The proposed CSMamba, GIMRBF, MOKAN, and MSFF effectively capture and fuse global, mesoscale, and local nonlinear features. Ablation experiments further confirm their complementary contributions to the overall model performance.
What are the implications of the main findings?
  • The improved prediction of dual-polarization radar parameters shows substantial potential for enhancing short-term convective nowcasting. It also offers more reliable guidance for severe weather monitoring and early-warning operations.
  • The proposed modules provide a methodological reference for future radar-based AI models. They also have broad applicability to other remote-sensing, precipitation estimation, and geophysical prediction tasks.

Abstract

Severe convective weather is marked by abrupt onset, rapid evolution, and substantial destructive potential, posing major threats to economic activities and human safety. To address this challenge, this study proposes MRKAN, a multi-parameter prediction algorithm for dual-polarization radar that integrates Mamba, radial basis functions (RBFs), and the Kolmogorov–Arnold Network (KAN). The method predicts radar reflectivity, differential reflectivity, and the specific differential phase, enabling a refined depiction of the dynamic structure of severe convective systems. MRKAN incorporates four key innovations. First, a Cross-Scan Mamba module is designed to enhance global spatiotemporal dependencies through point-wise modeling across multiple complementary scans. Second, a Multi-Order KAN module is developed that employs multi-order β -spline functions to overcome the linear limitations of convolution kernels and to achieve high-order representations of nonlinear local features. Third, a Gaussian and Inverse Multiquadratic RBF module is constructed to extract mesoscale features using a combination of Gaussian radial basis functions and Inverse Multiquadratic radial basis functions. Finally, a Multi-Scale Feature Fusion module is designed to integrate global, local, and mesoscale information, thereby enhancing multi-scale adaptive modeling capability. Experimental results show that MRKAN significantly outperforms mainstream methods across multiple key metrics and yields a more accurate depiction of the spatiotemporal evolution of severe convective weather.

1. Introduction

In the context of global climate warming, both the frequency and intensity of short-term convective storms have exhibited a significant upward trend [1]. These weather systems often form and evolve rapidly within a short period. The precipitation intensity can reach its peak within tens of minutes, and local accumulated rainfall frequently exceeds several hundred millimeters [2]. Intense rainstorms can easily trigger urban flooding, river overflows, and mountain torrents, posing serious threats to public safety and critical infrastructure [3,4,5]. Therefore, high-spatiotemporal-resolution and rapid forecasting of short-term convective weather evolution is of great practical importance for disaster prevention, loss reduction, and the protection of life.
Traditional nowcasting methods for short-term convective weather primarily rely on linear extrapolation of radar echoes. Commonly used techniques include the centroid tracking method [6,7,8], the cross-correlation method [9,10,11], and the optical flow method [12,13,14]. Among them, the centroid method extrapolates radar echoes by linearly fitting the temporal positions of echo centroids, with advantages including low computational cost and ease of real-time implementation. However, this method assumes that echo structures are symmetric and move steadily. When echoes merge, split, or experience rapid morphological changes, the centroid positions may shift abruptly, resulting in a significant increase in extrapolation errors. The cross-correlation method estimates motion vectors by identifying the displacement corresponding to the maximum spatial correlation between radar reflectivity fields at different times. This approach is more capable of capturing the overall motion patterns of complex echoes. However, when local echo deformation is severe, the method may yield multiple peaks or mismatched correlations, resulting in unstable motion vector estimation. The optical flow method is based on the brightness constancy assumption and derives the motion field by solving the optical flow equation, which enables accurate capture of fine-scale motion and local deformation of radar echoes. However, it requires high data continuity and grid consistency, is sensitive to noise and missing values, and often demands complex preprocessing and regularization steps. Overall, these methods mainly focus on displacement extrapolation and cannot effectively capture the nonlinear interactions among thermodynamic, dynamic, and microphysical processes within convective systems. As the forecast lead time increases, errors accumulate progressively, resulting in a marked decline in prediction reliability for medium- and long-term periods.
In recent years, deep learning has made substantial advances in severe convective weather forecasting. Among these methods, recurrent neural networks (RNNs) have gained significant attention due to their strong ability to model temporal dependencies. ConvLSTM [15] incorporates two-dimensional convolutional operations into the long short-term memory architecture. This model surpasses fully connected LSTM and traditional optical-flow methods in spatial feature extraction and temporal evolution modeling.
However, ConvLSTM tends to produce over-smoothed outputs and lose fine-scale details during long-term prediction, which limits its ability to capture rapid echo variations. To enhance long-sequence forecasting, TrajGRU [16] adaptively learns variable recurrent connection structures through small convolutional subnetworks. This design improves the model’s stability for long-term sequences. PredRNN++ [17] introduces cross-layer and cross-time memory transmission mechanisms together with a dual-memory cascade structure, effectively mitigating gradient vanishing. MIM [18] captures fast- and slow-varying components through parallel non-stationary memory modules, enabling efficient modeling of complex echo evolution. MotionRNN [19] decomposes echo motion into instantaneous components and long-term trends. With parallel MotionGRU structures and a dedicated information-channel design, the model improves its capability to learn multi-scale dynamic motion. Although the above RNN-based methods perform well in short-term forecasting, their frame-wise recursive extrapolation inevitably leads to error accumulation. Consequently, prediction accuracy decreases substantially as the forecast horizon increases.
Compared with recurrent structures, convolutional neural network (CNN) architectures require fewer parameters and support end-to-end feature extraction and multi-scale information fusion. As a result, they have been widely applied to short-term radar-echo prediction. RainNet [20] demonstrates strong performance in 5 min nationwide radar-mosaic forecasting in Germany and achieves high real-time computational efficiency. To improve sensitivity to key echo features, SE-ResUNet [21] integrates a Squeeze-and-Excitation module into the U-Net architecture. This design enables channel-wise recalibration and enhances local-feature representation in urban short-term forecasting. SmaAt-UNet [22] integrates spatial attention with depthwise-separable convolutions. This approach maintains high prediction accuracy while substantially reducing parameters and computational cost. FURENet [23] proposes a multi-parameter dual-polarization radar input model. Compared with single-parameter inputs, it improves the CSI for 30 and 60 min forecasts by 13.2% and 17.4%, respectively.
Deep learning methods based on RNNs and CNNs have been widely applied to radar echo spatiotemporal extrapolation and have enhanced short-term precipitation forecasting. However, studies on dual-polarization radar extrapolation still primarily rely on conventional deep neural network frameworks. These methods typically employ fixed linear mappings and predefined activation functions, limiting their capacity to represent the complex nonlinear interactions among multiple dual-polarization radar parameters. In high-dimensional spatiotemporal prediction scenarios, these modeling paradigms are prone to gradient vanishing and overfitting. They also inadequately capture fine-scale evolutionary features, which constrains the accurate representation of detailed structures and sustained evolution in convective precipitation processes [24,25,26,27].
Moreover, dual-polarization radar observations, including Zh, Zdr, and Kdp, provide detailed microphysical information on precipitation [28]. However, these variables are rarely modeled jointly or exploited synergistically in current extrapolation models. From an atmospheric physics perspective, Zh, Zdr, and Kdp reflect distinct microphysical properties. Zh indicates hydrometeor number concentration and volumetric scattering properties. Zdr characterizes particle shape and phase. Kdp relates to liquid water content and precipitation intensity along the propagation path. Zh primarily characterizes the overall intensity distribution of precipitation systems. Zdr is sensitive to raindrop shape and orientation and is employed to distinguish hydrometeor types and to indicate microphysical evolution processes. Kdp is approximately proportional to liquid water content and serves as an indicator of heavy precipitation and convective core regions. Under severe convective conditions, these parameters exhibit pronounced nonlinear coupling.
Their spatiotemporal evolution jointly reflects interactions among dynamical, thermodynamical, and microphysical processes within convective systems. Extrapolating a single radar parameter is insufficient to preserve the physical consistency of precipitation evolution. In contrast, jointly extrapolating Zh, Zdr, and Kdp constrains the coherent evolution of predictions at the microphysical level, thereby enhancing the physical plausibility and reliability of convective echo evolution representation.
From a meteorological research perspective, dual-polarization radar observations have been widely used to investigate precipitation microphysical structures and their evolution. However, preserving the spatiotemporal consistency of such physical information in short-term extrapolation remains a challenging problem. To address this challenge, this study adopts a data-driven modeling approach. It introduces network architectures with both function approximation capability and state-space characteristics, thereby bridging the gap between traditional statistical extrapolation methods and physical process representation. Although explicit physical constraint equations are not incorporated, the joint exploration of multi-parameter extrapolation and structural continuity preservation offers a novel approach for data-driven modeling of meteorological observations.
Based on these considerations, this paper proposes a dual-polarization radar echo extrapolation model, termed MRKAN. It integrates Mamba, radial basis function networks (RBF), and Kolmogorov–Arnold networks (KANs) to enhance multi-parameter extrapolation performance through temporal modeling and nonlinear function approximation. Compared with existing dual-polarization radar extrapolation methods, MRKAN incorporates the Mamba state-space model to capture long-term temporal dependencies of precipitation systems efficiently. This enables more accurate representation of the persistence and evolution trends in heavy precipitation processes. The RBF and KAN components enhance the model’s ability to represent highly nonlinear mappings among dual-polarization radar variables. They enable adaptive learning of intrinsic physical relationships underlying multi-parameter coupling without relying on complex handcrafted feature design. On this basis, MRKAN synergistically combines the strengths of different network architectures to complement temporal dynamic modeling and high-dimensional nonlinear representation. This design reduces the risk of overfitting and the loss of fine-scale details commonly observed in traditional deep networks for multi-variable dual-polarization radar extrapolation. This approach establishes a novel modeling paradigm for dual-polarization radar spatiotemporal extrapolation and advances high-resolution forecasting of heavy rainfall and severe convective weather.

2. Related Work

The core challenge of dual-polarization radar multi-input–multi-output (MIMO) echo extrapolation lies in modeling meteorological fields that are high-dimensional, multivariate, and highly nonlinear. Although conventional neural network architectures can effectively capture local features, they remain limited in modeling global dependencies and offering physical interpretability. In recent years, a series of novel models featuring strong nonlinear representational capacity and spatiotemporal structural awareness have emerged, including the Mamba [29], RBF [30], and KAN [31]. These models have exhibited excellent generalization and theoretical interpretability in scientific computing and spatiotemporal modeling, providing new technical pathways for dual-polarization radar echo prediction.

2.1. Mamba

The Mamba network originates from the theoretical framework of the Selective State Space Model (Selective SSM). Its core principle is to dynamically model long-range dependencies in time series through an input-driven parameterization mechanism. Compared with traditional RNN or Transformer, Mamba preserves the continuous modeling advantages of state space models. It significantly reduces computational complexity through a parallel scanning algorithm, achieving linear time complexity for long-sequence processing. In recent studies, Mamba [32] constructed input-driven dynamic mapping functions combined with a parallel scanning mechanism. This design increases inference throughput by more than a factor of five compared with Transformers of similar scale. DiM [33] first introduced Mamba into diffusion models, enhancing image generation quality through a two-dimensional adaptation strategy. MaskMamba [34], built on the Bi-Mamba-V2 architecture, incorporated inter-layer mixing and conditional injection mechanisms to enable non-autoregressive, high-resolution visual generation. QMamba [35] aggregated distortion-sensitive features through local window scanning and enhanced image quality assessment accuracy by integrating the StylePrompt strategy. Moe-mamba [36] integrates the mixture-of-experts mechanism into the Selective SSM. The resulting MoE-Mamba accelerates training convergence through sparse activation while preserving linear inference efficiency.

2.2. RBF

The RBF is a feedforward neural network grounded in kernel-based theory and rooted in function approximation and interpolation principles. The model forms local receptive regions in the input space using multiple radial basis units and approximates complex mapping relationships through nonlinear weighted summation. The main advantages of RBF lie in their simple structure, fast convergence, and strong local generalization capability. In methodological developments, U-RBF [37] proposes the Universal RBF Layer, which processes each input dimension independently using parallel Gaussian basis functions. It outperforms MLPs of similar size in low-dimensional regression and reinforcement-learning tasks. NeuRBF [38] improves image and NeRF-scene representation by combining learnable centers and widths with multi-frequency sinusoidal extensions. RESTAD [39] embeds RBF units into the Transformer’s latent space, substantially improving robustness in time-series anomaly detection. Deep RBF [40] improves the stability of multilayer RBF networks through regularization and parameter initialization, achieving strong generalization even under noisy conditions.

2.3. KAN

The KAN is based on the Kolmogorov–Arnold representation theorem, which states that any continuous multivariate function can be expressed as a finite composition of one-dimensional functions. KAN replaces the linear transformations and activation functions in conventional neural networks with learnable one-dimensional spline functions, enabling efficient nonlinear mapping and interpretable structural representation. This mechanism endows the model with enhanced expressive power and numerical stability in tasks such as low-dimensional function approximation, scientific computing, and physical modeling. Recently, KAN [31] introduced an end-to-end trainable activation function-based on β -splines, enabling superior performance in function fitting and partial differential equation solving. Chebyshev KAN [41] parameterized edge activation functions using Chebyshev polynomials, effectively reducing redundant parameters and improving numerical stability. SineKAN [42] accelerated inference by employing a reweighted sinusoidal grid structure. KKANs [43] integrate KAN with MLP architectures based on Kurkova’s approximation principle and demonstrate improved performance and interpretability across diverse regression tasks.

3. Materials and Methods

3.1. Overall Network Architecture

Figure 1 presents the overall architecture of the proposed Kolmogorov–Arnold Network with the Mamba and RBF (MRKAN) model. The model adopts an end-to-end encoder–decoder framework. Its core innovation lies in four specialized modules: the Cross-Scan Mamba Module (CSMamba), the Gaussian and Inverse Multiquadratic Radial Basis Functions Module (GIMRBF), the Multi-Order KAN Module (MOKAN), and the Multi-Scale Feature Fusion Module (MSFF).
The model takes three dual-polarization radar parameters as input: reflectivity Zh, differential reflectivity Zdr, and specific differential phase Kdp. For each parameter, ten consecutive temporal frames are selected and concatenated along the channel dimension. This forms an input tensor X R H × W × 30 , where H and W denote the image height and width, and 30 corresponds to the concatenation of three parameters over ten time steps. To balance expressive capacity and computational efficiency, the input tensor undergoes two-stage downsampling, reducing spatial resolution to one-fourth of the original while expanding the channel dimension to 96. The downsampling stage employs a max pooling strategy. The input features are first processed using a two-dimensional max pooling operation with a stride of 2. This operation halves the spatial resolution of the feature maps, enlarges the receptive field, and emphasizes salient response regions. The downsampled features are then processed by a double 3 × 3 convolution module to perform nonlinear mapping and channel transformation. Each convolutional layer is followed by Batch Normalization and a ReLU activation, which improves feature representation and stabilizes training. This downsampling strategy effectively reduces spatial resolution while preserving the essential structural and morphological information in radar echoes. The resulting tensor is subsequently passed through four encoder layers, each designed to simultaneously capture multi-scale spatial features and temporal dependencies, ensuring the richness and completeness of output representations. Specifically, CSMamba employs SSM to efficiently model long-term dependencies and extract global features. GIMRBF, based on RBF, captures intermediate-scale features bridging global and local representations. MOKAN focuses on local interactions among neighboring pixels to extract fine-grained spatial details. Finally, MSFF fuses these three feature types through a multi-attention weighting strategy to generate high-quality integrated feature maps.
After feature extraction, the decoder progressively upsamples the feature maps to restore spatial resolution and fuses them with encoder features at corresponding scales to refine local details. The upsampling stage restores spatial resolution by integrating interpolation-based upsampling with convolutional operations. The feature maps are first upsampled by a factor of two in the spatial dimensions using bilinear interpolation. A 3 × 3 convolution is then applied to adjust channel representations and refine local details, followed by Batch Normalization and ReLU activation to enhance nonlinear feature representation. This mechanism preserves essential contextual information and enhances the reconstruction of high-resolution predictions in complex scenarios, thereby improving the overall model performance.

3.2. CSMamba

In the VMamba architecture [44], the input feature map is first divided into multiple patches, which are then unfolded into one-dimensional sequences along four distinct scanning paths. Although this strategy preserves global modeling capacity and reduces computational cost, patch division inevitably weakens feature continuity. Furthermore, when scanning reaches the end of a row or column, discontinuities occur during hidden-state updates, impairing contextual perception and prediction accuracy.
To address this limitation, this study proposes CSMamba. Unlike the VSS Block in VMamba, CSMamba removes the patch operation and performs pixel-wise scanning directly across the entire feature map. From a physical perspective, radar echoes represent precipitation fields that evolve continuously in space. Their intensity and morphology typically exhibit smooth pixel-wise propagation, along with localized growth or decay. In patch-based scanning, the continuous echo field is divided into discrete units during serialization. This process disrupts the one-to-one correspondence of pixels across patches in state-space modeling, weakening the model’s capacity to represent physical processes such as advection and diffusion of echoes. In contrast, pixel-wise scanning preserves the direct relationships between spatially neighboring pixels during state updates. This approach enables the evolution of hidden states to more accurately reflect the continuous spatial propagation of precipitation systems, enhancing the physical consistency and stability of spatiotemporal modeling. In addition, to enhance global modeling and avoid local feature interference, the depthwise convolution operation in the original module is removed. The detailed architecture is illustrated in Figure 2. The input features are first normalized using LayerNorm and then processed through a dual-branch structure. In the main branch, the input features first pass through a linear layer, then scanned along four paths: row-wise from top-left to bottom-right, column-wise from top-left to bottom-right, row-wise from bottom-right to top-left, and column-wise from bottom-right to top-left. Each path processes features through an S6 Block. The outputs are then restored to the original spatial structure, summed pixel by pixel, and normalized using LayerNorm. In the secondary branch, features undergo linear transformation and activation, and are then multiplied with the main branch output to form a new feature map. Finally, the features are further enhanced through residual connection. A second feature extraction stage follows a similar process but adopts a snake scanning strategy. When the scan reaches the end of a row or column, it continues in the opposite direction along the next line instead of returning to the starting point. This smooth transition at turning points enhances global contextual continuity and information flow.

3.3. MOKAN

Conventional convolutional operations with fixed kernel weights are limited in their ability to capture complex nonlinear structures within input features. To enhance the local nonlinear representation capability of the model for multi-parameter extrapolation of dual-polarization radar data, this study proposes the MOKAN. The design is inspired by the Kolmogorov–Arnold theory, as illustrated in Figure 3.
The input feature x is first evenly divided into L subgroups with identical dimensionality. For each subgroup, the module applies quadratic and cubic β spline basis functions in parallel to perform spline convolution, generating local nonlinear responses ϕ k 2 x and ϕ k 3 x , respectively. These responses are multiplied by learnable weights W k 2 and W k 3 , and then summed to form the weighted combinations of quadratic and cubic basis functions. The two results are concatenated along the channel dimension and reorganized according to the original subgroup order. They are then passed through a convolutional layer for dimensionality reduction to match the input channel size. Finally, the output is added to the input through a residual connection to complete nonlinear enhancement. The MOKAN employs both quadratic and cubic β spline basis functions to improve the representation of diverse local nonlinear features. Dual-polarization radar echoes exhibit significant spatial and meteorological variability in local patterns. Some regions display relatively smooth nonlinear responses, while strong echo areas and their edges often contain higher-order nonlinear structures. A single-order spline basis function cannot adequately capture the diversity of these feature patterns. Therefore, this study employs parallel modeling using quadratic and cubic spline basis functions. This design enables the network to capture both moderately and highly complex local nonlinear responses. Outputs from different-order splines are fused through learnable weights and mapped along the channel dimension. This mechanism enables adaptive adjustment of each spline order’s contribution during training. This design enhances local representation flexibility. It also leverages grouped processing and channel compression to control parameter count and mitigate feature redundancy. The computation can be formulated as follows:
MOKAN x = x + GELU Conv k W k 2 k 2 x l k W k 3 k 3 x l
where [·∣·] denotes the concatenation operation along the channel dimension, x l denotes the features of the l -th group of x . ϕ k 2 and ϕ k 3 represent the k -th quadratic and cubic β spline basis functions, respectively, performing nonlinear mapping of input features within local receptive fields. W k 2 and W k 3 are learnable weights corresponding to the k -th quadratic and cubic splines, which adaptively regulate the contribution of each spline order. In this implementation, the MOKAN module divides input features along the channel dimension into 4 subgroups. Each subgroup independently performs nonlinear modeling while preserving spatial structure. This grouping strategy balances representation capacity and computational complexity. Multiple parallel groups enhance the model’s ability to capture local nonlinear patterns, while a smaller number of groups controls parameter size and reduces redundancy in high-resolution radar features. The number of groups was set to 4 as a practical choice to balance prediction accuracy and training stability. Therefore, this configuration is adopted in this study. For convolution and spline kernel design, MOKAN uses Kolmogorov–Arnold convolution to perform nonlinear mapping for each subgroup individually. To enhance modeling of local structures at different spatial scales, the module incorporates two parallel branches with 3 × 3 and 5 × 5 convolution kernels, respectively. The smaller kernel captures fine-grained local variations, whereas the larger kernel captures spatial correlations over a broader neighborhood. Correspondingly, the two branches use quadratic and cubic spline basis functions (orders 2 and 3) to balance nonlinear expressiveness and numerical stability. The spline basis functions are discretized on a uniform grid with five points and a value range of [0, 1], ensuring that nonlinear mapping adequately covers the input data. The spline convolution outputs of each subgroup are normalized and passed through a nonlinear activation. They are then concatenated along the channel dimension and fused via a 1 × 1 convolution. Finally, a residual connection adds the fused output to the input features, improving feature representation and stabilizing training.

3.4. GIMRBF

Previous studies have primarily focused on global or local feature extraction, while mesoscale features have received relatively little attention. However, mesoscale features play a crucial role in bridging global structures and local details, as well as in characterizing variations in echo position and intensity. To address this limitation, this study proposes GIMRBF, as illustrated in Figure 4.
The module jointly employs Gaussian radial basis functions (GRBFs) and inverse multiquadric radial basis functions (IMRBFs), which complement each other in receptive field range and attenuation behavior. The Gaussian function exhibits a bell-shaped distribution within a moderate distance from its center c k , making it suitable for fitting smooth mesoscale fluctuations in local regions. In contrast, the inverse multiquadric function decays more slowly, allowing it to capture correlations with more distant pixels. By introducing learnable centers c k and widths w k within channel subgroups, the module adaptively determines the effective receptive field range, thereby enabling precise mesoscale pattern modeling. Specifically, the input features are first flattened into a two-dimensional tensor along the spatial dimensions and then sequentially processed by LayerNorm and SiLU. The tensor is subsequently divided evenly along the channel dimension into several subgroups, where both types of radial basis functions are applied in parallel:
k G a u s s i a n x l = exp x l c k w k 2
k I n v M u l t i x l = 1 x l c k 2 + ε 2
where c k , w k , and ε are learnable parameters. The outputs from each subgroup are linearly mapped to generate mesoscale features, which are then concatenated along the channel dimension to reconstruct a feature map with the same spatial size as the input. Finally, BatchNorm and ReLU are applied to obtain the mesoscale features. During implementation, the GIMRBF module divides input features along the channel dimension into 4 subgroups. Within each subgroup, multiple radial basis functions, including Gaussian and Inverse Multiquadric RBFs, are applied concurrently. This grouping configuration maintains the ability to model mesoscale features while controlling parameter count and computational cost. Setting 4 was chosen to balance the extraction of mesoscale spatial patterns with training stability. Accordingly, this study adopts four subgroups for implementation. The number of basis functions in each RBF branch is set to eight. The effective receptive field is adaptively adjusted using learnable centers and widths. The domain is normalized to match the input feature range. This design enables the network to capture mesoscale variations in radar echoes.

3.5. MSFF

The aforementioned CSMamba, MOKAN, and GIMRBF modules are designed to capture global features, local details, and mesoscale dependencies, respectively. Each module represents the spatiotemporal characteristics of dual-polarization radar echoes at a distinct level. However, single-scale information representation remains insufficient to fully describe the dynamic variations of radar echo fields across multiple spatial scales. To overcome this limitation, MSFF is proposed to enable efficient collaboration and adaptive fusion among multi-level features. The module applies distinct attention mechanisms to features of different scales: channel attention for global features, self-attention for mesoscale features, and spatial attention for local features. These attention-enhanced features are then effectively fused to produce a unified multi-scale representation. Although introducing multiple attention mechanisms in multi-scale feature fusion can enhance feature representation, without proper constraints, different attention branches may learn similar or redundant patterns. This redundancy increases model complexity and the risk of overfitting. To address this issue, MSFF avoids stacking multiple attention mechanisms at the same feature scale. Instead, attention mechanisms are allocated according to the specific representational requirements of global, mesoscale, and local features. This targeted allocation ensures that each attention branch has a clear functional role, structurally reducing redundant learning among attention weights. Furthermore, each attention-enhanced feature is independently modeled before fusion. It is regularized using normalization, grouped convolution, and residual connections. These measures suppress excessive amplification of local responses and improve the stability and generalization of feature fusion.
In the channel attention, the module first performs both global average pooling and max pooling on the input features in parallel to extract channel-wise statistics. Subsequently, a convolution followed by a Sigmoid activation generates the channel attention weights. These weights are multiplied elementwise with the input features to emphasize important channel responses and suppress redundant information, thereby enhancing global-scale feature representation (Figure 5a).
In the self-attention, the input features are processed through three convolutional layers to generate the query ( Q ), key ( K ), and value ( V ) matrices. The transposed Q is multiplied by K and normalized using Softmax to obtain the attention weight matrix, which is subsequently multiplied by V to produce self-attention–enhanced features (Figure 5b). In the self-attention branch, the Q , K , and V are not generated through fully connected mapping after one-dimensional flattening. Instead, a 3 × 3 convolution is employed for feature projection. This design preserves the ability of self-attention to model global dependencies. The attention weights are computed over the flattened H × W spatial dimensions, explicitly capturing correlations between arbitrary spatial positions. The 3 × 3 convolution primarily preserves the local spatial structure characteristic of medium-scale radar echoes prior to attention computation. This ensures that the generated Q , K , and V simultaneously encode contextual semantics and local structural priors. Consequently, the characterization of echo boundaries, intensity gradients, and spatial continuity is enhanced. During attention computation, the convolution-projected features are rearranged into a B × C × H W format. The attention weight matrix is then obtained by computing correlations along the spatial dimension. Mathematically, this process is equivalent to the Q K T operation in standard self-attention. Q is transposed solely because of the different tensor organization in order to measure similarity between spatial positions. Consequently, this implementation retains a standard dot-product-based self-attention structure while improving suitability for processing two-dimensional radar feature maps.
In the spatial attention, the module performs global average pooling along both the height and width dimensions to capture spatial distribution characteristics. The resulting features are then divided into four subgroups. Each subgroup is processed in parallel using depthwise separable convolutions with kernel sizes of 3, 5, 7, and 9 to extract multi-scale local details. The outputs are concatenated, normalized, and activated using a Sigmoid function to produce the spatial attention map. This attention map is multiplied with the original features to highlight key spatial regions while suppressing background noise (Figure 5c).
During the fusion stage, the three types of attention-enhanced features are concatenated along the channel dimension to integrate semantic information from different scales. LayerNorm is then applied for normalization, followed by a grouped convolution with a kernel size of 3 to extract fused multi-scale features. After GELU activation enhances nonlinearity, the module output is added to the input via a residual connection to alleviate gradient vanishing and maintain stable feature propagation. Finally, the output features are normalized using BatchNorm and then passed through two convolutional arranged in an expansion–reduction structure. This design completes full-scale information fusion and generates a highly expressive multi-scale integrated feature representation (Figure 5d).

4. Results

4.1. Dual-Polarization Radar Data

This study utilizes the publicly available C-band dual-polarization meteorological radar dataset released by Nanjing University (NJU-CPOL, Nanjing, China; DOI: https://doi.org/10.5281/zenodo.5109403), comprising 258 representative precipitation events recorded from 2014 to 2019. The dataset includes three key observed parameters: radar reflectivity Zh, differential reflectivity Zdr, and specific differential phase Kdp. Radar observations were collected at an altitude of approximately 3 km, with a spatial resolution of about 1 km and a temporal resolution of 6 min. Each radar frame covers a spatial domain of 256 km × 256 km. Prior to sample construction, basic quality control was conducted on the radar observation data. This process primarily involved removing obvious outliers and invalid echoes. Data exceeding physically reasonable ranges were also filtered to reduce the impact of residual noise and clutter on subsequent model training and extrapolation. However, no complex or targeted clutter suppression algorithms were introduced. This approach preserves the simplicity and reproducibility of the data processing workflow while ensuring data reliability. To construct dual-polarization radar echo extrapolation samples, every 20 consecutive radar frames were grouped into a single sequence. The first 10 frames served as inputs, and the subsequent 10 frames were used as prediction targets. Thus, 60 min of observations were used to forecast the subsequent 60 min of echo evolution. The sliding window is set with a temporal stride of 1, such that consecutive samples differ by only one frame. This configuration fully utilizes continuous observation information and enhances the integrity of time series modeling. To reduce sample redundancy and improve training efficiency, only sequences containing at least one precipitation echo were retained. The dataset contains a total of 8129 samples, of which 1625 are reserved as an independent test set. The remaining 6504 samples are split into training and validation sets in an 8:2 ratio. During preprocessing, the Zh, Zdr, and Kdp parameters were normalized using Min–Max scaling, linearly mapping their values to the range [0, 1] to accelerate network convergence and enhance model stability.

4.2. Evaluation Metrics

To comprehensively evaluate model performance in dual-polarization radar echo extrapolation, a multidimensional evaluation framework was adopted, comprising both classification and regression metrics. The classification metrics include the Critical Success Index (CSI), Probability of Detection (POD), Equitable Threat Score (ETS), and False Alarm Ratio (FAR), defined as follows:
CSI = TP TP + FP + FN
POD = TP TP + FN
ETS = TP H TP + FP + FN H
H = TP + FN TP + FP TP + FP + FN + TN
FAR = FP TP + FP
Here, TP, FP, FN, and TN represent true positives, false positives, false negatives, and true negatives, respectively. The CSI and POD values range from 0 to 1, where higher values indicate better detection performance. The ETS ranges from −1 to 1, with values closer to 1 indicating higher forecast accuracy. A smaller FAR indicates a lower false alarm rate and greater model robustness.
In addition to classification metrics, regression-based indicators—namely the Pearson Correlation Coefficient correlation and error magnitude between predicted and observed values—were evaluated.
PCC = i + 1 n x i x ¯ y i y ¯ i + 1 n x i x ¯ 2 i + 1 n y i y ¯ 2
MAE = 1 n i = 1 n x i y i
RMSE = 1 n i = 1 n x i y i 2
Here, x i and y i represent the predicted and observed values of the i -th sample, x ˉ and y ˉ denote their means, and n is the total number of samples. PCC measures the linear consistency between predicted and observed fields. MAE reflects the average magnitude of prediction errors, while RMSE is more sensitive to large deviations, highlighting model performance under extreme precipitation conditions.

4.3. Experimental Setup

For multi-parameter joint prediction, the three radar variables were concatenated along the channel dimension to form a multivariate input feature. Considering the high memory cost of RNN in processing high-resolution data, a non-overlapping spatial partitioning strategy was adopted. The original 256 × 256 input domain was divided into multiple 64 × 64 × 16 tensor blocks, which were then fed into the network independently. The model was trained using the Adam optimizer with an initial learning rate of 1 × 10 4 . The learning rate was adaptively reduced by a factor of 0.1 if the validation loss did not decrease significantly for five consecutive epochs. This strategy helps promote convergence and suppress oscillations. All experiments were implemented in the PyTorch v2.2.2 framework and executed on a computing platform equipped with an NVIDIA GeForce RTX 3090 GPU, ensuring computational efficiency and experimental reproducibility [45].

4.4. Experimental Results and Analysis

Table 1 presents the quantitative evaluation results for the three radar variables: Zh, Zdr, and Kdp. The thresholds for Zh were set to 30, 35, and 40 dBZ, while those for Zdr and Kdp were 0.5 dB and 0.2°/km, respectively. The symbols “↑” and “↓” indicate that a higher or lower value, respectively, corresponds to better performance. The best results under each threshold are highlighted in bold. Overall, the proposed MRKAN model consistently outperforms all comparison methods across the four key indicators: CSI, POD, ETS, and FAR. Specifically, when the Zh threshold is set to 30 dBZ, MRKAN achieves CSI improvements of 33.78%, 17.08%, 17.93%, 21.03%, 13.40%, 20.86%, 16.05%, 10.70% and 7.36% over ConvLSTM, TrajGRU, PredRNN++, MotionRNN, MIM, SmaAt, FureNet, Earthformer [46] and RobustGAN [47], respectively. The corresponding gains in POD are 33.61%, 13.16%, 15.31%, 20.65%, 12.41%, 20.75%, 15.94%, 10.56% and 5.08%, while ETS improves by 35.59%, 18.17%, 18.97%, 22.04%, 14.11%, 21.91%, 16.85%, 11.18% and 7.84%. Meanwhile, FAR decreases by 23.62%, 25.38%, 24.29%, 18.66%, 15.71%, 15.15%, 9.33%, 14.45% and 14.98%, respectively. When the threshold increases to 35 dBZ, MRKAN surpasses the second-best model RobustGAN by 0.0477, 0.0285, and 0.0489 in CSI, POD, and ETS, respectively, while reducing FAR by 0.0472. When the threshold is further increased to 40 dBZ, MRKAN achieves CSI improvements of 0.1934, 0.1153, 0.1077, 0.1371, 0.0914, 0.1203, 0.1051, 0.0937 and 0.0519 over the nine models, while reducing FAR by 25.83%, 25.71%, 22.82%, 21.64%, 17.48%, 20.08%, 20.78%, 6.85% and 19.53%, respectively.
When the Zdr threshold is set to 0.5 dB, MRKAN improves CSI by 32.53%, 12.95%, 17.12%, 27.15%, 15.99%, 17.83%, 9.35%, 10.08% and 6.14% over the nine comparison models; POD by 30.79%, 5.21%, 13.49%, 29.64%, 17.12%, 15.75%, 6.32%, 7.49% and 3.90%; and ETS by 34.44%, 14.28%, 18.21%, 28.30%, 16.70%, 18.93%, 10.24%, 10.65% and 6.61%, while reducing FAR by 20.00%, 22.45%, 18.22%, 10.30%, 8.89%, 15.20%, 19.47%, 13.32% and 10.12%, respectively. When the Kdp threshold is set to 0.2°/km, MRKAN again achieves the best overall performance across all metrics, improving CSI by 38.29%, 22.51%, 19.58%, 38.42%, 25.76%, 22.82%, 39.42%, 13.54% and 9.75% over the nine models. Meanwhile, FAR decreases by 33.54%, 27.52%, 25.98%, 26.37%, 28.96%, 28.09%, 38.80%, 14.42% and 25.81%, respectively. In summary, across various radar parameters and threshold settings, MRKAN consistently demonstrates stable and significant performance advantages, effectively increasing hit rates while reducing false alarms.
Figure 6 illustrates the temporal evolution of CSI, POD, ETS, and FAR for each model over the one-hour forecast horizon. It can be observed that CSI, POD, and ETS for all methods exhibit a downward trend over time, reflecting the accumulated prediction uncertainty inherent in precipitation extrapolation. However, the performance decay curve of MRKAN is noticeably smoother, indicating stronger robustness in long-term forecasting. Throughout the forecast period, MRKAN maintains the highest CSI, POD, and ETS values across all three radar parameters, confirming its superiority in spatiotemporal consistency modeling. Meanwhile, although FAR tends to increase with forecast lead time for all models, MRKAN consistently achieves the lowest FAR values, indicating its ability to enhance detection accuracy while effectively suppressing false alarms.
To further evaluate the performance of different models in dual-polarization radar echo extrapolation, a quantitative comparison from the perspective of image quality was conducted, as summarized in Table 2. The evaluation metrics include the PCC, MAE, and RMSE, which respectively measure structural correlation, pixel-level deviation, and overall reconstruction error between predicted and observed radar images. The overall results indicate that MRKAN achieves the best performance across all three metrics. Its extrapolated radar fields exhibit the highest structural consistency and numerical accuracy with respect to ground truth, reflecting superior spatial fitting and temporal stability. For Zh, MRKAN improves PCC by 9.26%, 4.84%, 6.05%, 6.06%, 4.17%, 5.29%, 2.88%, 2.43% and 1.83% over ConvLSTM, TrajGRU, PredRNN++, MotionRNN, MIM, SmaAt, FureNet, Earthformer and RobustGAN, respectively; reduces MAE by 2.681, 1.8791, 1.8106, 1.9146, 1.4139, 1.5284, 1.4642, 0.5530 and 0.8671; and decreases RMSE by 4.6330, 3.0525, 3.2522, 3.2435, 2.4233, 2.9390, 1.8565, 1.1091 and 1.4182. These results indicate that MRKAN provides a significant advantage in reconstructing radar echo intensity and suppressing prediction errors, enabling more accurate characterization of precipitation structures and intensity gradients. For Zdr, MRKAN again outperforms all comparison models in terms of PCC, MAE, and RMSE. This result confirms the model’s accuracy in capturing subtle dual-polarization signals and high-frequency oscillations. It effectively restores the microstructural details of meteorological targets—such as variations in drop shape and size distribution—thereby improving the fidelity of radar field extrapolation. For Kdp, MRKAN again exhibits outstanding performance. Compared with the MIM, PCC increases by 14.64%, while MAE and RMSE decrease by 0.54% and 5.18%, respectively. This further highlights MRKAN’s superior capability in recovering phase information and controlling errors, enabling more accurate representation of electromagnetic propagation and phase accumulation effects within precipitation media. In conclusion, through multi-scale feature collaboration and interpretable nonlinear mapping mechanisms, MRKAN demonstrates remarkable superiority in reconstructing spatial radar structures and modeling temporal consistency in dual-polarization radar echo extrapolation.
To facilitate an intuitive comparison of the extrapolation performance of each model, Figure 7 visualizes the predicted Zh, Zdr, and Kdp fields at four representative time steps. The observations indicate that the three dual-polarization parameters exhibit notable consistency in their spatial and temporal distributions. Taking Zh as an example, the initial strong echoes are concentrated on the right side. Their intensity gradually weakens and expands toward the left, and by T+60 min, the region of maximum reflectivity shifts to the lower-left area. Zdr and Kdp exhibit similar evolution patterns. Overall, MRKAN captures the echo evolution well throughout the period, and its predicted echo intensities remain close to the observations. In contrast, the other methods exhibit varying degrees of underestimation. Specifically, at T+6 min, all methods provide reasonably accurate predictions. However, by T+24 min, ConvLSTM shows clear underestimation, predicting the strong-echo region on the right as substantially weaker. The other methods perform better but still deviate from the observations. By T+60 min, the underestimation becomes more pronounced across all comparison methods. For the large strong-echo region on the left, the methods retain limited predictive capability, but their estimated shapes and intensities differ markedly from the observations. For the scattered strong-echo areas on the right, most methods almost fail to reconstruct them. In contrast, MRKAN shows higher stability and accuracy at all time steps. Even at T+60 min, its predictions remain close to the observations, recovering the strong-echo distribution more completely and maintaining better overall structural consistency. The predictions of Zdr and Kdp follow a similar pattern. At T+6 min, all methods can accurately capture their intensities and structures. As the extrapolation horizon increases, the other methods exhibit significant underestimation and increasing blurriness, whereas MRKAN maintains superior performance in intensity estimation, structural fidelity, and spatial consistency. These results demonstrate its robustness and accuracy in complex radar scenarios.

4.5. Ablation Study

To systematically assess the contribution of each functional module in MRKAN to overall performance, a series of ablation experiments were conducted. By gradually removing or replacing model components, 7 comparative configurations were constructed to quantify the independent and synergistic effects of each module. In the experiments, module A denotes the CSMamba module, B represents the GIMRBF, C corresponds to the MOKAN, and D refers to the MSFF. All experiments were conducted under identical hardware and software environments, and model performance was evaluated using four metrics: CSI, POD, ETS, and FAR. The thresholds for the parameters correspond to 35 dBZ for Zh, 0.5 dB for Zdr, and 0.2°/km for Kdp. The baseline model was a four-layer VSSM architecture without any enhancement modules.
Table 3 summarizes the quantitative evaluation results for the various ablation configurations. The trends show that CSI, POD, and ETS increase steadily as the modules are introduced incrementally. In most cases, combining multiple modules outperforms using any single module, suggesting that the extracted features are complementary. Although simply stacking multiple feature types provides some benefits, MSFF achieves deeper multi-scale feature fusion and therefore yields superior performance. For Zdr and Kdp, introducing a single module improves accuracy but also increases FAR, whereas combining multiple modules improves accuracy while simultaneously suppressing FAR. When all modules operate together, the model achieves its highest accuracy and lowest FAR, demonstrating the rationality of the overall design and the necessity of each module.
To further verify the contributions and visual effects of each functional module, Figure 8 presents extrapolation results from the baseline model and several representative module combinations. The observations show that strong echoes initially concentrate in the lower-left region and gradually shift rightward over time, while an even stronger echo band persists on the right. The visualizations from the ablation experiments show that the baseline substantially underestimates strong-echo intensity and lacks sufficient spatial continuity and structural detail near boundaries, resulting in blurring and fragmentation. With the progressive introduction of CSMamba, GIMRBF, MOKAN, and MSFF, the model shows notable improvements in echo intensity reconstruction, spatial structure preservation, and local detail depiction. Under the full configuration, the model accurately captures the spatial evolution of the convective system and the dynamic movement of strong echoes, achieving a better balance between spatial continuity and detail fidelity. The Zdr and Kdp visualizations further indicate that a single module may cause overestimation in some high-value regions. Introducing two modules partially mitigates this issue. Using three modules yields predictions that are closer to the observations. With the complete MRKAN, the predictions best match the observations in intensity, morphology, and spatial consistency. These results demonstrate that the hierarchical and collaborative design of multiple modules significantly enhances spatiotemporal feature capture and prediction stability.
Overall, the ablation study systematically verifies the importance of each key component in MRKAN for multi-parameter extrapolation of dual-polarization radar data. The four modules form a progressive and complementary collaborative mechanism. CSMamba models global dependencies, GIMRBF extracts and enhances mesoscale features, MOKAN refines local structures, and MSFF performs multi-scale fusion and information integration. Consequently, MRKAN surpasses the baseline and all of its subsets in prediction accuracy, structural fidelity, robustness, and generalization, demonstrating the effectiveness and scientific soundness of the proposed architecture. Ablation results indicate that MRKAN’s performance improvement arises not from the enhancement of a single module, but from the functional collaboration among multiple modules. CSMamba primarily affects long-term temporal consistency. GIMRBF contributes significantly to medium-scale spatial structures, MOKAN enhances local nonlinear details, and MSFF effectively integrates multi-scale information based on these contributions. The contributions of each module vary across different metrics and radar parameters. This demonstrates that the architecture is not a redundant aggregation, but a synergistic design with clear physical and modeling motivations.

4.6. Complexity Experiment

In applications such as short-term precipitation forecasting, where timeliness is critical, model computational complexity, real-time performance, and predictive accuracy are all essential considerations. To assess whether MRKAN meets real-time forecasting requirements while maintaining high predictive accuracy, the study systematically evaluated its parameter count, floating-point operations (FLOPs), and average inference latency. All models were repeatedly tested under identical experimental conditions, and the results were averaged to reduce system noise and sporadic anomalies, ensuring fair and reliable comparisons. Detailed information is shown in Table 4.
MRKAN contains 148.40 million parameters. The FLOPs are 46.14 GFLOPs, much lower than those of most complex RNN models, and the average inference latency is only 115.77 milliseconds, fully satisfying real-time deployment requirements. These results indicate that by moderately increasing the parameter count to enhance expressiveness, MRKAN achieves high prediction accuracy without compromising efficiency. Compared with conventional RNN methods, MRKAN substantially reduces computational overhead while maintaining high predictive performance. For example, the average inference latencies of TrajGRU, MotionRNN, and MIM are 711.94 ms, 1221.08 ms, and 1675.22 ms, respectively. These latencies render them unsuitable for real-time applications. Although ConvLSTM has lower inference latency, its limited modeling capability leads to insufficient predictive accuracy. In contrast, MRKAN significantly improves predictive accuracy through abundant parameters and multi-level feature modeling. At the same time, it maintains inference efficiency, achieving an optimal balance between computational cost and performance. Compared with the Transformer-based Earthformer, MRKAN demonstrates superior efficiency and lower latency. Earthformer requires 279.32 GFLOPs and has an average inference latency of 330.64 ms, both substantially higher than those of MRKAN. Although RobustGAN is slightly faster in inference, its FLOPs are 54.65 GFLOPs, and its predictive accuracy remains lower than MRKAN.
Overall, by moderately increasing its parameter count to enhance expressiveness while maintaining reasonable computational complexity and inference latency, MRKAN achieves an excellent trade-off between predictive performance and real-time capability. This ensures practicality in high-timeliness precipitation forecasting scenarios. It also enables deployment in resource-constrained environments, demonstrating its application value in operational meteorological systems.

4.7. Generalization Experiments

To systematically evaluate the generalization performance of MRKAN under cross-regional and cross-system conditions, an independent extrapolation test dataset was constructed. The dataset is derived from continuous observations of multiple precipitation events in 2024 over the New York region using an S-band dual-polarization radar (New York City, NY, USA). It contains 1436 samples and strictly follows the preprocessing procedures of the NJU-CPOL dataset to ensure a fair comparison. The radar site is located at 40.865528°N, −72.863917°W, with effective detection ranges of 460 km for reflectivity and 230 km for radial velocity. Level-3 products at a 0.9° elevation angle with a temporal resolution of 6 min were used. The dataset is available from the Dataset Overview|National Centers for Environmental Information (NCEI). Compared to the C-band NJU-CPOL dataset, which covers only subtropical monsoon inland regions, the New York area has a humid continental climate. It experiences cold, dry winters and warm to hot summers, with precipitation more evenly distributed throughout the year. Additionally, the radar coverage extends from inland to near-coastal areas. Both the electromagnetic band and underlying surface conditions differ significantly from the original experimental scenario, providing a more challenging environment to test model generalization.
Table 5 presents the quantitative evaluation results of the comparison models on this extrapolation dataset. Due to differences in climate, radar band, and geographic environment, the performance of all methods declined compared to the original NJU-CPOL test set. However, MRKAN exhibited a substantially smaller performance degradation and maintained optimal results across key metrics. For example, at a reflectivity threshold of 35 dBZ, MRKAN achieved a CSI of 0.5285, POD of 0.6205, and ETS of 0.5227. These correspond to relative decreases of only 4.48%, 3.48%, and 3.84%, respectively, which are substantially better than most comparison methods. Earthformer exhibited the next smallest CSI decrease at 6.07%. Regarding false alarm control, MRKAN achieved the lowest FAR of 0.2727, indicating strong reliability for early warning. For the prediction of polarization variables Zdr and Kdp, MRKAN similarly demonstrated stable advantages. For Zdr, MRKAN achieved CSI, POD, and ETS values of 0.3850, 0.4934, and 0.3511, all substantially higher than those of other deep learning models. In Kdp prediction, its CSI and ETS decreased only to 0.3626 and 0.3426, respectively. The FAR increased only to 0.4163, achieving a favorable balance between accuracy and false alarm rate.
In summary, the extrapolation experiments using the New York S-band radar fully validate the cross-regional generalization performance of MRKAN. The model not only maintains high forecasting accuracy in the original training scenario but also delivers stable and leading performance under changes in climate, radar band, and surface conditions. This demonstrates its robustness and practical applicability.

4.8. Additional Experiments

To evaluate the stability and reliability of MRKAN under extended forecast horizons, a 2 h extrapolation experiment was conducted based on the NJU-CPOL dataset. This experiment extends the previously described 1 h nowcasting study. Figure 9 systematically illustrates the evolution of key evaluation metrics for Zh, Zdr, and Kdp radar variables as the forecast lead time increases for each method. Overall, as the forecast horizon extends from 66 to 120 min, all comparison models show declines in CSI, POD, and ETS, while FAR increases. This reflects the heightened uncertainty in convective system evolution and the intrinsic challenges of long-term radar echo prediction. Nevertheless, MRKAN consistently exhibits stable and superior performance throughout the forecast period. Specifically, for Zh prediction, MRKAN’s CSI decreases gradually from 0.4519 to 0.3723. The decay rate is significantly smaller than that of other methods, maintaining the highest value at all forecast times. Its POD and ETS also decline more gradually, indicating stronger temporal consistency in detecting and capturing strong echo regions. For Zdr and Kdp prediction, MRKAN consistently ranks among the top performers. It maintains leading CSI, POD, and ETS values while achieving the lowest or near-lowest FAR. This demonstrates effective false alarm suppression under complex microphysical evolution.
To further evaluate the model’s ability to reconstruct spatial structures and capture strong echo evolution under long-term forecasts, Figure 10 presents a side-by-side comparison of observed radar fields and model predictions at four representative time points within the 1 to 2 h forecast window. Observations show that strong precipitation echoes are initially concentrated in the upper-right region and gradually weaken and dissipate over time. Meanwhile, echo intensity in the lower-left and upper-left regions increases, reflecting the spatial reorganization and evolution of the convective system. Against this complex spatiotemporal background, MRKAN accurately reconstructs the overall spatial distribution of echoes. It precisely captures the position, shape, and evolution direction of strong echo cores, and effectively preserves clear intensity gradients in high-reflectivity regions. Consequently, its forecasts closely match observations in both structural continuity and intensity characterization. In contrast, most baseline models tend to overly smooth or underestimate echo intensity as the forecast horizon increases, especially within strong precipitation cores, where structural details and gradients are poorly preserved. Similar results are observed for Zdr and Kdp. MRKAN reasonably reproduces the extrema distributions of differential reflectivity and specific differential phase, and effectively captures their spatial gradient evolution over time. Its forecasts closely match observations in both morphology and magnitude, further demonstrating the model’s sensitivity and representational capability for complex microphysical features.
In summary, the quantitative evaluations in Figure 9 and the qualitative comparisons in Figure 10 demonstrate that MRKAN excels in conventional 1 h nowcasting tasks. It also consistently delivers high-accuracy, low-false-alarm forecasts under the more uncertain 2 h extended forecast scenario.

5. Discussion

This study systematically evaluated the performance of the proposed MRKAN model for dual-polarization radar echo extrapolation under multiple thresholds, forecast horizons, and experimental settings. The discussion below focuses on interpreting the observed performance advantages, analyzing the contributions of individual modules, and examining robustness, computational efficiency, and generalization capability.
The quantitative results across different reflectivity and polarization thresholds demonstrate that MRKAN consistently attains competitive or near-optimal performance, particularly under medium-to-high thresholds. These thresholds correspond to moderate and strong precipitation scenarios, which are more challenging due to their rapid evolution and complex spatial structures. The stable performance of MRKAN under these conditions indicates that the model is capable of accurately capturing both echo intensity variations and spatial organization, even when strong convective systems dominate the radar field. This behavior reflects the robustness and adaptability of the proposed architecture in handling severe weather scenarios.
With increasing forecast lead time, all models exhibit a gradual decline in skill scores, which is an inherent characteristic of precipitation nowcasting caused by error accumulation and growing uncertainty. However, MRKAN shows a noticeably smoother degradation trend, particularly for CSI, POD, and ETS. This gradual decline can be attributed to its joint modeling of long-term temporal dependencies and multi-scale spatial information. Specifically, the CSMamba module alleviates error accumulation arising from increasing forecast lead times through a state-space mechanism, while multi-scale feature fusion reinforces structural consistency along the temporal dimension. Together, these mechanisms help preserve coherent echo evolution patterns and enhance stability in extended forecasts.
From a meteorological perspective, the superior image-quality metrics achieved by MRKAN further highlight its advantages. Higher PCC values indicate improved preservation of the overall spatial structure of radar echo fields, while lower MAE and RMSE values reflect more accurate representation of precipitation intensity gradients and local extrema. Accurate reconstruction of spatial structure and intensity gradients is essential for identifying convective core regions, precipitation band boundaries, and their evolution trends. These capabilities are particularly important for short-term hazardous weather monitoring, as they directly affect the reliability of storm tracking and intensity estimation.
The ablation experiments provide insight into the functional roles and complementary nature of the proposed modules. Among them, CSMamba yields the most substantial individual performance improvement, highlighting its effectiveness in modeling global context and long-term dependencies. GIMRBF and MOKAN further enhance performance by strengthening mesoscale feature extraction and refining local nonlinear structures, respectively. Building upon these improvements, the MSFF module enhances multi-feature interaction and fusion across scales, leading to additional gains. The results indicate that performance improvements do not arise from any single component alone but from the collaborative operation of multiple modules, each contributing to different aspects of spatiotemporal modeling.
Although MRKAN contains a relatively large number of parameters, this is a natural consequence of its multi-level feature modeling and enhanced representational capacity. The increased parameter count enables more effective characterization of complex spatiotemporal precipitation patterns. Importantly, this enhanced expressiveness does not result in excessive computational burden. The model maintains reasonable FLOPs and low inference latency, achieving a favorable balance between predictive accuracy and computational efficiency. This balance is critical for real-time precipitation nowcasting applications, where both timeliness and accuracy are essential.
The generalization experiments conducted on an independent S-band radar dataset demonstrate that MRKAN maintains stable and leading performance under changes in climate regime, radar frequency band, and geographic environment. Compared with other models, MRKAN exhibits a smaller performance degradation and consistently achieves lower false alarm rates, indicating strong reliability for early warning applications. Furthermore, the extended 2 h forecast experiments show that MRKAN remains effective under longer prediction horizons, accurately reconstructing spatial distributions, intensity gradients, and evolution trends of radar echoes. These results confirm the model’s ability to capture nonlinear radar signal characteristics and balance spatiotemporal dependencies under increasingly uncertain forecasting conditions.

6. Conclusions

This study presents a multi-parameter joint extrapolation framework, termed MRKAN, designed for dual-polarization radar data. Through the design of feature extraction modules, MRKAN enables joint modeling and high-precision extrapolation of key radar parameters Zh, Zdr, and Kdp. Specifically, MRKAN leverages the advantages of Structured State Space Models for global dependency modeling, Radial Basis Function networks for mesoscale feature extraction, and Kolmogorov–Arnold Networks for local detail representation, thereby improving both spatiotemporal consistency and structural fidelity of precipitation echoes. Experimental results further verify the effectiveness of MRKAN. Across multiple evaluation metrics, including CSI, POD, ETS, and FAR, MRKAN outperforms existing mainstream models, showing superior prediction accuracy, error control, and long-term extrapolation stability. Under strong convective precipitation scenarios, the method effectively suppresses false alarms and improves the depiction of precipitation evolution in key regions, enhancing reliability for short-term nowcasting. It also establishes a technical framework for intelligent meteorological monitoring and severe weather forecasting, laying a solid foundation for future research and applications in related domains.
Although MRKAN achieves significant performance improvements on the experimental dataset, several limitations remain that require further study. First, the model’s performance partially depends on the spatial and temporal resolution of the input radar data. Significant changes in resolution may alter the distribution of feature scales, which can affect extrapolation accuracy. Second, the experiments in this study are primarily based on a specific dual-polarization radar system and observation region. Variations in radar configurations, scanning strategies, or climatic contexts may challenge the model’s generalization capability. Therefore, the robustness of MRKAN in cross-radar or cross-regional applications requires further enhancement.
Future research may proceed along several directions. First, the parameter dimension can be expanded by incorporating additional dual-polarization radar variables to enhance the model’s sensitivity to complex microphysical precipitation processes. Second, attention should be directed to the model’s spatiotemporal generalization, exploring its transferability and adaptability in medium- to long-term forecasting and multi-radar compositing scenarios. Finally, from a model optimization perspective, integrating lightweight feature extraction frameworks can further enhance computational efficiency.

Author Contributions

Conceptualization, J.W. and Y.Z.; methodology, J.W. and L.Z.; software, J.W., Q.L. and H.L.; validation, J.W., L.Z. and H.L.; formal analysis, J.W., H.P. and L.W.; investigation, J.W. and L.Z.; resources, Y.Z. and Q.L.; data curation, H.P. and L.W.; writing—original draft preparation, J.W.; writing—review and editing, J.W.; visualization, J.W., L.Z. and H.L.; supervision, Y.Z. and Q.L.; project administration, Y.Z. and Q.L.; funding acquisition, Y.Z. and Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (grant nos. 42175157, 42475151, 42305158), in part by the Jiangsu Graduate Research and Innovation Program (grant no. KYCX25_1647).

Data Availability Statement

This study utilizes the publicly available C-band dual-polarization meteorological radar dataset released by Nanjing University (NJU-CPOL, DOI: https://doi.org/10.5281/zenodo.5109403) and NOAA Next Generation Radar (NEXRAD) Level 3 Products (DOI: https://doi.org/10.25921/ncz0-wn95).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Du, J.; Yu, X.; Zhou, L.; Li, X.; Ao, T. Less Concentrated Precipitation and More Extreme Events over the Three River Headwaters Region of the Tibetan Plateau in a Warming Climate. Atmos. Res. 2024, 303, 107311. [Google Scholar] [CrossRef]
  2. Zhou, M.; Wu, J.; Chen, M.; Han, L. Comparative Study on the Performance of ConvLSTM and ConvGRU in Classification Problems—Taking Early Warning of Short-Duration Heavy Rainfall as an Example. Atmos. Ocean. Sci. Lett. 2024, 17, 100494. [Google Scholar] [CrossRef]
  3. Hlal, M.; Baraka Munyaka, J.-C.; Chenal, J.; Azmi, R.; Diop, E.B.; Bounabi, M.; Ebnou Abdem, S.A.; Almouctar, M.A.S.; Adraoui, M. Digital Twin Technology for Urban Flood Risk Management: A Systematic Review of Remote Sensing Applications and Early Warning Systems. Remote Sens. 2025, 17, 3104. [Google Scholar] [CrossRef]
  4. Li, R.; Qi, S.; Wang, Z.; Fu, X.; Gao, H.; Ma, J.; Zhao, L. Research on the Heavy Rainstorm–Flash Flood–Debris Flow Disaster Chain: A Case Study of the “Haihe River ‘23·7’ Regional Flood”. Remote Sens. 2024, 16, 4802. [Google Scholar] [CrossRef]
  5. Deopa, R.; Thakur, D.A.; Kumar, S.; Mohanty, M.P.; Asha, P. Discerning the Dynamics of Urbanization-Climate Change-Flood Risk Nexus in Densely Populated Urban Mega Cities: An Appraisal of Efficient Flood Management through Spatiotemporal and Geostatistical Rainfall Analysis and Hydrodynamic Modeling. Sci. Total Environ. 2024, 952, 175882. [Google Scholar] [CrossRef]
  6. Crane, R.K. Automatic Cell Detection and Tracking. IEEE Trans. Geosci. Electron. 1979, 17, 250–262. [Google Scholar] [CrossRef]
  7. Guo, Z.; Tang, J.; Tang, J.; Wang, S.; Yang, Y.; Luo, W.; Fang, J. Object-Based Evaluation of Precipitation Systems in Convection-Permitting Regional Climate Simulation Over Eastern China. J. Geophys. Res. Atmos. 2022, 127, e2021JD035645. [Google Scholar] [CrossRef]
  8. Di, Z.; Maggioni, V.; Mei, Y.; Vazquez, M.; Houser, P.; Emelianenko, M. Centroidal Voronoi Tessellation Based Methods for Optimal Rain Gauge Location Prediction. J. Hydrol. 2020, 584, 124651. [Google Scholar] [CrossRef]
  9. Rinehart, R.E.; Garvey, E.T. Three-Dimensional Storm Motion Detection by Conventional Weather Radar. Nature 1978, 273, 287–289. [Google Scholar] [CrossRef]
  10. Li, L.; Schmid, W.; Joss, J. Nowcasting of Motion and Growth of Precipitation with Radar over a Complex Orography. J. Appl. Meteorol. Climatol. 1995, 34, 1286–1300. [Google Scholar] [CrossRef]
  11. Johnson, J.T.; MacKeen, P.L.; Witt, A.; Mitchell, E.D.W.; Stumpf, G.J.; Eilts, M.D.; Thomas, K.W. The Storm Cell Identification and Tracking Algorithm: An Enhanced WSR-88D Algorithm. Weather Forecast. 1998, 13, 263–276. [Google Scholar] [CrossRef]
  12. Horn, B.K.P.; Schunck, B.G. Determining Optical Flow. Artif. Intell. 1981, 17, 185–203. [Google Scholar] [CrossRef]
  13. Lucas, B.D.; Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence—Volume 2; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1981; pp. 674–679. [Google Scholar]
  14. Bowler, N.E.H.; Pierce, C.E.; Seed, A. Development of a Precipitation Nowcasting Algorithm Based upon Optical Flow Techniques. J. Hydrol. 2004, 288, 74–91. [Google Scholar] [CrossRef]
  15. Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.-K.; Woo, W. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In Proceedings of the Neural Information Processing Systems, Montréal, QC, Canada, 7–12 December 2015; pp. 802–810. [Google Scholar] [CrossRef]
  16. Shi, X.; Gao, Z.; Lausen, L.; Wang, H.; Yeung, D.-Y.; Wong, W.; Woo, W. Deep Learning for Precipitation Nowcasting: A Benchmark and a New Model. In Proceedings of the 31st International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 5622–5632. [Google Scholar]
  17. Wang, Y.; Gao, Z.; Long, M.; Wang, J.; Yu, P.S. PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 5123–5132. [Google Scholar] [CrossRef]
  18. Wang, Y.; Zhang, J.; Zhu, H.; Long, M.; Wang, J.; Yu, P.S. Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Long Beach, CA, USA, 2019; pp. 9146–9154. [Google Scholar]
  19. Wu, H.; Yao, Z.; Wang, J.; Long, M. MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 15430–15439. [Google Scholar] [CrossRef]
  20. Ayzel, G.; Scheffer, T.; Heistermann, M. RainNet v1.0: A Convolutional Neural Network for Radar-Based Precipitation Nowcasting. Geosci. Model Dev. 2020, 13, 2631–2644. [Google Scholar] [CrossRef]
  21. Song, K.; Yang, G.; Wang, Q.; Xu, C.; Liu, J.; Liu, W.; Shi, C.; Wang, Y.; Zhang, G.; Yu, X.; et al. Deep Learning Prediction of Incoming Rainfalls: An Operational Service for the City of Beijing China. In Proceedings of the 2019 International Conference on Data Mining Workshops (ICDMW), Beijing, China, 8–11 November 2019; pp. 180–185. [Google Scholar] [CrossRef]
  22. Trebing, K.; Staǹczyk, T.; Mehrkanoon, S. SmaAt-UNet: Precipitation Nowcasting Using a Small Attention-UNet Architecture. Pattern Recognit. Lett. 2021, 145, 178–186. [Google Scholar] [CrossRef]
  23. Pan, X.; Lu, Y.; Zhao, K.; Huang, H.; Wang, M.; Chen, H. Improving Nowcasting of Convective Development by Incorporating Polarimetric Radar Variables Into a Deep-Learning Model. Geophys. Res. Lett. 2021, 48, e2021GL095302. [Google Scholar] [CrossRef]
  24. Ma, Z.; Zhang, H.; Liu, J. Focal Frame Loss: A Simple but Effective Loss for Precipitation Nowcasting. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 6781–6788. [Google Scholar] [CrossRef]
  25. Fang, W.; Pang, L.; Sheng, V.S.; Wang, Q. STUNNER: Radar Echo Extrapolation Model Based on Spatiotemporal Fusion Neural Network. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5103714. [Google Scholar] [CrossRef]
  26. Wang, M.; Han, L. Radar Image Extrapolation with Conditional Generative Adversarial Network. In Proceedings of the 2023 IEEE Smart World Congress (SWC), Portsmouth, UK, 28–31 August 2023; pp. 1–6. [Google Scholar] [CrossRef]
  27. Ma, Z.; Zhang, H.; Liu, J. MM-RNN: A Multimodal RNN for Precipitation Nowcasting. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4101914. [Google Scholar] [CrossRef]
  28. Zhao, K.; Huang, H.; Wang, M.; Lee, W.-C.; Chen, G.; Wen, L.; Wen, J.; Zhang, G.; Xue, M.; Yang, Z.; et al. Recent Progress in Dual-Polarization Radar Research and Applications in China. Adv. Atmos. Sci. 2019, 36, 961–974. [Google Scholar] [CrossRef]
  29. Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces. arXiv 2021, arXiv:2111.00396. [Google Scholar] [CrossRef]
  30. Asadi, K.; Parikh, N.; Parr, R.E.; Konidaris, G.D.; Littman, M.L. Deep Radial-Basis Value Functions for Continuous Control. In Proceedings of the AAAI Conference on Artificial Intelligence; PKP: Burnaby, BC, Canada, 2021; Volume 35, pp. 6696–6704. [Google Scholar] [CrossRef]
  31. Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačic, M.; Hou, T.Y.; Tegmark, M. KAN: Kolmogorov–Arnold Networks. arXiv 2024, arXiv:2404.19756. [Google Scholar] [CrossRef]
  32. Gu, A.; Dao, T. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv 2024, arXiv:2312.00752. [Google Scholar] [CrossRef]
  33. Teng, Y.; Wu, Y.; Shi, H.; Ning, X.; Dai, G.; Wang, Y.; Li, Z.; Liu, X. DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis. arXiv 2024, arXiv:2405.14224. [Google Scholar] [CrossRef]
  34. Chen, W.; Niu, L.; Lu, Z.; Meng, F.; Zhou, J. Maskmamba: A hybrid mamba-transformer model for masked image generation. arXiv 2024, arXiv:2409.19937. [Google Scholar] [CrossRef]
  35. Guan, F.; Li, X.; Yu, Z.; Lu, Y.; Chen, Z. QMamba: On First Exploration of Vision Mamba for Image Quality Assessment. arXiv 2024, arXiv:2406.09546. [Google Scholar] [CrossRef]
  36. Pióro, M.; Ciebiera, K.; Król, K.; Ludziejewski, J.; Krutul, M.; Krajewski, J.; Antoniak, S.; Miłoś, P.; Cygan, M.; Jaszczur, S. MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts. arXiv 2024, arXiv:2401.04081. [Google Scholar] [CrossRef]
  37. Jost, D.; Patil, B.; Reinke, C.; Alameda-Pineda, X. Univariate Radial Basis Function Layers: Brain-Inspired Deep Neural Layers for Low-Dimensional Inputs. arXiv 2023, arXiv:2311.16148. [Google Scholar] [CrossRef]
  38. Chen, Z.; Li, Z.; Song, L.; Chen, L.; Yu, J.; Yuan, J.; Xu, Y. NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 4159–4171. [Google Scholar] [CrossRef]
  39. Ghorbani, R.; Reinders, M.J.T.; Tax, D.M.J. RESTAD: Reconstruction and Similarity Based Transformer for Time Series Anomaly Detection. In Proceedings of the 2024 IEEE 34th International Workshop on Machine Learning for Signal Processing (MLSP), London, UK, 22–25 September 2024; pp. 1–6. [Google Scholar] [CrossRef]
  40. Wurzberger, F.; Schwenker, F. Learning in Deep Radial Basis Function Networks. Entropy 2024, 26, 368. [Google Scholar] [CrossRef]
  41. Ss, S. Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation. arXiv 2024, arXiv:2405.07200. [Google Scholar] [CrossRef]
  42. Reinhardt, E.; Ramakrishnan, D.; Gleyzer, S. SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions. Front. Artif. Intell. 2025, 7, 1462952. [Google Scholar] [CrossRef] [PubMed]
  43. Toscano, J.D.; Wang, L.-L.; Karniadakis, G.E. KKANs: K u . ková-Kolmogorov-Arnold Networks and Their Learning Dynamics. Neural Netw. 2025, 191, 107831. [Google Scholar] [CrossRef] [PubMed]
  44. Jiao, J.; Liu, Y.; Liu, Y.; Tian, Y.; Wang, Y.; Xie, L.; Ye, Q.; Yu, H.; Zhao, Y. VMamba: Visual State Space Model. In Proceedings of the Advances in Neural Information Processing Systems 37; Neural Information Processing Systems Foundation, Inc. (NeurIPS): Vancouver, BC, Canada, 2024; pp. 103031–103063. [Google Scholar] [CrossRef]
  45. Zhang, Y.; Geng, S.; Ma, G.; Zhu, L.; Liu, Q. An Improvement Multitask Transformer Network for Dual-Polarization Radar Extrapolation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5108015. [Google Scholar] [CrossRef]
  46. Gao, Z.; Shi, X.; Wang, H.; Zhu, Y.; Wang, Y.; Li, M.; Yeung, D.-Y. Earthformer: Exploring Space-Time Transformers for Earth System Forecasting. Adv. Neural Inf. Process. Syst. 2022, 35, 25390–25403. [Google Scholar]
  47. Yang, Y.; Li, B.; Cao, W.; Chen, X.; Li, W. Cross-Modal Medical Image Generation from MRI to PET Using Robust Generative Adversarial Network. Expert Syst. Appl. 2026, 297, 129287. [Google Scholar] [CrossRef]
Figure 1. Overview of MRKAN.
Figure 1. Overview of MRKAN.
Remotesensing 18 00372 g001
Figure 2. Structure of CSMamba.
Figure 2. Structure of CSMamba.
Remotesensing 18 00372 g002
Figure 3. Structure of the MOKAN.
Figure 3. Structure of the MOKAN.
Remotesensing 18 00372 g003
Figure 4. Structure of the GIMRBF.
Figure 4. Structure of the GIMRBF.
Remotesensing 18 00372 g004
Figure 5. Structure of the MSFF. (a) Channel attention module, (b) Self attention module, (c) Spatial attention module, (d) Feature fusion module.
Figure 5. Structure of the MSFF. (a) Channel attention module, (b) Self attention module, (c) Spatial attention module, (d) Feature fusion module.
Remotesensing 18 00372 g005
Figure 6. Test set comprises three dual-polarization radar variables, with the metrics of CSI, POD, ETS, and FAR trending over the next 1 h.
Figure 6. Test set comprises three dual-polarization radar variables, with the metrics of CSI, POD, ETS, and FAR trending over the next 1 h.
Remotesensing 18 00372 g006
Figure 7. Zh, Zdr and Kdp prediction results for all methods for the dual-polarimetric radar test set. The first row indicates the ground truth, and the remaining rows indicate the prediction results of each model.
Figure 7. Zh, Zdr and Kdp prediction results for all methods for the dual-polarimetric radar test set. The first row indicates the ground truth, and the remaining rows indicate the prediction results of each model.
Remotesensing 18 00372 g007
Figure 8. Ablation results for the dual-polarimetric radar dataset. The first row shows the ground truth, and the remaining rows show the prediction results of various models.
Figure 8. Ablation results for the dual-polarimetric radar dataset. The first row shows the ground truth, and the remaining rows show the prediction results of various models.
Remotesensing 18 00372 g008
Figure 9. The NJU-CPOL dataset contains three dual-polarization radar variables, and CSI, POD, ETS, and FAR are evaluated over the next 1 to 2 h.
Figure 9. The NJU-CPOL dataset contains three dual-polarization radar variables, and CSI, POD, ETS, and FAR are evaluated over the next 1 to 2 h.
Remotesensing 18 00372 g009
Figure 10. Forecasting results over the next 1 to 2 h based on the NJU-CPOL dataset. The first row depicts the ground truth, whereas the following rows show the predicted outputs from different models used in the ablation study.
Figure 10. Forecasting results over the next 1 to 2 h based on the NJU-CPOL dataset. The first row depicts the ground truth, whereas the following rows show the predicted outputs from different models used in the ablation study.
Remotesensing 18 00372 g010
Table 1. Performance analysis of Zh, Zdr, and Kdp on NJU-CPOL. ↑ means higher is better, ↓ means lower is better.
Table 1. Performance analysis of Zh, Zdr, and Kdp on NJU-CPOL. ↑ means higher is better, ↓ means lower is better.
VariablesThresholdMethodCSI ↑POD ↑ETS ↑FAR ↓
Zh τ = 30 dBZConvLSTM0.47240.53610.45460.2163
TrajGRU0.53980.63300.52160.2214
PredRNN++0.53590.62120.51810.2182
MotionRNN0.52220.59370.50510.2031
MIM0.55730.63720.54020.1960
SmaAt0.52290.59320.50560.1947
FureNet0.54460.61780.52750.1822
Earthformer0.57090.64790.55440.1931
RobustGAN0.58870.68170.57160.1943
Ours0.63200.71630.61640.1652
Zh τ = 35 dBZConvLSTM0.37320.43250.36310.2738
TrajGRU0.44690.54340.43580.2846
PredRNN++0.44860.53590.43800.2789
MotionRNN0.43130.50430.42110.2659
MIM0.47110.55530.46080.2521
SmaAt0.43900.51650.42860.2563
FureNet0.45420.53550.44390.2445
Earthformer0.48240.55850.47250.2227
RobustGAN0.50560.61440.49470.2584
Ours0.55330.64290.54360.2112
Zh τ = 40 dBZConvLSTM0.23240.26770.22770.3666
TrajGRU0.31050.38620.30440.3660
PredRNN++0.31810.38040.31250.3523
MotionRNN0.28870.33660.28370.3470
MIM0.33440.39400.32900.3295
SmaAt0.30550.36580.30020.3402
FureNet0.32070.39570.31570.3432
Earthformer0.33210.38390.32670.2919
RobustGAN0.37390.46960.36760.3379
Ours0.42580.49960.41970.2719
Zdr τ = 0.5 dBConvLSTM0.35200.44780.33220.3765
TrajGRU0.41300.55670.39080.3884
PredRNN++0.39830.51610.37780.3683
MotionRNN0.36690.45180.34810.3358
MIM0.40220.50010.38270.3306
SmaAt0.39590.50600.37550.3552
FureNet0.42660.55090.40510.3740
Earthformer0.42380.54490.40360.3475
RobustGAN0.43950.56370.41890.3351
Ours0.46650.58570.44660.3012
Kdp τ   =   0.2 ° /kmConvLSTM0.31840.49440.30400.5445
TrajGRU0.35940.54320.34490.4993
PredRNN++0.36820.54620.35390.4889
MotionRNN0.31810.44950.31150.4915
MIM0.35010.46350.34330.5094
SmaAt0.35850.54740.34400.5033
FureNet0.31580.52550.29910.5913
Earthformer0.38780.52630.37580.4229
RobustGAN0.40120.53770.38570.4878
Ours0.44030.57090.42850.3619
Table 2. Comparison of the performance of different deep learning models for the strong convective weather forecasting task for Zh, Zdr, and Kdp using PCC, RMSE, and SSIM metrics. ↓ means lower is better; ↑ means higher is better.
Table 2. Comparison of the performance of different deep learning models for the strong convective weather forecasting task for Zh, Zdr, and Kdp using PCC, RMSE, and SSIM metrics. ↓ means lower is better; ↑ means higher is better.
VariablesMethodPCC ↑MAE ↓RMSE ↓
ZhConvLSTM0.83606.712317.1937
TrajGRU0.87125.910415.6132
PredRNN++0.86135.841915.8129
MotionRNN0.86125.945915.8042
MIM0.87685.445214.9840
SmaAt0.86755.559715.4997
FureNet0.88785.495514.4172
Earthformer0.89174.584313.6698
RobustGAN0.89704.898413.9789
Ours0.91344.031312.5607
ZdrConvLSTM0.51293.435111.4436
TrajGRU0.55543.693211.1592
PredRNN++0.56303.270310.9063
MotionRNN0.55273.494710.9854
MIM0.58993.044510.5077
SmaAt0.55743.353411.0670
FureNet0.56993.796810.9697
Earthformer0.57163.230510.9296
RobustGAN0.59503.249210.7793
Ours0.62993.000910.3807
KdpConvLSTM0.49951.41105.8748
TrajGRU0.56991.38415.5710
PredRNN++0.57871.40685.4559
MotionRNN0.55811.16504.7771
MIM0.60101.15474.6997
SmaAt0.56771.44195.5268
FureNet0.51421.82105.8282
Earthformer0.59961.48965.8268
RobustGAN0.63961.68256.0355
Ours0.68901.14854.4563
Table 3. Performance analysis of radar reflectivity Zh with thresholds τ set to 30, 35, and 40 dBZ, differential reflectivity Zdr with threshold τ set to 0.5 dB, and specific differential phase Kdp with threshold τ set to 0.2 ° /km on dual-polarization radar dataset. Quantitative comparison results of all algorithms in terms of CSI, POD, ETS, and FAR. √ indicates that the module is used, × indicates that the module is not used.
Table 3. Performance analysis of radar reflectivity Zh with thresholds τ set to 30, 35, and 40 dBZ, differential reflectivity Zdr with threshold τ set to 0.5 dB, and specific differential phase Kdp with threshold τ set to 0.2 ° /km on dual-polarization radar dataset. Quantitative comparison results of all algorithms in terms of CSI, POD, ETS, and FAR. √ indicates that the module is used, × indicates that the module is not used.
VariablesMethodABCDCSIPODETSFAR
ZhBaseline××××0.44350.50340.43360.2809
Ours (w/o B&C&D)×××0.46880.53090.45910.2488
Ours (w/o A&C&D)×××0.45650.52340.43560.2737
Ours (w/o A&B&D)×××0.46120.58290.44960.3204
Ours (w/o C&D)××0.48240.55800.47230.2476
Ours (w/o B&D)××0.49490.58170.48490.2373
Ours (w/o A&D)××0.47790.53820.46830.2246
Ours (w/o D)×0.51840.59900.50860.2202
Ours0.55330.64290.54360.2112
ZdrBaseline××××0.39310.48030.37370.3067
Ours (w/o B&C&D)×××0.40320.50630.39230.3407
Ours (w/o A&C&D)×××0.40180.49070.38970.3517
Ours (w/o A&B&D)×××0.41440.50810.39790.3795
Ours (w/o C&D)××0.41390.51340.40450.3361
Ours (w/o B&D)××0.41710.51090.39770.3067
Ours (w/o A&D)××0.41310.51260.40360.3370
Ours (w/o D)×0.44190.55050.42220.3133
Ours0.46650.58570.44660.3012
KdpBaseline××××0.37500.51020.36240.4226
Ours (w/o B&C&D)×××0.38460.52930.37990.4725
Ours (w/o A&C&D)×××0.37590.52350.36540.4752
Ours (w/o A&B&D)×××0.38570.52710.37810.4848
Ours (w/o C&D)××0.40250.53790.39250.4511
Ours (w/o B&D)××0.40350.54700.39050.4218
Ours (w/o A&D)××0.40220.52030.39030.3757
Ours (w/o D)×0.41880.55180.40670.3847
Ours0.44030.57090.42850.3619
Table 4. Floating point operations and average inference latency among different models.
Table 4. Floating point operations and average inference latency among different models.
MethodParameters (M)Floating Point Operations (GFLOPs)Average Inference Latency (ms)
ConvLSTM10.29163.5838.09
TrajGRU68.23321.68711.94
PredRNN++9.39769.07411.36
MotionRNN9.30751.761221.08
MIM15.581275.071675.22
SmaAt4.0410.1012.91
FureNet58.0834.1013.01
Earthformer5.85279.32330.64
RobustGAN8.0354.6581.07
Ours148.4046.14115.77
Table 5. Performance analysis was conducted on the New York S-band dual-polarization radar dataset by setting the Zh threshold τ to 35 dBZ, the Zdr threshold τ to 0.5 dB, and the Kdp threshold τ to 0.2°/km. Quantitative comparison results of all algorithms in terms of CSI, POD, ETS, and FAR. ↑ means higher is better, ↓ means lower is better.
Table 5. Performance analysis was conducted on the New York S-band dual-polarization radar dataset by setting the Zh threshold τ to 35 dBZ, the Zdr threshold τ to 0.5 dB, and the Kdp threshold τ to 0.2°/km. Quantitative comparison results of all algorithms in terms of CSI, POD, ETS, and FAR. ↑ means higher is better, ↓ means lower is better.
VariablesThresholdMethodCSI ↑POD ↑ETS ↑FAR ↓
Zh τ = 35 dBZConvLSTM0.33030.39970.32510.3271
TrajGRU0.41470.52450.40890.3267
PredRNN++0.39690.47840.39160.2977
MotionRNN0.40240.47460.39720.2848
MIM0.41250.51680.42170.2833
SmaAt0.40510.49530.39940.3334
FureNet0.41170.50650.40540.3805
Earthformer0.45310.52790.43280.2813
RobustGAN0.46920.56830.45340.3132
Ours0.52850.62050.52270.2727
Zdr τ = 0.5 dBConvLSTM0.26390.32490.23670.3995
TrajGRU0.30590.37790.27670.3934
PredRNN++0.33270.42900.29950.4066
MotionRNN0.32470.40360.29390.3785
MIM0.32160.40530.28930.3792
SmaAt0.32560.40670.29460.3753
FureNet0.32380.42890.29030.4237
Earthformer0.31870.43620.30490.3723
RobustGAN0.33650.45160.31760.3698
Ours0.38500.49340.35110.3649
Kdp τ   =   0.2 ° /kmConvLSTM0.24720.35420.23150.5611
TrajGRU0.25910.34870.24320.5153
PredRNN++0.30550.45730.28640.5213
MotionRNN0.27920.37470.26310.5015
MIM0.27030.36810.26120.5096
SmaAt0.26910.37550.25290.5130
FureNet0.29100.44370.26360.6068
Earthformer0.31360.45390.31710.4742
RobustGAN0.32080.46220.32650.4971
Ours0.36260.48880.34260.4163
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, J.; Zhang, Y.; Zhu, L.; Liu, Q.; Lin, H.; Peng, H.; Wu, L. MRKAN: A Multi-Scale Network for Dual-Polarization Radar Multi-Parameter Extrapolation. Remote Sens. 2026, 18, 372. https://doi.org/10.3390/rs18020372

AMA Style

Wang J, Zhang Y, Zhu L, Liu Q, Lin H, Peng H, Wu L. MRKAN: A Multi-Scale Network for Dual-Polarization Radar Multi-Parameter Extrapolation. Remote Sensing. 2026; 18(2):372. https://doi.org/10.3390/rs18020372

Chicago/Turabian Style

Wang, Junfei, Yonghong Zhang, Linglong Zhu, Qi Liu, Haiyang Lin, Huaqing Peng, and Lei Wu. 2026. "MRKAN: A Multi-Scale Network for Dual-Polarization Radar Multi-Parameter Extrapolation" Remote Sensing 18, no. 2: 372. https://doi.org/10.3390/rs18020372

APA Style

Wang, J., Zhang, Y., Zhu, L., Liu, Q., Lin, H., Peng, H., & Wu, L. (2026). MRKAN: A Multi-Scale Network for Dual-Polarization Radar Multi-Parameter Extrapolation. Remote Sensing, 18(2), 372. https://doi.org/10.3390/rs18020372

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop