TDA-DARKNet: A Deep Learning Model Based on Dual-Polarization Radar Data for Tornado Detection

Zhang, Guoxiu; Zeng, Qiangyu; Zhang, Fugui; Wang, Hao; Yu, Tiantian

doi:10.3390/rs18081124

Open AccessArticle

TDA-DARKNet: A Deep Learning Model Based on Dual-Polarization Radar Data for Tornado Detection

by

Guoxiu Zhang

^1,2

,

Qiangyu Zeng

^1,3,4

,

Fugui Zhang

^1,2,*,

Hao Wang

^1,2

and

Tiantian Yu

^1,2

¹

College of Electronic Engineering, Chengdu University of Information Technology, Chengdu 610225, China

²

Key Laboratory of Atmosphere Sounding, China Meteorological Administration, Chengdu 610225, China

³

Key Laboratory of South China Sea Meteorological Disaster Prevention and Mitigation of Hainan Province, Haikou 570203, China

⁴

China Meteorological Administration Tornado Key Laboratory, Beijing 100871, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(8), 1124; https://doi.org/10.3390/rs18081124 (registering DOI)

Submission received: 3 March 2026 / Revised: 2 April 2026 / Accepted: 9 April 2026 / Published: 10 April 2026

(This article belongs to the Topic AI for Natural Disasters Detection, Prediction and Modeling)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

We propose a novel tornado identification model, TDA-DARKNet, which combines convolutional neural networks with channel and spatial attention mechanisms to strengthen responses to key variables and spatial regions, while incorporating a Kolmogorov–Arnold Network (KAN) to enhance nonlinear representation capability.
The model enhances tornado detection capability, particularly for weak tornado events.

What are the implications of the main findings?

Cross-regional evaluation using multiple independent tornado events indicates that the model maintains stable detection performance and good regional generalization.
The results suggest that the transition from “handcrafted feature-driven” approaches to “deep representation-driven” frameworks constitutes a key technological pathway for improving tornado detection probability and extending warning lead time.

Abstract

Tornado is a localized, small-scale severe convective weather phenomenon characterized by extreme destructiveness. Tornado detecting and warning mainly rely on Doppler weather radar, which identifies and tracks tornadoes by recognizing the tornado vortex signature and supercells in radar data. Artificial intelligence technology has been applied to tornado recognition in recent years. However, existing monitoring methods, especially those using unsupervised learning algorithms, still have limited recognition accuracy and timely warning, and usually struggle to strike a balance between detection accuracy and false alarm rate. A novel tornado detection algorithm TDA-DARKNet has been proposed to address the aforementioned issues. The algorithm integrates a dual attention mechanism, dense residual connections, and Kolmogorov–Arnold network (KAN). A tornado dataset suitable for deep learning has been formed, which utilizes features including radial velocity, reflectivity, velocity spectrum width, differential reflectivity, and correlation coefficient in radar data. The TDA-DARKNet algorithm was trained and tested using the tornado dataset, and evaluated in tornado cases. The experimental results show that TDA-DARKNet improves the detection probability and extends the lead time to a maximum of 42 min in strong tornado situations, while achieving 97.11% accuracy, 95.08% precision, indicating strong overall identification performance. In addition, by directly leveraging radar-based data for tornado identification, the algorithm eliminates the need for manual feature engineering, simplifies data processing, reduces complexity, and further enhances detection effectiveness.

Keywords:

tornado detection; dual-polarization radar; attention mechanism; denseblock; Kolmogorov–Arnold Network (KAN)

1. Introduction

Tornadoes are small-scale weather phenomena with strong destructive potential, characterized by limited spatial extent and short lifetimes. Accurate prediction and early warning of tornadoes remain highly challenging, constituting a longstanding frontier in atmospheric science. According to statistical analyses of tornadoes in China, tornadoes typically exhibit wind speeds of 100–140

m \cdot s^{- 1}

, diameters of less than 2 km, and path lengths ranging from 0 to 25 km [1]. Tornadoes are generally classified into two categories: mesocyclone tornadoes and non-mesocyclone tornadoes, with the latter typically being weaker than the former [2,3]. Tornadoes predominantly occur in the flat terrains of the eastern and certain central regions in China. Jiangsu Province, especially its northern part, exhibits the highest frequency of tornado occurrences nationwide. The occurrence of tornadoes is closely linked to atmospheric conditions, particularly for major tornado outbreaks, which are typically accompanied by specific atmospheric anomalies and wind field conditions. Ćwik et al. [4] analyzed major tornado outbreaks in the United States during May by applying Maximum Covariance Analysis (MCA) to 91 significant tornado events from 1950 to 2019. Their findings provide new insights for understanding tornado outbreak characteristics and early warning in the context of climate change. The mechanisms of tornado formation remain incompletely understood; however, several theoretical models have been proposed to explain this process. Artekha [5], based on the Schwarzschild criterion for convective instability, analyzed the tornado velocity field structure through both analytical and numerical solutions, incorporating the stream function, azimuthal velocity components, and density change equations in cylindrical coordinates. These findings provide theoretical support for improving tornado prediction accuracy.

Weather radar is a crucial tool for tornado detection. Early tornado signature recognition algorithms primarily relied on dynamical features such as radar radial velocity for identification. The National Severe Storms Laboratory (NSSL) used pulse-Doppler radar to detect tornadoes and discovered a distinctive tornado vortex signature (TVS). Zrnić et al. [6] proposed the Mesocyclone Detection Algorithm (MDA), which identifies mesocyclones through the evaluation of spatial scale and azimuthal shear of radial velocity. MDA substantially enhanced the warning capability for tornadoes associated with supercell storms. In 1989, a new tornado detection algorithm (TDA-TVS) was developed to address the low detection rate associated with threshold-based identification of the TVS [7]. TDA-TVS successfully expanded tornado identification from supercell tornadoes to non-supercell tornadoes. It should be noted that TVS detection is strongly dependent on radar characteristics, scanning strategies, and the distance from the radar, which can significantly influence its detection capability. Steven [8] found that the integration of radar reflectivity with near-storm environmental parameters derived from numerical models significantly improves the automated classification of convective storms, particularly in identifying rotating storms and assessing their potential for severe weather. However, the rotational signals of many weak tornadoes are confined to the lower elevation levels of the radar, resulting in a relatively low detection rate.

In the 21st century, substantial progress in tornado detection has been achieved with the updating from Doppler radar to dual-polarization radar systems. Dual-polarization radar integrates fundamental Doppler parameters with polarimetric variables, enabling more comprehensive detection and analysis of meteorological phenomena. Ryzhkov et al. [9] conducted a detailed investigation of three tornado events and found that tornado debris in the hook echo region was associated with low correlation coefficient

ρ_{H V}

(below 0.8) and differential reflectivity

Z_{DR}

(less than 0.5 dB). Based on these findings, a tornado detection algorithm utilizing tornado debris signatures (TDSs) was developed. Bluestein et al. [10] further pointed out that, in the identification of TDSs, the correlation coefficient is more effective in tornado detection algorithms due to its insensitivity to signal attenuation. Kumjian [11] identified distinct features such as the

Z_{DR}

arc,

Z_{DR}

column, and

K_{DP}

column based on observations from C-band polarimetric radar. Recognizing that mixed precipitation-debris scenarios may not produce extremely low correlation coefficient values, Ryzhkov [12] revised the TDS criteria by relaxing the correlation coefficient threshold and introducing reflectivity and azimuthal shear constraints, thereby improving operational robustness. Zhang et al. [13] conducted the observation and documentation of a waterspout over the Pearl River estuary using a dual-polarization phased array radar (dual-PAR) network in 2020. This study confirmed that tornado occurrence is associated with abnormally low values of the

ρ_{H V}

, consistent with known polarimetric signatures. Analysis of the prior studies reveals that TDS typically appears after tornado touchdown, which limits its effectiveness for early warning. However, it serves as a reliable indicator for confirming that a tornado has occurred.

By integrating effective features from traditional identification methods and introducing novel ones, artificial intelligence has made substantial advancements in tornado detection and warning, greatly enhancing the efficiency of both identification and early warning. Wang [14] innovatively proposed the Neuro-Fuzzy Tornado Detection Algorithm (NFTDA) using a neuro-adaptive system, paving a new path for tornado detection. Hill et al. [15] applied the random forest algorithm to extreme weather warning, and also achieved tornado warning capability. Zeng et al. [16] proposed a tornado detection algorithm based on random forest (TDA-RF), which can effectively identify tornadoes of different intensities. By comprehensively evaluating multiple features, the algorithm demonstrates strong anti-interference capability, enabling tornado warning lead times of up to 18 min based on networked radar. Deep learning has demonstrated strong capability in rapidly identifying various weather patterns but often encounters the problem of data imbalance during the training process. Therefore, Basalyga et al. [17] constructed an image training dataset using tornado data, balanced the training set through image augmentation techniques, and conducted tornado prediction research using an image-based convolutional neural network (CNN) model, further promoting the application of artificial intelligence in the field of tornado warning. A new probabilistic tornado detection algorithm based on azimuthal shear was proposed by Sandmæl et al. [18], which demonstrated superior performance compared to TDA. McGuire [19] successfully predicted the number of tornado outbreak days of different scales based on climate data using three convolutional neural networks: LeNet-5, VGG-16, and ResNet-50. Xie et al. [20] proposed a tornado recognition algorithm based on a multi-task identification network (MTI-Net), introducing a novel multi-head convolutional block backbone network, which effectively improved the tornado recognition hit rate and reduced the false alarm rate, highlighting the effectiveness of MTI-Net in handling small-scale tornado events. A tornado detection algorithm based on XGBoost and dual-polarization radar was proposed by Zeng et al. [21], demonstrating improved performance in tornado identification through the incorporation of dual-polarization parameters.

Recent studies have clearly demonstrated that machine learning and deep learning models have benefited from enhanced computational power, leading to new advancements in tornado detection and early warning, and significantly improving detection and warning efficiency. However, current research in China on tornado structure and evolution remains largely limited to environmental conditions. Tornado identification models often face a trade-off between increasing detection rates and reducing false alarms due to the lack of high-resolution observational data.

Dual-polarization radar data collected in China, including reflectivity, radial velocity, spectrum width, differential reflectivity, correlation coefficient, and specific differential phase, were utilized to construct a tornado dataset for deep learning applications. A novel tornado detection algorithm, termed TDA-DARKNet (Tornado Detection Algorithm-Dual Attention Residual Dense Kolmogorov–Arnold Network), is proposed. TDA-DARKNet integrates a dual attention mechanism, dense residual connections, and a Kolmogorov–Arnold network (KAN). Specifically, TDA-DARKNet combines the nonlinear function approximation capability of KAN with the spatial feature extraction strengths of convolutional neural networks (CNNs). This design enables more effective representation of nonlinear spatial structures in radar observations. In addition, dense residual connections facilitate efficient integration of shallow detail features and deep semantic features through cross-layer feature reuse and fusion. During feature extraction, a channel-wise attention mechanism (CAM) and a spatial attention mechanism (SAM) are employed to achieve adaptive feature weighting. This process emphasizes critical tornado signatures while suppressing background noise.

2. Dataset

2.1. Dual Polarization Weather Radar

Dual-polarization weather radar extends traditional single-polarization radar by incorporating an additional vertical polarization channel, enabling the measurement of additional parameters, including differential reflectivity (

Z_{DR}

), correlation coefficient (

ρ_{H V}

), differential phase (

ϕ_{D P}

), and specific differential phase (

K_{DP}

). These parameters can provide more microscopic physical information of the precipitation system, which is of great value in identifying the actual tornado and its debris zone on the ground [22]. The differential reflectivity

Z_{DR}

refers to the logarithm of the ratio of the reflectivity factors of the horizontal polarization and vertical polarization echoes, reflecting the average shape or flatness of the particles. The definition of

Z_{DR}

is given in Equation (1). In a normal rainfall environment, the

Z_{DR}

value of flat raindrops is generally positive, while the non-meteorological debris rolled up by a tornado is usually irregular or nearly spherical, resulting in a significantly lower

Z_{DR}

value, close to 0 or a negative value.

Z_{DR} = 10 l o g (\frac{Z_{h}}{Z_{v}})

(1)

The correlation coefficient

ρ_{HV}

represents the consistency of the horizontal and vertical polarization echoes of particles in the radar beam, as shown in Equation (2):

ρ_{H V} = \frac{\bar{S_{v v} S_{h h}^{*}}}{\sqrt{\bar{{|S_{v v}|}^{2} {|S_{h h}|}^{2}}}}

(2)

where

S_{hh}

and

S_{vv}

represent the in-phase echoes of horizontally and vertically polarized electromagnetic waves, respectively. When electromagnetic waves continuously illuminate the same resolution volume and the statistical properties of hydrometeors within the volume remain stable, the relative relationship between

Z_{h}

and

Z_{v}

across pulses remains consistent, resulting in a high correlation coefficient. Correlation coefficients in pure rain regions often exceed 0.98. However, when tumbling, fragmentation, or the coexistence of hydrometeors with differing physical phases (e.g., mixed-phase particles) occurs within the resolution volume, scattering behavior becomes inconsistent. As a result, the relative relationship between

Z_{h}

and

Z_{v}

across pulses varies significantly, and the correlation coefficient decreases. Therefore,

ρ_{H V}

is one of the most critical parameters for identifying the characteristics of tornado debris.

2.2. Radar Base Data Processing

The radar base data used in this study were quality-controlled. During dataset construction, the azimuth indexing was aligned such that the initial radial direction corresponds to geographic North, ensuring consistency between tornado damage survey information (azimuth and distance) and the radar data. This operation is only used for data mapping and does not involve any modification of the original radar measurements.

2.3. Tornado Dataset Construction

There are substantial differences in the intensity, type, and movement paths of tornadoes between China and the United States, with U.S. tornadoes generally exhibiting greater overall intensity. Owing to the relatively late implementation of dual-polarization upgrades to China’s new-generation Doppler radar systems, combined with the low frequency of tornado occurrences, the number of available dual-polarization tornado observation samples remains limited. To satisfy the large-sample requirements of deep learning models, while accounting for differences in environmental conditions and radar observation characteristics between Chinese and U.S. tornadoes, radar data from both countries are employed in this study. Specifically, the TorNet dataset [23], developed by Lincoln Laboratory in collaboration with NVIDIA, is integrated with tornado radar observations from China to provide a robust data foundation for model training.

The TorNet dataset facilitates advances in machine learning and deep learning-based tornado identification and detection by providing high-quality simulated radar data. The dataset consists of full-resolution, multi-polarization weather radar data extracted from the National Centers for Environmental Information (NCEI) Storm Events Database spanning 2013–2022, ensuring the reliability and accuracy of the data. In total, the dataset includes 203,133 radar samples, of which confirmed tornado cases account for 6.8% (13,813), false-alarm tornado warnings for 31.8% (64,510), and randomly selected non-tornado weather cases for 61.4% (124,766). This class composition, which maintains a realistic class imbalance and includes challenging non-tornadic cases with tornado-like signatures, enables learning algorithms to better distinguish true tornadoes from non-tornadic convective storms and reduce false positives. The statistical distribution of the dataset is summarized in Figure 1. Collectively, these data provide a comprehensive training resource for deep learning-based tornado detection models.

Given that tornado cases in China are primarily concentrated in eastern regions, this study selects historical tornado observations from S-band dual-polarization radars in Jiangsu Province and Guangdong Province. A total of 16 tornado events during the period 2019–2023 were collected. The S-band dual-polarization radar operates in the VCP21 volume coverage pattern, completing a full volume scan every 6 min across multiple elevation angles, including 0.5°, 1.5°, 2.4°, 3.3°, 4.3°, 6.0°, 9.9°, 14.6°, and 19.5°. The radar base data have a radial resolution of 0.25 km, an azimuthal resolution of 1°, and a maximum detection range of 230 km, and are stored in binary format.

The dataset production process is shown in Figure 2. The radar fields were divided into small patches consisting of 8 × 8 range bins (corresponding to a physical area of 2 km × 2 km), due to the small spatial scale of tornadoes and their limited coverage in radar data. A sliding window with a stride of 3 was then applied for sampling to ensure complete coverage of tornado-affected regions. The extracted patches were subsequently filtered and cleaned by removing samples with a high proportion of invalid data, as well as those located near the radar center or at far ranges, where range folding or velocity aliasing effects are significant. Finally, radar variables at the same elevation angle and spatial location, including reflectivity, radial velocity, spectrum width, differential reflectivity, correlation coefficient, and specific differential phase, were reorganized into a data matrix of 6 (data types) × 8 (radials) × 8 (range bins). Positive samples (tornado) were labeled as 1, while negative samples (non-tornado) were labeled as 0.

A total of 150,177 samples were obtained, consisting of 18,215 positive (tornado) samples and 131,962 negative (non-tornado) samples. To address class imbalance, data augmentation was applied to the positive samples [24]. Specifically, stochastic perturbation-based augmentation methods were employed, including additive Gaussian noise injection, multiplicative amplitude scaling, and small random perturbations. For the negative samples, random undersampling [25] was performed to reduce the number of samples and achieve a more balanced dataset. Consequently, 36,430 positive samples and 36,430 negative samples were retained for model training.

3. Method

3.1. TDA-DARKNet

A novel tornado recognition algorithm, termed Tornado Detection Algorithm-Dual Attention Residual Dense Kolmogorov–Arnold Network (TDA-DARKNet), is proposed in this study. TDA-DARKNet integrates dual attention mechanisms, dense residual structures, and a Kolmogorov–Arnold network (KAN) to improve the accuracy and robustness of tornado identification while reducing the false alarm rate. The attention mechanisms guide the model to focus on critical spatial locations and feature channels associated with tornado occurrence. The dense residual structure enhances feature reuse efficiency, alleviates the vanishing gradient problem, and facilitates deeper extraction of hierarchical tornado echo features. Meanwhile, the KAN component strengthens the model’s capability to capture complex nonlinear relationships associated with tornado dynamics.

The architecture of TDA-DARKNet model is illustrated in Figure 3. The input consists of a six-channel radar data tensor, with each channel corresponding to reflectivity, radial velocity, spectrum width, differential reflectivity, correlation coefficient, and specific differential phase. The integrated multi-source tornado dataset, combining the TorNet dataset and the China radar observations, was split into training, validation, and test sets with a ratio of 8:1:1. The training set was used for model optimization, the validation set for monitoring model performance during training and guiding hyperparameter tuning, and the test set for final performance evaluation.

The network backbone is composed of four stages:

Feature Extraction: In this stage, the input radar data patches are first processed by a standard convolutional layer, followed sequentially by batch normalization (BN), a ReLU activation function, and a max-pooling operation, in order to reduce spatial dimensionality and extract preliminary low-level spatial features. This process facilitates the capture of local texture and intensity information associated with tornado occurrence.

Dense Feature Learning: This stage is composed of alternating DenseBlocks and Transition layers. Each DenseBlock consists of multiple DenseLayers and employs dense connectivity, in which each layer receives the outputs of all preceding layers as input, thereby enhancing feature reuse and gradient propagation while improving parameter efficiency [26]. Within each DenseLayer, BN, ReLU activation, Convolution, Dropout, and a dual attention module incorporating both channel and spatial attention are sequentially applied. The dual attention mechanisms embedded within each layer enable adaptive emphasis on critical feature representations closely associated with tornado occurrence, such as rotational signatures or debris-related features. The Transition layers between DenseBlocks perform channel and spatial compression via convolution and adaptive average pooling, which reduce computational cost while preserving salient feature representations, thereby facilitating the modeling of multiscale tornado features, such as low-level rotational structures and hook echoes.

Feature Compression: After the dense feature learning stage, the feature maps are first sequentially processed by BN and ReLU activation function, and are then compressed to a fixed spatial size through adaptive average pooling to obtain a compact global feature representation. Subsequently, the compressed feature maps are flattened into a one-dimensional vector, which serves as high-level semantic features and is fed into the subsequent stage.

Final Classification: The flattened feature vector is fed into Kolmogorov–Arnold Network (KAN), which replaces the conventional fully connected layer. This module captures the complex nonlinear relationships between radar variables and the probability of tornado occurrence through spline-based basis function representations. Finally, the model outputs a binary classification probability indicating whether a tornado is present in the given sample.

3.2. Kolmogorov–Arnold Network

Kolmogorov–Arnold Network (KAN) is a neural network architecture derived from the Kolmogorov–Arnold representation theorem [27]. The Kolmogorov–Arnold representation theorem was proposed by Vladimir Arnold and Andrey Kolmogorov. It states that a multivariate continuous function defined on a bounded domain can be represented as a composition of finitely many univariate continuous functions and binary addition operations, as shown in Equation (3):

f (X) = f (x_{1}, \dots, x_{n}) = \sum_{q = 1}^{2 n + 1} Φ_{q} (\sum_{p = 1}^{n} Φ_{q, p} (x_{p}))

(3)

where

Φ_{q} : R \to R, Φ_{q, p} : [0, 1] \to R, x_{p}

represents the pth element of vector X.

In KAN, network weights are not represented as single scalar values. Instead, they are modeled using B-spline basis functions, and weighted summation is performed only at spline knots. Multilayer perceptrons (MLPs) define weights as real-valued parameters and rely on fixed activation functions to introduce nonlinearity [28]. In contrast, KAN can directly learn complex functional relationships. It has demonstrated high accuracy and good interpretability in small-sample scientific modeling tasks. KAN is structurally similar to MLPs at the network level. Internally, it is consistent with spline-based representations. This design enables KAN to effectively learn tornado-related features and to model the extracted features with high precision.

Within the KAN framework, B-spline functions are employed to implement the univariate mapping module. This design preserves the expressive capacity of the network and improves its stability and interpretability in small-sample modeling. Compared with global polynomial fitting, B-splines provide clear advantages in modeling nonlinear relationships and capturing local features [29,30]. The construction of B-spline functions begins with the zeroth-order (

k = 1

) basis function. This function takes a value of 1 within the interval defined by two adjacent knots and 0 outside the interval, thereby exhibiting strict local support, as shown in Equation (4). Based on this, higher-order (

k \geq 2

) B-spline basis functions are recursively derived using the de Boor-Cox formula. Each higher-order basis function is expressed as a linear combination of two adjacent lower-order basis functions (Equation (5)). This formulation achieves higher-order continuity and smoothness while preserving locality. Finally, a continuous and smooth function with local control properties is constructed through the weighted summation of control coefficients and basis functions, as shown in Equation (6).

B_{i, 1} (x) = \{\begin{matrix} 1, t_{i} \leq x \leq t_{i + 1} \\ 0, otherwise \end{matrix}

(4)

B_{i, k} (x) = \frac{x - t_{i}}{t_{i + k - 1} - t_{i}} B_{i, k - 1} (x) + \frac{t_{i + k} - x}{t_{i + k} - t_{i + 1}} B_{i + 1, k - 1} (x)

(5)

S (x) = \sum_{i = 0}^{n} c_{i} B_{i, k} (x)

(6)

3.3. Channel Attention Mechanism

The Channel Attention Mechanism (CAM) is introduced to enable the model to automatically emphasize more informative feature subspaces. CAM [31] is designed to allocate importance across feature map channels by assigning adaptive weights to each tornado-related channel. This mechanism highlights features that contribute most to recognition while suppressing irrelevant or redundant information, thereby enhancing overall model performance.

As shown in Figure 4, the channel attention mechanism typically consists of the following three steps (taking the input feature

X \in R^{C \times H \times W}

as an example):

(1) Channel Feature Compression: Global average pooling (GAP) and global max pooling (GMP) are first applied to compress the spatial dimensions, converting the spatial information of each channel into a single scalar. This process yields a channel-wise descriptor vector, as shown in Equation (7).

z_{c} = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} X_{c} (i, j)

(7)

(2) Channel Attention Weight Calculation: The channel descriptor vector is fed into a lightweight fully connected neural network, typically a two-layer perceptron, to capture inter-channel dependencies and compute the attention weights for each channel, as shown in Equation (8).

s = σ (W_{2} \cdot ReLU (W_{1} \cdot z))

(8)

Here,

σ

denotes the Sigmoid function, and the output

s \in {[0, 1]}^{C}

represents the weights (attention coefficients) of each channel.

(3) Feature Recalibration: Finally, each channel in the original feature map is multiplied by its corresponding weight through channel-wise multiplication, thereby completing the weighted recalibration of channels.

X_{c}^{'} = s_{c} \cdot X_{c}

(9)

3.4. Spatial Attention Mechanism

The Spatial Attention Mechanism (SAM) is an attention module in neural networks that dynamically assigns importance to spatial positions [30]. SAM aims to guide the model in automatically identifying spatial regions that require greater “attention”, thereby emphasizing key areas associated with tornado occurrence and suppressing irrelevant background, ultimately enhancing the model’s feature representation capability.

As shown in Figure 5, the spatial attention mechanism typically consists of the following three steps (taking the input feature

X \in R^{C \times H \times W}

as an example):

(1) Channel Dimension Compression: Global max pooling and global average pooling are first applied along the channel dimension of an input feature X with size C × H × W, producing two feature maps of size 1 × H × W, as shown in Equations (10) and (11).

X_{max} = {MaxPool}_{c} (X)

(10)

X_{avg} = {AvgPool}_{c} (X)

(11)

(2) Spatial Attention Weight Calculation: Subsequently, the two pooled feature maps are concatenated along the channel dimension and passed through a convolutional layer with a kernel size of 7 × 7. Finally, a Sigmoid activation function is applied to generate the spatial attention weight map

M_{s}

.

M_{s} = σ (f^{7 \times 7} ([X_{max}; X_{avg}]))

(12)

Here,

f^{7 \times 7} (\cdot)

denotes the convolution operation and represents the Sigmoid function.

(3) Feature Recalibration: Finally, each spatial element in the original feature map is multiplied element-wise by its corresponding weight, completing the spatial-wise weighted processing, as shown in Equation (13).

X^{'} = M_{s} \otimes X

(13)

3.5. Model Evaluation

When evaluating binary classification models in deep learning algorithms, a binary confusion matrix is commonly used to assess model performance, as shown in Table 1.

T P

denotes the number of correctly classified positive samples, corresponding to correctly identified tornadoes.

F P

denotes the number of incorrectly classified positive samples, corresponding to falsely identified tornadoes.

F N

denotes the number of incorrectly classified negative samples, corresponding to missed tornadoes.

T N

denotes the number of correctly classified negative samples, corresponding to correctly identified non-tornado cases.

Based on the confusion matrix, several evaluation metrics are used to assess model performance, including accuracy (

A C C

; Equation (14)), precision (

P R E

; Equation (15)), F1-score (Equation (17)), probability of detection (

P O D

; Equation (18)), false alarm ratio (

F A R

; Equation (19)), critical success index (

C S I

; Equation (20)), and Matthews correlation coefficient (

M C C

; Equation (21)).

A C C = \frac{T P + T N}{T P + F N + F P + T N}

(14)

P R E = \frac{T P}{T P + F P}

(15)

R e c a l l = \frac{T P}{T P + F N}

(16)

F_{1 - score} = \frac{2 \cdot P R E \cdot R e c a l l}{P R E + R e c a l l}

(17)

P O D = \frac{T P}{T P + F N}

(18)

F A R = \frac{F P}{T P + F P}

(19)

C S I = \frac{T P}{T P + F N + F P}

(20)

M C C = \frac{T P \times T N - F P \times F N}{\sqrt{(T P + F P) (T P + F N) (T N + T P) (T N + F N)}}

(21)

In order to comprehensively evaluate the performance of the model, the experiment selected five common deep learning classification models to compare with the model TDA-DARKNet proposed in this paper, including ResNet, DenseNet, EfficientNet, ConvNeXt, and MLP-Mixer model. The evaluation results on the test set are summarized in Table 2.

The experimental results indicate that TDA-DARKNet outperforms all comparative models in the tornado identification task. The model achieves an ACC of 0.9711, with PRE and F1-score reaching 0.9508 and 0.9717, respectively, demonstrating high accuracy and strong performance balance. Its POD reaches 0.9936, indicating a significantly reduced miss rate, while the FAR is as low as 0.0492, reflecting strong false alarm suppression capability. In addition, the CSI and MCC attain 0.9450 and 0.9431, respectively, further verifying the model’s stability and discriminative ability under complex radar interference conditions. Overall, TDA-DARKNet exhibits superior performance across key classification and warning metrics, demonstrating strong potential for operational application.

To evaluate the influence of each module on model performance, ResDenseNet was used as the baseline model, and different variants were constructed by sequentially introducing channel attention module (CA), spatial attention module (SA), convolutional block attention module (CBAM), and the KAN module. The effects of individual modules were first compared, followed by an analysis of the performance changes when attention mechanisms were combined with the KAN module. All experiments were conducted under the same data split and training settings to ensure fairness. This design enables a quantitative assessment of the contribution of each module and verifies the effectiveness of the final TDA-DARKNet model.

The ablation study results (Table 3) systematically evaluate the contributions of different modules to the model performance. The baseline demonstrates strong detection capability (POD = 0.9904). However, it suffers from a relatively high false alarm rate (FAR = 0.0667), indicating limitations in distinguishing strongly rotating but non-tornadic cases. The introduction of CA or SA does not lead to consistent performance improvements, and even results in degradation in some metrics. For example, incorporating CA increases the FAR to 0.0790, suggesting that a single attention mechanism may introduce redundant features or amplify non-essential echo information. Similarly, the CBAM module (CA + SA) fails to provide noticeable gains, indicating that conventional attention mechanisms have limited effectiveness in tornado identification tasks. In contrast, incorporating the KAN module leads to substantial performance improvements (ACC increases from 0.9598 to 0.9679, while FAR decreases to 0.0495), demonstrating its superiority in modeling complex nonlinear relationships. The proposed TDA-DARKNet achieves the best performance across all evaluation metrics, while maintaining an extremely high detection rate (POD = 0.9936) and reducing the false alarm rate (FAR = 0.0492), thereby validating the rationality and effectiveness of the proposed model architecture.

The channel and spatial attention module can adaptively assign importance weights to different features. Specifically, channel attention emphasizes the relative contributions of key meteorological variables (e.g., velocity and reflectivity variables), while spatial attention focuses on critical regions within severe convective systems that exhibit pronounced dynamical or microphysical characteristics (e.g., high reflectivity regions). The Kolmogorov–Arnold Network (KAN) explicitly learns nonlinear functional relationships between input features and outputs. Compared with traditional black-box neural networks, its functional form is more interpretable. This structure enables the model to more clearly characterize the nonlinear response relationships between the key features highlighted by the channel and spatial attention module and tornado occurrence, thereby revealing potential underlying physical connections to some extent.

Considering that performance differences among deep learning models on this dataset are relatively limited, comparisons restricted solely to deep learning architectures may not fully reflect the effectiveness of algorithmic improvements. Therefore, a representative machine learning model, TDA-XGB [21], is further introduced as a baseline to evaluate performance enhancement from a different technical perspective, as shown in Table 4. TDA-XGB is a tornado identification approach based on Extreme Gradient Boosting (XGBoost). Its core principle lies in constructing a high-accuracy strong classifier through an ensemble learning framework that sequentially aggregates multiple weak learners (CART regression trees). The method minimizes an objective function composed of a loss term and a regularization term, thereby improving predictive accuracy while effectively controlling model complexity and reducing the risk of overfitting. The model is trained iteratively using a gradient boosting strategy, where each round takes the residuals of the current model as the learning target to progressively refine the decision boundary and ultimately output the probability of tornado occurrence. In terms of feature construction, TDA-XGB relies on manual feature engineering based on dual-polarization radar data. It extracts statistical descriptors, including the maximum, minimum, and mean values, of reflectivity, radial velocity, spectrum width, differential reflectivity, and correlation coefficient. In addition, dynamic features such as velocity difference, velocity shear, angular momentum, and rotational velocity are derived to construct a multidimensional feature vector for classification.

Overall, machine learning approaches typically rely on manually engineered statistical features or predefined physical criteria. As a result, their capability to characterize the complex spatial structures and weak rotational signatures associated with tornadoes is relatively limited, making them more prone to missed detections in weak tornado cases or during the early developmental stages. In contrast, the proposed TDA-DARKNet demonstrates higher detection sensitivity in the tornado identification task. This improvement can be attributed to its deep feature learning framework, which is capable of extracting more discriminative spatial structural information directly from dual-polarization radar base data and enhancing the response to critical tornado echo regions. Consequently, the model maintains strong identification performance even during the early development phase of tornadoes and under weak-signal conditions. Furthermore, with the incorporation of the Kolmogorov–Arnold Network (KAN), the model achieves a more comprehensive approximation of complex nonlinear relationships, thereby further improving overall recognition robustness.

4. Typical Case Analysis and Generalization Capability Validation

Due to the pronounced case-to-case variability and sudden nature of tornadoes, a reliable model must not only achieve strong performance on evaluation metrics but also maintain stable identification capability across typical events and varying observational conditions. Therefore, this section conducts validation analyses using multiple independent tornado cases from Jiangsu and Guangdong Provinces. Specifically, two representative tornado events are selected to evaluate the ability of TDA-DARKNet to identify tornadoes of different intensities, including an EF2 tornado that occurred in Guangzhou, Guangdong, on 27 April 2024, and an EF0 tornado on 17 September 2024. On this basis, statistical identification results from 19 tornado events are further analyzed to comprehensively assess the overall generalization capability of the proposed model. The tornado damage survey data used in this study are obtained from local meteorological agencies and are derived from field investigations combined with multiple data sources, which are used to determine the occurrence time, path, and touchdown location of tornadoes.

4.1. EF2 Tornado in Guangzhou, Guangdong on 27 April 2024

At 15:00 on 27 April 2024, a tornado struck Liangtian Village in Zhongluotan Town, Baiyun District, Guangzhou, Guangdong Province. According to the damage survey, this event had a path length of approximately 8.2 km and an average width of about 700 m. The tornado was classified as an EF2 event, indicating a strong tornado. Eight volume scans from the Z9200 radar between 14:18 and 15:00 were used to evaluate this tornado case. The model identification results are marked by black rectangles on the radar Plan Position Indicator (PPI) images.

The TDA-DARKNet model first detected early tornado signatures at 14:18. As shown in Figure 6, hook echo signatures exhibited a clear evolutionary pattern across all elevation angles in the reflectivity PPI images.

Figure 7 illustrates the identification results of TDA-DARKNet at 15:00. According to damage survey records, the tornado had already touched down by this time. The identified region exhibits high reflectivity, accompanied by pronounced positive-negative radial velocity couplets. Low values of differential reflectivity and correlation coefficient are also observed. These features indicate a typical TDS in the radar PPI images.

In this case, the TDA-XGB model first produced a valid detection at 14:48, providing a lead time of approximately 12 min. In contrast, TDA-DARKNet was able to capture potential tornado development features at an earlier stage, achieving a lead time of up to 42 min. The feature-engineered TDA-XGB relies on regional statistical metrics or threshold-based criteria as input variables. During feature construction, part of the spatial structural information contained in radar echoes is inevitably compressed. As a result, its capability to characterize fine-scale morphological features, such as mesocyclones and hook echoes, is relatively limited, making it difficult to identify tornado signatures during the early formation stage or when the signals are not yet fully developed.

As shown in Figure 8, we analyzed the tornado positions identified by TDA-DARKNet at the same elevation between 14:18 and 15:00. The tornado movement path identified by the model exhibits an overall west to east-southeast trajectory, which is generally consistent with the post-event damage survey results. This agreement indicates that TDA-DARKNet is capable of accurately capturing the spatiotemporal evolution of the tornado.

4.2. EF0 Tornado in Xuzhou, Jiangsu on 17 September 2024

At 12:21, the TDA-DARKNet model detected tornado signatures at the 1.5° and 2.4° elevation angles, as shown in Figure 9. At the 1.5° elevation, the detected region exhibited high reflectivity. The positive-negative radial velocity couplets reached 27

m \cdot s^{- 1}

and −22.5

m \cdot s^{- 1}

, respectively. These features indicate a typical tornado vortex signature.

Figure 10 shows the identification results of the Z9516 radar at 12:32 (UTC + 8) for the 1.5° and 2.4° elevations. This scan can be regarded as the radar observation closest to the touchdown time. As shown in the figure, the detected regions at the 1.5° elevation exhibit high reflectivity. Hook echo signatures and the corresponding positive-negative velocity couplets are clearly visible.

In this weak tornado case, TDA-XGB failed to provide a valid detection result, whereas TDA-DARKNet successfully identified tornado signatures 9 min in advance. This outcome indicates that the deep learning approach, by enhancing the representation of small-scale spatial structures in radar echoes and capturing complex nonlinear relationships, exhibits greater sensitivity to weak rotational features and early-stage tornado signals. Consequently, it demonstrates clear advantages in terms of tornado detection probability and warning lead time.

4.3. Generalization Capability Validation

To further evaluate the generalization capability of the proposed model under different regional and radar observational conditions, a total of 19 independent tornado events from Jiangsu, Guangdong, and Anhui Provinces were selected for validation, as summarized in Table 5. In the table, the Detection value of 1 indicates that a tornado event was successfully identified by the model, while the False Alarm value of 1 indicates that the model issued a false alarm. When both Detection and False Alarm values are equal to 1, it indicates that the model issued multiple tornado warnings, including at least one correct detection (hit) along with additional false alarms. Considering that model performance is sensitive to input data quality, rigorous quality control and screening procedures were conducted prior to the generalization assessment. Cases with severe missing data, significant ground clutter contamination, or substantial influence from radar beam blockage and detection blind zones were excluded to ensure the reliability and comparability of the statistical results. It should be noted that for cases with relatively poor data quality, the model’s detection performance may still be affected to some extent. This observation further highlights the critical importance of high-quality observational data in deep learning-based tornado identification tasks.

The statistical results indicate that TDA-DARKNet provides effective detections in the majority of cases, maintaining a relatively high overall detection rate. The number of false alarm cases is limited, suggesting that the model retains good stability and robustness under complex severe convective conditions. In terms of warning lead time, the model achieves varying degrees of advance detection in all successfully identified cases, with an average lead time of approximately 13 min, demonstrating strong potential for operational application. For most events, the lead time is concentrated within the 5–15 min range, while in several intense tornado cases, the advance warning time exceeds 40 min. This result indicates that the model is capable of capturing key structural and dynamical features during the early stages of tornado development.

Regarding intensity distribution, the generalization test samples cover tornadoes ranging from EF0 to EF3. The model not only consistently identifies strong tornado events of EF2 and above, but also exhibits reliable detection capability for weaker EF0–EF1 tornado cases. This finding suggests that the deep learning approach maintains strong feature extraction and discrimination ability even under weak rotational signal conditions. Cross-regional statistical analysis further demonstrates that the model achieves generally consistent detection performance under different radar observational conditions in Jiangsu and Guangdong Provinces, highlighting its regional adaptability and generalization capability.

Overall, the multi-event statistical results confirm that TDA-DARKNet maintains strong detection stability and generalization performance across different intensity levels, regional environments, and radar observational conditions. However, this conclusion still requires further validation across a broader range of regions and more diverse sample conditions. Furthermore, the implementation of rigorous data quality control procedures provides a solid foundation for reliable model performance evaluation and underscores the practical necessity of high-quality radar data support for future operational deployment.

5. Discussion

In the EF0 tornado case that occurred in Xuzhou in September 2024, the TDA-DARKNet model first detected tornado features at 12:21, providing approximately 9 min of lead time. A comparison of the two tornado cases indicates that TDA-DARKNet provides substantially earlier warnings for the EF2 strong tornado than for the EF0 weak tornado.

This difference is primarily attributed to more pronounced and longer-lasting mesocyclone echo signatures during the early stages of strong tornadoes. These signatures allow the TDA-DARKNet model to capture key features earlier, thereby improving detection accuracy and response timeliness. EF0 tornadoes are generally smaller in spatial scale and shorter in duration. Their structural features are often less distinct and less stable than those of strong tornado systems. As a result, weak tornadoes often lack clear mesocyclone signatures or classic echo structures in radar base data. This limitation increases detection difficulty and reduces early-warning effectiveness.

In the Guangzhou event, tornado debris signature (TDS) features became clearly identifiable approximately two to three volume scans before touchdown. However, in the Xuzhou EF0 case, TDS features were rarely observed throughout the entire tornado life cycle. This result suggests that weak tornadoes typically generate limited near-surface debris, making TDS-based detection challenging. This indicates that TDS has limited utility for advance tornado warnings, especially for weak tornado events. However, it remains a highly reliable indicator for confirming tornado touchdown. Therefore, early warning of weak tornadoes should rely more heavily on other radar features, particularly strong low-level rotational signatures, such as tornado vortex signatures (TVS) or near-surface mesocyclones.

From the preceding discussion, the following findings can be drawn:

1. Radar remains the primary tool for tornado detection, but it has inherent limitations. Radar can identify potential or ongoing tornadoes by detecting mesocyclones and tornado vortex signatures (TVS) within thunderstorms. However, not all detected vortices result in tornado touchdown. The detection of a mesocyclone does not necessarily indicate that a tornado will reach the ground. Issuing warnings for every mesocyclone would lead to excessive false alarms, potentially triggering a “cry wolf” effect among the public. Moreover, not all tornadoes exhibit clear radar signatures. Some tornadoes, particularly non-supercell tornadoes, have very short lifespans and weak or indistinct rotational signatures. In addition, radar observations are constrained by temporal and spatial resolution, as scans are performed at discrete time intervals. A tornado may form and cause damage between successive scans. The integration of high-resolution phased-array radar systems with environmental field nowcasting represents a key direction for improving tornado detection and early warning.

2. The average lifespan of a tornado is approximately 10–20 min. The process from radar identification to warning issuance must be completed within a very limited time window. In the United States, the average effective tornado warning lead time is approximately 13–15 min. In China, effective warning lead times are often shorter, due to the smaller scale and rapid evolution of tornadoes. Tornado warnings must clearly specify the threatened areas. However, the highly unpredictable nature of tornado paths complicates accurate impact-area forecasting. As a result, warning regions may be overly broad or insufficiently targeted. Effective tornado warning requires the integration of detection and path prediction algorithms, together with rapid information dissemination through mobile emergency alerts and social media platforms, to ensure timely public response.

6. Conclusions

An improved deep learning model, TDA-DARKNet, was proposed for tornado identification based on radar base data. The model integrates dual attention mechanisms, dense residual structures, and the Kolmogorov–Arnold Network (KAN), and demonstrates strong performance in improving detection accuracy, reducing false alarm rates, and enhancing the recognition capability for weak tornadoes. The main findings are summarized as follows:

1. TDA-DARKNet integrates the spatial feature extraction capabilities of convolutional neural networks with both channel and spatial attention mechanisms to strengthen the response to key variables and spatial regions. Additionally, the KAN module enhances its capacity to model complex nonlinear relationships. TDA-DARKNet achieved an accuracy of 97.11% and a precision of 99.36% on the test set, with a false alarm ratio of only 4.92%, which effectively mitigates the trade-off between accuracy and false alarm rate in tornado detection.

2. The typical case analyses further demonstrate the practical application potential of the proposed model across tornadoes of different intensities. In the strong tornado event on 27 April 2024, TDA-DARKNet achieved a maximum warning lead time of 42 min, and the identified movement path was generally consistent with the post-event damage survey results. In the weak tornado case, the model successfully provided a valid detection approximately 9 min in advance, whereas TDA-XGB failed to identify the event and resulted in a missed detection. These results indicate that the deep learning approach exhibits higher detection sensitivity under weak rotational or non-typical structural conditions.

3. The cross-regional multi-event generalization results further indicate that, across 19 independent tornado cases from Jiangsu, Guangdong, and Anhui Provinces, the model maintains a consistently high overall detection rate and demonstrates stable performance under varying radar observational conditions. Most successfully detected events achieved minute-level advance warnings, with lead times exceeding 40 min in several strong tornado cases. Although a small number of false alarms were observed, the overall false alarm level remains effectively controlled compared with the relatively high false alarm rates commonly encountered in current operational tornado detection systems. These findings suggest that TDA-DARKNet exhibits strong detection stability and a certain degree of regional generalization capability across different regions, intensity levels, and observational conditions.

4. The model performs automatic detection directly based on radar base data, eliminating the need for manual feature engineering and thereby simplifying the data processing workflow. Meanwhile, the incorporation of the KAN module enhances the modeling of complex nonlinear relationships, improving the model’s generalization performance and robustness under limited sample conditions. The results suggest that the transition from “handcrafted feature-driven” approaches to “deep representation-driven” frameworks constitutes a key technological pathway for improving tornado detection probability and extending warning lead time.

Current tornado identification methods primarily rely on radar echo data. However, dependence on a single data source remains a key limitation. Tornadoes are highly variable and complex, and the limited spatial coverage and viewing geometry of individual radars make it difficult to fully capture their three-dimensional evolution. In practice, radar-based detection is not only constrained by spatial resolution but also affected by Earth curvature and atmospheric refraction. At longer ranges, even the lowest elevation scans may fail to capture near-surface information, making it challenging to determine whether a mesocyclone will extend to the ground and produce a tornado. In this context, in addition to long-range S-band radars, the incorporation of X-band radars to fill observational gaps is particularly important, especially as phased-array radars offer significant advantages in both spatial and temporal resolution. In regions such as Jiangsu and Guangzhou, radar networks integrating large-scale radars with X-band phased-array systems have already been deployed. Future efforts should further emphasize multi-source data fusion and collaborative multi-radar networks, promoting the development of intelligent, grid-based, and high-precision tornado identification systems, thereby providing stronger support for operational forecasting.

Author Contributions

Conceptualization, G.Z., Q.Z. and F.Z.; methodology, G.Z., Q.Z. and F.Z.; software, G.Z., Q.Z. and F.Z.; validation, G.Z., Q.Z., F.Z. and H.W.; formal analysis, G.Z., F.Z. and T.Y.; investigation, Q.Z. and F.Z.; resources, Q.Z. and F.Z.; data curation, Q.Z.; writing—original draft preparation, G.Z.; writing—review and editing, G.Z., Q.Z., F.Z. and H.W.; visualization, G.Z., T.Y. and H.W.; supervision, Q.Z. and F.Z.; project administration, Q.Z. and F.Z.; funding acquisition, Q.Z. and F.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China (U2342216 and 42575154), National Natural Science Foundation of Sichuan Province (2026NSFSC0208), China Meteorological Administration Tornado Key Laboratory (TKL202309), Key Laboratory of South China Sea Meteorological Disaster Prevention and Mitigation of Hainan Province (SCSF202503), Key Laboratory of Intelligent Meteorological Observation Technology, CMA (ZNGC2024QN03), Key Laboratory of Atmosphere Sounding, CMA (2023KLAS09M, 2024KLAS04M).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

The author used S-band radar tornado information from Jiangsu Province and Foshan Meteorological Bureau, and it was greatly appreciated. The authors also thank the published researchers whose literature contains the information used and cited in this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Fan, W.; Yu, X. Characteristics of Spatial–Temporal Distribution of Tornadoes in China. Meteorol. Mon. 2015, 41, 793–805. [Google Scholar]
Zheng, Y.; Liu, F.; Zhang, H. Advances in tornado research in China. Meteorol. Mon. 2021, 47, 1319–1335. [Google Scholar]
Yu, X.; Zheng, Y. Research and Operational Progress of Contemporary Severe Convective Weather in China. Acta Meteorol. Sin. 2020, 78, 391–418. [Google Scholar]
Ćwik, P.; Furtado, J.C.; McPherson, R.A.; Taszarek, M. Major May tornado outbreaks in the United States: Novel multiscale atmospheric patterns identified using maximum covariance analysis. Atmos. Res. 2025, 315, 107872. [Google Scholar] [CrossRef]
Artekha, S. Some generalizations of the tornado model. Atmos. Res. 2025, 329, 108541. [Google Scholar] [CrossRef]
Zrnić, D.; Burgess, D.; Hennington, L. Automatic detection of mesocyclonic shear with Doppler radar. J. Atmos. Ocean. Technol. 1985, 2, 425–438. [Google Scholar] [CrossRef]
Wakimoto, R.M.; Wilson, J.W. Non-supercell tornadoes. Mon. Weather Rev. 1989, 117, 1113–1140. [Google Scholar] [CrossRef]
Lack, S.A.; Fox, N.I. Development of an automated approach for identifying convective storm type using reflectivity-derived and near-storm environment data. Atmos. Res. 2012, 116, 67–81. [Google Scholar] [CrossRef]
Ryzhkov, A.V.; Schuur, T.J.; Burgess, D.W.; Zrnic, D.S. Polarimetric tornado detection. J. Appl. Meteorol. 2005, 44, 557–570. [Google Scholar] [CrossRef]
Bluestein, H.B.; French, M.M.; Tanamachi, R.L.; Frasier, S.; Hardwick, K.; Junyent, F.; Pazmany, A.L. Close-range observations of tornadoes in supercells made with a dual-polarization, X-band, mobile Doppler radar. Mon. Weather Rev. 2007, 135, 1522–1543. [Google Scholar] [CrossRef]
Kumjian, M.R.; Ryzhkov, A.V. Polarimetric signatures in supercell thunderstorms. J. Appl. Meteorol. Climatol. 2008, 47, 1940–1961. [Google Scholar] [CrossRef]
Snyder, J.C.; Ryzhkov, A.V. Automated detection of polarimetric tornadic debris signatures using a hydrometeor classification algorithm. J. Appl. Meteorol. Climatol. 2015, 54, 1861–1870. [Google Scholar] [CrossRef]
Zhang, Y.; Bai, L.Q.; Meng, Z.Y.; Chen, B.H.; Tian, C.C.; Fu, P.L. Rapid-scan and polarimetric phased-array radar observations of a tornado in the Pearl River Estuary. J. Trop. Meteorol. 2021, 27, 81–86. [Google Scholar]
Wang, Y.; Yu, T.Y. Novel tornado detection using an adaptive neuro-fuzzy system with S-band polarimetric weather radar. J. Atmos. Ocean. Technol. 2015, 32, 195–208. [Google Scholar] [CrossRef]
Hill, A.J.; Herman, G.R.; Schumacher, R.S. Forecasting severe weather with random forests. Mon. Weather Rev. 2020, 148, 2135–2161. [Google Scholar] [CrossRef]
Zeng, Q.; Qing, Z.; Zhu, M.; Zhang, F.; Wang, H.; Liu, Y.; Shi, Z.; Yu, Q. Application of random forest algorithm on tornado detection. Remote Sens. 2022, 14, 4909. [Google Scholar] [CrossRef]
Basalyga, J.N.; Barajas, C.A.; Gobbert, M.K.; Wang, J. Performance benchmarking of parallel hyperparameter tuning for deep learning based tornado predictions. Big Data Res. 2021, 25, 100212. [Google Scholar] [CrossRef]
Sandmæl, T.N.; Smith, B.R.; Reinhart, A.E.; Schick, I.M.; Ake, M.C.; Madden, J.G.; Steeves, R.B.; Williams, S.S.; Elmore, K.L.; Meyer, T.C. The tornado probability algorithm: A probabilistic machine learning tornadic circulation detection algorithm. Weather Forecast. 2023, 38, 445–466. [Google Scholar] [CrossRef]
McGuire, M.P.; Moore, T.W. Prediction of tornado days in the United States with deep convolutional neural networks. Comput. Geosci. 2022, 159, 104990. [Google Scholar] [CrossRef]
Xie, J.; Zhou, K.; Chen, H.; Han, L.; Guan, L.; Wang, M.; Zheng, Y.; Chen, H.; Mao, J. Multi-task learning for tornado identification using Doppler radar data. Geophys. Res. Lett. 2024, 51, e2024GL108809. [Google Scholar] [CrossRef]
Zeng, Q.; Zhang, G.; Huang, S.; Song, W.; He, J.; Wang, H.; Liu, Y. A novel tornado detection algorithm based on XGBoost. Remote Sens. 2025, 17, 167. [Google Scholar] [CrossRef]
Zhao, K.; Huang, H.; Wang, M.; Lee, W.C.; Chen, G.; Wen, L.; Wen, J.; Zhang, G.; Xue, M.; Yang, Z.; et al. Recent progress in dual-polarization radar research and applications in China. Adv. Atmos. Sci. 2019, 36, 961–974. [Google Scholar] [CrossRef]
Veillette, M.S.; Kurdzo, J.M.; Stepanian, P.M.; Cho, J.Y.; Reis, T.; Samsi, S.; McDonald, J.; Chisler, N. A benchmark dataset for Tornado detection and prediction using full-resolution polarimetric weather radar data. Artif. Intell. Earth Syst. 2025, 4, e240006. [Google Scholar] [CrossRef]
Yang, S.; Yang, H.; Shen, F.; Jian, Z. A Review of Image Data Augmentation for Deep Learning. J. Softw. 2025, 36, 78–79. [Google Scholar]
Hasanin, T.; Khoshgoftaar, T. The effects of random undersampling with simulated class imbalance for big data. In 2018 IEEE International Conference on Information Reuse and Integration (IRI); IEEE: New York, NY, USA, 2018; pp. 70–79. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. Kan: Kolmogorov-Arnold networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
Taud, H.; Mas, J.F. Multilayer perceptron (MLP). In Geomatic Approaches for Modeling Land Change Scenarios; Springer: Berlin/Heidelberg, Germany, 2017; pp. 451–455. [Google Scholar]
Gordon, W.J.; Riesenfeld, R.F. B-spline curves and surfaces. In Computer Aided Geometric Design; Elsevier: Amsterdam, The Netherlands, 1974; pp. 95–126. [Google Scholar]
Strang, G.; Fix, G.J. An Analysis of the Finite Element Method; Wellesley-Cambridge Press: Philadelphia, PA, USA, 2008; p. 402. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]

Figure 1. Types and intensities of tornado samples in the Tornet dataset. (a) Tornado Sample Counts by Type; (b) Distribution of EF-Scale Ratings in Tornado Samples.

Figure 2. Tornado Sample Set Construction.

Figure 3. TDA-DARKNet Architecture.

Figure 4. Channel Attention Mechanism.

Figure 5. Spatial Attention Mechanism.

Figure 6. Detection results of TDA-DARKNet from the Z9200 radar at 14:18 (UTC + 8). The black rectangle is centered on the detected location and represents a square region with a side length of 2 km, while the annotated values indicate the corresponding detection probabilities. Sub-images (a,b) correspond to elevation angles of 0.5° and 1.5°, respectively. Each sub-image consists of four radar variables, including reflectivity, radial velocity, differential reflectivity, and correlation coefficient, arranged in a 2 × 2 layout.

Figure 7. Detection results of TDA-DARKNet from the Z9200 radar at 15:00 (UTC + 8). The black rectangle is centered on the detected location and represents a square region with a side length of 2 km, while the annotated values indicate the corresponding detection probabilities. Sub-images (a,b) correspond to elevation angles of 0.5° and 1.5°, respectively. Each sub-image consists of four radar variables, including reflectivity, radial velocity, differential reflectivity, and correlation coefficient, arranged in a 2 × 2 layout.

Figure 8. Detection-based tornado path of the EF2 tornado in Guangzhou on 27 April 2024 using TDA-DARKNet.

Figure 9. Detection results of TDA-DARKNet from the Z9516 radar at 12:21 (UTC + 8). The black rectangle is centered on the detected location and represents a square region with a side length of 2 km, while the annotated values indicate the corresponding detection probabilities. Sub-images (a,b) correspond to elevation angles of 1.5° and 2.4°, respectively. Each sub-image consists of four radar variables, including reflectivity, radial velocity, differential reflectivity, and correlation coefficient, arranged in a 2 × 2 layout.

Figure 10. Detection results of TDA-DARKNet from the Z9516 radar at 12:32 (UTC + 8). The black rectangle is centered on the detected location and represents a square region with a side length of 2 km, while the annotated values indicate the corresponding detection probabilities. Sub-images (a,b) correspond to elevation angles of 1.5° and 2.4°, respectively. Each sub-image consists of four radar variables, including reflectivity, radial velocity, differential reflectivity, and correlation coefficient, arranged in a 2 × 2 layout.

Table 1. Confusion Matrix for Binary Classification.

Ground Truth	Predict
Ground Truth	Positive (Tornado)	Negative (Non-Tornado)
Positive (Tornado)	$T P$	$F N$
Negative (Non-Tornado)	$F P$	$T N$

Table 2. Evaluation metric scores of the model on the test set, where “↑” indicates that a higher score is better, and “↓” indicates that a lower score is better.

Model	ACC ↑	PRE ↑	F1-Score ↑	POD ↑	FAR ↓	CSI ↑	MCC ↑
ResNet	0.9550	0.9367	0.9559	0.9759	0.0633	0.9155	0.9108
DenseNet	0.9647	0.9419	0.9655	0.9904	0.0581	0.9333	0.9306
EfficientNet	0.9651	0.9413	0.9606	0.9807	0.0586	0.9242	0.9204
ConvNeXt	0.9606	0.9449	0.9686	0.9935	0.0550	0.9392	0.9369
MLP-Mixer	0.9654	0.9227	0.9501	0.9790	0.0772	0.9049	0.8988
TDA-DARKNet	0.9711	0.9508	0.9717	0.9936	0.0492	0.9450	0.9431

Table 3. Ablation study results of different model configurations, where “↑” indicates that a higher score is better, and “↓” indicates that a lower score is better.

Model	ACC ↑	PRE ↑	F1-Score ↑	POD ↑	FAR ↓	CSI ↑	MCC ↑
ResDenseNet	0.9598	0.9333	0.9610	0.9904	0.0667	0.9249	0.9214
ResDenseNet + CA	0.9542	0.9210	0.9559	0.9936	0.0790	0.9156	0.9113
ResDenseNet + SA	0.9582	0.9425	0.9589	0.9759	0.0575	0.9211	0.9170
ResDenseNet + CBAM	0.9574	0.9344	0.9585	0.9839	0.0656	0.9203	0.9162
ResDenseNet + KAN	0.9679	0.9505	0.9685	0.9871	0.0495	0.9388	0.9364
ResDenseNet + CA + KAN	0.9614	0.9375	0.9624	0.9887	0.0625	0.9276	0.9243
ResDenseNet + SA + KAN	0.9622	0.9376	0.9633	0.9904	0.0624	0.9291	0.9260
TDA-DARKNet	0.9711	0.9508	0.9717	0.9936	0.0492	0.9450	0.9430

Table 4. Evaluation metric scores of TDA-XGB and TDA-DARKNet on the test set, where “↑” indicates that a higher score is better and “↓” indicates that a lower score is better.

Model	ACC ↑	PRE ↑	F1-Score ↑	POD ↑	FAR ↓	CSI ↑	MCC ↑
TDA-XGB	0.8387	0.8222	0.8402	0.8590	0.1778	0.7244	0.6782
TDA-DARKNet	0.9711	0.9508	0.9717	0.9936	0.0492	0.9450	0.9431

Table 5. Statistics of detection results and warning lead times for tornado events in different regions. Detection = 1 indicates successful detection; False Alarm = 1 indicates false alarm; “/” indicates no record.

Date	Time (UTC + 8)	Location	EF Scale	Radar	Detection	False Alarm	Lead Time
20200518	14:06–14:07	Guangdong Jiangmen	EF0	Z9662	0	1	/
20200601	12:50–12:57	Guangdong Guangzhou	/	Z9200	1	0	32 min
20200612	13:49–14:00	Jiangsu Gaoyou	EF2	Z9250	1	0	13 min
20200722	21:48	Jiangsu Suqian	EF3	Z9515	1	1	18 min
20200722	22:40	Jiangsu Yancheng	EF2	Z9515	1	0	25 min
20210513	14:40	Guangdong Zhuhai	/	Z9755	0	0	/
20210728	16:30–18:40	Guangdong Guangzhou	/	Z9200	1	0	6 min
20210728	16:30–18:40	Guangdong Guangzhou	/	Z9758	1	0	6 min
20210730	10:00–11:00	Guangdong Guangzhou	/	Z9200	1	0	6 min
20220619	07:20–07:30	Guangdong Foshan	EF1	Z9758	1	0	10 min
20230610	16:22	Jiangsu Nantong	EF2	Z9513	1	0	15 min
20240427	14:49–15:04	Guangdong Guangzhou	EF2	Z9200	1	0	42 min
20240917	12:30–12:38	Jiangsu Xuzhou	EF0	Z9516	1	0	9 min
20240917	14:07–14:12	Jiangsu Xuzhou	EF1	Z9516	1	1	7 min
20240917	14:33–14:41	Jiangsu Xuzhou	EF1	Z9516	1	0	11 min
20240917	15:31–15:33	Jiangsu Xuzhou	EF0	Z9516	1	0	2 min
20240917	15:47–16:10	Jiangsu Lianyungang	EF0	Z9516	1	0	7 min
20240917	16:02–16:09	Jiangsu Xuzhou	EF0	Z9516	1	0	5 min
20240917	17:07–17:16	Anhui Suzhou	EF1	Z9516	1	0	5 min

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, G.; Zeng, Q.; Zhang, F.; Wang, H.; Yu, T. TDA-DARKNet: A Deep Learning Model Based on Dual-Polarization Radar Data for Tornado Detection. Remote Sens. 2026, 18, 1124. https://doi.org/10.3390/rs18081124

AMA Style

Zhang G, Zeng Q, Zhang F, Wang H, Yu T. TDA-DARKNet: A Deep Learning Model Based on Dual-Polarization Radar Data for Tornado Detection. Remote Sensing. 2026; 18(8):1124. https://doi.org/10.3390/rs18081124

Chicago/Turabian Style

Zhang, Guoxiu, Qiangyu Zeng, Fugui Zhang, Hao Wang, and Tiantian Yu. 2026. "TDA-DARKNet: A Deep Learning Model Based on Dual-Polarization Radar Data for Tornado Detection" Remote Sensing 18, no. 8: 1124. https://doi.org/10.3390/rs18081124

APA Style

Zhang, G., Zeng, Q., Zhang, F., Wang, H., & Yu, T. (2026). TDA-DARKNet: A Deep Learning Model Based on Dual-Polarization Radar Data for Tornado Detection. Remote Sensing, 18(8), 1124. https://doi.org/10.3390/rs18081124

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

TDA-DARKNet: A Deep Learning Model Based on Dual-Polarization Radar Data for Tornado Detection

Highlights

Abstract

1. Introduction

2. Dataset

2.1. Dual Polarization Weather Radar

2.2. Radar Base Data Processing

2.3. Tornado Dataset Construction

3. Method

3.1. TDA-DARKNet

3.2. Kolmogorov–Arnold Network

3.3. Channel Attention Mechanism

3.4. Spatial Attention Mechanism

3.5. Model Evaluation

4. Typical Case Analysis and Generalization Capability Validation

4.1. EF2 Tornado in Guangzhou, Guangdong on 27 April 2024

4.2. EF0 Tornado in Xuzhou, Jiangsu on 17 September 2024

4.3. Generalization Capability Validation

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI