MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification

Chen, Yihua; Han, Yuanpeng; Zhang, Yujian; Liu, Yi; Song, Lin; Wang, Jialei; Wang, Xinjue; Zhang, Qilin

doi:10.3390/rs18071027

Open AccessArticle

MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification

by

Yihua Chen

¹,

Yuanpeng Han

¹,

Yujian Zhang

¹,

Yi Liu

^1,2,*

,

Lin Song

³

,

Jialei Wang

⁴

,

Xinjue Wang

⁵ and

Qilin Zhang

^1,2

¹

Collaborative Innovation Center on Atmospheric Environment and Equipment Technology, B-DAT, Nanjing University of Information Science and Technology, Nanjing 210044, China

²

State Key Laboratory of Climate System Prediction and Risk Management, Key Laboratory of Meteorological Disaster, Ministry of Education, Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing 210044, China

³

Qingdao Ecological and Agricultural Meteorological Center, Qingdao Meteorological Bureau, Qingdao 266003, China

⁴

Public Meteorological Service Center of Zhuhai, Zhuhai Meteorological Bureau, Zhuhai 519000, China

⁵

Department of Computer Science, University of Reading, Whiteknights, Reading RG6 6DH, UK

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(7), 1027; https://doi.org/10.3390/rs18071027

Submission received: 10 February 2026 / Revised: 18 March 2026 / Accepted: 27 March 2026 / Published: 29 March 2026

(This article belongs to the Special Issue Advancing Remote Sensing Through Large Multimodal Foundation Models: Toward Intelligent Earth Observation)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A lightning identification network guided by physical priors and asymmetric supervision is proposed for multi-source meteorological data.
A multi-source multi-scale feature fusion module and an asymmetric weighted BCE-DICE loss are designed to enhance feature representation and alleviate class imbalance.
A weighted sliding window MLP decoder and a physics-guided radar enhancement mechanism are presented to suppress noise, reduce false alarms, and boost true positive rates, thereby improving the reliability of lightning identification.

What are the implications of the main findings?

The proposed strategy alleviates modal competition in multi-source heterogeneous data fusion and improves the utilization of meteorological data.
The multi-scale fusion framework provides a new paradigm for physical feature coupling in convective weather monitoring.
The method effectively reduces false alarm rates while improving detection rates, enabling high-precision lightning identification.

Abstract

Utilizing multi-source meteorological data for lightning identification is crucial for monitoring severe convective weather. However, several key challenges persist in this field: dimensional imbalance and modal competition among multi-source heterogeneous data, model training bias caused by the extreme sparsity of lightning samples, and an imbalance between false alarms and missed detections resulting from complex background noise. To address these challenges, this paper proposes a lightning identification network guided by physical priors and constrained by supervision. First, to tackle the issue of modal competition in fusing satellite (high-dimensional) and radar (low-dimensional) data, a physical prior-guided asymmetric radar information enhancement mechanism is introduced. This mechanism uses radar physical features as contextual guidance to selectively enhance the latent weak radar signatures. Second, at the architectural level, a multi-source multi-scale feature fusion module and a weighted sliding window–multilayer perceptron (MLP) enhanced decoding unit are constructed. The former achieves the coupling of multi-scale physical features at a 2 km grid scale through cross-level semantic alignment, building a highly consistent feature field that effectively improves the model’s ability to detect lightning signals. The latter leverages adaptive receptive fields and the nonlinear modeling capability of MLPs to effectively smooth spatially discrete noise, ensuring spatial continuity in the reconstructed results. Finally, to address the model bias caused by severe class imbalance between positive and negative samples—resulting from the extreme sparsity of lightning events—an asymmetrically weighted BCE-DICE loss function is designed. Its “asymmetric” characteristic is implemented by assigning different penalty weights to false-positive and false-negative predictions. This loss function balances pixel-level accuracy and inter-class equilibrium while imposing high-weight penalties on false-positive predictions, achieving synergistic optimization of feature enhancement and directional suppression. Experimental results show that the proposed method effectively increases the hit rate while substantially reducing the false alarm rate, enabling efficient utilization of multi-source data and high-precision identification of lightning strike areas.

Keywords:

physical prior guidance; asymmetric supervision constraints; modal imbalance mitigation; lightning identification; multi-source data fusion

1. Introduction

1.1. Research Significance

Lightning, as a natural discharge phenomenon with immense energy, possesses hazardous characteristics, such as high abruptness, high voltage, high current, and significant electromagnetic pulses, causing multi-dimensional damage to human life, infrastructure stability, ecological balance, and socio-economic development. According to statistics on lightning disasters in China from 1997 to 2009, lightning strikes resulted in 5033 fatalities and 4670 injuries during this period, with significant spatiotemporal distribution characteristics: temporally, they occur most frequently in summer, and spatially, they are concentrated in the eastern coastal and southern regions [1]. Additionally, lightning-induced transient overvoltages easily damage electrical equipment, leading to large-scale power outages [2]. Satellite monitoring and statistical models confirm that lightning is a primary cause of wildfires in many areas, especially remote forested regions. In parts of North America, lightning directly triggers 98% of wildfires from May to June each year [3]. Notably, climate change profoundly affects the intensity and frequency of lightning activity. Multiple studies based on CMIP6 climate models predict that the global frequency of lightning will increase significantly in the future [4]. The aforementioned actual and potential risks highlight the necessity of conducting high-precision lightning monitoring and localization.

1.2. Advances in Lightning Detection and Identification Research

Traditional ground-based detection networks (such as the ADTD system) rely on the spatiotemporal localization of electromagnetic signals. Their detection efficiency follows a gamma distribution relative to distance and is significantly affected by terrain. In areas with complex terrain, the localization accuracy of Time-of-Arrival (TOA) technology fluctuates greatly, making it difficult to achieve standardized global coverage [5]. Satellite remote sensing offers a global perspective, but the data quality varies across platforms. The CMIP6 climate models show Discrepancies by a factor of two in simulating global lightning frequency, and the Fengyun-4A satellite’s LMI also exhibits systematic biases [4]. Traditional atmospheric electric field monitoring relies on single-characteristic warnings, resulting in high false alarm rates, poor robustness, and low efficiency [6]. Deep learning, leveraging its strong nonlinear feature extraction and multi-source data fusion capabilities, has demonstrated tremendous potential in the field of remote sensing [7,8,9], which effectively overcomes the limitations of traditional methods, providing a new paradigm for lightning detection and prediction. It can adapt to complex terrain interference and fuse heterogeneous data to build superior models [10,11].

In real-time lightning detection, Transformers face three critical challenges: slow inference speed, high training difficulty and high training and deployment costs. Their self-attention mechanism incurs O(

n^{2}

) time/space complexity, leading to explosive computational overhead with longer input sequences. Unlike CNNs, Transformers lack local receptive fields and weight sharing, failing to meet low-latency real-time deployment requirements. They also exhibit high sensitivity to hyperparameters and training strategies, while lightning detection suffers from scarce, costly annotated data. Additionally, Transformers lack CNNs’ inductive biases and must learn visual prior knowledge from scratch, increasing the training difficulty. Their large parameter size further increases hardware/energy costs compared to performance-equivalent CNNs. While model compression and efficient variants can mitigate these issues, they do not resolve them fundamentally, hindering large-scale deployment in real-time lightning identification. By contrast, CNN and UNet variants leverage local receptive fields, encoder–decoder architectures, and skip connections to effectively capture lightning’s geometric features. These structures enable multi-scale feature extraction and high-resolution reconstruction, distinguishing lightning signals from noise and correcting data biases [12].

Existing research shows that multi-channel brightness temperature data from Himawari satellites can provide high-spatiotemporal resolution support for lightning identification. Individual channels (such as TBB 13, TBB 9, TBB 15) can respectively characterize cloud-top structures, analyze water vapor transport and facilitate radiative correction. Among them, TBB 13 ≤ −52 °C can indicate deep convective cloud tops [13,14].

Inter-channel brightness temperature differences (e.g., TBB 15-TBB 13, composite difference indices) are sensitive to cloud-top ice crystal properties and serve as effective indicators for severe convection and imminent lightning [15]. Moreover, integrating radar echo parameters with satellite brightness temperature difference products enables precise localization and intensity quantification of potential thunderstorm cells [16].

Representative quasi-end-to-end near-real-time detection works exhibit limitations of varying degrees: Qian et al. (2022b) [10] proposed the Lightning-SN model using S-band Doppler radar data, which outperformed traditional/baseline models but lacked tailored diffusion processing for near-real-time detection, limiting practical applicability. This reflects the core challenge of sample sparsity, where lightning pixels (<1%) are easily overwhelmed by background noise. Lee et al. (2024) [17] fused GK2A/AMI satellite data with LINET ground observations for lightning detection over mid-latitude regions (e.g., the Korean Peninsula) via spatiotemporal matching, feature extraction, and two-stage post-processing; despite high hit rates, the method suffered from high false alarm rates and accuracy imbalance issues, restricting practical value. These issues arise from an imbalance between false alarms and false negatives, as the model struggles to distinguish non-convective clutter from real lightning. Lu et al. (2025) [18] developed the MLD-YOLO model (integrating MLCA, LSKA, DyHead) for lightning identification in Ningbo, China, fusing five radar products via PCA-weighted fusion; similar to Lee et al. (2024), it faced hit/false alarm imbalance, and its loose evaluation strategy (13 km × 13 km bounding box for lightning events) yielded overly optimistic results that do not accurately reflect fine-grained localization precision. This imbalance stems from both modal competition (unconstrained fusion suppresses weak features) and imbalance between false alarms and false negatives (exacerbated by sparse samples). Additionally, constrained by YOLO’s inherent characteristics [19,20],

MLD-YOLO outputs axis-aligned bounding boxes that cannot accurately describe irregular lightning morphologies/boundaries: discrete/point-like lightning events mix excessive background pixels (ambiguous localization), while continuous/clustered regions lose critical morphological details (e.g., branches) for meteorological analysis. This bounding box-based paradigm also fails to support operational quantitative applications (e.g., lightning density calculation, spatiotemporal evolution analysis, radar echo overlay).

1.3. Main Contributions

To address the above challenges, we proposed a multi-data fusion model for pixel-level identification of high-risk lightning regions at a 2 km scale that does not rely on lightning location information:MDE-UNet. The model adaptively integrates multi-source information from radar and satellite through channel attention, performs efficient and high-quality modeling in a multi-scale and refined manner on the decoding path, and finally effectively supplements preliminary results with physical prior information. It achieves end-to-end real-time mapping from multi-source radar and satellite data to lightning locations.

The main contributions of this paper are as follows.

1.: To address the challenges of physical decoupling of multi-scale features and the difficulty in distinguishing real lightning signals from interference, a recognition algorithm based on multi-scale feature aggregation and asymmetric supervision constraints is proposed. This algorithm achieves feature enhancement and targeted suppression by constructing a multi-source, multi-scale fusion module, forming a high-consistency feature field and significantly improving the model’s ability to discriminate real lightning signals from non-lightning interference.
2.: To address the problem of suboptimal model recognition performance caused by the highly sparse spatial distribution of lightning activity, an asymmetrically weighted BCE-DICE loss is introduced as a targeted regularization term, which effectively alleviates the severe class imbalance problem while imposing high-weight penalties on false positive identifications.
3.: To tackle the identification challenge of the imbalance between false positives and false negatives caused by the susceptibility of lightning activity to local noise interference, an enhanced decoding unit based on a weighted sliding window multilayer perceptron (weighted sliding window MLP) is designed, providing a novel technical approach for the accurate identification of sparse lightning events.
4.: To resolve the issue of weak radar modality features being overwhelmed in multi-source heterogeneous meteorological data fusion due to the significant channel dimension imbalance between satellite and radar data, a physics-guided asymmetric radar enhancement mechanism is proposed. This mechanism mitigates the masking effect of high-dimensional satellite data on radar features by strategically strengthening weak radar modality features, thereby enabling more balanced and effective multi-modal data fusion.

2. Data Sources and Preprocessing

We utilized lightning event data, radar data, and satellite data from Guangdong Province for the period from 1 July 2023 to 1 September 2023. The geographic scope is defined as longitude 110.94°E–116.06°E and latitude 21.44°E–26.56°E, as shown in Figure 1. The dataset covers a total of 514,386 lightning events, yielding 4360 samples.

The minimum time interval between consecutive samples is 10 min, and a 2 km spatial resolution is adopted based on the following considerations: this scale is physically consistent with the spatiotemporal characteristics of convective clouds (primary lightning sources with 1–10 km horizontal scales and 2–4 km intense cores [21]) and lightning (over 90% of clusters spanning 1.5–3 km in radius, with peak frequency regions matching reflectivity gradients resolvable at 2 km [22]), effectively capturing critical convective structures, internal processes, and high-density lightning areas. Additionally, it ensures spatial alignment with the highest-resolution satellite datasets, facilitating consistent multi-source data integration.

2.1. Lightning Detection Data

Given the difficulty of directly extrapolating lightning location data, we utilized ground-based lightning location data obtained from a very low-frequency lightning location network (VLF-LLN). Based on the Methods and Parameters proposed by Liu et al. [23], we first fit the cumulative probability distribution of the horizontal extension distance of lightning flashes to characterize the probabilistic distribution of of the horizontal reach distance of individual lightning events using the positive half-axis of a one-dimensional Gaussian function, where the standard deviation is set to 7.554 km. This parameter is derived from fitting the cumulative probability distribution of lightning horizontal extension distances and is chosen to ensure that the probability drops to 3% at a horizontal distance of 20 km, consistent with the statistical observation that lightning channels rarely extend horizontally beyond 20 km [24].

Subsequently, a time-decay weighting strategy is adopted to fuse the Gaussian distribution results from multiple time periods. The weight function is constructed based on a negative exponential decay function, ensuring that the sum of weights is normalized to 1 to ensure numerical stability. In the time-weighted fusion process, a mapping variable k scales the time step range [1, n] to [1, 10]. This setting is justified by utilizing the exponential function

exp (- 10) \approx 4 \times 10^{- 5}

to effectively decay the weights of distant historical data, while ensuring all weights sum to 1 to prevent scale distortion. The value of n is set to 3 to incorporate lightning historical information from the past 30 min.

To highlight the core regions of lightning activity while preserving effective information, a peak truncation process is then applied to the fused results, with a maximum threshold set at 25 counts per 10 min. The truncatesd frequency

P_{c l i p}

is categorized as follows:

P_{c l i p}

> 25 corresponds to the core area of high-frequency lightning activity, 1 ≤

P_{c l i p}

≤ 25 represents the secondary high-density lightning area, and

P_{c l i p}

< 1 is considered a region with no significant lightning activity. This processing retains the differences in lightning activity intensity while mitigating the impact of extreme outliers on the model training.

Finally, we selected high-risk areas such as the lightning high-density and secondary high-density regions as labels. A binarization process is applied to

P_{c l i p}

using a threshold of 1, definedas

P_{m a p} = \{\begin{matrix} 1, & 1 \leq P_{c l i p} \\ 0, & others \end{matrix}

(1)

The processing flow is illustrated in Figure 2.

2.2. Satellite Data

The experimental data are sourced from the Japan Meteorological Agency’s next-generation geostationary meteorological satellite, Himawari-8. Its onboard Advanced Himawari Imager (AHI) features multi-band observation capabilities, including three visible channels, two near-infrared channels, and ten infrared channels (including three water vapor channels).

We selected Band 9, TBB 13, the difference between TBB 15 and TBB 13 (TBB 15-TBB 13), and the composite index (Band 11-TBB 13)-(TBB 13-TBB 15) as the core input parameters.

The synergistic application of Himawari multi-channel brightness temperature data and radar observations provides high spatiotemporal resolution and physical insights for real-time lightning monitoring. Their integration is motivated by the fact that brightness temperature (TBB) and its combinations can reveal the microphysical structure and dynamic processes of convective clouds, while radar data provides the three-dimensional distribution of precipitation particles within clouds. Together, they enable the identification and prediction of high-risk lightning areas.

Single-channel brightness temperature forms the basis for identifying severe convective systems. The infrared window channel TBB 13 (approximately 10.4

μ

m) directly reflects the cloud-top radiation temperature. Values below −52 °C typically indicate deep convective cloud tops, which is one of the necessary conditions for lightning generation [16]. Band 9 (approximately 6.9

μ

m) is sensitive to water vapor content and can analyze upper- and mid-tropospheric water vapor transport. TBB 15 (approximately 12.4

μ

m) is less affected by ozone absorption and is suitable for atmospheric correction. However, single-channel information is limited and insufficient to distinguish cloud tops with similar heights but significantly different internal structures.

The brightness temperature difference (BTD) between channels is key to unlocking microphysical information within clouds. The TBB 15-TBB 13 difference is highly sensitive to the size and phase of ice crystals at the cloud top. Small, dense ice crystals formed under strong updrafts cause this difference to show a relatively large positive value. This feature is a strong indicator of imminent severe convection and lightning [15]. This index is physically grounded in the selective response of different infrared bands to cloud-top microphysical parameters. First-order differences (e.g., B11-TBB13 or TBB13-TBB15) can partially eliminate common-mode components induced by variations in cloud-top height. Second-order differences go a step further, effectively suppressing the common radiative effects caused by changes in the path length of water vapor throughout the entire atmosphere, as water vapor exerts similar influences on these three adjacent bands [25]. By constructing the combined form of (B11-TBB13)-(TBB13-TBB15), the common radiative effects of water vapor paths can be effectively offset, while signals of reduced ice crystal effective radius and increased ice crystal number concentration are amplified—precisely the core microphysical characteristics of the charge-separation active region in deep convective clouds [26,27].

2.3. Radar Data

Radar data play an indispensable role in the verification and prediction of lightning activity. The radar reflectivity factor (dBZ) characterizes the backscattering cross-section of precipitation particles and provides three-dimensional distribution information of particles within clouds. When a high-reflectivity core extends above the freezing level, it indicates strong updrafts and intense collisions among ice-phase particles, which are central to the non-inductive charging mechanism and are directly related to lightning occurrence [28]. Combined with satellite brightness temperature difference products, this enables precise localization and intensity assessment of thunderstorm cells. Furthermore, radar echo extrapolation is crucial for severe convective weather warnings. Echo data characterized by high intensity and extensive coverage correspond to more intense convective motions, and strong updrafts transport water vapor and ice crystals to high altitudes, establishing the physical basis for charge separation and lightning triggering. By focusing on such high-value data samples, the model’s ability to capture the spatial distribution features of precipitation systems (such as the morphology, coverage, and movement paths of convective clouds) and intensity variation patterns (such as trends in echo intensity changes and migration characteristics of strong echo cores) can be effectively enhanced. These features exhibit significant correlations with the probability, intensity, and spatial distribution of lightning activity, thereby providing a solid data foundation for real-time lightning identification models [29].

The research data were obtained from the national radar dataset (China Meteorological Administration), with a 1 km spatial and 6 min temporal resolution. Original 512 × 512 pixel radar images were downsampled to 256 × 256 for training efficiency and spatial alignment; a radar echo temporal interpolation method (based on optical flow field and spatiotemporal weight constraints) was adopted to address temporal window mismatch and realize refined reconstruction of sparse radar data.

This method uses adjacent radar echoes as input: missing/abnormal values are set to 0, followed by Gaussian smoothing (standard deviation = 3 pixels, 3 km) to suppress noise without damaging convective structure boundaries—this parameter is determined by radar resolution and convective cell scale [30], consistent with related studies [31,32]; Radar echoes were 0–1 normalized to avoid optical flow distortion, meet Farnebäck algorithm assumptions [33], and improve motion estimation reliability. Farnebäck dense optical flow and bicubic interpolation were used, with echo intensity weights constraining outputs to 0–70 dBZ (the theoretical upper limit for extreme echoes).

Evaluation indicators included MAE, RMSE, and CSI (20 dBZ threshold, recommended by WMO for convective identification [34]). Interpolation results: MAE 3.0914 dBZ, RMSE 5.5994 dBZ, CSI 0.7482. This performance is sufficient for subsequent research: CSI = 0.7482 exceeds the 0.7 operational usability threshold. For lightning/severe convection identification, spatial structure accuracy (CSI) is more critical than intensity accuracy [32,35], and the algorithm prioritizes strong echo location/existence—crucial for lightning initial screening. The results surpass traditional optical flow methods (CSI < 0.65), approach deep learning model performance, and serve as reliable input for subsequent research [36].

The visualizations of radar and satellite channels are illustrated in Figure 3.

3. Research Methods

Given the highly complex nonlinear relationships between lightning occurrence and various atmospheric physical parameters, which cannot be fully captured by traditional physical formulas or linear models, we adopted a data-driven deep learning approach. By employing the Multi-modal Decoding Enhanced UNet (MDE-UNet), this work establishes a same-size mapping between radar and satellite observations and lightning probability distribution, enabling end-to-end identification of lightning initiation timing.

3.1. Model Architecture

3.1.1. Channel Attention Mechanism Block (CA Block)

To address multi-source data fusion and noise interference in lightning monitoring, we introduce a lightweight channel attention mechanism for feature recalibration, which is designed to adaptively capture and amplify critical channel information while suppressing irrelevant noise channels.

Given an input feature map

X \in R^{C \times H \times W}

(where C represents the number of channels, H denotes the height of the feature map, and W is the width of the feature map), we first perform global average pooling to obtain a channel-wise descriptor that aggregates global spatial information for each channel, enabling the adaptive identification of key channelcharacteristics:

s = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} X_{:, i, j}, s \in R^{C}

(2)

In the above equation, the operator

\sum_{i = 1}^{H} \sum_{j = 1}^{W}

implements global spatial summation over all spatial positions

(i, j)

of the feature map, and the scaling factor

\frac{1}{H \times W}

normalizes the summation result to the range of the average value, ensuring the magnitude of the channel-wise descriptor s is invariant to the spatial dimensions of the input feature map.

We then learn adaptive channel weights

α \in R^{C}

via a two-layer fully connected network, where the weights are automatically optimized to highlight channels carrying discriminative information for lightning monitoring (e.g., deep convection and cloud-top microphysics) and weaken redundant ones:

α = σ (W_{2} \cdot ReLU (W_{1} \cdot s + b_{1}) + b_{2})

(3)

In the above equation, the parameters are defined as follows:

W_{1} \in R^{C / r \times C}

and

b_{1} \in R^{C / r}

: weight matrix and bias vector of the first fully connected layer, where r (reduction ratio, selected to 2 in this study) is the dimensionality reduction factor to reduce computational complexity. Specifically, a smaller reduction ratio is adopted to capture finer-grained channel correlations inherent in radar echo features.

W_{2} \in R^{C \times C / r}

and

b_{2} \in R^{C}

: weight matrix and bias vector of the second fully connected layer, which restores the dimensionality to the original number of channels C;

ReLU (\cdot)

: rectified linear unit activation function, introduced to add non-linearity and enhance the model’s ability to learn complex channel correlations;

σ (\cdot)

: sigmoid activation function, which maps the output of the fully connected network to the range

(0, 1)

, ensuring the channel weights

α

can act as scaling factors to modulate the original feature map.

The core advantage of this design is that the weights are not manually predefined but adaptively learned from the data, enabling the model to dynamically focus on task-critical channels.

Finally, we recalibrate X by element-wise multiplication with

α

, where the adaptive weights directly modulate the contribution of each channel to achieve targeted enhancement of key channel information:

\tilde{X} = α ⊙ X

(4)

In the above equation,

\tilde{X} \in R^{C \times H \times W}

is the recalibrated feature map, and the operator ⊙ denotes element-wise multiplication (Hadamard product) between the channel weight vector

α

and the original feature map X—specifically, each element

α_{c}

(the c-th element of

α

) is multiplied with all spatial elements of the c-th channel of X, thus scaling the entire channel by the learned weight.

This mechanism adaptively captures and reinforces key channels related to deep convection and cloud-top microphysics, suppresses redundant noise channels that interfere with lightning detection, and achieves efficient multi-source data fusion with low computational cost, improving lightning activity identification accuracy. The module structure is shown in Figure 4.

3.1.2. Enhanced Decoding Unit Based on Weighted Sliding Window Multilayer Perceptron (WWMLP Block)

To address the core challenge of inversely mapping high-level semantic features to a high-resolution, spatially continuous physical field when reconstructing lightning-affected areas based on radar and multi-channel Himawari satellite data (including Band 9, TBB 13, TBB 15, and their derived indices), we innovatively propose an enhanced decoding unit based on a weighted sliding window multilayer perceptron (WWMLP Block) along the critical path of the decoder. Traditional decoding methods reconstruct the spatial distribution of lightning by upsampling the fused high-level features that have incorporated radar structural information and satellite cloud-top microphysical states; however, they often result in fragmented reconstruction outcomes, blurred boundaries, and numerous spatially isolated false-positive artifacts, leading to poor physical consistency. To solve this problem, this module employs a weighted sliding window mechanism to dynamically aggregate contextual information from neighboring fused features at each spatial position during the reconstruction process. By jointly considering the satellite brightness temperature gradient features corresponding to the current point and the surrounding radar reflectivity features, this module can explicitly model and reinforce the inherent spatial continuity and physical correlation of lightning-affected areas in the real atmosphere, thereby effectively suppressing isolated false signals that deviate from meteorological principles, while also enhancing the model’s ability to identify signals indicating strong lightning, improving the model’s sensitivity in capturing potential strong lightning features, and further optimizing the extraction and judgment accuracy of lightning-related features.

Simultaneously, the core multilayer perceptron (MLP) within the module—consisting of two fully-connected (FC) layers with the ReLU activation function—leverages its robust nonlinear fitting capability to precisely learn the complex mapping function from aggregated local multi-source feature vectors to the final lightning occurrence probability. This inherently deciphers the deep nonlinear physical relationship between composite indices (e.g., TBB15-TBB13), which reflect cloud-top ice crystallization processes, and lightning activity.

In terms of computational overhead, in scenarios where the number of input channels

C_{in}

(total channels of fused radar-satellite features) is greater than the number of output channels

C_{out}

(channels of lightning physical field features), the core parameter count formulas of the two are as follows (derived under the assumption of fair comparison for decoding scenarios):

Total parameters of WWMLP (2-layer MLP with ReLU activation):

\begin{matrix} 9 + C_{in} \times C_{out} + 2 C_{out}; \end{matrix}

Total parameters of standard double convolution:

\begin{matrix} 9 C_{in} \times C_{out} + 9 C_{out}^{2} + 4 C_{out} . \end{matrix}

Based on this formula, the parameter count of this module is significantly reduced compared to the standard double convolution block Experiments demonstrate that this decoding unit enables the model to generate spatially smooth, sharply bounded, and physically credible reconstructions of lightning-affected areas. It significantly reduces false alarms while maintaining a high detection rate, achieving precise and robust decoding from multi-source features to a high-quality lightning physical field. The module design is illustrated in Figure 5.

3.1.3. Multi-Scale Feature Fusion Module (Fusion Block)

To address the operational challenges in traditional lightning identification, such as fragmented continuity of convective signals and loss of key microphysical features due to dispersed and poorly coordinated multi-scale reconstruction features, we designed a multi-scale feature fusion module. This module serves as an integration and optimization hub for the outputs of each decoder level. First, interpolation combined with cross-layer skip connections is used to perform spatial dimension alignment and channel concatenation on feature maps of different resolutions and levels, enabling the preliminary fusion of high-level and low-level features at the same scale. Then, a Dual Convolution Block ((Conv2d + Batch Norm + RELU)

\times 2

) is employed to extract local cross-scale convective organization features and reconstruct vertical structural coherence, effectively addressing operational issues such as incoherent vertical structures in mesoscale convective systems and blurred weak-echo meteorological signals. The channel attention mechanism (CA Block) is introduced to apply physics-aware weighting based on the multi-scale consistency of feature channels in regions of convective available potential energy release, ice-phase particle scattering enhancement, and charge enrichment zones, effectively overcoming the traditional technical bottleneck of separating deep convective and stratiform precipitation features, suppressing non-convective interference signals and enhancing the application of key meteorological signals. The design resolves the physical decoupling of multi-source observation features during cross-scale transmission, constructing a multi-scale fusion feature field with clear meteorological consistency. The module design is illustrated in Figure 6.

3.1.4. Radar Information Enhancement Module (RE Block)

Existing research indicates that regions of high radar reflectivity typically correspond to deep convective clouds and strong updrafts. In mature thunderstorms, updrafts lift a significant amount of supercooled water droplets above the freezing level (0 °C). At this altitude, frequent collisions, friction, and fragmentation occur between supercooled droplets and particles such as ice crystals and graupel. According to the classical non-inductive charging theory, under specific temperature and liquid water content conditions (typically within the “mixed-phase region” of −10 °C to −25 °C), collisions between ice crystals and graupel result in charge transfer. This process causes lighter ice crystals to become positively charged and be carried upward by updrafts to the upper part of the cloud, while heavier graupel becomes negatively charged and accumulates in the middle to lower parts, forming dipole or tripole charge structures. This large-scale charge separation is a prerequisite for generating strong electric fields and ultimately triggering lightning discharges. Studies clearly demonstrate a strong correlation between lightning activity and radar reflectivity exceeding 31.29 dBZ at the −20 °C level [37,38,39]. This is precisely because this altitude range is the most active region for charge separation, and high reflectivity reflects the abundance of ice-phase particles in this area. The relationship between radar echoes and lightning distribution is illustrated in Figure 7.

To address the problem that single-satellite data is difficult to capture the microphysical processes during the initial stage of severe convection, and to overcome the limitation that weak radar features are easily overshadowed by dominant features in traditional multimodal fusion [40,41], we propose a softly physics-guided bidirectional synergistic neural network framework. This framework takes the large-scale lightning-related physical priors embedded in satellite data as global physical guidance to construct a highly complementary physical mapping space. It further introduces a “identification-correction” mechanism to achieve functional decoupling: the satellite-dominant backbone network focuses on learning macroscopic precipitation patterns, while the specially designed Physics-Guided Radar Enhancement Module generates an adaptive residual probability map (denoted as RSI) based on the explicit physical relationship between radar reflectivity and precipitation intensity (e.g., Z-R relationship) [42], spatial continuity, temporal evolution patterns, and other physical properties, which are implicitly learned within the module. The residual probability map corrects the initial inversion results (denoted as BO) in the form of residual compensation, and the corrected results are subject to value range constraint through a sigmoid layer to ensure they fall within a reasonable physical range. This design establishes a physical consistency verification mechanism at the output level, allowing radar physical features to serve as additional supervisory signals and guiding the model to focus on weak feature regions overlooked in traditional training. This design effectively activates the fine-scale local severe convective structures lost due to the global smoothing characteristic of satellite data, thereby recalling missed detection samples while maintaining a low false-alarm rate. This mechanism is consistent with the findings of He et al. (2025) [43], indicating that the introduction of physical guidance can effectively restore the smoothed weak echo features and improve the physical consistency of prediction results. The proposed framework effectively avoids the inherent information loss or feature conflict in deep feature fusion, fully integrates the global contextual information of satellite data with the local physical guidance of radar observations, and achieves efficient synergy and precise alignment of different modal features under physical guidance, providing reliable support for improving the accuracy of lightning identification. The module structure illustrated in Figure 8.

f u s e d = BO + RSI

(5)

3.1.5. Overall Model Architecture

To address the core operational challenges in lightning monitoring—such as difficulties in fusing multi-source data, strong noise interference, fragmented reconstruction, and missed detection of local severe convection—we proposed the MDE-UNet model. Based on the U-Net skeleton, the model employs a systematic modular design to achieve synergistic enhancement of multi-source information. At the encoding stage, a Channel Attention Block (CA Block) is introduced to adaptively enhance critical satellite and radar channel features related to deep convection while suppressing redundant noise. At the core of the decoding stage, an innovative weighted sliding window MLP block (WWMLP Block) is designed to ensure spatial continuity and physical consistency in reconstruction results by aggregating neighborhood context, effectively eliminating isolated spurious signals. Crucially, the network leverages the inherent skip connections of the U-Net architecture to directly inject encoder-extracted features—rich in spatial detail—into the corresponding levels of the decoder. This fusion mechanism, which combines “abstract semantic guidance with fine-grained detail supplementation,” effectively mitigates the vanishing gradient problem during network training and provides the decoder with precise spatial positional references for high-resolution feature reconstruction. As a result, it significantly enhances the model’s ability to preserve and restore key spatial details—such as the boundaries and morphology of lightning-prone areas—when inverting lightning probability. Furthermore, a Multi-scale Feature Fusion Block (Fusion Block) integrates satellite and radar features at different resolutions, addressing scale decoupling issues in feature transmission and constructing a highly consistent feature field. Finally, a dedicated Radar Enhancement Block (RE Block) performs physics-guided local refinement of satellite-dominated preliminary inversion results, effectively recalling missed detection areas. Through hierarchical encoding, adaptive decoding, and cross-modal deep fusion, the model leverages the complementary advantages of multi-source data. It significantly reduces false alarm rates while improving the spatial continuity, physical credibility, and hit rate of lightning area reconstruction, enhancing the model’s robustness and operational applicability in complex weather scenarios. The model architecture is illustrated in Figure 9.

3.2. Loss Function

To effectively address the core technical challenge of extreme sparsity and severe class imbalance in lightning monitoring data—where positive samples (lightning areas) are scarce—and to overcome issues such as model optimization bias and high false negative rates, as well as to address the practical operational problem of balancing missed detection and false alarm rates caused by gradient dominance of negative samples in traditional binary cross-entropy loss (BCELoss), we propose an asymmetric weighted BCE-Dice composite loss function.

This loss function is specifically designed to align with the model’s optimization objectives. The Dice loss term (Dice Loss) calculates the overlap between predicted and ground truth regions, effectively mitigating class imbalance while constraining the overall spatial continuity and localization accuracy of the regions. Meanwhile, the improved weighted binary cross-entropy term (weighted BCE) assigns higher penalty weights to sparse positive samples, compelling the model to focus more on the challenging-to-learn features of lightning pixels during training.

The weighted fusion of these two components enables the model to not only perform fine-grained pixel-level probability calibration during optimization but also ensure overall spatial consistency between predicted regions and actual lightning areas, thereby prioritizing the avoidance of high-risk missed detection errors. From the perspective of optimization objectives, this design fundamentally addresses a series of challenges, including model learning failure under sparse samples, fragmented prediction regions, and the difficulty of embedding operational risk preferences. It lays the theoretical foundation for the model to ultimately achieve a balance between high detection rates and low false alarm rates, serving as a key factor in enhancing the physical consistency and operational usability of lightning area reconstruction. The formula is as follows:

L o s s = λ_{B C E} \cdot WeightedBCE + λ_{D i c e} \cdot DiceLoss

(6)

Among them,

λ_{B C E}

and

λ_{D i c e}

are the weighting coefficients for the weighted BCE loss and Dice loss, respectively, used to balance the contributions of the two loss components.

The Dice loss leverages its focus on the intersection of positive samples to correct gradient bias, compelling the model to emphasize sparse lightning pixels. The formula is as follows:

DiceLoss = 1 - \frac{2 \times \sum_{b = 1}^{B} \sum_{h = 1}^{H} \sum_{w = 1}^{W} (p_{b, h, w} \cdot y_{b, h, w}) + 1.0}{\sum_{b = 1}^{B} \sum_{h = 1}^{H} \sum_{w = 1}^{W} p_{b, h, w} + \sum_{b = 1}^{B} \sum_{h = 1}^{H} \sum_{w = 1}^{W} y_{b, h, w} + 1.0}

(7)

Here,

p_{b, h, w}

represents the predicted probability of the pixel at position (h,w) in the b-th sample by the model;

y_{b, h, w}

denotes the ground truth label of the pixel at position (h,w) in the b-th sample.

Additionally, asymmetric weights for false positives and false negatives are introduced to precisely align with operational cost requirements. By combining pixel-level loss preservation and region-level overlap measurement, this approach addresses the issue of gradient dominance by negative samples while balancing pixel accuracy and regional localization precision, thereby ensuring effective learning of sparse lightning features.

The asymmetric weights

ω_{b, h, w}

and weighted BCE loss WeightedBCE are formulated as follows:

ω_{b, h, w} = \{\begin{matrix} ω_{F P}, & p_{b, h, w} > 0.5 and y_{b, h, w} = 0 (F P) \\ ω_{F N}, & p_{b, h, w} < 0.5 and y_{b, h, w} = 1 (F N) \\ 1, & others \end{matrix}

(8)

WeightedBCE = \frac{1}{B \times H \times W} \sum_{b = 1}^{B} \sum_{h = 1}^{H} \sum_{w = 1}^{W} ω_{b, h, w} \cdot [- y_{b, h, w} log (p_{b, h, w}) - (1 - y_{b, h, w}) log (1 - p_{b, h, w})]

(9)

In the above equation,

ω_{F P}

denotes the penalty weight for false-positive (false alarm) samples, and

ω_{F N}

denotes the penalty weight for false-negative (missed detection) samples.

In this paper, based on UNet, we investigate the effect of the loss function with different ratios of

ω_{FP} / ω_{FN}

(fixing

ω_{FP} = 2

). Based on the Critical Success Index (CSI), the optimal combination (4:3) is selected through comparison, and the results are shown in Table 1. The results indicate that a higher ratio of

ω_{FP} / ω_{FN}

makes the loss function more inclined to penalize missed reports, thus leading to an increase in the Probability of Detection (POD); however, the False Alarm Ratio (FAR) also increases accordingly.

In addition, based on UNet, we studied the effects of BCELoss and DICELoss loss functions under different ratios (fixing the sum of the ratios to 1). Based on the Critical Success Index (CSI), the optimal combination (7:3) is selected through comparison, and the results are shown in Table 2.

4. Experimental Process and Results

4.1. Experimental Details

The experiment is conducted on an NVIDIA GeForce RTX 3090 (24GB GDDR6X) with Python 3.12 and PyTorch 1.10+, using PyTorch’s auto-differentiation and optimizer interface for efficient parameter updates and flexible training control. The dataset is split into training (70%), validation (15%) and test (15%) subsets, with trainset randomly shuffled for sample randomness.This ratio ensures sufficient training samples for better generalization while enabling reliable parameter tuning and performance evaluation. Training settings: Adam optimizer with initial learning rate 1 ×

10^{- 4}

; gradient clipping (L2 norm ≤ 1.0, the most common and recommended threshold) to avoid gradient explosion; 30 training epochs total. The model with optimal validation performance is saved (based on CSI); post-training, its weights are loaded to conduct a one-time evaluation on the independent test set, and core metrics are recorded to reflect generalization capability.

To select an optimal probability threshold for binarizing model outputs (a prerequisite for confusion matrix calculation), we evaluated the Critical Success Index (CSI) across a series of candidate thresholds based on the UNet model test. Table 3 presents the CSI values corresponding to different probability thresholds.

As shown in Table 3, the CSI reaches its maximum value (0.4804) at a probability threshold of 0.40. Therefore, we adopted 0.4 as the threshold to binarize the model outputs for confusion matrix calculation.

To enhance the model’s learning capability and address the extreme sparsity of lightning samples, we forgo traditional data augmentation in favor of a two-stage transfer learning strategy combined with targeted sample-level processing. In the pre-training stage, we utilize Gaussian-smoothed labels binarized at a threshold of 0 to generate a comprehensive set of lightning samples. This approach serves as a weak supervision signal, with the explicit goal of maximizing the coverage of potential lightning-affected areas. By providing a broader learning signal, this strategy helps the model rapidly capture prominent lightning signatures and establish fundamental spatial localization patterns, thereby effectively alleviating the challenges posed by severe spatiotemporal sparsity. Subsequently, the pre-trained weights are fine-tuned using high-risk area labels—processed with a more stringent binarization threshold of 1 and a refined learning rate of

1 \times 10^{- 5}

. This stage guides the model to transition from coarse localization to precise boundary delineation, optimizing both recognition accuracy and morphological reconstruction. Collectively, these sample-level strategies, alongside the asymmetric loss function design, provide a robust framework for learning from sparse and imbalanced meteorological data.

4.2. Evaluation Metrics

To evaluate the performance of the lightning identification and monitoring model, it is necessary to select targeted evaluation metrics that align with the characteristics of lightning events—such as their sudden onset, imbalanced sample distribution, and the practical requirements of monitoring scenarios. These metrics should comprehensively quantify the model’s core performance dimensions, including identification accuracy, missed detection risk, false alarm interference, and pixel-level prediction errors. We selected six metrics—Probability of Detection (POD), False Alarm Ratio (FAR), Critical Success Index (CSI), Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Classification Accuracy (Accuracy)—to construct a multidimensional, comprehensive model evaluation system. The definitions, mathematical expressions, and physical meanings of these metrics are described below.

4.2.1. Probability of Detection (POD)

The Probability of Detection (POD), or hit rate, is a core disaster monitoring metric that quantifies a model’s ability to detect real lightning events. It uses a binary confusion matrix:

TP (True Positive): The ground truth is lightning, and the model predicts lightning (correct identification).
FN (False Negative): The ground truth is lightning, but the model predicts non-lightning (missed detection).
FP (False Positive): The ground truth is non-lightning, but the model predicts lightning (false alarm).
TN (True Negative): The ground truth is non-lightning, and the model predicts non-lightning (correct exclusion).

The formula is:

POD = \frac{TP}{TP + FN}

(10)

The value of Probability of Detection (POD) ranges from 0 to 1. A value closer to 1 indicates higher detection sensitivity of the model to real lightning events, a lower probability of missed detection, and provides reliable technical support for lightning disaster warning.

4.2.2. False Alarm Rate (FAR)

The False Alarm Rate (FAR) quantifies the proportion of false alarms in model predictions, defined as the ratio of non-lightning samples incorrectly identified as lightning events. It is a key metric for evaluating a model’s anti-interference capability and the credibility of monitoring results. The formula is

FAR = \frac{FP}{TP + FP}

(11)

FAR ranges from 0 to 1. A value closer to 0 indicates fewer false alarms, stronger discriminative ability for non-lightning events, and higher credibility of monitoring results; conversely, a higher FAR may lead to a large number of ineffective alarms.

4.2.3. Critical Success Index (CSI)

The Critical Success Index (CSI), also known as the Threat Score (TS), is a core comprehensive metric for evaluating the performance of lightning detection models. It eliminates the interference of random guessing and is particularly suitable for scenarios with imbalanced sample distributions. The formula is

CSI = \frac{TP}{TP + FP + FN}

(12)

CSI ranges from 0 to 1, with values closer to 1 indicating better model detection performance. By penalizing both missed detections (FN) and false detections (FP), the CSI provides a more accurate reflection of a model’s practical usability than metrics such as the Probability of Detection (POD) and False Alarm Rate (FAR), and is thus regarded as a benchmark for assessing a model’s overall performance

4.2.4. Structural Similarity Index (SSIM)

The SSIM (Structural Similarity Index) assesses image similarity in structure, luminance, and contrast, quantifying visual consistency and distortion. As a key metric for image quality and model performance evaluation, it supports model threshold tuning, network architecture optimization, and image preprocessing parameter fine-tuning. The formula is

SSIM (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(13)

The definitions of each parameter are as follows:

μ_{x}

and

μ_{y}

are the pixel means of the reference image and the evaluated image, respectively, reflecting luminance information;

σ_{x}^{2}

and

σ_{y}^{2}

are the pixel variances of the two images, reflecting contrast information;

σ_{x y}

is the pixel covariance between them, reflecting structural correlation information; and

C_{1}

,

C_{2}

are very small constants introduced to avoid division by zero.

4.2.5. Accuracy

Accuracy (Classification Accuracy) is a common metric for binary classification, measuring overall model performance. It is the ratio of correctly predicted samples (both true lightning and true non-lightning cases) to all samples, reflecting the model’s ability to distinguish the two classes. The formula is

Accuracy = \frac{TP + TN}{TP + TN + FP + FN}

(14)

The value of Accuracy ranges from 0 to 1. A value closer to 1 indicates better overall classification performance of the model.

4.2.6. AUC-ROC (Area Under the Receiver Operating Characteristic Curve)

AUC-ROC (Area Under the Receiver Operating Characteristic Curve) is a widely used metric for evaluating binary classification models, especially suitable for imbalanced datasets. It quantifies the model’s capability to distinguish between positive and negative classes across all classification thresholds.

The AUC value is calculated as the area under the ROC curve, which plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at different threshold settings. TPR and FPR are defined as

TPR = \frac{TP}{TP + FN}, FPR = \frac{FP}{FP + TN}

(15)

where TP (True Positive) is the number of correctly predicted positive samples, FN (False Negative) is the number of positive samples incorrectly predicted as negative, FP (False Positive) is the number of negative samples incorrectly predicted as positive, and TN (True Negative) is the number of correctly predicted negative samples.

The mathematical expression for AUC is

AUC = \int_{0}^{1} TPR (FPR) d (FPR)

(16)

AUC ranges from 0 to 1: a value of 1 represents a perfect classifier, while 0.5 means the model has no discriminative ability (equivalent to random guessing).

4.2.7. THLC (True Hit Lightning Count)

This metric directly quantifies the model’s ability to effectively identify real lightning events. By counting the number of successfully detected lightning flashes, this metric is independent of the number of negative samples and clearly reflects the practical detection performance of the model.

4.3. Results

4.3.1. Ablation Study

To investigate the impact of each module on model performance, we progressively removed individual modules and recorded the corresponding performance metrics, as shown in Table 4. The visualization of identification results for each model (part a), along with the residual heatmaps (part b) (i.e., the discrepancy between the model’s identification results and the ground truth, with red representing false positives, blue representing false negatives, and darker shades indicating a greater degree of such errors), is shown in Figure 10. In the ground truth, red represents high-frequency lightning regions, while green denotes sub-high-frequency regions. Table 5 presents the total number of correctly detected lightning occurrences across various visualized sample scenarios for each model.

Based on both quantitative metrics and visualization results, we analyze the lightning region localization and reconstruction performance of all models under different improvement strategies as follows. MDE-UNet achieves the optimal values in core metrics such as POD, CSI, SSIM, THLC, Accuracy and AUC-ROC, while maintaining a relatively low level of FAR, with its overall performance significantly outperforming the baseline UNet and all ablation variants. Specifically, the THLC metric, representing the total number of truly hit lightning strikes, reaches 88,337, the POD is improved to 0.6671, and the CSI stands at 0.5848, fully demonstrating the model’s superiority in lightning detection, regional localization and morphological reconstruction. Ablation experimental results further indicate that removing any component leads to a significant performance degradation, verifying the crucial role of each module in enhancing model accuracy, suppressing false alarms and improving spatial consistency.

4.3.2. Attention Mechanism, Physical Consistency Test and Result Attribution Analysis

To verify the model’s ability to utilize the physical prior information of each channel, this study calculates the statistical metrics (mean and standard deviation) of physical quantities across all channels for regions with correct and incorrect model predictions. The results are presented in Table 6 and Table 7.

As can be observed, distinct differences exist in the physical quantity distributions among different prediction categories. For the radar channel, the TP region presents a remarkably high mean value, while the TN region shows a much lower mean value, which is consistent with the physical characteristics of the observed target. In contrast, the FN and FP regions exhibit intermediate mean values, indicating ambiguous physical features that lead to classification errors. For the infrared channels and combined physical features, similar differentiable patterns can also be observed across TP, TN, FN, and FP regions. Meanwhile, the standard deviation values reflect the fluctuation of physical quantities within each category. These statistical results demonstrate that the model tends to make correct predictions when the input physical features present clear and typical patterns, whereas the fuzzy or intermediate physical characteristics are more likely to cause misjudgment.

To verify the physical guidance effect of the REBlock, this study conducts a visual analysis of the RSI heat map output by the module, as shown in Figure 11. The results show that the numerical range of radar information changes significantly after being processed by the REBlock, and its spatial distribution is highly similar to that of the original radar echo. As can be seen in the visualization results, regions with strong precipitation characteristics in the radar echo can effectively correspond to lightning occurrence areas.

In addition, we employed the SHAP technique to quantify the contribution of each input channel to the model’s prediction results, as presented in Table 8. It can be seen from the table that the Radar channel and Band9 channel rank the top two in both SHAP value and contribution degree, reaching 25.5% and 25.2% respectively, indicating that radar observation data and Band 9 satellite data are the core features driving the model’s prediction. Meanwhile, the combined feature (TBB11-TBB13)-(TBB13-TBB15) also shows a high contribution degree (20.4%), indicating that the physical features constructed based on brightness temperature differences also play an important role in lightning identification, while the TBB13 and TBB15-TBB13 channels have relatively lower contributions, at 14.7% and 14.2% respectively.

This indicates that the module can explicitly model the physical relationship between radar echo and precipitation, which is the key reason why the REBlock can improve the lightning identification hit rate. Meanwhile, the false alarm rate of the model increases slightly due to the influence of the radar echo coverage. Combining quantitative experiments and visualization analysis, the REBlock has more advantages than disadvantages overall in the lightning identification task, and can effectively enhance the model’s ability to perceive and identify lightning regions.

4.3.3. Comparison of Multiple Models and Loss Functions

To comprehensively validate the performance of the proposed MDE-UNet, several mainstream segmentation models, including HRNet, SegNet, and BiSeNet, are adopted as baseline methods for comparative experiments. Quantitative evaluation results in terms of POD, FAR, CSI, SSIM, PSNR, Accuracy, and AUC-ROC are listed in Table 9. Figure 12 shows the radar charts of performance metrics for each model. Figure 13 shows the visualization of identification results for each model (part a), along with the residual heatmaps (part b) (i.e., the discrepancy between the model’s identification results and the ground truth, with red representing false positives, blue representing false negatives, and darker shades indicating a greater degree of such errors.) of each model. In the ground truth, red represents high-frequency lightning regions, while green denotes sub-high-frequency regions. Table 10 lists the real hit counts of lightning events for each model on the visualized samples. It can be observed that MDE-UNet achieves the best performance across almost all evaluation metrics. Specifically, MDE-UNet obtains the highest POD, CSI, SSIM, THLC, Accuracy, and AUC-ROC values, along with the lowest FAR value, outperforming HRNet, SegNet, and BiSeNet by a clear margin.

The superior performance of MDE-UNet compared with classic segmentation models demonstrates its effectiveness in the lightning identification task. By integrating multi-modal physical information and dedicated structural designs, MDE-UNet not only improves the accuracy and integrity of lightning region detection but also reduces false alarms effectively. The obvious advantages in both meteorological evaluation metrics (POD, CSI, FAR) and image quality metrics (SSIM, PSNR) confirm that the proposed model can better capture the spatial and physical characteristics of lightning. These results indicate that MDE-UNet is more suitable and reliable for lightning identification from multi-modal remote sensing data than conventional segmentation networks.

Finally, to verify the performance of the asymmetric weighted BCE-Dice composite loss function, this study compares the asymmetric weighted BCE-Dice composite loss, standard BCE-Dice composite loss, and MSE based on the UNet model, with the results shown in Table 11. The asymmetric weighted BCE-Dice Loss outperforms the others in multiple metrics (Pod, CSI, SSIM, THLC, Accuracy, AUC-ROC). Although MSE achieves the lowest false alarm rate (FAR), its low hit rate significantly diminishes the value of this advantage. BCE-DICE Loss yields the highest SSIM value, yet its overall performance is inferior to that of the asymmetric weighted BCE-Dice Loss.

5. Discussion

This study proposes an end-to-end MDE-UNet model for pixel-level lightning identification based on the fusion of radar and satellite data. The results demonstrate that multi-scale feature fusion, asymmetric supervision, and physics-guided enhancement strategies not only significantly alleviate the high false alarm rate and class imbalance issues in real-time lightning detection but also effectively improve the hit rate of lightning events. The comprehensive performance outperforms multiple mainstream baseline segmentation models and loss functions. In terms of inference efficiency, the single-sample inference time is only 6.1387 milliseconds, exhibiting extremely low latency characteristics that can meet the real-time response requirements of high-time-sensitive applications such as lightning monitoring and early warning.

5.1. Modules

The CABlock captures and enhances key channel information while suppressing irrelevant noise channels, helping the model improve the hit rate and effectively reduce the false alarm rate.
The WWMLPBlock effectively suppresses isolated false signals that deviate from meteorological principles by strengthening the inherent spatial continuity and physical correlation of lightning-affected areas in the real atmosphere. It also enhances the model’s ability to identify implicit strong lightning signals, improves the sensitivity of the model to capture potential strong lightning features, and significantly improves various comprehensive indicators of the model.
The FusionBlock addresses the physical decoupling problem of multi-source observation features during cross-scale transmission and constructs a multi-scale fusion feature field with clear meteorological consistency. It not only achieves fine-grained characterization of the full life cycle of convective systems but also effectively reduces the false alarm rate of lightning potential in operational tests while significantly improving the hit rate of lightning activity.
The REBlock explicitly models the physical correlation between radar echoes and precipitation. The characteristic that areas with strong precipitation features in radar echoes can effectively correspond to lightning occurrence areas is the key reason why the REBlock can improve the lightning identification hit rate. However, affected by this, the model’s false alarm rate has a slight increase. Comprehensive quantitative experiments and visual analysis show that the REBlock has more advantages than disadvantages in the lightning identification task and can effectively enhance the model’s perception and recognition capabilities for lightning areas.

5.2. Loss Function Design

Asymmetric weighted BCE-Dice composite loss achieves effective supervision through the weighted combination of binary cross-entropy (BCE) and DiceLoss, which can be summarized as follows:

1.

It alleviates the gradient dominance problem in extremely sparse data scenarios and performs asymmetric optimization for the imbalance between missing detection rate and false alarm rate, thus achieving a benign balance between hit rate and false positive rate.

2.

The Dice term is integrated to enhance the spatial consistency of prediction results.

3.

Compared with Mean Squared Error Loss (MSELoss), it is more compatible with the Bernoulli distribution of binary classification tasks:

Higher sensitivity to positive sample errors, avoiding loss dilution by a large number of negative samples;
The Dice term focuses on the spatial structure of regions, overcoming the defect of MSELoss (only point-wise calculation, ignoring overall morphological features).

Overall, the asymmetric weighted BCE-Dice composite loss shows significant advantages in addressing class imbalance, improving spatial prediction accuracy, and meeting business risk preferences.

5.3. Physical Prior

Statistical analysis of the model’s correctly and incorrectly identified areas shows that in correctly identified regions (TP, TN), convective features are significant with small standard deviations, indicating stable performance; TN in particular has the smallest standard deviation, confirming the model’s stable recognition of non-lightning areas and ability to distinguish false lightning. Incorrectly identified areas (FN, FP) are mainly concentrated in moderate remote sensing data combinations, where the mean values of each channel are extremely close and standard deviations are large, leading to high discrimination difficulty—this confusion stems from the physical overlap of different weather systems in remote sensing signals, an inherent physical interference rather than a model defect. Additionally, each channel’s data has strong physical interpretability, further validated by SHAP feature importance results.

First, the radar echo channel data exhibits clear physical regularity, with TP much higher than TN, consistent with the basic meteorological principle that lightning is often accompanied by severe convective storms. Strong radar echoes correspond to large hydrometeor particles in deep convective clouds. SHAP analysis shows that the radar channel contributes the most to model prediction (25.5%) and plays an important role in lightning discrimination.

Second, Band9 channel data follows physical laws: TP is the lowest and TN the highest among the four categories, showing an obvious negative trend. This corresponds to the mechanism by which lightning occurs near convective cloud tops with extremely low temperatures and a dry–cold upper atmosphere. SHAP analysis shows its contribution rate is 25.2%, nearly equal to the radar channel, indicating strong discriminative power. TBB13 channel results also fit physical expectations: TP is the lowest, and TN is the highest, confirming the close relation between lightning and cold cloud tops. FP is higher than TP but distinct from TN, while FN is close to TP. This suggests that missed detections mainly appear in cold cloud-top convection without lightning, and false alarms may come from ordinary convective cells. The ambiguity arises from the non-strict correspondence between cloud-top temperature and lightning. TBB13 has a SHAP contribution of 14.7%, ranking fourth, with moderate but reliable discrimination ability.

Third, the mean values of the TBB15-TBB13 channel are all negative, consistent with the microphysical characteristics of lightning clouds dominated by ice-phase particles. TP shows the most negative value, FP is even more negative, while FN is relatively warm. This reveals the channel’s limitation: strong ice-phase signatures do not guarantee lightning. For instance, heavy snow clouds may cause false alarms, while early convective systems with insufficient ice crystals lead to missed detections. The small numerical differences between categories indicate weak discriminative ability when used alone, consistent with its lowest SHAP contribution of 14.2%. Prediction errors stem from the complex nonlinear relationship between cloud microphysics and lightning, an inherent physical constraint.

Fourth, the (TBB11-TBB13)-(TBB13-TBB15) channel shows the best statistical separability. TP is significantly lower than the other three categories, reflecting stronger vertical development and more complex mixed-phase structures in lightning clouds. Although FP exists, it is still distinguishable from TN, meaning false-alarm clouds have high but weaker structure complexity than real lightning clouds. FN is close to FP, suggesting missed-alarm systems already resemble lightning clouds in macrostructure but lack key discharge triggers. The small FP deviation indicates the model is robust to this feature. SHAP analysis confirms its strong ability to capture cloud structure complexity, with a contribution rate of 20.4

5.4. Limitations

However, due to limitations in data and equipment, this study still has the following shortcomings:

Only lightning identification in a single fixed area was studied, and no other physical field features were added. This may cause such features to be implicitly learned in the model, leading to poor spatial generalization ability of the model.
Time interpolation was performed on radar data, which inevitably introduces artifacts. Moreover, downsampling conducted for spatial alignment will inevitably lead to information loss, which will affect the model’s identification effect.
For the determination of loss function hyperparameters, a fixed parameter + grid search strategy was adopted, which cannot prove that the final selected result is the optimal strategy, and there may be better hyperparameter combinations.
The binarization threshold was selected using a grid search, which does not guarantee that the selected threshold is optimal, and there may be better thresholds.

5.5. Future Work

Future work may consider adding more physical field features (such as sounding, terrain, wind field), designing independent encoding modules to extract features from higher-resolution radar data, and considering introducing historical frame physical information to avoid interpolation for full utilization of high spatio-temporal resolution radar information. Adaptive loss functions and dynamic threshold strategies will be introduced to realize automatic optimization of the model in different scenarios; the model will be connected with real-time observation data streams to promote the operational implementation of end-to-end lightning monitoring and early warning systems.

6. Conclusions

To address the key bottlenecks in pixel-level lightning identification (imbalance between the False Alarm Ratio (FAR) and Probability of Detection (POD), class imbalance, poor multi-source fusion, and insufficient physical interpretability), this study proposes an end-to-end MDE-UNet model based on radar and satellite data fusion. The main conclusions are summarized as follows: The MDE-UNet model, integrating multi-scale feature fusion, asymmetric supervision, and physics-guided enhancement strategies, achieves high-precision and low-latency lightning identification. It outperforms mainstream baseline models and loss functions, alleviates core detection problems, and its single-sample inference time (6.1387 ms) meets the real-time requirements of lightning monitoring and early warning.

The designed physics-guided modules (CABlock, WWMLPBlock, FusionBlock, REBlock) and asymmetric weighted BCE-Dice composite loss function synergistically improve model performance and interpretability. They effectively suppress noise, strengthen physical correlations, solve multi-source feature decoupling, and address gradient dominance under sparse data, adapting to practical business needs.

Statistical analysis confirms the model’s strong physical interpretability, with channel performance consistent with lightning meteorological principles; model misidentifications (FN, FP) result from inherent physical signal overlap rather than model defects, ensuring reliability.

This study clarifies limitations (single-region research, data preprocessing losses, suboptimal parameters) and proposes future directions (multi-region datasets, more physical features, adaptive strategies, operational landing). In summary, it provides a new algorithm for multi-source remote sensing-based lightning identification, with important scientific and practical value for atmospheric remote sensing and lightning warning.

Author Contributions

Conceptualization: Y.L., L.S., X.W., Q.Z. and Y.C.; methodology, Y.C. and Y.H.; software, Y.C., Y.H., J.W. and Y.Z.; validation, Y.C. and Y.H.; formal analysis, Y.C. and Y.H.; investigation, Y.C. and Y.H.; resources, Y.C. and Y.Z.; data curation, Y.C., Y.L. and Y.Z.; writing—original draft preparation, Y.C.; writing—review and editing, Y.C. and Y.L.; visualization, Y.C.; supervision, Y.L.; project administration, Y.L.; funding acquisition, Y.L. and L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Young Scientists Fund of the National Natural Science Foundation of China, grant number 42405148; the Startup Foundation for Introducing Talent of Nanjing University of Information Science and Technology, grant number 2023r124; the Open Research Project of the Key Laboratory of Lightning, China Meteorological Administration grant number 2024KELL-B013; and the Natural Science Foundation of Shandong Province, China grant number ZR2023MD012; Scientific and Technological Research Projects of Guangdong Meteorological Bureau (No. GRMC2022Z06). This research was funded by State Key Laboratory of Climate System Prediction and Risk Management.

Data Availability Statement

The dataset (mainly radar data) used in this study is subject to a confidentiality agreement and therefore cannot currently be made publicly available.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, W.; Meng, Q.; Ma, M.; Zhang, Y. The lightning casualties and damages in china from 1997 to 2009. Nat. Hazards 2011, 57, 465–476. [Google Scholar] [CrossRef]
Reis, T.P.d.; Raizer, A. Modeling and simulation of distribution networks under lightning transients: A comparative study of accuracy and complexity. Energies 2024, 17, 337. [Google Scholar] [CrossRef]
Song, Y.; Xu, C.; Li, X.; Oppong, F. Lightning-induced wildfires: An overview. Fire 2024, 7, 79. [Google Scholar] [CrossRef]
Guryanov, V.V.; Mikhailov, R.P.; Eliseev, A.V. Present–day and future lightning frequency as simulated by four cmip6 models. Pure Appl. Geophys. 2024, 181, 3351–3374. [Google Scholar] [CrossRef]
Pan, Y.; Zheng, D.; Zhang, Y.; Yao, W.; Zhang, Y.; Fang, X.; Lyu, W.; Zhang, W. Significant location accuracy changes resulting from lightning detection networks deployed on inclined terrains. Remote Sens. 2023, 15, 5733. [Google Scholar] [CrossRef]
Li, X.; Yang, L.; Yin, Q.; Yang, Z.; Zhou, F. A lightning warning method using atmospheric electric field based on EEWT-ASG and morpho. Atmosphere 2023, 14, 1002. [Google Scholar] [CrossRef]
Ni, Y.; Liu, S.; Guo, T.; Xia, M. TiBT-Net: A High-Resolution Remote Sensing Image Change Detection Network Integrating Bi-Temporal Space Enhancement and Token Interaction. Remote Sens. 2026, 18, 805. [Google Scholar] [CrossRef]
Ren, Z.; Weng, L.; Xia, M.; Lin, H. MCINet: Multi-attentive cross-level interaction network for cloud and snow segmentation. J. Appl. Remote Sens. 2026, 20, 021404. [Google Scholar] [CrossRef]
Yin, H.; Wang, J.; Liu, S.; Wang, Y.; Liu, Y.; Guo, T.; Xia, M. MISA-Net: Multi-Scale Interaction and Supervised Attention Network for Remote-Sensing Image Change Detection. Remote Sens. 2026, 18, 376. [Google Scholar] [CrossRef]
Qian, Z.; Wang, D.; Shi, X.; Yao, J.; Hu, L.; Yang, H.; Ni, Y. Lightning identification method based on deep learning. Atmosphere 2022, 13, 2112. [Google Scholar] [CrossRef]
Song, G.; Li, S.; Xing, J. Lightning nowcasting with aerosol-informed machine learning and satellite-enriched dataset. npj Clim. Atmos. Sci. 2023, 6, 126. [Google Scholar] [CrossRef]
Zhang, Y.; Cao, D.; Yang, J.; Lu, F.; Wang, D.; Liu, R.; Zhang, H.; Liu, D.; Chen, Z.; Lyu, H.; et al. A parallax shift effect correction based on cloud top height for FY-4A lightning mapping imager (LMI). Remote Sens. 2023, 15, 4856. [Google Scholar] [CrossRef]
Chung, E.-S.; Sohn, B.-J.; Schmetz, J. CloudSat shedding new light on high-reaching tropical deep convection observed with Meteosat. Geophys. Res. Lett. 2008, 35, 2. [Google Scholar] [CrossRef]
Iwasaki, S.; Shibata, T.; Nakamoto, J.; Okamoto, H.; Ishimoto, H.; Kubota, H. Characteristics of deep convection measured by using the A-train constellation. J. Geophys. Res. Atmos. 2010, 115, D6. [Google Scholar] [CrossRef]
Poland, M.P.; Lopez, T.; Wright, R.; Pavolonis, M.J. Forecasting, detecting, and tracking volcanic eruptions from space. Remote Sens. Earth Syst. Sci. 2020, 3, 55–94. [Google Scholar] [CrossRef]
Jumianti, N.; Marzuki, M.; Yusnaini, H.; Ramadhan, R.; Harjupa, W.; Saufina, E.; Nauval, F.; Risyanto, R.; Sakti, A.D.; Abdillah, M.R.; et al. Prediction of extreme rain in kototabang using himawari-8 satellite based on differences in cloud brightness temperature. Remote Sens. Appl. Soc. Environ. 2024, 33, 101102. [Google Scholar] [CrossRef]
Lee, S.H.; Suh, M.S. Lightning detection using GEO-KOMPSAT-2A/advanced meteorological imager and ground-based lightning observation sensor linet data. Remote Sens. 2024, 16, 4243. [Google Scholar] [CrossRef]
Lu, M.; Dong, T.; Chen, M.; Yu, M.; Liu, H.; He, C.; Zhang, J.; Mao, Y. Lightning identification based on multiple weather radar product data. Int. J. Digit. Earth 2025, 18, 2498604. [Google Scholar] [CrossRef]
Ye, H.; Wang, Y. Residual transformer yolo for detecting multi-scale crowded pedestrian. Appl. Sci. 2023, 13, 12032. [Google Scholar] [CrossRef]
Jiao, L.; Abdullah, M.I. Yolo series algorithms in object detection of unmanned aerial vehicles: A survey. Serv. Oriented Comput. Appl. 2024, 18, 269–298. [Google Scholar] [CrossRef]
Zheng, D.; Fan, P.; Zhang, Y.; Yao, W.; Fang, X.; Ran, R. Deep convective clouds observed by ground-based radar over Naqu, Qinghai–Tibet Plateau. Atmos. Res. 2023, 293, 106930. [Google Scholar] [CrossRef]
Wang, F.; Zhang, Y.; Deng, X.; Liu, H.; Dong, W.; Yao, W. Characteristics of regions with high-density initiation of flashes in mesoscale convective systems. Remote Sens. 2022, 14, 1193. [Google Scholar] [CrossRef]
Liu, Y.; Wang, J.; Song, Y.; Liang, S.; Xia, M.; Zhang, Q. Lightning nowcasting based on high-density area and extrapolation utilizing long-range lightning location data. Atmos. Res. 2025, 321, 108070. [Google Scholar] [CrossRef]
Zhang, Z. Identification method and analysis on lightning flash initiation phase and size. J. Appl. Meteorol. Sci. 2017, 28, 2414. [Google Scholar]
Coy, J.J.; Bell, A.; Yang, P.; Wu, D.L. Sensitivity analyses for the retrievals of ice cloud properties from radiometric and polarimetric measurements in sub-mm/mm and infrared bands. J. Geophys. Res. Atmos. 2020, 125, e2019JD031422. [Google Scholar] [CrossRef]
Zhang, X.; Yin, Y.; Kukulies, J.; Li, Y.; Kuang, X.; He, C.; Lapierre, J.L.; Jiang, D.; Chen, J. Revisiting lightning activity and parameterization using geostationary satellite observations. Remote Sens. 2021, 13, 3866. [Google Scholar] [CrossRef]
Zhang, H.; Deng, Y.; Wang, Y.; Lan, L.; Wen, X.; Fang, C.; Xu, J. Extraction of factors strongly correlated with lightning activity based on remote sensing information. Remote Sens. 2024, 16, 1921. [Google Scholar] [CrossRef]
Sun, J.; Xiao, Y.; Li, Y.; Du, M.; Fu, Z.; Leng, L.; Cai, R.; Wu, H. Lightning activity and microphysical structure characteristics during the convective cell mergers in an extreme mesoscale convective system. Atmos. Res. 2024, 301, 107266. [Google Scholar] [CrossRef]
Zhao, Z.; Duan, C.; Song, L.; Zhang, Q.; Zhu, W.; Liu, Y. Misspred: A robust two-stage radar echo extrapolation algorithm for incomplete sequences. Remote Sens. 2025, 17, 2066. [Google Scholar] [CrossRef]
Yang, Z.; Qi, Y.; Zhang, Z.; Li, D. Can CINRAD Radar With VCP-21 Mode Capture the Accumulated Rainfall Pattern and Intensity of Fast-Moving Storms? IEEE Trans. Geosci. Remote Sens. 2023, 62, 4100813. [Google Scholar] [CrossRef]
Tian, J.; Qiu, Q.; Zhao, X.; Mu, W.; Cui, X.; Hu, C.; Kang, Y.; Tu, Y. Application of variational optical flow forecasting technique based on precipitation spectral decomposition to three case studies of heavy precipitation events during rainy season in Hebei Province. Water 2023, 15, 2204. [Google Scholar] [CrossRef]
Wang, J.; Wang, Z.; Ye, J.; Lai, A.; Ma, H.; Zhang, W. Technical Evaluation of Precipitation Forecast by Blending Weather Radar Based on New Spatial Test Method. Remote Sens. 2023, 15, 3134. [Google Scholar] [CrossRef]
Xun, L.; Yao, Q.; Ji, F.; Li, T.; Zhang, J.; Miao, K.; Yan, Q. PPNet: A more effective method of precipitation prediction. Meteorol. Appl. 2022, 29, e2081. [Google Scholar] [CrossRef]
Wang, L.; Dong, Y.; Zhang, C.; Heng, Z. Extreme and severe convective weather disasters: A dual-polarization radar nowcasting method based on physical constraints and a deep neural network model. Atmos. Res. 2023, 289, 106750. [Google Scholar] [CrossRef]
Li, J.; Shi, Y.; Zhang, T.; Li, Z.; Wang, C.; Liu, J. Radar precipitation nowcasting based on ConvLSTM model in a small watershed in north China. Nat. Hazards 2024, 120, 63–85. [Google Scholar] [CrossRef]
Zhao, Z.; Wang, Z.; Zhao, G.; Zhao, J. A new strong convective precipitation forecasting method based on attention mechanism and spatio-temporal reasoning. Sci. Rep. 2024, 14, 19024. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Zhang, H.; Hu, J.; Zhang, Y.; Yao, J. Research on the correlation between meteorological radar echo characteristics and lightning warning technology. Atmos. Ocean. Sci. Lett. 2025, 19, 100649. [Google Scholar] [CrossRef]
Dovgalyuk, Y.A.; Veremei, N.E.; Sin’kevich, A.A.; Mikhailovskii, Y.P.; Toropova, M.L.; Popov, V.B.; Lu, J.; Yang, J. Investigation of electrification mechanisms and relationship between the electrical discharge frequency and radar characteristics of the thunderstorm in China. Russ. Meteorol. Hydrol. 2020, 45, 712–719. [Google Scholar] [CrossRef]
Sin’kevich, A.A.; Kurov, A.B.; Mikhailovskii, Y.P.; Toropova, M.L.; Veremei, N.E. Study of thundercloud characteristics in northwest Russia using neural networks. Atmos. Ocean. Opt. 2023, 36, 137–143. [Google Scholar] [CrossRef]
Niu, D.; Li, Y.; Wang, H.; Zang, Z.; Jiang, M.; Chen, X.; Huang, Q. Fsrgan: A satellite and radar-based fusion prediction network for precipitation nowcasting. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 7002–7013. [Google Scholar] [CrossRef]
Pei, Y.; Li, Q.; Wu, Y.; Peng, X.; Guo, S.; Ye, C.; Wang, T. Mafnet: Multimodal asymmetric fusion network for radar echo extrapolation. Remote Sens. 2024, 16, 3597. [Google Scholar] [CrossRef]
Peng, W.; Bao, S.; Yang, K.; Wei, J.; Zhu, X.; Qiao, Z.; Wang, Y.; Li, Q. Radar quantitative precipitation estimation algorithm based on precipitation classification and dynamical zr relationship. Water 2022, 14, 3436. [Google Scholar] [CrossRef]
He, X.; Zhou, Z.; Zhang, W.; Zhao, X.; Chen, H.; Chen, S.; Bai, L. Diffsr: Learning radar reflectivity synthesis via diffusion model from satellite observations. In Proceedings of the ICASSP 2025—2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; IEEE: New York, NY, USA, 2025; pp. 1–5. [Google Scholar]

Figure 1. Schematic diagram of the study area in this paper.

Figure 2. Lightning data processing flowchart.

Figure 3. Visualization of lightning regions and input channels.

Figure 4. CA block structure diagram.

Figure 5. WWMLP block structure diagram.

Figure 6. Fusion block structure diagram.

Figure 7. Lightning point radar echo overlay map.

Figure 8. RE block structure diagram.

Figure 9. A diagram of the model’s overall architecture.

Figure 10. (a) Performance results of each model; (b) Corresponding residual heat maps.

Figure 11. Visualization heatmaps of radar, RSI, and lightning areas.

Figure 12. Radar chart comparing performance of different models.

Figure 13. (a) Performance results of each model; (b) Corresponding residual heat maps.

Table 1. Effect of different

ω_{FP} / ω_{FN}

ratios on POD, FAR and CSI.

Table 1. Effect of different

ω_{FP} / ω_{FN}

ratios on POD, FAR and CSI.

$ω_{FP} / ω_{FN}$	4:1	4:2	4:3	4:4	4:5
POD	0.4914	0.5426	0.5884	0.5936	0.6259
FAR	0.2431	0.2632	0.2765	0.3006	0.3719
CSI	0.4244	0.4545	0.4804	0.4729	0.4567

Table 2. Effect of different

λ_{BCE} : λ_{DICE}

ratios on CSI.

Table 2. Effect of different

λ_{BCE} : λ_{DICE}

ratios on CSI.

$λ_{BCE} : λ_{DICE}$	5:5	6:4	7:3	8:2	9:1
CSI	0.4569	0.4725	0.4804	0.4603	0.4447

Table 3. CSI Values at different probability thresholds (based on the UNet baseline).

Probability Threshold	0.25	0.30	0.35	0.40	0.45	0.50	0.55
CSI Value	0.4632	0.4701	0.4729	0.4804	0.4668	0.4594	0.4468

Table 4. Table of performance metrics for each model.

Model	POD ↑	FAR ↓	CSI ↑	SSIM ↑	THLC ↑	Accuracy ↑	AUC-ROC ↑
UNet	0.5884	0.2765	0.4804	0.8386	85,052	0.9894	0.9907
MDE-UNet	0.6671	0.1742	0.5848	0.8685	88,337	0.9921	0.9939
w/o CABlock	0.6596	0.2087	0.5619	0.8597	88,283	0.9914	0.9913
w/o REBlock	0.6062	0.1671	0.5404	0.8667	85,982	0.9915	0.9911
w/o FusionBlock	0.6091	0.1870	0.5343	0.8616	86,299	0.9912	0.9909
w/o WWMLPBlock	0.6175	0.2385	0.5175	0.8481	86,027	0.9902	0.9908

Notes:↑ indicates higher values correspond to better model performance; ↓ indicates lower values correspond to better model performance. Bold values represent the best results in each column.

Table 5. THLC1.

Row	UNet	w/o CA	w/o RE	w/o Fusion	w/o WWMLP	MDE-UNet
Row 1	449.0	467.0	461.0	456.0	459.0	470.0
Row 2	193.0	181.0	182.0	181.0	185.0	192.0
Row 3	63.0	65.0	63.0	65.0	66.0	68.0

Table 6. Mean values of physical quantities for different categories.

Channel	TP	TN	FN	FP
Radar	43.798956	3.336027	28.425636	28.930815
Band9	254.211831	266.989961	257.406672	261.036003
TBB13	232.18226	237.927041	233.700711	235.747180
TBB15-TBB13	$- 4.834220$	$- 5.297829$	$- 4.760092$	$- 4.968166$
(Band11-Band13)-(TBB13-TBB15)	$- 6.436632$	$- 5.436156$	$- 5.875391$	$- 6.648390$

Table 7. Standard deviations of physical quantities for different categories.

Channel	TP	TN	FN	FP
Radar	10.725694	8.882650	17.925308	14.599966
Band9	30.305019	26.620143	29.401424	30.305019
TBB13	13.523064	12.137142	14.264807	14.496424
TBB15-TBB13	2.753154	2.525695	2.750731	2.810131
(Band11-Band13)-(TBB13-TBB15)	3.005805	2.782921	3.061888	3.090387

Table 8. SHAP value and degree of contribution of each channel.

Channel	SHAP Value	Contribution Degree (%)
Radar	0.033395	25.5
Band9	0.033065	25.2
TBB13	0.019255	14.7
TBB15-TBB13	0.018612	14.2
(TBB11-TBB13)-(TBB13-TBB15)	0.026714	20.4

Table 9. Performance metrics of different models.

Model	POD	FAR	CSI	SSIM	THLC	Accuracy	AUC-ROC
MDE-UNet	0.6671	0.1742	0.5848	0.8685	88,337	0.9921	0.9939
HRNet	0.6059	0.3290	0.4671	0.8221	87,198	0.9885	0.9880
SegNet	0.5149	0.3200	0.4144	0.8241	79,664	0.9879	0.9771
BiSeNet	0.5591	0.2279	0.4799	0.8322	83,838	0.9899	0.9901

Bold values: The best performance in each column.

Table 10. Comparison of different models.

Method	MDE-UNet	HRNet	SegNet	BiSeNet
Row 1	470.0	455.0	461.0	438.0
Row 2	192.0	186.0	172.0	187.0
Row 3	68.0	67.0	67.0	53.0

Table 11. Performance comparison of different loss functions tested on UNet.

Loss Function	Pod	FAR	CSI	SSIM	THLC	Accuracy	AUC-ROC
Asymmetric weighted BCE-Dice Loss	0.5884	0.2765	0.4804	0.8386	85052	0.9894	0.9901
BCE-DICE Loss	0.4749	0.2233	0.4179	0.8447	68642	0.9890	0.9899
MSE	0.4268	0.1998	0.3857	0.5401	61509	0.9887	0.9875

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Y.; Han, Y.; Zhang, Y.; Liu, Y.; Song, L.; Wang, J.; Wang, X.; Zhang, Q. MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification. Remote Sens. 2026, 18, 1027. https://doi.org/10.3390/rs18071027

AMA Style

Chen Y, Han Y, Zhang Y, Liu Y, Song L, Wang J, Wang X, Zhang Q. MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification. Remote Sensing. 2026; 18(7):1027. https://doi.org/10.3390/rs18071027

Chicago/Turabian Style

Chen, Yihua, Yuanpeng Han, Yujian Zhang, Yi Liu, Lin Song, Jialei Wang, Xinjue Wang, and Qilin Zhang. 2026. "MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification" Remote Sensing 18, no. 7: 1027. https://doi.org/10.3390/rs18071027

APA Style

Chen, Y., Han, Y., Zhang, Y., Liu, Y., Song, L., Wang, J., Wang, X., & Zhang, Q. (2026). MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification. Remote Sensing, 18(7), 1027. https://doi.org/10.3390/rs18071027

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

MDE-UNet: A Physically Guided Asymmetric Fusion Network for Multi-Source Meteorological Data Lightning Identification

Highlights

Abstract

1. Introduction

1.1. Research Significance

1.2. Advances in Lightning Detection and Identification Research

1.3. Main Contributions

2. Data Sources and Preprocessing

2.1. Lightning Detection Data

2.2. Satellite Data

2.3. Radar Data

3. Research Methods

3.1. Model Architecture

3.1.1. Channel Attention Mechanism Block (CA Block)

3.1.2. Enhanced Decoding Unit Based on Weighted Sliding Window Multilayer Perceptron (WWMLP Block)

3.1.3. Multi-Scale Feature Fusion Module (Fusion Block)

3.1.4. Radar Information Enhancement Module (RE Block)

3.1.5. Overall Model Architecture

3.2. Loss Function

4. Experimental Process and Results

4.1. Experimental Details

4.2. Evaluation Metrics

4.2.1. Probability of Detection (POD)

4.2.2. False Alarm Rate (FAR)

4.2.3. Critical Success Index (CSI)

4.2.4. Structural Similarity Index (SSIM)

4.2.5. Accuracy

4.2.6. AUC-ROC (Area Under the Receiver Operating Characteristic Curve)

4.2.7. THLC (True Hit Lightning Count)

4.3. Results

4.3.1. Ablation Study

4.3.2. Attention Mechanism, Physical Consistency Test and Result Attribution Analysis

4.3.3. Comparison of Multiple Models and Loss Functions

5. Discussion

5.1. Modules

5.2. Loss Function Design

5.3. Physical Prior

5.4. Limitations

5.5. Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI