1. Introduction
The frequency of sudden and localized heavy rainfall events is increasing due to climate change [
1]. Radar-based nowcasting, which has higher accuracy than numerical forecasting models at short forecasts of less than 3 h, is valuable for obtaining early flood warnings. Generally, very short-term rainfall-prediction information is calculated through extrapolation and advection–based prediction techniques using radar. The Korea Meteorological Administration currently employs the McGill Algorithm for Precipitation Nowcasting by Lagrangian Extrapolation. However, because the agency has secured access to long-term radar observation data and established sufficient computing resources, rainfall prediction based on artificial intelligence deep learning (e.g., recurrent neural network, convolutional neural network (CNN), and convolutional long short-term memory (ConvLSTM)) using radar data has recently been expanding. Studies using ConvLSTM have been conducted in South Korea [
2,
3,
4,
5,
6,
7,
8]. As mentioned in previous studies, CNN deep neural network-based nowcasting models tend to outperform extrapolation-based forecasts. However, as the forecast lead time increases, spatial smoothing becomes substantial, making it difficult to predict distinct, high-intensity precipitation features and distorting the small-scale weather phenomena that are important for improving forecast accuracy [
3,
6,
8]. Additionally, existing methods based on deterministic forecasts of rainfall movement and locations over the entire precipitation field are limited in availability due to the difficulty of making consistent forecasts that consider spatio-temporal complexity. Therefore, probabilistic forecasts are known to have higher economic and decision-making value than deterministic forecasts [
9,
10,
11]. Ravuri et al. (2021) developed a deep generative model of radar (DGMR) based on generative adversarial networks (GANs) for probabilistic radar for very-short-term rainfall prediction. DGMR can also be described as a statistical model that learns the probability distribution of data, and it can easily generate samples from the learned distribution. Moreover, training DGMR using UK Met Office radar data and performing forecasts with a lead time of 5–90 min improved accuracy was compared to PySTEPS, an existing rainfall prediction model, and U-Net, a CNN-based deep learning model.
In this study, we employed four deep learning models, each with a unique approach to rainfall prediction: RainNet, which specializes in precipitation prediction; ConvLSTM2D U-Net, which incorporates convolutional layers into traditional LSTM networks; a U-Net-based recursive model, which utilizes recursive prediction strategies; and a generative adversarial network, which is designed to generate realistic rainfall patterns. These models were individually applied and evaluated using Korean radar rainfall data by the Ministry of Environment for very-short-term forecasts of up to one hour. To ensure a balanced comparison, all four models were trained and assessed using the same dataset. Each deep learning model was applied to Korean radar rainfall data to evaluate its performance in very-short-term rainfall prediction (up to one hour in advance).
2. Materials and Methods
This study uses four kinds of deep learning-based nowcasting models.
Table 1 provides a summary and comparison of the different network architectures. The details of each model are explained in each section.
2.1. RainNet
In this study, RainNet, a prediction model using a convolutional deep neural network with an existing U-Net structure, was used as the basic model [
8]. RainNet has been used in Korea with radar data from the Korea Meteorological Administration, and its predictive applicability has been evaluated [
6]. The neural network structure used in RainNet is based on U-Net and SegNet, which have encoder–decoder structures with skip connections between branches [
13,
14]. RainNet employs an encoder–decoder architecture in which the encoder progressively reduces the spatial resolution by utilizing pooling, then adds a convolutional layer. The decoder uses the upsampling method to gradually upscale the patterns in the trained image to a higher spatial resolution, and this image is followed by a convolutional layer. To ensure semantic connectivity between features across layers, it includes a skip connection from the encoder to the decoder, which is proposed to avoid the problem of gradient vanishing [
15]. The model uses convolutional filters with sizes of up to 512 channels, along with kernel sizes of 1 × 1 and 3 × 3, and rectified linear unit activation functions for the convolutional layer, following the existing domestic RainNet studies [
6,
16].
As input, the RainNet model takes four sets of radar-generated gridded rainfall data (Observation at times T − 3, T − 2, T − 1, T), observed up to 30 min in the past at 10-min intervals at prediction time T. It performs a prediction (Predict T + 1) for the next 10 min and learns to minimize the error by comparing it with the observed radar-generated gridded rainfall data (Observation T + 1). Therefore, the pretrained model is optimized for a 10-min forecast.
To train RainNet, mean absolute error (MAE) was used as the loss function. Nadam (Nesterov-accelerated Adaptive Moment Estimation) was used to update the parameters, and the learning rates of the Nadam optimizer, beta_1, and beta_2 were set to 0.0001, 0.9, and 0.999, respectively. The training of the RainNet model was initially configured with 200 epochs and batch size 32, and the loss function was minimized at 26 epochs using early stopping. The RainNet in this study used the Keras framework, and training was performed on Dual GPU (NVIDIA RTX A6000).
2.2. ConvLSTM2D U-Net
The ConvLSTM2D U-Net model integrates the U-Net architecture with the ConvLSTM2D structure to predict rainfall by considering the temporal continuity of radar image data [
7]. In this context, the U-Net comprises contracting pathways for capturing global image features and expanding pathways for precise localization, thereby forming a symmetrical U-shaped network. The ConvLSTM2D structure is characterized by its ability to capture spatiotemporal correlations and includes convolutional layers in input-to-state and state-to-state transitions. The model’s architecture is depicted in
Figure 1. The rationale for incorporating ConvLSTM2D into the U-Net structure lies in the similarity between the computation of filters in the convolution and dense layers, which may obscure the temporal order of the time series. Furthermore, in a change from the original RainNet, we opted to use Conv2DTranspose instead of an upsampling layer. Conv2Dtranspose performs a convolutional operation with a trained filter to enhance resolution, as opposed to traditional upsampling layers, which interpolate lower-resolution data. Additionally, we employed SpatialDropout2D at dropout locations during the training of RainNet. SpatialDropout2D is capable of excluding entire two-dimensional feature maps, aiding in the prevention of overfitting. The activation function used during training was the exponential linear unit. Notably, a linear bottleneck structure was implemented for filters 256 and 512 to reduce the number of parameters. This bottleneck structure reduces dimensionality using a 1 × 1 convolution, increases dimensionality using a 3 × 3 convolution, and deepens dimensionality once more with a final 1 × 1 convolution layer. This design effectively reduces computational complexity. The ConvLSTM2D U-Net takes four radar-generated gridded rainfall data as input, observed up to 30 min in the past at 10-min intervals. It performs a prediction (Predict T + 1) for the next 10 min.
In the optimization of ConvLSTM2D U-Net, the MAE served as the loss function. Parameter updates were conducted via the Adam optimizer, utilizing a learning rate of 1 × 10
−4; the remaining parameters adhered to default settings, as suggested by Kingma and Ba (2015) [
17]. The training of the ConvLSTM2D U-Net model was initially configured with 1000 epochs and a batch size of 2, using early stopping to obtain the best model at the 20th epoch.
2.3. Generative Adversarial Network
Additionally, this study employed a nowcasting technique utilizing a GAN. A GAN comprises two neural networks, a generator and a discriminator, which engage in adversarial competition to learn. GANs enable the learning of data probability distributions and facilitate the generation of samples from the learned distribution. Particularly, in this study, the Deep Generative Model for Rainfall (DGMR) is based on a conditional adversarial generative neural network, known as a conditional GAN (cGAN). A cGAN conditions the generator and discriminator with additional information during training, allowing the introduction of specific conditions to generate the desired data artificially. In the case of DGMR, it conditions the observed rainfall information at the time of prediction, thereby generating random noise resembling the predicted rainfall field [
10]. As shown in
Figure 2, DGMR comprises a generator, two discriminators, and their respective blocks, and the learning process of the model can be described as follows.
First, radar rainfall fields from the past 40 min at 10-min intervals serve as context vectors in the generator, which is trained with two loss functions and one weight regularization. The generator takes a context vector and produces an image. Eight frames are randomly selected from this image and used to calculate a loss value when compared to real data. The generator’s role includes transforming the input randomized noise vector into information intended to match patterns in actual images. To achieve the goal of generating images indistinguishable from real radar images, it undergoes adversarial training with a discriminator, which is responsible for evaluating the realism of the generated images. The spatial discriminator, structured as a CNN, focuses on distinguishing between observed and generated radar rainfall fields, thereby ensuring spatial consistency and reducing ambiguous predictions. Meanwhile, with randomized inputs of generated images, the temporal discriminator distinguishes observed from generated radar sequences to ensure temporal consistency and reduce erratic predictions stemming from overfitting or instability.
Additionally, grid-cell regularization was applied to the observed and model-generated mean values to enhance accuracy. This regularization introduces a term penalizing differences between the two, facilitating accurate predictions based on location. Moreover, the generative neural network model is inherently probabilistic and capable of simulating multiple data generations using conditional probability distributions of input radar information. The resulting approach resembles an ensemble technique. Furthermore, DGMR has the advantage of learning from observational data and representing uncertainty across various spatiotemporal scales. However, its performance deteriorates rapidly for convective cell forecasts or forecasts extending beyond 90 min, primarily due to the challenges associated with predicting physical properties related to rainfall development and dissipation [
10,
18].
DGMR was trained with up to 5 × 10
5 generator steps, as suggested by Ravuri et al. (2021). Two discriminator steps were performed for each generator step. The learning rate of the generator is 5 × 10
−5, and the learning rate of the discriminator is 2 × 10
−4. The Adam optimizer was used, and β
1 and β
2 were set to 0.0 and 0.999, respectively. Moreover, the scaling parameter for grid-cell normalization was set to λ = 20. DGMR used the PyTorch framework (
https://pytorch.org, accessed on 30 October 2023). DGMR stopped learning at epoch 130 because the model was optimized. Specifically, in the case of GAN, it is difficult to determine whether it is optimized simply based on loss; hence, we checked the rainfall-prediction image generated by the learned model and whether mode collapse occurred to assess optimization. Furthermore, the GAN model was trained for a 60-min lead time to maintain consistency with the other algorithms.
2.4. Recursive RainNet
Recursive RainNet (RainNet-REC) employs a model that is pretrained using the existing 10-min forecast to mitigate error accumulation and the smoothing effects that typically occur during iterative forecasting. This approach uses the U-Net network to implement a recursive prediction strategy (
Figure 3) [
12]. The forecasting process is as follows. Initially, four radar-generated gridded rainfall datasets recorded at 10-min intervals (Observation T − 3, T − 2, T − 1, and T) and observed up to 30 min in the past serve as inputs at the simulation time (T). These inputs are processed through the established RainNet model structure to generate a 10-min forecast of radar rainfall data (Output1). Subsequently, the forecast time advances by another 10 min and Output1 is concatenated with observed rainfall data (Observation T − 2, T − 1, T) to create input data for the subsequent forecast. This iterative process continues until rainfall forecasts for the next 10–60 min are obtained. To refine the model, each hourly prediction result (Output1, Output2, ..., Output6) is compared with the observed radar-generated gridded rainfall (Observation T + 1–T + 6) to calculate errors. Training is then conducted to minimize these errors. Consequently, the pretrained model is optimized for 10-min and 60-min forecasts. For recursive RainNet model training, MAE was used as the loss function and the learning rate of the Nadam optimizer, beta_1, and beta_2 were set to 0.0001, 0.9, and 0.999, respectively. Training was performed for 200 epochs and batch size of 8, and the loss function value was minimized at 133 epochs. While RainNet and RainNet-REC fundamentally share identical network architectures, their differences in performance and number of parameters can be attributed to variations in training objectives, prediction strategies, and architectural configurations. Specifically, RainNet is optimized for a short 10-min forecast. In contrast, RainNet-REC essentially stacks 6 RainNet models and optimizes the parameters for each of these 6 models individually, which explains the greater number of total parameters. Each of these stacked models in RainNet-REC employs a more complex recurrent prediction approach spanning up to 60 min. The temporal dependency of the model during training is a crucial factor affecting prediction accuracy. Furthermore, divergent weight configurations between the two models suggest that RainNet-REC may have navigated a more favorable optimization landscape during training. RainNet, trained on a narrower dataset for 10-min predictions, overfits to its training data, thereby compromising its ability to generalize effectively.
4. Results
In this section, we conduct an evaluation of the predicted rainfall using four metrics, namely critical success index (CSI), MAE, F1 score, and SSIM. The CSI is calculated by counting the number of grid points where predictions and observations closely match for rainfall exceeding a specified threshold, as defined in Equation (1). This count is then divided by the total number of cases involving precipitation events. To calculate the CSI, we employ a rain contingency table, which serves as a matrix indicating the presence or absence of predicted and observed rainfall.
The MAE quantifies the disparity between predicted and observed rainfall, as depicted in Equation (2). Finally, the F1 score is employed. This metric combines precision and recall by computing their harmonic mean. Maximizing the F1 score implies optimizing precision and recall simultaneously.
SSIM is a perception-based metric that considers luminance, contrast, and structure to compare two images, making it ideal for evaluating the quality of our precipitation forecast maps.
True Positive (TP) is the number of samples correctly predicted as “positive.” False Positive is the number of samples wrongly predicted as “positive.” True Negative is the number of samples correctly predicted as negative. Furthermore, now
i and obs
i are the predicted and observed rainfall intensities (mm/h) at location i, and n is the number of radar grids. The threshold rainfall intensity was 0.1 mm/h for overall rainfall and 5 mm/h to evaluate the difference from deep learning models for predicting heavy rainfall. Here,
and
are the two images (QPE and QPF) being compared and
and
are the averages of images
and
respectively. Additionally,
and
are the variances of
and
and
is the covariance of
and
Additionally, C
1, and C
2 are two variables to stabilize the division with a weak denominator [
20].
Figure 8 displays the CSI, F1 score, and MAE of each method’s rainfall-forecast output based on radar-observed rainfall for each heavy-rainfall case, categorized by forecast lead time and with a threshold rainfall of 0.1 mm/h. As the lead time increases, variations in performance among the different rainfall prediction methods become more apparent. In terms of CSI, RainNet-REC consistently demonstrated superior performance across all heavy-rainfall cases. DGMR also exhibited higher accuracy compared to RainNet and ConvLSTM2D U-Net. F1 scores exhibited minimal variation among the prediction techniques. Notably, DGMR yielded a considerably greater MAE compared to the other predictors. SSIM did not show a noticeable difference for each model, but RainNet-REC showed the highest value.
Figure 9 presents the rainfall-prediction results for each method using a threshold rainfall of 5 mm/h. The deviations from the 0.1 mm/h threshold are evident, particularly highlighting the strong prediction performance of RainNet-REC and DGMR in terms of CSI and F1 score, as they effectively forecasted regions of heavy rainfall. However, DGMR’s MAE was higher than that of other models due to its tendency to predict continuous increases in rainfall events. In the evaluation relative to the threshold value of 5 mm/h, there was no significant difference in SSIM by technique. Although DGMR’s rainfall forecasts offer visually convincing renderings, the SSIM of DGMR scores lower than compared to those of the other deep learning approaches. This outcome is posited to stem from DGMR’s unique method of generating forecasts, which relies on a probability distribution resembling that of the input data. While this approach minimizes smoothing effects, it results in a forecasted rainfall distribution that deviates to some extent from the observed patterns.
Table 3 and
Table 4 present average evaluation results for the prediction accuracy of heavy-rainfall cases with critical rainfall thresholds of 0.1 mm/h and 5 mm/h, respectively.
For the 0.1 mm/h threshold, RainNet’s performance metrics, such as CSI, fluctuated between a maximum of 0.907 at the 10-min lead time and a minimum of 0.560 at the one-hour forecast. RainNet-REC consistently showed superior results, with the highest CSI being 0.920 and the lowest being 0.762 across the specified lead times. As shown in
Table 3, RainNet-REC consistently achieves the highest SSIM values across all lead times, with values ranging from 10 to 60 min. This result indicates that RainNet-REC is does well at preserving structural details in the forecasted rainfall patterns, thereby confirming its effectiveness in mitigating the smoothing effect, as mentioned earlier.
In evaluation at the 5 mm/h threshold, RainNet’s CSI varied from a peak of 0.603 at the 10-min lead time to a low of 0.107 at the 60-min mark. Thus, RainNet-REC demonstrated exceptional performance, with a highest CSI of 0.626 and a lowest CSI of 0.408 across the timespans.
The results from four models, namely RainNet, ConvLSTM2D U-Net, DGMR, and RainNet-REC, were analyzed for success in forecasting radar rainfall at lead times ranging from 10 to 60 min. In the short-term forecasting window of 10–30 min, RainNet-REC emerged as the top performer, excelling in CSI and F1 score, while DGMR lagged, as it had the highest MAE, indicating lower accuracy in predicting rainfall amounts. As we extended our analysis to medium-term lead times (between 40 and 60 min), RainNet-REC continued to dominate, although its performance slightly deteriorated with increasing lead time—a trend observed across all models. DGMR consistently exhibited the least precision, as evidenced by its consistently high MAE values. In summary, RainNet-REC consistently outperformed all other models across all evaluated metrics and timeframes, closely followed by ConvLSTM2D U-Net, which could be considered a viable alternative. RainNet performed well at shorter lead times but faced challenges at longer intervals. Additionally, DGMR consistently underperformed across all metrics and lead times, making it the less-recommended option. Therefore, for those seeking a model with superior accuracy and precision across various forecasting times, RainNet-REC is the most advisable choice.
5. Conclusions
This study utilized Korean radar rainfall data and applied various deep learning algorithms for very short-term rainfall predictions, up to 1 h in advance. The algorithms included CNN-based U-Net, ConvLSTM for considering temporal continuity, a recursive model based on U-Net with a recursive strategy, and a GAN-based model. The input radar rainfall was estimated using a conditional merging technique. The study evaluated the prediction performance of each technique for different rainfall events, presenting results based on metrics such as CSI, F1 score, MAE, and SSIM. Two rainfall-intensity thresholds, 0.1 mm/h and 5 mm/h, were used during the evaluation of various models for forecasting radar rainfall across different lead times.
For lower-intensity rainfall (0.1 mm/h), RainNet’s CSI scores varied widely, ranging from 0.907 to 0.560 depending on the forecast lead time. In contrast, RainNet-REC consistently outperformed other models, with CSI scores ranging from 0.762 to 0.920. RainNet-REC also excelled in preserving structural details in rainfall patterns, as indicated by its consistently high SSIM values. Although its performance declined slightly with increasing forecast times, it remained superior to other models such as RainNet, ConvLSTM2D U-Net, and DGMR.
For higher-intensity rainfall (5 mm/h), RainNet-REC outperformed other models like RainNet, ConvLSTM2D U-Net, and DGMR across various lead times. It performed exceptionally well in short-term predictions (10–30 min) and remained robust even in the medium term (40–60 min), though with a slight decline in performance. In these cases, DGMR yielded high CSI values, thereby demonstrating improved rainfall field-prediction capabilities compared to its own lower-intensity performance. Conversely, RainNet-REC continued to demonstrate high forecast accuracy, as evidenced by its low MAE values.
Overall, this study offers valuable insights into the effectiveness of deep learning algorithms for very-short-term weather forecasting using Korean radar data. Specifically, the recursive RainNet-REC model achieved high scores in predicting short-term rainfall up to 1 h in advance, highlighting its potential utility in disaster management.