Radar-SR3: A Weather Radar Image Super-Resolution Generation Model Based on SR3

: To solve the problems of the current deep learning radar extrapolation model consuming many resources and the final prediction result lacking details, a weather radar image super-resolution weather model based on SR3 (super-resolution via image restoration and recognition) for radar images is proposed. This model uses a diffusion model to super-resolve weather radar images to generate high-definition images and optimizes the performance of the U-Net denoising network on the basis of SR3 to further improve image quality. The model receives high-resolution images with Gaussian noise added and performs channel splicing with low-resolution images for conditional generation. The experimental results showed that the introduction of the diffusion model significantly improved the spatial resolution of weather radar images, providing new technical means for applications in related fields; when the amplification factor was 8, Radar-SR3, compared with the image super-resolution model based on the generative adversarial network (SRGAN) and the bicubic interpolation algorithm, the peak signal-to-noise ratio ( PSNR ) increased by 146% and 52% on average. According to this model, it is possible to train radar extrapolation models with limited computing resources with high-resolution images.


Introduction
Weather radar is one of the key tools in the field of meteorology and natural disaster warning and is widely used to monitor and track rain, storms, lightning, and other weather phenomena in the atmosphere [1].In meteorology, the role of weather radar cannot be ignored; it not only provides a wide range of meteorological data but also has the characteristics of high spatial and temporal resolution of weather radar detection data, which is of great significance for meteorological operations, such as short-term forecasting as well as small-and medium-scale weather monitoring, which are also the main tool for nowcasting.The timely monitoring of weather changes and the effective management of disaster risks are crucial for short-term weather forecasting.The performance and image quality of meteorological radar directly impact the accuracy of predictions.Current radar echo extrapolation deep learning models, such as PredRNN [2] and Mo-tionRNN [3], tend to produce blurry images to achieve higher mean squared error (MSE) scores, thereby affecting the final prediction results.GAN-LSTM [4] and GAN-rcLSTM [1] have been incorporated as a generative adversarial network (GAN) module based on recurrent neural networks to generate radar images with clearer details and more accurate predictions.Currently, extrapolation models take high-resolution radar images as input in the hope of obtaining better extrapolation results.However, using high-resolution radar U-Net.The Radar-SR3 super-resolution model, incorporating this novel RA module, achieved an optimal peak signal-to-noise ratio and structural similarity index on a weather radar dataset.

Problem Description
Recently, the application of radar echo extrapolation deep learning models in weather forecasting has made significant progress [1][2][3]22,23].Since the RNN model has memory units, researchers prefer to use RNN-based models for radar echo extrapolation.However, the final extrapolation results of the models often suffer from issues of blurring and distortion.Under complex meteorological conditions, blurred predictions may lead to misjudgments of potential extreme weather events.As shown in Table 1, with the same batch size and time length, the size of radar images can significantly impact the parameters of Recurrent Neural Network (RNN)-based extrapolation models.When the width and height of radar images increase by a factor of 8, the average growth in model parameters is 15-times.This increase can pose a bottleneck in computational resources, affecting both training and inference speed.In extreme cases, it may even render training on some high-resolution weather radar datasets practically infeasible.[24] 249,720,192 14,380,416 ConvLSTM [25] 96,379,969 5,547,073 MotionRNN 132,511,361 8,648,321 * From left to right, the input sizes represent: batch size, time length, number of image channels, image height, image width.

Materials
The meteorological radar dataset consists of time-series data of radar echo, with its physical interpretation being the basic reflectivity factor at the 3 km elevation.A higher water droplet content in the atmosphere results in higher radar reflectivity [26,27].The dataset is compiled using a network of multiple S-band meteorological radars in Jiangsu Province, covering the period from April to September in the years 2019 to 2021, and the data used are 3 km cappi (constant altitude plan position indicator).The radar echo data underwent quality control processes such as clutter suppression and discrete noise filtering.Additionally, data with a low proportion of radar echoes were manually excluded, covering the entire area of Jiangsu Province.
The data values range from 0 to 70 dBZ, with a horizontal resolution of 1 km × 1 km, a time resolution of 6 min, and a grid size of 480 × 560 pixels for single-time data.To facilitate the training of deep learning models while preserving image information, padding on both sides and center cropping were applied, resulting in images of 512 × 512 pixels.
To balance the resources, time, training effectiveness, and recognition performance required for deep learning model training, a total of 31,122 samples were selected.These samples were split into training, validation, and test sets in a ratio of 7:2:1, respectively.
Considering training time, the 512 × 512-pixel images are initially downsampled to images of size 128 × 128 and 16 × 16.The 128 × 128 images are defined as HR images, while the 16 × 16 images are defined as LR images.An example of data visualization is shown in Figure 1.

Denoising Diffusion Probability Model
The denoising diffusion probability model (DDPM) is inspired by non-equilibrium thermodynamics [28].If you add noise to pixels in a high-dimensional image space, like ink spreading in water, and then reverse the process, this can generate images from the noise, resulting in unexpected combinations of images.The denoising diffusion probability model includes a deep learning denoising network and a diffusion process.The diffusion process includes a forward diffusion process and a reverse diffusion process.

Denoising Network Based on U-Net Model
The U-Net model [29] was initially designed to address the segmentation of medical images.It introduces an encoder-decoder [30] architecture, utilizing a U-shaped network structure to capture contextual information.The encoder part of U-Net is responsible for extracting image features from low-resolution inputs.Since the goal of the SR3 model is to reconstruct low-resolution images into high-resolution images, the decoder part of U-Net in the SR3 model works to gradually increase the resolution of the feature maps through deconvolution and up-sampling operations.To preserve image details and structural information, the U-Net in the SR3 model incorporates skip connections.These connections link the feature maps between the encoder and decoder, allowing information to be transmitted across different scales, facilitating feature fusion, and ultimately generating high-resolution images.The framework of the U-Net model is shown in Figure 2.

Diffusion Process
The diffusion process includes a forward diffusion process and a reverse diffusion process, using a parameterized Markov chain trained by variational inference to generate samples that match the data after a limited time [22].The denoising diffusion probability model (DDPM) is inspired by non-equilibrium thermodynamics [28].If you add noise to pixels in a high-dimensional image space, like ink spreading in water, and then reverse the process, this can generate images from the noise, resulting in unexpected combinations of images.The denoising diffusion probability model includes a deep learning denoising network and a diffusion process.The diffusion process includes a forward diffusion process and a reverse diffusion process.

Denoising Network Based on U-Net Model
The U-Net model [29] was initially designed to address the segmentation of medical images.It introduces an encoder-decoder [30] architecture, utilizing a U-shaped network structure to capture contextual information.The encoder part of U-Net is responsible for extracting image features from low-resolution inputs.Since the goal of the SR3 model is to reconstruct low-resolution images into high-resolution images, the decoder part of U-Net in the SR3 model works to gradually increase the resolution of the feature maps through deconvolution and up-sampling operations.To preserve image details and structural information, the U-Net in the SR3 model incorporates skip connections.These connections link the feature maps between the encoder and decoder, allowing information to be transmitted across different scales, facilitating feature fusion, and ultimately generating high-resolution images.The framework of the U-Net model is shown in Figure 2.

Denoising Diffusion Probability Model
The denoising diffusion probability model (DDPM) is inspired by non-equilibrium thermodynamics [28].If you add noise to pixels in a high-dimensional image space, like ink spreading in water, and then reverse the process, this can generate images from the noise, resulting in unexpected combinations of images.The denoising diffusion probability model includes a deep learning denoising network and a diffusion process.The diffusion process includes a forward diffusion process and a reverse diffusion process.

Denoising Network Based on U-Net Model
The U-Net model [29] was initially designed to address the segmentation of medical images.It introduces an encoder-decoder [30] architecture, utilizing a U-shaped network structure to capture contextual information.The encoder part of U-Net is responsible for extracting image features from low-resolution inputs.Since the goal of the SR3 model is to reconstruct low-resolution images into high-resolution images, the decoder part of U-Net in the SR3 model works to gradually increase the resolution of the feature maps through deconvolution and up-sampling operations.To preserve image details and structural information, the U-Net in the SR3 model incorporates skip connections.These connections link the feature maps between the encoder and decoder, allowing information to be transmitted across different scales, facilitating feature fusion, and ultimately generating high-resolution images.The framework of the U-Net model is shown in Figure 2.

Diffusion Process
The diffusion process includes a forward diffusion process and a reverse diffusion process, using a parameterized Markov chain trained by variational inference to generate samples that match the data after a limited time [22].

Diffusion Process
The diffusion process includes a forward diffusion process and a reverse diffusion process, using a parameterized Markov chain trained by variational inference to generate samples that match the data after a limited time [22].
In the forward diffusion process, a set of data x 0 ∼q(x) obtained by sampling from the real data distribution, a series of noise-added samples x 1 , x 2 , . . .x t−1 , x t, x t+1 . . ., x T , are obtained by superimposing Gaussian noise on the samples in T steps.The recursive formula from the origin HR image x 0 to an HR image after adding Gaussian noise at time step t x t is: z is the noise that conforms to the standard normal distribution, and α t is the weight, which decreases as T increases.Let α t α t−1 . . .α 2 α 1 = α t ; then, for any T time step: The final x T is a noise that conforms to the standard normal distribution, also because z is the noise that conforms to the standard normal distribution.According to Formula (2): Because α t α t−1 . . .α 2 α 1 = α t , then α t can be controlled by α t and time step T.
The reverse diffusion process uses a U-Net model to learn image noise to achieve denoising.That is, for the noise z and time step t and the image x t at time step t, we obtain: In the reconstruction stage, x t−1 is found under the premise of knowing x t , which can be obtained by q(x t−1 |x t ).According to Formula (1) and the properties of normal distribution: Then, according to the conditional probability formula: z = UNet(x t , t) , z ∼N(0, 1), 1 − α t = β t , according to Formula (7).The image x t−1 at time step t−1 can be obtained from the image x t at time t and the noise zt .

SimAM Attention
The Attention mechanism was first proposed by John K. Tsotsos in 1995 [31] for the field of visual images.In 2014, Google Mind applied the Attention mechanism to image classification in Recurrent Convolutional Neural Network (RNN) models [32].The Attention mechanism generates weight vectors for each input element, determining which parts significantly impact the model output by calculating the weights of each feature map.
Traditional attention mechanisms include spatial and channel attention mechanisms [33].The spatial attention mechanism focuses on the importance of different spatial features in the image, while the channel Attention mechanism emphasizes the significance of features between different channels.Adding Attention mechanisms to deep learning models can often lead to improved performance.However, increased parameters increase the complexity of deep learning models, resulting in longer training and inference times.
SimAM [34], on the other hand, is an Attention mechanism based on mature neuroscience theories.It simultaneously infers spatial and channel weights from the current neurons, achieving performance improvements without affecting model complexity.The structure of the SimAM Attention mechanism is shown in Figure 3.

SR3 Model
The SR3 model describes the super-resolution problem as a conditional generati problem.Compared with DDPM, which predicts the noise in the image each time throu the U-Net network to generate a denoised image, the SR3 model first passes the origin low-resolution image through up-sampling, which interpolates low-resolution imag into high-resolution images and adds them to the training process.At the same time, t noise-added image and the interpolated high-resolution image are input.That is, the nu ber of input channels changes from three channels in DDPM to six channels.The d noising network U-Net can perform conditional denoising based on the high-resoluti image after low-resolution interpolation.Therefore, compared with DDPM, random d noising becomes a conditional generative model controlled based on the low-resoluti image.At the same time, the denoising network U-Net in SR3 no longer obtains no based on time step t but directly accepts the noise at the current time, thereby achievi faster inference speed.
Due to the particularity of weather radar images, the original U-Net denoising n work in SR3 struggles to handle the global structure of the image and cross-channel d pendencies, so the U-Net denoising network in SR3 is needed to improve the ability capture global information, thus achieving a better denoising effect.This paper introdu an Attention mechanism to capture global information and channel information based the original U-Net denoising network.It introduces residual connections to increase t number of model parameters and enhance the denoising ability of the U-Net model.

Radar-SR3 Model
Radar-SR3 replaces the original U-Net denoising network with an improved d noising U-Net network based on SR3.The overall process is shown in Figure 4.

SR3 Model
The SR3 model describes the super-resolution problem as a conditional generation problem.Compared with DDPM, which predicts the noise in the image each time through the U-Net network to generate a denoised image, the SR3 model first passes the original low-resolution image through up-sampling, which interpolates low-resolution images into high-resolution images and adds them to the training process.At the same time, the noiseadded image and the interpolated high-resolution image are input.That is, the number of input channels changes from three channels in DDPM to six channels.The denoising network U-Net can perform conditional denoising based on the high-resolution image after low-resolution interpolation.Therefore, compared with DDPM, random denoising becomes a conditional generative model controlled based on the low-resolution image.At the same time, the denoising network U-Net in SR3 no longer obtains noise based on time step t but directly accepts the noise at the current time, thereby achieving faster inference speed.
Due to the particularity of weather radar images, the original U-Net denoising network in SR3 struggles to handle the global structure of the image and cross-channel dependencies, so the U-Net denoising network in SR3 is needed to improve the ability to capture global information, thus achieving a better denoising effect.This paper introduces an Attention mechanism to capture global information and channel information based on the original U-Net denoising network.It introduces residual connections to increase the number of model parameters and enhance the denoising ability of the U-Net model.

Radar-SR3 Model
Radar-SR3 replaces the original U-Net denoising network with an improved denoising U-Net network based on SR3.The overall process is shown in Figure 4.

Residual Connection with Attention Mechanism
Residual connection was first proposed in ResNet [35] in 2016 and won first place in the ImageNet Image Recognition Challenge in 2015.Residual connections are implemented by adding the input to the result of activating a nonlinear activation function.This method can reduce the problem of a network's gradient vanish and improve model expression capabilities.In the residual connection, the input x is mapped to a function (), which is then added to the original input to output  =  + ().This can reduce the vanishing gradient problem because deeper network parameters have less impact on the model output, thus ensuring stability and convergence speed during the training process.This article uses the Swish activation function [36] as the activation function in the residual block.Compared with the Relu activation function, Swish is a smooth and non-monotonic function, and its performance on multiple-depth models is better than the Relu function.
Based on the residual connection, this article adds different Attention mechanism modules for the U-Net denoising model to capture noise and features on each channel and space, named the RA (ResNet Block with Attention) module.The added attention mechanisms include the self-attention mechanism [37], the CBAM Attention mechanism [38], and the SimAM Attention mechanism [28].The image quality and structural similarity indicators were compared, and the one with the highest index was selected as an integral part of the RA module.The RA module is shown in Figure 5.

Residual Connection with Attention Mechanism
Residual connection was first proposed in ResNet [35] in 2016 and won first place in the ImageNet Image Recognition Challenge in 2015.Residual connections are implemented by adding the input to the result of activating a nonlinear activation function.This method can reduce the problem of a network's gradient vanish and improve model expression capabilities.In the residual connection, the input x is mapped to a function f (x), which is then added to the original input to output y = x + f (x).This can reduce the vanishing gradient problem because deeper network parameters have less impact on the model output, thus ensuring stability and convergence speed during the training process.This article uses the Swish activation function [36] as the activation function in the residual block.Compared with the Relu activation function, Swish is a smooth and non-monotonic function, and its performance on multiple-depth models is better than the Relu function.
Based on the residual connection, this article adds different Attention mechanism modules for the U-Net denoising model to capture noise and features on each channel and space, named the RA (ResNet Block with Attention) module.The added attention mechanisms include the self-attention mechanism [37], the CBAM Attention mechanism [38], and the SimAM Attention mechanism [28].The image quality and structural similarity indicators were compared, and the one with the highest index was selected as an integral part of the RA module.The RA module is shown in Figure 5.

Residual Connection with Attention Mechanism
Residual connection was first proposed in ResNet [35] in 2016 and won first place in the ImageNet Image Recognition Challenge in 2015.Residual connections are implemented by adding the input to the result of activating a nonlinear activation function.This method can reduce the problem of a network's gradient vanish and improve model expression capabilities.In the residual connection, the input x is mapped to a function (), which is then added to the original input to output  =  + ().This can reduce the vanishing gradient problem because deeper network parameters have less impact on the model output, thus ensuring stability and convergence speed during the training process.This article uses the Swish activation function [36] as the activation function in the residual block.Compared with the Relu activation function, Swish is a smooth and non-monotonic function, and its performance on multiple-depth models is better than the Relu function.
Based on the residual connection, this article adds different Attention mechanism modules for the U-Net denoising model to capture noise and features on each channel and space, named the RA (ResNet Block with Attention) module.The added attention mechanisms include the self-attention mechanism [37], the CBAM Attention mechanism [38], and the SimAM Attention mechanism [28].The image quality and structural similarity indicators were compared, and the one with the highest index was selected as an integral part of the RA module.The RA module is shown in Figure 5.

Improved U-Net Denoising Network
This paper modifies the original U-Net model by replacing its convolutional modules with residual connection modules, enhancing its effectiveness and depth.In the downsampling and up-sampling layers, the residual connection blocks are replaced with RA modules better to capture noise and features from the original image.The architecture of

Improved U-Net Denoising Network
This paper modifies the original U-Net model by replacing its convolutional modules with residual connection modules, enhancing its effectiveness and depth.In the downsampling and up-sampling layers, the residual connection blocks are replaced with RA modules better to capture noise and features from the original image.The architecture of the U-Net network is illustrated in Figure 6, consisting of an encoding segment and a decoding segment.The encoding segment employs three layers of residual blocks to extract shallow semantic information from the image.Two RA modules are utilized to capture deeper image correlations.The decoding component uses two RA modules to reconstruct the deep semantic features of the decoded image and three layers of residual blocks to restore shallow semantic information.
the U-Net network is illustrated in Figure 6, consisting of an encoding segment and a decoding segment.The encoding segment employs three layers of residual blocks to extract shallow semantic information from the image.Two RA modules are utilized to capture deeper image correlations.The decoding component uses two RA modules to reconstruct the deep semantic features of the decoded image and three layers of residual blocks to restore shallow semantic information.
As depicted in Figure 7, the residual block includes a Group Normalization, a Swish activation function, and a 3 × 3 convolutional kernel with a stride of 1 in two-dimensional convolution.The down-sampling layer involves a 3 × 3 convolutional kernel, while the up-sampling layer utilizes a 2 × 2 convolutional kernel.

Experimental Setup
The experimental environment is as follows: The CPU is an Intel ® Core™ i9-13900K processor with a frequency of 5.0 GHz, and the memory is 32 GB.The GPU is an NVIDIA GeForce RTX 4090.The software environment includes PyTorch 2.0.1 and CUDA 11.8.The batch size is set to 24, and the Adam optimizer [39] is used for optimization with an initial learning rate of 1 × 10 −4 .The L1 loss function is employed as the loss function.

Evaluation Metrics
This paper uses the peak signal-to-noise ratio (PSNR) and the structural similarity index measurement (SSIM) [40] as quantitative metrics to evaluate the super-resolution performance of the algorithm.PSNR is employed to assess the consistency between generated images and ground truth, while SSIM is used to evaluate the structural similarity between generated images and ground truth.The definitions of SSIM and PSNR are given in Formulas ( 9) and (10).
For an image x and y with a size of  × , the mean squared error (MSE) between x and y is defined as: As depicted in Figure 7, the residual block includes a Group Normalization, a Swish activation function, and a 3 × 3 convolutional kernel with a stride of 1 in two-dimensional convolution.The down-sampling layer involves a 3 × 3 convolutional kernel, while the up-sampling layer utilizes a 2 × 2 convolutional kernel.
Atmosphere 2024, 15, x FOR PEER REVIEW 8 of 14 the U-Net network is illustrated in Figure 6, consisting of an encoding segment and a decoding segment.The encoding segment employs three layers of residual blocks to extract shallow semantic information from the image.Two RA modules are utilized to capture deeper image correlations.The decoding component uses two RA modules to reconstruct the deep semantic features of the decoded image and three layers of residual blocks to restore shallow semantic information.As depicted in Figure 7, the residual block includes a Group Normalization, a Swish activation function, and a 3 × 3 convolutional kernel with a stride of 1 in two-dimensional convolution.The down-sampling layer involves a 3 × 3 convolutional kernel, while the up-sampling layer utilizes a 2 × 2 convolutional kernel.

Experimental Setup
The experimental environment is as follows: The CPU is an Intel ® Core™ i9-13900K processor with a frequency of 5.0 GHz, and the memory is 32 GB.The GPU is an NVIDIA GeForce RTX 4090.The software environment includes PyTorch 2.0.1 and CUDA 11.8.The batch size is set to 24, and the Adam optimizer [39] is used for optimization with an initial learning rate of 1 × 10 −4 .The L1 loss function is employed as the loss function.

Evaluation Metrics
This paper uses the peak signal-to-noise ratio (PSNR) and the structural similarity index measurement (SSIM) [40] as quantitative metrics to evaluate the super-resolution performance of the algorithm.PSNR is employed to assess the consistency between generated images and ground truth, while SSIM is used to evaluate the structural similarity between generated images and ground truth.The definitions of SSIM and PSNR are given in Formulas ( 9) and (10).
For an image x and y with a size of  × , the mean squared error (MSE) between x and y is defined as:

Experimental Setup
The experimental environment is as follows: The CPU is an Intel ® Core™ i9-13900K processor with a frequency of 5.0 GHz, and the memory is 32 GB.The GPU is an NVIDIA GeForce RTX 4090.The software environment includes PyTorch 2.0.1 and CUDA 11.8.The batch size is set to 24, and the Adam optimizer [39] is used for optimization with an initial learning rate of 1 × 10 −4 .The L1 loss function is employed as the loss function.

Evaluation Metrics
This paper uses the peak signal-to-noise ratio (PSNR) and the structural similarity index measurement (SSIM) [40] as quantitative metrics to evaluate the super-resolution performance of the algorithm.PSNR is employed to assess the consistency between generated images and ground truth, while SSIM is used to evaluate the structural similarity between generated images and ground truth.The definitions of SSIM and PSNR are given in Formulas ( 9) and (10).
For an image x and y with a size of m × n, the mean squared error (MSE) between x and y is defined as: µ x and µ y represent the average of x and y. σ 2 x and σ 2 y represent the variance of x and y. σ xy is the covariance of x and y.
03, L is the range of pixel values.If the number of image channels is three, the range of L is 0~255.SSIM is a number between 0 and 1.The larger it is, the smaller the difference between the output image and the image-free image is; that is, the image quality is very good.When the two images are the same, SSIM = 1.

Module Selection
To assess the denoising capability of the U-Net network combined with different Attention mechanisms, an 8-fold magnification module selection experiment was conducted on the Jiangsu radar dataset for the self-attention mechanism, CBAM Attention mechanism, and SimAM Attention mechanism.Initially, the U-Net denoising network was used as the baseline model without adding any attention mechanism.Subsequently, the self-attention, CBAM, and SimAM Attention mechanisms were individually incorporated into the U-Net residual block.A comparative analysis was performed to examine the effects of each module on the U-Net denoising network.Evaluation metrics such as PSNR and SSIM were employed.The experimental results are presented in Table 3. Table 3 shows that the denoising capability, as measured by the PSNR indicator, is superior when the RA module combines the SimAM attention block with the residual block compared to other combinations.The SSIM is only 0.007 lower than using the self-attention module.Moreover, SimAM can enhance model performance without increasing training time or complexity due to its parameter-free nature.While the self-attention has the highest SSIM score, the improvement in PSNR compared to the baseline is not significant, and it has the highest number of parameters, meaning that it is not the optimal choice.Although the CBAM Attention mechanism can capture features from different channels and spatial dimensions, experimental results and the PSNR/SSIM metrics indicate suboptimal performance when applied to the U-Net denoising network.Attempting to concatenate the CBAM and SimAM Attention results in a slight improvement in both PSNR and SSIM compared to using CBAM alone.Consequently, the SimAM Attention is selected as the Attention module within the RA module.

Discussion
Weather radar is significant for nowcasting, and radar image super-resolution technology based on the conditional generation diffusion model can significantly alleviate the problem of poor imaging of the extrapolation model caused by various factors.Based on the SR3 super-resolution model, this paper first explores the feasibility of the SR3 model in weather radar super-resolution.Secondly, by improving the U-Net denoising network, the convolution block is replaced by a residual connection, and for the problem of difficulty in fusing multi-dimensional features, a residual module incorporating an attention mechanism is proposed, which includes a SimAM Attention module and multiple residual blocks.Using radar observation data in Jiangsu in the past three years and comparing experimental results, it was found that Radar-SR3 using the improved U-Net denoising model has better image generation capabilities than the SR3 model and is in the same dataset as commonly used image super-resolution algorithms; comparisons were made and excellent results were obtained.But Radar-SR3 still has flaws: the training time is too long.In the experiments in this paper, 1 epoch takes about 30 min, but it takes 500 epochs to achieve stable super-resolution generation results.If Denoising Diffusion Implicit Models (DDIMs) are used, the inference time can be reduced.

Conclusions
In a follow-up work, without introducing a new radar echo extrapolation model, radar echo prediction can be carried out through low-resolution radar echo images, and then the super-resolution generation of the extrapolated images can be performed through the Radar-SR3 model, obtaining a radar echo extrapolation image with clearer and richer details.Because

3 .
Radar Echo Image Super-Resolution Model Based on Improved SR3 3.1.Denoising Diffusion Probability Model

Figure 5 .Figure 4 .
Figure 5. Sample of RA Module.3.4.2.Improved U-Net Denoising NetworkThis paper modifies the original U-Net model by replacing its convolutional modules with residual connection modules, enhancing its effectiveness and depth.In the downsampling and up-sampling layers, the residual connection blocks are replaced with RA modules better to capture noise and features from the original image.The architecture of

Figure 8
Figure 8 shows the super-resolution results of Radar-SR3 at different training stages.Figure 9 shows the relationship between the number of training epochs and PNSR.As the number of training epochs increases, the images generated by the Radar-SR3 model become closer to the actual value.Figure 10 selects five time steps to show the super-resolution process of Radar-SR3.Table 2 shows different models' PSNR and SSIM indicators when the amplification factor is 8. Compared with the SR3 model, Radar-SR3 improves PSNR by 0.44 while keeping SSIM unchanged.

Figure 9
Figure 8 shows the super-resolution results of Radar-SR3 at different training stages.Figure 9 shows the relationship between the number of training epochs and PNSR.As the number of training epochs increases, the images generated by the Radar-SR3 model become closer to the actual value.Figure 10 selects five time steps to show the super-resolution process of Radar-SR3.Table 2 shows different models' PSNR and SSIM indicators when the amplification factor is 8. Compared with the SR3 model, Radar-SR3 improves PSNR by 0.44 while keeping SSIM unchanged.

Figure 8 .
Figure 8 shows the super-resolution results of Radar-SR3 at different training stages.Figure 9 shows the relationship between the number of training epochs and PNSR.As the number of training epochs increases, the images generated by the Radar-SR3 model become closer to the actual value.Figure 10 selects five time steps to show the super-resolution process of Radar-SR3.Table 2 shows different models' PSNR and SSIM indicators when the amplification factor is 8. Compared with the SR3 model, Radar-SR3 improves PSNR by 0.44 while keeping SSIM unchanged.Atmosphere 2024, 15, x FOR PEER REVIEW 10 of 14

Figure 9 .Figure 10 .
Figure 9. PSNR changes with training iterations.The abscissa represents the number of iterations, and the ordinate represents the value of PSNR.LR T =   T =   T =   T =   T =   HR

Figure 9 .Figure 10 .Figure 9 .Figure 8 .
Figure 9. PSNR changes with training iterations.The abscissa represents the number of iterations, and the ordinate represents the value of PSNR.LR T =   T =   T =   T =   T =   HR

Figure 9 .Figure 10 .Figure 10 .
Figure 9. PSNR changes with training iterations.The abscissa represents the number of iterations, and the ordinate represents the value of PSNR.LR T =   T =   T =   T =   T =   HR

Figure 11 Figure 8 .
Figure 11 selects an example.First, the LR image is interpolated into an HR image through the Bicubic algorithm, from 16 × 16 to 128 × 128.Then, the interpolated image is compared with SR3, SRGAN, and Radar-SR3 for details.Compared with SR3, the superresolution reconstruction effect of Radar-SR3 in high-echo areas is closer to the real value, and the details in some discontinuous echo areas are richer.A comparison can be obtained, and Radar-SR3 super-resolution model imaging is more precise and more detailed than Bicubic algorithm imaging; although the SRGAN model can restore high-echo areas well, it generates severely gridded images during the generation process, and overall imaging is not available.But the Radar-SR3 model can restore the details more clearly based on generating clear images, generating smoother and continuous effects, and the generated images are closer to the authentic images.

Figure 9 .Figure 10 .Figure 11 .Figure 12 .
Figure 9. PSNR changes with training iterations.The abscissa represents the number of iterations, and the ordinate represents the value of PSNR.LR T =   T =   T =   T =   T =   HR

Figure 11 .
Figure 11.Comparison of details.Use the red frame to enlarge image details and compare them.

Figure 12 Figure 8 .
Figure 12 shows the super-resolution effects of different models.The pictures generated by the Radar-SR3 model are closer to the real values.

Figure 9 .Figure 10 .Figure 11 .Figure 12 .
Figure 9. PSNR changes with training iterations.The abscissa represents the number of iterations, and the ordinate represents the value of PSNR.LR T =   T =   T =   T =   T =   HR

Figure 12 .
Figure 12.Comparison of generated images by different models.Figure 12.Comparison of generated images by different models.

Table 1 .
Number of parameters of radar extrapolation model under different input sizes.

Table 2 .
Comparison of super-resolution performance of different models.

Table 3 .
Comparison of indicators and parameter quantities of different models.