Recognition of Severe Convective Cloud Based on the Cloud Image Prediction Sequence from FY-4A

: Severe convective weather is hugely destructive, causing signiﬁcant loss of life and social and economic infrastructure. Based on the U-Net network with the attention mechanism, the recurrent convolution, and the residual module, a new model is proposed named ARRU-Net (Attention Recurrent Residual U-Net) for the recognition of severe convective clouds using the cloud image prediction sequence from FY-4A data. The characteristic parameters used to recognize severe convective clouds in this study were brightness temperature values TBB 9 , brightness temperature difference values TBB 9 − TBB 12 and TBB 12 − TBB 13 , and texture features based on spectral characteristics. This method ﬁrst input ﬁve satellite cloud images with a time interval of 30 min into the ARRU-Net model and predicted ﬁve satellite cloud images for the next 2.5 h. Then, severe convective clouds were segmented based on the predicted image sequence. The root mean square error (RMSE), peak signal-to-noise ratio (PSNR), and correlation coefﬁcient (R 2 ) of the predicted results were 5.48 K, 35.52 dB, and 0.92, respectively. The results of the experiments showed that the average recognition accuracy and recall of the ARRU-Net model in the next ﬁve moments on the test set were 97.62% and 83.34%, respectively.


Introduction
Cloud research is paramount in atmospheric and meteorological systems.The recognition of severe convective clouds has always been important in researching meteorological disaster prevention [1][2][3].
The task of satellite cloud image prediction is a time-space sequence prediction task, which is to predict the position and shape of the cloud cluster and the change of brightness temperature value of the infrared channel for a certain period.Currently, there are three methods to predict satellite cloud images: block matching, optical flow, and artificial intelligence.Jamaly et al. improved the accuracy of cloud motion estimation regarding velocity and direction by utilizing the cross-correlation and cross-spectrum analysis methods as matching criteria based on the block matching principle [4].Dissawa et al. proposed a method for real-time motion tracking of clouds based on cross-correlation and optical flow and applied it to the entire sky image based on the ground [5].Shakya et al. developed a fractional-order technique for calculating optical flow and applied it to cloud motion estimation in satellite image sequences.In addition, cloud prediction was carried out using optical flow interpolation and extrapolation models based on conventional anisotropic diffusion [6].In recent years, under the background of big data, more and more artificial intelligence methods have been used to predict cloud images.Son et al. proposed a deep learning model (LSTM-GAN) based on cloud movement prediction in satellite images for PV forecasting [7].Xu et al. proposed a generative antagonistic network-long-short memory model (GAN-LSTM) for FY-2E satellite cloud image prediction [8].Bo et al. used a convolutional short-term memory network to predict cloud position.This method can realize end-to-end prediction without considering the speed and direction of cloud movement [9].Traditional methods and existing artificial intelligence methods have problems with low-resolution and blurred images in cloud image extrapolation.
One of the main bases for recognizing severe convective clouds using satellite imagery is spectral signatures.Many scholars have proposed several brightness temperature thresholds for the identification of severe convective clouds in different research areas, such as 207 K [10], 215 K [11], 235 K [12], etc. Mecikalski et al. used infrared water vapor brightness temperature difference and bright temperature difference between split widow channels as one of the criteria for recognizing severe convective clouds [13].Jirak et al. have studied hundreds of mesoscale convective systems for four months continuously and aimed to identify convective clouds in infrared cloud images by using the brightness temperature value of a black body of less than 245 K as the recognition condition [14].Sun et al. reduced false detection by introducing the different information of two channels and improving the effect of the algorithm in penetrating recognition of convective clouds [15].Mitra et al. created a multi-threshold approach to detect severe convective clouds throughout the day and at night [16].
In addition to spectral characteristics, cloud texture structure can be used as an essential basis for recognition.Welch et al. used a gray-level co-occurrence matrix to extract LANDSAT satellite images for cloud classification and achieved good results [17].Zinner combined the displacement vector field calculated according to the image-matching algorithm and the spectral images of adjacent times to determine the regions with severe convective development in the cloud map, and the images of different bands were used to recognize severe convective clouds at different stages of development [18].Bedka et al. proposed an infrared window zone texture method to identify the upward thrust of severe convective clouds [19].
This study has improved the ability of the disaster warning system to predict severe convective weather, providing more accurate guidance in mitigating the impact of severe convective weather-induced disasters.First, to improve the accuracy of recognizing severe convective clouds, we proposed using the ARRU-Net network for predicting the following 2.5 h of satellite cloud images and recognizing convective clouds within them.Second, we introduced the attention mechanism, the residual model, and the recurrent convolution to enhance the prediction and recognition capabilities of the model.Third, we integrated severe convective clouds' spectral and texture features to explore more feature parameters and improve recognition accuracy.This method eliminated cirrus clouds and increased the recognition accuracy of severe convective clouds.
The rest of the paper is organized as follows.The study area, the FY-4A satellite AGRI Imager data, and data processing are described in Section 2. The configuration of the ARRU-Net model, severe convective cloud label-making method and model performance evaluation method are described in Section 3. The comparison experiments of cloud image prediction and recognition of severe convective cloud based on the cloud image prediction sequence are described in Section 4. The conclusions are given in Section 5.

Study Area
The study area was latitude of 118.52 • E-128.72 • E and longitude of 25.28 • N-35.48 • N, covering the eastern coast of China and the northwest Pacific.The study area is shown in Figure 1.Because the eastern part of China is connected with the western part of the Pacific Ocean, the change of cloud systems there is more active.Introducing the information of cloud systems in the western part of the Pacific Ocean into the study area has a good auxiliary effect on the cloud system prediction in the coastal areas of China.

Study Area
The study area was latitude of 118.52°E-128.72°Eand longitude of 25.28°N-35.48°N,covering the eastern coast of China and the northwest Pacific.The study area is shown in Figure 1.Because the eastern part of China is connected with the western part of the Pacific Ocean, the change of cloud systems there is more active.Introducing the information of cloud systems in the western part of the Pacific Ocean into the study area has a good auxiliary effect on the cloud system prediction in the coastal areas of China.

FY-4A Data
There are several meteorological detection sensors onboard FY-4A.Among them, AGRI Imager can scan an area in minutes, adopt the off-axis three-mirror main optical system, obtain more than 14 bands of earth cloud images at high frequency, and use an onboard black body for high-frequency infrared calibration to ensure the accuracy of observation data.AGRI consists of a total of 14 channels that span the range from visible to infrared light.These channels cover a wide geographical area, ranging from 80.6°N to 80.6°S and from 24.1°E to 174.7°W [20].Since the visible light and shortwave infrared band of the FY-4A satellite cannot be used at night, this paper selected water vapor and longwave infrared channel data from the L1 data of the multi-channel imager of FY-4A satellite for all-weather cloud prediction and strong convective cloud identification.Figure 2 shows an example of cloud data from 14 channels of the FY-4A satellite AGRI Imager.The observational parameters for each channel are presented in Table 1.

FY-4A Data
There are several meteorological detection sensors onboard FY-4A.Among them, AGRI Imager can scan an area in minutes, adopt the off-axis three-mirror main optical system, obtain more than 14 bands of earth cloud images at high frequency, and use an onboard black body for high-frequency infrared calibration to ensure the accuracy of observation data.AGRI consists of a total of 14 channels that span the range from visible to infrared light.These channels cover a wide geographical area, ranging from 80.6 • N to 80.6 • S and from 24.1 • E to 174.7 • W [20]. Since the visible light and shortwave infrared band of the FY-4A satellite cannot be used at night, this paper selected water vapor and long-wave infrared channel data from the L1 data of the multi-channel imager of FY-4A satellite for all-weather cloud prediction and strong convective cloud identification.Figure 2 shows an example of cloud data from 14 channels of the FY-4A satellite AGRI Imager.The observational parameters for each channel are presented in Table 1.

Data Preprocessing
Before using FY-4A AGRI Imager data for the study, it was necessary to preprocess the data, which mainly included geometric correction, radiometric calibration, and data normalization.After radiation calibration, the grayscale images were converted into brightness temperature images.The original raster files were projected onto a unified coordinate system through geometric correction.The data were then normalized to eliminate the influence caused by differences in the value range.

Data Preprocessing
Before using FY-4A AGRI Imager data for the study, it was necessary to preprocess the data, which mainly included geometric correction, radiometric calibration, and data normalization.After radiation calibration, the grayscale images were converted into brightness temperature images.The original raster files were projected onto a unified coordinate system through geometric correction.The data were then normalized to eliminate the influence caused by differences in the value range.

Geometric Correction
In the L1 data of the FY-4A AGRI Imager, a corresponding calibration table is provided for the image data layer of each channel.If we take the DN value corresponding to a particular position in the image data layer as an index, we then find the reflectivity or brightness temperature value corresponding to the index position in the calibration table to realize the radiation calibration process.Before using FY-4A AGRI Imager data for the study, it was necessary to preprocess the data, which mainly included data extraction, geometric correction, radiometric calibration, and data normalization.

Radiometric Calibration
The latitude and longitude data selected in this paper range from 118.52 where column denotes the number of columns of the image after projection, and row represents the number of rows after projection.lon max , lon min denote the maximum and minimum values of the longitude range, lat max , lat min denote the maximum and minimum values of the latitude range, and res represents the spatial resolution of the data used in this study, which was 0.4 • .Each pixel in the original satellite image was mapped into the projected image by the equal longitude and latitude projection transformation formula, which is as follows, where x represents the abscissa in the image after projection, and y represents the ordinate in the image after projection.

Data Normalization
The data used in this study were the FY-4A satellite multi-channel imager water vapor and long-wave infrared band (FY-4A AGRI Imager 9-14 channel) data, and these bands after radiation calibration processing.The value in the channel was brightness temperature value, and the data range of the brightness temperature values was determined to be 124~325 by calibration.Therefore, for mapping purposes, this paper assigned 124 as the minimum value and 325 as the maximum value and mapped the data to the [0, 1] interval through the calculation method of the following formula, where min is the minimum value in the sample data and max is the maximum value of the sample data.

Method
Based on the U-Net network [21] with the attention mechanism, the recurrent convolution, and the residual module, a new model was proposed, named ARRU-Net, for the recognition of severe convective clouds using the cloud image prediction sequence from FY-4A data.The data used in this study were the FY-4A AGRI imager 9-14 channel data.This method first input five satellite cloud images with a time interval of 30 min into the ARRU-Net model and predicted five satellite cloud images for the next 2.5 h.Then, the ARRU-Net model segmented the severe convective clouds based on the predicted image sequence.

The Proposed ARRU-Net Model
Based on the U-Net network, this study introduces a new network named ARRU-Net using the attention mechanism and the residual module, which changes the original U-Net convolution into a recurrent convolution, as shown in Figure 3.

Attention Mechanism
In meteorological satellite data, surrounding geographical features and the resemblance between snow cover and cloud clusters can cause the learning direction of models to deviate from the intended target.Additionally, meteorological satellite data contain multiple spectral channels, each playing a distinct role in cloud detection.In light of these challenges, incorporating an attention mechanism into cloud detection models for meteorological satellite imagery can facilitate the focused learning of differences between cloud clusters and other regions and the varying impacts of each channel on cloud formation.This approach aims to enhance the efficiency and accuracy of cloud detection.
The structure of the attention module is shown in Figure 4.This attention mechanism takes two feature maps as inputs, g and x l , which are linearly transformed into A and B through 1 × 1 convolution.The resulting feature maps are then added together and passed through a ReLU activation function to obtain the intermediate feature map.After another 1 × 1 convolution operation, as well as sigmoid function and resampling, the attention coefficient α is obtained.Finally, the attention coefficient α is multiplied by x l to obtain the output feature map.
This method first input five satellite cloud images with a time interval of 30 min into the ARRU-Net model and predicted five satellite cloud images for the next 2.5 h.Then, the ARRU-Net model segmented the severe convective clouds based on the predicted image sequence.

The Proposed ARRU-Net Model
Based on the U-Net network, this study introduces a new network named ARRU-Net using the attention mechanism and the residual module, which changes the original U-Net convolution into a recurrent convolution, as shown in Figure 3.

Attention Mechanism
In meteorological satellite data, surrounding geographical features and the resemblance between snow cover and cloud clusters can cause the learning direction of models to deviate from the intended target.Additionally, meteorological satellite data contain multiple spectral channels, each playing a distinct role in cloud detection.In light of these challenges, incorporating an attention mechanism into cloud detection models for meteorological satellite imagery can facilitate the focused learning of differences between cloud clusters and other regions and the varying impacts of each channel on cloud formation.This approach aims to enhance the efficiency and accuracy of cloud detection.
The structure of the attention module is shown in Figure 4.This attention mechanism takes two feature maps as inputs, g and x l , which are linearly transformed into A and B through 1 × 1 convolution.The resulting feature maps are then added together and passed through a ReLU activation function to obtain the intermediate feature map.After another 1 × 1 convolution operation, as well as sigmoid function and resampling, the attention coefficient α is obtained.Finally, the attention coefficient α is multiplied by x l to obtain the output feature map.

Recurrent Convolutional Block
Recurrent convolution is widely employed in text classification [22].It has also found applications in computer vision, such as object recognition, as demonstrated by Liang et al. [23], and image segmentation, as demonstrated by Alom et al. [24].
This model changes the convolutional layers of U-Net into recurrent convolutional layers to learn multi-scale features of different receptive fields and fully utilize the output feature map, as shown in Figure 5. Recurrent convolution can extract spatial features from cloud imagery data by applying convolutional operations at each time step.These features capture information such as different cloud types' shapes, textures, and structures, thereby providing richer input features for subsequent predictive tasks.In each recurrent convolutional block, the Conv + BN + ReLU operation is repeated t times by adjusting the total time step parameter to t.

Recurrent Convolutional Block
Recurrent convolution is widely employed in text classification [22].It has also found applications in computer vision, such as object recognition, as demonstrated by Liang et al. [23], and image segmentation, as demonstrated by Alom et al. [24].
This model changes the convolutional layers of U-Net into recurrent convolutional layers to learn multi-scale features of different receptive fields and fully utilize the output feature map, as shown in Figure 5. Recurrent convolution can extract spatial features from cloud imagery data by applying convolutional operations at each time step.These features capture information such as different cloud types' shapes, textures, and structures, thereby providing richer input features for subsequent predictive tasks.In each recurrent convolutional block, the Conv + BN + ReLU operation is repeated t times by adjusting the total time step parameter to t.
layers to learn multi-scale features of different receptive fields and fully utilize the output feature map, as shown in Figure 5. Recurrent convolution can extract spatial features from cloud imagery data by applying convolutional operations at each time step.These features capture information such as different cloud types' shapes, textures, and structures, thereby providing richer input features for subsequent predictive tasks.In each recurrent convolutional block, the Conv + BN + ReLU operation is repeated t times by adjusting the total time step parameter to t.

Residual Connection
Residual connection [25] is a widely used technique in deep learning.Using direct summation, it combines the final output with a nonlinear transformation of the original output and the input.This method has been shown to effectively address issues such as network degradation [26,27] and shattering gradient during backpropagation, while also making training easier.
The residual connection enables the direct addition of the input cloud imagery to the output cloud imagery, thereby supplementing the lost feature information during the convolutional process.It also helps mitigate the degradation problem often observed in deep networks, allowing for extracting more comprehensive cloud characteristics within

Residual Connection
Residual connection [25] is a widely used technique in deep learning.Using direct summation, it combines the final output with a nonlinear transformation of the original output and the input.This method has been shown to effectively address issues such as network degradation [26,27] and shattering gradient during backpropagation, while also making training easier.
The residual connection enables the direct addition of the input cloud imagery to the output cloud imagery, thereby supplementing the lost feature information during the convolutional process.It also helps mitigate the degradation problem often observed in deep networks, allowing for extracting more comprehensive cloud characteristics within convolutional layers of the exact resolution.Consequently, this approach enhances the model's generalization capability.The structure of the residual connection is shown in Figure 6.

Severe Convective Cloud Label-Making Method
The characteristic parameters used to recognize severe convective clouds in this study were brightness temperature value TBB9, brightness temperature difference values TBB9−TBB12 and TBB12−TBB13, and texture features based on spectral characteristics.

Severe Convective Cloud Label-Making Method
The characteristic parameters used to recognize severe convective clouds in this study were brightness temperature value TBB 9 , brightness temperature difference values TBB 9 −TBB 12 and TBB 12 −TBB 13 , and texture features based on spectral characteristics.

Analyze Spectral features
The 9th band (6.25 µm) of FY-4A is the water vapor band, and the 12th band (10.7 µm) and 13th band (12.0 µm) are long-wave infrared bands.Let the brightness temperature of a pixel in the satellite images of the 9th, 12th, and 13th bands with the same pixel position and at the same time be TBB 9 , TBB 12 , and TBB 13 , respectively.Through analysis, three spectral characteristic quantities are selected: brightness temperature at the ninth band TBB 9 , brightness temperature difference value TBB 9 −TBB 12 , and TBB 12 TBB 13 .Figure 7 shows the three spectral features selected in this paper for recognizing severe convective clouds.

Severe Convective Cloud Label-Making Method
The characteristic parameters used to recognize severe convective clouds in this study were brightness temperature value TBB9, brightness temperature difference values TBB9−TBB12 and TBB12−TBB13, and texture features based on spectral characteristics.

Analyze Spectral features
The 9th band (6.25 µm) of FY-4A is the water vapor band, and the 12th band (10.7 µm) and 13th band (12.0 µm) are long-wave infrared bands.Let the brightness temperature of a pixel in the satellite images of the 9th, 12th, and 13th bands with the same pixel position and at the same time be TBB9, TBB12, and TBB13, respectively.Through analysis, three spectral characteristic quantities are selected: brightness temperature at the ninth band TBB9, brightness temperature difference value TBB9−TBB12, and TBB12TBB13. Figure 7 shows the three spectral features selected in this paper for recognizing severe convective clouds.

Extract Texture Features
Gabor transform was used to extract the texture features of severe convective cloud images in different directions.Gabor transform is a short-time window Fourier transform with a Gaussian window function [28], which aims to satisfy two-dimensional images' locality in spatial and frequency domains.In two-dimensional image processing, the Gabor filter has good filtering performance, is similar to the human visual system, and has a good texture detection function [29].The two-dimensional Gabor filtering function is as follows: where xp = xcosθ + ysinθ, yp = ycosθ − xsinθ.Sx and Sy are the ranges of variables changing on the x and y axes, respectively, representing the size of the selected Gabor wavelet window; f is the frequency of the sine function; and G is the Gabor filtering function g(x, y).

Image Binarization
The cloud clusters in which the brightness temperature of the water vapor channel is more than 220 K were removed, and the TBB 9 spectral features were binarized, as shown in Figure 9a.
The brightness temperature difference TBB 9 −TBB 12 was greater than −4 K in the water vapor-infrared window region and was used for the preliminary extraction of convective clouds.The spectral characteristics of TBB 9 −TBB 12 were binarized, as shown in Figure 9b.
There were still cirrus and other noises in the convective cloud data preliminarily extracted by the brightness temperature difference of water vapor-infrared window, and the split-window brightness temperature difference method used in the experiments excluded noise such as partial cirrus clouds by selecting TBB 12 −TBB 13 < 2 K as the threshold value, as shown in Figure 9c.
with a Gaussian window function [28], which aims to satisfy two-dimensional images' locality in spatial and frequency domains.In two-dimensional image processing, the Gabor filter has good filtering performance, is similar to the human visual system, and has a good texture detection function [29].The two-dimensional Gabor filtering function is as follows: where  =  +  ,  =  −  . and  are the ranges of variables changing on the x and y axes, respectively, representing the size of the selected Gabor wavelet window; is the frequency of the sine function; and  is the Gabor filtering function (, ).
Figure 8 shows the texture features extracted by Gabor filter at 0°, 45°, 90°, and 135° directions.vapor-infrared window region and was used for the preliminary extraction of convective clouds.The spectral characteristics of TBB9−TBB12 were binarized, as shown in Figure 9b.There were still cirrus and other noises in the convective cloud data preliminarily extracted by the brightness temperature difference of water vapor-infrared window, and the split-window brightness temperature difference method used in the experiments excluded noise such as partial cirrus clouds by selecting TBB12-TBB13 < 2 K as the threshold value, as shown in Figure 9c.(2) Take the intersection operation of the two pictures in Figure 10a,b to obtain the labels needed for the final severe convection recognition, as shown in Figure 11.Set the area threshold to 4, and the area with less than 4 pixels will be excluded from the binarized image, as shown in Figure 12. (2) Take the intersection operation of the two pictures in Figure 10a,b to obtain the labels needed for the final severe convection recognition, as shown in Figure 11.Set the area threshold to 4, and the area with less than 4 pixels will be excluded from the binarized image, as shown in Figure 12.Set the area threshold to 4, and the area with less than 4 pixels will be excluded from the binarized image, as shown in Figure 12.Set the area threshold to 4, and the area with less than 4 pixels will be excluded from the binarized image, as shown in Figure 12.In order to quantitatively evaluate the prediction effect of the model, this study selected the evaluation indexes of peak signal-to-noise ratio (PSNR), root-mean-square error (RMSE), and correlation coefficient (R 2 ).The calculation formulas are as follows: where MAX 2 is the maximum possible value of a picture pixel.M and N are the height and width of the image, I(i, j) represents the pixel value of the i-th row and the j-th column in the observed image, and K(i, j) represents the pixel value of the i-th row and the j-th column in the model predicted image.

Model Performance Evaluation of Recognition of Severe Convective Cloud
To further validate the proposed model in this article, four evaluation metrics-accuracy, precision, recall, and F1-score-were used to quantitatively analyze the results of recognizing severe convective clouds.The calculation formulas for these four evaluation metrics are as follows: The length of each series was set to 5 time steps, and the time interval of each time node was 30 min; [x t−4 , x t−3 , x t−2 , x t−1 , x t ] was the input data, [x t+1 , x t+2 , x t+3 , x t+4 , x t+5 ] was the output data, and t was the current moment.From June to September 2021 and 2022, 5000 time series were selected, of which 4000 time series were used as the training set, 500 time series as the validation set, and the remaining 500 data as the test set, for a total of 150 training epochs using the RMSprop optimizer.

Comparison of Cloud Image Prediction Models
Various deep learning models were used to compare the effect of cloud image prediction.The predicted results are shown in Figure 13.It can be seen from Figure 13 that compared with the ARRU-Net model, the prediction image of other models was blurred.Only the general shape of the cloud could be seen; the fine-grained details of the clouds were significantly compromised, and the image became more and more blurred with time.However, the image predicted by the ARRU-Net model proposed in this study was closer to the label image, and the sharpness of the It can be seen from Figure 13 that compared with the ARRU-Net model, the prediction image of other models was blurred.Only the general shape of the cloud could be seen; the fine-grained details of the clouds were significantly compromised, and the image became more and more blurred with time.However, the image predicted by the ARRU-Net model proposed in this study was closer to the label image, and the sharpness of the image was significantly higher than that of U-Net.With the change of time, the sharpness of the image changed little, and more cloud details could be predicted.
In order to evaluate the comparison results of this method with other models more objectively, the PSNR, RMSE, and R 2 indexes of the four model prediction images on the test set were calculated, as shown in Figure 14.It can be seen from the figure that the prediction effect of ARRU-Net was better than that of other models.The average PSNR of the images predicted by ARRU-Net in the next five moments was more than 33 dB, the average RMSE at five moments was less than 7 K, and the average R2 at five moments was higher than 0.88.The ARRU-Net model achieved an average RMSE reduction of 1.3 K and a PSNR increase of 1.95 dB compared to U-Net in predicting the following five time steps on the test set.It is proved that this method can predict images that are clearer and more similar to the label images and have higher accuracy for long-term prediction.
Remote Sens. 2023, 15, x FOR PEER REVIEW 14 of 20 In order to evaluate the comparison results of this method with other models more objectively, the PSNR, RMSE, and R 2 indexes of the four model prediction images on the test set were calculated, as shown in Figure 14.It can be seen from the figure that the prediction effect of ARRU-Net was better than that of other models.The average PSNR of the images predicted by ARRU-Net in the next five moments was more than 33 dB, the average RMSE at five moments was less than 7 K, and the average R2 at five moments was higher than 0.88.The ARRU-Net model achieved an average RMSE reduction of 1.3 K and a PSNR increase of 1.95 dB compared to U-Net in predicting the following five time steps on the test set.It is proved that this method can predict images that are clearer and more similar to the label images and have higher accuracy for long-term prediction.We compared our method, ARRU-Net, with five other methods: Opticalflow-LK [30], DBPN [31], SRCloudNet [32], AFNO [33], and GAN+Mish+Huber [34].Among the approaches, SRCloudNet, GAN+Mish+Huber, and AFNO were the newest state-of-the-art methods.As shown in Table 2, the ARRU-Net model outperformed some other methods.We compared our method, ARRU-Net, with five other methods: Opticalflow-LK [30], DBPN [31], SRCloudNet [32], AFNO [33], and GAN+Mish+Huber [34].Among the approaches, SRCloudNet, GAN+Mish+Huber, and AFNO were the newest state-of-the-art methods.As shown in Table 2, the ARRU-Net model outperformed some other methods.ARRU-Net and other models were used to recognize the severe convective clouds on cloud image prediction sequence.The models' inputs were the predicted satellite cloud image sequence, and the outputs were the recognition result of the satellite cloud image sequence.The label images were formed according to the label-making method introduced in Section 3.2.The RMSprop optimizer was used for 150 training epochs.Figure 15 shows the severe convection cloud recognition results of the ARRU-Net model and other models on the same test data for the predicted satellite cloud images.As can be seen from the Figure 16, the average accuracy and recall for cloud recognition were 97.62% and 83.34%, respectively, compared with the U-Net model, with a 2% increase in accuracy and a 4% increase in recall, indicating that this method can effectively eliminate cirrus clouds and improve the accuracy of severe convective cloud recognition.Therefore, quantitative and qualitative analysis show the apparent advantages of this method, and the ability of the ARRU-Net model to capture cloud features was improved, achieving more accurate severe convective cloud recognition.Figure 15 shows the severe convection cloud recognition results of the ARRU-Net model and other models on the same test data for the predicted satellite cloud images.As can be seen from the Figure 16, the average accuracy and recall for cloud recognition were 97.62% and 83.34%, respectively, compared with the U-Net model, with a 2% increase in accuracy and a 4% increase in recall, indicating that this method can effectively eliminate cirrus clouds and improve the accuracy of severe convective cloud recognition.Therefore, quantitative and qualitative analysis show the apparent advantages of this method, and the ability of the ARRU-Net model to capture cloud features was improved, achieving more accurate severe convective cloud recognition.

Conclusions
The study was conducted to recognize severe convective clouds based on the FY-4A satellite image prediction sequence to provide more accurate guidance to mitigate the impact of severe convective weather.This study proposed the ARRU-Net model for

Conclusions
The study was conducted to recognize severe convective clouds based on the FY-4A satellite image prediction sequence to provide more accurate guidance to mitigate the impact of severe convective weather.This study proposed the ARRU-Net model for predicting the following 2.5 h of satellite cloud images and recognizing convective clouds within them.Based on the U-Net network with the attention mechanism, the recurrent convolution, and the residual module, a new model was proposed, named ARRU-Net, for the recognition of severe convective clouds using the cloud image prediction sequence from FY-4A data.The characteristic parameters used to recognize severe convective clouds in this study were brightness temperature values TBB 9 , temperature difference values TBB 9 −TBB 12 and TBB 12 −TBB 13 , and texture features based on spectral characteristics.
The results of the experiments indicated that the proposed method surpassed other comparative models.The RMSE, PSNR, and R 2 of the predicted results were 5.48 K, 35.52 dB, and 0.92, respectively.The ARRU-Net model achieved an average RMSE reduction of 1.3 K and a PSNR increase of 1.95 dB compared to U-Net in predicting the following five time steps on the test set.The average accuracy and recall for cloud recognition were 97.62% and 83.34%, respectively, compared with the U-Net model, with a 2% increase in accuracy and a 4% increase in recall, indicating that this method can effectively eliminate cirrus clouds and improve the accuracy of severe convective clouds recognition.
Despite the promising results of this study, the model's performance in predicting cloud images decreased over time.This was due to a sparse data, which can cause significant changes in the objects between adjacent images.Consequently, the prediction and segmentation of longer-term and small-scale convective clouds were affected.Further research is necessary to enhance the predicted images' resolution and improve long-term sequence prediction accuracy.

Figure 1 .
Figure 1.Illustration of the selected area and the observation area of the FY-4A satellite.

Figure 1 .
Figure 1.Illustration of the selected area and the observation area of the FY-4A satellite.

Figure 4 .
Figure 4. Attention mechanism structure of the proposed ARRU-Net model.

Figure 4 .
Figure 4. Attention mechanism structure of the proposed ARRU-Net model.

Figure 5 .
Figure 5. Two kinds of convolutional block: (a) a basic unit of the U-Net convolution; (b) the recurrent convolution of the proposed ARRU-Net model.

Figure 5 .
Figure 5. Two kinds of convolutional block: (a) a basic unit of the U-Net convolution; (b) the recurrent convolution of the proposed ARRU-Net model.
Remote Sens. 2023, 15, x FOR PEER REVIEW 8 of 20 convolutional layers of the exact resolution.Consequently, this approach enhances the model's generalization capability.The structure of the residual connection is shown in Figure 6.

Figure 6 .
Figure 6.Residual connection structure of the proposed ARRU-Net model.

Figure 6 .
Figure 6.Residual connection structure of the proposed ARRU-Net model.

Figure 6 .
Figure 6.Residual connection structure of the proposed ARRU-Net model.

Figure 9 .Figure 10 .
Figure 9. Image binarization: (a) brightness temperature of the water vapor channel TBB 9 < 220 K; (b) brightness temperature difference TBB 9 −TBB 12 > −4 K; (c) brightness temperature difference of water vapor-infrared window TBB 12 −TBB 13 < 2 K. 3.2.4.Closed Operations and Intersection Operations (1) A closed operation was carried out on the preliminary recognition results of severe convection by TBB 9 , TBB 9 −TBB 12 , and TBB 12 −TBB 13 , as shown in Figure 10.(2) Take the intersection operation of the two pictures in Figure 10a,b to obtain the labels needed for the final severe convection recognition, as shown in Figure 11.

Figure 11 .
Figure 11.The image of recognition result after intersection operation of Figure 10a,b.

Figure 11 .
Figure 11.The image of recognition result after intersection operation of Figure 10a,b.

Figure 11 .
Figure 11.The image of recognition result after intersection operation of Figure 10a,b.

Figure 11 .
Figure 11.The image of recognition result after intersection operation of Figure 10a,b.

Figure 12 .
Figure 12.The image of recognition result after area threshold: (a) original image; (b) the final label image.

Figure 12 .
Figure 12.The image of recognition result after area threshold: (a) original image; (b) the final label image.

4. 2 . 2 .
Figure 15  shows an example of recognition results for a predicted image sequence in a test set using the ARRU-Net model.The following is an example of the segmentation results of the predicted image sequence, where the gray base map is the predicted image, green represents the hit pixel, blue represents the missed image, and red represents the falsely detected pixel.The ARRU-Net model presented in this study had significantly more green hits than other models and had more red misses and fewer blue misses than other models.

Figure 15 Figure 15 .
Figure 15  shows an example of recognition results for a predicted image sequence in a test set using the ARRU-Net model.The following is an example of the segmentation results of the predicted image sequence, where the gray base map is the predicted image, green represents the hit pixel, blue represents the missed image, and red represents the falsely detected pixel.The ARRU-Net model presented in this study had significantly more green hits than other models and had more red misses and fewer blue misses than other models.

Figure 15 .
Figure 15.Comparison of the results of recognition of severe convective cloud based on the cloud image prediction sequence: green represents the hit pixel, blue represents the missed image, and red represents the falsely detected pixel.(a) input; (b) label; (c) U-Net; (d) ConvLSTM; (e) 3DCNN; (f) U-Net+Residual+Recurrent; (g) ARRU-Net.

Figure 15 .
Figure 15.Comparison of the results of recognition of severe convective cloud based on the cloud image prediction sequence: green represents the hit pixel, blue represents the missed image, and red represents the falsely detected pixel.(a) input; (b) label; (c) U-Net; (d) ConvLSTM; (e) 3DCNN; (f) U-Net+Residual+Recurrent; (g) ARRU-Net.

Author Contributions:
Conceptualization, X.Y. and Y.L.; methodology, Q.C., X.Y. and M.C.; software, Q.C. and X.Y.; investigation, X.Y. and Y.L.; writing-original draft preparation, Q.C.; writing-review and editing, X.Y., Y.L. and Q.X.; supervision, Q.X. and P.Z.All authors have read and agreed to the published version of the manuscript.Funding: This research was funded by Laoshan Laboratory science and technology innovation projects, grant number No. LSKJ202201202, the Hainan Key Research and Development Program, grant number No. ZDYF2023SHFZ089, Fundamental Research Funds for the Central Universities, grant number No. 202212016 and the Hainan Provincial Natural Science Foundation of China, grant number No. 122CXTD519.Data Availability Statement: Not applicable.

Channel Type Central Wavelength Spectral Bandwidth Spatial Resolution Main Applications
• E to 128.72 • E and 25.28 • N to 35.48 • N. The formula for calculating the number of rows in the projected image is as follows,

Table 2 .
Comparison of the performance of our model and other methods in cloud image prediction.

Table 2 .
Comparison of the performance of our model and other methods in cloud image prediction.Net and other models were used to recognize the severe convective clouds on cloud image prediction sequence.The models' inputs were the predicted satellite cloud image sequence, and the outputs were the recognition result of the satellite cloud image sequence.The label images were formed according to the label-making method introduced in Section 3.2.The RMSprop optimizer was used for 150 training epochs.4.2.2.Comparison of Recognition of Severe Convective Cloud Based on the Cloud Image Prediction Sequence

Table 3 .
Comparison of the performance of our model and other methods in cloud image recognition.

Table 3 .
Comparison of the performance of our model and other methods in cloud image recognition.