Weather Radar Echo Extrapolation Method Based on Deep Learning

: In order to forecast some high intensity and rapidly changing phenomena, such as thunderstorms, heavy rain, and hail within 2 h, and reduce the inﬂuence brought by destructive weathers, this paper proposes a weather radar echo extrapolation method based on deep learning. The proposed method includes the design and combination of the data preprocessing, convolutional long short-term memory (Conv-LSTM) neuron and encoder–decoder model. We collect eleven thousand weather radar echo data in high spatiotemporal resolution, these data are then preprocessed before they enter the neural network for training to improve the data’s quality and make the training better. Next, the neuron integrates the structure and the advantages of convolutional neural network (CNN) and long short-term memory (LSTM), called Conv-LSTM, is applied to solve the problem that the full-connection LSTM (FC-LSTM) cannot extract the spatial information of input data. This operation replaced the full-connection structure in the input-to-state and state-to-state parts so that the Conv-LSTM can extract the information from other dimensions. Meanwhile, the encoder–decoder model is adopted due to the size difference of the input and output data to combine with the Conv-LSTM neuron. In the neural network training, mean square error ( MSE ) loss function weighted according to the rate of rainfall is added. Finally, the matrix “point-to-point” test method, including the probability of detection ( POD ), critical success index ( CSI ), false alarm ratio ( FAR ) and spatial test method contiguous rain areas (CRA), is used to examine the radar echo extrapolation’s results. Under the threshold of 30 dBZ, at the time of 1 h, we achieved 0.60 ( POD ), 0.42 ( CSI ) and 0.51 ( FAR ), compared with 0.42, 0.28 and 0.58 for the CTREC algorithm, and 0.30, 0.24 and 0.71 for the TITAN algorithm. Meanwhile, at the time of 1 h, we achieved 1.35 (total MSE ) compared with 3.26 for the CTREC algorithm and 3.05 for the TITAN algorithm. The results demonstrate that the radar echo extrapolation method based on deep learning is obviously more accurate and stable than traditional radar echo extrapolation methods in near weather forecasting.


Introduction
The role of the strong convective weather forecast in today's society is self-evident because the evolution rule is extremely complex, and for agriculture, social activities will have a big impact, easily causing disaster and life and property loss. Therefore, for the inherent laws, the characteristics of the future trends of this kind of weather have been the focus of the all-weather forecast department. The forecast of this kind of weather is called near weather forecasting technology [1]. The near weather forecasting technology is mainly divided into three technologies: extrapolation techniques (which combine identification, tracking and extrapolation), numerical weather prediction technology and expert system model forecasting technology combined with multiple observation data and analysis methods [2,3]. Numerical weather prediction technology contains complex physical equation calculation and it is difficult to satisfy the requirements of accuracy and real-time in precipitation prediction [4]. Expert system model forecasting technology needs hardware support and a personnel assistant, and it should integrate a variety of medium-and small-scale observation data and different weather prediction technologies as well [5,6].
The extrapolation forecasting technology, which has a better performance in the near weather forecast and has been developed fairly maturely, can bring good reference and early warning within 0-2 h. The relatively mainstream algorithms in extrapolation forecasting technology include cross-correlation, monomer centroid and optical flow methods [7]. The first two algorithms have been widely used in many local meteorological forecasting departments [8]. In this paper, we use a cross-correlation method (coordinate tracking radar echoes by correlation (CTREC)) and a monomer centroid method (thunderstorm identification, tracking, analysis and nowcasting (TITAN)) to compare with the weather radar echo extrapolation method based on deep learning.
The cross-correlation method has a good prediction effect on the stratiform cloud weather system and layered and convective mixed weather system with slow change and stable trends. However, the prediction accuracy is low for a severe convective weather system with rapid direction change and complex movement trends [9][10][11][12]. The centroid method [13,14] can effectively track convective cells with high intensity. However, it is not easy to identify echoes with small intensity and complex structures, and storm cells [14][15][16] should not develop too violently; otherwise, tracking can easily fail. The two extrapolation methods still have some defects. Therefore, the extrapolation prediction technology based on a deep learning algorithm has been developed.
Deep learning is built based on the machine learning algorithms and theories. It is also developed to satisfy the requirements of artificial intelligence [17]. The deep learning model is usually an end-to-end model, under this circumstance, we only need to feed the data and then obtain the output results. So, deep learning does not require very expert professional knowledge for users. In recent years, deep learning has made major breakthroughs in technologies and theories [18], showing excellent abilities in many fields. Through the formulation of an optimization algorithm, the construction of a neural network model and learning or training a large amount of data, the neural network using deep learning algorithms can effectively "learn" the internal correlation of the radar data features in high spatiotemporal resolution sequences, capturing the evolution law and motion state of the radar echo quickly. At present, the near weather forecasting based on deep learning is mainly realized by radar echo extrapolation [19][20][21][22][23]. Compared with CTREC or TITAN, the deep learning model can overcome their disadvantages, tracking and forecasting severe convective weather more stably and more accurately. Additionally, with the development of deep learning, the weather radar extrapolation method based on deep learning has greater potential.
Meanwhile, there are several high-resolution data in the meteorological field, such as radar-based data obtained by traditional single-polarization and more advanced dualpolarization Doppler radar, as well as real-time observation data from ground observation stations and satellite data [24,25]. The amount of data is very large that it is appropriate to combine the data with deep learning. If we can build a deep learning neural network model with a computing-intensive server, to train and learn these data, save an end-to-end model and directly deploy it in the weather forecast operational system, the real-time prediction ability of minutes or seconds is expected to be realized. Then, it will be essential in developing weather forecast services.
The main contributions of this paper are as follows: (1) Read quantities of weather radar data from radar-based data. The benefits of massive data for deep learning are shown in this paper. (2) Choose effective data quality control methods to filter the clutter. Use the best in three interpolation methods. (3) Find the appropriate parameters suitable for the neural network.
(4) Select a reasonable loss function and set up a weight matrix to assist the training of neural network. (5) Make multiple tables according to the experiment results and evaluation criteria to approve the accuracy of the deep learning system. This paper includes six sections. The first section is the introduction, to introduce the background of near weather forecasting technology, illustrate the practicability and superiorities of deep learning system. The second section gives the data preprocessing methods for producing data input and improving training quality. The third section introduces the core principles and algorithms of deep learning methods. The fourth section explains the evaluation criteria of extrapolation results, so that we can verify the results from different aspects. The fifth section presents all quantitative result analyses and figure displays of the traditional extrapolation algorithms and deep learning algorithm. The final section shows the conclusions and prospects of this paper.

Data Preprocessing
Not only does the new generation of weather radar detect the meteorological targets, but it detects the non-meteorological targets. The quality of weather radar echo data has a direct impact on the extrapolation experiments. The main factors affecting the quality of data are the ground clutter and noise clutter. These two kinds of clutter will affect the performance of radar echo measuring the precipitation, and the integrity of echo display. They will also affect the feature extraction, target judgment and result calculation in the extrapolation experiment. So, we used the weather radar echo data quality control algorithms to filter this clutter. Then, the data after quality control will be interpolated. The data in polar coordinates will be interpolated into the plane grid. Finally, the interpolated data are normalized and sent to the extrapolation method as input. The flowchart of the data preprocessing is shown in Figure 1:  For the noise clutter, the method of filtering out the isolated points and making up the missing detection points is adopted.
Filtering isolated points requires traversing every radar echo database, and if it is valid, a rectangular window N * N is created on the data. Then, the total number of valid data M and the proportion of valid data S of all data in the window are obtained. Finally, a threshold of M 0 (typically set to 0.7, it is verified that the threshold between 0.5 and 0.7 is effective to filter isolated points) is set to determine the isolated point. Meanwhile, if S is less than M 0 , it will be judged as an isolated point and set as invalid data. The specific equation is as follows: Filling in a missing point is also known as the alopecia areata problem. Similar to the method of filtering the isolated points, this method requires traversing the radar echo database and creating a rectangular window M * M on the traversal point (the size of the window in this paper is M = 5; this value is an appropriate value concluded throughout experiments), then counting the number of valid data in the M * M window and set a threshold value (the default value is 12; the threshold value can be around 12, which goes up to the number of points that you want to fill in). If the number of valid data exceeds this threshold, the target grid point is replaced with Here, V i is the value of the ith data in the window. The specific schematic diagram is shown in Figure 2.

Ground Clutter
In this paper, the recognition of ground object clutter is mainly judged by three factors: reflectivity factor mean radial texture (TDBZ), reflectivity factor vertical gradient (VGZ) and absolute radial velocity (V abs ) [26,27]. Ground clutter significantly differs from the meteorological echo from these three aspects. These three values are calculated as follows: Here, i and j represent the range bin number and radial number of the reflectivity factor, respectively; N gates and N radials represent the number of range bin number and radial number in the sector region centered on the reflectivity coordinates i and j, respectively. In this paper, the size of N gates and N radials is set to 5 (this value is an appropriate value concluded throughout experiments). It is more effective to use TDBZ to judge ground clutter and precipitation echo in places far away from the radar center (range > 150 km), and the TDBZ of ground clutter is relatively larger.
The vertical gradient of reflectivity (VGZ) reflects the variation characteristics of echo on the vertical gradient. It is an essential feature to identify precipitation echo and ground clutter. Ground clutter usually appears at low elevation; however, it disappears as the elevation increases. Therefore, the (VGZ) ground clutter is generally large. In the calculation equation of (VGZ), Z low represents the reflectivity factor value at the low elevation, Z up represents the reflectivity factor value at the high elevation with the same azimuth and range bin number, H up and H low represent the corresponding height, where the reference height H up is 3-4.5 km and H low is the corresponding height at the low elevation.
Additionally, V abs represents the radial velocity corresponding to the azimuth and range bin number. Because the resolution of radial velocity and reflectivity factor is different, the range bin of radial velocity is four times that of the reflectivity factor. Therefore, four consecutive grid points in the radial direction correspond to a range bin of reflectivity factors.

Data Interpolation
Data interpolation transforms the data points in the plane grid region of the Cartesian coordinate system (hereinafter referred to as Cartesian coordinate system) into the polar coordinate system centered on the radar station. Then, the polar coordinates of the grid point P(x, y, z) in the Cartesian coordinate system are transformed as follows: In the above equations, x, y, z represent the coordinates in the Cartesian coordinate system. R, θ, ϕ represent the radial distance, azimuth and elevation of this point in the polar coordinates, respectively.
The method of eight-point linear interpolation is adopted. The schematic of this method is shown in Figure 3: As shown in Figure 3, there are eight adjacent data points around the interpolation point P. Among them, f 1 ∼ f 4 are the four data points above P, and f 5 ∼ f 8 are the four data points below P. Their respective coordinate points are f a 1 (e 1 , a 1 , . The values of the points to be interpolated are obtained through bilinear interpolation as follows:

Experimental Result Analysis
First, the experimental results of the noise and ground clutter filtering in data quality control are analyzed. Figure 4 shows that after noise filtering, the noise points on the radar echo map become less and the edge of the echo map becomes smoother. After filtering ground clutter, the ground clutter located in the middle is filtered out, and the quality of the radar echo map is significantly improved.
To generate constant altitude plan position indicator (CAPPI) as the input of radar echo extrapolation methods, we employ three interpolation methods: linear interpolation in nearest neighbor combined with a vertical direction (NVI), linear interpolation in a vertical direction plus a horizontal direction (VHI) and linear interpolation of eight points (EPI). Figure 5 compares the original radar echo image and the interpolation results of the three interpolation methods.  Because the EPI interpolation method considers three factors: radial, azimuth and elevation, it can make more grid points get interpolated, and the EPI interpolation results are smoother. Therefore, the EPI interpolation method was used in the extrapolation experiments.

Deep Learning Algorithm
The first step of deep learning is the specification and quality of training data. The second step is the selection and optimization of the training algorithm, and the third step is the parameter setting, network depth and level matching of the training network. The configuration of each link has a great or small influence on the training effect. Therefore, in terms of the algorithm's complexity, the deep learning algorithm takes it into account more comprehensively.
Based on deep learning theory, this section combines the CNN and LSTM neurons to form the Conv-LSTM neurons, which serve as the core and engine of neural network training. The Conv-LSTM is used as the neuron of the encoder-decoder model to form a time-series prediction model for radar echo extrapolation.
When training data in deep learning, the corresponding loss function should be selected as the optimization target of the optimization algorithm to gradually improve the convergence speed of the training. Different "tasks" of the network are implemented, and the loss functions selected are different. In this paper, mean square error (MSE) and other loss functions are used for the training optimization of deep learning. Experiments show that such loss functions can make the network achieve the best convergence effect. The flowchart of the proposed method by this paper is shown in Figure 6. CNN is a deep neural network with convolutional operation as its core idea [28]. Its three core technologies are receptive field, weight sharing and downsampling layer [29,30].
The structure of CNN generally includes an input layer, convolutional layer, excitation function, downsampling layer, full-connection layer and output layer.
For the convolution operation, the convolution kernel scan input, after matrix multiplication and overlaying the bias, can be used to calculate the value of the neuron at the next layer: In Equations (6) and (7), Z l and Z l+1 represent the input and output of the l + 1 layer and L l+1 is the size of Z l+1 . It is assumed that the feature graphs have the same length and width.
Z(i, j) is the pixel of the feature graph; K is the number of channels of the feature graph; f is the side length of the square convolution kernel; s 0 is the step length of the convolution kernel movement; and p is the size of filling in 0 during the convolution.
When f = 1, s 0 = 1 and p = 0, the convolution operation using the cross-correlation algorithm is equivalent to the full join operation: The convolution and downsampling operations are shown in Figure 7. In Figure 7, * represents the convolution operation. The hidden and output layers, or the hidden and hidden layers are connected through the excitation function, called the excitation function relationship. The most common excitation functions for deep learning include the Sigmoid, Tanh, Maxsoft and ReLU functions. The first two excitation functions belong to the nonlinear rectifier function. The last one belongs to the linear rectifier function, which is also the most commonly used one of the excitation functions.

LSTM Neural Network
The recurrent neural network (RNN) is mainly used in time-series prediction. Its most obvious feature is that the output of the neuron at a certain moment can be fed into the neuron again as the input, meaning that the data at the previous and next moments can produce correlation and dependence. This is why RNN is applied to time series. For multilayer RNN, there are only three weight parameters to be calculated for each layer. Because weight parameters are shared globally such as CNN, the number of hyperparameters to be calculated for RNN is significantly reduced.
The disadvantage of RNN is also very obvious. If the predicted time is very long, the increase or decrease in the loss value is too severe, leading to the problem of gradient vanishing and extinction [31].
To solve the problem of gradient explosion and disappearance generated by the RNN neural network for long-time prediction, LSTM was built and extended on this basis. LSTM is an upgraded version of RNN invented by Jürgen Schmidhuber in 1997 [32]. It has been proved to have an excellent ability to deal with long sequence problems [33]. The structure of LSTM is shown in Figure 8. The core idea of LSTM is to preserve and perpetuate long-time cell states. Additionally, three "gate switches" are designed to control the weight change of cell and neuron states at each moment. The three "gate switches" include the forgetting, input and output gates. The equations of LSTM are as follows: Here, • represents the Hadamard product; i t represents the input at the current moment; W represents the weight matrix; b represents the bias; c t−1 represents the state at the previous moment; f t represents the value of the forgetting gate (i.e., which cell state should be forgotten); c t represents the state at the current moment; o t represents the output at the current moment; and h t represents the final output after the Hadamard product of the current output and state.

Conv-LSTM Neural Network
The classical LSTM structure expands the data into one dimension for prediction, which can better solve the time correlation. However, FC-LSTM can only extract the time-series information but cannot extract the spatial information. Spatial data, especially the radar echo data, contain much redundant information that cannot be processed by FC-LSTM.
To solve this problem, a convolution structure between input-to-state and state-to-state arises at the historic moment. Conv-LSTM uses convolution instead of full connection to extract the spatial information of sequence. In other words, the main difference between FC-LSTM and Conv-LSTM is that Conv-LSTM replaces matrix multiplication operation with convolution operation. The equations are as follows: where * represents the convolution operation.

Encoder-Decoder Model
The reason for adopting the encoder-decoder model is the asymmetry of input and output. Applying this model, inputs of different lengths can be used to calculate outputs of different lengths, which solves the disadvantage that LSTM must have input and output symmetry. The basic idea is to use two batches of RNN, one batch of RNN as encoder and the other batch of RNN as the decoder. The encoder and decoder should ensure asymmetrical structure too.
The input of the encoder of the deep learning neural network model is the radar echo data. After multi-layer downsampling and convolutional layer processing, the cell state is packaged and sent to the decoder. The decoder takes the cell state as the input and restores the cell state to a specific output data through a multi-layer of deconvolution and upsampling. Figure 9 shows structure of the encoder-decoder model. In this paper, we used three Conv-LSTM as encoder and three Conv-LSTM as decoder. Then, add downsampling and upsampling into the encoder and decoder separately. The downsampling and upsampling are implemented in convolution and deconvolution, respectively.

Loss Function
When training data in deep learning, the corresponding loss function should be selected as the optimization target of the optimization algorithm to gradually improve the performance of the dataset. The appropriate loss function can make the network achieve the best convergence effect.
The radar echo extrapolation in this paper belongs to the machine learning regression model. The regression model is supervised learning used to predict the numerical target value and make an approximate prediction of the real value. The regression model is a supervised learning algorithm, meaning that the predicted data are continuously distributed.
The evaluation criterion of deep learning for good or bad results is the loss function. The smaller the value of the loss function, the better the performance and robustness of the model. For the regression problem, the output should be continuous. Therefore, the loss function of the neural network can choose MSE, mean absolute error (MAE) and root mean square error (RMSE). Their corresponding calculation methods are shown in Equations (11)- (13). These loss functions have similar properties. They are all calculated based on a matrix "point-to-point", allowing one to visually see the similarities between predicted and true values. In this paper, MSE is chosen as the loss function.
We adopted a weight matrix to measure the importance of the reflectivity factor value because the occurrence probability of low reflectivity factor value is very high, and the higher the reflectivity factor value, the lower the occurrence probability.
In Equation (9), x represents the value of the reflectivity factor. Its unit is dBZ.

Evaluation Criteria of Extrapolation Results
For the test of precipitation forecast, it is generally divided into two classes. The first class is the matrix "point-to-point" test method. This method uses "point-to-point" method to calculate the difference between two matrices of the same size, and then obtain the mean value of all the differences. It allows us to visually see the difference between the two matrices from statistics. The representative methods of this class include POD, CSI and FAR.
The second class is the space inspection technology. The traditional test method, matrix point-to-point test method, is easy to lead to the phenomenon of double punishment. It prefers to regard the precipitation forecast as a failed forecast. To overcome and eliminate this phenomenon, a space-based verification technology has been developed in recent years and applied to the evaluation of precipitation forecast. This technology can evaluate the prediction results from another angle, make the verification method more comprehensive and detailed.  Table 1.

Prediction Is Positive Prediction Is Negative
Observation is positive TP FN Observation is negative FP TN For example, the weather radar echo extrapolation needs to set the threshold of reflectivity factor in dBZ. TP represents the number of points both observed and predicted to be greater than the threshold, FN represents the number of points observed to be greater than the threshold but predicted to be less than the threshold, FP represents the number of points observed to be less than the threshold but predicted to be greater than the threshold and TN represents the number of points both observed and predicted to be less than the threshold.
For POD, CSI and FAR, their equations are as follows:

Spatial Test Method
The calculation process of CRA is to select the observation area to be evaluated and then find the corresponding area on the predicted precipitation map. The error in these two areas is called the error before displacement. Then, the predicted precipitation map is shifted to a certain angle. When the error between the predicted precipitation map and the observed precipitation map reaches the minimum, the mean square error is the translation error. The area of the predicted area, plus the area where the predicted precipitation map and the observed precipitation map reaches the minimum, plus the area of the observed data are called the CRA verification area.
The error of precipitation forecast can be divided into three parts: displacement error, intensity error and shape error: In the above equations, f i and O i represent the prediction and observation results in the CRA verification area and N represents the number of data points compared.
where MSE shi f t is the translation error. It is obtained by the following equation: The volume error is calculated by subtracting the average observation result after displacement from the average prediction result after displacement: (19) Finally, the pattern error is obtained by subtracting the volume error from the translation error: Figure 10 shows the results of the test set extrapolation of the CTREC algorithm, demonstrating the extrapolation results at four moments. We used two different evaluation algorithms to evaluate the test set, and the evaluation results are presented in Tables 2-4. Here, we selected 10, 20, 30 and 40 dBZ as echo thresholds because they are the boundary values of distinguishing light rain (between 10 and 20 dBZ), moderate rain (between 20 and 30 dBZ), heavy rain (between 30 and 40 dBZ) and torrential rain (greater than 40 dBZ). As presented in Tables 2 and 3, the higher the POD and CSI, the better the extrapolation results, and the lower the FAR, the better the extrapolation results. As shown in Table 4, the total MSE manifests the performance of the extrapolation results from another perspective. The percent displacement, percent pattern and percent volume mean to judge the results in three aspects, and the smaller the total MSE, the better the extrapolation results.

TITAN Algorithm
As shown in Figure 11, the ellipses on (a) and (b) are storm cells with a reflectivity factor greater than 30 dBZ. The time interval between the two images is 6 min; it can be seen that the number of cells in the two echoes is different. The phenomenon of division, merger, extinction and generation of the cells can be seen in the radar echo cells identified at each moment in Figures 11 and 12. To accurately track and predict the cells, appropriate judgment conditions and restrictive conditions need to be added. Therefore, we used the radar echo images of the first six moments to carry out the least square fitting method to obtain the development process of different cells. Additionally, a representative monomer prediction process is selected for analysis, as shown in Figures 13 and 14.  The predicted evaluation results are presented in Tables 5 and 6.  Figure 15 shows the results of the deep learning algorithm test set extrapolation, demonstrating the extrapolation results at four moments. Two different evaluation algorithms were used to evaluate the test set and the evaluation results are presented in Tables 7-9.

Conclusions and Prospects
As an important means of near weather forecasting technology, this paper focused on the extrapolation algorithm and exploration in the meteorological field, which is worthy of further improvement and perfection. Our proposed extrapolation method of deep learning takes the reflectivity factor data in the Doppler weather radar base data as the input. Before feeding it into the neural network, we conducted several preprocessing operations on the data for training so that the input data could meet the requirements of training. The data preprocessing is essential and the effect of the training is significantly improved after the data are preprocessed.
The deep learning algorithm has achieved good results under the set threshold and prediction time range. Multiple table data demonstrate the advantages of the proposed method by this paper.
For the matrix "point-to-point" test method, under the threshold of 10 dBZ or 20 dBZ, whether at the time of 0.5 h or 1 h, the POD, CSI and FAR of the deep learning algorithm have tiny differences compared with the CTREC algorithm. The reason for this phenomenon is that radar echo with low dBZ is easy to forecast. However For the spatial test method, at the time of 0.5 h and 1 h, the MSE total of the deep learning algorithm is 1.15 and 1.35 compared with 2.99 and 3.26 for the CTREC algorithm and 2.73 and 3.05 for the TITAN algorithm. Consequently, the stability of the deep learning algorithm is better than the CTREC and TITAN algorithms. We can obtain this conclusion from these figures as well. For example, for CTREC algorithm extrapolation results in Figure 10, the figures (d), (f) and (h) can approve that at the time of 30 min, 42 min and 60 min, the shapes of the extrapolation results change a lot from the observations. For the TITAN algorithm extrapolation results in Figures 13 and 14, the extrapolation results also change somewhat. However, for the deep learning algorithm extrapolation results in Figure 15, the figures (d), (f) and (h) can approve that at the time of 30 min, 42 min and 60 min, the shapes of the extrapolation results change a little.
From experiments results, it is confirmed that both in statistic and morphology the proposed method by this paper is superior to traditional radar echo extrapolation methods, CTREC and TITAN algorithms.
Additionally, compared with the CTREC algorithm, the extrapolation results of the deep learning algorithm are continuous, having no discrete points, which is significant for the judgment and measurement of the precipitation area. Compared with the TITAN algorithm, the deep learning algorithm can not only extrapolate the low-intensity echo region, but it also has a better accuracy of the high-intensity echo region. Furthermore, the deep learning algorithm can respond to the disappearance and generation of echoes in time, which makes it quickly respond to severe convective weather. Through the training, learning and feature extraction of massive data, the deep learning extrapolation algorithm forms a system that can automatically solve the inherent law of the data and predict the development trend of the data. This is of great help to the landing of precipitation forecast and make it business-oriented.