Convolutional LSTM Architecture for Precipitation Nowcasting Using Satellite Data †

: The short-term prediction of precipitation is a difﬁcult spatio-temporal task due to the non-uniform characterization of meteorological structures over time. Currently, neural networks such as convolutional LSTM have shown ability for the spatio-temporal prediction of complex problems. In this research, we propose an LSTM convolutional neural network (CNN-LSTM) architecture for immediate prediction of various short-term precipitation events using satellite data. The CNN-LSTM is trained with NASA Global Precipitation Measurement (GPM) precipitation data sets, each at 30-min intervals. The trained neural network model is used to predict the sixteenth precipitation data of the corresponding ﬁfteen precipitation sequence and up to a time interval of 180 min. The results show that the increase in the number of layers, as well as in the amount of data in the training data set, improves the quality of the forecast.


Introduction
Precipitation nowcasting refers to the prediction of rainfall in a local region over a short period of time, generally up to six hours [1]. Short-term prediction of weather events is important for public safety from high-impact meteorological events such as flash floods, tropical cyclones, thunderstorms, lightning, high-speed wind, etc. which can affect large populations or areas of significant economic investment. Precipitation nowcasting is also useful for weather forecasts and guidance in aviation, marine safety, ground traffic control, and construction industries. Nowcasting is one of the most challenging problems in weather forecasting because of the non-uniform and flawed characterization of the meteorological structures over time. Traditional methods for forecasting based on numerical weather prediction (NWP) are not suitable for short-term predictions because they are highly computationally expensive, sensitive to noise, and they depend a lot on initial conditions of the event [2]. They cause a delay in short-term predictions because of data assimilation and simulation steps required in NWP models which make the forecast irrelevant by the time it is made.
Existing methods for precipitation nowcasting can roughly be categorized into two classes [3], namely, NWP based methods and radar echo extrapolation-based methods. For the NWP approach, making predictions at the nowcasting timescale requires a complex simulation of the physical equations in the atmosphere model. Thus, the current stateof-the-art operational precipitation nowcasting systems [4,5] often adopt the faster and more accurate extrapolation-based methods. Some computer vision techniques, especially optical flow-based methods, have proven useful for making accurate extrapolation of radar maps [4,6,7]. However, the success of these optical flow-based methods is limited because the flow estimation step and the radar echo extrapolation step are separated and it is challenging to determine the model parameters to give good prediction performance. These technical issues may be addressed by viewing the problem from the machine learning perspective. In essence, precipitation nowcasting is a spatiotemporal sequence forecasting problem with the sequence of past satellite images as input and the sequence of a fixed number of future satellite images as output. However, such learning problems, regardless of their exact applications, are nontrivial in the first place due to the high dimensionality of the spatiotemporal sequences especially when multi-step predictions have to be made, unless the spatiotemporal structure of the data is captured well by the prediction model. Moreover, building an effective prediction model for the radar echo data is even more challenging due to the chaotic nature of the atmosphere.
Recent advances in deep learning, especially recurrent neural network (RNN) and long short-term memory (LSTM) models [8][9][10][11][12][13][14][15][16], provide some useful insights on how to tackle this problem. According to the philosophy underlying the deep learning approach, if we have a reasonable end-to-end model and sufficient data for training it, we are close to solving the problem. In this paper, we propose a novel convolutional LSTM network for precipitation nowcasting. We formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem that can be solved under the general sequence-to-sequence learning framework proposed in [15].

IMERG Dataset
IMERG is the unified algorithm that provides multi-satellite precipitation data. The data is obtained from passive microwave sensors of the precipitation measuring satellites comprising the global precipitation measurement (GPM) constellation [17]. The IMERG dataset is available in temporal resolutions of 30 min, 3 h, 1 day, 7 days, and 30 days. All IMERG datasets have a spatial resolution of 0.1. Since our goal is short-term forecasting of precipitation, we use the IMERG datasets with a temporal resolution of 30 min. The datasets with a temporal resolution of 30 min have been available since March 2014. IMERG datasets with a temporal resolution of 30 min are available in HDF5, GeoTIFF, NetCDF, ASCII, PNG, KMZ, OpenDAP, GrADS and THREDDS data formats. For our research, we use the HDF5 format IMERG dataset [18] for all subsequent analysis. We only use the 'precipitatonCal' field from the HDF5 dataset, which is multi-satellite precipitation data with gauge calibration and has a unit of mm/h.

Nowcasting Problem and Training Data
In a precipitation nowcasting problem using satellite data, the spatial region is represented by a M x N grid with Z measurement values varying over time. At any time (t), the observation is a tensor X where X ∈ R MxNxZ where R is the observed feature (precipitation). If the observation is recorded periodically, we get a sequence of observed features X <1> , X <2> , X <3> , . . . , X <t> . The nowcasting problem is then to predict the next sequence X<t+1> given the previous observations. In this research, we choose a square grid (M = N = 120) from the IMERG dataset as shown in Figure 1.
In our study, we would like to predict the sequence X <1> , X <2> , X <3> , . . . , X <15> from the previous fifteen observations at an interval of 30 min. For each input precipitation data, we use the subsequent precipitation data as the output precipitation in the training set. Therefore

Development of the Convolutional LSTM Network Architecture
We develop convolutional neural networks by stacking one, two and three LSTM layers for spatial and temporal feature learning which is followed by a 3D convolutional layer for the next 30 min of precipitation prediction, as shown in Figure 2. In the last layer of the architecture, we use ReLU as the activation layer. This is because precipitation nowcasting is a regression problem where the output of the convolutional neural network is a precipitation value. Since precipitation cannot take negative values, we choose ReLU to turn any negative activations into zeros (i.e., no rain).

Results and Discussion
Below is an example in the 29 April 2015 test dataset of a predicted storm from t + 30 min to t + 180 min illustrated below using the convolutional LSTM network with three layers (Figure 3). The model predicts that the precipitation values are initially good up to t + 180 min, although for the last intervals the precision decreases slightly. Interestingly, in all cases, the model preserves the direction and speed of the storm.  From the images, we find that the model slightly underestimates precipitation values above 20 mm/h as we rarely find predicted precipitation above 20mm/h. The reason for this is that the number of training samples is much smaller for higher precipitation values and so the neural network is more biased towards the prediction of lower precipitation values. This is also a general problem with the unbalanced dataset in deep learningbased techniques [19]. The neural network, however, estimates the speed and direction of storms accurately from past precipitation data, and the shape of the predicted precipitation corresponds to the observed precipitation. This is because the network has learned the spatial correlations between different timestamps of the previous sequences during end-toend training.
In Figures 4-7 we see how the Convolutional LSTM neural network with three layers exceeds the ones with one and two layersby having smaller RMSE and MAE. We also compared the prediction results of each model using the correlation at each time step in the prediction. Although the accuracy of the prediction decreases as the prediction time step progresses, the Convolutional LSTM network with more stacked layers continues to perform better at each time step.

Conclusions
In this article, we presented a new convolutional LSTM architecture for forecasting precipitation from space satellite data. We found that the LSTM model with three layers obtained the best results and predicts precipitation with good accuracy even for a lead time of 180 min. We conclude that convolutional LSTM is very suitable for capturing spatiotemporal relations in the satellite-based precipitation dataset for short-term forecasting. The model preserves the speed and directions of the precipitation in the forecasted results well. Satellite-based precipitation nowcasting is quite important, as radar data has limitations of not being available in all regions. A significant improvement in results could be expected by using a larger training set, using a convolutional LSTM neural network with four layers, by performing hyperparameter tuning, by pre-classifying the storm type with geographic information, and through the use of a weighted loss function.