1. Introduction
As one of the most destructive natural calamities, drought occurs when rainfall amounts are below normal for a long period. The characteristics are high frequency, long duration, wide influence [
1,
2], and damaging effects on grain yields and water supplies, so it is of great significance to model and forecast the rainfall amount and drought. Accurate precipitation predictions are required for the precise estimation of drought in an area [
3]. More accurate and timely rainfall prediction can boost drought research, while greatly improving future water management policies in many ways.
Due to the nonlinear, stochastic and highly complex nature of rainfall data, timely and exact rainfall forecasting has remained a challenging task, and more complex technologies are needed. The autoregressive integrated moving average (ARIMA) and neural network (NN) are broadly trending [
1,
4,
5]. ARIMA has good prediction accuracy and flexibility for different types of time series data, such as those found in hydrology [
6,
7], finance [
8], agriculture [
9], and medicine [
10]; however, ARIMA cannot adequately simulate the nonlinear structure of precipitation. Consequently, linear methods cannot capture the nonlinear characteristics of rainfall processes, and nonlinear time series methods should be considered when predicting rainfall [
11,
12]. NN is able to overcome this shortcoming superbly and can model the complex, mostly nonlinear relationships of precipitation time series to achieve higher precision in precipitation predictions [
13,
14,
15,
16,
17]. Compared with the ARIMA model, the neural network structure has the advantages of self-organization, self-learning and nonlinear approximation, but it also has the disadvantage of assuming that the inputs and outputs are independent. NN is an efficient way for modeling, function approximation and prediction of complex problems. Many scholars have found that the main advantage of neural networks is its good accuracy, especially when the variables are nonlinear, compared with other artificial intelligence (AI) models such as gene expression programming (GEP). The GEP model is more sensitive to the quality of observations than NN models, so its performance is usually inferior to NN models [
18,
19,
20]. Due to the influences of observation field environment, climate, instrument performance, installation mode, and human factors, the precipitation observations often have systematic random errors. Therefore, NN models are a good choice in predicting precipitation. The deep learning model can automatically learn complex time patterns through high-level abstraction and nonlinear transformation and achieve approximations of complex functions compared with the simple NN model [
21]. Thus, deep learning models such as LSMT can solve the nonlinear and periodic problems presented by rainfall forecasting [
10,
22]. However, the functioning of LSTM is governed by other factors, such as sample sizes and noise factors. In short, it is not sensible to use single LSTM or single ARIMA to predict rainfall because each individual model may not perform well in all circumstances. On the other hand, hybrid technologies combine the superiorities of several models used for time series data to overcome the shortcomings of each model and improve prediction accuracy. In most cases, hybrid models achieve higher prediction than any single model [
1,
4,
23,
24]. For example, Shishegaran et al. [
23] developed a hybrid model for predicting air quality index by combining ARIMA and GEP. Mehdizadeh et al. [
24] developed hybrid models GEP-FARIMA, MARS-FARIMA, MLR-FARIMA, GEP-SETAR, MARS-SETAR and MLR-SETAR for modeling monthly streamflow; it showed that the hybrid models offered more accurate results than the single models and MLR-FARIMA and MLR-SETAR models. As such, this study presents a new hybrid wavelet-ARIMA-LSTM that takes advantage of the unique strengths of wavelet transformation, ARIMA and LSTM for accurately predicting monthly precipitation.
Drought is generally classified as meteorological drought, agricultural drought, hydrological drought or socioeconomic drought by drought timescales and impacts [
25,
26,
27]. Among these categories, meteorological drought usually precedes other types of drought and is determined by the degree of lack of precipitation in an area over a period of time. This paper studies meteorological drought. Due to the different classification criteria, each drought event has a different drought severity level. Many meteorological drought indices are used to describe the hydrometeorological characteristics of drought at different scales and include the China Z-Index (CZI) [
28,
29,
30,
31,
32], which is used in this paper.
Drought is a complex phenomenon and is one of the most unpredictable natural disasters [
33,
34,
35]. The causes of drought are extremely complex and are related not only to natural factors such as meteorology but also to human activities; thus, data collection and selection of potential influencing factors of drought are difficult. When faced with cases of inadequate sample sizes and poor information, accurate forecasting of drought events is a difficult task. The important problem in drought prediction is how to make accurate predictions under uncertain systems. The grey model GM (1, 1), which was introduced by Deng [
36], focuses on resolving uncertain problems with small sample sizes [
37]. GM (1, 1) can effectively reflect the exponential growth characteristics of system change trends [
38] because its time response function corresponds to an exponential function. A discrete grey model DGM (1, 1) was proposed by Xie and Liu [
39] to address the prediction errors due to the change in the traditional grey model from discrete to continuous. DGM (1, 1) has the advantages of fully fitting pure exponential sequences, no restrictions on the development coefficient, and broadens the application scope of the model. However, in actual situations, the unique superiorities of DGM (1, 1) cannot be fully utilized because the data are basically inconsistent with exponential growth, which causes scholars to usually choose GM (1, 1) instead of DGM (1, 1) when solving practical problems. Since first being proposed, grey prediction models have been broadly employed in various fields, such as electricity consumption prediction [
40,
41], air pollution forecasting [
42,
43,
44] and energy forecasting [
45,
46,
47]. Such practical applications show that grey prediction models have wide applicability, especially in situations with incomplete information and inaccurate data. Grey prediction models have successfully dealt with various problems, but only a few scholars have used these models to study drought prediction, while prediction of drought events conforms to the characteristics of grey systems. Therefore, this paper uses GM (1, 1) and DGM (1, 1) to predict the occurrence of drought events. In addition, we choose GM (1, 1) and DGM (1, 1) because that they are the most basic and widely used grey prediction models.
Generally, hybrid models have higher prediction accuracy than single models. In this paper, the wavelet-ARIMA-LSTM method are proposed for the first time. It combines the advantages of wavelet, ARIMA and LSTM and can predict future precipitation more accurately. Drought is considered to be the most incomprehensible and least understood disaster by many researchers. It is very difficult to achieve accurate forecasting of drought events when facing the problems of insufficient samples and poor information. The grey system model places a particular emphasis on dealing with the uncertainty brought by small samples. Based on this, the study uses GM (1, 1) and DGM (1, 1) to predict drought years for drought risk analyses and drought warnings. The major objectives of this study are: (1) to develop a hybrid wavelet-ARIMA-LSTM method to predict monthly precipitation for the period 1967–2017 in Northeast China; (2) to analyze drought characteristics in Northeast China based on the drought index, CZI; and (3) to use GM (1, 1) and DGM (1, 1) to predict the occurrence of drought events and compare the predictive capabilities of the two methods. It is expected that the research results will help to provide decision support for rainfall predictions, which in turn will help in planning adaptative measures to reduce drought impacts and provide decision support for disaster prevention.
4. Conclusions
This study employed the proposed W-AL, single ARIMA and single LSTM to predict precipitation, using monthly data over a long period of from1967 to 2017 for three stations. Additionally, drought analysis by using the CZI drought index, using annual precipitation amounts for the same years was carried out. Finally, drought years, as classified by the CZI, applying the grey prediction models of GM (1, 1) and DGM (1, 1), were predicted. The following main conclusions are drawn from this study.
The proposed W-AL model at different ratios of training and test sets all exhibited higher prediction accuracy than the ARIMA and LSTM, based on different climate types for monthly precipitation data. In addition, by comparing the R2 values obtained by the W-AL models of the three stations, it can be found that the fitting effect of the W-AL method in Linjiang with humid and semi-humid climate type was best and was followed by Changchun with semiarid and semihumid climate type and Qian Gorlos with arid and semiarid type.
The drought index CZI results revealed that, drought events have become more frequent since 1987, and that all stations experienced some levels of drought in the 1980s, mid-2000s and early 2010s. On the other hand, the results indicated that there were more numerous slight drought events in Changchun and more numerous moderate drought events in Linjiang and Qian Gorlos.
GM (1, 1) and DGM (1, 1) were used to predict drought years. According to the results, GM (1, 1) always showed higher accuracy than DGM (1, 1) at different climate types, with an average relative error of 2.22% at a minimum and 6.66% at a maximum. Therefore, GM (1, 1) was applied to predict drought years that were close to the actual conditions. Additionally, the best prediction effect for drought events was relatively arid areas, and the worst was a relatively humid area.