Drought Forecasting Using Stochastic Models in a Hyper-arid Climate

Drought forecasting plays a crucial role in drought mitigation actions. Thus, this research deals with linear stochastic models (autoregressive integrated moving average (ARIMA)) as a suitable tool to forecast drought. Several ARIMA models are developed for drought forecasting using the Standardized Precipitation Evapotranspiration Index (SPEI) in a hyper-arid climate. The results reveal that all developed ARIMA models demonstrate the potential ability to forecast drought over different time scales. In these models, the p, d, q, P, D and Q values are quite similar for the same SPEI time scale. This is in correspondence with autoregressive (AR) and moving average (MA) parameter estimate values, which are also similar. Therefore, the ARIMA model (1, 1, 0) (2, 0, 1) could be considered as a general model for the Al Qassim region. Meanwhile, the ARIMA model (1, 0, 3) (0, 0, 0) at 3-SPEI and the ARIMA model (1, 1, 1) (2, 0, 1) at 24-SPEI could be generalized for the Hail region. The ARIMA models at the 24-SPEI time scale is the best forecasting models with high R 2 (more than 0.9) and lower values of RMSE and MAE, while they are the least forecasting at the 3-SPEI time scale. Accordingly, this study recommends that ARIMA models can be very useful tools for drought forecasting that can help water resource managers and planners to take precautions considering the severity of drought in advance.


Introduction
Water demand has significantly increased in many parts of the world, especially in hyper-arid regions that are facing a shortage of available water resources.To make matters worse, world-climatic change is leading to more droughts on the Earth's surface [1].Additionally, droughts have always been a normal, recurrent event in arid and semi-arid areas [2].Floods and droughts projected for the 21st century show large significant changes from those in the 20th century [3].Prominently, the impact of intense long drought on natural ecosystems essentially concerns regional agriculture, water resources and the environment [4].The lack of a precise assessment of drought in certain situations may lead to incorrect decisions and actions by managers and policy makers [5].Therefore, scientists categorized the drought phenomenon into four major groups: meteorological, agricultural, hydrologic and socio-economic [6,7].These drought categories are usually described using drought detection and monitoring indices, which are based on different natural variables in given time intervals, such as precipitation (PCPN), soil moisture, potential evapotranspiration (PET), vegetation condition, ground water and surface water [8].In other words, a drought index captures the physical characteristics and their associated impacts.
The most important drought characteristics that may have to be taken into consideration within the index are: the duration, intensity, magnitude and spatial extent of the drought [9].However, combining these characteristics in one index is extremely difficult.Thus, many researchers have proposed various indices; among these are: the Palmer Drought Severity Index (PDSI) [10], Surface Water Supply Index (SWSI) [11], Palfai Aridity Index (PAI) [12], Standardized Precipitation Index (SPI) [13] and Standardized Precipitation Evapotranspiration Index (SPEI) [14].In fact, the drought analysis via one of these indices depends on the nature of the index, local conditions, data availability and validity [15].
The pros and cons of the above-mentioned drought indices depend on the simplicity and temporal flexibility of their application.The World Meteorological Organization (WMO) is recommending the SPI as a standard drought index [16].This is because of the simplicity of its calculations, since precipitation data themselves are sufficient to conduct the test without requiring any statistical complications [17][18][19].In addition, the SPI could show a high performance in detecting and measuring drought intensity [20].Despite this convenience, the SPI has still some problems in terms of water balance.Therefore, the SPEI was developed to avoid these problems in the SPI version.The SPEI takes into consideration the sensitivity of PDSI to the changes in PET and the multi-scalar nature of the SPI [21].The calculation procedure of SPEI is similar to that of SPI [14].Hence, the difference between the two indices is that the SPEI uses the climatic water balance (the difference between PCPN and PET).The SPEI, therefore, has been used in recent studies as an appropriate index to monitor the drought in many locations of the world [21][22][23][24].
Nowadays, the major challenge in drought research is to develop suitable methods and techniques to predict the onset and termination points of droughts [25].Hence, the importance of drought forecasting comes from the future mitigation of drought impacts [26].Several attempts have been made to apply statistical models in hydrologic drought forecasting based on time series methods, such as autoregressive integrated moving average (ARIMA) models, exponential smoothing and neural networks.The ARIMA is a typical statistical analysis model that uses time series data to predict future trends [27].It can be used in predicting the hydrologic variables, such as: annual and monthly stream flows and precipitation [28].The ARIMA model approach can outperform most other statistical models, like exponential smoothing and neural network, in hydrologic time series [29].This relative advantage of the ARIMA model is due to its statistical properties, as well as the well-known methodology in building the model.
Much research has focused on drought forecasting in recent years.Mishra and Singh developed a new hybrid wavelet-Bayesian regression model for simulating the hydrological drought time series [30].Özger et al. created a combination model based on wavelet and fuzzy logic for long lead-time drought forecasting [31].Özger et al. investigated the use of a wavelet fuzzy logic model to estimate the PDSI based on meteorological variables [32].Mishra et al. developed a drought forecasting model using hybrid, individual stochastic and artificial neural network (ANN) models for drought forecasting based on SPI [33].These attempts were useful and satisfactory in forecasting time series droughts.
It should be emphasized that, in the previous mentioned published articles, the SPEI was not used as the drought index.Therefore, the main objective of the present study is to develop the stochastic models (ARIMA) to fit and forecast the SPEI series at different time scales, in addition to providing a descriptive analysis of drought-relevant climate parameters during 61 years (1950-2011).

Study Area
This study was conducted in the Al-Qassim and Hail regions in the central part of the kingdom of Saudi Arabia (KSA).The regions have a total area of 157,000 km 2 (Figure 1a).The predominant terrain according to the digital elevation model (DEM) is plain lands with some peaks higher than 2000 m (Figure 1b).Five cities across both regions are selected to perform the study: Buraydah, Ar Rass and Ash Shinan from Al-Qassim and Hail and Al Ghazalah from Hail.Their altitudes vary from 616 to 1083 m above sea level.

Data Sources and Preparation
The available monthly records of precipitation, maximum and minimum air temperature of the selected locations were obtained from two data sources: the first is the Global Precipitation Climatology Center [34], while the second is the Presidency of Meteorology and Environment (PME) of the KSA.Data from both sources were compared to each other in order to complete the missing data and establish data series for the period 1950-2011.The other missing climatic data were estimated from the neighboring weather stations using linear regression methods [35].The quality of established data series was investigated through non-parametric tests.
A long-term dataset for SPEI series of 3-, 6-, 12-and 24-month time scales for each selected location (Figure 1b) were created from the Global SPEI database (SPEIbase V2.0) [36].This database offers long-time, robust information about drought conditions at the global scale, with a 0.5-degree spatial resolution and a monthly time resolution.It has also a multi-scale character that provides time-scales for the SPEI between 1 and 48 months.This dataset is made available under the open database license [37].According to drought classification based on SPEI (Table 1), drought events occur at a time when the value of SPEI is negative, while they end when the SPEI value turns positive.The calculation of the SPEI in V2.0 is based on the original SPI (the calculation procedure is described in detail by Li et al., [38]) and the climatic water balance (PCPN-PET) that uses the FAO-56 Penman-Monteith equation [39].The PET equation can be expressed as follows:

Autoregressive Integrated Moving Average (ARIMA) Modelling Approach
Stochastic or time series models (ARIMA) are an approach of time series forecasting.The models provide a systematic-empirical method for forecasting and analyzing the hydrologic time series data.Therefore, the Box-Jenkins approach for ARIMA model development allows an appropriate remedy for non-stationarity in the historical time series data.This property of the ARIMA model is due to the combination of the autoregressive (AR) and moving average (MA) parts.The AR and MA create the multiplicative ARIMA (p, d, q) (P, D, Q)S model, where (p, d, q) is the non-seasonal part and (P, D, Q)S is the seasonal part of the model.The non-seasonal part of the ARIMA model (AR) can be expressed as: where ∅ () and θ () are polynomials for p and q order, respectively.Meanwhile, the seasonal part of the ARIMA model that defines the multiplicative seasonal of model can be written as: where p, d and q are non-negative integers that refer to the order of the autoregressive, integrated and moving average parts of the model, respectively; and P, D, Q and S are the order of seasonal auto-regression, the number of seasonal differencing, the order of seasonal moving average and the length of the season, respectively.∅  , Φ  , θ  , and Θ  are coefficients of the polynomials.

Autoregressive Integrated Moving Average (ARIMA) Model Development
Basically, the development and selection of the appropriate ARIMA model are achieved by an iterative procedure that consists of three steps [27,40,41]; these steps are as follows.

Model Identification
This step consists of identifying the possible ARIMA model that represents the behavior of the time series.The series behavior was investigated by the autocorrelation function (ACF) and partial autocorrelation function (PACF).The ACF and PACF were used to assist in determining the order of the model.The information provided by ACF and PACF is useful to suggest the type of models that can be built.The final model was then selected using the penalty function statistics through the Akaike information criterion (AIC) and Schwarz-Bayesian criterion (SBC).These criteria help to rank models (models having the lowest value of criterion being the best).The AIC and SBC take the mathematical form as shown below: where k is number of parameters in the model, (p + q + P + Q); L is the likelihood function of the ARIMA model; and n is the number of observations.

Parameter Estimation
After identifying the appropriate model as an essential step, the estimation of model parameters was achieved.The model estimate values for the AR and MA parts were calculated using the procedure suggested by Box and Jenkins [40].The AR and MA parameters were tested to make sure that they are statistically significant or not.The associated parameters, such as standard error of estimates and their related t-values, are also calculated.

Diagnostic Checking
Diagnosing the ARIMA model is a crucial part and the last step of the model development.It refers to the model adequacy that checks the model assumptions certainly.This model adequacy assures that the time series satisfies the model assumptions and that the forecast is reliable.Several diagnostic statistics and plots of residuals are investigated to see if the residuals are correlated white noise or not.The conducted residuals analysis of each SPEI fitted model is summarized as follows: ACF and PACF of residuals, histograms of residuals and residual distribution around the mean.The performance of the predictions resulting from the ARIMA (p, d, q) (P, D, Q)S models was evaluated by the following measures for goodness-of-fit: where N is the number of forecasting events, Xm the observed SPEI and Xs the predicted SPEI.
In this study, all of the SPEI forecasting ARMIA models have been developed using the forecasting feature incorporated in the IBM SPSS 20 software [42].Since the seasonal models require a periodicity, the seasonal cycle was set as an integer equal to 12.

Climate Descriptive Analysis
A descriptive analysis of climatic parameters was conducted as an attempt to understand the behavior of drought over the study area better.Table 2 represents annual means of the most drought-relevant parameters.Based on statistical reasoning, there are no significant differences in mean annual temperatures between the studied meteorological stations; where the highest temperature value was recorded in Ar Rass City (26.44 °C• year −1 ).Meanwhile, the lowest value was recorded in Hail City (21.83 °C• year −1 ).Concurrently, the monthly maximum mean temperature occurred in June, July and August; whereas the lowest monthly mean temperature was recorded in December, January and February.Generally, Al-Qassim is warmer than Hail; this slight rise in monthly mean temperature may be due to the variation in altitudes in both regions.
Along with the temperature, the monthly mean precipitation is varied over the year.Generally, very low precipitation rates are in most months of the year, except June, July, August and September having without no precipitation.Both regions presently have a high difference between the monthly mean precipitation and PET (water deficit), as presented in Table 2.The water deficit for Buraydah, Ar Rass, Ash Shinan, Hail and Al Ghazalah were 165.37, 169.12, 146.46, 131.74 and 162.4 mm year −1 , respectively.This deficit in water supply needs to be compensated by other water resources and/or following good water management and rationalization methods.Similar trends of PET appear in many places in the world, such as China [43].
Overall, the analysis of drought-relevant climatic parameters for both regions is proving notable decreasing levels in precipitation.This precipitation associates with a significant increase in temperature.Therefore, this situation will drive an acute increment in the frequency and magnitude of drought episodes [44,45].In fact, the primary reason for concern about turning to drought conditions is that they have a direct negative impact on the actual water resources.T m = mean annual temperature; P m = mean annual precipitation; PET = potential evapotranspiration.

Drought Frequency Variations
The SPEI time series plots at different time scales covering the period 1950-2011 for Al-Qassim and Hail regions are depicted in Figure 2.These figures show that the whole study area is going toward more droughts.For a deeper investigation of drought variation frequency, the time series of the study period is divided into two main sub-periods (i.e., 1950-1979 and 1981-2011) for both regions.The drought frequencies are calculated and then plotted in Figure 3. Generally, the frequency curve of the first sub-period was almost above the curve of the second period; therefore, an increasing drought frequency was occurring during both sub-periods in all areas.In the first sub-period (1950-1979), a moderately dry condition was relatively low in comparison with the second sub-period (1981-2011).The drought frequency was 0.58%, 1.39%, 2.21%, 2.67 and 0.80% in Buraydah, Ar Rass, Ash Shinan, Hail and Ghazalah, respectively.Meanwhile, there is a significant increase in the drought frequencies (moderately dry condition) during the second sub-period.These frequencies are 19.90%,17.52%, 20.38%, 15.73% and 17.21% in Buraydah, Ar Rass, Ash Shinan, Hail and Ghazalah, respectively.Based on this analysis, the drought episodes become more frequent in all study areas.
Figure 4 depicts the SPEI at different time scales for Buraydah and Hail cities during the study period.As is evident from this figure, the drought episodes are fluctuating in both cities.In the first five decades, the overall drought trend is approaching the natural limits of near normal and moderately wet scales with intermittent anomalous values in some years.These anomalous values are attributed to rainfall scarcity; whereas, in the last decade, the onset of drought conditions with severely and extremely dry began to emerge.Both cities represent a similar time evolution, with minimal distinguished differences between them.This circumstance is a result of climate change impact [8] and appears in other places in the world, e.g., Egypt [22], Turkey [46], Portugal [47] and China [24].

Autoregressive Integrated Moving Average (ARIMA) Model Development
Although it is difficult to forecast drought, relatively reliable drought forecasting remains a vital tool for water resource managers.Thus, in this study, a number of ARIMA models were developed using the SPEI data series at different time scales.The datasets from 1950 to 1989 were used for model development at 3-, 6-, 12-and 24-SPEI time scale.The ARIMA model development is described briefly for Al Ghazalah City as an instance.

Model Identification
Looking at the sample ACF and PACF plots of the time series in Figure 5, there are non-seasonal and seasonal pattern in the time series.The ACF is damping out in a sine-wave behavior with some significant spikes along the time series.There is a significant spike at Lag 1 in the PACF, which refers to AR (1) as the non-seasonal part of the ARIMA model.The final model selection was based on the lowest values of the penalty function statistics (AIC and SBC).The identified best models for all cities at different SPEI time scales according to minimum AIC and SBC are shown in Table 3.

Table 3. Autoregressive integrated moving average (ARIMA) models of Standardized
Precipitation Evapotranspiration Index for various study areas with forecasting fit measures and penalty function statistics for each estimated model of observed and predicted data.4 presents model parameters at different lags, standard error, t-value and the associated significance value at a significance level < 0.05 for Al Ghazalah city.It can be observed that the standard error of most model parameters is very small compared to the parameter values.This is with the exception of the model parameter values of the SPEI at three-month time scale.Additionally, all of the ARIMA model parameters are significant at significance level < 0.05.Thus, these parameters should be combined in the model.The same conclusion was found in the other models.

Diagnostic Checking of Residuals
After the estimation of model parameters, diagnostic checking was performed to verify the adequacy of the model.The ACF and PACF of the residuals at different time scales are shown in Figures 6 and 7.All of the ACF and PACF values are within the confidence of 0.01 bands for all lags.Therefore, there is no significant correlation between residuals.Figure 8 depicts the histograms of residuals for the SPEI at different time scales, which are normally distributed.This indicates that the fitted model is adequate for the SPEI time series data and residuals to white noise.Figure 9 shows the scatterplots of the residuals against predicted values of SPEI at different time scales.In this case, the scatterplots show no obvious patterns, and all of the residuals are distributed randomly around zero.Consequently, forecasts of these models are good and adequate.
Table 3 presents the highest accuracy forecasting models related to each considered SPEI time scale with accuracy fit measures (R 2 , RMSE, and MAE).Generally, drought forecasting with ARIMA models in Al-Qassim and Hail has shown considerable results, with R 2 ranging between 0.656 and 0.974.However, in most cases, the R 2 of all ARIMA models increases to greater than 0.8, which matches with the lower values of RMSE and MAE.This is with the exception of ARIMA models at the three-month time scales, which demonstrate the lowest R 2 values.Therefore, it can be summarized that the ARIMA models with longer time scales are manifesting a forecasting ability and fitting accurately with the drought prediction in the future.Very similar results appear around the world and apply to the SPI, which is the basis of the SPEI, i.e., China [48] and India [29].
It is noted that the SPEI time series for all cities has almost the same trend for similar time series (Figure 2).Additionally, the signals of the 12-SPEI and 24-SPEI time series are similar; similarly for the signals of the 3-SPEI and 6-SPE time series.Therefore, the order d equal to one appears for the time series based on 12 and 24 months, while this do not appear in the corresponding 3-and 6-month scales.

Model Forecasting
Obviously, forecasting plays an important role in decision making.It can help to foresee the future uncertainty based on the behavior of the past and current observations.The forecasting through the ARIMA models provides a good basis for hydrologic phenomena.For drought forecasting, one city from each region has been selected and then used to forecast drought.The testing data series of the SPEI at different time scales is from 1990 to 2011.The predicted and observed SPEI were plotted for evaluating the extent of agreement between both data series; Buraydah and Hail cities represent Al-Qassim and Hail regions, respectively.The comparison of both observed and predicted data (Figures 10 and 11) illustrates the high accuracy of forecasted values.Undoubtedly, by increasing the number of SPEI time series, the ability of the model to forecast the future will be enhanced.The reason for the enhancement is that the increasing number of SPEI time series refines the SPEI's output values, which decreases abrupt shifts of the SPEI curve.
By comparing the AR and MA coefficients, it is found that the ARIMA models of the 24-month time scale for Buraydah and Ar Rass are quite similar.Meanwhile, the ARIMA models of three-month time scale are similar in Ash Shinan, Hail and Al Ghazalah.The same remarks were realized with ARIMA models of the 24-month time scale.Table 5 presents the similar parameter estimates between the developed ARIMA models for both regions.From the results in Table 3, the values of p, d, q, P, D and Q obtained from the models fitted for the five cities are quite similar at the same time scales.Therefore, the ARIMA model (1, 1, 0) (2, 0, 1) at 24-SPEI could be generalized for the entire Al-Qassim region.Additionally, the ARIMA model (1, 0, 3) (0, 0, 0) at 3-SPEI and the ARIMA model (1, 1, 1) (2, 0, 1) at 24-SPEI could be applied to the whole Hail region.This is because they are located near each other.

Conclusions
In this work, the SPEI is presented as a powerful multi-scalar drought index to investigate drought event variations in Al-Qassim and Hail regions.The objective of this study has been divided into two parts.The first part of the objective is the assessment of climatic parameters and drought frequency based on SPEI.According to this assessment, both regions are suffering from the presence of water deficit condition up to 169.12 mm• year −1 in Ar Rass City as a result of high temperatures that are associated with low precipitation rates.Therefore, the risk of drought occurrence increases due to the water deficit, which eventually will create tremendous threats to water resources.At the same time, the SPEI variation is going towards an abnormal trend of drought (severely and extremely dry) since the last decade.Anyway, these results should sound alarm bells in hyper-arid regions.
It is worth noting that one of the most problematic issues facing hydrologists is forecasting drought events.Hence, the second part of the objective was developing and testing the ARMIA models for drought forecasting using SPEI with 3-, 6-, 12-, 24-month time scales.Based on the AIC and SBC values, the best ARIMA models were identified.The reliability of forecasted future values is a core point for forecasters.This is because the drought mitigation policies will be implemented based on the forecasted values.Thus, after the estimation of the parameters of selected models, a series of diagnostic checking tests were performed.Among the possible ARIMA models, the ARIMA model (1, 1, 0) (2, 0, 1) at 24-SPEI could be chosen as a general model for the Al-Qassim region.Meanwhile, the ARIMA model (1, 0, 3) (0, 0, 0) at 3-SPEI and the ARIMA model (1, 1, 1) (2, 0, 1) at 24-SPEI could be generalized for the Hail region.These results may be due to the cities being located near each other.It is also noted that the results obtained from the ARIMA models for the studied cities showed that the ARIMA model at the 24-SPEI time scale is the best forecasting model, which matches the high coefficient of determination R 2 (more than 0.9) and lower values of RMSE and MAE.Meanwhile, the ARIMA model at the 3-SPEI time scale is the worst model for drought prediction for all cities.Since these ARIMA models show their powerful ability in drought forecasting, they can play a very important role as useful tools for water resource managers and planners to take precautions considering drought for other hyper-arid regions in advance.
Obviously, the linkage between the climate change represented in droughts and available water resources is necessary.Thus, in hyper-arid regions, confronting the actual/future drought conditions with withdrawal/recharge periods of groundwater is one of the most important investigations and should be considered as a priority in future research.

Figure 1 .
Figure 1.(a) Map of the administrative regions of the Kingdom of Saudi Arabia (KSA) with the indication of the study location (dotted blue area).(b) Digital elevation map for Al-Qassim and Hail with the study's selected weather stations that are used in this paper.

Figure 2 .
Figure 2. Standardized Precipitation Evapotranspiration Index (SPEI) series at different time scales for Al-Qassim and Hail regions during 1950-2011.

Figure 3 .
Figure 3. Drought cumulative percent of the Standardized Precipitation Evapotranspiration Index (SPEI) series over a 24-month time scale for different locations within the central area of Saudi Arabia.

Figure 4 .
Figure 4. Temporal evolution for the data series of Standardized Precipitation Evapotranspiration Index (SPEI) at time scales from one to 24 months for the period 1950-2010 in (a) Buraydah and (b) Hail cities.

Figure 5 .
Figure 5. Autocorrelation function (ACF) and partial autocorrelation function (PACF) based on a periodicity of 12 months for model selection at different Standardized Precipitation Evapotranspiration Index (SPEI) time series.

Figure 10 .
Figure 10.Observed and predicted standardized precipitation evapotranspiration index (SPEI) series over different time scales based on mean precipitation during 1990 to 2011 for Buraydah City.

Figure 11 .
Figure 11.Observed and predicted standardized precipitation evapotranspiration index (SPEI) series over different time scales based on mean precipitation during 1990 to 2011 for Hail City.

Table 1 .
Classes of dryness/wetness grade according to the Standardized Precipitation Evapotranspiration (SPEI) values.

Table 2 .
Coordinates and altitude of the weather stations used in this study with drought-relevant climatic parameters.

Table 4 .
Statistical analysis of autoregressive integrated moving average (ARIMA) model parameters, including the autoregressive (AR) and moving average (MA) orders with associated significance values at significance level < 0.05 for Al Ghazalah City.

Table 5 .
Similar parameter estimate values at the same Standardized Precipitation Evapotranspiration Index (SPEI) time scale including the autoregressive (AR) and moving average (MA) orders with associated significance values at significance level < 0.05 for both regions.