1. Introduction
Rainfall as a natural phenomenon plays an important role in driving the hydrological cycle. Precise information on the amount and distribution of rainfall is indispensable in many hydrological applications, e.g., climate change assessment, drought monitoring, flood forecasting and extreme weather prediction [
1,
2,
3].
Rain gauges and satellite rainfall products are two of the most widely used sources of data for rainfall measurements [
4]. Although individual rain gauges provide rainfall values at relatively high accuracy, their often sparse regional coverage limits the spatial resolution of rainfall measurements required for the kind of hydrological studies mentioned above. Difficulties in estimating rainfall have been addressed in many studies [
5,
6], especially in developing countries where ground-based rainfall networks may be sparse or even non-existent [
7]. In fact, areal rainfall data from even a dense rain gauge network may be highly uncertain [
8,
9], as the spatial distribution of rainfall is usually obtained by some kind of geostatistical interpolation of point rainfall data (e.g., [
10,
11,
12,
13,
14]).
Another alternative approach for proper rainfall estimation is offered by satellite rainfall products [
15,
16,
17]. The recent satellite-based rainfall products can provide accurate rainfall data sets at high spatial and temporal resolutions for a wide range of hydrological applications [
18,
19]. Hughes [
20] presented a preliminary analysis of the potential for using satellite rainfall estimates through a comparison with available point gauge data for four poorly gauged river basins in South Africa, Zambia, and Angola.
A large number of satellite rainfall products with steadily increasing spatial and temporal resolution have become available since the 1990s, e.g., Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) [
21,
22]; the Tropical Rainfall Measuring Mission (TRMM), and the Passive Microwave InfraRed technique (PMIR) [
23]. Su et al. [
24] first assessed the performance of four latest and widely used satellite-based precipitation datasets, namely Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks–Climate Data Record (PERSIANN–CDR), the version 7 (V7) of the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) products (3B42) and two products from CMORPH (the Climate Prediction Center Morphing technique): bias corrected product (CMORPH–CRT) and satellite-gauge blended product (CMORPH BLD ) over the upper Yellow river basin in China during the 2001–2012 time period for the simulation of streamflow for two flood events. Whereas the 2005-flood event was well predicted for all four satellite-based precipitation data sets, they performed poorly for the 2012-flood event, as the latter was induced by more torrential rainfall with larger estimation errors.
Another way to estimate rainfall time series is to build a prediction model with satellite surface soil moisture products. A novel approach named SM2RAIN proposed by [
25] employs soil moisture observations to infer the rainfall. This technique is based on the inversion of the water balance equation and has already been successfully applied in situ [
25] and to satellite soil moisture data [
26,
27,
28,
29] in different regions. Ciabatta et al. [
30] employed the obtained rainfall estimates through SM2RAIN in hydrological modeling to predict the river discharge over four catchments in Italy during the 4-year period 2010–2013. Massari et al. [
31] used SM2RAIN-corrected daily rain gauge data in flood modeling in a small watershed in southern France and showed the superiority of this correction approach over the use of rain gauge data alone.
As calibration and validation of the SM2RAIN model for estimating water balance components and rainfall constitutes a time-consuming iterative process, other non-parametric approaches such as artificial neural networks (ANNs) have been proposed and applied to the prediction of complex physical systems, such as rainfall, in many parts of the world (e.g., [
32,
33,
34,
35,
36]). However, in most of these studies ANN has been used in the form of a classical input–output multi-perceptron model between various climate components as input and rainfall as output, with only a few taking into account likely (auto) lagged relationships in the climate variables and/or the rainfall [
37].
This deficiency of classical ANN in describing time-lagged input-output correlations is partly remedied by the NARX (nonlinear auto-regressive with exogenous inputs) neural network model introduced by [
38] as a new representation for a wide range of discrete and nonlinear systems. NARX is a dynamic neural network that uses time delays as well as feedback (memory) connections between both output and input layers to come up with more reliable ANN-prediction models [
39,
40].
Wunsch et al. [
41] applied NARX successfully to obtain groundwater-level forecasts for several wells in three different types of aquifers, namely porous, fractured and karst aquifers in south-west Germany, using precipitation and temperature as input parameters.
In this paper we describe a new application of the NARX neural network to better predict continuous rainfall series across the Karkheh river basin (KRB), Iran, which has been the focus of several studies of the authors over the last years (e.g., [
42]). To this end, changes of relative AMSR–E satellite soil moisture and measured temperature data are considered as input data in NARX to estimate the rainfall. These estimates are then compared with the ground-based observations, Precipitation Estimates from Remotely Sensed Information using Artificial Neural Networks Climate Data Record (PERSIANN CDR) as well as with those obtained by [
29] using the SM2RAIN approach.
3. Results and Discussion
As mentioned in the introduction, two different approaches of daily rainfall estimation in the KRB, i.e., (1) the SM2RAIN algorithm incorporating soil moisture observations from AMSR–E and (2) the NARX neural network algorithm also employing AMSR–E soil moisture observations and ground-based mean air temperature as input and gauge-measured rainfall as output are used and compared to each other.
3.1. SM2RAIN Rainfall Estimation Using AMSR–E Soil Moisture Data
The SM2RAIN model is calibrated using AMSRE soil moisture and gauge rainfall data for the period 1 January 2003 to 31 December 2005 and validated with data for the remaining 9 months from 1 January 2006 to 30 September 2006. The calibration is performed as an iterative process whereby the free parameters in the SM2RAIN algorithm (Equations (1)–(4)) are adjusted within their allowable ranges, until the estimated rainfall values are in line with the measured ones, using the coefficient of determination (R2) and the root-mean-square error (RMSE) as quantitative statistical measures.
Time series of the daily SM2R-AMSRE- estimated and observed rainfall for the different stations are shown in the upper panels of
Figure 3, with the AMSR–E soil moisture time series depicted in the corresponding lower panels.
Table 2 lists the R and root-mean-square error (RMSE) values of the SM2RAIN model fits obtained for the 5 KRB climate stations for both the calibration and validation periods. One may notice from the table that the observed rainfall data are reproduced with reasonable accuracy. In the validation period, the R
2-values range between 0.33 for Khorramabad to 0.65 for Ilam. The RMSE is the lowest for Ahvaz station, in accordance with the better R
2-value there. Lower performances are also acquired for Ilam and Khorramabad stations, most likely due to the presence of more noise in the associated satellite soil moisture data (see
Figure 3).
Besides, surface conditions and topography of the climate stations affect the outcome as well. Thus, the high altitude stations Ilam and Khorramabad stations in the mountainous regions, with more snow cover and frozen soils, do worse than the low altitude station Ahvaz.
Comparisons between the gauge-measured and SM2R-AMSRE estimated rainfall show that the SM2RAIN underestimates the total rainfall amount at all sites. The major reason for this is the constant values of soil moisture for any rainfall amount after reaching saturation [
25,
47,
62,
63].
This issue could also explain the reason why at the high-altitude stations, which experience more large rainfall events, the RMSEs are higher. The optimized values of the four parameters (after calibration/validation) that control the water cycle in SM2RAIN method (see Equations (1)–(3)) are listed in
Table 3.
Similar to [
25,
47], the values of these parameters are consistent with their expected physical ones.
3.2. Rainfall Estimation Using the Nonlinear Autoregressive Network with Exogenous Inputs (NARX) Neural Network
Using AMSR–E satellite data for soil moisture, ground-measured temperatures and rainfall as input (open-loop, see
Figure 2) and output, the new NARX neural forecasting model was trained iteratively for the time period January 2003 to September 2005, by adjusting the number of hidden neurons and delays for each KRB station, until a minimal RMSE was obtained. The subsequent testing was performed in closed-loop setup with data in the period September 2005 to September 2006. As mentioned, the application of the NARX model requires the tuning of parameters of the neural network.
Table 4 presents the optimal number of hidden neurons and delays d (Equation (5)) found after some lengthy trial-and-error runs to get the best results in terms of the least RMSE.
Using the parameters of
Table 4,
Figure 4 shows the average NARX-estimated daily rainfall for the training and testing periods for the 5 KRB stations.
Furthermore, the training and one-year prediction accuracies of the NARX model were evaluated by the coefficient of determination (R
2), Nash–Sutcliffe efficiency (NSE) and the RMSE, both of which are listed in
Table 5. Similar to the results of the SM2RAIN method in the previous section, the best and worst NARX model performances are obtained for stations Ahvaz and Khorramabad, respectively, with R
2-values ranging between 0.57 for the former and 0.17 for latter in the testing phase.
3.3. PERSIANN–CDR Satellite-Based Rainfall
Besides the SM2RAIN- and NARX-estimated rainfalls, the PERSIANN–CDR satellite-based rainfall is also compared with the observed rainfall at the five climate stations in the KRB (see
Figure 5).
The PERSIANN–CDR product statistical performances for detecting the observed rainfall are displayed in
Table 6.
The best results are achieved for Ahvaz station, followed by Hamedan station, however, with values of R
2 and
NSE, substantially lower (
RMSE higher) than those obtained with the SM2RAIN- (
Table 2) and NARX- (
Table 5) models. The reason of that somewhat disappointing performance of the PERSIANN–CDR product could be the lack of training of the neural network parameters (over Iran), due to the limited gauge information [
64,
65], and the low quality of the longwave infrared (IR)-based precipitation estimates [
64,
66]. Moreover, as can be seen from
Figure 5, PERSIANN–CDR tends to underestimate the rainfall at all stations, particularly, for heavy rainfall events.
3.4. Comparison of SM2RAIN- and NARX-Simulated Rainfall
Comparison of the rainfall series predicted by NARX (
Figure 4) with those of SM2RAIN (
Figure 3) as well as of the corresponding statistical performance indicators (
Table 3 and
Table 5) shows that for climate stations Ahvaz and Kermanshah the simulated NARX-predicted rainfall has a higher correlation with the observed one than the SM2RAIN-predicted one, and this holds for both training/calibration and testing/validation phases/periods.
For station Hamedan, NARX provides almost the same, or even little better results than SM2RAIN. In contrast, for Khorramabad and Ilam stations, SM2RAIN is generally superior by delivering a higher correlation than NARX for all periods/phases. A more revealing picture of the performance differences between the two methods is provided by the plots of the biases, i.e., the absolute differences between the simulated and observed rainfalls for the KRB stations in
Figure 6. As can be clearly seen, the SM2RAIN model has for all stations generally less bias, i.e., also less RMSE (see
Table 2 and
Table 5) than the NARX model. Moreover, as mentioned earlier, SM2RAIN has a tendency to underestimate higher rainfall rates due to saturation and this is the reason why many of the SM2RAIN biases are negative, whereas the NARX bias show more temporary systematic over/under prediction of the rainfall. In any case, these results indicate that a physically based model (SM2RAIN) is—at least in this application—indeed superior to a non-physical neural network model (NARX).
4. Conclusions
In this study, the recently developed SM2RAIN algorithm [
26] and a new NARX neural network model are applied to convert AMSR–E soil moisture data to predict daily rainfall at 5 climate stations in the KRB, Iran, which has been a major study region of the authors for some time. The results show that SM2RAIN is able to predict the rainfall at the KRB stations located in different climate regions—from the mountainous north to the flat south of KRB—with varying reliability. Thus, the SM2RAIN-simulated rainfall shows good correlations with the observed one, with R
2-values ranging from 0.32 to 0.79 during the calibration and validation period.
The new NARX neural network developed here turns out to be fast and robust and is able to also approximate the daily rainfall data at the same KRB stations in an acceptable manner, whereby the R
2 values range between 0.17 and 0.65 for the testing period. From the time series of the biases obtained with the two prediction methods (
Figure 5), it can be inferred that although SM2RAIN underestimates daily rainfall in many cases, this method works somewhat better than NARX which produces higher biases and RMSE (
Table 2 and
Table 5) at all stations. Whether this holds generally, or only in the present KRB application, is yet to be investigated. However, given that SM2RAIN is a physical model its slight superiority may be of no surprise. On the other hand, the appealing feature of the NARX network is that, thanks to the use of exogenous (external) input data, its network complexity is reduced compared with classical multilayer perceptron neural networks. For an additional independent check of the observed rainfall, the PERSIANN–CDR rainfall product has also been applied, but it shows a lower performance than the SM2RAIN and NARX models.
In conclusion, the results of the present study indicate that both SM2RAIN- and NARX models, using AMSR–E satellite soil moisture products, have a high potential for real-time rainfall prediction, but should be further applied with other satellite soil moisture data sets to more catchments worldwide with different physiographic characteristics in order to better assess their practical usefulness.