Knowledge of the spatio-temporal variations in the distribution of precipitation is of critical importance to hydrological/hydrometeorological modeling. As the quality of hydrological and hydrometeorological model outputs depends greatly on the quality of the input precipitation estimates, realistic spatio-temporal distributions of precipitation are required for reliable modeling [1
Two data sources are routinely used to obtain reliable estimates of precipitation. The first source comprises the point measurements acquired from rain gauges. The spatio-temporal estimates of precipitation are routinely obtained by applying various interpolation algorithms to the rain gauge data. It is not always possible, however, to generate reliable precipitation estimates from rain gauge data alone, as the quality of the resulting estimates depends on the number and spatial configuration of those data [3
The second sources comprise radar or satellite observations. Weather radar, which measures the reflectivity of water droplets at a certain height [5
], can provide fine resolution estimates of precipitation. However, such precipitation estimates suffer from various types of errors [7
], so a proper correction of the radar estimates should be considered for many hydrological applications. In addition, weather radar networks are available only for restricted regions. Another source of quantitative precipitation data involves satellite-based estimates. Indeed, many missions producing satellite-derived precipitation estimates have been operating since the 1990s, such as the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA), the Global Precipitation Mapping (GPM) and the Global Change Observation Mission-Water (GCOM-W) [9
]. These satellite precipitation products can provide periodic and regional information about the distribution of precipitation and its variations. Despite their advantages compared with rain gauge data, satellite precipitation products also have some limitations. First, their spatial resolution is too coarse to analyze precipitation distributions at a local scale. The spatial resolution of most satellite products ranges from 10 km–25 km, which is not adequate for local analysis. Thus, spatial downscaling to increase spatial resolution [12
] is essential when using coarse spatial resolution satellite-derived precipitation data for local-scale analysis in areas where rain gauges are very sparse. Several statistical/geostatistical methods have been proposed for spatial downscaling, some of which integrate auxiliary environmental variables, such as elevation and vegetation index, at a fine spatial resolution via regression analysis and residual correction [4
]. Promising downscaling results have been obtained by previous studies, but the predictive performance of any downscaling method is subject to the accuracy of input satellite precipitation product. A large number of rain gauges throughout the world is already used for precipitation retrieval from satellite data. For example, the TMPA 3B43 products have been rescaled by monthly rain gauge data from the Global Precipitation Climatological Project (GPCP) and the Climate Assessment and Monitoring System (CAMS) [21
]. However, the accuracy of satellite-derived precipitation products might be unsatisfactory in some areas where accessibility to rain gauge data was restricted [22
If properly integrated with satellite precipitation products, rain gauge data can be effectively used to map precipitation at a fine spatial resolution, since precipitation estimates from rain gauges and satellites have complementary characteristics in terms of data availability and accuracy. Precipitation data from rain gauges are usually regarded as true measurements, hence information provided by rain gauges could be used to adjust bias or errors in satellite precipitation products, thereby helping to improve the predictive performance of fine resolution mapping of precipitation.
Despite the potential of the integration of rain gauge data with satellite precipitation products, some challenging issues still need to be addressed. The major issue is the difference in scale between rain gauge data and satellite products. Rain gauge data can be regarded as point measurements, whereas satellite precipitation product can be considered as areal or aggregated measurements at a certain coarse spatial resolution. Thus, differences in scale or mismatching should be considered in an appropriate manner during the integration of data from different supports. A detailed discussion on the difficulty in comparing satellite-derived precipitation estimates with point rain gauges can be found in Porcù et al. [23
]. In addition, an appropriate method should also be developed for the adjustment of satellite products using rain gauge data. As mentioned above, satellite precipitation products such as TMPA have been generated through a gauge adjustment step. However, the bias-corrected coarse resolution precipitation product might still have bias or errors at a finer resolution because rescaling to the monthly rain gauge data was implemented at the coarse resolution. Thus, local adjustment using rain gauge data is still required to generate improved precipitation estimates at the fine spatial resolution. To integrate rain gauges and satellite precipitation products, many studies have developed advanced methods, such as double kernel smoothing [24
], Bayesian combination [25
], conditional merging (CM) [26
], cokriging (CK) [27
] and kriging with an external drift (KED) [28
]. Hunink et al. [31
] combined two different TRMM products with rain gauge data and satellite-derived environmental variables, such as NDVI and DEM. However, this approach requires sufficient rain gauges to establish quantitative relationships between input variables. Although bias correction or local adjustment has been applied by the above methods, very few studies have considered the difference in scale during integration.
Recently, Duan and Bastiaanssen [32
] tried to downscale TRMM 3B43 products with fine resolution environmental variables, and the downscaled TRMM precipitation estimates were then integrated with rain gauge data. The errors at rain gauges were interpolated, then added to correct the downscaled TRMM precipitation estimates. However, there was no step to relate the downscaled TRMM precipitation estimates to the rain gauge data prior to interpolation of the errors. The local adjustment with rain gauges may facilitate correcting errors in the downscaled satellite-derived precipitation estimates. When integrating rain gauges and satellite-derived precipitation estimates, the predictive performance usually depends on many factors. Even though the difference in scale has been properly accounted for, the spatial configuration or density of rain gauges, as well as the quality of the coarse resolution satellite precipitation product might affect the predictive performance. In particular, the density of rain gauges greatly affects not only the spatial pattern of mapping results, but also the degree of local adjustment of the satellite precipitation product. Therefore, the impact of the density of rain gauges on predictive performance should also be quantitatively evaluated, in conjunction with the development of a proper integration method. However, very few studies have considered these two important aspects in the same body of work.
In this study, we investigate the benefits of integrating coarse resolution satellite precipitation products and rain gauge data for fine resolution mapping of precipitation, with emphasis on the development of a geostatistical downscaling-integration framework. A two-stage geostatistical approach is presented that considers the differences in spatial resolution between rain gauge and satellite precipitation products, as well as the errors in satellite precipitation products. In the first stage, the coarse resolution satellite precipitation data are downscaled at a fine spatial resolution using area-to-point (ATP) kriging [33
] to allow direct comparison with rain gauge data. In the second stage, the downscaled precipitation estimates are integrated with rain gauge data using multivariate kriging. During this second stage, the errors in the satellite-derived precipitation estimates are adjusted based on the rain gauge data. Three geostatistical algorithms, i.e., simple kriging with local means (SKLM), KED and CM, are applied for integration at a fine spatial resolution. To analyze the impact of the rain gauge density on the predictive performance, scenarios with various rain gauge densities are tested by cross-validation. Our approach differs from previous studies in that the whole of the procedures for both downscaling and integration is fully based on geostatistics that can provide consistent and flexible ways for the change of support, the quantification of spatial correlation and data integration. The methodological developments and applications are demonstrated by an experiment on the integration of TRMM monthly precipitation estimates and rain gauge data acquired over South Korea.
2. Study Area and Data
A downscaling and integration experiment is conducted using coarse resolution satellite precipitation products and rain gauge data acquired over South Korea. Six TRMM 3B43 products acquired from May–October in 2013 were used as the satellite-derived precipitation estimates. Using TRMM products from May–October, not all 12 months, is based on the precipitation characteristics in South Korea. Seasonally, 50%–60% of annual precipitation falls in summer, and the winter precipitation is less than 10% of the total annual precipitation [34
]. As part of the summer East-Asian Monsoon system, a rainy season due to stationary front rain, locally known as the Changma
in Korea, starts in late June and continues until late July, bringing frequent heavy rainfall [34
]. Typhoons also have influences on the Korean peninsula from June–October [34
]. By considering these precipitation characteristics in South Korea, TRMM products from May–October in 2013 were chosen for this experiment.
The original monthly TRMM data at a resolution of 0.25° were first geocoded using a transverse Mercator projection with a spatial resolution of 25 km. The monthly accumulated precipitation at a mm scale was finally prepared for integration with the rain gauge data. Monthly accumulated rainfall measurements obtained from 71 automated synoptic observing systems (ASOS) over South Korea were used for integrating and validating the prediction results (Figure 1
). ASOS data are available mainly over land, so the integration and downscaling results were generated only over land. The target resolution for downscaling was set experimentally to 1 km.
To further investigate the differences among the three multivariate kriging algorithms, we computed correlation coefficients by comparing error statistics and predictions for different inputs and algorithms using all of the months and densities (Table 3
). Since error statistics obtained from all months were used in the computations, the rRMSEs were used as error statistics instead of the RMSEs. First, the impact of TRMM precipitation data on the predictive performance of each multivariate kriging algorithm was analyzed. The errors in the TRMM precipitation had a strong positive correlation with the predictive performance for all three algorithms (Table 3
a). A strong positive correlation was also observed in the average of correlation coefficients between the TRMM precipitation data and the predicted values by each multivariate kriging algorithm for all months and densities. Strong correlations were obtained by all three algorithms, but SKLM and KED had relatively stronger correlations than CM. For merging rain gauge and radar data using KED, similar findings were obtained by Berndt et al. [28
], where KED was more sensitive to the radar data quality than CM. These measures indicate that the gain of integrating TRMM precipitation with rain gauges depends on the errors or quality of the TRMM precipitation data, as well as on the density of the rain gauges.
To analyze the differences of predictions without or with the integration of the TRMM precipitation product, the error statistics and predictions obtained by OK were compared with those obtained using multivariate kriging algorithms (Table 3
b). The correlation between the rRMSEs for OK and each multivariate kriging algorithm was also strong. In addition, the average correlation coefficients between the OK predictions and the predicted values by each multivariate kriging were also very strong. All of the multivariate kriging algorithms had high correlation, but CM had the largest correlation coefficients, unlike the analysis result regarding the impact of errors in the TRMM precipitation data. These differences in the impact analysis results may be explained by the characteristics of each multivariate kriging algorithm. In SKLM and KED, soft data contribute to the estimation of the local means. The errors in the soft data are adjusted using hard data, but they might not be fully adjusted. Residuals are the main target of kriging in SKLM and KED, so their dependency on the local means also affects the prediction results. Meanwhile, CM predictions are the sum of the contributions from both the hard and soft data. When few rain gauges are used for CM, the errors in the OK predictions may greatly affect the final CM predictions. Based on these findings, the impacts of errors in the OK predictions may be reduced if the correlation with the OK predictions is low, but that with the TRMM precipitation data is large. For example, the CM predictions for S25 in June had a correlation coefficient of 0.45 with the OK predictions, but 0.948 with the TRMM precipitation data. Thus, the RMSE of the CM predictions (31.44) was smaller than that of the OK predictions (40.44). However, intrinsic errors in the TRMM precipitation data may also have degraded the CM predictions. Thus, the tentative conclusions obtained from this study should be evaluated extensively from experiments using longer time-series datasets.
In relation to integrating datasets measured over different supports, advanced geostatistical kriging algorithms, which integrate datasets from different supports within one stage unlike our two-stage approach, have been proposed. Liu and Journel [45
] proposed block kriging to integrate well-logging data with coarse scale geophysical exploration data. Goovaerts [46
] also proposed area-and-point kriging as a general framework for combining point- and areal-support data and demonstrated its effectiveness through case studies in the fields of soil science and medical geography. From impact analysis results in this study, it was observed that the reliability of the integration of coarse resolution satellite-derived precipitation estimates with rain gauges was still susceptible to errors in areal-support data. Since the variants of conventional kriging directly integrate data measured over different supports, intrinsic errors in the areal-support data may propagate to the integration procedure, thereby yielding unreliable final prediction results at a fine spatial resolution. If a priori information is available regarding the error variance, this information could be used to adjust the kriging weights [45
]. However, information on the error variance is not always available. Thus, the final prediction results obtained by direct integration without error adjustment may be affected severely by the errors in the areal-support data. To prove this statement, the two-stage approach presented in this study should be compared with the direct integration approach.
In order to focus on the benefit of integrating coarse resolution satellite-derived precipitation estimates with rain gauge data, other environmental variables related to precipitation were not considered in this study. If auxiliary environmental variables at a fine spatial resolution are available, they can easily be integrated within the geostatistical framework presented in this study. These days, many fine scale auxiliary variables, such as DEM and the vegetation index, can be readily obtained from satellites (e.g., ASTER DEM and MODIS products). These auxiliary variables can be integrated in either (1) the downscaling step or (2) the integration step. If auxiliary variables at a fine spatial resolution are used during the downscaling step, several downscaling methods mentioned in the Introduction Section (e.g., linear regression-ATP residual kriging [4
] and geographically-weighted regression-spline interpolation of residuals [15
]) can be employed to generate the satellite-derived precipitation estimates while accounting for the spatial heterogeneity at a fine spatial resolution. As discussed in Park et al. [17
], however, the benefit of incorporating fine resolution auxiliary variables for downscaling may be not always great in some cases, when compared with downscaling without the auxiliary variables. In addition, any regression model with higher explanatory power does not always lead to an improvement of predictive performance due to the intrinsic errors of input coarse resolution data [48
]. If the downscaled satellite-derived precipitation estimates with the fine resolution auxiliary variables are integrated with rain gauge data, the final prediction results at the fine resolution might not show improved performance in some areas with sparse rain gauges. The second possible integration approach is to use the auxiliary variables at the fine resolution as additional inputs for multivariate kriging algorithms as in this study. SKLM is more efficient than other multivariate kriging algorithms because many auxiliary variables can be used to estimate the local means via multiple linear/non-linear regression modeling. Meanwhile, KED is only applicable when the auxiliary variables are linearly related to precipitation. It also becomes computationally demanding when many auxiliary variables are integrated. CM has been applied to integrate only one type of radar or satellite precipitation product, so the applicability of CM is not clear. By considering these issues, thus, integrating datasets from both different supports and multiple sources should be tested extensively in future research.
Another important issue in downscaling and integration is to quantify the uncertainty attached to prediction. In particular, the uncertainty quantification or assessment in downscaling of coarse resolution satellite product is very important because downscaling is regarded as an under-determined inverse problem [4
]. In downscaling of the coarse resolution satellite product, the uncertainty can be assessed within a stochastic simulation framework [4
]. The multiple downscaled realizations from stochastic simulation can be used as inputs of multivariate kriging for integration with rain gauge data. Subsequently, the comparison of the differences between multiple integration results can be used to quantify the uncertainty or impact of downscaled satellite-derived precipitation estimates on, say, hydrological assessments. As the downscaled satellite-derived precipitation estimates are integrated with rain gauge data, however, there are several different sources of uncertainty at ungauged locations, such as the spatial configuration of rain gauges and spatial correlation structures of the residuals, as well as the uncertainty of the downscaled satellite-derived precipitation estimates. Due to the complexity of considering all different sources of uncertainty, uncertainty assessment based on stochastic simulation was not considered in this study. A stochastic simulation framework for downscaling (conditional ATP simulation [36
]) and integration (multivariate conditional simulation [39
]) has already been established in geostatistics. Thus, the two-stage geostatistical kriging approach presented in this paper could be extended to the stochastic simulation framework for target-specific uncertainty assessment, such as the impact of the incorporation of fine resolution auxiliary variables on the downscaling of coarse resolution satellite precipitation products.