A Rainfall Model Based on a Geographically Weighted Regression Algorithm for Rainfall Estimations over the Arid Qaidam Basin in China

Accurate rainfall estimations based on ground-based rainfall observations and satellite-based rainfall measurements are essential for hydrological and environmental modeling in the Qaidam Basin of China. We evaluated the accuracy of daily and monthly scale Tropical Rainfall Measuring Mission (TRMM) rainfall products in the Qaidam Basin. A Geographically Weighted Regression (GWR) was used to estimate the spatial distribution of the TRMM product error using altitude and geographical latitude and longitude as independent variables. Finally, a rainfall model was developed by combining ground-based and satellite-based rainfall measurements, and the model precision was validated with a cross-validation method based on rainfall gauge measurements. The TRMM precipitation observations may contain errors compared with the ground-measured precipitation, and the error for daily data was higher than that for monthly data. A time series of TRMM rainfall measurements at the same location showed errors at certain time intervals. The ground-based and satellite-based rainfall GWR model improved the error in the TRMM rainfall products. This rainfall estimation model with a 1-km spatial resolution is applicable in the Qaidam Basin in which there is a sparse network of rainfall gauges, and is significant for spatial investigations of hydrology and climate change.


Introduction
Accurate estimations of rainfall distribution are critical for hydrological and environmental models [1].They are also crucial for statistically evaluating meteorological conditions and the influence of various factors on environmental change [2,3].Directly measuring rainfall from rain gauges is the most accurate way to determine site-scale precipitation.However, point rainfall measurements only represent the average rainfall over an area, and the uncertainty of these measurements is determined by the density of rain gauges [4].There is a sparse network of rain gauges (14 meteorological sites) in the Qaidam Basin of China, which covers about 260,000 km 2 .Rainfall distribution in this region is typically estimated based on this network of rain gauges, and is thus subject to a large degree of uncertainty.Moreover, the accuracy of rain gauge-based measurements is strongly affected by the complex spatial and temporal variability of rainfall at a specific locations [5].Because of high altitude, desert, and harsh weather conditions, it is difficult to expand the network of rain gauges in the Qaidam Basin.Therefore, accurate estimations of rainfall distribution in combination with ground-based rainfall observations and other data sources is essential in uninhabited regions including the desert, high mountains, and plateau in and around the Qaidam Basin.
Satellite-based techniques are another way to achieve regional rainfall measurements [6][7][8][9][10].Compared with ground-based measurements, satellite-based methods provide rainfall estimations at much finer temporal and spatial resolutions [11].The advantages of satellite-based rainfall measurements mitigate some of the problems associated with ground-based rainfall gauges [12].Among the existing satellite-based global rainfall products, Tropical Rainfall Measuring Mission (TRMM) estimates are the most widely used rainfall products [13][14][15][16].Since its launch in 1997, TRMM has provided critical precipitation measurements in tropical and subtropical regions of the planet [17].Numerous studies have applied and evaluated TRMM rainfall data for environmental and meteorological research [18][19][20][21][22].
The spatial distribution of rainfall has dual characteristics; it is a combination of continuity and discontinuity, as well as of certainty and randomness [12,23].The inevitable impacts of environmental factors give rise to the complex spatial variability of rainfall.This is the basic reason for zonal rainfall distributions.The bottom mat condition is another important factor impacting the spatial distribution of rainfall; in particular, topography has the most significant influence of rainfall spatial distribution [24].Continental plateaus (for example, the Tibetan Plateau in China) and large mountains influence regional circulation patterns, thereby affecting large-scale rainfall distributions.The non-stationary relationship between precipitation and altitude has been confirmed by many statistical analyses and numerical experiments [25][26][27].
Due to the complex characteristics of rainfall spatial variability and its numerous influential factors, it is difficult to exactly estimate the spatial distribution of rainfall [28].Conventional methods for estimating rainfall distributions are based on gauges and employ spatial interpolations.Thus far, there are at least 10 types of rainfall spatial interpolation methods [29,30], and these can be divided into two categories.The first type is based on ground-based rainfall measurements only.The rainfall at a specific location is estimated from gauges at many adjacent sites without considering the impacts of geographical position, topography, and other factors.Geographical spatial interpolation algorithms such as the Spline, Ordinary Kriging, and Inverse Distance Weighting techniques are typical representative methods [31,32].The second type of spatial estimation method combines ground-based rainfall measurements with major influential factors such as geographical position and topography [32,33].For example, the regression-based Parameter-elevation Relationships on Independent Slopes Model (PRISM) uses point data, a Digital Elevation Model (DEM), other spatial datasets, and an encoded spatial climate knowledge base to estimate precipitation at various time scales [34].The PRISM dataset was shown to accurately represent spatial precipitation patterns.However, linear climate-elevation relationship regression methods and station-weighting functions still need to be evaluated for high quality precipitation estimations.
Satellite-based rainfall data cannot accurately represent local rainfall distributions [12].However, rainfall estimations from ground-based measurements also contain obvious uncertainty in areas with sparse networks of rainfall gauges.Merging ground-based and satellite-based rainfall measurements improves the accuracy of rainfall distribution measurements [4,12,28].The TRMM product 3B42RT(V7) (Hereafter 3B42RT) is a precipitation product that integrates information from both ground-based and satellite-based rainfall measurements [13,14,19,22,35].
As summarized by Kidd and Huffman (2011) and Kidd and Levizzani (2011), there have been a number of efforts to improve the accuracy of rainfall estimation models [36,37], including instantaneous rainfall estimations using neural network analysis [38] and rainfall estimations based on a combination of satellite infrared and lightning information [39].The Food and Agriculture Organization (FAO) of the United Nations has also attempted to develop and improve satellite rainfall estimation methods [40].Studies on combined radar-gauge rainfall measurements began in the 1970s [41,42].Since then, significant progress has been made in combined gauge-satellite rainfall distribution estimations [43,44].However, a consistent definition for "Rainfall Blending" does not exist in meteorological or hydrological fields.There are two basic methods for merging gauge and satellite rainfall data.The first is a simple globally averaged calibration.However, due to the influence of land surface conditions and rainfall types, this method cannot reflect the non-stationary spatial relationship between satellite rainfall observations or ground-based gauge measurements and actual rainfall.Therefore, local estimation methods such as the Geographically Weighted Regression algorithm (GWR) [27], Objective Analysis (OA), Optimum Interpolation (OI), the variation method, geo-statistics [45], adaptive kernel density estimation [46], and condition fusion are widely used for regional rainfall estimations.
Geographically Weighting Regression (GWR) is a robust algorithm that has been successfully used in spatial rainfall analyses.GWR can theoretically integrate geographical location, altitude, and other factors for spatial rainfall estimations, and reflects the non-stationary spatial relationship between these factors and rainfall [27,32,47,48].Based on GWR, Xu et al. (2015, [27]) developed a new satellite-based monthly precipitation downscaling algorithm based on the non-stationary relationship between precipitation and NDVI and DEM.The study showed that the downscaled precipitation datasets performed better than traditional downscaling algorithms over the eastern Tibetan Plateau and the TianShan Mountains.In this research, precipitation was regressed using the non-stationarity of the relationship between TRMM and NDVI, DEM based on GWR.The ground-based precipitation measurements were not used as a variable to generate the background rainfall conditions before the precipitation estimation [27].The GWR rainfall model has only been applied in local areas or small watersheds with concentrated networks of rainfall gauges, and research is lacking in regions with sparsely distributed gauges.Furthermore, rainfall varies among different climatic regions, and GWR applications require further study in arid or desert regions such as the Qaidam Basin in China [47].
The aim of the present study is to merge ground-based and satellite-based rainfall measurements for accurate spatial rainfall estimations in the Qaidam Basin of China, which has a sparse network of rainfall gauges.First, the 3B42RT rainfall product was evaluated on both daily and monthly scales.Then, a spatial rainfall fusion technique combining ground-based and satellite-based measurements was established based on the GWR algorithm and applied to daily and monthly scale spatial rainfall estimations.The accuracy of the rainfall distribution estimations was evaluated by comparing the results to ground-based rain gauge observations and conventional satellite-based rainfall measurements.This study will improve regional rainfall estimations based on 3B42RT data along with rain gauge observations in arid or desert regions, and strengthen our understanding of the relationship between 3B42RT data and rain gauge measurements in this region.

Study Area
The Qaidam Basin (90 ˝16 1 E-99 ˝16 1 E, 35 ˝00 1 N-39 ˝20 1 N), a type of plateau basin, is located in the northwest of Qinghai Province and surrounded by the Kunlun Mountains, Altun Mountains, and Qilian Mountains.The boundary of the Qaidam Basin used in our study was determined according to the terrain features of the basin, including parts of Qinghai and Gansu Province.The Qaidam Basin is 800 km long from east to west, 300 km wide from north to south, and has an area of 257,768 km 2 .The basin is located in a crescent valley formed by mountains and plateaus.It has an altitude of over 3000 m and is the highest of China's four largest basins.The basin has a plateau continental climate and is characterized by drought.The annual precipitation in the Qaidam Basin is 200 mm in the southeast, and this has progressively decreased to 15 mm in the northwest.The relative air humidity in this basin typically ranges from 30% to 40% but can reach less than 5%.The average annual temperature is low, and the wind power is strong, causing a harsh natural environment.Most of the land in this area is classified as arid desert, and the main soil types are salify desert soil and gypsum desert soil.Due to scarce precipitation, large temperature differences, and strong wind, desertification and salinization in the basin are obvious and the environment is fragile.The Qaidam Basin is the area of the Qinghai Tibet Plateau that is most sensitive to climate change.

3B42RT Data
The TRMM is a joint U.S. and Japan satellite mission for monitoring tropical and subtropical precipitation and estimating the associated latent heat.Rainfall measuring instruments on the TRMM satellite include a Precipitation Radar (PR), an electronic scanning radar operating at 13.8 GHz; TRMM Microwave Image (TMI), a nine-channel passive microwave radiometer; a visible and infrared scanner (VIRS); and a five-channel visible/infrared radiometer.The purpose of the 3B42 algorithm is to produce TRMM-adjusted merged-infrared (IR) precipitation and Root-Mean-Square (RMS) precipitation-error estimates.In this study, we used the dataset version 7 (3B42RT) from 1 January 2001 to 31 December 2013 with a spatial resolution of 0.25 ˝ˆ0.25 ˝.The 3B42RT combination is computed from 50 ˝N-S every 3 h from the 3B40RT (High Quality) and 3B41RT (Variable Rainrate) fields.

Meteorological and Digital Elevation Data
The monthly and daily precipitation data were obtained from the basic and standard surface meteorological stations in the study area.About 14 years of monthly and daily meteorological data (2000-2013) were acquired from the China Meteorological Data Network [49].The quality of the raw data was strictly controlled and inspected during the production process.To improve the precision of the precipitation distribution estimations, a 100-km buffer surrounding the study area was established and 14 meteorological sites were selected to construct the precipitation distribution model (Figure 1).DEM with a 1-km spatial resolution was used because it is directly related to the precipitation distribution.The DEM was derived from the "China Western Environment and Ecology Science Data Center" [50].

3B42RT Data
The TRMM is a joint U.S. and Japan satellite mission for monitoring tropical and subtropical precipitation and estimating the associated latent heat.Rainfall measuring instruments on the TRMM satellite include a Precipitation Radar (PR), an electronic scanning radar operating at 13.8 GHz; TRMM Microwave Image (TMI), a nine-channel passive microwave radiometer; a visible and infrared scanner (VIRS); and a five-channel visible/infrared radiometer.The purpose of the 3B42 algorithm is to produce TRMM-adjusted merged-infrared (IR) precipitation and Root-Mean-Square (RMS) precipitation-error estimates.In this study, we used the dataset version 7 (3B42RT) from 1 January 2001 to 31 December 2013 with a spatial resolution of 0.25° × 0.25°.The 3B42RT combination is computed from 50°N-S every 3 h from the 3B40RT (High Quality) and 3B41RT (Variable Rainrate) fields.

Meteorological and Digital Elevation Data
The monthly and daily precipitation data were obtained from the basic and standard surface meteorological stations in the study area.About 14 years of monthly and daily meteorological data (2000-2013) were acquired from the China Meteorological Data Network [49].The quality of the raw data was strictly controlled and inspected during the production process.To improve the precision of the precipitation distribution estimations, a 100-km buffer surrounding the study area was established and 14 meteorological sites were selected to construct the precipitation distribution model (Figure 1).DEM with a 1-km spatial resolution was used because it is directly related to the precipitation distribution.The DEM was derived from the "China Western Environment and Ecology Science Data Center" [50].

Precipitation Spatial Estimation Model Development
Satellite-based measurements provide spatially continuous but inaccurate precipitation estimates, whereas ground-based precipitation observations provide accurate but spatially discontinuous information.Accurate continuous precipitation distribution estimations can be generated by merging satellite-based and ground-based precipitation data.Assuming P is the real precipitation field for a region, P o is the observed precipitation field formed by n observation sites, and P b is the precipitation background field (or initial evaluation field), the relationship among these variables can be described as: where e b piq is the precipitation background error and observation error at the observation site i, and the mathematical expectation for the precipitation background error and observation error is 0. Therefore, the precipitation background error can be estimated using the difference between the actual precipitation measurements and background value as: " e b pjq " f rP o p1q ´Pb p1q, P o p2q ´Pb p2q, ¨¨¨, P o pnq ´Pb pnqs where êb pjq is the precipitation background error at position j without including ground-based measurements, and P o p1q ´Pb p1q , P o p2q ´Po p2q , ¨¨¨, P o pnq ´Po pnq represent the differences between observed precipitation values and background values at n points with actual precipitation measurements.GWR was used to estimate the precipitation distributions.The results can be calculated by adding the precipitation background value and the estimated precipitation background field error using GWR: Ppjq " P b pjq `" e b pjq " P b pjq `f rP o p1q ´Pb p1q, P o p2q ´Pb p2q, ¨¨¨, P o pnq ´Pb pnqs where P pjq is the precipitation at point j without including ground-based precipitation measurements, and P b pjq is the satellite-based background precipitation value at j.The temporal scales of data from the meteorological stations are daily and monthly.The daily ground-based rainfall measurements included total precipitation from 8:00 P.M. on the previous day to 8:00 P.M. on the current day, and monthly precipitation was calculated as the sum of the daily precipitation.The 3B42RT data were strictly controlled for quality, and values with poor quality or missing samples were eliminated from the dataset.The three-hour 3B42RT data were used to derive the daily and monthly precipitation background values.To calculate the daily 3B42RT precipitation data, three-hour 3B42RT data were taken from the last observation on the previous day to the first seven observations on the current day.Monthly 3B42RT was calculated as the sum of the daily precipitation.

GWR Model
GWR is an extension of the common linear regression model.GWR directly builds a relationship between location and parameters using spatial x, y coordinates, as well as the local fitting relationship between the dependent variable and independent variables.GWR also can integrate multiple factors to fit the dependent variable.As a robust tool to describe spatial heterogeneity, the regression coefficients in GWR are not based on global information; rather, they vary with location, which is generated by a local regression estimation using sub-sampled data from the nearest neighboring observations.The principle of GWR is as follows: where y i ; x i1 , x i2 , ¨¨¨, x ij are the observation coefficients between the dependent variable y and independent variable x j at the geographical position pµ i , v i q, µ i represents longitude, v i represents latitude, and β j pµ i , v i q represents unknown parameters at the observation site pµ i , v i q , which can also be recognized as n unknown parameters.By selecting elevation as the independent variable, a GWR model was established based on the precipitation background error at the weather station and actual precipitation measurements.ε i represents independent identically distributed random errors in region i in accordance with the spherical disturbance hypothesis such that the mean value is 0 and the variance is σ 2 and independent, normally assumed to follow a distribution of N(0, σ 2 ).
To solve the mathematical problem, a local regression model was built using the actual observation sitei and its adjacent observation samples.The spatial distance decay weight matrix w ij reflects the differences between adjacent observation values and point i.GWR comprehensively considers the local autocorrelation of the dependent variable and its cross-correlation with the independent variables, and both are used for spatial estimations.The spatial autocorrelation of the dependent variable is described by the weight matrix, whereas the cross-correlations between the dependent variable and independent variables are reflected by the local regression index.The self-adaption function (similar to the Gauss truncation function) was used to adapt and adjust the bandwidth and to optimize the spatial weight matrix: where d ij is the distance between j and i, and b is the bandwidth.The cross-validation method was used to determine the optimal bandwidth b.The b value corresponding to the minimum value of cross-validation is the optimal bandwidth.
CV " where CV is the cross-validation result, and y i and ŷj are the estimated and observed values at site i, respectively.The parameter ŷ‰i which represents the self-observation site i, is eliminated during the regression parameter estimation process, whereas only other observation sites are used within the range of bandwidth b.
In our study, the differences between ground-based and satellite-based precipitation observations at 14 meteorological sites were calculated.Then, the GWR model was used to estimate the spatial distribution of the difference values, and elevation was selected as an independent variable with significant influence on the spatial distribution of precipitation.Finally, an estimated error value was added to the satellite-derived precipitation background value to calculate the spatial precipitation.

Accuracy Evaluation
The cross-validation method was used to evaluate the accuracy of the GWR model.There are 14 meteorological sites in and around the study area.All samples from January 2001 to November 2013 at one site were used as the validation data, and samples from the other 13 sites were used to generate the GWR model.The cross-validation process was repeated 14 times, with each of the subsamples from one site used exactly once as validation data.The results from the 14 meteorological sites were then used to produce the estimation.
Quantitative and classification indicators were used to evaluate the consistency between ground-based and satellite-based precipitation observations.Quantitative indicators included Absolute Error, Mean Absolute Error, Relative Deviation, Systematic Relative Deviation, and Correlation Coefficient.The principles of the indicators are as follows: ME " ABI AS " MAE{O r " where The skill index was used to assess the improvement or reduction in the accuracy of the precipitation fusion scheme compared to others.The formulas are as follows: where MAE 1 and MAE 2 are the mean Absolute Errors of the two precipitation distribution estimation schemes being compared, and r 1 and r 2 are the Correlation Coefficients of the two schemes.Theoretically, the ranges of SI MAE and SI CC are (´8, 1) and p´8, `8q, respectively.The threshold values of SI MAE and SI r are both 0, which is used to determine whether the estimation accuracy for the fusion scheme is improved compared to the reference plan.Values of SI MAE and SI r greater than 0 indicate that the precipitation estimation accuracy is improved for the fusion scheme, and the improvement degree is proportional to the SI MAE and SI r values.Lower SI MAE and SI r values correspond to reduced precipitation estimation accuracy and a larger degree of reduction.The accuracy evaluations of the daily and monthly scale precipitation estimations were separately implemented.The raw 0.25 ˝ˆ0.25 ˝3B42RT data were resampled to 1 km ˆ1 km using the B-Spline method, and then used to generate the merging GWR precipitation model.Moreover, the raw and resampled 3B42RT data were used to compare and analyze the accuracy and effect of the GWR model.These steps will help to determine whether the merging GWR model accounted for the error in the 3B42RT rainfall products.

Daily Precipitation Distribution Estimations
Scatter plots between the daily ground-based precipitation observations and raw (0.25 ˝ˆ0.25 ˝), resampled (1 km ˆ1 km) 3B42RT data and precipitation estimated with the GWR model from all sites between January 2001 and October 2013 are shown in Figure 2. and resampled 3B42RT data and ground-measured precipitation was not obvious (Figure 2a,b).However, the precision of GWR-estimated precipitation obviously improved, as shown in Figure 2c.For the similar accuracy of raw and resampled 3B42RT data, the GWR accuracy was evaluated compared to the resampled 3B42RT data.Correlation coefficients, root mean square error, mean error, and mean absolute error were used to evaluate the accuracy of precipitation estimated with the GWR model and 3B42RT interpolation method (Table 1).Through a comparative analysis, it was found that the estimation accuracy of the GWR method was greater than that of the resampled 3B42RT observation.The GWR estimated precipitation was significantly correlated with the actual precipitation, with a correlation coefficient of 0.6538 (p ≤ 0.01).However, the correlation coefficient between the resampled 3B42RT data and actual precipitation was only 0.1587 (p ≤ 0.01).The root mean square error for the GWR model was 1.63, compared with 2.57 for the resampled 3B42RT observation.Correspondingly, the mean absolute error (0.52) for the GWR model was lower than that for the resampled 3B42RT observation (0.78), with a relative mean absolute error of 1.01 compared to 1.52, respectively.The mean error for the The correlation between GWR-estimated and ground-based precipitation measurements (Figure 2c) was improved compared with the raw and resampled 3B42RT data (Figure 2a,b).The scatter plots of raw and resampled 3B42RT vs. ground-measured precipitation were similar (Figure 2a,b).The correlation between raw and resampled 3B42RT data and ground measurements was low.These results indicate that at a daily scale, 3B42RT precipitation observations may contain error compared with ground-measured precipitation.The convergence of scattered points between the raw and resampled 3B42RT data and ground-measured precipitation was not obvious (Figure 2a,b).However, the precision of GWR-estimated precipitation obviously improved, as shown in Figure 2c.
For the similar accuracy of raw and resampled 3B42RT data, the GWR accuracy was evaluated compared to the resampled 3B42RT data.Correlation coefficients, root mean square error, mean error, and mean absolute error were used to evaluate the accuracy of precipitation estimated with the GWR model and 3B42RT interpolation method (Table 1).Through a comparative analysis, it was found that the estimation accuracy of the GWR method was greater than that of the resampled 3B42RT observation.The GWR estimated precipitation was significantly correlated with the actual precipitation, with a correlation coefficient of 0.6538 (p ď 0.01).
However, the correlation coefficient between the resampled 3B42RT data and actual precipitation was only 0.1587 (p ď 0.01).The root mean square error for the GWR model was 1.63, compared with 2.57 for the resampled 3B42RT observation.Correspondingly, the mean absolute error (0.52) for the GWR model was lower than that for the resampled 3B42RT observation (0.78), with a relative mean absolute error of 1.01 compared to 1.52, respectively.The mean error for the GWR model was slightly higher than that for the resampled 3B42RT observation, with a value of 0.10 compared to ´0.03, respectively.
Skill indices were used to assess the improvement in the accuracy of the GWR model compared to resampled 3B42RT data (Table 2).The precipitation estimation accuracy was significantly improved with the GWR model compared with the resampled 3B42RT data based on correlation coefficients and relative mean absolute error.The skill index of the correlation coefficient was 3.12 (0.35 of the relative mean absolute error).This study employed two randomly selected meteorological observation points (Xiaozaohuo (52707) and Dulan (52836)).A time series of the GWR results, 3B42RT data, and actual precipitation was extracted from January to November 2013 to evaluate the relationships between them, as shown in Figure 3.
Remote Sens. 2016, 8, 311 9/17 GWR model was slightly higher than that for the resampled 3B42RT observation, with a value of 0.10 compared to −0.03, respectively.
Skill indices were used to assess the improvement in the accuracy of the GWR model compared to resampled 3B42RT data (Table 2).The precipitation estimation accuracy was significantly improved with the GWR model compared with the resampled 3B42RT data based on correlation coefficients and relative mean absolute error.The skill index of the correlation coefficient was 3.12 (0.35 of the relative mean absolute error).

Skill Indices
Value 3.12 0.35 This study employed two randomly selected meteorological observation points (Xiaozaohuo (52707) and Dulan (52836)).A time series of the GWR results, 3B42RT data, and actual precipitation was extracted from January to November 2013 to evaluate the relationships between them, as shown in Figure 3. Using resampled 3B42RT precipitation to reflect the spatial distribution did not improve the estimations, but precipitation estimated with the GWR model was well correlated with the ground-based measurements (Figure 3a).During months with the high rainfall (May-August), the 3B42RT observations were significantly lower than the actual precipitation (Figure 3b).However, during months with low rainfall (January-April and September-October), several main rainfall events observed by 3B42RT were not recorded through ground-based measurement.This indicates that the 3B42RT observations overestimated precipitation.The GWR precipitation estimates accounted for the underestimated precipitation observations during May-August, and the overestimated precipitation was corrected during January-April and September-October.The annual rainfall distribution estimated with the GWR model was consistent with ground observations, showing that rainfall was highest during summer.The GWR-estimated precipitation at Dulan was similar with the results at Xiaozaohuo, indicating the consistency of the results.The 3B42RT observations did not reflect actual rainfall events, whereas the GWR estimations were more consistent with the actual conditions, especially in September.Using resampled 3B42RT precipitation to reflect the spatial distribution did not improve the estimations, but precipitation estimated with the GWR model was well correlated with the ground-based measurements (Figure 3a).During months with the high rainfall (May-August), the 3B42RT observations were significantly lower than the actual precipitation (Figure 3b).However, during months with low rainfall (January-April and September-October), several main rainfall events observed by 3B42RT were not recorded through ground-based measurement.This indicates that the 3B42RT observations overestimated precipitation.The GWR precipitation estimates accounted for the underestimated precipitation observations during May-August, and the overestimated precipitation was corrected during January-April and September-October.The annual rainfall distribution estimated with the GWR model was consistent with ground observations, showing that rainfall was highest during summer.The GWR-estimated precipitation at Dulan was similar with the results at Xiaozaohuo, indicating the consistency of the results.The 3B42RT observations did not reflect actual rainfall events, whereas the GWR estimations were more consistent with the actual conditions, especially in September.
The typical dates with precipitation in spring, summer, autumn, and winter in 2013 were used to compare the GWR-estimated precipitation spatial distribution with the actual precipitation observed at meteorological sites (Figure 4).The typical dates with precipitation in spring, summer, autumn, and winter in 2013 were used to compare the GWR-estimated precipitation spatial distribution with the actual precipitation observed at meteorological sites (Figure 4).In general, there was consistent correlation between the estimated and actual precipitation.In areas with more precipitation, the actual precipitation was also large, and vice-versa.The degree of correlation between the estimated and actual precipitation in the study area was better than that of the surrounding region.This was mainly due to the scarcity of meteorological sites in the surrounding region, which resulted in lower accuracy of GWR model (Figure 4b,c).Because of the difference between 3B42RT data and ground-based rainfall measurements in some regions, there were some discrepancies between the GWR results and actual precipitation.

Monthly Precipitation Distribution Estimations
Raw and resampled 3B42RT data and meteorological data extracted from all sites between January 2001 and October 2013 were used to analyze the corresponding relationships between the 3B42RT observations, GWR-estimated precipitation, and ground-based precipitation measurements.Figure 5 shows scatter plots of the raw (0.25° × 0.25°) and resampled 3B42RT data (1 km × 1 km), GWR estimated precipitation (1 km × 1 km), and ground-measured precipitation.
The scatter plots of raw and resampled 3B42RT vs. ground-measured precipitation were similar.The correlation between the 3B42RT data and meteorological observations was higher on a monthly scale than on a daily scale (Figure 5a,b).However, the distribution of scattered points between raw, resampled 3B42RT, and ground-measured precipitation showed a fan-shaped distribution and the correlation was low (Figure 5a,b).The correlation between GWR-estimated and ground-based precipitation measurements (Figure 5c) obviously improved compared with the raw and resampled 3B42RT data (Figure 5a,b).A convergence was obvious between the GWR-estimated and actual precipitation measurements (Figure 5c).These results indicate that the estimation accuracy for the GWR model significantly improved compared to the original and resampled 3B42RT data.In general, there was consistent correlation between the estimated and actual precipitation.In areas with more precipitation, the actual precipitation was also large, and vice-versa.The degree of correlation between the estimated and actual precipitation in the study area was better than that of the surrounding region.This was mainly due to the scarcity of meteorological sites in the surrounding region, which resulted in lower accuracy of GWR model (Figure 4b,c).Because of the difference between 3B42RT data and ground-based rainfall measurements in some regions, there were some discrepancies between the GWR results and actual precipitation.

Monthly Precipitation Distribution Estimations
Raw and resampled 3B42RT data and meteorological data extracted from all sites between January 2001 and October 2013 were used to analyze the corresponding relationships between the 3B42RT observations, GWR-estimated precipitation, and ground-based precipitation measurements.Figure 5 shows scatter plots of the raw (0.25 ˝ˆ0.25 ˝) and resampled 3B42RT data (1 km ˆ1 km), GWR estimated precipitation (1 km ˆ1 km), and ground-measured precipitation.
The scatter plots of raw and resampled 3B42RT vs. ground-measured precipitation were similar.The correlation between the 3B42RT data and meteorological observations was higher on a monthly scale than on a daily scale (Figure 5a,b).However, the distribution of scattered points between raw, resampled 3B42RT, and ground-measured precipitation showed a fan-shaped distribution and the correlation was low (Figure 5a,b).The correlation between GWR-estimated and ground-based precipitation measurements (Figure 5c) obviously improved compared with the raw and resampled 3B42RT data (Figure 5a,b).A convergence was obvious between the GWR-estimated and actual precipitation measurements (Figure 5c).These results indicate that the estimation accuracy for the GWR model significantly improved compared to the original and resampled 3B42RT data.To evaluate the accuracy of raw and resampled 3B42RT data, the difference in accuracy among the GWR model results and resampled 3B42RT data was compared, as shown in Table 3.The GWR model, which integrates satellite-based observations with precipitation observations and elevation information, produced more accurate monthly scale precipitation estimates than the resampled 3B42RT data.The correlation coefficient, root mean square error, mean error, and mean absolute error were all improved.The correlation coefficient between GWR-estimated precipitation and actual precipitation observations was 0.8910 (p ≤ 0.01), whereas that between the resampled 3B42RT data and actual precipitation observations was 0.5749 (p ≤ 0.01).The root mean square error for the GWR model was 12.47, compared with 23.10 for the resampled 3B42RT observation.The mean error and mean absolute error of the GWR-estimated precipitation were 0.38 and 6.83, respectively, both obviously higher than the corresponding values based on the resampled 3B42RT (−0.99 and 13.53, respectively).The mean absolute error of the GWR model was only half that of the original 3B42RT data.
Table 4 shows that the GWR model had higher precipitation estimation accuracy than the resampling 3B42RT method.The skill indices of the GWR model were all over 0. The skill index of the correlation coefficient was 0.55, indicating that the precipitation estimation accuracy of the GWR To evaluate the accuracy of raw and resampled 3B42RT data, the difference in accuracy among the GWR model results and resampled 3B42RT data was compared, as shown in Table 3.The GWR model, which integrates satellite-based observations with precipitation observations and elevation information, produced more accurate monthly scale precipitation estimates than the resampled 3B42RT data.The correlation coefficient, root mean square error, mean error, and mean absolute error were all improved.The correlation coefficient between GWR-estimated precipitation and actual precipitation observations was 0.8910 (p ď 0.01), whereas that between the resampled 3B42RT data and actual precipitation observations was 0.5749 (p ď 0.01).The root mean square error for the GWR model was 12.47, compared with 23.10 for the resampled 3B42RT observation.The mean error and mean absolute error of the GWR-estimated precipitation were 0.38 and 6.83, respectively, both obviously higher than the corresponding values based on the resampled 3B42RT (´0.99 and 13.53, respectively).The mean absolute error of the GWR model was only half that of the original 3B42RT data.
Table 4 shows that the GWR model had higher precipitation estimation accuracy than the resampling 3B42RT method.The skill indices of the GWR model were all over 0. The skill index of the correlation coefficient was 0.55, indicating that the precipitation estimation accuracy of the GWR model was greater than that of the resampled 3B42RT data.The skill index of the mean absolute error was 0.49, indicating that the estimation error was also improved with the GWR model compared to the resampled 3B42RT data.This study employed two randomly selected meteorological observation points (Dachaidan (52713) and Wudaoliang (52908)).A time series of the GWR results and actual precipitation was extracted from January 2001 to November 2013 to evaluate the relationship between them, as shown in Figure 6.
Remote Sens. 2016, 8, 311 12/17 model was greater than that of the resampled 3B42RT data.The skill index of the mean absolute error was 0.49, indicating that the estimation error was also improved with the GWR model compared to the resampled 3B42RT data.This study employed two randomly selected meteorological observation points (Dachaidan (52713) and Wudaoliang (52908)).A time series of the GWR results and actual precipitation was extracted from January 2001 to November 2013 to evaluate the relationship between them, as shown in Figure 6.The difference between GWR-estimated and actual precipitation was lower that between the 3B42RT data and actual precipitation.The 3B42RT data overestimated precipitation before 2008, whereas the GWR model results were better correlated with the actual precipitation.Moreover, the 3B42RT data underestimated precipitation during 2008-2010, but this was accounted for in the GWR model, and the 3B42RT-derived precipitation accuracy was effectively improved during this period.The precipitation estimation accuracy during 2011-2013 was similar to that before 2008.The GWR-estimated precipitation values were more accurate compared with the 3B42RT data.
The accuracy of precipitations estimates made by the GWR model at Wudaoliang was less than that at Dachaidan.This may be because the difference between the 3B42RT observations and actual precipitation was reduced at Wudaoliang compared to Dachaidan.Precipitation was neither over or The difference between GWR-estimated and actual precipitation was lower that between the 3B42RT data and actual precipitation.The 3B42RT data overestimated precipitation before 2008, whereas the GWR model results were better correlated with the actual precipitation.Moreover, the 3B42RT data underestimated precipitation during 2008-2010, but this was accounted for in the GWR model, and the 3B42RT-derived precipitation accuracy was effectively improved during this period.The precipitation estimation accuracy during 2011-2013 was similar to that before 2008.The GWR-estimated precipitation values were more accurate compared with the 3B42RT data.
The accuracy of precipitations estimates made by the GWR model at Wudaoliang was less than that at Dachaidan.This may be because the difference between the 3B42RT observations and actual precipitation was reduced at Wudaoliang compared to Dachaidan.Precipitation was neither over or underestimated at Wudaoliang during the study period from January 2001 to November 2013.A good relationship existed between the 3B42RT and actual precipitation data at this site.In spite of this, the relationship between the GWR results and actual precipitation was improved, whereas that with the resampled 3B42RT data did not change.The correlation between GWR-estimated and actual precipitation was obviously improved compared with the original 3B42RT data.
Typical months were selected in spring, summer, autumn, and winter to evaluate the corresponding relationship between the monthly GWR model results and actual precipitation measurements (Figure 7).The spatial precipitation distribution was accurately estimated by the GWR model, which integrated satellite-based precipitation observations and ground-based rainfall measurements.The GWR-estimated spatial precipitation distribution was consistent with the actual rainfall distribution.Compared with the daily GWR results, the correlation between the actual monthly rainfall and monthly GWR results was better than that between the daily GWR results and daily rainfall observations (Figure 4).The precipitation distribution characteristic determined by the GWR model in study area showed that the rainfall in the central Qaidam Basin was lower than that in the eastern and southern areas of the basin (Figure 7b,d).High-value pixels in the 3B42RT data differed from those in the surrounding region, which may have affected the accuracy of the GWR model (Figure 7a,c).Due to the lack of ground meteorological sites, it was difficult to inspect and correct the 3B42RT values.In the future, other data sources should be integrated into the GWR model to improve the precipitation spatial estimation results.
Remote Sens. 2016, 8, 311 13/17 this, the relationship between the GWR results and actual precipitation was improved, whereas that with the resampled 3B42RT data did not change.The correlation between GWR-estimated and actual precipitation was obviously improved compared with the original 3B42RT data.
Typical months were selected in spring, summer, autumn, and winter to evaluate the corresponding relationship between the monthly GWR model results and actual precipitation measurements (Figure 7).The spatial precipitation distribution was accurately estimated by the GWR model, which integrated satellite-based precipitation observations and ground-based rainfall measurements.The GWR-estimated spatial precipitation distribution was consistent with the actual rainfall distribution.Compared with the daily GWR results, the correlation between the actual monthly rainfall and monthly GWR results was better than that between the daily GWR results and daily rainfall observations (Figure 4).The precipitation distribution characteristic determined by the GWR model in study area showed that the rainfall in the central Qaidam Basin was lower than that in the eastern and southern areas of the basin (Figure 7b,d).High-value pixels in the 3B42RT data differed from those in the surrounding region, which may have affected the accuracy of the GWR model (Figure 7a,c).Due to the lack of ground meteorological sites, it was difficult to inspect and correct the 3B42RT values.In the future, other data sources should be integrated into the GWR model to improve the precipitation spatial estimation results.

Discussion
GWR is a robust algorithm that has been successfully used in spatial rainfall analyses.GWR can theoretically integrate geographical location, altitude, and other factors for spatial rainfall estimations, and reflects the non-stationary spatial relationship between these factors and rainfall.Compared with other traditional precipitation merging models, GWR not only has higher precision, but it can also easily explain the spatial distribution characteristics of precipitation.GWR is based on kernel regression and local smoothing.It relates the spatial autocorrelation of precipitation and the non-stationary spatial relationship between these factors and rainfall to achieve the final precipitation distribution.

Discussion
GWR is a robust algorithm that has been successfully used in spatial rainfall analyses.GWR can theoretically integrate geographical location, altitude, and other factors for spatial rainfall estimations, and reflects the non-stationary spatial relationship between these factors and rainfall.Compared with other traditional precipitation merging models, GWR not only has higher precision, but it can also easily explain the spatial distribution characteristics of precipitation.GWR is based on kernel regression and local smoothing.It relates the spatial autocorrelation of precipitation and the non-stationary spatial relationship between these factors and rainfall to achieve the final precipitation distribution.
As found in our research, the ground-based and satellite-based GWR model improved the correlation between in-situ rainfall observations and remotely sensed rainfall measurements in the Qaidam Basin.However, rainfall varies among different climatic regions, and GWR applications require further study in other arid or desert regions.Although the precipitation merging GWR model improved the precipitation estimation accuracy, it has non-uniform and local difference characteristics.The GWR model accuracy differed at various locations, as shown by the temporal and spatial validation results.Further research and development of the GWR model is currently being conducted.
Furthermore, there are too few rain gauge stations in the study area to achieve high-quality daily rainfall estimations.Hu (2013) used GWR to investigate the effect of station network density on the GWR precipitation model in the Ganjiang River Basin.There is no doubt that the local accuracy of the GWR precipitation model is higher in study areas with more intensive rain gauge station networks.When the ground gauge density is lower than ~1300 km 2 per gauge, the estimation accuracy varies dramatically with gauge number; whereas, if the density is above ~380 km 2 per gauge, the accuracy is insensitive to gauge numbers variation.When the gauge network density is below ~2500 km 2 per gauge, the accuracy of spatial precipitation estimations obtained by data merging is gradually improved compared to those obtained by traditional interpolation methods based on only ground observations [47].
The Qaidam Basin is 800 km long from east to west, 300 km wide from north to south, and has an area of 257,768 km 2 .Along with weather stations in the buffer zone, the gauge network density in the Qaidam Basin is about 18,412 km 2 per gauge, far below 2500 km 2 per gauge.The sparse gauge network density had a great impact on the accuracy of the GWR model.However, in this case, the accuracy of the spatial precipitation estimation obtained by data merging may be greater than that obtained by traditional interpolation methods based on only ground observations.

Conclusions
A GWR model was developed to derive rainfall distribution estimations from ground-based and satellite-based rainfall measurements.Prior to this study, the 3B42RT product accuracy of daily and monthly scale rainfall data has been evaluated based on rain gauge measurements.The accuracy of the GWR model was validated.
Both daily and monthly 3B42RT productions may contain errors compared to in-situ rainfall observations.The accuracy of the monthly 3B42RT products was slightly higher than that of the daily scale products.The ground-based and satellite-based GWR model improved the correlation between in-situ rainfall observations and remotely sensed rainfall measurements in the Qaidam Basin.The correlation coefficients between GWR estimated precipitation and in-situ rainfall observations at daily and monthly scales were 0.6538 and 0.8910, with root mean square errors of 1.63 and 12.47, respectively.The GWR model accounted for the error in the 3B42RT rainfall products.
Temporally, the annual seasonal rainfall distribution derived from the GWR model was generally consistent with the actual rainfall distribution, although the accuracy varied among sites.Spatially, the monthly rainfall estimation accuracy of the GWR model was higher than that of the daily scale results.However, the GWR model accuracy differed among locations due to error in the 3B42RT data.
The GWR model integrates ground-based and satellite-based measurements, and thus is applicable in the Qaidam Basin, in which there is a sparse network of rainfall gauges.The model results with a 1-km spatial resolution can achieve precipitation assessments in remote areas, which is valuable for spatial simulations of hydrological conditions and climate change.
Further research and development of the GWR model is currently being conducted.The current model merges ground-based rainfall observations, 3B42RT rainfall products, altitude, and geographical location information.A model that combines surface observations, 3B42RT and elevation data, and other observed precipitation data (e.g., the impact of radar and land cover types) will help to further improve the accuracy of precipitation distribution estimations.Estimations made by the GWR model over various time scales also need to be further studied.In this study, the accuracy of the GWR model was validated through the cross validation method and randomly selected meteorological sites.A more comprehensive validation method should be investigated to ensure the effectiveness of rainfall estimations.The PRISM dataset was shown to be an accurate representation of spatial climate patterns.Further research and validation of the GWR model should include comparisons with PRISM to investigate its suitability for different regions and time scales.In addition, our GWR rainfall model did not take the temporal correlation of rainfall into account.Spatial-temporal interpolation/modeling (the ensemble Kalman filtering, fixed rank filtering, etc.) will be applied to the Qaidam Basin in future research.

Figure 2 .
Figure 2. Scatter plots between raw 3B42RT and ground-measured precipitation (a); resampled 3B42RT and ground-measured precipitation (b); and GWR-estimated and ground-measured precipitation (c) on a daily scale.

Figure 2 .
Figure 2. Scatter plots between raw 3B42RT and ground-measured precipitation (a); resampled 3B42RT and ground-measured precipitation (b); and GWR-estimated and ground-measured precipitation (c) on a daily scale.

Figure 5 .
Figure 5. Scatter plots between raw 3B42RT and ground-measured precipitation (a); resampled 3B42RT and ground-measured precipitation (b); and GWR-estimated and ground-measured precipitation (c) on a monthly scale.

Figure 5 .
Figure 5. Scatter plots between raw 3B42RT and ground-measured precipitation (a); resampled 3B42RT and ground-measured precipitation (b); and GWR-estimated and ground-measured precipitation (c) on a monthly scale.

Table 2 .
Estimation accuracy of precipitation distribution estimated with GWR compared with resampled 3B42RT data.

Table 2 .
Estimation accuracy of precipitation distribution estimated with GWR compared with resampled 3B42RT data.

Table 3 .
Precipitation distribution estimation accuracy of the GWR model.

Table 3 .
Precipitation distribution estimation accuracy of the GWR model.

Table 4 .
Skill index of the GWR model compared to the original 3B42RT data.

Table 4 .
Skill index of the GWR model compared to the original 3B42RT data.