Comparisons of Spatially Downscaling TMPA and IMERG over the Tibetan Plateau

Accurate precipitation data is crucial in many applications such as hydrology, meteorology, and ecology. Compared with ground observations, satellite-based precipitation estimates can provide much more spatial information to characterize precipitation. In this study, the satellite-based precipitation products of Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (IMERG) and Tropical Rainfall Measurement Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) were firstly evaluated over the Tibetan Plateau (TP) in 2015 against ground observations at both annual and monthly scales. Secondly, random forest algorithm was used to obtain the annual downscaled results (~1 km) based on IMERG and TMPA data and the downscaled results were examined against rain gauge data. Thirdly, a disaggregation algorithm was used to obtain the monthly downscaled results based on those at annual scale. The results indicated that (1) IMERG performed better than TMPA at both annual and monthly scales; (2) IMERG had few anomalies while TMPA displayed significant numbers of outliers in central and western parts of the TP; (3) random forest was a promising algorithm in acquiring high resolution precipitation data with improved accuracy; (4) the downscaled results based on IMERG had better performances than those based on TMPA.


Introduction
Precipitation plays a significant role in global water cycles and energy exchanges [1].Accurate precipitation information is highly desirable in various scientific and application fields such as water resource management, weather prediction, as well as disaster monitoring and control [2].Currently, there are three kinds of independent instruments to obtaining precipitation data, including gauges, weather radar and satellite-based sensors [3].Though conventional point-based measurements from rain gauges could provide relatively accurate rainfall values at the point scale, they are not suitable for providing continuous spatial precipitation distributions [4].Weather radar can obtain rainfall data with finer spatiotemporal resolutions.However, weather radar has disadvantages due to errors from various sources such as beam blockage in complex terrain and range degradation issues [5].While satellite-based remote sensing has great potentials to provide comprehensive estimates of precipitation globally with reasonable spatiotemporal resolutions and accuracy.
Several projects, including the Global Precipitation Climatology Project (GPCP) [6][7][8], the Global Satellite Mapping of Precipitation (GsMaP) project [9], and the Tropical Rainfall Measuring Mission (TRMM) [10][11][12], had calculated satellite-based precipitation estimates and published various precipitation products with different temporal and spatial resolutions for free, such as the TRMM Multisatellite Precipitation Analysis (TMPA) data set [12].The TRMM, carrying a signal-frequency precipitation radar (PR; the Ku-band at 13.8 GHz) and a multichannel TRMM Microwave Imager (TMI; frequencies range between 10 and 85.5 GHz), had provided a range of precipitation products since 1997 [10] and intended to provide the 'best' satellite precipitation estimates [12].TMPA data had been widely applied in various fields, such as hydrological modeling [13], flood prediction [14] and climatology [15], over the past two decades.
With the success of TRMM, the Global Precipitation Measurement (GPM) Core Observatory was launched in February 2014 by a joint effort of the National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency (JAXA).The GPM Core Observatory carried a dual-frequency precipitation radar (DPR; the Ku-band at 13.6 GHz and Ka-band at 35.5 GHz) and a conical-scanning multichannel GPM Microwave Imager (GMI; frequencies range between 10 and 183 GHz).Compared with TRMM's instruments, GPM extended its sensor packages and could detect light and solid precipitation more accurately than TRMM sensors [2,16].Numerous studies had examined the accuracy of GPM rainfall products.The performance of the Integrated Multi-satellitE Retrievals for GPM (IMERG) and TMPA products were evaluated over mainland China at various spatiotemporal resolutions [17].In southeast India, the accuracy of IMERG data were validated and compared with those of TMPA and GsMaP data [18].While in Far East Asia, the IMERG and TMAP data were employed to evaluate the topographical and seasonal precipitation features applied [19].These studies mainly demonstrated that IMERG had a better ability to provide more accurate rainfall estimate data than TMPA data.
However, the spatial resolution of TMPA data was too coarse (0.25 • ) and could not meet the demands of some researches, especially at the regional and basin scales.As a result, numerous downscaling models combining different environmental variables as auxiliary data had been explored to derive precipitation estimates at 1 km scale.To downscale TMPA data, the exponential regression model between TMPA and the normalized difference vegetation index (NDVI) was firstly applied in [20].Based on this downscaling method, a statistical regression downscaling algorithm, introducing the NDVI and the digital elevation model (DEM), was developed [21].The regression kriging method was also found to be feasible to downscale TMPA data [22].While in [23], a downscaling algorithm namely geographically weighted regression (GWR) was introduced, which considered a regional regression model.Recently, a spatial data mining algorithm which introduced a series of land surface variables (NDVI, land surface temperature (LST) for both day (LST-d) and night (LST-n), DEM, etc.) was proposed [24,25].While the resolution of IMERG was still too coarse (0.1 • ), and it was important to downscale IMERG data into a finer resolution (~1 km) for applications at the regional or basin scales.Many studies had evaluated the quality and accuracy of IMERG data, but few focused on the downscaling performance of IMERG.
Random forest (RF) is a nonparametric statistical regression algorithm developed from classification and regression tree (CART).In this study, the IMERG and TMPA data were explored to obtain downscaled results at the 1 km scale based on the nonparametric regression models constructed using the RF model.Moreover, both the original satellite-based precipitation estimates (TMPA and IMERG) and the downscaled results were furtherly analyzed and compared to reveal their differences at different spatial scales.
The objectives of this study were as follows: (1) to compare the quality of TMPA and IMERG over the TP at both annual and monthly temporal scales against ground observations; (2) to obtain downscaled results (~1 km) based on TMPA and IMERG data, respectively; and (3) to generate the monthly downscaled results based on those at annual scale.

Study Area
The TP is located in southwestern China (Figure 1) between 73 • -104 • E and 26 • -40 • N. It is the highest plateau on Earth.The extent of the TP is approximately 2.5 million km 2 , with an average elevation of >4000 m above mean sea level.Because of the complex topography and extremely high elevation, the precipitation on the TP exhibits substantial spatiotemporal variability [26].Spatially, much of the precipitation occurs in the central and southern parts of the plateau, with little in the western and northern parts.Temporally, the amount of rainfall differs seasonally.Spring and summer receive approximately 90% of the annual precipitation because of the effects of the Indian monsoon, which brings substantial water vapor from the sea.In winter, the westerlies influence the western part of the plateau, resulting in little rainfall.The TP contains the upstream reaches of the five major Asian rivers, including the Yellow, Brahmaputra, Ganges, Indus and Yangtze Rivers [27,28], which supply water resources for more than 1.4 billion people [29].

Study Area
The TP is located in southwestern China (Figure 1) between 73°-104° E and 26°-40°N.It is the highest plateau on Earth.The extent of the TP is approximately 2.5 million km 2 , with an average elevation of >4000 m above mean sea level.Because of the complex topography and extremely high elevation, the precipitation on the TP exhibits substantial spatiotemporal variability [26].Spatially, much of the precipitation occurs in the central and southern parts of the plateau, with little in the western and northern parts.Temporally, the amount of rainfall differs seasonally.Spring and summer receive approximately 90% of the annual precipitation because of the effects of the Indian monsoon, which brings substantial water vapor from the sea.In winter, the westerlies influence the western part of the plateau, resulting in little rainfall.The TP contains the upstream reaches of the five major Asian rivers, including the Yellow, Brahmaputra, Ganges, Indus and Yangtze Rivers [27,28], which supply water resources for more than 1.4 billion people [29].

Rain Gauges
The rain gauge data used in this study were provided by the Third Pole Environment Database (http://en.tpedatabase.cn/portal/MetaDataInfo.jsp?MetaDataId=249472).This dataset provided daily precipitation data from 1979 to 2015 for the TP and its surroundings.We obtained precipitation records from 113 rain gauge stations which distributed unevenly across the TP (Figure 1).Daily precipitation records from January 2015 through December 2015 were used in this study.The monthly and annual precipitation volume for each station were calculated by accumulating the daily precipitation.

TRMM Satellite Precipitation Dataset
The TRMM was launched in November 1997 through cooperation between NASA and JAXA to monitor and investigate tropical and subtropical rainfall system.The TMPA employed inputs from two different types of satellite sensors.The primary data source was from precipitation-related microwave sensors, including the TMI, Special Sensor Microwave/Imager (SSM/I) and Special Sensor Microwave Imager/Sounder (SSMIS), Advanced Microwave Scanning Radiometer for the Earth

Rain Gauges
The rain gauge data used in this study were provided by the Third Pole Environment Database (http://en.tpedatabase.cn/portal/MetaDataInfo.jsp?MetaDataId=249472).This dataset provided daily precipitation data from 1979 to 2015 for the TP and its surroundings.We obtained precipitation records from 113 rain gauge stations which distributed unevenly across the TP (Figure 1).Daily precipitation records from January 2015 through December 2015 were used in this study.The monthly and annual precipitation volume for each station were calculated by accumulating the daily precipitation.

TRMM Satellite Precipitation Dataset
The TRMM was launched in November 1997 through cooperation between NASA and JAXA to monitor and investigate tropical and subtropical rainfall system.The TMPA employed inputs from two different types of satellite sensors.The primary data source was from precipitation-related microwave sensors, including the TMI, Special Sensor Microwave/Imager (SSM/I) and Special Sensor Microwave Imager/Sounder (SSMIS), Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E), the Advanced Microwave Sounding Unit (AMSU) and the Microwave Humidity Sounders The NDVI is typically used to determine the vegetation productivity and distribution [31,32], and is correlated well with precipitation at annual temporal scale [25,33,34].The MOD13A3 monthly NDVI product from January 2015 to December 2015 with a spatial resolution of 1 km was employed in this study.The MOD13A3 was derived from atmospherically corrected reflectance in the red and near-infrared wavebands of the Moderate Resolution Imaging Spectroradiometer (MODIS) on the Terra satellite, which can be downloaded at https://ladsweb.modaps.eosdis.nasa.gov/search/.Annual NDVI value was obtained by averaging monthly NDVI in the year.

Land Surface Temperature Datasets
Land surface temperature datasets were collected through MODIS sensors aboard the Terra and Aqua satellites.MODIS can provide the global land surface temperature records of day and night with an error-controlled between −1 K and 1 K. MODIS11A2 products with a spatial resolution of 1 km and an eight-day temporal resolution were obtained at (https://lpdaac.usgs.gov/dataset_discovery/modis/modis_products_table/mod11a2_v006).In 2015, there were forty-five 8-day periods, we calculated the averaged day and night land surface temperatures over the TP to determine the year-round land surface temperature for day (LST-d) and night (LST-n).

Topography Datasets
The Shuttle Radar Topography Mission (SRTM) is an international project spearheaded by the National Geospatial-Intelligence Agency (NGA) and (NASA), and it was launched in February 2000.The SRTM project provided high-resolution DEM data with a spatial coverage ranging from 56 • S to 60 • N and a spatial resolution of one arc second (~30 m).In this study, the dataset with three arcs second (approximately 90 m) were adopted as the basic topographical data (http://srtm.csi.cgiar.org).Other topographical parameters, including slope [35], aspect [36], curvature [37], radiation [38], slope length and steepness (LS) [39], the topographic wetness index (TWI) [40], the multiresolution valley bottom flatness index (MrVBF) [41] and the terrain ruggedness index [42], were derived from DEM data.Curvature, radiation, and ruggedness were abbreviated as Curv, Radi, and Rugg, respectively.

Random Forest
RF was developed as an extension of the CART to improve the accuracy and stability of the CART model [43].To perform a regression in CART, the model searches every distinct value of every predictor in the entire data set (S) to find the optimal predictor and split values that are used to partition the data into two groups (S 1 and S 2 ), such that the overall sum of squared error (SSE) is minimized [44]: where y 1 and y 2 represent the averages of the training set outcomes of groups S 1 and S 2 , respectively.The process continues within sets S 1 and S 2 until the number of samples in the splits falls below some threshold.
To reduce the variance of the estimates, bagging (short for bootstrap aggregation) strategy was proposed.The bagging method generates m bootstrap samples of the original data and trains the tree model for each bootstrap sample.Each model is then used to generate a prediction for a new sample, and these m predictions are averaged to obtain the bagged model's prediction.However, the trees in bagging are not completely independent of each other because of the underlying relationship between the predictors and responses.The trees from different bootstrap samples may have similar structures that prevent bagging from optimally reducing the variance of the predicted values.Random Forest was developed to reduce the tree correlation by randomly selecting predictors at each split.
The main steps of the random forest algorithm were as follows.
(1) To randomly generate ntree bootstrap samples from the original dataset.The elements not selected are referred to as 'out of bag' (OOB) samples.(2) For each split, to randomly select m try predictors of the original predictors and choose the best predictor among the m try predictors to partition the dataset.Additionally, m try is the tuning parameter of the model.(3) To predict new data (OOB elements) by averaging predictions of the ntree trees.(4) The OOB samples were used to estimate the prediction error.The OOB estimate of the error rate [45] was calculated as follows: where foob [X i ] represents the observation of the ith OOB prediction.
The random forest can provide a measurement of variable's importance.One of the approaches is to look at the increase in the OOB estimate error when the specific predictor variable is randomly permuted and other predictors are kept unchanged.The more the error increases, the more important the variable is.These variable importance values are used to rank the predictors in terms of their relative contribution to the model.The random forest model was generated using the package randomforest in R (https://cran.r-project.org/web/packages/randomForest/).

Main Downscaling Steps at the Annual Scale
RF was used to downscale the TMPA and IMERG data over the TP at an annual scale.The main steps in the downscaling process were as follows: (1) Sum the TMPA and IMERG monthly precipitation to annual precipitation (P 0.25 • TMPA , P 0.1 • I MERG ).(2) Aggregate land surface variables with their original resolutions to those at the resolutions of 0.1 • and 0.25 • , respectively.
(3) Establish the random forest model RF TMPA between land surface variables at 0.25 • and P 0.25 • TMPA , and establish the random forest model RF IMEGR between land surface variables at 0.1 • and P 0.   resi and P 0.1 • resi ) at 1 km from those at 0.25 • using a simple spline tension interpolator.As the residual data were regularly spaced, the spline interpolation was typically used for this type of data.(7) Generate the annual precipitation estimates, termed as the downscaled results before residual corrections, at 1 km using RF TMPA and RF I MEGR , respectively, in combination with the land surface variables.(8) Add the residuals at 1 km obtained at step (6) and the downscaled results before residual corrections at 1 km obtained at step (7) to finally obtain the downscaled results after residual corrections.
The steps used for downscaling are shown in Figure 2.
Remote Sens. 2018, 10, x FOR PEER REVIEW 6 of 21 (3) Establish the random forest model  between land surface variables at 0.25° and  .°, and establish the random forest model  between land surface variables at 0.1° and  .°.
(4) Estimate annual precipitation at 0.25° using  , and estimate annual precipitation at 0.1° using  .(5) Map the residuals at 0.25 ° (  .°) by computing the difference between  .° and the estimated annual precipitation at 0.25°.Similarly, map the residuals at 0.1° ( .°) by computing the difference between  .° and the estimated annual precipitation at 0.1°.(6) Obtain the residuals ( .° and  .°) at 1 km from those at 0.25° using a simple spline tension interpolator.As the residual data were regularly spaced, the spline interpolation was typically used for this type of data.(7) Generate the annual precipitation estimates, termed as the downscaled results before residual corrections, at 1 km using  and  , respectively, in combination with the land surface variables.(8) Add the residuals at 1 km obtained at step (6) and the downscaled results before residual corrections at 1 km obtained at step (7) to finally obtain the downscaled results after residual corrections.
The steps used for downscaling are shown in Figure 2.

Disaggregation from Annual Precipitation to Monthly Precipitation
Considering the time lag of the response of the NDVI to precipitation [46], we used a disaggregation algorithm to obtain the monthly downscaled results.The disaggregation algorithm was based on the assumption that the monthly ratios are as defined in Equation (3) [47].
The numerator _ represents the precipitation that occurs in the ith month estimated from the original satellite precipitation product, and the denominator represents the annual precipitation.Note that the spatial resolution of  is the same as that of the original satellite precipitation data, and the ratios were further interpolated by spline interpolation to 1 km, which was consistent with the downscaled annual results.The monthly downscaled results can then be acquired by multiplying the monthly ratios (~1 km) by the annually downscaled results.

Disaggregation from Annual Precipitation to Monthly Precipitation
Considering the time lag of the response of the NDVI to precipitation [46], we used a disaggregation algorithm to obtain the monthly downscaled results.The disaggregation algorithm was based on the assumption that the monthly ratios are as defined in Equation (3) [47].
The numerator OrgSat_Pre i represents the precipitation that occurs in the ith month estimated from the original satellite precipitation product, and the denominator represents the annual precipitation.Note that the spatial resolution of Ratio i is the same as that of the original satellite precipitation data, and the ratios were further interpolated by spline interpolation to 1 km, which was consistent with the downscaled annual results.The monthly downscaled results can then be acquired by multiplying the monthly ratios (~1 km) by the annually downscaled results.

Validation
Validation statistics, including the coefficient of determination (R 2 ), bias, the mean absolute error (MAE) and the root mean square error (RMSE) were used to access the performance of the original satellite-based precipitation products and the downscaled results against precipitation estimates observed from rain gauges.The R 2 values indicate the strength of the linear relationship between the precipitation estimates and ground observations, ranging between 0 and 1 with an optimal value of 1. Bias denotes the degree either over-or underestimated, with a perfect value of 0. Overestimation is represented as positive bias indicating the satellite value is higher than gauge data.MAE shows the mean magnitude of the errors and RMSE was used to represent the distribution of errors.Smaller values of MAE and RMSE indicate more accurate and precise results.The perfect score for both the MAE and RMSE is 0.
where P i is the satellite precipitation measurement; M i is the measured precipitation from rain gauge stations; n is the total number of samples.

Original Results and Validations Using Ground Observations at Annual and Monthly Scales
Figure 3a-c represented the annual precipitation maps derived from original TMPA, resampled IMERG and original IMERG in the year 2015 over the TP, respectively.Over the region, both TMPA and IMERG data, as well as the resampled IMERG, captured the typical southeast-northwest decreasing precipitation patterns, which were consistent with the patterns measured by rain gauges displayed in Figure 1.TMPA and IMERG patterns were close as they utilized similar sensors including IR and PMW sensors, to retrieve precipitation.However, TMPA presented some exceptional values in the western and northern parts of the TP relative to the neighboring pixels.These anomalies did not appear in IMERG estimates resulting in more continuous precipitation trends.

Validation
Validation statistics, including the coefficient of determination (R 2 ), bias, the mean absolute error (MAE) and the root mean square error (RMSE) were used to access the performance of the original satellite-based precipitation products and the downscaled results against precipitation estimates observed from rain gauges.The R 2 values indicate the strength of the linear relationship between the precipitation estimates and ground observations, ranging between 0 and 1 with an optimal value of 1. Bias denotes the degree either over-or underestimated, with a perfect value of 0. Overestimation is represented as positive bias indicating the satellite value is higher than gauge data.MAE shows the mean magnitude of the errors and RMSE was used to represent the distribution of errors.Smaller values of MAE and RMSE indicate more accurate and precise results.The perfect score for both the MAE and RMSE is 0.
where  is the satellite precipitation measurement;  is the measured precipitation from rain gauge stations; n is the total number of samples.

Original Results and Validations Using Ground Observations at Annual and Monthly Scales
Figure 3a-c represented the annual precipitation maps derived from original TMPA, resampled IMERG and original IMERG in the year 2015 over the TP, respectively.Over the region, both TMPA and IMERG data, as well as the resampled IMERG, captured the typical southeast-northwest decreasing precipitation patterns, which were consistent with the patterns measured by rain gauges displayed in Figure 1.TMPA and IMERG patterns were close as they utilized similar sensors including IR and PMW sensors, to retrieve precipitation.However, TMPA presented some exceptional values in the western and northern parts of the TP relative to the neighboring pixels.These anomalies did not appear in IMERG estimates resulting in more continuous precipitation trends.To compare the performance of the original TMPA data with resampled IMERG, as well as the original IMERG data, we validated these products against ground observations from 113 rain gauge stations on the TP in 2015.Figure 4a shown the scatterplot of the validation between ground observations and the original TMPA data at 0.25° resolution.Figure 4b,c demonstrate the validations of resampled IMERG data at 0.25° resolution and original IMERG data at 0.1° resolution, against ground observations, respectively.Both satellite precipitation products generally overestimated precipitation over the TP compared to ground observations.The original TMPA was the mostly overestimated product compared to ground observations (R 2 ~ 0.67, bias ~ 22.40%).IMERG shown improved accuracy (R 2 ~ 0.74, bias ~ 12.23%) at 0.1° resolution.The resampled IMERG data also outperformed the original TMPA data, with R 2 ~ 0.73 and bias ~ 12.40%, at the same spatial resolution of 0.25°.To compare the performance of the original TMPA data with resampled IMERG, as well as the original IMERG data, we validated these products against ground observations from 113 rain gauge stations on the TP in 2015.Figure 4a shown the scatterplot of the validation between ground observations and the original TMPA data at 0.25 • resolution.Figure 4b,c demonstrate the validations of resampled IMERG data at 0.25 • resolution and original IMERG data at 0.1 • resolution, against ground observations, respectively.Both satellite precipitation products generally overestimated precipitation over the TP compared to ground observations.The original TMPA was the mostly overestimated product compared to ground observations (R 2 ~0.67, bias ~22.40%).IMERG shown improved accuracy (R 2 ~0.74, bias ~12.23%) at 0.1 • resolution.The resampled IMERG data also outperformed the original TMPA data, with R 2 ~0.73 and bias ~12.40%, at the same spatial resolution of 0.25  To compare the performance of the original TMPA data with resampled IMERG, as well as the original IMERG data, we validated these products against ground observations from 113 rain gauge stations on the TP in 2015.Figure 4a shown the scatterplot of the validation between ground observations and the original TMPA data at 0.25° resolution.Figure 4b,c demonstrate the validations of resampled IMERG data at 0.25° resolution and original IMERG data at 0.1° resolution, against ground observations, respectively.Both satellite precipitation products generally overestimated precipitation over the TP compared to ground observations.The original TMPA was the mostly overestimated product compared to ground observations (R 2 ~ 0.67, bias ~ 22.40%).IMERG shown improved accuracy (R 2 ~ 0.74, bias ~ 12.23%) at 0.1° resolution.The resampled IMERG data also outperformed the original TMPA data, with R 2 ~ 0.73 and bias ~ 12.40%, at the same spatial resolution of 0.25°.Figure 5 displayed the monthly mean precipitation from 113 rain gauges, the original TMPA, resampled IMERG and the original IMERG, respectively.Both TMPA and IMERG successfully captured the precipitation dynamics intra-annually, with heavy precipitation occurred at June, July, August and September (referred as wet seasons), while little precipitation happened in November, December, January and February (referred as dry seasons).IMERG estimates were better than TMPA estimates, due to its values were closer to the ground measured precipitation amounts.Table 1 presented the statistical results at monthly scale.Both TMPA and IMERG products performed better during the wet seasons and performed worse during the dry seasons.During dry seasons, the volume of precipitation was lower, and the durations of the precipitation was shorter in time which made the detection more complicated by PWM sensor.The performance of IMERG was slightly better than TMPA during the dry seasons with relatively higher values of R 2 and lower values of bias, MAE and RMSE, respectively.One of the possible reasons for the improvements may attribute to the extended sensors that equipped on IMERG (DPR and GMI) than sensors equipped on TMPA (PR and TMI), leading to a more accurate detection of light and solid precipitation.Another likely reason to explain why IMERG performed better than TMPA during the dry season was that the different PWM retrievals.IMERG employed the GPROF2014 while TMPA employed GPROF2010.GPROF2014 was Bayesian in nature, and used "dynamic" detection threshold values, compared with GPROF2010 of regression and relative "static" detection threshold values.Therefore, Figure 5 displayed the monthly mean precipitation from 113 rain gauges, the original TMPA, resampled IMERG and the original IMERG, respectively.Both TMPA and IMERG successfully captured the precipitation dynamics intra-annually, with heavy precipitation occurred at June, July, August and September (referred as wet seasons), while little precipitation happened in November, December, January and February (referred as dry seasons).IMERG estimates were better than TMPA estimates, due to its values were closer to the ground measured precipitation amounts.Figure 5 displayed the monthly mean precipitation from 113 rain gauges, the original TMPA, resampled IMERG and the original IMERG, respectively.Both TMPA and IMERG successfully captured the precipitation dynamics intra-annually, with heavy precipitation occurred at June, July, August and September (referred as wet seasons), while little precipitation happened in November, December, January and February (referred as dry seasons).IMERG estimates were better than TMPA estimates, due to its values were closer to the ground measured precipitation amounts.Table 1 presented the statistical results at monthly scale.Both TMPA and IMERG products performed better during the wet seasons and performed worse during the dry seasons.During dry seasons, the volume of precipitation was lower, and the durations of the precipitation was shorter in time which made the detection more complicated by PWM sensor.The performance of IMERG was slightly better than TMPA during the dry seasons with relatively higher values of R 2 and lower values of bias, MAE and RMSE, respectively.One of the possible reasons for the improvements may attribute to the extended sensors that equipped on IMERG (DPR and GMI) than sensors equipped on TMPA (PR and TMI), leading to a more accurate detection of light and solid precipitation.Another likely reason to explain why IMERG performed better than TMPA during the dry season was that the different PWM retrievals.IMERG employed the GPROF2014 while TMPA employed GPROF2010.GPROF2014 was Bayesian in nature, and used "dynamic" detection threshold values, compared with GPROF2010 of regression and relative "static" detection threshold values.Therefore, Table 1 presented the statistical results at monthly scale.Both TMPA and IMERG products performed better during the wet seasons and performed worse during the dry seasons.During dry seasons, the volume of precipitation was lower, and the durations of the precipitation was shorter in time which made the detection more complicated by PWM sensor.The performance of IMERG was slightly better than TMPA during the dry seasons with relatively higher values of R 2 and lower values of bias, MAE and RMSE, respectively.One of the possible reasons for the improvements may attribute to the extended sensors that equipped on IMERG (DPR and GMI) than sensors equipped on TMPA (PR and TMI), leading to a more accurate detection of light and solid precipitation.Another likely reason to explain why IMERG performed better than TMPA during the dry season was that the different PWM retrievals.IMERG employed the GPROF2014 while TMPA employed GPROF2010.GPROF2014 was Bayesian in nature, and used "dynamic" detection threshold values, compared with GPROF2010 of regression and relative "static" detection threshold values.Therefore, GPROF2014 resulted in much less "false" alarm.During wet seasons, both TMPA and IMERG corresponded well with ground observations with R 2 values approximately 0.70.IMERG was still more accurate with bias ~10%, MAE ~22 mm and RMSE ~32 mm, than TMPA (bias ~20%, MAE ~25 mm, RMSE ~37 mm).Regrading to the resampled IMERG data at 0.25 • resolution, the statistical results were very close to those of the original IMERG at 0.1 • and thus provided more accurate precipitation estimates than TMPA at 0.25 • .Based on these evaluations, we concluded that IMERG data were better than TMPA data in terms of accuracy, with anomalies removed at the annual scale.Ntree and m try were the key parameters that affected the predictive performances of the RF model.To determine the optimal selection of parameters, the OOB error rate, as defined in Equation ( 2), was applied.The OOB error rate was an internal estimate of the predictive performances provided by the bagging model.The values of OOB error usually correlated well with the assessment of the predictive performance obtained from a cross-validation or test set.In addition, using OOB error rate could decrease the computational time required to tune RF model.
The RF was protected from overfitting [43], the model could not be adversely affected if a large number of trees were built for the forest.Figure 6a,b indicated that the OOB error in RF TMPA and RF I MERG decreased not significantly when ntree increased from 900 to 1500.To balance the accuracy and computational burden, 1000 was finally selected as the value of ntree for both RF TMPA and RF I MERG .

Main Results in the Downscaling Procedure Using Random Forest
Ntree and  were the key parameters that affected the predictive performances of the RF model.To determine the optimal selection of parameters, the OOB error rate, as defined in Equation ( 2), was applied.The OOB error rate was an internal estimate of the predictive performances provided by the bagging model.The values of OOB error usually correlated well with the assessment of the predictive performance obtained from a cross-validation or test set.In addition, using OOB error rate could decrease the computational time required to tune RF model.
The RF was protected from overfitting [43], the model could not be adversely affected if a large number of trees were built for the forest.Figure 6a,b indicated that the OOB error in  and  decreased not significantly when ntree increased from 900 to 1500.To balance the accuracy and computational burden, 1000 was finally selected as the value of ntree for both  and  .To specify the  parameter for  and  , we tested five values of  that ranged from 2 to 10 and independently generated 1000 tree models at each  value.The boxplots of the OOB error for  and  were presented in Figure 7a,b, respectively.The upper and lower edges To specify the m try parameter for RF TMPA and RF IMERG , we tested five values of m try that ranged from 2 to 10 and independently generated 1000 tree models at each m try value.The boxplots of the OOB error for RF TMPA and RF IMERG were presented in Figure 7a,b, respectively.The upper and lower edges of the central box were the first and third quartiles (25% and 75%, respectively).The sign of band and circle inside the box represented the median and average value of OOB error, respectively.For RF TMPA , the OOB error decreased when the value of m try increased from 2 to 6 and then increased with increasing values from 6 to 10.For RF IMERG , the OOB error decreased when m try increased from 2 to 8. Therefore, a combination of ntree = 1000 and m try = 6 was applied to build the model and downscale the TMPA data, and ntree = 1000 and m try = 8 was employed to build the model and downscale the IMERG data over the TP. Figure 8a,b demonstrated the variable importance in  and  , respectively.These indicated that NDVI, DEM, LST-d and LST-n were the main factors contributed to establish the random forest models and that these factors influenced the spatial patterns and amount of precipitation at annual scale.The spatial distribution of the dominant factors which influenced the spatial distribution of precipitation over the TP was firstly explored in [25].In central and northeast TP where were mainly covered by grassland and meadows (which were easily influenced by precipitation), the NDVI was the most important factor.The dominant factors were LST-d and LSTn in the northwest and west, where the temperatures were low with high elevations, and the precipitation in these regions could be influenced through thermal and dynamic mechanisms.In the southeast regions, topography played the most important roles in influencing the precipitation.Orographic rainfall typically occurred in the southeast where the topography varied greatly from valleys to huge mountains ranging in elevation from 100 to 4000 m.  Figure 8a,b demonstrated the variable importance in RF TMPA and RF IMERG , respectively.These indicated that NDVI, DEM, LST-d and LST-n were the main factors contributed to establish the random forest models and that these factors influenced the spatial patterns and amount of precipitation at annual scale.The spatial distribution of the dominant factors which influenced the spatial distribution of precipitation over the TP was firstly explored in [25].In central and northeast TP where were mainly covered by grassland and meadows (which were easily influenced by precipitation), the NDVI was the most important factor.The dominant factors were LST-d and LST-n in the northwest and west, where the temperatures were low with high elevations, and the precipitation in these regions could be influenced through thermal and dynamic mechanisms.In the southeast regions, topography played the most important roles in influencing the precipitation.Orographic rainfall typically occurred in the southeast where the topography varied greatly from valleys to huge mountains ranging in elevation from 100 to 4000 m. Figure 8a,b demonstrated the variable importance in  and  , respectively.These indicated that NDVI, DEM, LST-d and LST-n were the main factors contributed to establish the random forest models and that these factors influenced the spatial patterns and amount of precipitation at annual scale.The spatial distribution of the dominant factors which influenced the spatial distribution of precipitation over the TP was firstly explored in [25].In central and northeast TP where were mainly covered by grassland and meadows (which were easily influenced by precipitation), the NDVI was the most important factor.The dominant factors were LST-d and LSTn in the northwest and west, where the temperatures were low with high elevations, and the precipitation in these regions could be influenced through thermal and dynamic mechanisms.In the southeast regions, topography played the most important roles in influencing the precipitation.Orographic rainfall typically occurred in the southeast where the topography varied greatly from valleys to huge mountains ranging in elevation from 100 to 4000 m.

Downscaled Results and Validations Using Ground Observations
The annual downscaled results at 1 km resolution using random forest models based on TMPA and IMERG data were presented in Figure 9b,c, respectively.We additionally obtained the downscaled results at 0.1 • based on TMPA which was displayed in Figure 9a.Both the downscaled results based on TMPA and IMERG captured the spatial distribution of annual precipitation compared to the original TMPA, original IMERG and gauge precipitation, showing a decreasing trend from south to north, and from west to east.It was notable that the downscaled results based TMPA data shown contiguous precipitation map with anomalies removed which appeared in the original TMPA data, in western and northern parts of the TP.The downscaled results based on IMERG data still displayed continuous variation and gradual trend without outliers.

Downscaled Results and Validations Using Ground Observations
The annual downscaled results at 1 km resolution using random forest models based on TMPA and IMERG data were presented in Figure 9b,c, respectively.We additionally obtained the downscaled results at 0.1° based on TMPA which was displayed in Figure 9a.Both the downscaled results based on TMPA and IMERG captured the spatial distribution of annual precipitation compared to the original TMPA, original IMERG and gauge precipitation, showing a decreasing trend from south to north, and from west to east.It was notable that the downscaled results based TMPA data shown contiguous precipitation map with anomalies removed which appeared in the original TMPA data, in western and northern parts of the TP.The downscaled results based on IMERG data still displayed continuous variation and gradual trend without outliers.The performances of the downscaled results were validated against ground observations.Both the downscaled results still overestimated precipitation compared to ground observations but in lower degrees with reduced positive bias.The values of MAE and RMSE also decreased and R 2 values increased in the downscaled results compared with the corresponding satellite products indicating the improved accuracy of downscaled results.Figure 10a shown the validation of the downscaled The performances of the downscaled results were validated against ground observations.Both the downscaled results still overestimated precipitation compared to ground observations but in lower degrees with reduced positive bias.The values of MAE and RMSE also decreased and R 2 values increased in the downscaled results compared with the corresponding satellite products indicating the improved accuracy of downscaled results.Figure 10a shown the validation of the downscaled results based on TMPA data at 10 km resolution, while Figure 10b,c shown the validations of the downscaled results based on TMPA and IMERG data at 1 km, respectively, against ground observations.The downscaled results based on IMERG at 1 km had the best performance (R 2 ~0.86, bias ~7.73%) among the three downscaled results based on TMPA and IMERG data and shown improved accuracy compared with the original IEMRG data (R 2 ~0.74, bias ~12.23%).The downscaled results based on TMPA at 10 km and at 1 km also shown improvements compared with the original TMPA data (R 2 ~0.67, bias ~22.40%), while the downscaled results based on TMPA at 1 km (R 2 ~0.82, bias ~11.63%) performed better that the downscaled results at 10 km resolution based on TMPA data (R 2 ~0.77, bias ~14.13%).Regrading to the comparison between the original IMERG data and the downscaled results based on TMPA at 10 km resolution, their performances were remarkably close compared to ground observations with R 2 ~0.75 and bias ~13%.Based on the disaggregation algorithm introduced in Section 3.3, the monthly downscaled results based on TMPA and IMERG were obtained.Table 2 presented the statistical results at monthly scale.The downscale results based on IMERG at 1 km shown the best accuracy at monthly scale with R 2 ~ 0.82 and bias ~ 7% during wet seasons and R 2 ~ 0.62 and bias ~ 10% during dry seasons.The downscaled results based on TMPA at 1 km also corresponded well with ground observations during wet seasons (R 2 ~ 0.79 and bias ~ 10%) and shown relatively poorer performance during dry seasons (R 2 ~ 0.58 and bias ~ 15%).Regrading to the downscaled results based on TMPA at 10 km resolution, the performance was equivalent to the original IMERG data at 10 km but better than the original Based on the disaggregation algorithm introduced in Section 3.3, the monthly downscaled results based on TMPA and IMERG were obtained.Table 2 presented the statistical results at monthly scale.The downscale results based on IMERG at 1 km shown the best accuracy at monthly scale with R 2 ~0.82 and bias ~7% during wet seasons and R 2 ~0.62 and bias ~10% during dry seasons.The downscaled results based on TMPA at 1 km also corresponded well with ground observations during wet seasons (R 2 ~0.79 and bias ~10%) and shown relatively poorer performance during dry seasons (R 2 ~0.58 and bias ~15%).Regrading to the downscaled results based on TMPA at 10 km resolution, the performance was equivalent to the original IMERG data at 10 km but better than the original TMPA data at 0.25 • .Generally, the accuracy of the downscaled results based on TMPA and IMERG performed better than those of the original satellite-based precipitation products, in each month.Systematic anomalies were observed in the western and northern part of the TP in the original TMPA 3B43 data.The phenomena were also found in [25,48,49].They concluded that the systematic anomalies on the TP interior in satellite-based precipitation estimates were mainly affected by some water bodies.The distribution of inland water bodies was shown in Figure 11c.It was obtained from the Global Lakes and Wetlands Database (GLWD) (downloaded from: https://www.worldwildlife.org/publications/global-lakes-and-wetlands-database-large-lake-polygons-level-1).The spatial distributions of systematic anomalies were very close to those of inland water bodies.We further investigated the precipitation estimates covered by anomalies compared with rain gauges or its neighbors.Sites for seven anomalies which were located near water bodies were analyzed, as shown in Figure 12.It was obvious that TMPA data tended to overestimate precipitation over inland water bodies, with mean bias ~ 78%.In contrast, IMERG data agreed quite well with ground observations (bias ~ 10%).The PMW retrievals might be the main source of the anomalies, and the deficiencies of PMW retrievals in the TMPA products (GPROF2010) led to the systematic overestimation.While IMERG employed the latest version of Goddard Profiling Algorithm (GPROF2014) to compute precipitation estimates from all PMW sensors onboard GPM satellites, it provided more consistent precipitation estimates over water bodies.For GPROF 2014, the surface type was classified based on the monthly  It was obvious that TMPA data tended to overestimate precipitation over inland water bodies, with mean bias ~ 78%.In contrast, IMERG data agreed quite well with ground observations (bias ~ 10%).The PMW retrievals might be the main source of the anomalies, and the deficiencies of PMW retrievals in the TMPA products (GPROF2010) led to the systematic overestimation.While IMERG employed the latest version of Goddard Profiling Algorithm (GPROF2014) to compute precipitation estimates from all PMW sensors onboard GPM satellites, it provided more consistent precipitation estimates over water bodies.For GPROF 2014, the surface type was classified based on the monthly It was obvious that TMPA data tended to overestimate precipitation over inland water bodies, with mean bias ~78%.In contrast, IMERG data agreed quite well with ground observations (bias ~10%).The PMW retrievals might be the main source of the anomalies, and the deficiencies of PMW retrievals in the TMPA products (GPROF2010) led to the systematic overestimation.While IMERG employed the latest version of Goddard Profiling Algorithm (GPROF2014) to compute precipitation estimates from all PMW sensors onboard GPM satellites, it provided more consistent precipitation estimates over water bodies.For GPROF 2014, the surface type was classified based on the monthly emissivity for all SSM/I frequencies.The inland water bodies were treated as independent classes and were trained in the Bayes' model, which was beneficial for estimating precipitation over water bodies.Thus, the unified and updated PMW algorithms used in IMERG products result in improved precipitation estimates over water bodies compared to TMPA products.Another possible reason for the improvement in the precipitation estimates over water bodies was that IMERG used GMI, which had a much better resolution than TMI equipped in TRMM, thus it can better identify the inland water bodies.
For the monthly downscaled results based on TMPA data and IMERG data using random forest algorithm and monthly disaggregation algorithm, we also calculated the precipitation estimates for the same seven sites for anomalies.The results were shown in Figure 13.Both the downscaled results based on TMPA data and IMERG data at 1 km resolution corresponded well with rain gauges.The bias was ~12% and 8% for the downscaled results based on TMPA and the downscaled results based on IMERG, respectively.The downscaled results based on TMPA and IMERG also eliminated the anomalies indicate that random forest was an effective way to improve the spatial resolutions with reasonable accuracy with systematic anomalies removed.emissivity for all SSM/I frequencies.The inland water bodies were treated as independent classes and were trained in the Bayes' model, which was beneficial for estimating precipitation over water bodies.Thus, the unified and updated PMW algorithms used in IMERG products result in improved precipitation estimates over water bodies compared to TMPA products.Another possible reason for the improvement in the precipitation estimates over water bodies was that IMERG used GMI, which had a much better resolution than TMI equipped in TRMM, thus it can better identify the inland water bodies.
For the monthly downscaled results based on TMPA data and IMERG data using random forest algorithm and monthly disaggregation algorithm, we also calculated the precipitation estimates for the same seven sites for anomalies.The results were shown in Figure 13.Both the downscaled results based on TMPA data and IMERG data at 1 km resolution corresponded well with rain gauges.The bias was ~12% and 8% for the downscaled results based on TMPA and the downscaled results based on IMERG, respectively.The downscaled results based on TMPA and IMERG also eliminated the anomalies indicate that random forest was an effective way to improve the spatial resolutions with reasonable accuracy with systematic anomalies removed.

The Relationships between the Spatial Resolutions and the Accuracies of the Precipitation Estimates against Ground Observations
Though the accuracy of the precipitation estimates improved with the finer spatial resolutions, based on the statistical results presented in Tables 1 and 2 in this study, it did not mean that the finer the spatial resolution, the better the precision.While in [50], the performances of the satellite-based precipitation products were better with decreasing spatial resolutions, in Lower Colorado River Basin (LCRB).The relationships between the spatial resolutions and the accuracies of satellite-based products might be affected by the density of rain gauge networks.For instance, over regions like LCRB, there were dense rain gauge networks, which enabled it possible to conduct the detailed assessment of the performances of the IMERG products at various resolutions.While in TP, there are sparsely and poorly constructed rain gauge networks, which made it not suitable to do researches exploring the relationships between spatial resolutions and the accuracies of remote sensing products.Generally, to compare the difference between point-based measurements (from rain gauges) and gridded estimates, the precipitation estimates from gauges located in the same pixel should be averaged, and then compared with the corresponding pixel values.More gauges would be contained in one pixel after upscaling, and thus the estimate errors at each gauge might be neutralized or alleviated.This might be one of the possible reasons to explain why the performances of remote sensing products improved after spatial resolution degradation.Further studies should be conducted

The Relationships between the Spatial Resolutions and the Accuracies of the Precipitation Estimates against Ground Observations
Though the accuracy of the precipitation estimates improved with the finer spatial resolutions, based on the statistical results presented in Tables 1 and 2 in this study, it did not mean that the finer the spatial resolution, the better the precision.While in [50], the performances of the satellite-based precipitation products were better with decreasing spatial resolutions, in Lower Colorado River Basin (LCRB).The relationships between the spatial resolutions and the accuracies of satellite-based products might be affected by the density of rain gauge networks.For instance, over regions like LCRB, there were dense rain gauge networks, which enabled it possible to conduct the detailed assessment of the performances of the IMERG products at various resolutions.While in TP, there are sparsely and poorly constructed rain gauge networks, which made it not suitable to do researches exploring the relationships between spatial resolutions and the accuracies of remote sensing products.Generally, to compare the difference between point-based measurements (from rain gauges) and gridded estimates, the precipitation estimates from gauges located in the same pixel should be averaged, and then compared with the corresponding pixel values.More gauges would be contained in one pixel after upscaling, and thus the estimate errors at each gauge might be neutralized or alleviated.This might be one of the possible reasons to explain why the performances of remote sensing products improved after spatial resolution degradation.Further studies should be conducted to investigate the relationships between the spatial resolutions and the accuracies of satellite-based products, especially over regions where gauge network is coarse.

Future Research
In this study, the precipitation measured by rain gauges were used to evaluate the performances of the original TMPA and IMERG data, as well as the downscaled results based on satellite-based precipitation products.However, due to the harsh environment and complex terrain, the rain gauges were sparsely and unevenly distributed over the TP.A majority of the gauges were located in north and east and at middle or low altitudes, while few gauges in the western part of the TP and at high elevations [51].Thus, the assessment may not be representative and reasonable in the ungauged areas in western part of the TP and regions with high elevations.Though radar precipitation data is also widely used as 'ground truth' values to evaluate the quality of satellite-based precipitation data, radar networks are unsound over the TP.So far, there are only four working radar stations, shown in Figure 14.Additionally, other shortages, such as beam blockage and range degradation, cause radar precipitation merely useless over the TP.Therefore, more reliable ground reference data are highly needed for the TP. to investigate the relationships between the spatial resolutions and the accuracies of satellite-based products, especially over regions where gauge network is coarse.

Future Research
In this study, the precipitation measured by rain gauges were used to evaluate the performances of the original TMPA and IMERG data, as well as the downscaled results based on satellite-based precipitation products.However, due to the harsh environment and complex terrain, the rain gauges were sparsely and unevenly distributed over the TP.A majority of the gauges were located in north and east and at middle or low altitudes, while few gauges in the western part of the TP and at high elevations [51].Thus, the assessment may not be representative and reasonable in the ungauged areas in western part of the TP and regions with high elevations.Though radar precipitation data is also widely used as 'ground truth' values to evaluate the quality of satellite-based precipitation data, radar networks are unsound over the TP.So far, there are only four working radar stations, shown in Figure 14.Additionally, other shortages, such as beam blockage and range degradation, cause radar precipitation merely useless over the TP.Therefore, more reliable ground reference data are highly needed for the TP.Above all, this study evaluated the performances of TMPA and IMERG, and compared the downscaled results based on TMPA and IMERG, respectively, at annual and monthly temporal scales.However, the performance of satellite-based precipitation products may vary under extreme rainfall conditions, occurring at meteorological scale (e.g., hourly scale) [52].It is very important to obtain the precipitation estimates with both finer spatial resolutions and accuracies to capture the extreme rainfall events, which is meaningful in flood disaster prevention.Therefore, additional downscaling techniques should be developed especially for higher temporal scales (hourly or halfhourly), to improve our understanding of precipitation activities during extreme rainfall events.

Conclusions
In this study, we focused on downscaling TMPA and IMERG data to generate precipitation estimates with a finer spatial resolution (~1 km) at annual scale, and a disaggregation algorithm was applied to obtain the monthly downscaled results based on those at annual scale.Both of the original satellite-based precipitation products (TMPA and IMERG) and the downscaled results were evaluated against ground observations.The main conclusions from this study were as follows: (1) The IMERG data provided more accurate precipitation estimates than TMPA data at both the Above all, this study evaluated the performances of TMPA and IMERG, and compared the downscaled results based on TMPA and IMERG, respectively, at annual and monthly temporal scales.However, the performance of satellite-based precipitation products may vary under extreme rainfall conditions, occurring at meteorological scale (e.g., hourly scale) [52].It is very important to obtain the precipitation estimates with both finer spatial resolutions and accuracies to capture the extreme rainfall events, which is meaningful in flood disaster prevention.Therefore, additional downscaling techniques should be developed especially for higher temporal scales (hourly or half-hourly), to improve our understanding of precipitation activities during extreme rainfall events.

Conclusions
In this study, we focused on downscaling TMPA and IMERG data to generate precipitation estimates with a finer spatial resolution (~1 km) at annual scale, and a disaggregation algorithm was applied to obtain the monthly downscaled results based on those at annual scale.Both of the original satellite-based precipitation products (TMPA and IMERG) and the downscaled results were evaluated against ground observations.The main conclusions from this study were as follows:

Figure 1 .
Figure 1.Spatial distribution of gauge stations and the Digital Elevation Model (DEM) over the Tibetan Plateau (TP).

Figure 1 .
Figure 1.Spatial distribution of gauge stations and the Digital Elevation Model (DEM) over the Tibetan Plateau (TP).
25 • resi ) by computing the difference between P 0.25 • TMPA and the estimated annual precipitation at 0.25 • .Similarly, map the residuals at 0.1 • (P 0.
1 • resi ) by computing the difference between P 0.1 • I MERG and the estimated annual precipitation at 0.1 • .(6) Obtain the residuals (P 0.25•

Figure 2 .
Figure 2. The flow chart of the random forest-based downscaling algorithm used in the study.

Figure 2 .
Figure 2. The flow chart of the random forest-based downscaling algorithm used in the study.

Figure 4 .
Figure 4. Scatterplots of the validation between ground observations and (a) the original TMPA data at a spatial resolution of 0.25°; (b) the resampled IMERG data at a spatial resolution of 0.25°; and (c) the original IMERG data at a spatial resolution of 0.1°, at the annual scale.

Figure 5 .
Figure 5. Monthly mean precipitation from 113 rain gauges (gray bar), the original TMPA (blue line), the resampled IMERG (green line) and the original IMERG (red line) for the TP from January to December 2015.

Figure 4 .
Figure 4. Scatterplots of the validation between ground observations and (a) the original TMPA data at a spatial resolution of 0.25 • ; (b) the resampled IMERG data at a spatial resolution of 0.25 • ; and (c) the original IMERG data at a spatial resolution of 0.1 • , at the annual scale.

Figure 4 .
Figure 4. Scatterplots of the validation between ground observations and (a) the original TMPA data at a spatial resolution of 0.25°; (b) the resampled IMERG data at a spatial resolution of 0.25°; and (c) the original IMERG data at a spatial resolution of 0.1°, at the annual scale.

Figure 5 .
Figure 5. Monthly mean precipitation from 113 rain gauges (gray bar), the original TMPA (blue line), the resampled IMERG (green line) and the original IMERG (red line) for the TP from January to December 2015.

Figure 5 .
Figure 5. Monthly mean precipitation from 113 rain gauges (gray bar), the original TMPA (blue line), the resampled IMERG (green line) and the original IMERG (red line) for the TP from January to December 2015.

Figure 6 .
Figure 6.Graphs of the variation of the 'out of bag' (OOB) error rate for (a)  and (b)  at different ntree values.Note: OOB error denoted the 'out of bag' error provided by the bagging model. and  denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 6 .
Figure 6.Graphs of the variation of the 'out of bag' (OOB) error rate for (a) RF TMPA and (b) RF I MERG at different ntree values.Note: OOB error denoted the 'out of bag' error provided by the bagging model.RF TMPA and RF I MERG denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 7 .
Figure 7. Boxplots of Variation of the OOB error rate for (a)  and (b)  at different  values.Note: OOB error denoted the 'out of bag' error provided by the bagging model. and  denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 8 .
Figure 8.The order of variables' importance in (a)  and (b)  .Note:  and  denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 7 .
Figure 7. Boxplots of Variation of the OOB error rate for (a) RF TMPA and (b) RF I MERG at different m try values.Note: OOB error denoted the 'out of bag' error provided by the bagging model.RF TMPA and RF I MERG denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 7 .
Figure 7. Boxplots of Variation of the OOB error rate for (a)  and (b)  at different  values.Note: OOB error denoted the 'out of bag' error provided by the bagging model. and  denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 8 .
Figure 8.The order of variables' importance in (a)  and (b)  .Note:  and  denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 8 .
Figure 8.The order of variables' importance in (a) RF TMPA and (b) RF I MERG .Note: RF TMPA and RF I MERG denoted the random forest models based on TMPA data and IMERG data, respectively.

Figure 9 .
Figure 9. Spatial distribution of precipitation maps generated from (a) downscaled results based on TMPA data at 0.1°, (b) downscaled results based on TMPA data at 1 km and (c) downscaled results based on IMERG data at 1 km, over the TP in 2015.

Figure 9 .
Figure 9. Spatial distribution of precipitation maps generated from (a) downscaled results based on TMPA data at 0.1 • , (b) downscaled results based on TMPA data at 1 km and (c) downscaled results based on IMERG data at 1 km, over the TP in 2015.

Figure 10 .
Figure 10.Scatterplots of the validation between ground observations and (a) downscaled results based on TMPA data at 10 km; (b) downscaled results based on TMPA data at 1 km and (c) downscaled results based on IMERG data at 1 km, at the annual scale.

Figure 10 .
Figure 10.Scatterplots of the validation between ground observations and (a) downscaled results based on TMPA data at 10 km; (b) downscaled results based on TMPA data at 1 km and (c) downscaled results based on IMERG data at 1 km, at the annual scale.

Figure 11 .
Figure 11.Spatial distribution of precipitation estimates generated from (a) TMPA 3b43, (b) IMERG Final and (c) inland water bodies from Global Lakes and Wetlands Database (GLWD).The black circles represent the sites for anomalies.

Figure 12 .
Figure 12.Comparison of mean monthly precipitation data for the anomalies covering inland water bodies in the original TMPA 3B43 data (blue line) and the original IMERG data (red line) with the data from seven rain gauge stations (black line).

Figure 11 .Figure 11 .
Figure 11.Spatial distribution of precipitation estimates generated from (a) TMPA 3b43, (b) IMERG Final and (c) inland water bodies from Global Lakes and Wetlands Database (GLWD).The black circles represent the sites for anomalies.

Figure 12 .
Figure 12.Comparison of mean monthly precipitation data for the anomalies covering inland water bodies in the original TMPA 3B43 data (blue line) and the original IMERG data (red line) with the data from seven rain gauge stations (black line).

Figure 12 .
Figure 12.Comparison of mean monthly precipitation data for the anomalies covering inland water bodies in the original TMPA 3B43 data (blue line) and the original IMERG data (red line) with the data from seven rain gauge stations (black line).

Figure 13 .
Figure 13.Comparison of mean monthly precipitation data of the anomalies covering inland water bodies in the downscaled results based on TMPA data (blue line) and the downscaled results based IMERG data (red line) with the data from seven rain gauge stations (black line).

Figure 13 .
Figure 13.Comparison of mean monthly precipitation data of the anomalies covering inland water bodies in the downscaled results based on TMPA data (blue line) and the downscaled results based IMERG data (red line) with the data from seven rain gauge stations (black line).

Figure 14 .
Figure 14.The spatial distribution of radar stations over the TP.

Figure 14 .
Figure 14.The spatial distribution of radar stations over the TP.

Table 1 .
Coefficients of determination (R 2 ), bias, mean absolute error (MAE) and root mean square error (RMSE) of the original TMPA data, the resampled IMERG data and the original IMERG data, compared with ground observations at each month in 2015.
3.2.Main Results in the Downscaling Procedure Using Random Forest

Table 1 .
Remote Sens. 2018, 10, x FOR PEER REVIEW 10 of 21 GPROF2014 resulted in much less "false" alarm.During wet seasons, both TMPA and IMERG corresponded well with ground observations with R 2 values approximately 0.70.IMERG was still more accurate with bias ~ 10%, MAE ~ 22 mm and RMSE ~ 32 mm, than TMPA (bias ~ 20%, MAE ~ 25 mm, RMSE ~ 37 mm).Regrading to the resampled IMERG data at 0.25° resolution, the statistical results were very close to those of the original IMERG at 0.1° and thus provided more accurate precipitation estimates than TMPA at 0.25°.Based on these evaluations, we concluded that IMERG data were better than TMPA data in terms of accuracy, with anomalies removed at the annual scale.Coefficients of determination (R 2 ), bias, mean absolute error (MAE) and root mean square error (RMSE) of the original TMPA data, the resampled IMERG data and the original IMERG data, compared with ground observations at each month in 2015.

Table 2 .
Coefficients of determination (R 2 ), bias, mean absolute error (MAE) and root mean square error (RMSE) for the downscaled results based on TMPA data at 10 km, the downscaled results based on TMPA data at 1 km and downscaled results based on IMERG data at 1 km, against ground observations at each month in 2015.