Assessment of TMPA 3B42V7 and PERSIANN-CDR in Driving Hydrological Modeling in a Semi-Humid Watershed in Northeastern China

Recent developments of satellite precipitation products provide an unprecedented opportunity for better precipitation estimation, and thus broaden hydrological application. However, due to the errors and uncertainties of satellite products, a thorough validation is usually required before putting into the real hydrological application. As such, this study aims to provide a comprehensive evaluation on the performances of Tropical Rainfall Measuring Mission Multi-satellite Precipitation Analysis (TMPA) 3B42V7 and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR), as well as their adequacies in simulating hydrological processes in a semi-humid region in the northeastern China. It was found that TMPA 3B42V7 showed a superior performance at the daily and monthly time scales, and had a favorable capture of the rainfall-intensity distribution. Intra-annual comparisons indicated a better representation of TMPA 3B42V7 from January to September, whereas PERSIANN-CDR was more reliable from October to December. The Soil and Water Assessment Tool (SWAT) driven by gauge precipitation data performed excellently with NSE > 0.9, while the performances of TMPA 3B42V7and PERSIANN-CDR-based models are satisfactory with NSE > 0.5. The performances varied under different flow levels and hydrological years. Water balance analysis indicated a better performance of TMPA 3B42V7 in simulating the hydrological processes, including evapotranspiration, groundwater recharge and total runoff. The runoff compositions (i.e., base flow, subsurface flow, and surface flow) driven by TMPA 3B42V7 were more accordant with the actual hydrological features. This study will not only help recognize the potential satellite precipitation products for local water resources management, but also be a reference for the poor-gauged regions with similar hydrologic and climatic conditions around the world, especially the northeastern China and western Russia.


Introduction
Precipitation is acknowledged as a key element in water cycle that can directly affect the hydrologic cycle and further influence water resources management and flood control [1][2][3]. Over the past few decades, monitoring of precipitation has to rely on manually ground gauges, while the high requirements on terrain and funding investment have limited their applications worldwide [4]. As such, the satellite remote sensing technology has brought new opportunities for better precipitation estimation. The Tropical Rainfall Measuring Mission (TRMM), launched in 1997, was the first satellite carrying precipitation radar, which greatly encouraged the developments of satellite precipitation products, including TRMM Multi-satellite Precipitation Analysis (TMPA), Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR), Climate (BRR), which is located 66 km downstream from the river head and has a drainage area of 2085 km 2 , accounting for approximately 74% of the entire Biliu River Basin (BRB). The region has a semi-humid temperate monsoon climate featured with four distinct seasons. The mean annual precipitation is about 739 mm, with more than 70% of the precipitation occurring during the flood season from July to September (see Figure 2 for the spatial distribution of mean annual precipitation). The mean annual air temperature is about 10 °C . January has the coldest mean monthly air temperature at around −8 °C , and July is the warmest month with a mean air temperature of 24 °C. Hilly in the northeast and northwest regions and flat in proximity to the reservoir, the surface elevation ranges from 41 m to 1108 m. Forest (FRST) and agricultural land (AGRL) are the two major land use types covering 72% and 19% of the BRRW, respectively; while pasture (PAST), surface water (WATR) and residential (URLD) occupy the remaining 9% of the total area ( Figure 1b). The soils are dominated by Haplic Luvisols (LVh, 60%) and Haplic Phaozems (PHh, 31%) ( Figure 1c).   BRR is part of the East-to-West Water Transfer Project of Liaoning Province, which is a vitally important and backbone water conservancy project to the provincial social-economic development. The reservoir has a total storage capacity of 9.3 × 10 8 m 3 , and a designed annual water-supply of 4.4 × 10 8 m 3 that serves more than 80% of the domestic and industrial water uses for Dalian city. In addition to water supply, the reservoir has multiple tasks for flood control, power generation, and agricultural irrigation. Therefore, a comprehensive knowledge of the rainfall-runoff hydrological process in BRRW through model simulation and prediction is the guarantee for drinking watersupply and irrigation for millions of people.

Gauge Precipitation Data
Observed precipitation is the base to evaluate the performance of satellite precipitation products. Figure 2 shows the locations of eight rain gauges, and their control areas (varying from 166 to 371 km 2 ) following the Thiessen polygon method. Daily precipitation at these gauges from 1 January 2000 to 31 December 2015 are obtained from the Hydrology Bureau of Liaoning Province; data are collected from 0:00 UTC (Coordinated Universal Time) to 24:00 UTC of the current day. The spatial distribution of mean annual precipitation (from 2000 to 2015) of the gauge records, interpolated by the Inverse Distance Weight method, is displayed in Figure 2.

Satellite Precipitation Products
3B42V7 is the latest post real-time data of the TRMM Multi-Satellite Precipitation Analysis (TMPA) product, which is superior to all previous versions [30]. The 3B42V7 dataset is available since January 1998 and covers the global latitude belt from 50° S to 50° N with a spatial resolution of 0.25° BRR is part of the East-to-West Water Transfer Project of Liaoning Province, which is a vitally important and backbone water conservancy project to the provincial social-economic development. The reservoir has a total storage capacity of 9.3 × 10 8 m 3 , and a designed annual water-supply of 4.4 × 10 8 m 3 that serves more than 80% of the domestic and industrial water uses for Dalian city. In addition to water supply, the reservoir has multiple tasks for flood control, power generation, and agricultural irrigation. Therefore, a comprehensive knowledge of the rainfall-runoff hydrological process in BRRW through model simulation and prediction is the guarantee for drinking water-supply and irrigation for millions of people.

Gauge Precipitation Data
Observed precipitation is the base to evaluate the performance of satellite precipitation products. Figure 2 shows the locations of eight rain gauges, and their control areas (varying from 166 to 371 km 2 ) following the Thiessen polygon method. Daily precipitation at these gauges from 1 January 2000 to 31 December 2015 are obtained from the Hydrology Bureau of Liaoning Province; data are collected from 0:00 UTC (Coordinated Universal Time) to 24:00 UTC of the current day. The spatial distribution of mean annual precipitation (from 2000 to 2015) of the gauge records, interpolated by the Inverse Distance Weight method, is displayed in Figure 2.

Satellite Precipitation Products
3B42V7 is the latest post real-time data of the TRMM Multi-Satellite Precipitation Analysis (TMPA) product, which is superior to all previous versions [30]. The 3B42V7 dataset is available since January 1998 and covers the global latitude belt from 50 • S to 50 • N with a spatial resolution of 0.25 • × 0.25 • and a temporal resolution of 3-h [31]. In this study, the daily precipitation data from 2000 to 2015 are evaluated, which can be downloaded freely from the Goddard Earth Sciences Data and Information Services Center (https://mirador.gsfc.nasa.gov). The daily 3B42V7 data cover an accordant time span from 0:00 UTC to 24:00 UTC with the gauge observation, which are further aggregated into the monthly mean values.
PCDR is maintained by the University of California and National Oceanic and Atmospheric Administration (NOAA) [32]. It is a multi-satellite precipitation product generated by the PERSIANN algorithm using Gridsat-B1 IR satellite data [33]. The product is available since 1983, covering a wider latitude belt from 60 • S to 60 • N and a same spatial resolution (0.25 • × 0.25 • ) compared with 3B42V7 product. In this study, the daily PCDR data from 2000 to 2015 are obtained from the Center for Hydrometeorology and Remote Sensing (http://chrsdata.eng.uci.edu/).
Though both 3B42V7 and PCDR products are bias-corrected on a monthly basis using the Global Precipitation Climatology Centre (GPCC) dataset [7], none of the eight rain gauges in the study area are included in the GPCC gridded gauge-analysis product. It is therefore essential to examine the accuracy of the two satellite products based on the ground observations, which will be a valuable reference for such semi-humid areas where the satellite precipitation products have been less evaluated.

Accuracy Assessment of Satellite Precipitation Products
To understand the spatiotemporal variability of 3B42V7 and PCDR against gauge precipitation data, the evaluation at different spatial (i.e., grid scale and watershed scale), and temporal scales (i.e., daily and monthly) are conducted from 2000 to 2015. At the grid scale, the gauge observations are compared with the satellite data of grid where the gauges are located; that are grids No.4, 5, 7, and 8 in Figure 2. Arithmetic average is calculated for gauge data when there is more than one gauge in one satellite grid (e.g., average of five gauges in grid No.5 is compared with the satellite data). As for the watershed-scale comparison, the gauge-based areal precipitation is computed by Thiessen polygons, and the satellite areal precipitation is derived by averaging the satellite data that have more than 50% of its grid inside the watershed (i.e., grids No.4, 5, 7, and 8) [15].
Four basic statistical indices are computed to evaluate the accuracy of satellite-derived precipitation data versus gauge observation. Correction coefficient (CC) is utilized to assess the linear correlation between satellite-and gauge-based precipitation. Root-mean-square error (RMSE) and mean absolute error (MAE) are both used to measure the average error magnitude, whereas the former gives greater weight to large errors. The relative bias (BIAS) represents the systematic bias of satellite precipitation. The above indices are expressed as follows: CC = P obs,i − P obs P sat,i − P sat P obs,i − P obs 2 P sat,i − P sat 2 (1) where P obs is the observed precipitation from rain gauge, P sat is the estimated precipitation of 3B42V7 and PCDR products, P obs and P sat are the mean values of observed and estimated precipitation, respectively, and n is the number of samples.
To assess the ability of satellite products in detecting precipitation events, three categorical statistical indices, including probability of detection (POD), false alarm ratio (FAR), and critical success index (CSI) are used [34]. A threshold of 1 mm/day is used to differentiate the precipitation and non-precipitation events according to Feidas et al. [35]. The perfect values of POD, FAR, and CSI are 1, 0, and 1, respectively. The formulas are as follows: where H is the number of hits (observed rain that was correctly detected), M is the number of misses (observed rain that was not detected), and F is the number of false alarms (rain detected but not observed).

SWAT Model Application
The SWAT is a time-continuous, semi-distributed, and physically based hydrological model, designed to simulate water, sediment, nutrient, and pesticide transports at a catchment scale on daily, monthly, or yearly time step [36][37][38]. The model divides a watershed into sub-basins connected by the stream network, and further into hydrological response units (HRUs), which is the minimum element to calculate the hydrological processes. The model has been successfully applied in many regions in Northeast China [39,40]. More information about the model can be found in SWAT theoretical documentation [41] and in the literatures [36,42,43].
A variety of data were collected to perform SWAT simulation, including topography, land use, soil types, and meteorological and hydrological conditions. The digital elevation model (DEM) with 90 m resolution was downloaded from National Aeronautics and Space Administration Shuttle Radar Topographic Mission (http://srtm.csi.cgiar.org/). The land cover and soil types with 1 km resolution were obtained from the Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (http://www.resdc.cn). Daily records of maximum and minimum air temperature, relative humidity, and wind speed were obtained from China Meteorological Administration (http://data.cma.cn). Since there are no meteorological stations inside the study region, the adjacent stations of Xiongyue . The SUFI-2 (Sequential Uncertainty Fitting, ver.2) algorithm built in the SWAT-CUP software is employed for sensitivity analysis and auto-calibration [44]. Through sensitive analysis, 13 parameters are selected for model calibration. Three goodness-of-fit measures, namely coefficient of determination (R 2 ), Nash-Sutcliffe Coefficient of Efficiency (NSE) [45], and relative bias radio (Bias) were adopted to evaluate the performance of streamflow simulation, which are expressed as follows: Remote Sens. 2020, 12, 3133 7 of 20 where Q o,i and Q m,i represent the observed and simulated streamflow in the i th time period, respectively; Q o and Q m represent the average of observed and simulated streamflow, respectively. To evaluate the application potential of satellite precipitation estimates in hydrological simulation, each of the three precipitation datasets (i.e., ground gauge, 3B42V7, and PCDR) is used independently to drive the model and to obtain their own optimal parameter values. This would be helpful in understanding the accuracy and reliability of the specified satellite data in producing hydrological variables in case it is the sole source of precipitation in the study region. The simulated streamflow and water budget components (i.e., surface runoff, subsurface runoff, groundwater flow, and evapotranspiration) based on different precipitation datasets are analyzed and discussed in Sections 3.2 and 3.3. Figure 3 shows the scatter plots of daily data from two satellite products (3B42V7 and PCDR), versus rain gauges at the grid and watershed scales, as well as the corresponding statistical indices. During the evaluation period from 2000 to 2015, there are, in total, 23,376 points for grid-based comparison and 5844 pairs for watershed-based comparison.

Daily Precipitation
where Q o,i and Q m,i represent the observed and simulated streamflow in the i th time period, respectively; Q o ̅̅̅̅ and Q m ̅̅̅̅̅ represent the average of observed and simulated streamflow, respectively.
To evaluate the application potential of satellite precipitation estimates in hydrological simulation, each of the three precipitation datasets (i.e., ground gauge, 3B42V7, and PCDR) is used independently to drive the model and to obtain their own optimal parameter values. This would be helpful in understanding the accuracy and reliability of the specified satellite data in producing hydrological variables in case it is the sole source of precipitation in the study region. The simulated streamflow and water budget components (i.e., surface runoff, subsurface runoff, groundwater flow, and evapotranspiration) based on different precipitation datasets are analyzed and discussed in Sections 3.2 and 3.3.  At the grid scale, the 3B42V7 and PCDR both had a small BIAS at −3.95% and 2.09%, respectively, indicating an underestimation of 3B42V7 and an overestimation of PCDR overall. 3B42V7 had a larger CC (0.63), and a smaller MAE (1.93 mm) than PCDR (CC: 0.56, MAE: 2.2 3mm); whereas PCDR reported a smaller RMSE (6.79 mm) and a smaller BIAS of 2.09% than 3B42V7 (RMSE: 6.87 mm, BIAS At the grid scale, the 3B42V7 and PCDR both had a small BIAS at −3.95% and 2.09%, respectively, indicating an underestimation of 3B42V7 and an overestimation of PCDR overall. 3B42V7 had a larger CC (0.63), and a smaller MAE (1.93 mm) than PCDR (CC: 0.56, MAE: 2.2 3mm); whereas PCDR reported a smaller RMSE (6.79 mm) and a smaller BIAS of 2.09% than 3B42V7 (RMSE: 6.87 mm, BIAS −3.95%).

Daily Precipitation
It should be noted that compared to MAE, RMSE retains the difference in magnitude as it can avoid the fact that positive and negative differences cancel each other out, reflecting the randomness of the errors [4,46].
Regarding the contingency of satellite precipitation estimates, 3B42V7 and PCDR both displayed moderate ability in detecting the gauge precipitation events with POD at 0.60 and 0.78, respectively. Although PCDR outperformed 3B42V7 in POD, the higher FAR obviously decreased its overall skill, which showed that 62% of the detected events were false. This further resulted in a lower CSI of PCDR (CSI: 0.34) compared to 3B42V7 (CSI:0.42).
The fitness between satellite data and gauge data differed among grid. As shown in Table 1, grid No.5 generally has a better agreement between the satellite-based and gauge-based data, as indicated by the higher CC and CSI, and smaller RMSE, MAE, and FAR. This is probably attributed to the five gauge stations scattered over the grid that can better represent the areal precipitation within this grid. As reminded by previous studies [14,47], the gauge observations that are usually used as reference datasets may be subject to uncertainties due to errors in recording and deficits of regionalization, and the rain gauges density may affect the evaluation results. At the watershed scale, both 3B42V7 and PCDR showed a better agreement compared with the grid scale, because all statistical indices are closer to their corresponding perfect values, except for the BIAS of PCDR. In general, 3B42V7 outperformed PCDR at both spatial scales, as indicated by the higher CC and CSI, and smaller RMSE (for watershed scale only), MAE, BIAS, and FAR of 3B42V7 dataset.
Additionally, it is generally recognized that the distribution pattern of precipitation with different intensities is an important feature and has significant effects on streamflow modeling and flood forecasting [48,49]. Therefore, the occurrence frequency of daily precipitation and the corresponding contribution to the total precipitation are evaluated for different intensities at the watershed scale ( Figure 4).
It can be seen that the "non-rainy" days (0 mm) was the class that most frequently occurred, with an occurrence rate of 66.5% for gauge observation and 3B42V7, and 41.5% for PCDR. The second largest class was "0-1 mm" for the gauge and 3B42V7 data, occurring in 15.6% and 15.2% of the total number of days, respectively, followed by the class of "1-10 mm", occurring in 12.3% and 12.8%, respectively. In general, the 3B42V7 under "non-rainy" and "light rain" classes had a similar occurrence rate compared to the gauge data. This is different with some previous studies, stating that TMPA product is less effective in correctly specifying the moderate and light rain rates [2,15].
However, for PCDR, more "0-1 mm" (occurrence rate: 25.2%)" and "1-10 mm" (occurrence rate: 28.5%) days were recorded compared to gauge observation. The discrepancies were also evident in the contribution pattern, where PCDR had an overestimation in light rainfall class (0-1 mm and 1-10 mm) and an underestimation in high rainfall classes (>25 mm). Similar findings on the performance of PCDR were reported in Northeastern China [50], the contiguous United States [33], and Italy [51].
Besides, no events above 100 mm were recorded in PCDR, while both the gauge and 3B42V7 data had two events larger than 100 mm, reflecting an inadequate ability of PCDR in detecting extreme precipitation events. The 3B42V7, however, showed a similar contribution pattern against the gauge data for all classes, with an acceptable discrepancy within ±5%, indicating that the 3B42V7 data is more adequate in characterizing the rainfall structure of the gauge observation.
Further, the performance of daily precipitation of 3B42V7 and PCDR products under different rainfall intensities was assessed at the watershed scale ( Figure 5). Four indices, i.e., CC, RMSE, MAE, and BIAS, were applied for comparison. CC increased from 0.20 to 0.42 (on average of the two products) as the rainfall intensity increased; whereas RMSE, MAE, and BIAS indicated a falling performance of both satellite-based products as the rainfall intensity increased. The performances of two satellite products differed in terms of rainfall intensities. 3B42V7 showed better agreement in torrential rains (>50 mm), whereas PCDR preceded in heavy rainfall events (25-50 mm) except for the BIAS index. As for light (1-10 mm) and moderate (10-25 mm) events, performances of the two products were comparable as indicated by the larger CC and smaller bias for 3B42V7, and the smaller RMSE and MAE for PCDR.  It can be seen that the "non-rainy" days (0 mm) was the class that most frequently occurred, with an occurrence rate of 66.5% for gauge observation and 3B42V7, and 41.5% for PCDR. The second largest class was "0-1mm" for the gauge and 3B42V7 data, occurring in 15.6% and 15.2% of the total number of days, respectively, followed by the class of "1-10mm", occurring in 12.3% and 12.8%, respectively. In general, the 3B42V7 under "non-rainy" and "light rain" classes had a similar occurrence rate compared to the gauge data. This is different with some previous studies, stating that TMPA product is less effective in correctly specifying the moderate and light rain rates [2,15].
However, for PCDR, more "0-1mm" (occurrence rate: 25.2%)" and "1-10mm" (occurrence rate: 28.5%) days were recorded compared to gauge observation. The discrepancies were also evident in the contribution pattern, where PCDR had an overestimation in light rainfall class (0-1 mm and 1-10 mm) and an underestimation in high rainfall classes (>25 mm). Similar findings on the performance of PCDR were reported in Northeastern China [50], the contiguous United States [33], and Italy [51]. Besides, no events above 100 mm were recorded in PCDR, while both the gauge and 3B42V7 data had two events larger than 100 mm, reflecting an inadequate ability of PCDR in detecting extreme precipitation events. The 3B42V7, however, showed a similar contribution pattern against the gauge data for all classes, with an acceptable discrepancy within ±5%, indicating that the 3B42V7 data is more adequate in characterizing the rainfall structure of the gauge observation.
Further, the performance of daily precipitation of 3B42V7 and PCDR products under different rainfall intensities was assessed at the watershed scale ( Figure 5). Four indices, i.e., CC, RMSE, MAE, and BIAS, were applied for comparison. CC increased from 0.20 to 0.42 (on average of the two products) as the rainfall intensity increased; whereas RMSE, MAE, and BIAS indicated a falling performance of both satellite-based products as the rainfall intensity increased. The performances of two satellite products differed in terms of rainfall intensities. 3B42V7 showed better agreement in torrential rains (>50 mm), whereas PCDR preceded in heavy rainfall events (25-50 mm) except for  Figure 6 shows the comparison results of monthly satellite precipitation and gauge data. The assessment on the capability in detecting precipitation events is not able to perform at the monthly scale, because all months of the three datasets have precipitations larger than 1 mm. The correlation  Figure 6 shows the comparison results of monthly satellite precipitation and gauge data. The assessment on the capability in detecting precipitation events is not able to perform at the monthly scale, because all months of the three datasets have precipitations larger than 1 mm. The correlation relationship is greatly improved at the monthly scale. The CC values were larger than 0.90 for both products at the grid and watershed scales.

Monthly Precipitation
Remote Sens. 2020, 12, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing Figure 5. Statistical indices of 3B42V7 and PCDR data under different precipitation intensities at daily scale. Figure 6 shows the comparison results of monthly satellite precipitation and gauge data. The assessment on the capability in detecting precipitation events is not able to perform at the monthly scale, because all months of the three datasets have precipitations larger than 1 mm. The correlation relationship is greatly improved at the monthly scale. The CC values were larger than 0.90 for both products at the grid and watershed scales.  Due to the intra-annual variabilities of precipitation, the performance of satellite products for each month is further evaluated. Figure 7 shows the statistical indices of CC, RMSE, and BIAS for each month. Overall, both 3B42V7 and PCDR can capture the monthly precipitation pattern of the gauge observation. The linear relationships were satisfactory with the CC above 0.80 for 3B42V7 and above 0.70 for PCDR during all months except for September. RMSE increased as the monthly precipitation increased, with the maximum RMSE occurred in August for both satellite products. The variations in BIAS showed clearly the underestimation during the high-precipitation period from June to September (for PCDR only in June and August), while overestimation during other months. It is worthy to recognize that the systematic errors were relatively small in the wet season (i.e., from June to September) with the absolute values of BIAS lower than 20%, while large in the dry season (i.e., from December to next March) with a BIAS up to 60%. In general, 3B42V7 produced a smaller absolute value of BIAS throughout the year (10.6% for 3B42V7 and 18.1% for PCDR on average), and was slightly superior in CC and RMSE from January to August; while PCDR outperformed 3B42V7 from September to December as indicated by its larger CC and smaller RMSE.

Monthly Precipitation
Remote Sens. 2020, 12, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing June to September) with the absolute values of BIAS lower than 20%, while large in the dry season (i.e., from December to next March) with a BIAS up to 60%. In general, 3B42V7 produced a smaller absolute value of BIAS throughout the year (10.6% for 3B42V7 and 18.1% for PCDR on average), and was slightly superior in CC and RMSE from January to August; while PCDR outperformed 3B42V7 from September to December as indicated by its larger CC and smaller RMSE.  Table 2 shows the calibrated values of 13 parameters, and all values lie within their proper ranges. Figure 8 shows the observed and simulated monthly streamflow at JC and BRR stations ( Figure 1a); the corresponding statistical indices are summarized in Table 3.  Table 2 shows the calibrated values of 13 parameters, and all values lie within their proper ranges. Figure 8 shows the observed and simulated monthly streamflow at JC and BRR stations (Figure 1a); the corresponding statistical indices are summarized in Table 3.

Streamflow Simulation
The simulated streamflow driven by gauge data exhibited a closer agreement with the observed data than model simulations using 3B42V7 and PCDR data. The NSE and R 2 were particularly high for the gauge-based simulation; the NSE ranged from 0.92 to 0.96, and R 2 ranged from 0.93 to 0.97. The good fitness demonstrates that the model is capable of capturing the key features of the observed hydrograph at monthly scale when forced by gauge precipitation. Therefore, the model is considered robust and provides a sound basis for further testing the precision and applicability of satellite products.    The model performances driven by two satellite products were lower than the gauge-based simulation. At BRR station, 3B42V7-based model led to a slightly better model efficiency and a lower systematic error than PCDR, as indicated by a larger NSE of 0.76 (PCDR: 0.65), a larger R 2 of 0.77 (PCDR: 0.67), and a smaller Bias of 18.3% (PCDR: 22.9%) on average during calibration and validation periods. Similarly, 3B42V7 outperformed PCDR in producing the streamflow at JC station, with the NSE of 0.68 and 0.56, R 2 of 0.70 and 0.59, and Bias of 9.8% and 17.4% for 3B42V7 and PCDR, respectively. Overall, consistent with the quality evaluation of the two satellite products, 3B42V7 exhibited a better performance than PCDR in depicting streamflow hydrographs.

Performance under Different Flow Levels and Hydrological Years
In addition to the overall hydrographs, the ability of capturing the flow peaks and low flow conditions are of particular importance for flood control and drought prevention. To investigate the model performance at different flow levels, the monthly observed streamflow exceeding its 90% quantile is defined as high flow and that less than its 50% quantile is defined as low flow; the streamflow between these two is defined as moderate flow. On this basis, the entire study period is divided into 16 months of high flow, 84 months of moderate flow, and 68 months of low flow. The statistical indices of streamflow simulation under different flow conditions are summarized in Table 4. Under the high flow condition, the gauge-based model generated a perfect simulation result with R 2 of 0.92 and 0.85 at JC and BRR, respectively. The calculated Bias was also small within a range of ±10%. However, both 3B42V7-and PCDR-based models yielded an underestimation of streamflow, as indicated by the negative Bias ranging from −13.8% to −28.4%. Model performance using 3B42V7 data was better than that using PCDR, with a larger R 2 of 0.63 and 0.54 at JC and BRR, respectively, and a smaller Bias of −13.8% and −21.7% at JC and BRR, respectively. The better performance of 3B42V7-based model during high flow condition is probably due to the better representation of 3B42V7 in capturing torrential rains (Figures 4 and 5).
As regards to the moderate flow, the gauge-based model generated a satisfactory R 2 of 0.69 and 0.74 at JC and BRR, respectively, but a larger Bias of −30% at BRR. The model performances utilizing 3B42V7 and PCDR data were relatively lower. The R 2 ranged from 0.16 to 0.27, and the Bias was larger than 40% at JC whereas lower within −10% at BRR. Therefore, the gauge observations are more accurate in depicting runoff process under moderate flow conditions; whereas, the satellite products match well in the amount of streamflow (at BRR only). Unfortunately, at the low flow level, model simulations driven by the three precipitation datasets all produced unsatisfactory results, with R 2 ranging from 0.01 to 0.29, and absolute values of Bias larger than 35% (except for gauge-based simulation at JC). As demonstrated in previous studies, the poor performance under relatively low flow levels may be attributable to the uncertainties of other input data (such as soil type) and parameters, as well as the limitation of SWAT internal algorithms under low-flow conditions [52,53]. Therefore, reliable precipitation data, along with improved model algorithms, are both essential to enable a more accurate and realistic streamflow simulation [54].
The performances of satellite precipitation datasets in generating streamflow are also evaluated at different hydrological years. Prior to evaluation, the historical sequence of streamflow at BRR station during 1986 to 2015 is used to plot the annual runoff frequency curve. Then the annual runoff frequency (F) of 37.5% and 62.5% are adopted as the classification standard [55,56], and the 16-year simulation period is divided into wet years (F ≤ 37.5%, including years of 2005, and 2010 to 2013), normal years (37.5% < F ≤ 62.5%, including years of 2004, 2006, 2007, 2008, and 2015) and dry years (F > 62.5%, including years of 2002, 2003, 2009, and 2014). Table 5 lists the evaluation indices of streamflow simulation during the wet, normal and dry years. The gauge-based model gained preferable results than 3B42V7-and PCDR-based models for all indices, with R 2 above 0.9 for the wet and normal years, and above 0.5 for the dry years. Generally, simulation results got worse as the frequency increased, as indicated by the decreased R 2 and increased absolute values of Bias. The performances of 3B42V7-and PCDR-based models were satisfactory during the wet and normal years, with R 2 ranging from 0.54 to 0.83; however, they yielded remarkable underestimation at BRR, with Bias ranging from −17.1% to −32.8%. Generally, 3B42V7 produced a relatively better performance than PCDR in wet years with larger R 2 and smaller Bias, while the performance of these two products were comparable in the normal years. In the dry years, both 3B42V7and PCDR-based simulations showed poorer correlations with the observed streamflow. Table 5. Statistical indices of the simulated monthly streamflow using three precipitation inputs during the wet, normal, and dry years.

Water Balance
In addition to streamflow predictions, the ability in producing hydrological variables is another important indicator for assessing the satellite precipitation data [3]. Water balance analysis was carried out to compare the major components of hydrological cycle from monthly simulations driven by the three precipitation datasets. The SWAT model partitions precipitation into evaporation, transpiration, ground discharge (including base flow), and runoff (including surface runoff, base flow, and subsurface flow) [38]. Numerical comparisons of the averaged water balance components from 2002 to 2015 are shown in Table 6. In the case of gauge-based simulation, 73.5% of precipitation was exhausted through evaporation and transpiration, and a very similar rate of 73.6% was produced by 3B42V7; whereas the corresponding rate in PCDR was slightly overestimated (75%). As a key component determining the amount of base flow, the groundwater recharge accounted for 13.9%, 10.5%, and 19% of precipitation in the gauge-, 3B42V7-and PCDR-driven simulations, respectively. As for the total runoff, the 3B42V7-based model generated a similar total runoff to the gauge-based model, 198.7 mm and 186.2 mm, respectively. Less total runoff was produced in the PCDR case (155.8 mm), although the precipitation estimate of PCDR was larger than that of gauge observation and 3B42V7 estimate, probably due to the overestimated exhaust through evaporation, transpiration and groundwater recharge. The differences in runoff components were notable between PCDR-and gauge-based models, especially for the surface runoff, (i.e., 86.1 mm for gauge and 18.1 mm for PCDR), and the base flow volume (i.e., 79.9 mm for gauge and 110.0 mm for PCDR).

Annual and Monthly Runoff Distributions
The runoff distributions (i.e., monthly and annually) and components (i.e., surface flow, subsurface flow, and base flow) are of great concerns in evaluating the water resources from different sources. Figure 9 displays the annual total runoff partitioning into surface runoff, base flow and subsurface flow from 2002 to 2015. Generally, two satellite datasets captured well the overall trends of annual precipitation and total runoff. Both gauge-and 3B42V7-based results indicated that surface runoff was the dominant contributor to total runoff, followed by base flow, while PCDR led to an overestimation in base flow simulation and an underestimation in surface runoff simulation.
Remote Sens. 2020, 12, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing Note: P% stands for the composition ratio of precipitation; R% stands for the composition ratio of total runoff.

Annual and Monthly Runoff Distributions
The runoff distributions (i.e., monthly and annually) and components (i.e., surface flow, subsurface flow, and base flow) are of great concerns in evaluating the water resources from different sources. Figure 9 displays the annual total runoff partitioning into surface runoff, base flow and subsurface flow from 2002 to 2015. Generally, two satellite datasets captured well the overall trends of annual precipitation and total runoff. Both gauge-and 3B42V7-based results indicated that surface runoff was the dominant contributor to total runoff, followed by base flow, while PCDR led to an overestimation in base flow simulation and an underestimation in surface runoff simulation.   Figure 10 further displays the compositions of monthly runoff driven by gauge, 3B42V7 and PCDR data. The monthly variations of total runoff were consistent among the three simulations, with more than two thirds of total runoff occurring in July and August. In general, 3B42V7 generated a similar contribution pattern with the gauge-based simulation; that is, surface runoff was the dominate component from January to August, and subsurface flow and base flow became the major contributors Figure 10 further displays the compositions of monthly runoff driven by gauge, 3B42V7 and PCDR data. The monthly variations of total runoff were consistent among the three simulations, with more than two thirds of total runoff occurring in July and August. In general, 3B42V7 generated a similar contribution pattern with the gauge-based simulation; that is, surface runoff was the dominate component from January to August, and subsurface flow and base flow became the major contributors during September to December. Unfortunately, PCDR was not able to produce the intraannual runoff compositions, which tended to overestimate the base flow most of the time from March to November and underestimate the role of surface flow.

Conclusions
This study assessed the accuracy of two satellite precipitation products, i.e., 3B42V7 and PCDR, and their utilities in driving hydrological simulation over a semi-humid region in the Biliu River Reservoir Watershed (China).
Results indicated that 3B42V7 generally outperformed PCDR in estimating the gauge precipitation. Performance of both satellite products improved with the time and space aggregations from daily to monthly scale, and from grid to watershed scale. The abilities in detecting precipitation events were comparable between the two satellite datasets. Dividing the precipitation into different intensities, it was found that 3B42V7 was superior in determining the rainfall occurrence under all intensities, whereas PCDR overestimated the light rainfall (1-10 mm) and underestimated the moderate (10-25 mm) and high rainfalls (>25 mm). Evaluation indices for different grids showed a better agreement between the daily satellite estimates and gauge precipitation in a data-rich grid.

Conclusions
This study assessed the accuracy of two satellite precipitation products, i.e., 3B42V7 and PCDR, and their utilities in driving hydrological simulation over a semi-humid region in the Biliu River Reservoir Watershed (China).
Results indicated that 3B42V7 generally outperformed PCDR in estimating the gauge precipitation. Performance of both satellite products improved with the time and space aggregations from daily to monthly scale, and from grid to watershed scale. The abilities in detecting precipitation events were comparable between the two satellite datasets. Dividing the precipitation into different intensities, it was found that 3B42V7 was superior in determining the rainfall occurrence under all intensities, whereas PCDR overestimated the light rainfall (1-10 mm) and underestimated the moderate (10-25 mm) and high rainfalls (>25 mm). Evaluation indices for different grids showed a better agreement between the daily satellite estimates and gauge precipitation in a data-rich grid. Therefore, it is worth to thicken the network of precipitation measurements with further precipitation stations, which would increase the amount of data necessary in the validation processes in the future.
The R2, NSE, and Bias were applied to evaluate the monthly hydrological model performance. The 3B42V7-based simulation showed a better agreement between the simulated and observed streamflow compared to model simulation using PCDR. Specifically, the gauge data were well suited for simulating the high and moderate flows, and 3B42V7 data were also applicable in the high flow simulation. However, neither 3B42V7 nor PCDR were capable in modeling the moderate and low flow conditions. Further, water balance analysis indicated that the 3B42V7 data and gauge observations denoted very similar water balance and runoff compositions, while PCDR overestimated the base flow and underestimated the surface flow most of the time. Overall, results in this study suggested the adequacy and applicability of 3B42V7 in describing hydrological features in the study area, which is vital in water resources planning and management in case there is less gauge-based observation.
To date, there are more than 30 global precipitation datasets available that can serve as alternative precipitation data sources including gauge-based, reanalysis, and satellite data [57]. Although many of them are proven to have great potential in the real hydrological applications, there is still much room to improve the accuracy of satellite products for various hydrologic purposes, such as flood forecasting, drought prevention, and water supply. Therefore, further studies need to carry out the in-depth work of satellite precipitation correction and validation over different parts of the world, in order for decision-makers to find alternative precipitation sources for water resources planning and management.