Assessing the Uncertainties of Four Precipitation Products for Swat Modeling in Mekong River Basin

Using hydrological simulation to evaluate the accuracy of satellite-based and reanalysis precipitation products always suffer from a large uncertainty. This study evaluates four widely used global precipitation products with high spatial and temporal resolutions [i.e., AgMERRA (AgMIP modern-Era Retrospective Analysis for Research and Applications), MSWEP (Multi-Source Weighted-Ensemble Precipitation), PERSIANN-CDR (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record), and TMPA (Tropical Rainfall Measuring Mission 3B42 Version7)] against gauge observations with six statistical metrics over Mekong River Basin (MRB). Furthermore, the Soil and Water Assessment Tool (SWAT), a widely used semi-distributed hydrological model, is calibrated using different precipitation inputs. Both model performance and uncertainties of parameters and prediction have been quantified. The following findings were obtained: (1) The MSWEP and TMPA precipitation products have good accuracy with higher CC, POD, and lower ME and RMSE, and the AgMERRA precipitation estimates perform better than PERSIANN-CDR in this rank; and (2) out of the six different climate regions of MRB, all six metrics are worse than that in the whole MRB. The AgMERRA can better reproduce the occurrence and contributions at different precipitation densities, and the MSWEP has the best performance in Cwb, Cwa, Aw, and Am regions that belong to the low latitudes. (3) Daily streamflow predictions obtained using MSWEP precipitation estimates are better than those simulated by other three products in term of both the model performance and parameter uncertainties; and (4) although MSWEP better captures the precipitation at different intensities in different climatic regions, the performance can still be improved, especially in the regions with higher altitude.


Introduction
Hydrological models are vital tools for water resources management and allocation policy development, and to assess the impact of climate change and human activities (e.g., land use change and dam construction) on water yield [1][2][3]. With the development of computer technology, more and more hydrological models have been developed and applied in various river basins all over the world [4,5]. In general, these models can be divided into two types, namely the lumped model However, there are several necessary factors worth further analyzing. Firstly, these preexisting studies often evaluate the performance of precipitation products by treating the study area as a whole or simply dividing the study area into several sub-regions according to the location of hydrological stations. Few studies have analyzed the performance of satellite/reanalysis precipitation estimates in different climatic zones over the Mekong River Basin. Secondly, previous studies mainly focus on the capacities of precipitation products to reconstruct the streamflow process compared with gauge observations, such as calibrating the model using each precipitation product or use parameters calibrated by gauge observations with different precipitation inputs. Moreover, questions remain as to the analysis of the uncertainty of model parameter with different precipitation inputs. Thirdly, most of the previous studies have found that precipitation estimates of APHRODITE, which was developed by the Japan Meteorological Agency (JMA) and it can well represent the temporal and spatial characteristics of precipitation in the Mekong River Basin. However, APHRODITE only provides precipitation estimates until 2007, which will hinder the analysis of the temporal and spatial evolution of precipitation over the past 10 years in the Mekong River Basin. Furthermore, considering that the Mekong River flows through six countries and it has complex climate and topographical features, a long sequence of more accurate precipitation estimates is urgently needed.
Therefore, the objectives of this study are as follows: (1) To evaluate the accuracy of satellite/ reanalysis precipitation products over the whole MRB and different climatic zones using multiple statistical metrics; (2) to investigate the impact of different precipitation inputs on SWAT modeling over StungTreng station, and (3) to estimate parameter and streamflow simulation uncertainties using the four precipitation inputs.

Study Area
The Mekong River Basin (MRB) is located in the South-East Asia and has a drainage area of 795,000 km 2 (Figure 1). It is the tenth longest river in the world and largest cross-border river in Asia. The MRB originated in the Qinghai-Tibet Plateau of China, and flows through six countries from North to South (i.e., China, Laos, Myanmar, Thailand, Vietnam, and Cambodia), then finally flowing into the South China Sea in the Southernmost city of Ho Chi Minh City, Vietnam [32,33]. The upper reaches of MRB (UMB), also known as the LanCang River located in China, with a river length of 2139 km, flows through Qinghai Province, Tibet Autonomous Region and Yunnan Province, and it has a drainage area of~195,000 km 2 (~accounts for 24% of the entire MRB). The lower reaches of MRB (LRB), flows through other five countries, with a river length of~2770 km, and it has a drainage area of~600,000 km 2 (~accounts for 76% of the entire MRB). The MRB contains seven broad physiographic regions featuring diverse topography, drainage patterns, and geomorphology: The UMB contains the Qinghai-Tibetan Plateau, Three Rivers Areas, and LanCang River Basin, and the Northern Highlands, Khorat Plateau, Tonle Sap Lake, and Mekong Delta make up the LRB.
The climate of the MRB ranges from temperate to tropical. Some of the high mountains are covered by permanent glaciers and snow in the URB, flow in the dry season of MRB is mainly fed by snow and glacier melt water, especially for the middle reaches. The climate of the LRB is often tropical with more precipitation and higher temperature. Average annual precipitation of the MRB ranges from as little as 600 mm in the URB to more than 3000 mm in the LRB. Influenced primarily by southeast monsoons, streamflow of the MRB has significant flood and dry season periods, in which flood season extending from June to November accounts for 80% to 90% of the annual flow, while the dry season only accounts for the rest of the 10~20 percent. Average annual streamflow of the MRB is 13,300 m 3 /s (over Stung Treng station from 1910 to 2012). The sub-figure in the lower left corner of Figure 1 is climate classification of the MRB, which was downloaded from http://koeppen-geiger.vu-wien.ac.at/present.htm. This climate classification method is based on recent data from the Climatic Research Unit (CRU) of the University of East Anglia and the Global Figure 1. Locations of the Mekong River Basin (MRB) and the meteorological and hydrological stations, the sub-figure at the bottom left is the climate region classification of MRB from World Maps of Koppen-Geiger Climate Classification (Notation: A, C, D, E mean equatorial, warm temperature, snow, and polar climates, respectively; m and w mean monsoonal and desert precipitation regimes; a, b, c, and T mean hot summer, warm summer, cool summer, and polar tundra temperature features).

Geographical Data
The DEM (digital elevation model) data with 90 m spatial resolution was collected from National Aeronautics and Space Administration Shuttle Radar Topographic Mission (NASA-SRTM) at http: //srtm.csi.cgiar.org/SELECTION/inputCoord.asp. The soil data used in this study is Harmonized World Soil Database v 1.2 with 1 km spatial resolution, which was downloaded from Food and Agriculture Organization of the United Nations, and this soil dataset contains two layers, namely top soil and subsoil [35]. The land use data with 1 km spatial resolution was downloaded from Global Land Cover 2000 (http://bioval.jrc.ec.europa.euproducts/glc2000/products.php).

Precipitation Products
In this study, five precipitation datasets from 1998-2007 were used for SWAT modeling (Table 1). These precipitation data can be divided into three categories based on their sources: (1) Observed precipitation data from gauges observations which were downloaded from China Meteorological Data Sharing Service System (CMDSS; http://data.cma.cn/) and Mekong River Commission (MRC; http:// www.mrcmekong.org/); and (2) One reanalysis precipitation products: AgMRRA [36], which provides all meteorological datasets for SWAT modeling other than precipitation. This reanalysis precipitation product has been selected because it consists of daily wind speed, maximum/minimum temperature, relative humidity, and solar radiation. (3) Three satellite precipitation products: MSWEP [27], which represents one of the latest remote sensing fusion datasets; PERSIANN-CDR [37], which has been widely proven to have better accuracy in multiple watersheds around world [13,15]; and TMPA satellite precipitation, which has derived many other precipitation products [38]. The situ gauge precipitation data for the URB was collected from China Meteorological Data Sharing Service System (CMDSS), which includes daily precipitation with the period of 1998 to 2007. Furthermore, the other daily precipitation were obtained from MRC historical observation dataset. We have deleted those stations that continuously missing 1-year data and inverse distance weighted interpolation method have been used to fill in missing data. Finally, as shown in Figure 1, 82 precipitation stations data were used as a baseline to evaluate the other four reanalysis/satellite-based precipitation products. Daily streamflow data were obtained from one station on the main stream of MRB, namely Stung Treng. The control area of Stung Treng station is 635,000 km 2 [39]. The reason we chose this station is that the streamflow at downstream of Stung Treng is significantly affected by storm surges, tides, and sea levels rise, and the complex flood propagation due to the delta system and reserved flow of the Tonle Sap lake [40].

AgMERRA Dataset
The AgMERRA climate forcing datasets were developed as part of the Agricultural Model Intercomparison and Improvement Project (AgMIP) to provide daily time series with the period of 1980-2010 at the global scale [41]. The AgMERRA products provide daily precipitation, Remote Sens. 2019, 11, 304 6 of 24 maximum/minimum temperature, relative humidity, and solar radiation at the spatial resolution of 0.25 • , which are additional meteorological data that are useful for SWAT model in MRB. MERRA-Land product has been incorporated into AgMERRA, which has substantially improved the resolution of the daily precipitation distribution and extreme precipitation over other datasets, and it also utilizes the NASA/GEWEX solar radiation budget data to improve the solar radiation values. More details about AgMERRA algorithms can be found in Reference [42] and Ruane, Goldberg and Chryssanthacopoulos [41]. The AgMERRA daily precipitation, maximum/minimum temperature, relative humidity, wind speed, and solar radiation were obtained from National Aeronautics and Space Administration at https://data.giss.nasa.gov/impacts/agmipcf/agmerra/. In the later model evaluation and uncertainty analysis, we only changed the precipitation input, while other meteorological data remained unchanged.

MSWEP Dataset
The Multi-Source Weighted-Ensemble Precipitation (MSWEP) is a new precipitation product that merges gauge observations and satellite/reanalysis precipitation which developed by Beck et al. [27] in European Commission (Italy), at the spatial resolution of 0.1 • /0.25 • for the global coverage from 1979 to present. The latest product version is MSWEP 2.1, which based on the recently released CHPclim dataset (Climate Hazards Groups Precipitation Climatology) but instead of more accurate regional datasets where available [27,43]. A correction method has been conducted which considered inferring catchment-average precipitation from mean streamflow observations at 13,762 stations to eliminate the gauge under-catch and orographic effects. The variability of MSWEP precipitation was determined by weighted average of seven precipitation products; two based on atmospheric model reanalysis precipitation products (JRA-55 and ERA-Interim), three of them are satellite-based precipitation products (CMORPH, GSMaP-MVK, and TMPA), the other two are solely based on interpolation of gauge-based observations [27]. Finally, the MSWEP precipitation product was adjusted against four state-of-the-art gauge-based datasets that contain GPCP-1DD, WFDEI-CRU, CPC Unified, and TMPA 3B42.

PERSIANN-CDR Dataset
The Precipitation Estimate from Remotely Sensed Information using Artificial Neural Networks-Climate Data Records (hereafter abbreviated as PERSIANN-CDR) is a multi-satellites high spatial and temporal resolution precipitation product developed by using PERSIANN algorithm based on GridSat-B1 infrared satellite data, and the stage IV hourly precipitation data of NCEP was used to train the artificial neural network [15,37]. The PERSIANN-CDR provides long-term (from 1983 to delayed present) daily precipitation estimates at 0.25 degrees spatial resolution. The precipitation finally corrected by using 2.5 degrees monthly products of GPCP (Global Precipitation Climatology Center) that contains gauge information from GPCP, as the result, the precipitation estimates of PERSIANN-CDR that are consistent with GPCP 2.5 degrees monthly data [37].

TMPA Dataset
The Tropical Rainfall Measuring Mission Multi-Satellite Precipitation Analysis Products 3B42 Research Version 7 (hereafter abbreviated to TMPA in this paper) was used as the last satellite precipitation in this study. TMPA dataset was originally developed by National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploratory Agency (JAXA) at high spatial (0.25 degree) and temporal resolution (3 h). On a global scale, this dataset provides precipitation estimates between 50 • S to 50 • N from 1998 to October 2014, which combined satellite remote sensing data from various sensors and Global Precipitation Climatology Center (GPCC) monthly gauge observations. Numerous published papers have evaluated the performance of TMPA by directly comparing with gauge observations or using hydrological modeling [15,43,44]. However, there are relatively few studies that consider the uncertainty of TMPA in hydrological modeling [45].

Methodology
As described in Section 2.1, the MRB flows through six climatic zones, and precipitation has different temporal and spatial characteristics in different climatic regions [46]. Therefore, the evaluation of precipitation products was conducted in six different climatic zones. Firstly, 82 pixels that contain at least one gauge were used to evaluate the performance of the four precipitation products. The daily gauge observations in the pixel were considered as the baseline for comparing with precipitation products. Comparison was conducted by using statistical methods and SWAT modeling that considers the parameters uncertainties [47,48].

Statistical Methods
To quantitatively evaluate the accuracy of the four precipitation products (i.e., AgMERRA, MSWEP, PERSIANN-CDR, and TMPA) compared with the gauge observations, the Correlation Coefficient (CC), Root Mean Square Error (RMSE), and Relative Error (RE) were used in this study. The detailed meaning and calculation formula of these three indicators can be found in Table 2 and other literature [15,18,49]. In order to evaluate the accuracy of precipitation products for different levels, we divided the daily precipitation into seven classes (i.e., 0, 0~1, 1~5, 5~10, 10~15, 15~20, >20 mm/day). It should be noted that the assessment of precipitation in different climatic regions was based on the average of gauge observations and each precipitation products in each region.

Statistic Metric Equation Perfect
Value Mean error (ME) (Notation: N represents the number of variables; S n represents the satellite/reanalysis precipitation estimate or simulated streamflow; G n represents the gauge observed precipitation or streamflow; σ S represents standard deviations of satellite/reanalysis precipitation or simulated streamflow; σ G represents standard deviations of gauge observed precipitation or streamflow; n 11 represents precipitation detected both by gauge and satellite product; n 01 represents precipitation detected by gauge but not detected by satellite precipitation product; n 10 is contrary to n 01 ; S represents the mean value of satellite/reanalysis precipitation estimates or simulated streamflow; and G represents the mean value of gauge observed precipitation or streamflow).

Model Description
The Soil and Water Assessment Tool (SWAT) is used in this study because it has been successfully applied all over the world. This model is a comprehensive, continuous-time, process-based, and semi-distributed macroscale model that has a large number of parameters to be calibrated [50]. It was proposed by Agricultural Research Service of the United States Department of Agriculture (USDA-ARS) to evaluate the effects of water management policies and non-point source pollution [6,45], which also has been improved by scholars from all over the world, resulting in many versions of the SWAT model. It can be used to simulate the variability of hydrological processes, vegetation growth, erosion, agricultural non-point source pollution, and water resources management [1,15]. It discretizes the study basin into several sub-basins based on the DEM and slope data, and subsequently, each of sub-basins will be divided into more than one HRU (Hydrologic Response Units) according to land use, soil characteristic, and topography data. The calculation of the internal hydrological cycle of the model is based on water and energy balance, which are determined by meteorological inputs such as precipitation, maximum/minimum temperature, solar radiation, wind speed, and relative humidity at different time steps. The quantity of water, contaminants, and sediment are calculated at each sub-basin based on the daily/sub-daily inputs, and then these loads are routed into the main stream and reservoirs within the study basin by using Muskingum method. More theories about this model can be found in other literature [15,45,51] and the official model manual [8].

Model Setup
The version SWAT2012 (developed by USDA-ARS located in Maryland, United States) coupled with ArcGIS Version 10.2 (developed by Environmental Systems Research Institute located in California of United States) was used to set up the model for the one selected hydrometric station, namely Stung Treng. The MRB over the Stung Treng has been divided into 397 sub-basins and further divided into 1499 HRUs based on the soil and land used datasets. The categories of slope used in the HRU definition are: 0~5%, 5~10%, 10~15%, and >15%. The soil data used in this study is reclassified into 16 types, and the land use data was reclassified into five types according to the database provided by the SWAT model. The detailed input data of the model are shown in Table 1. Daily gauge-based precipitation data obtained from 82 rain gauges (18 rain gauges located in China and the others located in Laos, Myanmar, Thailand, Cambodia, and Vietnam; see Figure 1) for the period of 1998 to 2007. In this study, consider the impact of altitude on the rainfall and temperature in the upper alpine region, five elevation bands have been applied in the modeling that can modify the regional precipitation and temperature by weighting the difference of elevation bands [50]. More information about the elevation bands in SWAT model can be found in other literature [45,51] and the SWAT model's official document [8].

Model Calibration and Uncertainty Analysis
In this study, the first two years (1998 to 1999) were used as the warm-up period to reduce the impact of the initial state of the model. For each model simulation, daily-simulated streamflow of the selected hydrometric station has been calibrated individually using the period of 2000 to 2003 and then validated in the time period 2004 to 2007, based on the daily observed streamflow of Stung Treng ( Figure 1). In this study, the model is calibrated by using a separate software named SWAT-CUP (SWAT Calibration and Uncertainty Program) [50]. Five uncertainty methods are provided in SWAT-CUP (i.e., SUFI-2, GLUE, Parameter Solution, MCMC and PSO) [52]. The SWAT model was calibrated by using SUFI-2 in this study because of its efficiency in calculation and simplicity in data preparation. It was also found to be suited for global optimization and uncertainty analysis [53]. Before calibration, parameter sensitivity analysis has been conducted by using the one-at-a-time and global sensitivity analysis (500 times) methods coupled in SWAT-CUP for 24 common sensitive parameters according to the other literatures [15,50]. After the parameter sensitivity analysis, the SWAT model was started with the initial ranges of more sensitive parameters, which is shown in Table 3, and the model was calibrated with three iterations. The upper and lower bounds of the parameters are qualified to be physically reasonable based on literature [45,52] and the SWAT model official documentation [8]. Two thousand simulations were run in each iteration, after each iteration, a set of new parameters ranges was given by SUFI-2 (normally narrowed down). The next iteration was executed based on the new parameter ranges. Detailed information about SUFI-2 and the protocol to calibrate SWAT can be found in Abbaspour et al.
[1] and Tuo et al. [45]. Table 3. Initial ranges of 10 more sensitive calibrated parameters and their final optimal values for five precipitation inputs (Gauge, AgMERRA, MSWEP, PERSIANN-CDR, and TMPA). There are multiple metrics in SWAT-CUP to evaluate the model performance. This study followed the approach suggested by Abbaspour et al. [1]. The 95PPU has been used to evaluate the performance of SWAT, which accounts for the parameter uncertainty and 95PPU (95% prediction uncertainty) of the outputs. The Nash Sutcliffe coefficient (NSE) [54], the coefficient of determination (R 2 ) and percent bias (PBIAS) have been used as the indicators for the best simulation within the 2000 simulations. The NSE was used to measure the quantity difference between the observed streamflow and simulated values, with NSE=1 means the best simulation result, while the R 2 represents the change trends similarity between the simulated streamflow and observed ones, with R 2 =1 meaning the better simulation result. The PBIAS is used to estimate the average tendency of simulated streamflow, and its optimum value is 0, where positive values mean model underestimation and vice versa. The formulas of NSE, R 2 , and PBIAS are listed in Table 2.

Parameters
As stated in Abbaspour et al.
[1] and Tuo et al. [45], parameter uncertainties are described as the final ranges of the sensitive parameters with which the SWAT model achieve a satisfactory result [55,56]. Thus, these output uncertainties can be calculated at 2.5% and 97.5% levels of the distribution for a specific variable, which propagated from the parameters uncertainties by using Latin hypercube sampling method. In this study, the uncertainty analysis of SUFI-2 is based on a Bayesian framework through the sequential and fitting process, which is represented by an envelope of good simulations decided by 95PPU. To quantify the degree of the prediction uncertainties, two indices are used in this study, namely P-factor and R-factor [57]. The P-factor is used to represent the percentage of observed variable bracketed by 95PPU, and its range is zero to one, where one is the best result that represents 100% of the observed variables are surrounded by the model simulation uncertainty. The R-factor is equal to the ratio of the mean thickness of 95PPU and the standard deviation of the observed variable, and it ranges from zero to infinity in which zero is the optimal value. Generally, a P-factor of 1 and R-factor of zero means a perfect simulation that matches the observed variable, which is difficult to achieve due to the uncertainties from multiple sources and measurement errors [52]. As suggested by Abbaspour et al.
[1], P-factor > 0.7 and R-factor < 1.5 are considered as acceptable simulation results for streamflow. The P-factor and R-factor can be calculated as follows: where N is the total number of observed variables, n in is the observed data bracketed by 95PPU, q u and q l are the upper and lower of the simulated data, and σ q is the standard deviation of the observed variable q.

Evaluation of the Performance of Four Precipitation Products
4.1.1. Evaluation in the Whole Mrb Figure 2 shows scatterplots of precipitation estimates' comparisons for MRB at a daily scale between AgMERRA, MSWEP, PERSIANN-CDR, TMPA, and gauge observations. As shown in Figure 2, daily MSWEP and TMPA agree well with gauge observed precipitation over the whole MRB, and the other products are slightly inferior. A summary list of six metrics for four precipitation products at daily resolution over the whole MRB is shown in the upper left corner of Figure 2. These indices were calculated based on the arithmetic mean of gauge observations and pixel estimates over the whole MRB. Generally, PERSIANN-CDR overestimated daily precipitation slightly (with the positive ME = 0.25) while the other three products have an underestimated trend. TMPA has the lowest ME and RMSE and the highest CC, and the POD, FAR and CSI (0.93, 0.05 and 0.89, respectively), indicating that TMPA can capture most precipitation events over the whole MRB. Among these four precipitation products, PERSIANN-CDR has the lowest CC but has the highest POD, which means that PERSIANN-CDR estimates the occurrence of rain events in MRB is the best.
Remote Sens. 2019, 11, x FOR PEER REVIEW 10 of 25 where is the total number of observed variables, is the observed data bracketed by 95PPU, and are the upper and lower of the simulated data, and is the standard deviation of the observed variable q.

Evaluation of the Performance of Four Precipitation Products
4.1.1. Evaluation in the Whole Mrb Figure 2 shows scatterplots of precipitation estimates' comparisons for MRB at a daily scale between AgMERRA, MSWEP, PERSIANN-CDR, TMPA, and gauge observations. As shown in Figure 2, daily MSWEP and TMPA agree well with gauge observed precipitation over the whole MRB, and the other products are slightly inferior. A summary list of six metrics for four precipitation products at daily resolution over the whole MRB is shown in the upper left corner of Figure 2. These indices were calculated based on the arithmetic mean of gauge observations and pixel estimates over the whole MRB. Generally, PERSIANN-CDR overestimated daily precipitation slightly (with the positive ME=0.25) while the other three products have an underestimated trend. TMPA has the lowest ME and RMSE and the highest CC, and the POD, FAR and CSI (0.93, 0.05 and 0.89, respectively), indicating that TMPA can capture most precipitation events over the whole MRB. Among these four precipitation products, PERSIANN-CDR has the lowest CC but has the highest POD, which means that PERSIANN-CDR estimates the occurrence of rain events in MRB is the best. The intensity distributions of daily precipitation at different magnitudes have been used in many articles to evaluate the performance of satellite/reanalysis precipitation products with gauge observations [18,58]. This method is suitable for evaluating the ability of satellite/reanalysis precipitation products to detect light and high precipitation in different climatic zones. The occurrence and contribution of daily precipitation with different intensities (i.e., 0~0.1, 0.1~1, 1~5, 5~10, 10~20, and >20 mm/day) were calculated over one pixel with at least one gauge from 1998 to 2007 (Figure 3). For the whole MRB, the highest occurrence of rainfall is 1~5 mm/day, which accounted for 15.5% of total rainfall, and the AgMERRA showed a little overestimation in this intensity both at the occurrence and contribution, while the other three precipitation products are opposite (except for the contribution of MSWEP has a slight overestimation). For the daily precipitation with the range of 0~0.1 mm, the PERSIANN-CDR has the best performance to estimate the occurrence, the MSWEP showed an obvious overestimation. For daily precipitation ranging from 0.1 to 1 mm, the four precipitation products are all good for rainfall estimates both at occurrence and contribution. For the bin of daily precipitation ranging from 5 to 10 mm, the PERSIANN-CDR is basically consistent with the gauge observations (have almost equal indicator value). For daily precipitation larger than 20 mm, all four precipitation products tend to overestimate the occurrence and contribution, of which the AgMERRA had the largest overestimation. In conclusion, the daily PERSIANN-CDR can reproduce the occurrence and contribution structures of various precipitation intensities, particularly for the slight (0~0.1 mm/day) and heavy (10~20 mm/day) precipitation, which performs better than the other three products. For the occurrence of daily precipitation ranging from 0 to 0.1 mm, the latest product MSWEP has a great overestimation, which should be noted by the dataset developer.
Remote Sens. 2019, 11, x FOR PEER REVIEW 11 of 25 Figure 2. Scatterplots of precipitation comparison for Mekong River basin at daily scale between satellite/reanalysis precipitation products and gauge. The black line is diagonal line, and the best-fit line, which used the least squares method, is blue.
The intensity distributions of daily precipitation at different magnitudes have been used in many articles to evaluate the performance of satellite/reanalysis precipitation products with gauge observations [18,58]. This method is suitable for evaluating the ability of satellite/reanalysis precipitation products to detect light and high precipitation in different climatic zones. The occurrence and contribution of daily precipitation with different intensities (i.e., 0~0.1, 0.1~1, 1~5, 5~10, 10~20, and >20 mm/day) were calculated over one pixel with at least one gauge from 1998 to 2007 (Figure 3). For the whole MRB, the highest occurrence of rainfall is 1~5 mm/day, which accounted for 15.5% of total rainfall, and the AgMERRA showed a little overestimation in this intensity both at the occurrence and contribution, while the other three precipitation products are opposite (except for the contribution of MSWEP has a slight overestimation). For the daily precipitation with the range of 0~0.1 mm, the PERSIANN-CDR has the best performance to estimate the occurrence, the MSWEP showed an obvious overestimation. For daily precipitation ranging from 0.1 to 1 mm, the four precipitation products are all good for rainfall estimates both at occurrence and contribution. For the bin of daily precipitation ranging from 5 to 10 mm, the PERSIANN-CDR is basically consistent with the gauge observations (have almost equal indicator value). For daily precipitation larger than 20 mm, all four precipitation products tend to overestimate the occurrence and contribution, of which the AgMERRA had the largest overestimation. In conclusion, the daily PERSIANN-CDR can reproduce the occurrence and contribution structures of various precipitation intensities, particularly for the slight (0~0.1 mm/day) and heavy (10~20 mm/day) precipitation, which performs better than the other three products. For the occurrence of daily precipitation ranging from 0 to 0.1 mm, the latest product MSWEP has a great overestimation, which should be noted by the dataset developer.  1, 0.1~1, 1~5, 5~10, 10~20, and >20 mm/day), and their relative contribution to the total precipitation amounts during the period 1998 to 2007 (Notation: the intensity of 0~0.1 contributed less rainfall (<1%), so it is not shown in the right corner of Figure 3).

Evaluation in Different Climatic Zones
Regional comparison over six different climatic zones was also conducted to show the error characteristics of AgMERRA, MSWEP, PERSIANN-CDR, and TMPA over the MRB comprehensively. In general, all six metrics of four precipitation products in six different climatic zones were worse than those for the whole MRB (Table 4) which means that different precipitation products may have different performances in different climate zones, as using more site aggregates to assess precipitation may neutralize certain errors. The MSWEP has the best metrics in ET, DWC, AW, and AM, 12  Contribution percentage Figure 3. The intensity distribution of daily precipitation in different ranges (0~0.1, 0.1~1, 1~5, 5~10, 10~20, and >20 mm/day), and their relative contribution to the total precipitation amounts during the period 1998 to 2007 (Notation: the intensity of 0~0.1 contributed less rainfall (<1%), so it is not shown in the right corner of Figure 3).

Evaluation in Different Climatic Zones
Regional comparison over six different climatic zones was also conducted to show the error characteristics of AgMERRA, MSWEP, PERSIANN-CDR, and TMPA over the MRB comprehensively. In general, all six metrics of four precipitation products in six different climatic zones were worse than those for the whole MRB (Table 4) which means that different precipitation products may have different performances in different climate zones, as using more site aggregates to assess precipitation may neutralize certain errors. The MSWEP has the best metrics in ET, DWC, AW, and AM, corresponding to the upstream and downstream of the MRB, where have relatively dry and wet climates. The CC in the above climatic zones for MSWEP was greater than 0.6, and POD was greater than 0.79. For CWB, and CWA, AgMERRA has the best performance with the higher CC, POD, and CSI. Most metrics of the four precipitation products are worst in DWC compared with other climatic regions, the possible reason is the special geographical location of the region, and the assessment of snowfall by these products may be inaccurate, another reason may be that there is only one rainfall gauge located in this climatic zone. ET and DWC correspond to the drier and colder among the six climatic zones, in which the four precipitation products overestimated precipitation with positive ME except for AgMERRA in ET, while in other four relatively humid areas most of the precipitation products have different overestimation with negative ME and RMSE. For the POD and CSI, it becomes larger when the climate becomes humid, which means that the precipitation products are more accurate for detecting the occurrence of precipitation in wetland areas. PERSIANN-CDR and TMPA have poorer performance in ET, DWC, CWB, and CWA compared with the other two climatic zones. Like the assessment over the whole MRB, the intensity distribution of daily precipitation at different thresholds as well as their relative contributions to the total precipitation in the six different climatic regions were also conducted to reveal the error characteristic of AgMERRA, MSWEP, PERSIANN-CDR, and TMPA ( Figure 4). As shown in Figure 4, different precipitation products have different optimal performances in each climate region due to the unique data sources and their deviation correction algorithms. For the daily precipitation ranges from 0~0.1 mm, the AgMERRA product showed the best performance over the six climate regions both at the occurrence and contributions compared with the gauge observations. In the ET region representing cold climate, the AgMERRA product has a more accurate estimation of each intensity precipitation compared with the other three products, the MSWEP and PERSIANN-CDR underestimated the occurrence of 0.1~1 mm/day, but greatly overestimated the contributions of larger precipitation (10~20 mm/day, >20 mm/day), the TMPA product had a slight underestimation and overestimation for the occurrences of 0.1~1 mm/day and the contributions of >25 mm/day, respectively. For the region Dwc, all four precipitation products have an underestimation for the occurrence of 0~0.1 mm/day, for the density of greater than 20 mm/day, the MSWEP has a great overestimation at the occurrence and contributions compared with the gauge observations. It should be noted that, although PERSIANN-CDR and TMPA are more accurate in estimating the occurrence and contribution rate of precipitation of different intensities, it has a relatively low correlation coefficient with gauge observations (Table 4), which may be related to its erroneous evaluation of the occurrence time of precipitation. For the region Cwb, in addition to the overestimation of the occurrence of 0.1~1 mm/day, other products are relatively accurate in estimating daily precipitation at different densities. For the region Cwa, four precipitation products have relatively good estimates, both for the occurrence and contributions at different intensities. For the region Aw and Am, which also mean humid areas, the four precipitation all have larger contributions than gauge observation in the density of greater than 20 mm/day except PERSIANN-CDR in Aw, for the intensity of 10~20 mm/day, the AgMERRA, MSWEP, and TMPA slightly underestimate the occurrence and contributions, while the PERSIANN-CDR performs better.

Model Calibration and Validation
SWAT model was calibrated using the gauge observations, AgMERRA, MSWEP, PERSIANN-CDR, and TMPA and the daily streamflow from 1998 to 2003 with the first two years for warming-up the model [57]. The NSE, R 2 , and PBIAS were used to evaluate the model performance, which has been calculated in Table 5. From the Table 5, we can see that all the five precipitation inputs can reconstruct the daily streamflow in MRB both at calibration and validation periods, in which the performance of TMPA in the calibration period and Gauge observations in the validation period are the best with the NSE and R 2 = 0.94. For the gauge observations simulation results, we can see that its Nash coefficient reaches 0.92 and 0.94 in calibration and validation period respectively. The result indicates that SWAT model has a good capacity to simulate the daily streamflow of the MRB, which also provides a basis for our subsequent research. For the values of PBIAS, it is easy to find that the PBIAS values of four precipitation products simulation results are greater than gauge observations simulated, which means that there is a large uncertainty in the simulated streamflow of the four precipitation products, among which the AgMERRA has the largest PBIAS (14.1%). The simulation results of five precipitation inputs including gauge observation for different percentile streamflow processes are also analyzed. Figure 5 shows box plots of daily observed and simulated streamflow during the calibration and validation periods. From the left panel of Figure 5, we can see that gauge observations precipitation can best simulate the large streamflow, while the other four precipitation products have a greatly underestimation trend. For the low-streamflow process, the simulated results by using gauge observations are slightly large while the AgMERRA and TMPA simulation values are significantly lower compared with the gauge observed streamflow. Considering the simulated results of precipitation products on different percentile streamflow, the MSWEP showed the best performance compared with the other precipitation products. In the validation period, the streamflow simulated by AgMERRA and TMPA also underestimate the low streamflow, while the gauge precipitation simulated was overestimated.  Table 6 shows the P and R values, which represent the uncertainty of multiple simulations. For the value of P, streamflow simulated by the four precipitation products can all reach good results, in which MSWEP has the best performance with the P equal to 0.99 that means 99% of observed streamflow data were enveloped by 95PPU. Considering the thickness of 95PPU (R values), PERSIANN-CDR performs best compared with the other three products. In general, all four precipitation can achieve good results considering the prediction uncertainty following the criteria suggested by Abbaspour et al.

Parameters and Prediction Uncertainties
[1]. It can be seen from the above studies that, although the model simulation has achieved good results, different rainfall products have different performances in different climate regions, which means that there are large uncertainties in the parameters of the model simulation process. Figure 6 shows 6 of the 10 parameters we calibrated which are more sensitive, among these parameters, CN2 has been confirmed by many studies as the most sensitive parameter in different study areas and it mainly controls the production of surface runoff [1, 9,51]. Different precipitation inputs result in different CN2 ranges, among which the most similar ranges compared with gauge simulations is MSWEP (0.03 to 0.09 vs 0.01 to 0.17), while the greatest difference can be observed in AgMERRA (-0.44 to 0.12). ALPAH_BNK is the baseflow alpha factor for bank storage, and it displays a larger variability compared with CN2 in both the best simulation values and parameter ranges for four different precipitation inputs. Sol BD, responsible for the moist bulk density of the two soil layers, shows a smaller variability compared with CN2 and ALPHA_BNK except for AgMERRA. Therefore, SOL_BD of simulations of MSWEP, PERSIANN-CDR, and TMPA are less impacted by changes of precipitation input, while for AgMERRA product, its soil runoff process is quite different from the gauge simulations. CH_K2 is a very important parameter related to the effective hydraulic conductivity in main channel alluvium. Different precipitation products result in different best CH_K2 values and ranges (67.8, 121.0, 142.3, 164.9, and 236.5 for gauge, AgMERRA, MSWEP, PERSIANN-CDR, and TMPA, respectively). GW_REVAP is responsible for water exchange between  Table 6 shows the P and R values, which represent the uncertainty of multiple simulations. For the value of P, streamflow simulated by the four precipitation products can all reach good results, in which MSWEP has the best performance with the P equal to 0.99 that means 99% of observed streamflow data were enveloped by 95PPU. Considering the thickness of 95PPU (R values), PERSIANN-CDR performs best compared with the other three products. In general, all four precipitation can achieve good results considering the prediction uncertainty following the criteria suggested by Abbaspour et al.

Parameters and Prediction Uncertainties
[1]. It can be seen from the above studies that, although the model simulation has achieved good results, different rainfall products have different performances in different climate regions, which means that there are large uncertainties in the parameters of the model simulation process. Figure 6 shows 6 of the 10 parameters we calibrated which are more sensitive, among these parameters, CN2 has been confirmed by many studies as the most sensitive parameter in different study areas and it mainly controls the production of surface runoff [1, 9,51]. Different precipitation inputs result in different CN2 ranges, among which the most similar ranges compared with gauge simulations is MSWEP (0.03 to 0.09 vs 0.01 to 0.17), while the greatest difference can be observed in AgMERRA (−0.44 to 0.12). ALPAH_BNK is the baseflow alpha factor for bank storage, and it displays a larger variability compared with CN2 in both the best simulation values and parameter ranges for four different precipitation inputs. Sol BD, responsible for the moist bulk density of the two soil layers, shows a smaller variability compared with CN2 and ALPHA_BNK except for AgMERRA. Therefore, SOL_BD of simulations of MSWEP, PERSIANN-CDR, and TMPA are less impacted by changes of precipitation input, while for AgMERRA product, its soil runoff process is quite different from the gauge simulations. CH_K2 is a very important parameter related to the effective hydraulic conductivity in main channel alluvium. Different precipitation products result in different best CH_K2 values and ranges (67.8, 121.0, 142.3, 164.9, and 236.5 for gauge, AgMERRA, MSWEP, PERSIANN-CDR, and TMPA, respectively). GW_REVAP is responsible for water exchange between shallow aquifers and vegetation roots. The GW_REVAP of AgMERRA and MSWEP are closest to the gauge calibrated value in terms of the parameter's optimal values and ranges. ESCO is the soil evaporation compensation factor, which accounts for the evapotranspiration process from soil moisture to atmosphere. The optimal ESCO value of PERSIANN-CDR is much lower than Gauge simulation results, which means that simulated evapotranspiration is greater (Table 3).
In conclusion, considering the uncertainty of the model parameters, the runoff simulation of MSWEP is the best, because the optimal parameters and parameter range determined by it are the closest to the gauge observations simulated results. The results also show that different precipitation inputs have an impact both on the parameter ranges and the best calibrated value of them. However, in order to make the simulated streamflow more in line with the observed streamflow process, the internal algorithm of SWAT-CUP will adjust parameters, which control different components of hydrological processes (e.g., potential evapotranspiration and surface runoff). Table 7 shows the different components of the hydrological process calibrated by five precipitation inputs (including the gauge observations). From Table 7, we can see that, although the precipitation inputs and parameter values are very different, similar main hydrological variables are adjusted based on the observed streamflow using SWAT-CUP [1,45]. Therefore, when using the observed streamflow data to calibrate hydrological model, more other datasets (e.g., glacier data, soil moist data, potential evapotranspiration data) should be used to validate it [45,51].

Simulation with Fixed Parameters Set Using Different Precipitation Products
In the previous studies, we analyzed the effects of different precipitation products on the calibrated parameters of SWAT model in MRB. We found that, although different precipitation products can achieve better simulation results, the parameters for achieving the best simulation are significantly different, which means that precipitation input is the crucial sources of model simulation uncertainty.
In this section, we define a fixed set of parameters by averaging the parameter values calibrated for the five simulations (i.e., calibrated by Gauge observations, AgMERRA, MSWEP, PERSIANN-CDR, and TMPA) that maximized NSE value. The purpose of this test is to find a compromise parameter set in different optimal parameter sets to evaluate the impact of different precipitation inputs. The values of fixed parameter set are listed in Table 8, and the simulated results using fixed parameter set are shown in Figure 7 and Table 9.
In general, the simulation results obtained using the fixed parameter set are inferior to those obtained by the automatic simulation using the respective precipitation data. From Figure 7 and Table 9, we can see that the results simulated using gauge observations have the best performance in terms of NSE, R 2 , RMSE, and PBIAS. Concerning the four satellite and reanalysis precipitation products, TMPA and PERSIANN-CDR have achieved good simulation results, although both are overestimated for low flow processes. For AgMERRA and MSWEP, they both achieved a smaller NSE, which greatly underestimated the high-flow process.

Simulation with Fixed Parameters Set Using Different Precipitation Products
In the previous studies, we analyzed the effects of different precipitation products on the calibrated parameters of SWAT model in MRB. We found that, although different precipitation products can achieve better simulation results, the parameters for achieving the best simulation are significantly different, which means that precipitation input is the crucial sources of model simulation uncertainty.
In this section, we define a fixed set of parameters by averaging the parameter values calibrated for the five simulations (i.e., calibrated by Gauge observations, AgMERRA, MSWEP, PERSIANN-CDR, and TMPA) that maximized NSE value. The purpose of this test is to find a compromise parameter set in different optimal parameter sets to evaluate the impact of different precipitation inputs. The values of fixed parameter set are listed in Table 8, and the simulated results using fixed parameter set are shown in Figure 7 and Table 9.
In general, the simulation results obtained using the fixed parameter set are inferior to those obtained by the automatic simulation using the respective precipitation data. From Figure 7 and Table 9, we can see that the results simulated using gauge observations have the best performance in terms of NSE, R 2 , RMSE, and PBIAS. Concerning the four satellite and reanalysis precipitation products, TMPA and PERSIANN-CDR have achieved good simulation results, although both are overestimated for low flow processes. For AgMERRA and MSWEP, they both achieved a smaller NSE, which greatly underestimated the high-flow process.

Discussion
Comparing CC, RMSE, ME, POD, FAR, and CSI between daily gauge precipitation and AgMERRA, MSWEP, PERSIANN-CDR, and TMPA at the whole MRB, it is shown that the agreement between MSWEP and the gauge observations is better than the remaining three products (with the highest CC and lowest RMSE), which are mainly because MSWEP product was developed by taking full advantage of satellite-based and reanalysis precipitation data (e.g., CPC United and GPCC, CMORPH, GsMAP-MVK, TMPA, ERA-Interim, and JRA-55), many of these precipitation products were gauge-ajusted [27]. While the TMPA precipitation estimates outperforms the other three precipitation products with the lowest ME, which are mainly due to its poor sampling [15]. This finding is consistent with the results of Chen et al. [21]. Based on the three categorical statistical indices (POD, FAR, CSI), this study shows that all the four precipitation products have the capacity of detecting the precipitation events with relative high POD and CSI. Among the preexisting studies, these studies show that the APHRODITE has a good performance compared with TMPA, CMORPH, CRU, and ERA-Interim and CFSR [21,30,31,39]. To our best knowledge, MSWEP has been applied for the first time to the MRB, and we found that this product outperforms the APHRODITE during the period of 1951-2007. This conclusion may be a supplement to the lack of recent meteorological data in the Mekong River Basin.
The Mekong River flows through six different climate regions, precipitation has different characteristics in different climate zones. First of all, all six metrics of four precipitation products in six different climatic zones were worse than those for the whole MRB, which was caused by the cancelation of positive and negative values [58]. By comparing statistical indices (CC, RMSE, ME, POD, FAR, and CSI) and the intensity distributions at a daily scale to evaluate the accuracy of four precipitation products over the six different climate regions, it is demonstrated that the PERSIANN-CDR and TMPA precipitation products have a low CC with the gauge observations in region ET where it represents a cold plateau area, and it also underestimated the occurrence of 0~0.1 mm/day, but overestimated the occurrence and contributions at the greater than 20 mm/day threshold, this phenomenon is consistent with what was discovered by Tong et al. [59] and Xu et al. [60], which conducted the evaluation in Tibet with similar topographical climatic conditions to region ET. For the region Dwc, Figure 4 and Table 4 clearly showed that the four precipitation products have poor performances with the gauge observations, the first reason is that there are more snowfall events in this climate zone, and these precipitation products are relatively poor in predicting snowfall [45], the second possible reason is that there is only one rainfall gauge in this climate zone, which may bring a larger random error for the assessment of precipitation products [61]. For the regions Cwb and Cwa, all precipitation products have relatively good performance with gauge observations except PERSIANN-CDR product, in which PERSIANN-CDR has a lowest CC and POD compared with other three products that means it can hardly detect the occurrence of rainfall events, this finding agrees well with what was found by Jiang et al. [62], which conducted their study in a Cwa area. In the Aw and Am areas representing humid region, this study shows that the MSWEP product has the largest CC (0.82, 0.73) followed by TMPA (0.82, 0.72) and PERSIANN-CDR (0.77, 0.70), and this finding is consistent with what was found out by Alijanian et al. [43]. The evaluation also shows that the daily precipitation estimates from AgMERRA and MSWEP have a relatively better performance than PERSIANN-CDR and TMPA over the whole MRB and each different climatic zone. Considering that AgMERRA provides data limited to 1980-2010, MSWEP is more suitable for meteorological research in MRB.
Tuo et al. [45] and the references therein showed that the NSE cannot be used as the only indicator to evaluate the pros and cons of a hydrological model with different precipitation inputs. Therefore, P and R instead of NSE are used in this study to compare the accuracy of streamflow reconstruction based on the four precipitation estimates over Stung Treng station. In addition, parameter and prediction uncertainties were analyzed to investigate the influence of different precipitation inputs on calibrated parameter ranges in each iteration and best model performance. Daily streamflow simulations using MSWEP precipitation estimates behave a little better than other three products, which are consistent with daily precipitation derived from MSWEP have higher CC and lower RMSE over MRB. When considering the uncertainty of the model simulation (P and R), the daily precipitation estimates derived from four products all show relative good simulation results (with P > 0.7 and R < 1.5, [1]). However, when we evaluated the precipitation products in different climate zones, some daily precipitation products did not perform well in some climatic zones. This means that even if we consider the uncertainty of the model through multiple simulations, it is difficult to determine which precipitation product performs better due to the uncertainties of the internal calculation of the model and the parameters, and this finding also pointed out by Tuo et al. [45]. These findings also highlighted the importance of precipitation assessment for a relatively large study area that must be divided into different sub-regions. When looking to the parameter uncertainties for different precipitation inputs, we found that the ranges of 10 sensitive parameters of MSWEP are closer to the gauge precipitation data, we can carefully draw a conclusion that MSWEP has a good simulation effect when considering parameter uncertainty. In general, the results of this study are expected to provide valuable recommendations for hydrometeorological users when studying the climate change in the Mekong Basin by using the four precipitation products here. As pointed out by World Meteorological Organization (WMO), studying climate change requires at least 30 years of meteorological data. While the MSWEP and AgMERRA precipitation estimates have shown good accuracy in different climate zones and hydrological simulations, which considered the parameter and prediction uncertainties, MSWEP provides data for more than 30 years (1979 to near present), but AgMERRA only has daily precipitation data from 1980 to 2010. Therefore, compared with the other three products, we prefer MSWEP product for hydrometeorological research in the Mekong River Basin.

Conclusions
In this study, we evaluated the accuracy of daily AgMERRA, MSWEP, PERSIANN-CDR, and TMPA precipitation products in the whole MRB and six sub-regions, which have different climate characteristics, using daily gauge precipitation observation data from 82 gauges across the MRB as reference. Then SWAT model was used to evaluate the accuracy of different precipitation products to reconstruct the daily streamflow over the Stung Treng station. Furthermore, we have analyzed parameter uncertainties, prediction uncertainties with fixed parameter set and potential relevance between the different parts mentioned above. The main findings of this study are as follows: 1.
The daily precipitation of MSWEP and TMPA have nearly the same good performance with the highest CC and POD and lowest RMSE over the whole MRB, and the AgMERRA performs better than PERSIANN-CDR in this rank. However, all metrics of four precipitations in six different climate regions were worse than those for the whole MRB. Specifically, the TMPA and PERSIANN-CDR products have a large downward trend, especially in the ET and Dwc regions, while the AgMERRA and MSWEP have relatively good performance in six different climate regions. The AgMERRA has a good performance for each climate zone, while MSWEP and PERSIANN-CDR have obvious overestimation in ET and Dwc areas (for heavy rain with daily precipitation greater than 20 mm/day). In the Am, Aw, Cwa, and Cwb regions, MSWEP has the best performance, and other products perform differently in different intensities.

2.
The MSWEP precipitation estimates have achieved the best simulation results by considering the model's simulation effects for low-flow and high-flow processes ( Figure 5), followed by the PERSIANN-CDR, TMPA, and AgMERRA. Considering the uncertainty of the model's multiple simulations, MSWEP and PERSIANN-CDR precipitation products have obtained good simulation results with higher P and lower R values, the TMPA simulation result was better than AgMERRA in this rank.

3.
By analyzing the uncertainty of the calibrated parameters with different precipitation inputs, the parameter ranges determined by the gauge precipitation data were used as a benchmark to analyze the parameter ranges and optimal parameter values for different precipitation inputs. We found that MSWEP products have a better performance while the other three precipitation products had larger uncertainties.

4.
We also evaluated the impact of different precipitation input on model performance with a fixed parameter set. We have found that PERSIANN-CDR and TMPA products were less sensitive to changes in parameters, and these two products have higher NSE and R 2 values but poorer simulation for low-streamflow processes, while MSWEP product has a relatively smaller NSE value, but better for low-streamflow simulations. This conclusion is only a test we have done in the specific study area of the Mekong River Basin, and it needs to be verified in other watersheds.
In general, by evaluating the accuracy of the four precipitation products in different climate zones and the whole MRB, we recommend that daily precipitation derived from MSWEP can be used for hydrometeorological studies in the Mekong River Basin. Using different precipitation products to calibrate the model based on streamflow data will affect other hydrological processes such as evapotranspiration and baseflow, and this uncertainty would eventually affect the formulation of water management policies. Therefore, when using hydrological model simulation to evaluate the pros and cons of precipitation products, the uncertainty of different precipitation inputs should also be considered.