Inter-Comparison of Gauge-Based Gridded Data, Reanalysis and Satellite Precipitation Product with an Emphasis on Hydrological Modeling

: Precipitation is essential for modeling the hydrologic behavior of watersheds. There exist multiple precipitation products of different sources and precision. We evaluate the influence of different precipitation product on model parameters and streamflow predictive uncertainty using a soil water assessment tool (SWAT) model for a forest dominated catchment in India. We used IMD (gridded rainfall dataset), TRMM (satellite product), bias-corrected TRMM (corrected satellite product) and NCEP-CFSR (reanalysis dataset) over a period from 1998–2012 for simulating streamflow. The precipitation analysis using statistical measures revealed that the TRMM and CFSR data slightly overestimate rainfall compared to the ground-based IMD data. However, the TRMM estimates improved, applying a bias correction. The Nash–Sutcliffe (and 𝑅 (cid:2870) ) values for TRMM, TRMMbias and CFSR, are 0.58 (0.62), 0.62 (0.63) and 0.52 (0.54), respectively at model calibrated with IMD data (Scenario A). The models of each precipitation product (Scenario B) yielded Nash–Sutcliffe (and 𝑅 (cid:2870) ) values 0.71 (0.76), 0.74 (0.78) and 0.76 (0.77) for TRMM, TRMMbias and CFSR datasets, respectively. Thus, the hydrological model-based evaluation revealed that the model calibration with individual rainfall data as input showed increased accuracy in the streamflow simulation. IMD and TRMM forced models to perform better in capturing the streamflow simulations than the CFSR reanalysis-driven model. Overall, our results showed that TRMM data after proper correction could be a good alternative for ground observations for driving hydrological models.


Introduction
Precipitation data are crucial to many applications related to human life and water management. Examples include estimating the hydrological water balance [1], improving management practices, hydropower planning, development projects and flood controls. Conventionally, rain gauges provide the direct measurement of precipitation. However, rain gauges are often not enough to understanding than the original 3b42v7 data. Similar inferences were reported by Bitew et al. [29] and Tuo et al. [2] in their respective analysis.
All the studies mentioned above strengthened the need for thorough validation of these precipitation products and bias correction before using them as an input for the hydrological model. In addition, errors and uncertainties associated with inputs (precipitation products) have a high probability of inducing it in hydrologic simulations [45]. A multitude of studies showed that the model recalibration using the precipitation product considerably increases the model performance [2,17,41,[46][47][48][49][50][51][52]. However, the model parameter ranges obtained through the recalibration with the precipitation data may be improbable, thereby questioning the model's applicability for real-world applications. Further, in such recalibration of the models, it is essential to estimate the parameter uncertainty due to the precipitation input. Thus, validations and evaluations of these precipitation products are critical for any hydrologic modeling study [53,54]. Further, the calibration and output of the hydrological model is a function of the input precipitation characteristics. Therefore, it is imperative to evaluate the effect of the different precipitation products on the parameter estimation (calibration) and the streamflow simulations. It can also be stated that there is enough scope for advancement in understanding the effect of the different precipitation products on the hydrologic model-based streamflow simulations, parameter estimation (Calibration), predictive uncertainty and its applicability in Indian subcontinents.
Even though there has been a plethora of work done in terms of comparison of the reanalysis and satellite dataset individually, there has been a limited number of studies on intercomparison of different classes of precipitation data sets. Therefore, in this study, we compared three different types of precipitation products that are commonly used, (i) gauge-based rainfall product (IMD), (ii) reanalysis data (NCEP-CFSR) and (iii) satellite precipitation (TRMM and TRMM corrected). This study aimed to (1) compare the statistical characteristics of four different precipitation data sets, (2) investigate the adequacy of these precipitation products in driving the semi-distributed SWAT hydrological model and (3) evaluate the parameter uncertainty involved. These precipitation inputs are used to develop a semi-distributed SWAT model for a semi-arid, forest dominated coastal basin, "Nagavali River Basin in India".

Study Area
We selected the Nagavali River Basin (NRB) since the literature on semi-arid and semi-humid climatic regions is limited. NRB lies between the Godavari and Mahanadi river basins in the eastern part of India, with a catchment area of about 9510 km 2 ( Figure 1). The river begins from the Eastern Ghats at an elevation of 1600 m. It traverses about 160 km through Odisha and enters Andhra Pradesh near Kuneru village of Vizianagaram district at an elevation of about 152 m [55]. The extent of the basin is 18°17′ to 19° 44′ north latitudes and 82°53′ to 83°54′ east longitudes. The stream length ranges around 255 km, out of which the main 62% lies in Odisha and the rest in Andhra Pradesh. In this study, we considered the catchment area of 9056 km 2 up to the gauging and discharge measurement station at Srikakulam. NRB gets typical yearly precipitation of around 1140 mm, and the annual average temperature of the river basin is 28.2 °C in post-monsoon and 34.2 °C in premonsoon.

Datasets
Four precipitation products from different sources and of different resolutions that are the most frequently used in the hydrological application are used. This include gridded IMD (0.25° × 0.25°), TRMM 3B42 (0.25° × 0.25°), bias-corrected TRMM (0.25° × 0.25°) and reanalysis NCEP-CFSR (0.31° × 0.31°). All the datasets are selected for the common observation period from 1998-2012. Table 1 and Supplementary Information section S1 present detailed information about the precipitation data. The gridded data were generated from the observed data of 6995 gauging stations across India using spatial interpolation for the period 1901-2013 [14] Details about these data can be obtained from http://imd.gov.in (homepage/rainfall information).

Verification Strategy
Various statistical metrics used to evaluate the satellite precipitation products with reference to gauge-based IMD gridded data both qualitatively and quantitatively. IMD dataset is a popular validated dataset used in several other works [57][58][59][60][61]. Pai et al. [14] and Yeggina et al. [62] evaluated the performance of the IMD gridded dataset against the gauge rainfall estimates and showed that the IMD gridded dataset could be used as a reliable alternative for gauge rainfall. Various qualitative methods were used to measure the correspondence between estimates by precipitation product and IMD data. Here, we use the false alarm ratio (FAR), probability of detection (POD) and critical success index (CSI)-to evaluate the TRMM, bias-corrected TRMM and CFSR data with the reference rainfall data (IMD). Detailed information on these statistical metrics is presented in Table 2 and Supplementary Information section S2. Instead of point-to-point analysis, we analyzed the spatial average rainfall over the entire catchment considering the size and homogeneity of the catchment [63]. Based on the average rainfall for a given day, these statistical measures were computed by comparing rainfall and non-rainfall events for satellite (TRMM and CFSR) and reference (IMD) rainfall data. Table 2. Definition of statistical metrics used to evaluate the TRMM, bias-corrected TRMM and CFSR data with the reference rainfall data (IMD). These statistical measures were computed by comparing rainfall and non-rainfall events for satellite (TRMM and CFSR) and reference (IMD) rainfall data. In general, when the values of probability of detection (POD) < 0.65, FAR > 0.35 and CSI < 0.45, it is assumed that the product performance is low in terms of detecting rainfall. The calculation of these three metrics is explained in [64].

S. No Statistical Metrics Equation Description
1 Probability of detection (POD) It represents the detected rainfall events correctly by satellite out of rainfall events by reference data. Its ranges from 0 to 1 (1 is the best value) 2 False alarm ratio (FAR) FAR = F/(F+H) It represents the false, detected as rainfall event by satellite when reference data showed no rainfall. Its ranges from 0 to 1 (0 is the best value) 3 Critical success index (CSI)

CSI = H/(H+M+F)
It is more accurate when correct negative values are not considered. Its ranges from 0 to 1 (1 is the perfect value) For the verification of the discharge simulation from the hydrological model, we adopted the popular performance evaluation measures such as Nash-Sutcliffe coefficient (NS) [65], percentage of bias (PBIAS) and the coefficient of determination (R2) in this study.
where and are the ith time step simulated and observed values, respectively. is the mean of 'n' observed values.
The values of NS ranges from −∞ to 1, with NS = 1 indicating the best model. PBIAS (Equation (2)) measures the average tendency of the simulated data (Gupta et al. [66], and the optimal value of PBIAS is zero. If PBIAS is a positive value, the model is underestimating the streamflow, and if PBIAS is negative, the model overestimates the streamflow. The model is unsatisfactory if NSE ≤ 0.5, satisfactory performance (0.5 < NS ≤ 0.65), good performance (0.65 < NS ≤ 0.75), very good performance (0.75 < NS ≤ 1.00) and PBIAS >± 25% for the streamflow [67]. The R2 pattern similarity between observed and simulated data and its value ranges between 0 and 1. Higher the value of R2, the higher the similarity.

Hydrological Model
The precipitation products mentioned above were used to drive a semi-distributed Soil and Water Assessment Tool (SWAT) model. Here, we briefly describe the SWAT model and its governing equations, procedures to set up the model, and finally, the way to calibrate and evaluate the model.
SWAT is a comprehensive, physically based, continuous in time, semi-distributed hydrological model [68,69]. It was developed by the Agriculture Research Service of the United States Department of Agriculture and used widely to study large scale hydrological conditions in different regions [70][71][72][73]. It mimics various hydrological components of watersheds by solving process-based equations. The various processes in the SWAT model are simulated at daily time steps based on the water balance equation (Equation (1), [74]) where and represent initial and water content at any time 't' days (in mm), respectively. , represent the amount of evapotranspiration, precipitation and surface runoff (in mm) on any day. The amount of percolation and return flow on a day (in mm) is denoted by & , respectively. The Soil Conservation Service (SCS)-curve number (CN) method (Equations (5) and (6)) is used for estimating the surface runoff.
Qsurf and Rday represent the accumulated runoff or rainfall excess (in mm) and rainfall received on that particular day (in mm), respectively [74]. Ia denotes an initial loss that includes detention, interception and percolation before the runoff. The retention factor is denoted by marks retention parameter. The retention factor is subjected to spatial variation owing to land-use change, soil type, various management options, slope and temporal variation because of the changes in soil water. The storage routing technique is used for calculating the percolation (Arnold et al. [75], assuming that the percolation occurs whenever water content exceeds the field capacity, and the layer below is unsaturated (Mosbahi et al. 2020 [76], Sridhar, 2013 [53]). In this study, the Penman-Monteith method is adopted for estimating potential evapotranspiration. Using DEM, SWAT divides the entire basin into different subbasins and subsequently into hydrological response units (HRUs) based on the unique land use, soil classification, slope and various management combinations.
Primary meteorological variables such as temperature, precipitation, solar radiation, relative humidity and wind speed are required in addition to land use, soil characteristics and land cover to set up a hydrological model using SWAT. A brief description of the input data and data sources are presented in Table 3. Moreover, the land use land cover and soil map of the study area are shown in Figure 2. The daily river discharge data were obtained from the water resources information system of India (Central Water Commission, [77] for the period from 1985-2012 at Srikakulam station, as shown in Table 3. The reservoir details, inflow and outflow data were obtained from the Andhra Pradesh Water Resources Department and were incorporated in the model for implementing regulations. Table 3. Detailed information on primary meteorological variables required to set up a hydrological model using the soil water assessment tool (SWAT).  QSWAT interfaced with a quantum geographical information system (QGIS) was used for developing the model. Preliminary analysis was done to re-project all the information to a common projection and was resampled to 30 m spatial resolution. TauDEM tool in QSWAT was used for stream network generation and watershed delineation. The entire basin was divided into 43 subbasins using a threshold of 150 km 2 . A total of 200 HRUs were formed by fixing a threshold of 200 HRUs with distinct land use, soil and slope. Here, the analysis was carried out at the hydrologic response unit (HRU) level on a monthly time step. Further, SWAT calibration and uncertainty

Data
procedures (CUP) 2012 with sequential uncertainty fitting (SUFI-2) was used for auto-calibration, sensitivity analysis and validation purposes (for more detail, please refer to Setti et al. [55].

Model Development and Evaluation
For simulating streamflow from different precipitation inputs, two different scenarios were considered, which are as follows: Scenario A: calibrating the model with IMD gridded data and rerunning the model with other considered precipitation products; and Scenario B: calibrating the model with each of the rainfall products. Scenario A would allow investigating the impact of differences in rainfall products on streamflow simulation accuracy, whereas Scenario B (with each of rainfall products) will help in investigating the impact of input rainfall data on calibrated parameters, sensitivity analysis and streamflow simulation accuracy and in verifying whether these rainfall products can be used as an alternative source of input rainfall data for model calibration.
For both the scenarios, the sensitive parameters selection, automatic calibration and validation were performed by SUFI-2 optimization algorithms in the SWAT-CUP tool package. Sensitive parameters were identified using Latin-Hypercube One-factor-At-a-Time method with 2000 simulations. This procedure tests the model sensitivity by modifying only one parameter at a time and keeping the rest unchanged. In total, 18 parameters were identified from past literature (Setti et al. [63], and sensitivity analysis was performed. In this study, we considered the first two years, i.e., 1998-2000, as the warm-up period to reduce the effect of initial conditions. The monthly simulated streamflow of the Nagavali River Basin was calibrated over a time period of 2001-2008 and then validated in the period of 2009-2012 with observation streamflow at Srikakulam station (as shown in Figure 1) using SWAT. For each iteration, 2000 simulations were run for the calibration period. Following each iteration, parameter ranges were modified (nearest to fitted value) with reference to the values suggested by the program and also by their reasonable physical limitations. Interested readers refer [78,79] for more details about the protocol to calibrate the SWAT model.

Results and Discussion
We present the results in three categories, first comparing satellite data and reanalysis data with ground-based IMD data, second, evaluating hydrological model based on the precipitation products for two different scenarios, and finally, comparison of the annual water balance components.

Comparison of Satellite Data and Reanalysis Data with Ground-Based IMD Data
First, we computed statistical parameters such as minimum rainfall, maximum rainfall, standard deviation, average and skewness coefficient of all precipitation products (Table 4) only for rainy days (rainfall magnitude ≥2.5 mm/day). We observed that the maximum precipitation values were slightly overestimated by TRMM and CFSR datasets in comparison with the IMD product. Similarly, the CFSR is more positively skewed than the IMD, showing the general tendency for overestimation. The empirical cumulative density function (CDF) of the daily precipitation distribution during 1998-2012 was computed for the four datasets ( Figure 3). There is a slight difference in the CDF obtained for IMD, TRMM and CFSR data at low values of rainfall. However, the deviation is significant in the precipitation range from 30 to 60 mm/day for TRMM and CFSR data. The CDF of the bias-corrected TRMM is closely matching with that of IMD. IMD dataset for the time period from 1998-2013 revealed a total of 1591 rainy days and 4252 non-rainy days. On the other hand, the total rainy days estimated by the TRMM, bias-corrected TRMM and CFSR are 978, 932 and 1161, respectively (Table 5). Using the threshold of 2.5 mm/day, the values of POD, FAR and CSI were estimated. The value of POD was found to be around 0.61, 0.69 and 0.73 (Table 6) for TRMM, bias-corrected TRMM and CFSR, respectively, indicating the CFSR product is close to the IMD dataset in terms of hits. Similarly, in terms of FAR and CSI, the CFSR data seems to perform better than the TRMM-based datasets. For example, the CFSR data have the lowest value of FAR (0.38) and the highest value of CSI (0.50). These analyses showed that the CFSR product has better performance than the other products like TRMM and bias-corrected TRMM. However, it to be noted that this analysis does not consider the quantity of the rainfall; instead, it only finds hits and misses into consideration. In order to compare the data products based on the rainfall amount, we choose the rainfall intensity classification of the World Meteorological Organization (WMO) standard Geneva, Switzerland, 2012 [80] (1) rain < 2.5 mm (no/tiny rain), (2) 2.5 mm ≤ rain < 5 mm (low moderate rain), (3) 5 mm ≤ rain < 10 mm (high moderate rain), (4) 10 mm ≤ rain < 20 mm (low heavy rain),(5) 20 mm ≤ rain < 50 mm(high heavy rain),and (6) rain ≥ 50 mm(violent rain). The four considered precipitation datasets display a similar probability of occurrence of dry days (rain < 2.5 mm/d), which are 49%, 44%, 44% and 37% for IMD, TRMM, bias-corrected TRMM and CFSR, respectively. However, within the wet days, the probability of occurrence of high, moderate rain (low heavy rain) was found to be 30% (22%), 26% (25%), 28% (24%) and 31% (27%) for IMD, TRMM, bias-corrected TRMM and CFSR, respectively. It is observed that the CFSR data have more probability of occurrence in the low heavy rain range than what is observed in IMD data.
To evaluate the overall pattern and rainfall quantity, rainfall datasets on three different time scales-daily, monthly and annual-were considered. On a daily scale, the linear Pearson correlation between the IMD and other data sets such as TRMM, bias-corrected TRMM and CFSR datasets was found to be 0.44, 0.83 and 0.82, respectively. The daily and average monthly time series of the rainfall events of four datasets are shown in Figure 4a

Hydrological Model-Based Evaluation of the Precipitation Products
As mentioned, two different scenarios were considered-calibrating the SWAT model with IMD gridded data and rerunning the model with other considered precipitation products.

Scenario A: SWAT Model Calibrated with IMD Data
This scenario attempts to unravel the effects of the precipitation products on streamflow simulation, specifically when the precipitation products driving the model is calibrated with IMD data.

Calibration of the SWAT Model with IMD Data
Initially, model calibration was done using the IMD rainfall data as input to the model after finding the sensitive parameters (using the method mentioned in Section 3). The calibration was done using the SWAT-CUP with the objective function of maximizing the NS between the observed and simulated flows. The performance statistics of the models' simulation showed that the calibration is good with reference to NS and correlation coefficient on the order of 0.85 and 0.79 and 0.91 and 0.88, respectively, during the calibration and validation periods.
In the second step, the other precipitation products are used for driving the calibrated model. Figure 5 shows the streamflow simulation results from each of the precipitation products, and the corresponding performance statistics are shown in Table 7. It is observed that even though the TRMM dataset-based model is performing satisfactorily in driving the model, according to Moriasi  the systematic difference has significantly reduced in the bias-corrected TRMM-based model with PBIAS = −7.4%. The streamflow simulation using the CFSR data seems to be overestimated with PBIAS = −16.2%. However, it is to be noted that in the case of the CFSR, the streamflow overestimations are not systematic; rather, the pattern streamflow closely follows the precipitation pattern. There is considerable streamflow overestimation in the year 2003, during which the rainfall estimates were higher than the IMD data. On the other hand, the streamflow simulation was underestimated during the years 2000, 2004 and 2005, where the rainfall estimates from CFSR were lower than IMD. Thus, there is no systematic under or overestimation in the streamflow simulation. These aforementioned results demonstrated that the bias-corrected TRMM data are a potential alternative to IMD data.

Scenario B: SWAT Model Calibrated Individually with Precipitation Products
This scenario evaluates the applicability of the precipitation products for calibrating the model and investigates the differences in model calibration, sensitive parameters and the ability to simulate streamflow accurately.

Sensitive Parameters
To evaluate the difference introduced at the stage of sensitivity analysis by using different precipitation products, we compared the ranking of the sensitive parameters obtained. Out of several parameters, we selected 18 commonly used parameters (Table 8) for streamflow simulation using four precipitation products.   To identify the sensitivity parameters of each precipitation dataset, the model was run with 2000 simulations of eighteen selected parameters. Interestingly, the most sensitive parameters obtained for the four datasets were significantly different (Table 9) except for the deep percolation factor and curve number (CN2) that comes out to be significant for all products. The second most significant factor was ESCO using IMD and TRMM, whereas it was ALPHA_BF and GWQMN using bias-corrected TRMM and CFSR data. The most sensitive parameters from all the data sets include RCHRG_DP, ESCO and GWQMN, which show that the crucial processes governing the hydrology of the system are evapotranspiration and groundwater recharge, which is in line with the general understanding of the dominant processes in forest and agricultural catchment. Table 9. Sensitivity parameter ranking for each dataset with default parameter values using sequential uncertainty fitting (SUFI)-2 algorithm. Interestingly, the most sensitive parameters obtained for the four datasets were significantly different except for the deep percolation factor and curve number (CN2) that comes out to be significant for all products. The second most significant factor was soil evaporation compensation factor (ESCO) using IMD and TRMM, whereas it was baseflow alpha factor (ALPHA_BF) and threshold depth of water in the shallow aquifer (GWQMN) using bias-corrected TRMM and CFSR data.

Model Calibration
Here, out of the eighteen parameters, the ten most sensitive parameters for each dataset were further considered for calibration of the corresponding models.
The individual models were developed based on four different data sets and were calibrated at a monthly scale using the SUFI-2 algorithm. The results from the model calibration are enumerated for all the models in Table 10. Following the model performance classification of Moriasi et al. [67], the model based on IMD precipitation data performs well with NS (R 2 ) values on the range 0.85(0.87) and 0.79(0.82) during the calibration and validation period, respectively. On the other hand, the CFSR data-based model performance is satisfactory with the NS values 0.76 and 0.73 during calibration and validation, respectively. The models based on TRMM and bias-corrected TRMM dataset produced better results with NS equal to 0.71 and 0.74, respectively, as shown in Table 10. A similar pattern was observed in the PBIAS values during the calibration and validation period.  Figure 6 shows the streamflow simulation results during calibration and validation. Based on the simulation results obtained during calibration and validation, it can be seen that the IMD data produces the best results in comparison with the other datasets. However, closer simulation was obtained using the TRMM bias-corrected dataset. It is also to be noted that the bias-corrected TRMM datasets are yielding better results when compared to the TRMM data. Further, the results based on CFSR are also satisfactory when compared to its performance in Scenario A. However, there is an overestimation of streamflow for most of the simulations except for the years 2000, 2004 and 2005. This implies that the recalibration of the model with the individual dataset has increased the model performance to a greater extent. A similar observation was also reported by Tuo et al. [2] and Zhang et al. [41]. Besides the evaluation of the results from the best simulation from each precipitation data set, as explained above, we also investigated the ensemble of all available simulations. We calculated the distribution of the NSC values obtained for the total of 2000 simulations. Figure 7 shows the empirical CDF of the NSC values obtained from the four models. All four models reached a satisfactory level of performance. However, interesting results were obtained using the bias-corrected TRMM data set, where most of the simulations (around 80% of the model simulations) were above NSC = 0.5, while IMD, TRMM, CFSR data sets had a lower fraction on the order of 70%, 20% and 65%, respectively. IMD data-based model displays a larger fraction (50%) of model simulations having NSC > 0.65, which represents excellent model performance. On comparing the bias-corrected TRMM dataset and TRMM-based models, there is a significant improvement in the bias-corrected TRMM-based results. This analysis indicates the uncertainty in terms of model performance when using different precipitation data sets.

Parameter Uncertainty
The parameter uncertainty when using different precipitation data sets was estimated by investigating the variations in the best-fit parameters, and the range of parameters obtained during the calibration. Table 11 and Figure 8 show the values and range of the best-fit parameters of the calibrated parameters of nine global sensitive hydrological parameters, respectively. Among the critical parameters shown in Figure 8, ESCO, an important parameter related to soil evaporation, have different ranges for the different datasets. It is also to be noted that the range of the parameters for ESCO obtained using IMD is different from the ones obtained from other precipitation data sets. TRMM and CFSR dataset-based models have a high ESCO value >0.5 when compared to IMD, and TRMM bias-corrected-based models indicate lower evapotranspiration compared to other datasets, which are also reflected in overestimation of streamflow. Table 11. Values for best-fit parameters obtained using the four different datasets. Among the important parameters, ESCO, an important parameter related to soil evaporation, have different ranges for the different datasets. The range of the parameters for ESCO obtained using IMD is different from the ones obtained from other precipitation data sets.  Deep aquifer percolation fraction (RCHRG_DP) (the most dominant parameter in four precipitation datasets) displays variation in both best values and the ranges considering all the precipitation inputs. This indicates that the IMD-based model allows more water for deep percolation when compared to the CFSR-driven model. Similarly, the best parameter ranges for GWQMN (a parameter controlling the depth of water in the shallow aquifer) were higher for the IMD-based model when compared to the other models, which indicates the model-based in TRMM and CFSR data resulted in less shallow aquifer storage.

S. No
Interestingly, there is no significant difference in the CN2 parameter, which reflects the surface runoff in terms of both the best parameter and the ranges. However, for other parameters, such as SOL _AWC, HRU_SLP, CH_K2, the ranges and the best fit values are different.
Overall, the analysis shows that different precipitation inputs affect both the best estimate of a parameter as well as its ranges. It is to be noted that the SWAT-CUP adjusts parameters in such a way that the observed streamflow values are matched. Even though the models capture the observed flow, the parameter values, and hence the distribution of water in the catchment, are entirely different. For example, the average water balances of the various components in the study area using different precipitation data set lead to different values of evapotranspiration, base flow, percolation, but more or less similar surface runoff values. It can be observed that the calibrated bias-corrected TRMM forced the model to have low ESCO when compared to the other models resulting in higher values of evapotranspiration.

Uncertainty in Streamflow Simulation
Even though the parameter uncertainties contribute to uncertainty in all outputs, variables were obtained from the model. In this section, we limit our discussion only to the uncertainty in the streamflow, as shown in Figure 9. We evaluated the uncertainty using the p-factor, and R-factor obtained during the calibration. The p-factor represents the fraction of observed streamflow falling within the 95PPU band and varies from 0 to 1, where p = 1 indicates that 100% of the observed flow falls within the 95% confidence band (i.e., a perfect model simulation considering the uncertainty). The R-factor is estimated as the ratio of the average width of the 95% confidence band and the standard deviation of the observed streamflow. A value of R < 1.5, again depending on the situation, would be desirable for this index. According to Abbaspour et al. [81], R-value less than 1.5 would be desirable for satisfactory model simulation.
In this study, the percentage of observed data bracketed by 95PPU uncertainty for IMD, TRMM, bias-corrected TRMM and CFSR is 83%, 68%, 72% and 64%, respectively during the calibration and 85%, 67%, 69% and 61%, respectively during the validation period as shown in Figure 9 and Table  10. However, comparing the r-factors (which measures the width of the 95PPU), it can be observed that the CFSR and bias-corrected TRMM-based models have low values indicating the higher levels of uncertainty. Further, the observed peak values do not fall within the 95PPU obtained from these data sets. Thus, it is clear from these results different precipitation inputs generate discrete prediction uncertainties for estimation of streamflow.  Figure 10 represents the annual average of the water balance components based on the different rainfall inputs. For comparison, the observed streamflow and evapotraspiration (ET) are provided from the literature. Figure 10a presents the results of the models calibrated with IMD data (Scenario A), and Figure 10b shows the results of the model obtained when calibrated with individual data rainfall type.

Precipitation
The catchment averaged rainfall estimates from the TRMM and CFSR indicate that these data sets have a general tendency for overestimating the rainfall when compared to the IMD datasets. This is consistent with the results reported in [41].

Streamflow
On comparing the observed and the simulated streamflow volume, we noted that the model based on IMD data produced annual streamflow accurately; nevertheless, the performance of other data sets depends on how the model was calibrated.
When the model was calibrated with IMD data (Scenario A), the CFSR-based model overestimated the average annual streamflow. In particular, it can be seen from Figure 5 that the nonmonsoon streamflow was higher. However, TRMM and bias-corrected TRMM models result in slight over and underestimation, respectively. The results closely follow the pattern in the rainfall estimates showing that rainfall is the primary driving factor in streamflow simulation. Under Scenario B, all the models show comparatively better ability in capturing the streamflow volume; however, biascorrected TRMM showed the ability to reproduce the streamflow accurately. Moreover, significant improvement in the CFSR data-based model was reported.

Evapotranspiration
The IMD-based model simulation of ET (641.5 mm) was well within the range of 600-700 mm in the study area [53]. Under Scenario A, TRMM, CFSR, and bias-corrected TRMM resulted in an overestimation of the ET, but the amount of overestimation is different. Even though TRMM and CFSR have similar precipitation amounts, but the distribution of the water balance components is different. This difference may be attributed to the difference in the rainfall pattern and intensity, as discussed in Section 4.1. On calibrating the model with the satellite data sets (the objective function used here is the performance of streamflow volume), TRMM and bias-corrected TRMM significantly overestimated the ET, whereas the CFSR data resulted in lesser value and close to IMD-based model estimation. Overall, we notice a considerable increase in both ET and streamflow estimates for all the data sets due to the overestimation of the rainfall.
It was observed that the precipitation dataset is primary data in driving hydrological model and also the main source of uncertainty as different precipitation dataset leads to different best parameter ranges. Further, uncertainty in the input precipitation propagates to the uncertainty to the model estimates of the water balance components and, therefore, on the water resources management and policy decisions made based on the model results.

Conclusions
We compared four different precipitation datasets of different categories, source and resolution, namely IMD, TRMM, bias-corrected TRMM and CFSR using a hydrological model. We investigated the impact of precipitation products on hydrological model calibration, model predictions for a forest-dominated river basin in India. The following are salient features obtained from the study.
All the considered precipitation products showed a good correlation with each other as well as the IMD gridded data on a daily scale. The CC of CFSR and bias-corrected TRMM with the IMD data were approximately 0.8. However, a quantitative comparison of the precipitation showed that CFSR and TRMM data have a slight tendency to overestimate precipitation. We also observed that biascorrected TRMM data are comparable with the IMD data. The deviation is significant in the precipitation range from 30 mm/day to 60 mm/day. We created two scenarios were for evaluating the precipitation using the SWAT model. In Scenario A, the model was calibrated using IMD precipitation data as input, and later this model was run with other data sets. In Scenario B, individual models were developed with each of the rainfall products as input. From these scenarios, we observed that the performance of streamflow modeling increased when the model was calibrated individually for each of the rainfall products.
The results from Scenario B showed that the different precipitation datasets resulted in multiple sets of sensitive parameters, parameter ranges and water balance components. The IMD data-based model yielded the best results in terms of streamflow simulation. The model based on the biascorrected TRMM dataset produced closer results and thus can be used as an alternate for gauge precipitation. In summary, the choice of precipitation products has a vital role in model performance, prediction uncertainties and parameter uncertainties in streamflow simulations. Water balance estimation based on different precipitation datasets can lead to different conclusions, and therefore uncertainty generated by the use of different precipitation inputs must be taken into consideration.