Evaluation of Various Precipitation Products Using Ground-Based Discharge Observation at the Nujiang River Basin, China

: Precipitation observation and prediction is difficult in many high elevation regions due to the complex terrain and the lack of in situ observations for comparison. The Nujiang River (upper and middle Salween River) basin in the Tibetan Plateau is no exception. Because of this shortcoming, we propose the use of gauge-observed discharge time series at the basin outlet (e.g., Jiayuqiao hydrological station) to evaluate the performance of four different precipitation products (e.g., satellite-based products and reanalysis datasets). A physically-based distributed cryosphere hydrological model with coupled snow and frozen soil physics was adopted to transfer the basin-wide gridded precipitation into the basin-outlet discharges. First, we corrected and evaluated the four precipitation products. A correlation relationship was established between each precipitation product and the available (limited) gauge rainfall within different elevation zones, and then used to correct the four precipitation products in the study basin. Secondly, a distributed cryosphere hydrological model was used to simulate the basin-outlet runoff driven by each corrected precipitation product. The results indicated that modern-era retrospective analysis for Research and Applications, version 2 (MERRA2) precipitation has better performance in the upper Nujiang River basin relative to the other precipitation products based on comparisons of observed and simulated runoff.


Introduction
The Tibetan Plateau (TP) is sometimes known as "The Third Pole" in the world [1]. The TP's average elevation is more than 4000 m, and its total area is more than 2.5 × 10 6 km 2 , lending it another nickname-"the roof of the world" [2]. The TP is the headwaters for many of South and East Asia's major rivers, including the Yangtze, Yellow, Ganges, and Salween Rivers, therefore, it is important to study the hydrologic cycle of the TP [3]. TP has experienced rapid climate change over the past three decades [4][5][6], including changes to the hydrological cycle in the TP river basins, which can have an important societal impact on local and downstream human life. Consequently, the study of regional precipitation is of great significance for the prediction, management, and utilization of water resources over TP.
The upper and middle reaches of the Salween River basin, located in the southeast portion of the TP (Figure 1), are also known as the Nujiang River (NR) basin in China. The headstream of the NR is Where the river networks, hydrological station, and precipitation stations used in this study are also plotted. The Jiayuqiao hydrological station is given in black triangle.

Datasets
Geomorphological data were obtained from the 90 m resolution digital elevation model (DEM) from the National Aeronautics and Space Administration (NASA) Shuttle Radar Topography Mission (SRTM) [16]. These data for the NR basin were then resampled to a 5 km resolution DEM. Land use static data at 1 km from the U.S. Geological Survey was modified based on glacier coverage data from the International Center for Integrated Mountain Development (ICIMOD). The glacier inventory coverage was produced using Landsat 5-Multispectral Scanner System and Landsat 7-Enhanced Thematic Mapper+ images and high-resolution images of Google Earth, in combination with SRTM DEMs [17]. Soil parameters were from the globally consistent digital soil data of the Food and Agriculture Organization (FAO) [18]. The maps of land use and soil are shown in Figure 2. Where the river networks, hydrological station, and precipitation stations used in this study are also plotted. The Jiayuqiao hydrological station is given in black triangle.
The NR basin precipitation has undergone noticeable changes during the era of rapid global climate change. Du et al. found that from 1971 to 2008, precipitation in the basin increased at a rate of 21.0 mm/decade [8], primarily due to an increase in summer precipitation of 9.8mm/decade, along with spring and autumn increases of 5.0 mm/decade and a slight increase in winter (0.8 mm/decade). Zhou et al. found that from 1980 to 2008, the average annual precipitation of the whole basin fluctuated greatly and showed an increasing trend, with an increase of 13.8 mm/decade [9]. Yang et al. found that from 1981 to 2010, the trend in extreme precipitation events in the basin was complex. They found that annual maximum daily precipitation increased in both the upstream and downstream regions of the NR basin, but had no change to a slight decrease in the midstream region. However, extreme precipitation in the basin as a whole has increased significantly in the past 40 years [10].
Unfortunately, the insufficiency and uneven distribution of meteorological stations have long been the main restriction for hydrological simulation on the TP [11][12][13]. The NR basin has very few observational stations, and the data quality and continuity are insufficient for distributed hydrological modeling in the study area. With the exception of a few national weather stations and rain gauges, it is difficult to set up observation stations due to environmental and topographic factors. So it is very important to examine the accuracy of precipitation products in the basin.
The objective of this study is to evaluate the basin wide precipitation of the NR from different operational precipitation products (satellite-based and reanalyzed) by using ground-based discharge observations and a distributed cryosphere hydrological model. The paper is organized as follows. Section 2 depicts the study area, datasets, and the methodology. Results are described in Section 3, and discussions are given in Section 4. Section 5 is the conclusion.

Study Area
The NR basin is in the southeastern TP of China [14]. The NR has 137,800 km 2 in basin area, 2013 km in length, and 0.204% average gradient [15]. The elevation in the region declines 4840 m, from northwest to southeast (Figure 1). In the basin, the precipitation and other meteorological forcing data show large differences from the northwest to the southeast.

Datasets
Geomorphological data were obtained from the 90 m resolution digital elevation model (DEM) from the National Aeronautics and Space Administration (NASA) Shuttle Radar Topography Mission (SRTM) [16]. These data for the NR basin were then resampled to a 5 km resolution DEM. Land use static data at 1 km from the U.S. Geological Survey was modified based on glacier coverage data from the International Center for Integrated Mountain Development (ICIMOD). The glacier inventory coverage was produced using Landsat 5-Multispectral Scanner System and Landsat 7-Enhanced Thematic Mapper+ images and high-resolution images of Google Earth, in combination with SRTM DEMs [17]. Soil parameters were from the globally consistent digital soil data of the Food and Agriculture Organization (FAO) [18]. The maps of land use and soil are shown in Figure 2.

Datasets
Geomorphological data were obtained from the 90 m resolution digital elevation model (DEM) from the National Aeronautics and Space Administration (NASA) Shuttle Radar Topography Mission (SRTM) [16]. These data for the NR basin were then resampled to a 5 km resolution DEM. Land use static data at 1 km from the U.S. Geological Survey was modified based on glacier coverage data from the International Center for Integrated Mountain Development (ICIMOD). The glacier inventory coverage was produced using Landsat 5-Multispectral Scanner System and Landsat 7-Enhanced Thematic Mapper+ images and high-resolution images of Google Earth, in combination with SRTM DEMs [17]. Soil parameters were from the globally consistent digital soil data of the Food and Agriculture Organization (FAO) [18]. The maps of land use and soil are shown in Figure 2. Meteorological data (except for precipitation data) used in this study include wind speed, air temperature, air pressure, specific humidity, and downward longwave and solar radiation, which are used to drive the hydrological model. Meteorological data were obtained from gridded (0.25 • × 0.25 • ) 3-hourly datasets from the Global Land Data Assimilation System (GLDAS) version 2.0 during 2012 to 2016 [19][20][21][22]. Dynamic vegetation input data included leaf area index (LAI) and fraction of photosynthetic active radiation (FPAR) at 1 km spatial resolution and 8 day composites, computed from Moderate Resolution Imaging Spectroradiometer (MODIS) data [23]. GLDAS, Tropical Rainfall Measuring Mission (TRMM), and modern-era retrospective analysis for Research and Applications, version 2 (MERRA2) precipitation data were extracted gridded (0.25 • × 0.25 • for GLDAS, 0.25 • × 0.25 • for TRMM, 0.625 • × 0.5 • for MERRA2) 3-hourly data obtained from the NASA Goddard Space Flight Center [24,25]. The China Meteorological Forcing Dataset (CMFD) data are also gridded (0.1 • × 0.1 • ) 3-hourly data [26,27]. All the precipitation data are from 2012 to 2016.
China Meteorological Administration (CMA) stations have long times series of precipitation data [28,29], but due to limited geographical coverage of sparse CMA stations in the study basin, a few rain gauges from the Ministry of Water Resources of the People Republic of China (MWR) were also used. Based on data available, 15 stations which have daily precipitation data in the monsoon season (June to September) between 2012 and 2016 were chosen. There are also three hydrological stations from high to low elevation within this basin [30]. However, human activities have significantly changed the river basin downstream of the TP [31], so we selected an upstream hydrological station that more accurately reflects the unaltered processes in this region. Therefore, Jiayuqiao station, which is the highest of the three hydrological stations, was selected for use in this study. All stations used in this study are shown in Figure 1.

Methodology
There are significant differences between the observed precipitation and the products datasets, which we suspect may derive from the complex terrain [12,[35][36][37][38]. Therefore, the method we used for the precipitation correction is mainly based on the elevation gradient. For each precipitation product, there are certain amount monthly observed precipitation data from rain stations, corresponding to the 15 grid cells in the precipitation product. Therefore, we compared the monthly precipitation between observed data and product, and calculated the proportionality coefficient at each station. We can then derive the relationship between station elevation and proportionality coefficient. This relationship can be extrapolated to every grid point in the study area, which is necessary as there are no stations above a certain elevation.
In this study, we used a water and energy budget-based distributed hydrological model (WEB-DHM) with coupled snow and frozen soil physics [32][33][34]39]. Based on the enthalpy theory, the WEB-DHM has been largely improved for cold region hydrology, by incorporating realistic simulations of snow variables (by using a 3-layered energy balance snow scheme) and soil water phase changes (with a new frozen soil scheme) [33,40]. Overall structures of WEB-DHM are shown in Figure 3. (a) division from basin to subbasin; (b) subdivision from subbasin to flow intervals comprising several model grids; (c) discretization from a model grid to a number of geometrically symmetrical hillslopes; (d) and description of the water moisture transfer from atmosphere to river. R sw , R lw , H are downward solar radiation, downward longwave radiation, and sensible heat flux, respectively [32]. (B) A detailed description of enthalpy-based 3-layer snow module coupled with frozen soil module in the WEB-DHM [33]. The time step of the WEB-DHM model is an hour, and the model outputs can be expressed as hourly, daily, or monthly variables (e.g., discharge, heat fluxes) [32]. Since only daily discharge observations at Jiayuqiao gauge are available for this study, we provided the daily results to compare the simulated and observed discharges. The best option for the model input data should be hourly data but sometimes it is not feasible to have all hourly input data, the input data with longer time intervals have been used before [34]. In this study, daily data of each precipitation product were used.
At each model grid cell, the calculations of output variables are the water and energy balance in this model, so the model has a reliable physical foundation.
The WEB-DHM has already been widely used, and the model has been shown to perform well in several basins in the TP, including the Naqu River basin, the Lake Seling Co basin, the upper Yellow River basin, and the Lhasa River basin. Additionally, the model had shown good simulation ability in other countries, such as the upper Tone River basin in Japan and the Dudhkoshi River basin in Nepal [21,28,[40][41][42][43][44][45][46][47].
To evaluate the precipitation products, the hydrological model of the study area must first be established. Forced by the different precipitation data, the runoff can be simulated by the model. The Nash-Sutcliffe efficiency (NSE) [48], and relative root mean square error (RRMSE) are used to evaluate the runoff, which are defined in (1) and (2), respectively. In the equations, Q S is the simulated runoff, Q O is the observed runoff, Q O is the average observed runoff, and n is the number of observations in the calculation:

Results
Due to the high elevation, complex terrain, and tough environment in the NR basin, there are very few meteorological stations or rain gauges, making it difficult to represent the precipitation by using the few observational stations trend for the entire basin. Therefore, based on CMA and MWR station datasets, this study evaluated and corrected four precipitation products with the aids of the WEB-DHM hydrological model and basin-outlet discharge observations.

Comparison of Different Precipitation Products
Precipitation patterns in the NR basin are highly influenced by the timing and strength of the South Asian monsoon in summer. Therefore, we have separated the annual precipitation into the precipitation in non-monsoon season and the precipitation in monsoon season (Figure 4). It is found that these products are more similar to each other in non-monsoon season, and are significantly different in monsoon season. In this case, for hydrological simulations, it is necessary to correct the precipitation products in monsoon season before comparing and using them.

Correction of Precipitation Products
To correct the four precipitation products (GLDAS, TRMM, MERRA2, and CMFD) in this study, we first compared the precipitation data between 15 precipitation stations and the corresponding grid cells in these products. We calculated the R 2 of daily precipitation between the precipitation products and the stations. The comparison of daily precipitation between four precipitation products and four CMA stations is shown in Figure 5. It can be seen that the coefficients between each product and each station are all below −0.1, which limited the application of precipitation products so they need to be improved. In Figure 5, a huge number of points are very close to the coordinate axes, which means a mismatch between gauges and products in most days (e.g., the precipitation products cannot capture the gauge precipitation, vice versa). This might be caused by the lack of maintenance in some rain gauges at high altitudes, which attenuated the quality of the observed data. The daily precipitation between the products and stations show negative R 2 , so we chose monthly data for a further comparison. Based on monthly data, the average R 2 is around 0.1 ( Figure 6 and Table 1), which is much higher than those calculated at the daily scale. The monthly data reduced the errors caused by the fine time scale (e.g., mismatching at fine time scale). However, the data still needs to be improved, since the residual error is mostly caused by the data quality. Therefore, for further comparison and correction, we calculated a proportionality coefficient between the observations and the products at each station for each product by using monthly data with the aim of correcting precipitation products. We found that the proportional coefficients did not vary for stations at the same elevation but were different in different elevation zones. Therefore, each product needs to be corrected for its different elevation zones. According to the above proportional relationship, the station observed data can be used for detailed correction of precipitation products. However, at high altitudes (>4500 m), there are no rain stations, so it is still not possible to accurately correct the precipitation data from different products. Therefore, we assumed that the correction function from lower elevation stations can be extrapolated to the high-altitude regions. For example (Figure 6), we chose four CMA stations and 11 MWR stations in correcting MERRA2 data. We found that the CMA stations-at altitudes of 4528 m, 3955 m, 3085 m, and 1453 m-have proportionality coefficients of 1.1942, 0.9613, 0.8018, and 0.5536. By combining the altitudes and proportionality coefficients of 15 stations together (the altitude of the 11 MWR stations are not given here), we were able to obtain a curve which was extrapolated to high altitude regions.     Through this correction, the comparison between the data of corresponding grid points at each station before and after correction and the measured data at the station were obtained, which could then be compared to the monthly precipitation products (Figures 7 and 8). Figure 8 shows that the corrected precipitation is much closer to the observed data than the original product. In each product, the corrected data is much closer to line y = x (1: 1 line) than the original data, and the R 2 values are also higher. Using Figures 7 and 8, we created the corrected spatial distribution in monsoon season for each precipitation product (Figure 9). From Figure 9, the precipitation distributions of GLDAS, TRMM, and MERRA2 are close to each other, with some differences in numerical values; the precipitation distribution by CMFD was significantly different from other products, and the precipitation in high-altitude regions (> 4500 m) was significantly higher than other products after correction. Therefore, it can be seen from the figures that it still needs more verifications in spatial patterns to compare the precipitation products. Based on the above descriptions, further analysis and evaluations are needed to identify the reliability and applicability of precipitation products. The applicability of precipitation products is confirmed by using hydrological models to simulate runoff in high altitude areas in the upper NR basin (with the outlet of Jiayuqiao hydrological station).

Comparison of Simulated Discharges Driven by Precipitation Products
The precipitation data input to the WEB-DHM model is based on grid points, which means the precipitation value of each grid point will be read into the model, which provides an adequate basis for assessing whether the spatial distribution of precipitation is reasonable.
WEB-DHM can simulate the runoff through physical mechanisms by inputting a variety of forcing data. Therefore, to evaluate the precipitation products, the forcing data for the simulation in this study are the different precipitation datasets while other input meteorological variables were kept the same. To evaluate this precipitation correction, this study simulated the runoff by using both the original and corrected precipitation data. Therefore, we obtained 8 runoff outputs from 4 products. Using the observed discharge data of Jiayuqiao station as the true value, the output runoff can be evaluated by using NSE and RRMSE. The results can be seen in Figure 10. And the comparisons at a monthly scale were shown in Figure 11.

Discussion
The observed precipitation data in the NR basin is sparse and the precipitation products have their own uncertainties, therefore the current precipitation products should be corrected before evaluation. Because the spatial distribution of precipitation strongly correlates with the elevation Figure 11. Scatterplot of monthly runoff (standard) simulated by original and corrected precipitation.
From the values of indices in Figures 10 and 11, the simulated runoff using the corrected precipitation is better than the simulations by the original data for each product. For the simulated runoff by the corrected data, the NSE and R 2 are larger and the RRMSE is lower than that by the original data. However, the NSE is negative and the RRMSE is large in Figure 10, which means the correlation between the simulated runoff and the observed runoff are not high. Since the majority of Jiayuqiao catchment is ungauged (Figure 1; in which Jiayuqiao station is plotted in a black triangle, and rain gauges are given in black cross), the precipitation cannot be corrected accurately, especially in the high-altitude region. Though the variation trends are consistent between the simulated runoff by each product and observed runoff, the results can be referred to discuss which precipitation product performs better in the NR basin, where the ground-based data are scarce.

Discussion
The observed precipitation data in the NR basin is sparse and the precipitation products have their own uncertainties, therefore the current precipitation products should be corrected before evaluation. Because the spatial distribution of precipitation strongly correlates with the elevation gradient, we were able to correct the precipitation products based on the relationship between elevation and precipitation, then evaluate the products.
Compared with the observed data, all four uncorrected products underestimate the precipitation in the NR basin. After the correction is applied, the CMFD precipitation is higher than the others, especially in high elevation areas. We then used the corrected precipitation to simulate the runoff. Both the corrected and uncorrected simulated runoff underestimate runoff when compared to observations. However, the corrected product is generally closer to the observed runoff than the uncorrected product. We expect that this is due to the sparseness of the precipitation observation network, particularly at higher elevations.
From Figure 8, we can find that the correction method is effective, because the corrected precipitation is better than the original data in each product. However, the R 2 values of the corrected precipitation are still low, which might be caused by the low quality and quantity of observed data. Therefore, to correct the precipitation more accurately, it is better to get more ground-based observations. This study is an initial exploration in comparing various precipitation products at the upper Nujiang River basin. It cannot be denied that the indices (Figures 6, 8, 10 and 11, and Table 1) are not satisfying yet, which may be attributed to the very limited observations open to public use (currently only few CMA stations were used in operational precipitation products at the study area).
From Figures 10 and 11, the evaluation indices provide different results as to which corrected precipitation product is the most accurate for this region. Figure 10 shows MERRA2 data is better than the other three products because it has the largest NSE and the smallest RRMSE. In Figure 11, if R 2 is used as the evaluation basis, TRMM and MERRA2 are almost the same. Although the performance of GLDAS is very close to line y = x (1:1 line), the coefficient R 2 is not as high as TRMM and MERRA2. From the comparison above, we can find that MERRA2 is a little better than the others in the NR basin. To check whether the product has a good correlationship with the observed data, we also calculated the R 2 and proportionality coefficient (k) of four products ( Table 1). The results indicated that MERRA2 is also better than the other three products, at 0.114 while GLDAS is −0.146, TRMM is 0.099, and CMFD is 0.111. However, the evaluation indices indicate that none of the products is of good quality (with negative NSE). It is not surprising that no operational product is all that good when comparing to truth hydrological data in the NR basin, due to the lack of the ground-based precipitation observations. The reanalysis data is combined with atmospheric model simulations, which may explain why the MERRA2 product is most accurate among four products. Further corrections to the precipitation, using additional observational data, are necessary for this region.

Conclusions
In this study, due to lack of meteorological stations and rain gauges in the TP and larger uncertainties of the precipitation products, we choose the upper Nujiang River basin to evaluate and correct the precipitation products. We found the following: (1) All four precipitation products consistently underestimated the precipitation in the NR basin, for not incorporating enough ground-based observations; (2) By correcting the precipitation products based on the elevation gradient, all products were brought closer to the observed values. (3) MERRA2 data provide more accurate precipitation than the other products estimate in the upper NR basin using the corrected data which were then used to drive a hydrological model, the output of which was compared to the hydrological station data.
It is concluded that in further studies, more and more high-quality ground-based observations (e.g., integrating CMA, MWR, as well as the other in-situ precipitation observations) should be utilized when generating operational products for improved regional precipitation, as well as hydrological simulations.

Conflicts of Interest:
The authors declare no conflict of interest.