The Qinghai-Tibet Plateau (TP, 26°–40° N, 73°–104° E) is the largest and highest plateau in the world, with an average altitude higher than 4000 m over an area of about 2.5 million km2
]. Complex orographic and harsh weather conditions make it difficult to install and maintain synoptic stations in the TP. Almost all of the existing stations are located in the east and south of TP, and 70% of the stations are located below 4000 m. Scarcity and low elevation of the existing weather stations cannot accurately present the meteorological status of the TP. Likewise, scarcity and weak representation of meteorological data is fruitless for developing hydrological models [3
]. Distributed models scientifically delineate water cycles in basin-scales, but also need high quality weather data as input [5
]. Therefore, the poorly observed network is one of the reasons for the slowly progress of hydrological simulation and analysis in the basins of the TP.
Reanalysis datasets are based on remote sensing products and climate model outputs, and some are corrected by observed data and are an important surrogate for observations [6
]. They will play a major role in the development of models [7
], showing climate change trends, [8
] un-gauged regions [9
]. However, in view of uncertainties which exist in the process of data acquisition and assimilation, most studies are still predicated on the evaluation of reanalysis [10
]. The accuracy of reanalysis is mainly assessed in one of the following two ways: (1) comparison of reanalysis with corresponding observed data [11
]; (2) using reanalysis as input data to drive hydrological models and then comparing the hydrological features of model output with the observed [13
The first way is always used over a large-scale, with a certain number of weather stations. Many statistic indexes are utilized to measure the quality of reanalysis datasets, such as: correlation coefficient (R), relative bias (BIAS), root mean square error (RMSE) et al. Evaluation of average temperature and precipitation from reanalysis are more frequently used than other meteorological variables. Wang et al. [15
] compared two types of ERA-Interim datasets with gridded observation datasets. Results showed that after topographic correction, temperature distribution of reanalysis closely reproduces the temperature conditions of the TP, and that the increased trend is similar to observed data. Likewise, achievements of Gao et al. [16
] showed that ERA-Interim temperature in the TP works well; R of temperature in the monthly scale ranges from 0.973 to 0.999 when compared with 75 stations’ data above 3000 m. Song et al. [17
] compared precipitation from eight gridded datasets with station observations in Asian high mountains; the result indicated that gauge-based or multi-source datasets showed better performance, and that merged datasets are of potential use in modeling water cycles. You et al. [18
] compared multisource datasets with gridded precipitation observations over the TP; most datasets can capture the precipitation distribution and identity varieties of mean monthly precipitation. Wang et al. [19
] compared precipitation, temperature, radiation, wind speed and surface pressure from six multi-reanalysis products with observed data. Results indicate that different products have different abilities in calculating meteorological elements. For example; ERA-Interim performance is good with temperature, whereas the Global Land Data Assimilation Systems (GLDAS) shows the best performance with precipitation. In conclusion, reanalysis datasets can display the broad distribution of meteorological elements of the TP, but corrections using observed data are essential to minimize errors.
With long continuous time series and high spatial resolution, reanalysis datasets are suitable to create hydrological models, especially in regions that have few weather stations. High quality temporal and spatial resolution meteorological input data for distributed models largely determines the result of model output. Much research evaluates reanalysis at the watershed-scale by using hydrological models. Thomas et al. [20
] evaluated ten satellite and reanalysis datasets in six, different sized watersheds in West Africa. Gilles et al. [21
] analyzed the impact of combining different reanalysis and weather station data on the accuracy of discharge modeling in Canada and the USA. Both concluded that reanalysis datasets can be an alternative for observed data. Some reanalysis datasets do well in runoff simulation, and NSE are satisfactory, especially in the reanalysis datasets bias-corrected by weather station data. Kan et al. [22
] evaluated “the Climate Prediction Center Morphing Technique (CMORPH)”, “Tropical Rain Measurement Mission Multi-satellite Precipitation Analysis (TRMM 3B42 V6)”, “China Meteorological Forcing Dataset (CMFD)” and “Asian Precipitation-Highly Resolved Observational Data Integration Towards Evaluation Of Water Resource (APHRODITE)” in the upper Yarkant River. Results indicate that datasets of distribution of precipitation from CMFD are more appropriate because they are consistent with the distribution of glaciers, and CMORPH based on satellite data, gets better results in forcing the Variable Infiltration Capacity (VIC) model. Guo et al. [23
] compared two kind of multisource reanalysis data in hydrological simulation in the Lasa River Basin: the NSE is above 0.7 in the daily scale, and 0.8 in the monthly scale based on the HIMS model. Gao et al. [24
] analyzed the application of CFSR, ERA-Interim in driving the VIC model in the Kash River Basins, and results indicate ERA-Interim is superior to CFSR. Hence, a set of reliable datasets can be a substitute for observed data in basins with sparsely distributed weather stations.
Previous studies, whether they use the first or the second method, always focus upon precipitation and average temperature. However, a complete set of data for distributed or semi-distributed hydrological models needs precipitation and average temperature, but also max/min temperatures, relative humidity, atmospheric pressure, wind speeds, solar radiation etc. For example, the SWAT model, as one of the most popular models, is extensively applied in runoff simulation and prediction, sediment transition etc. [25
], and requires not only daily precipitation and temperature, but also relative humidity, atmospheric pressure, and wind speeds as input weather data, to obtain evapotranspiration [26
]. Relative humidity, wind speed and max/min-temperatures also have great value in research. Relative humidity reflects the saturation of moisture in the atmosphere and has an impact on surface water, energy budgets, formation of aerosols, growth of plants and animals, etc. [27
]. Wind speed depicts the movement of atmosphere and its influence affects other weather phenomena like precipitation, smog [29
]. Maximum and minimum temperatures are more responsive to extreme weather events [31
CMADS and CFSR, as two more complete datasets, contain several meteorological elements and are recommended by the SWAT official website (https://swat.tamu.edu/
). CFSR has been widely used around the world. Dile et al. [33
] and Abeyou et al. [34
] used CFSR to drive three different hydrological models in the Blue Nile River Basin, and their results indicate that CFSR has the ability in forcing hydrological models; its simulation results were the same as, or better than, those forced by weather station data. In China, CFSR was used in the Bahe River Basin [35
], Kaidu River Basin [36
], Kash River Basins [24
] etc. CMADS, built by Dr. Xianyong Meng from China Institute of Water Resources and Hydropower Research (IWHR), and bias-correction by observed data has been used in several basins including China’s Juntanghu watershed [37
] and the Manas River Basin [40
]; the results are satisfactory. However, comprehensive evaluation and application of these two datasets in the TP is scarce, especially CMADS. Thus, precipitation, max/min-temperatures, relative humidity and wind speed from CMADS and CFSR were evaluated using data from 131 weather stations in this paper. The Yellow River Source Basin was also selected for hydrological simulation and analysis.
Reanalysis datasets are an important alternative to observed data, especially for regions with few weather stations. They can provide several meteorological factors with higher-resolution data, which is profitable for hydrological simulation. However, various reanalysis datasets still have differences in sources, bias-corrected methods, resolution and temporal coverage et al. CMADS and CFSR are evaluated in this paper and the results show that bias-correction by observed weather data is important for reanalysis. CMADS assimilates nearly 40,000 regional automatic stations under China’s 2421 National Automatic and Business Assessment Centers, so that data accuracy is considerably improved. A complete set of data should contain as many climate elements as possible. Accuracy of relative humidity, wind speed, solar radiation etc. should receive more attention, not only due to their meteorological significance, but because they are also important to hydrological, ecological and erosion research etc. Evaluation results of CMADS and CFSR indicate that relative humidity and wind speed still have room for improvement (Table 1
, Figure 2
and Figure 5
). Besides, long-term series are more representative. CMADS just covers 9 years, compared to 35 years of CFSR, and is therefore too short. In overview, with good manifestation in meteorological elements and forcing hydrological models, it is hoped that authors expand the time series so as to provide convenience in assessing hydrological changes in a long-term context.
In this paper, precipitation, max/min temperatures, relative humidity and wind speed from CMADS and CFSR are evaluated. Discrepancies between these two datasets are fully demonstrated and main results are displayed as follows:
Compared with 131 metrological stations, daily precipitation is more difficult to simulate accurately. The average R for CMADS precipitation is 0.46, which is similar to CFSR (R = 0.43). R of CMADS and the CFSR max/min-temperatures is better, and the range is within 0.93–0.98. CMADS and CFSR both underestimate max/min-temperatures and average BIAS is cold. The average RMSE of max/min-temperatures is within 2.99–8.22 °C. Deviation of CMADS and CFSR temperature time series is close to observed data. Relative humidity and wind speed for CMADS is superior to those of CFSR according to various indexes.
The professional interpolation software ANUSPLIN is used to obtain the spatial distribution of annual average precipitation, max/min-temperatures, relative humidity and wind speed, based on the data from weather stations. Distribution of the three kinds of data is generally similar, but the local differences are more obvious. Precipitation of CFSR is overestimated in the whole TP, and unusually large values appeared in the southeast. Precipitation of CMADS is similar with observed data in distribution and in amount. As for the maximum and minimum temperatures, all three datasets have better consistency. Distribution of relative humidity of observed data shows that it is moister in the southeast and drier in the west, and this is different to what CMADS and CFSR present. A difference of distribution of wind speed is obvious in the northwest between observation and reanalysis.
CMADS has unique advantages in hydrological simulations compared with observed data and CFSR. Runoff simulations have achieved satisfactory results in the Yellow River Source Basin. NSE of CMADS+SWAT is 0.78 and 0.68 in calibration and validation, NSE of CFSR+SWAT is 0.69 and 0.52 in the Yellow River Source Basin and OBS+SWAT is unsatisfactory (NSE < 0). Obvious snow melting processes appeared in March and the temperature and soil moisture increased significantly around this time period. There are only eleven weather stations located in the Yellow River Source Basin, and these are located in the lower elevation areas of the eastern region, which means they are not representative. It is therefore difficult to achieve satisfactory simulation results only through adjustment of parameters in SWAT. Simulation results of runoff in the watershed are improved by two reanalysis datasets, due to their high resolution and quality, though CFSR overestimates precipitation in the Yellow River Source Basin, and results in excessive runoff. 2 m air temperature, soil moisture and 1.038 m depth soil temperature contribute more to snowmelt as shown, when measured by GeoDetector. Climate forcing data is important, deviation of precipitation (Table 1
, Figure 2
) results in the different amounts of runoff (Figure 6
), and the temperature, humidity and wind speed, etc. also play an important role in calculating evapotranspiration. Evaluation of various reanalyses before forcing hydrological models is essential.