Investigate the Applicability of CMADS and CFSR Reanalysis in Northeast China

Reanalysis datasets can provide alternative and complementary meteorological data sources for hydrological studies or other scientific studies in regions with few gauge stations. This study evaluated the accuracy of two reanalysis datasets, the China Meteorological Assimilation Driving Datasets for the Soil and Water Assessment Tool (SWAT) model (CMADS) and Climate Forecast System Reanalysis (CFSR), against gauge observations (OBS) by using interpolation software and statistical indicators in Northeast China (NEC), as well as their annual average spatial and monthly average distributions. The reliability and applicability of the two reanalysis datasets were assessed as inputs in a hydrological model (SWAT) for runoff simulation in the Hunhe River Basin. Statistical results reveal that CMADS performed better than CFSR for precipitation and temperature in NEC with the indicators closer to optimal values (the ratio of standard deviations of precipitation and maximum/minimum temperature from CMADS were 0.92, 1.01, and 0.995, respectively, while that from CFSR were 0.79, 1.07, and 0.897, respectively). Hydrological modelling results showed that CMADS + SWAT and OBS + SWAT performed far better than CFSR + SWAT on runoff simulations. The Nash-Sutcliffe efficiency (NSE) of CMADS + SWAT and OBS + SWAT ranged from 0.54 to 0.95, while that of CFSR + SWAT ranged from −0.07 to 0.85, exhibiting poor performance. The CMADS reanalysis dataset is more accurate than CFSR in NEC and is a suitable input for hydrological simulations.


Introduction
Precipitation and temperature play important roles in the climatic system and have been widely used to evaluate terrestrial ecosystem responses to climate change [1,2]. Compared with temperature, the spatio-temporal heterogeneity of precipitation is more complex and variable. Northeast China (NEC, 38 • -54 • N, 115 • -136 • E), located in the middle and high latitudes of the Northern Hemisphere, is the highest latitude area and an important granary of China. It is primarily characterized by a temperate monsoon climate and is a typically vulnerable climate area [3]. Due to the complex terrain and adverse weather and environmental conditions in NEC, weather stations have an uneven spatial distribution, and the density of the stations is 0.2 per 1000 km 2 . Xu et al. [4] found that acceptable and relatively stable hydrological model performance was achieved when the rain gauge density ranged between 1.0 and 1.4 per 1000 km 2 . Low density and uneven spatial distribution of gauge stations may lead to poor performances in hydrological models, specifically in the regions with complex terrain. To address this scarcity and uneven spatial distribution issue, other sources of meteorological data can be used as an alternative for simulation modelling studies.
Weather radar provides relatively high-resolution spatial and temporal precipitation data for hydrological modelling studies, but they are still associated with many challenges, such as limited spatial coverage, short data series, difficulty in identifying different precipitation forms, and uncertainty of the relationship between radar reflectivity factor and rainfall rate [5,6]. The observation time of satellite remote sensing precipitation data is continuous and consistent, covering a wider area in space than gauge stations and weather radar [7]. However, owing to the limited accuracy of infrared precipitation estimation and the long revisit period of microwave sensors, satellite precipitation products have certain limitations, and the detection ability of light and solid precipitation needs to be improved [8,9].
Another alternative precipitation data source is the grid-point climate product (i.e., reanalysis data), which is developed based on the modeled and remotely sensed satellite data sources. Reanalysis datasets have the advantages of uninterrupted regional coverage and high spatial-temporal resolution and may provide a new and complementary meteorological data source for hydrological simulation and other applications [10,11]. At present, the widely used reanalysis precipitation products include the Climate Forecast System Reanalysis (CFSR) (~38 km) from the National Centers for Environmental Prediction (NCEP) [12], the Modern-Era Retrospective analysis U.S. for Research and Applications (MERRA) (~50 km) from the National Aeronautics and Space Administration (NASA) [13], the ERA-Interim (~82 km) issued by the European Centre for Medium-Range Weather Forecasts (ECMWF) [14], and the JRA-55 (~60 km) issued by the Japan Meteorological Agency (JMA) [15]. Liu et al. [16] evaluated the spatial and temporal performances of ERA-Interim precipitation and temperature in Mainland China. They found that interannual ERA-Interim presented similar trends (decreasing gradually from southeast to northwest) with the interpolated ground station in precipitation at both annual and seasonal scales and good consistency in temperature, but caution was required when using ERA-Interim in areas with complex terrain. Zhao et al. [17] quantitatively evaluated the quality of NCEP-2 and CFSR reanalysis seasonal temperature data in China using detrended fluctuation analysis and noticed that the reliabilities of reanalysis varied in different regions and seasons. Tan et al. [18] recommended using Asian Precipitation-Highly-Resolved Observational Data Integration towards Evaluation of Water Resources (APHRODITE) precipitation and CFSR temperature data as inputs for modeling of Malaysian water resources, because APHRODITE data often underestimated the extreme precipitation and streamflow while CFSR data dramatically overestimated them. Owing to the uncertainties of reanalysis datasets when using the forecasting model, data assimilation and data source, their performance varies spatially, making it essential to assess their quality and capability in specific regions before applying them to hydrological applications [19,20].
A new reanalysis dataset called the China Meteorological Assimilation Driving Datasets for the Soil and Water Assessment Tool model (CMADS) Version 1.1 (~25 km) covering East Asia, has attracted the interest of many scholars [21][22][23][24][25][26]. Some studies have been conducted to evaluate the accuracy of precipitation estimates from CMADS using hydrological models [23][24][25][26]. However, only a limited number of studies have performed a straightforward comparison between CMADS and gauge observations (OBS) in precipitation, let alone assessed the temperature estimates from CMADS. For example, Guo et al. [26] adopted three reanalysis datasets-satellite data Tropical Rainfall Measuring Mission (TRMM) 3B42V7, interpolation dataset deriving from meteorological stations, and CMADS to drive rainfall-runoff models in the Lijiang River Basin-and found that CMADS performed the best for hydrological modeling. CFSR is a new generation reanalysis dataset of NCEP that has been widely used globally. CFSR and CMADS are selected in this study as they have high spatial resolution and are easy to obtain. Both of them can be conveniently applied in the Soil and Water Assessment Tool (SWAT) model, which is the hydrological model used in this study, to assess the performance of reanalysis products in runoff simulation. Some studies have compared CFSR with CMADS in West [27][28][29][30] and Central [31] China. For instance, Liu et al. [27] found that CMADS Water 2020, 12, 996 3 of 18 performed better than CFSR and OBS in the hydrological modelling results in the Qinghai-Tibet Plateau. The reanalysis dataset CMADS could be a complementary meteorological data source for OBS. Meng et al. [28] and Gao et al. [31] found that CMADS performed as well as OBS in runoff simulations, while CFSR performed the worst. Wang et al. [29] showed that both CMADS and CFSR could not be used as supplements to OBS. The performance of reanalysis data may vary from place to place because of climatic patterns, land surface condition, and their inherent uncertainties. According to the previous literature, this is the first study to compare CMADS with CFSR in hydrological application in Northeast China.
In this study, the evaluation work was conducted using statistics and hydrological modelling. Firstly, the two reanalysis datasets (CMADS and CFSR) were interpolated to regular meteorological stations over NEC, and their respective interpolations were compared with gauge observations (OBS) from the China Meteorological Administration (CMA) according to a set of meaningful diagnostic statistics. Then, the spatial interpolation distributions of annual average precipitation and maximum/minimum temperature from these three data sources in NEC were compared, as well as their monthly average precipitation and maximum/minimum temperature at the regular meteorological stations. Finally, the two original grid-point reanalysis datasets and OBS were used as inputs for runoff simulation in the hydrological modeling framework. The objective of this study is to perform quality assessment of two reanalysis datasets and their reliability and applicability for runoff simulation in NEC. The remainder of this paper is organized as follows. Section 2 introduces the basic information of NEC, data, and research methods. Section 3 presents the results and discussion, including the comparisons of two reanalysis datasets at gauge stations and their spatial distribution. The conclusions are presented in Section 4.

Study Area
NEC (38 • 43 -53 • 36 N, 115 • 30 -135 • 02 E) is comprised of Liaoning, Jilin, and Heilongjiang Provinces and Inner Mongolia Autonomous Region and has an area of 1.24 million square kilometers, with the south facing the sea and the southeast close to the ocean ( Figure 1). Its elevations vary from 0 to 2667 m, and the region can generally be divided into four geomorphological units, mountain, plateau, hill, and plain, in terms of different geology, geomorphology, and formation types. NEC is almost surrounded by mountains and hills on all sides, and there are plains in the middle and the east (Songliao and Sanjiang Plain, respectively). The Hulunbeier Plateau is located in the northwest of the Greater Khingan Range (GKR) with elevations ranging between 600 and 800 m. The highest peak is located in the Changbai Mountains (CBM).
The area west of GKR is characterized by temperate continental climate and the area east of GRK is characterized by temperate monsoon climate. From the southeast to the northeast, the annual mean temperature declines from above 7 • C to below −0.4 • C, the annual mean precipitation declines from 1100 to 200 mm, and the climate alters from humid, to semi humid, to semi-arid. The Hunhe River Basin (HRB), located in the south of NEC and having three geomorphological units (i.e., mountain, hill, and plain), was selected to examine the quality of two reanalysis datasets using the hydrological model. The catchment has an area of 7919 km 2 , and the highest elevation is 1122 m. Two hydrological stations are distributed in the upstream and the catchment outlet, respectively.

Data Collection
OBS data from 1987 to 2013 at 236 regular gauge stations (as shown in Figure 1) in NEC were collected from the CMA. OBS comprise point-scale data including precipitation, mean/maximum/minimum temperature, relative humidity, and wind speed at daily scale. CMADS, developed by Prof. Xianyong Meng from China Agricultural University (CAU), was based on Local Analysis and Prediction System/Space-Time Multiscale Analysis System (LAPS/STMAS) and was constructed using loop nesting of data, projection of resampling models, and bilinear interpolation [28,32,33]. CMADS uses the six-hourly reanalysis components of the European Centre for Medium-Range Weather Forecasts (ECMWF) as its basic background fields and assimilates the regular raw station data and data from satellites and radars. The precipitation data of CMADS merges information derived from the global satellite product Climate Prediction Center Morphing (CMORPH) and daily precipitation records observed at 2421 national automatic stations from the China National Meteorological Information Centre. CMADS V1.1 is now available from 1979 to 2018, covering regions between 0°-65°N and 60°-160°E with a spatial resolution of 0.25° × 0.25°. It can be downloaded from its official website (http://www.cmads.org/).

Data Collection
OBS data from 1987 to 2013 at 236 regular gauge stations (as shown in Figure 1) in NEC were collected from the CMA. OBS comprise point-scale data including precipitation, mean/maximum/minimum temperature, relative humidity, and wind speed at daily scale. CMADS, developed by Prof. Xianyong Meng from China Agricultural University (CAU), was based on Local Analysis and Prediction System/Space-Time Multiscale Analysis System (LAPS/STMAS) and was constructed using loop nesting of data, projection of resampling models, and bilinear interpolation [28,32,33]. CMADS uses the six-hourly reanalysis components of the European Centre for Medium-Range Weather Forecasts (ECMWF) as its basic background fields and assimilates the regular raw station data and data from satellites and radars. The precipitation data of CMADS merges information derived from the global satellite product Climate Prediction Center Morphing (CMORPH) and daily precipitation records observed at 2421 national automatic stations from the China National Meteorological Information Centre. CMADS V1.1 is now available from 1979 to 2018, CFSR is a global coupled atmosphere-ocean-land surface-sea ice assimilation system developed by NCEP from 1979 to July 2014 at a resolution of 38 km [12]. CFSR data consider the alteration in CO 2 , aerosols, and trace gas variations in the atmospheric model and assimilate the radiance measurements from a series of National Oceanic and Atmospheric Administration (NOAA) polar-orbiting satellites. Two sets of global precipitation analyses, the pentad Climate Prediction Center (CPC) Merged Analysis of Precipitation (CMAP) and global daily gauge analysis, are used in the CFSR land surface analysis. The precipitation data of CFSR are generated by blending the two datasets with the CFSR background on 6-hourly GDAS (Global Data Assimilation System) precipitation. The CFSR dataset is available on the website (http://globalweather.tamu). Both CMADS and CFSR are comprised of grid-point data of accumulated 24-h precipitation (mm), maximum and minimum temperature (

Spatial Interpolation Method
Spatial interpolation is an important way to obtain meteorological data in continuous space or perform estimation at an unknown point. Thin plate spline interpolation is well-suited for the interpolation of meteorological elements [34]. It provides accurate estimates of climate by allowing for the spatially varying dependence on topography and is not limited by spatial scale [35]. Hutchinson developed special interpolation software (Australian National University spline interpolation, ANUSPLIN) for surface fitting of climate data on the basis of the thin plate spline interpolation theory and summarizing previous studies [36]. The underlying statistical model is as follows: where z i is a dependent variable (i.e., temperature and precipitation) at location i, x i is an independent variable, f (x i ) is an unknown smooth function of x i , y i is an independent covariate, b T is the coefficient of y i , and e i is the random error of the independent variable with the expected value of 0. ANUSPLIN has been widely used throughout the world, and more detailed information about its application and each module is described by Hutchinson [36]. The options of ANUSPLIN setup for the different meteorological variables used in this study are shown in Table 1.

Hydrological Modelling
The Soil and Water Assessment Tool (SWAT) model developed by the U.S. Department of Agriculture (USDA) is a continuous-time, semi-distributed, and process-based river basin model, which is designed to predict the impact of land management practices on water, sediment, and agriculture chemical yields in large complex watersheds [37,38]. SWAT simulates hydrological processes at the daily time scale based on inputs such as weather, topography, land use, and soil. In SWAT, the small or large catchment is divided into sub-basins, and each sub-basin is further divided into hydrological response units (HRUs) with homogeneous land use, soil, and slope. Simulation of the watershed hydrology is calculated in the HRUs based on the water balance equation, which considers precipitation, irrigation, evapotranspiration, surface runoff, lateral flow, and percolation to shallow and/or deep aquifers [39], and the equation is as follows: where SW t and SW 0 are the final and initial soil water content respectively; t is time in days, and R i , Q i , ET i , P i , and QR i are the daily amounts of precipitation, runoff, evapotranspiration, percolation, and return flow, respectively, and all units are in mm. The Soil Conservation Service (SCS) curve method [40] was used to calculate the surface runoff, the Penman-Monteith method [41] was used to calculate the potential evapotranspiration, and the variable storage routing method [42] was selected to calculate the water evolution in the main channel over the study area. A more detailed description of the SWAT model can be found in Arnold et al. [43] and its official website (https://swat.tamu.edu/). There are two different parameterization strategies in the SWAT model: (1) calibration and validation are performed based on different meteorological datasets, and (2) the best fitted parameters, derived from the calibrated model using observed meteorological data and observed runoff, are used to simulate streamflow in the SWAT model driven by reanalysis datasets. It is generally known that parameter non-uniqueness is an inherent property of inverse modeling, therefore no best parameters exist. Precipitation data is a crucial factor for hydrological simulation, but the uncertainties in observed meteorological data (such as a limited number of the gauge stations, distribution of the sites, and data missing) may lead to poor hydrological performance, then uncorrected parameters. Therefore, in this study, Strategy one was adopted to calibrate the SWAT model using observed runoff and simulated runoff forced by OBS, CMADS, and CFSR, respectively.
In the SWAT projects, the HRB was divided into 34 sub-basins and 317 HRUs based on the geographical data (i.e., DEM, soil, and land use). Then CMADS, CFSR, and OBS were respectively used to drive the SWAT model (namely, CMADS + SWAT, CFSR + SWAT, and OBS + SWA) and the SWAT Calibration Uncertainty Program (SWAT-CUP) was applied to calibrate the parameters. The calibration period was 2008-2010, and the validation period was 2011-2013. Table 2 lists the model parameters and their fitted values in respective models.
Marks: "r_" means that the parameter is multiplied by (1 + calibration value), and "v_" means that the parameter is replaced by the calibration value. OBS + SWAT, CMADS + SWAT, and CFSR + SWAT mean that the SWAT model were forced by OBS, CMADS, and CFSR, respectively.

Performance Indicators
In order to assess the accuracy of CMADS and CFSR precipitation and temperature (i.e., maximum and minimum temperature) in NEC, the two reanalysis datasets are interpolated to the regular gauge stations, and then their interpolations are compared with gauge observations. Four statistical indices are used as the evaluation indicators, i.e., the correlation coefficient (R), the relative error (RE), the root mean square error (RMSE), and the ratio of standard deviations (STD ratio ). Some information on the statistical indices is shown in Table 3. In the formulas for R, RE, RMSE, and STD ratio , x i refers to the meteorological interpolation values at the regular gauge stations derived from reanalysis datasets, and y i refers to the gauge observations. x and y are the mean values. In addition, the Nash-Sutcliffe efficiency (NSE) and coefficient of determination (R 2 ) are used to measure the accuracy of runoff simulation of the SWAT model driven by CMADS, CFSR, or OBS. In the formulas for NSE and R 2 , Q O i and Q s i refer to the observed and simulated runoff values, respectively. Q O and Q s are the mean values. n and m are the number of values. Additionally, as recommended by Moriasi et al. [44], four model performance ranks (i.e., "Very good", "Good", "Satisfactory", and "Unsatisfactory") were assigned based on the values of R 2 and NSE (as shown in Table 4). Table 3. Statistical indices for the evaluation of precipitation, temperature, and runoff in this study. Table 4. Model performance ranks for runoff simulation at monthly temporal scale [44].

Comparison and Evaluation of CMADS and CFSR
Four evaluation indicators (i.e., R, RE, RMSE, and STD ratio ) were used for the comparison of CMADS and CFSR with observed data for precipitation, maximum temperature (T max ), and minimum temperature (T min ) at the monthly scale. The comparison distributed at 236 gauge stations is displayed in Figure 2, and the comparison results are shown in Table 5. For precipitation, 77% of the stations of CFSR had higher RE than that of CMADS. We found that high RE values usually occurred in winter, when there was less precipitation; even minor differences in precipitation would result in high RE because the denominator values were small. For the highest RE value from CMADS in Figure 2, the mean precipitation values from CMADS and OBS in December were 4.03 and 1.13 mm, respectively, with an absolute error of 2.90 mm, but their RE was 2.57, and the range of RE was 0.002-0.160 during April-October. The RMSE of CMADS precipitation ranged from 3.40 to 39.79 mm, with the average value of 13.4 mm, and that of CFSR ranged from 5.41 to 34.81 mm, with the average value of 17.9 mm. For CFSR, there was a 79% higher RMSE than that of CMADS, and 56% of the stations fell below their respective average value for both CMADS and CFSR. The STD ratio of CMADS precipitation ranged from 0.60 to 1.13, with the average value 0.92, and that of CFSR precipitation ranged from 0.44 to 1.23, with the average value 0.92. For the STD ratio , 62% of the stations fell within 1 ± 0.1 for CMADS, while for CFSR, it was only 26%.

Comparison and Evaluation of CMADS and CFSR
Four evaluation indicators (i.e., R, RE, RMSE, and ratio STD ) were used for the comparison of CMADS and CFSR with observed data for precipitation, maximum temperature ( max T ), and minimum temperature ( min T ) at the monthly scale. The comparison distributed at 236 gauge stations is displayed in Figure 2, and the comparison results are shown in Table 5.   For precipitation, 77% of the stations of CFSR had higher RE than that of CMADS. We found that high RE values usually occurred in winter, when there was less precipitation; even minor differences in precipitation would result in high RE because the denominator values were small. For the highest RE value from CMADS in Figure 2, the mean precipitation values from CMADS and OBS in December were 4.03 and 1.13 mm, respectively, with an absolute error of 2.90 mm, but their RE For maximum temperature, 57% RE of the stations for CMADS fell within ±0.05, while that for CFSR was only 29%. When the average monthly observed maximum temperature has a bigger or smaller value than zero but is very close to zero, it could produce a relatively higher/lower RE. When excluding the lowest RE (−28.03 and −140.67), the average RE values of the CMADS and CFSR maximum temperature would be 0.02 and 0.08, respectively. The RMSE of CMADS maximum temperature was always lower than that of CFSR at each station, as shown in Figure 2. Regarding the RMSE, 82% of CMADS fell below 1, while it was only 2% for CFSR. For CMADS, 100% of the stations fell within 1 ± 0.04 for the STD ratio , while it was only 15% for CFSR. The four evaluation indicators for comparing CMADS and CFSR with observed data in the minimum temperature were similar to that in the maximum temperature.
It is clear from Figure 2 that the variation ranges of most statistical indicator values of CMADS were relatively smaller than those of CFSR. Inspecting Table 5, the R value indicated that both CMADS and CFSR had high linear correlation with the observed data, particularly CMADS, and the RE, RMSE, and STD ratio showed that the precipitation and maximum/minimum temperature of CMADS were closer to the observed weather data than CFSR.

Spatial Annual and Monthly Average Distribution of CMADS, CFSR, and Observed Data
To compare CMADS and CFSR with observed data (OBS) in NEC, the spatial distributions of annual average precipitation and maximum/minimum temperature from these three data sources based on the ANUSPLIN interpolation software are shown in Figure 3.  It is clear from the figure that precipitation in NEC was primarily characterized by decline from the southeast to the northwest. As the terrain blocked and lifted the air flow, the water vapor from the sea/ocean was blocked by the mountains in the process of moving, which resulted in abundant precipitation in the CBM and the southeast of Lesser Khingan Range (LKR), ranging from 600 to 1150 mm (as shown by OBS). The precipitation ranged from 300 to 600 mm in the PLN and from 185 to 400 mm in the Hulunbeier Plateau (as shown by OBS). The annual precipitation in NEC was mainly in the range of 400-800 mm. Based on the spatial distribution of precipitation in NEC, we found that it was mostly influenced by the water vapor and elevation. Without considering the precipitation amount, both CMADS and CFSR had similar precipitation spatial distributions with OBS, i.e., precipitation was higher in the mountain regions and lower in the plains and plateau. In the whole NEC region, the precipitation of CMADS was lower than that of OBS, while CFSR was higher than It is clear from the figure that precipitation in NEC was primarily characterized by decline from the southeast to the northwest. As the terrain blocked and lifted the air flow, the water vapor from the sea/ocean was blocked by the mountains in the process of moving, which resulted in abundant precipitation in the CBM and the southeast of Lesser Khingan Range (LKR), ranging from 600 to 1150 mm (as shown by OBS). The precipitation ranged from 300 to 600 mm in the PLN and from 185 to 400 mm in the Hulunbeier Plateau (as shown by OBS). The annual precipitation in NEC was mainly in the range of 400-800 mm. Based on the spatial distribution of precipitation in NEC, we found that it was mostly influenced by the water vapor and elevation. Without considering the precipitation amount, both CMADS and CFSR had similar precipitation spatial distributions with OBS, i.e., precipitation was higher in the mountain regions and lower in the plains and plateau. In the whole NEC region, the precipitation of CMADS was lower than that of OBS, while CFSR was higher than that of OBS. In terms of the maximum precipitation region, CMADS had a similar one to OBS, but that of CFSR slightly moved to the northeast (Figure 3).
The maximum temperature ranged from 1 to 16 • C, the minimum temperature ranged from −16 to 9 • C, and both relative low and high maximum/minimum temperature appeared in the GKR and the coastal areas, respectively (Figure 3). The temperature (i.e., maximum and minimum temperature) gradually increased from the north to south and from the mountain to plain, which revealed that the temperature spatial distribution was mainly influenced by the latitude and elevation. The maximum temperature ranged from 8 to 16 • C in the PLN mainly and from 1 to 8 • C in the other regions, correspondingly, the minimum temperature ranges were −2-9 and −16-−2 • C, respectively. Figure 3 showed that both CMADS and CFSR had similar temperature spatial distributions and values with OBS, particularly CMADS.
The spatial RMSE of the stations from CMADS and CFSR is shown in Figure 4. Regarding precipitation, the relatively high RMSE (ranging from 15 to 40 mm) mainly occurred in the coastal areas, mountains, and the connecting regions between plains and mountains. When the water vapor transferred from the southeast to the northwest, the precipitation events became complex and variable due to the influence of elevation. In addition, precipitation in these areas usually was larger than that in the inland plain (Figure 3), resulting in higher RMSE. Regarding temperature, the RMSE in the mountains was larger than that in the plain because of the influence of elevation. On the whole, the RMSE of precipitation and temperature from CMADS was less than that from CFSR (Figure 4), which was consistent with the RMSE result in Figure 2 and Table 3.
Water 2020, 12, x FOR PEER REVIEW 11 of 18 that of OBS. In terms of the maximum precipitation region, CMADS had a similar one to OBS, but that of CFSR slightly moved to the northeast (Figure 3). The maximum temperature ranged from 1 to 16 °C, the minimum temperature ranged from −16 to 9 °C, and both relative low and high maximum/minimum temperature appeared in the GKR and the coastal areas, respectively (Figure 3). The temperature (i.e., maximum and minimum temperature) gradually increased from the north to south and from the mountain to plain, which revealed that the temperature spatial distribution was mainly influenced by the latitude and elevation. The maximum temperature ranged from 8 to 16 °C in the PLN mainly and from 1 to 8 °C in the other regions, correspondingly, the minimum temperature ranges were −2-9 and −16-−2 °C, respectively. Figure 3 showed that both CMADS and CFSR had similar temperature spatial distributions and values with OBS, particularly CMADS.
The spatial RMSE of the stations from CMADS and CFSR is shown in Figure 4. Regarding precipitation, the relatively high RMSE (ranging from 15 to 40 mm) mainly occurred in the coastal areas, mountains, and the connecting regions between plains and mountains. When the water vapor transferred from the southeast to the northwest, the precipitation events became complex and variable due to the influence of elevation. In addition, precipitation in these areas usually was larger than that in the inland plain (Figure 3), resulting in higher RMSE. Regarding temperature, the RMSE in the mountains was larger than that in the plain because of the influence of elevation. On the whole, the RMSE of precipitation and temperature from CMADS was less than that from CFSR (Figure 4), which was consistent with the RMSE result in Figure 2 and Table 3. In addition, NEC was simply divided into three regions from southeast to northwest according to the spatial terrain and climatic classification, i.e., the Changbai Mountains (CBM), the Songliao and In addition, NEC was simply divided into three regions from southeast to northwest according to the spatial terrain and climatic classification, i.e., the Changbai Mountains (CBM), the Songliao and Sanjiang Plain (PLN), and the Greater Khingan Range (GKR) including the Hulunbeier Plateau. The monthly average meteorological elements from CMADS, CFSR, and OBS in these regions are displayed in Figure 5. The amount of precipitation from CBM, PLN, and GKR gradually decreased, and the temperature correspondingly declined, which was consistent with the spatial distribution as shown in Figure 3. The monthly average precipitation and temperature distributions were uneven within a year. Both high precipitation and high temperature occurred in summer, while winter was dry and cold. The variances between the maximum temperature in July and January from the southeast to the northwest were 34.6 • C in the CBM, 40.5 • C in the PLN, and 44.0 • C in the GRK, respectively; correspondingly, those for minimum temperature were 38.6, 42.3, and 44.0 • C, respectively. This demonstrated that the monthly average temperature distributions were more uneven in the cold area than in the relatively warm area within a year. Further, the temperature of CMADS was closer to OBS than that of CFSR ( Figure 5 and Table 5).
Water 2020, 12, x FOR PEER REVIEW 12 of 18 Sanjiang Plain (PLN), and the Greater Khingan Range (GKR) including the Hulunbeier Plateau. The monthly average meteorological elements from CMADS, CFSR, and OBS in these regions are displayed in Figure 5. The amount of precipitation from CBM, PLN, and GKR gradually decreased, and the temperature correspondingly declined, which was consistent with the spatial distribution as shown in Figure 3. The monthly average precipitation and temperature distributions were uneven within a year. Both high precipitation and high temperature occurred in summer, while winter was dry and cold. The variances between the maximum temperature in July and January from the southeast to the northwest were 34.6 °C in the CBM, 40.5 °C in the PLN, and 44.0 °C in the GRK, respectively; correspondingly, those for minimum temperature were 38.6, 42.3, and 44.0 °C, respectively. This demonstrated that the monthly average temperature distributions were more uneven in the cold area than in the relatively warm area within a year. Further, the temperature of CMADS was closer to OBS than that of CFSR ( Figure 5 and Table 5).  Figure 5 shows that the highest monthly average precipitation in the majority of NEC regions was mainly concentrated in July; however, the precipitation in August in the CBN was nearly as high as that in July due to the monsoon climate and the high elevation. In this period, the occurrences of precipitation events and heavy precipitation increased. The monthly average precipitation from the stations of the three regions seemed to be underestimated by CMADS (Figure 5), and this could have led to the spatial distributions of CMADS annual average precipitation being lower than that of OBS in most areas (Figure 3). The monthly average precipitation of CMADS had a better agreement with OBS than with CFSR ( Figure 5), which was consistent with the R at all stations in Figure 2. The performance of CMADS in precipitation may result from (a) CMORPH satellite data, (b) interpolation  Figure 5 shows that the highest monthly average precipitation in the majority of NEC regions was mainly concentrated in July; however, the precipitation in August in the CBN was nearly as high as that in July due to the monsoon climate and the high elevation. In this period, the occurrences of precipitation events and heavy precipitation increased. The monthly average precipitation from the stations of the three regions seemed to be underestimated by CMADS (Figure 5), and this could have led to the spatial distributions of CMADS annual average precipitation being lower than that of OBS in most areas (Figure 3). The monthly average precipitation of CMADS had a better agreement with OBS than with CFSR ( Figure 5), which was consistent with the R at all stations in Figure 2. The performance of CMADS in precipitation may result from (a) CMORPH satellite data, (b) interpolation error, and (c) error of OBS in the process of measurement. CMORPH, having the advantages and disadvantages of satellite precipitation products, is taken as the background field of CMADS precipitation, as it is difficult to determine the precipitation below 4 mm due to the interference of ground surface reflectivity [9,45]. The Chinese regional hourly precipitation fusion product based on automatic precipitation stations is also assimilated in the precipitation data of CMADS; however, this may lead to errors when performing inverse interpolation. Additionally, precipitation events below 1.0 mm account for 56% of OBS events, and the weight of observation sampling error, measurement instrument error, and other factors increase in the calculation of light precipitation, particularly in winter. CFSR overestimated the monthly average precipitation during the dry months and sometimes over-or underestimated it during the rainy months ( Figure 5). The annual average precipitation from the stations of CFSR was higher than that of OBS in three regions ( Figure 5), which was consistent with their spatial distributions shown in Figure 3. This might have occurred because CFSR often overestimates light precipitation and underestimates heavy precipitation, which has been demonstrated in the Kaidu River Basin [46] and in Bolivia [47]. The complex topography and physiognomy, the physical parameter process of the numerical model structure, and other factors can account for major differences between CFSR and OBS in precipitation. In addition, the precipitation and temperature from the study area, Hunhe River Basin (HRB), are also displayed in Figure 5. The Hunhe River originates from the Changbai Mountains; therefore, the monthly average distributions of precipitation and temperature were similar between CBN and HRB. However, in the HRB, precipitation in August was obviously larger than that in July, due to heavier rainstorm events occurring during August.

Runoff Simulation in the Hunhe River Basin
In order to further assess the accuracy of meteorological datasets and their capability for runoff prediction in NEC, the meteorological datasets (CMADS, CFSR, and OBS) were used as inputs into a hydrological model (SWAT) to simulate runoff. The runoff simulations, including one derived from an uncalibrated SWAT model forced by OBS (named "Uncalibrated"), were compared with the observed runoff at the hydrological stations in the HRB. The monthly runoff simulations at two hydrological stations are shown in Figure 6, and the performance indicators are shown in Table 6.
Water 2020, 12, x FOR PEER REVIEW 13 of 18 error, and (c) error of OBS in the process of measurement. CMORPH, having the advantages and disadvantages of satellite precipitation products, is taken as the background field of CMADS precipitation, as it is difficult to determine the precipitation below 4 mm due to the interference of ground surface reflectivity [9,45]. The Chinese regional hourly precipitation fusion product based on automatic precipitation stations is also assimilated in the precipitation data of CMADS; however, this may lead to errors when performing inverse interpolation. Additionally, precipitation events below 1.0 mm account for 56% of OBS events, and the weight of observation sampling error, measurement instrument error, and other factors increase in the calculation of light precipitation, particularly in winter. CFSR overestimated the monthly average precipitation during the dry months and sometimes over-or underestimated it during the rainy months ( Figure 5). The annual average precipitation from the stations of CFSR was higher than that of OBS in three regions ( Figure 5), which was consistent with their spatial distributions shown in Figure 3. This might have occurred because CFSR often overestimates light precipitation and underestimates heavy precipitation, which has been demonstrated in the Kaidu River Basin [46] and in Bolivia [47]. The complex topography and physiognomy, the physical parameter process of the numerical model structure, and other factors can account for major differences between CFSR and OBS in precipitation. In addition, the precipitation and temperature from the study area, Hunhe River Basin (HRB), are also displayed in Figure 5. The Hunhe River originates from the Changbai Mountains; therefore, the monthly average distributions of precipitation and temperature were similar between CBN and HRB. However, in the HRB, precipitation in August was obviously larger than that in July, due to heavier rainstorm events occurring during August.

Runoff Simulation in the Hunhe River Basin
In order to further assess the accuracy of meteorological datasets and their capability for runoff prediction in NEC, the meteorological datasets (CMADS, CFSR, and OBS) were used as inputs into a hydrological model (SWAT) to simulate runoff. The runoff simulations, including one derived from an uncalibrated SWAT model forced by OBS (named "Uncalibrated"), were compared with the observed runoff at the hydrological stations in the HRB. The monthly runoff simulations at two hydrological stations are shown in Figure 6, and the performance indicators are shown in Table 6.   According to Tables 4 and 6, during the calibration period, the uncalibrated runoff simulations achieved "S" (satisfactory) (0.6 < NSE/R 2 < 0.7) at two hydrological stations. During the validation period, very good performance (V, i.e., very good) (NSE/R 2 > 0.8) was achieved at Beikouqian station, while poor performance (US, i.e., unsatisfactory) (NSE/R 2 < 0.5) was achieved at Shenyang station. The uncalibrated runoff simulations seemed to be overestimated after the flood season every year ( Figure 6). During the calibration period, all NSE values from both OBS + SWAT and CMADS + SWAT were above 0.8 for runoff simulations at two hydrological stations, all R 2 values were above 0.85, and their performance ranks reached "V". Both NSE and R 2 values from CFSR + SWAT ranged from 0.78 to 0.85 during the calibration period, with the performance ranks reaching "G" (good). During the validation period, all values of NSE and R 2 from both OBS + SWAT and CMADS + SWAT ranged from 0.54 to 0.96, with the performance ranks reaching "V" and "S" at Beikouqian and Shenyang hydrological stations, respectively; in contrast, the values of NSE and R 2 from CFSR + SWAT ranged from −0.07 to 0.50, therefore, it was rated "US". Figure 6 shows the relationship between measured precipitation and runoff simulation at Beikouqian and Shenyang hydrological stations in the HRB. The larger precipitation and runoff peaks mainly occurred in August, particularly in wet years, which was consistent with the monthly average precipitation in August in the HRB (as shown in Figure 5). The monthly average precipitation distributions from CMADS and OBS were similar ( Figure 5), thus similar runoff simulations were derived from CMADS + SWAT and OBS + SWAT. Their runoff simulations were similar to the observed runoff, indicating that good performance had been achieved (as shown in Table 6). Other studies have also verified the high accuracy of CMADS in runoff simulation over the Qinghai-Tibet Plateau [27], Heihe River Basin [28], and Korean Peninsula [23]. Overall, the runoff simulations from CFSR + SWAT were similar to the observed runoff, but obvious overestimation of runoff could be seen in August 2012, and underestimation in August 2013 ( Figure 6). Therefore, the runoff simulation performances were relatively poor from CFSR + SWAT (Table 6). According to the messages we had collected [48], due to the effect of monsoon, heavy rainstorms occurred in August 2013 leading to an extraordinary 50-year flood. By comparing the monthly average precipitation of CFSR and OBS in August, we found that CFSR data overestimated the measured precipitation by 39.9 mm (+16.5%) in 2012, but underestimated the measured precipitation by 95.0 mm (−33.7%) in 2013, respectively. Additionally, the accumulated precipitation was also large in August 2010 resulting in floods, but the precipitation intensity was weaker than that in 2013. In this month, CFSR underestimated the measured precipitation by about 91 mm (−23.4%). Blacutt et al. [47] found that, when separating the precipitation into classes, CFSR overestimated light to moderate precipitation (1-20 mm/day) and underestimated very heavy rain (>50 mm/day) in Bolivia. He et al. [49] declared that CFSR failed to capture extreme precipitation >120 mm/day in East China. Therefore, CFSR may tend to overestimate precipitation of the region/period with a light to moderate precipitation and to clearly underestimate extreme precipitation, resulting in a similar situation in runoff simulation ( Figures 5  and 6). CFSR could be limited by this disadvantage in hydrological studies. Some studies have attempted to correct the precipitation of CFSR using different methods (e.g., anti-distance weighted method, power transformation, and local intensity scaling) in China [46,50,51]. For example, Li [50] used the anti-distance weighted method to correct CFSR precipitation data in the Jinghe River Basin, Western China. The correction greatly improved the performance of CFSR in runoff simulations, from "US" to "S" or to "G". Overall, CFSR is not suitable for runoff simulation in the HRB. Some conversion of CFSR precipitation may improve its application in hydrological simulations, but it is more convenient to directly use CMADS.
In addition to the quality of meteorological data, the number and spatial distribution of meteorological stations are two important factors that affect hydrological simulations. The upstream and downstream of the HRB are wide and narrow, respectively. Four meteorological stations are respectively distributed in the upstream main stream and tributary, middle stream, and downstream regions. According to Xu et al. [4], the rain gauge density is appropriate in the range of 1.0~1.4/1000 km 2 ; however, the density is 0.5/1000 km 2 in the HRB, and OBS + SWAT achieved good performance in the runoff simulation. This might have occurred because the meteorological stations are relatively evenly distributed in the HRB, which can well reflect the precipitation situation and offset the scarcity of stations. Unfortunately, meteorological stations are usually unevenly distributed in most areas, in which cases, CMADS may be a supplement to OBS.

Conclusions
In this study, we compared the precipitation and temperature (i.e., T max and T min ) from reanalysis datasets (CMADS and CFSR) against that from gauge observations (OBS) at each station in Northeast China (NEC), as well as their annual average spatial distributions, monthly average distributions, and accuracy for runoff simulation. Compared with CFSR, CMADS had a higher consistency with OBS in precipitation and temperature, and the runoff simulation results of CMADS and OBS were much better than that of CFSR. The main conclusions of this paper are as follows: (1) From the comparison between CMADS and CFSR at the regular stations, it was clear that the statistic indicators of CMADS usually were close to the optimal values, and their ranges were smaller than those of CFSR ( Figure 2 and Table 5). Both CMADS and CFSR temperatures were close to that of OBS, particularly CMADS temperature, and their average RMSE ranged from 0.76 to 1.74 • C. Compared with temperature, there were more uncertainties in accurately predicting precipitation, and the average RMSE from CMADS and CFSR precipitation was 13.4 and 17.9 mm, respectively. The annual average spatial precipitation distributions from the three data sources were similar, and all of them correctly reflected the spatial distribution characteristics of precipitation in NEC. However, CMADS had a tendency to underestimate the annual average precipitation while CFSR overestimated it (Figure 3), which corresponded to the monthly average distributions based on the partitions ( Figure 5). Moreover, the maximum precipitation region of CFSR was not consistent with that of OBS, as it slightly moved to the northeast ( Figure 3). The annual average and monthly average temperature from the three data sources were similar in spatial distributions and in values, particularly CMADS, which also verified the evaluation results based on the statistical indicators in Table 5.
(2) Overall, CMADS + SWAT and OBS + SWAT both achieved a very good (or satisfactory) performance in the runoff simulation. Their simulated monthly runoff was consistent with the observed runoff ( Figure 6). The NSE and R 2 of CMADS + SWAT ranged from 0.54 to 0.95 and from 0.73 to 0.97, respectively. The ranges of NSE and R 2 from OBS + SWAT were 0.63-0.94 and 0.78-0.96, respectively. However, the NSE and R 2 of CFSR + SWAT ranged from −0.07 to 0.85 and from 0.39 to 0.85, respectively, and the runoff simulations of CFSR + SWAT were poor due to the overestimation of light-to-moderate precipitation and underestimation of heavy precipitation from CFSR. Only the runoff simulation performance during the calibration period was good. Overall, we found that CFSR was not suitable for runoff simulation, and CMADS can be used as a supplement to OBS in the regions without or with a limited number of stations in NEC.
(3) The precipitation and temperature of CMADS were more consistent with the OBS than that of CFSR. In summary, the precipitation data exhibits greater heterogeneity than the temperature data. Owing to the complexity and variability of precipitation events in space and time, it is difficult to