Multiscale Assessments of Three Reanalysis Temperature Data Systems over China

: Temperature is one of the most important meteorological variables for global climate change and human sustainable development. It plays an important role in agroclimatic regionalization and crop production. To date, temperature data have come from a wide range of sources. A detailed understanding of the reliability and applicability of these data will help us to better carry out research in crop modelling, agricultural ecology and irrigation. In this study, temperature reanalysis products produced by the China Meteorological Administration Land Data Assimilation System (CLDAS), the U.S. Global Land Data Assimilation System (GLDAS) and the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis version5 (ERA5)-Land are verified against hourly observations collected from 2265 national automatic weather stations (NAWS) in China for the period 2017–2019. The above three reanalysis systems are advanced and widely used multi-source data fusion and re-analysis systems at present. The station observations have gone through data Quality Control (QC) and are taken as “true values” in the present study. The three reanalysis temperature datasets were spatial interpolated using the bi-linear interpolation method to station locations at each time. By calculating the statistical metrics, the accuracy of the gridded datasets can be evaluated. The conclusions are as follows. (1) Based on the evaluation of temporal variability and spatial distribution as well as correlation and bias analysis, all the three reanalysis products are reasonable in China. (2) Statistically, the CLDAS product has the highest accuracy with the root mean square error (RMSE) of 0.83 °C. The RMSEs of the other two reanalysis datasets produced by ERA5-Land and GLDAS are 2.72 °C and 2.91 °C, respectively. This result indicates that the CLDAS performs better than ERA5-Land and GLDAS, while ERA5-Land performs better than GLDAS. (3) The accuracy of the data decreases with increasing elevation, which is common for all of the three products. This implies that more caution is needed when using the three reanalysis temperature data in mountainous regions with complex terrain. The major conclusion of this study is that the CLDAS product demonstrates a relatively high reliability, which is of great significance for the study of climate change and forcing crop models.


Introduction
Climate change and its impact on agricultural regionalization and crop production is one of the most important fields of study around the world.Temperature is an important indicator of the energy balance of the earth's surface and directly affects global climate change.Accurate temperature data can reasonably drive crop models to simulate the impact of climate change on agricultural production, which is a key tool to explore planting management systems and put forward adaptation strategies [1][2][3].To date, temperature data come from a wide range of sources.Conventional temperature observations at ground-based weather stations are single-point observations.Although a single-point observation may have high accuracy, its spatial coverage is quite limited, especially in mountainous areas with complex terrain, where the observation stations are often sparsely and unevenly distributed due to multiple constraints, such as the topography and environment conditions and maintenance difficulties.As a result, observations at such areas often have certain limitations in representativeness and applicability [4][5][6].Temperature retrievals from remote sensing can provide spatially continuous observations, yet the retrieval accuracy is low [7][8][9].Numerical model simulations have certain advantages regarding the spatial-temporal resolution.However, the model results are severely affected by various physical parameterization schemes [10,11], which often lead to large uncertainties in the output.
In recent years, various real-time analysis or reanalysis datasets (hereafter referred to as gridded dataset) on regular grids with high spatial resolution and temporal continuity have been produced by different multi-source data fusion and assimilation systems.A data fusion and assimilation system can take advantage of various data, analyze and process data from different sources, like direct observations, retrievals and model simulations, and output gridded dataset.A gridded dataset can cover a large spatial area over a long period, and thus effectively makes up for the lack of observations in areas where observation stations are sparse [12].These gridded datasets provide basic data support for gridded forecasting, climate analysis and application services in meteorological agencies [13].They can also be used as input data for numerical models to drive land surface, hydrological and ecological models to obtain more reliable results [10].
At present, various global and regional reanalysis datasets have been released.These datasets, including atmospheric and surface datasets, are produced by different data assimilation and fusion.The atmospheric reanalysis datasets include meteorological variables, such as temperature, relative humidity and UV winds on various pressure levels.The surface reanalysis datasets are composed of surface air temperature, and soil moisture, etc.By far, the National Centers for Environmental Prediction (NCEP) and National Centers for Atmospheric Research (NCAR) reanalysis products [14,15], the ECMWF reanalysis product (ERA5) [16][17][18], and the Japan Meteorological Agency (JMA) product JRA-55 [19] are the most widely used in the meteorological field.In practical applications, however, the requirement for spatial resolution of the near-surface elements are higher than that for the atmospheric elements in the upper air.Therefore, the land surface fusion system has been developing rapidly in recent years, and the fusion datasets that include nearsurface meteorological elements and soil variables have been widely used in weather and climate prediction, water resources management and water cycle studies.In the beginning of the 2000s, the National Oceanic and Atmospheric Administration (NOAA) established the Global Land Data Assimilation System (GLDAS) [20].In 2019, ECMWF released the high-accuracy ERA5-Land gridded surface dataset [21].In 2015, the China Meteorological Administration Land Data Assimilation System (CLDAS-V2.0)was successfully developed [12,22,23].
Following the continuous improvement of observational systems, assimilation systems and numerical models, spatial and temporal resolutions of gridded temperature datasets also increased.They provide a rich data source for the mechanism study of regional atmospheric circulation and climate change studies.However, due to differences in the input data sources, as well as fusion models and assimilation systems, the simulation effect of temperature can be good or bad [24].Therefore, the accuracy and applicability of temperature reanalysis datasets have always been a big concern of meteorologists.Many studies have evaluated the applicability of surface air temperature in several reanalysis products, such as ERA-40, JRA-25, NCEP/NCAR and NCEP/DOE, etc.It is found that these gridded reanalysis datasets can, to a certain degree, reflect spatial and temporal distribution characteristics of the observations [25][26][27][28][29][30][31], yet the differences between them demonstrate obvious regional and seasonal changes.Although the near surface air temperatures from GLDAS, ERA5-Landand CLDAS have been respectively evaluated over limited areas, comprehensive and detailed evaluation and comparison of these data over land areas of China have not been conducted.Note that the evaluation of gridded surface air temperature datasets is an important component of climate change study.Results of the evaluation provide a valuable reference for understanding regional temperature changes and promoting sustainable development.
Based on observations collected at automatic weather stations in China, this study analyzes the accuracy of near-surface air temperature in GLDAS, ERA5-Land and CLDAS gridded datasets over mainland China from different temporal and spatial perspectives.Results of the evaluation will be helpful for researchers to understand the applicability of these gridded datasets in China and provide a reference for the selection of appropriate temperature datasets in the studies of climate change, extreme weather, the Earth's energy and various numerical models.Meanwhile, this study will also help research institutions to further improve the algorithms used for producing these gridded datasets.

Data
Table 1 lists of the spatial and temporal resolutions of the datasets used in this study and their coverage areas.It contains in-site data (NAWS) and three grid datasets (GLDAS, ERA5-Land and CLDAS).

GLDAS Data
GLDAS is evolved from the land information system [32], which is a land surface data assimilation system that consists of multiple land surface models.It is applied to integrate observation-based data and produce surface state (such as soil moisture, surface temperature) and flux (such as evaporation, latent heat and sensible heat flux) variables.GLDAS includes four land surface models [33], i.e., Noah, Mosaic, CLM and VIC.Driven by Princeton University's global meteorological dataset, GLDAS-2 created a more climatologically consistent dataset that covers the period from 1948 to 2010 [34].The horizontal resolution of this dataset is 0.25° and the temporal resolution is 3-hour.It covers the area of (60°S-90°N, 180°W-180°E).In the present study, GLDAS gridded temperature data are downloaded from NASA Goddard Earth Science Data and Information Service Center (GES DISC) (http://disc.sci.gsfc.nasa.gov/hydrology/data-holdings,accessed on 5 September 2021).

ERA5-Land Data
ERA5-Land is a repeat of the ERA5 climate reanalysis, while a series of improvements have been made to make it better meet application requirements [21,[35][36][37].In particular, ERA5-Land runs at enhanced resolution (9 km vs. 31 km in ERA5).The temporal frequency of the output is hourly and the fields are masked for all oceans, making them lighter to handle.ERA5-Land is produced by a single model simulation that is not incorporated into the ECMWF Integrated Forecasting System (IFS).The ERA5-Land historical dataset for the period 1950-1980 was released in September 2021, and the dataset for the period since 1981 was initially released to users in 2019 and is being updated in real-time.The spatial and temporal resolutions of the dataset are 1-h and 0.1°, respectively.In the present study, the ERA5-Land surface air temperature gridded dataset is downloaded from ECMWF Copernicus Climate Change Service Climate Data Store (C3S CDS) (https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land?tab=form, accessed on 5 September 2021).

CLDAS Data
The version2 China Meteorological Administration Land Assimilation System (CLDAS-V2.0) is developed by the National Meteorological Information Center of the China Meteorological Administration.It runs four physical parameterization schemes [23] (Noah, CLM3.5, Common Land Model CoLM and Noah MP) to simulate various soil variables, such as soil temperature, soil moisture, evapotranspiration, surface heat flux, etc.The CLDAS datasets mainly include the forcing dataset and the land surface dataset.The forcing dataset is produced from fusion of observations collected at more than 60,000 automatic weather stations with numerical model predictions, as well as satellite remote sensing data using multi-grid variational analysis technology, discrete ordinates radiation model, hybrid radiation estimation model and terrain correction algorithm [12,23].CLDAS provides gridded 2m air temperature, 2m humidity, 10m UV wind, ground pressure, ground incident solar radiation, precipitation and other elements.In this study, the hourly gridded temperature dataset of the second edition of CLDAS is obtained from the National Meteorological Information Center of China Meteorological Administration.The spatial and temporal resolutions of the dataset are 1-hour and 0.05°, respectively, and it covers the area of (0°N-60°N, 70°E-140°E) (http://data.cma.cn/,accessed on 5 September 2021).

NAWS Observation Data
NAWS observations are obtained from the CIMISS database in Sichuan Meteorological Observation Data Center.In total, 2281 national weather stations in mainland China with data integrity above 98% are selected (Figure 1).The observation instruments at the above weather stations are regularly calibrated, upgraded, and maintained by professionals in the meteorological field.The data collected at these weather stations have passed the national, provincial, and station quality controls and all data are marked with the QC flags [38].Due to the lack of representativeness of the data observed by alpine stations in the plain area, the observation stations are excluded.Finally, data collected at the remaining 2265 stations are considered to be the most reliable observational data, which can be used as the benchmark data for the evaluation of CLDAS, ERA5-Land and GLDAS.

Data Processing
In this study, hourly 2 m temperature data collected at 2265 NAWS for the period 2017-2019 are used to evaluate the accuracy of CLDAS, ERA5-Land and GLDAS gridded datasets.Only those data indicated by QC flag as "correct" are selected to produce the "true value" dataset for the evaluation of reanalysis datasets.To address possible impacts caused by the displacement of weather stations during the evaluation period, GLDAS, ERA5-Land and CLDAS temperatures are spatially interpolated using the bi-linear interpolation method [39] to station locations according to the latitude and longitude information of the stations at each time to obtain comparative sequences.A total of 19,642,844 samples have been obtained.By calculating the statistical metrics defined in Section 2.3, the accuracy of the gridded datasets can be evaluated.

Metrics Used for Evaluation
Correlation coefficient (COR), mean error (ME) and root mean square error (RMSE) are used to compare CLDAS, GLDAS and ERA5-Land data with NAWS observations.They are defined as follows.
where i O is the weather station observation, i G is the gridded temperature data in- terpolated to the station locations, N is the total number of samples used in the evalua- tion (number of stations).COR varies within [−1.0-1.0]; the closer the value is to 1, the better the data consistency and closer it is to −1, the stronger the opposite relationship.
When COR is 0, it means that there is no linear relationship between product and observation.BIAS reflects the degree of deviation of the gridded temperature data from the observation at the station.A negative value indicates that the gridded temperature data is underestimated, while a positive value indicates that the temperature is overestimated in the reanalysis dataset.The closer the RMSE is to 0, the more accurate the gridded temperature data set is.During the evaluation and inspection period, all the samples used for evaluation are calculated based on the cumulative results of hourly observations.

Overall Accuracy during the Study Period
For the evaluation period of 2017-2019, the overall accuracy results are listed in Table 2.The average temperature observations at the weather stations in mainland China is 13.93 °C, and the averages CLDAS, ERA5-Land and GLDAS are 13.88 °C, 13.22 °C and 13.55 °C, respectively, and are 0.06 °C, 0.71 °C and 0.38 °C lower than the station observation.Note that the average of CLDAS is very close to that of the observations.From the perspective of correlation, the highest value of 0.998 is found between CLDAS and observations and the lowest value of 0.970 is found between GLDAS and observations, while that between the ERA5-Land and observations is insignificant.The biases of the three reanalysis datasets all are negative in mainland China, indicating that temperatures in these gridded datasets are underestimated compared to station observations, and the underestimation is most severe in ERA5-Land with a value of 0.71 °C.The RMSEs for CLDAS, ERA5-Land and GLDAS are 0.83 °C, 2.72 °C and 2.91 °C, respectively.Overall, both correlation and bias metrics indicate that the accuracy of CLDAS is obviously higher than that of the other two gridded datasets for the evaluation period.The accuracy of ERA5-Land is better than that of GLDAS, although the difference between them is small.Temperatures at the 4 h of 00, 06, 12, and 18 UTC on the 15th day of each month from January to December 2019 are used to represent the annual mean temperature in 2019.Scatterplots of temperature from the NAWS observations and the three gridded datasets, as well as their linear fittings, are displayed in Figure 2. The goodness of fit (R 2 ) are 0.995, 0.95 and 0.945, respectively, for CLDAS, ERA5-Land and GLDAS.Intuitively, it can be seen that the CLDAS has a higher accuracy.

Evaluation at Individual Stations
Figure 3 displays the spatial distributions of correlation coefficients between the three gridded datasets and observations in mainland China.For most stations, the COR values for CLDAS are higher than that for GLDAS and ERA5-Land.For the three datasets, the COR values all decrease from east to west.As shown in Figure 3a for CLDAS, the COR values are greater than 0.99 at most stations, except a few individual stations over the Tibetan Plateau, the Hengduan Mountains and other high elevation areas.For ERA5-Land (Figure 3b), stations with a COR larger than 0.99 are concentrated in Northeast China, North China, and the middle and lower reaches of the Yangzi River.The COR values decrease from 0.98 to 0.95 over inland China and are largely below 0.96 in West China.As shown in Figure 3c for GLDAS, the spatial pattern of COR is similar to that for ERA5-Land, while the COR values are largely smaller than that for ERA5-Land over inland China and Sichuan Basin. Figure 3d presents the Kernel Distribution Estimation (KDE) of the density of stations corresponding to COR values for the three gridded datasets.Note that the greater the number of stations with COR close to 1.0, the better the correlation of the gridded dataset with station observations.Figure 3d clearly indicates that CLDAS is the best among the three datasets, while ERA5-Land is better than GLDAS.Spatial distributions of RMSEs for the three datasets are presented in Figure 4, which indicates that the RMSEs for CLDAS are smaller than those for GLDAS and ERA5-Land at most stations.The value of RMSE increases from east to west for all three datasets.As shown in Figure 4a for CLDAS, the RMSE values are smaller than 0.5 °C at all of the stations, except for those in Xinjiang, Yunnan, the Tibetan Plateau and the high elevation areas in western Sichuan.The spatial distributions of RMSEs for ERA5-Land (Figure 4b) and GLDAS (Figure 4c) tend to be similar, with the values concentrated over 1.0-3.0°C for ERA5-Land and over 1.5-4.0°C for GLDAS.The KDE of density of stations with RMSE for the three datasets are presented in Figure 4d, which shows clearly that CLDAS is better than the other two datasets.Figure 5 displays the spatial distributions of BIAS for the three datasets in mainland China.For CLDAS (Figure 5a), the biases at most stations vary between −1.0-1.0 °C with positive values in the east and negative values in the west, where the terrain elevation is relatively high.The spatial distributions of bias for ERA5-Land (Figure 4b) and GLDAS (Figure 4c) are basically consistent.Large positive values occur in the North China Plain and the Taklimakan Desert in Xinjiang, while negative values mostly occur in Fujian and southwest China (except Sichuan Basin).Figure 4d displays the KDE of density of stations with BIAS for the three datasets.It shows that positive and negative biases each account for half of the total stations for CLDAS, while negative biases prevail for ERA5-Land and the opposite is true for GLDAS.

At Different Times of the Day
Figure 6 displays diurnal features of the four statistics metrics for the evaluation of the three datasets over the period 2017-2019.Multi-year averages of temperature at different times of the day are shown in Figure 6a, which indicates that the three gridded datasets exhibit consistent diurnal temperature variation with observations.COR at different times of the day are displayed in Figure 6b, which suggests that the COR for CLDAS changes little, with a value around 0.997.The COR value for ERA5-Land first decreases from 0.978 at 00 UTC to 0.97 at 06UTC, then gradually increases and reaches the peak at 12 UTC, and then slowly decreases again.The COR for GLDAS first increases from 0.974 at 00UTC to 0.979 at 03UTC, and then gradually decreases.Figure 6c presents RMSEs at different times of the day.The largest RMSEs occur at 15, 06 and 12 UTC for CLDAS, ERA5-Land and GLDAS, respectively with the values of 0.94 °C, 2.86 °C and 3.27 °C.Biases at different times of the day for the three datasets are displayed in Figure 6d, which shows positive biases for CLDAS at 00, 03 and 06UTC with the largest positive bias of 0.22 °C at 03UTC and the largest negative bias of −0.29 °C at 12UTC.GLDAS has positive biases at 00 and 03UTC and the bias is up to 0.66 °C at 00UTC, and negative biases occur at all other times with the largest negative bias of −1.43 °C at 12 UTC.ERA5-Land exhibits negative biases at all times of the day with the largest negative bias of −0.98 °C at 09 UTC.Overall, negative bias prevails at different times of the day for all the three datasets.

Daily Evaluation
Figure 7 presents daily variations of the evaluation metrics during 2017-2019.Daily mean temperatures for the three datasets and observations are shown in Figure 7a, which indicates that the daily temperature variation during the study period for the three datasets is consistent with station observations.Figure 7b presents daily CORs during the study period.The daily COR between CLDAS and station observations shows little changes, whereas the CORs of ERA5-Land and GLDAS with station observations exhibit large daily variations with the values ranging between 0.891-0.977and 0.848-0.979,respectively.Daily RMSEs of the three datasets are displayed in Figure 7c, which indicates that the RMSEs of CLDAS are obviously smaller than those of the other two datasets.The RMSEs of ERA5-Land and GLDAS are close, yet the RMSEs of ERA5-Land are smaller than those of GLDAS in most days.Daily RMSEs of CLDAS, ERA5-Land and GLDAS vary between 0.61-2.35°C, 1.97-3.80°C and 2.43-3.76°C, respectively.Figure 7d presents daily biases of the three datasets.It is obvious that the biases of CLDAS basically are negative but very close to 0.0.The biases of ERA5-Land are also negative almost all of the time except for a few days.GLDAS is dominated by negative biases in autumn and winter, while positive biases mainly occur in spring and summer.

Monthly Changes
Figure 8 shows monthly changes of temperature and evaluation metrics for the three datasets over 2017-2019.The curves of monthly average temperature of the three grid datasets are basically consistent with the observed values (Figure 8a).Monthly CORs are presented in Figure 8b, which shows that the three gridded datasets exhibit similar monthly variation patterns, i.e., the COR gradually decreases from January to June, and gradually increases from July to December.The possible reason is that the average temperature in China gradually increases because it is located in the northern hemisphere, and the change range of hourly temperature increases relatively.However, the reanalysis temperature products are affected by spatial resolution, and there is a certain regional smoothness in the response to this temperature change, which results in a decreasing COR.Similarly, COR gradually increased from June to December.The CORs of GLDAS are significantly lower than those of ERA5-Land during April-October, but no obvious differences can be found in other months.Figure 8c shows monthly RMSEs.The RMSEs of ERA5-Land overall are smaller than those of GLDAS but are slightly higher in November and February.Monthly biases of CLDAS and ERA5-Land all are negative, while negative biases prevail in GLDAS (Figure 8d) with positive biases only occurring from November 2017 to February 2018 and from October 2018 to February 2019.Among the three gridded datasets, the monthly biases of GLDAS vary the most and those of the CLDAS vary the least.

Seasonal Changes
Figure 9 shows seasonal characteristics of the evaluation metrics over the study period.Seasonal average temperatures of the three gridded datasets and station observations are presented in Figure 9a.Seasonal mean temperatures of station observations and CLDAS, ERA5-Land and GLDAS datasets are 15.0 °C, 14.96 °C, 14.06 °C and 14.57 °C, respectively, in spring.Summer mean temperatures of the above four datasets are 24.61°C, 24.56 °C, 23.87 °C and 23.88 °C, respectively.Autumn mean temperatures of these datasets are 14.23 °C, 14.16 °C, 13.63 °C and 13.95 °C, respectively.Seasonal mean temperatures in winter are 1.64 °C, 1.58 °C, 1.08 °C and 1.55 °C for the four datasets, respectively.In all the four seasons, the seasonal mean temperature of CLDAS is the closest to the observations, followed by that of GLDAS, and the result of ERA5-Land is the worst.Figure 9b shows the seasonal correlation between the three gridded datasets and station observations.It is found that the correlation is the lowest in summer, the highest in autumn and is higher in winter than in spring.This is a common feature for all three gridded datasets.RMSEs are displayed in Figure 9c, which shows that the RMSEs of the three datasets are the largest in winter with the value of 0.91 °C for CLDAS and 3.1 °C for both ERA5-Land and GLDAS.Figure 9d presents seasonal biases.Compared to station observations, temperature in all four seasons is underestimated in the three gridded datasets.The largest negative bias of CLDAS is −0.06 °C, which appears in autumn.The largest negative bias of ERA5-Land appears in winter with a value of −0.93 °C.GLDAS has the largest negative bias of −0.73 °C that appears in summer.

Evaluation over Subregions Divided according to Climate Regimes
With reference to previous studies [40,41], China is divided into eight subregions for evaluation based on topographic and climatic characteristics.Results of evaluation over climate regimes are listed in Table 3, which shows that the RMSEs of ERA5-Land and GLDAS are the largest in subregion II.This subregion is located in the Tibetan Plateau, where negative biases prevail.This is also the region with the largest negative bias.In contrast, the RMSE of CLDAS is the smallest in subregion II, where positive biases appear.The RMSEs of ERA5-Land and GLDAS are smaller in subregions V, VI and VII than in other subregions.Subregions V, VI and VII are located in eastern China, where the terrain is relatively flat.The RMSEs of the three datasets are larger in subregion IV than in other subregions, which is attributed to the fact that subregion IV is located in the transitional zone from the Tibetan Plateau to Sichuan Basin, where the terrain is extremely complex.The province is the second-level administrative unit in China, which possesses certain geographical and human attributes.Most operational meteorological services or scientific research projects are conducted according to the territorial principle.Therefore, it is necessary to evaluate the gridded datasets from the perspective of the provinces.For this reason, all of the national automatic weather stations in China are grouped according to their provincial attributes, and the biases in each province are calculated individually.Results are listed in Table 4, which shows that, except for Tibet and Guizhou for GLDAS, and except Tibet for ERA5-Land, the CORs of the two gridded datasets with station observations are above 0.90, while the CORs of CLDAS are above 0.99 in all of the provinces.
The RMSEs of CLDAS are below 1.0 °C in all of the provinces, except for Shanxi and Xinjiang, where the values are 1.012 °C and 1.088 °C, respectively.The RMSEs of ERA5-Land and GLDAS are below 3.0 °C in all of the provinces except for Gansu, Xinjiang, Qinghai, Guizhou, Sichuan and Tibet, and the largest RMSEs of the two datasets both occur in Tibet, with the values of 7.86 °C and 6.292 °C, respectively.The numbers of provinces with negative biases in CLDAS, ERA5-Land and GLDAS respectively account for 61%, 81%, and 55% of the total number of provinces in mainland China.The evaluation results show that the quality of CLDAS, ERA5-Land and GLDAS is significantly better in the eastern provinces than in the western provinces of China.Compared with ERA5-Land and GLDAS, CLDAS is closer to observations in each individual province.ERA5-land is better than GLDAS in all of the provinces except for Sichuan, Qinghai and Tibet, where the biases of ERA5-Land are slightly larger than those of GLDAS.

Discussion
The present study reveals some important issues that are different to previous studies [30,41,42].For example, the biases of CLDAS, ERA5-Land and GLDAS at night are larger than that in daytime, and all three datasets have negative biases in the nighttime.Monthly biases of the three gridded datasets demonstrate certain regularities.From January to June, their correlations with station observations gradually decrease, and the biases increase.From July to December, the correlations gradually increase, and the biases decrease.Seasonal correlations of the three datasets with observations are the lowest in summer and the highest in autumn, while the correlation in winter is higher than that in spring.Similar assessment results also found a monthly variation in the GLDAS evaluation results, but the deviation was the lowest in August [43], which may be due to different time periods of evaluation.
In addition, the change in temperature is significantly related to geographical locations and variations, such as altitude and slope.The Integrated Nowcasting through Comprehensive Analysis (INCA) [44] were used in the fine lattice simulation and application of temperature over complex terrain, and this method compared and analyzed the other three interpolation methods (inverse distance weighting method, inverse distance weighting method and ordinary Kriging method) [45].The altitude of the station will have a great impact on the results of the four grid methods, and the error increases gradually with the increase of the elevation of the verification station.However, it is mainly aimed at Zhejiang Province in eastern China [46].There are few reports that provide a detailed evaluation of site classification according to terrain across the whole of China.
The topography in China is high in the west and low in the east, showing a staircaselike distribution with multiple terrain patterns and large mountainous areas.The 2065 observation stations used in this study are located at different elevations.The highest station is the Amdo Station in Tibet, the elevation of which is 4800 m.The lowest station is the Turpan Station in Xinjiang, western China, and its elevation is −48.7 m.Evaluation of the present study at individual stations and over various regions indicate that the accuracy of the three datasets is, to a certain degree, related to topography.This is because surface air temperature in gridded datasets is simulated at each fixed grid, where the elevation is the grid-average value.However, the elevation of a weather station may not be able to well represent the average elevation of its nearby area, which may possibly lead to biases in the gridded dataset.Next, we will further classify the slope and elevation of the observational stations, and discuss the influences of the two main terrain features on the accuracy of the gridded datasets.

Impact of Terrain Elevation on the Accuracy of Gridded Dataset
According to their elevations, the stations are divided into eight categories, i.e., elevation < 500 m, ≥500-1000 m, ≥1000-1500 m, ≥1500-2000 m, ≥2000-2500 m, ≥2500-3000 m, ≥3000-3500 m and ≥3500 m.Figures 11 and 12 show the bias characteristics of the three gridded datasets at different elevations.The correlations of ERA5-Land and GLDAS with station observations both show a downward trend with increasing elevation, while their average biases gradually increase with more severe underestimation and the bias range at a single station becomes more divergent.Compared to ERA5-Land and GLDAS, the CLDAS dataset is less affected by elevation.Several previous studies have also found that elevation differences between stations and model grids are a major reason for the biases in reanalysis datasets [47][48][49].Specifically, weather stations over the Hengduan Mountain in western Sichuan are concentrated in the river valley, where the elevation is greatly different to the surrounding areas.Large cold biases in this area are found in gridded datasets because the station elevations there are lower than the heights of corresponding model grids.For those stations located at the top of mountains, their elevations probably are higher than the heights of model grids at the same place.As a result, warm biases are found at these stations in the gridded datasets.The above discussion indicates that the elevation correction of temperature in the gridded dataset can effectively reduce the biases and improve the applicability of the dataset [31,50,51].In addition, possible input data errors, model system errors, and interpolation errors (from Gaussian grid to latitude-longitude grid) of the fusion system are also sources of biases.

Impact of Slope on the Accuracy of Gridded Dataset
According to the classification of slopes proposed by the International Geographical Union and the Geomorphological Mapping Committee for the application of detailed geomorphological maps [52], the slope grades are divided into: plain (0°-0.5°),slight slope (>0.5°-2°), gentle slope (>2°-5°), slope (>5°-15°), steep slope (>15°-35°), steep slope cliffs slope (>35°-55°), vertical slope (>55°-90°).Figures 13 and 14 displays the RMSE and BIAS characteristics of the three datasets over different types of slope.It is found that RMSEs and BIASs of ERA5-Land and GLDAS both increase with increasing slope, while the correlations of the two datasets with observations decrease and the mean errors gradually increase.The underestimation of temperature in the two gridded datasets also gradually intensifies, with a wider spread of biases at individual stations.Compared with the above two datasets, CLDAS is less affected by the terrain slope.

Conclusions
In the present study, the gridded temperature datasets (CLDAS, ERA5-Land and GLDAS) that have been widely used in mainland China are evaluated for the past three years (2017-2019) on multiple times scales from hours of the day to daily, monthly and seasonal, etc. Spatially, the evaluation is conducted at single stations and over various climate regimes and administrative regions, etc.The results indicate that the three gridded datasets can represent the near surface air temperature in mainland China and realistically reflect the overall characters of temperature over major land areas of China.Compared to station observations, temperatures in the three datasets all are underestimated to varying degrees.The underestimation is most severe in ERA5-Land, followed by that in GLDAS.Overall, CLDAS exhibits the highest accuracy in mainland China, and ERA5-Land shows the second highest accuracy.GLDAS is the worst.However, note that the accuracy of ERA5-Land and GLDAS are only slightly different, and the two datasets demonstrate their own advantages and disadvantages in different regions.
In summary, differences in evaluation results can be attributed to various factors, including different resolutions of the gridded datasets, different remapping methods used to match gridded data with station observations and different evaluation metrics, etc.The present study compares the evaluation results of three gridded temperature datasets from different perspectives and finds that CLDAS has the highest accuracy in mainland China, followed by ERA5-Land, with GLDAS being the worst.However, CLDAS dataset mainly covers China and the surrounding areas, whereas ERA5-Land and GLDAS datasets are

Figure 1 .
Figure 1.Distribution of National Automatic Weather Stations (NAWS) in China.

Figure 10 .
Figure 10.Subregions of China according to climate regimes.

Table 1 .
Characteristics of datasets.

Table 3 .
Evaluation results over subregions of different climate regimes.

Table 4 .
Evaluation results over provinces in mainland China.