Evaluation of HRCLDAS and ERA5 Datasets for Near-Surface Wind over Hainan Island and South China Sea

Near-surface wind data are particularly important for Hainan Island and the South China Sea, and there is a wide range of wind data sources. A detailed understanding of the reliability of these datasets can help us to carry out related research. In this study, the hourly near-surface wind data from the High-Resolution China Meteorological Administration (CMA) Land Data Assimilation System (HRCLDAS) and the fifth-generation ECMWF atmospheric reanalysis data (ERA5) were evaluated by comparison with the ground automatic meteorological observation data for Hainan Island and the South China Sea. The results are as follows: (1) the HRCLDAS and ERA5 near-surface wind data trend was basically the same as the observation data trend, but there was a smaller bias, smaller root-mean-square errors, and higher correlation coefficients between the near-surface wind data from HRCLDAS and the observations; (2) the quality of HRCLDAS and ERA5 near-surface wind data was better over the islands of the South China Sea than over Hainan Island land. However, over the coastal areas of Hainan Island and island stations near Sansha, the quality of the HRCLDAS near-surface wind data was better than that of ERA5; (3) the quality of HRCLDAS near-surface wind data was better than that of ERA5 over different types of landforms. The deviation of ERA5 and HRCLDAS wind speed was the largest along the coast, and the quality of the ERA5 wind direction data was poorest over the mountains, whereas that of HRCLDAS was poorest over hilly areas; (4) the accuracy of HRCLDAS at all wind levels was higher than that of ERA5. ERA5 significantly overestimated low-grade winds and underestimated high-grade winds. The accuracy of HRCLDAS wind ratings over the islands of the South China Sea was significantly higher than that over Hainan Island land, especially for the higher wind ratings; and (5) in the typhoon process, the simulation of wind by HRCLDAS was closer to the observations, and its simulation of higher wind speeds was more accurate than the ERA5 simulations.


Introduction
Near-surface wind is one of the most important meteorological parameters. It is a major factor in various industries of economic importance, such as agriculture, fishery, transportation, construction, and water conservancy engineering. Hainan Province is located at the southernmost tip of China, and it includes Hainan Island and more than two million square kilometers of the South China Sea. Owing to these geographical features, wind disasters occur frequently in Hainan Province, so near-surface wind data are particularly important for Hainan. Although the site observations are accurate, simple site observations cannot meet data needs in areas with sparse sites, such as complex topographic areas and large sea areas [1][2][3][4]. With the improvement of numerical models and model interpretation techniques and the diversified requirements of meteorological service mation Center provided the data, and they were obtained from the National Comprehensive Meteorological Information Sharing Platform through the MUSIC interface. The stations include two national climate observatories, three national reference climate stations, four national basic meteorological stations, 12 national meteorological observatories, 383 conventional meteorological observatories, and six offshore buoy stations. Figure 1 shows the station distribution.

Surface Meteorological Observation Data
The surface meteorological observation data include the two-minute average wind speed and two-minute average wind direction data after quality control for Hainan Province from 3 April to 31 October 2020 ( Table 1). The Hainan Meteorological Information Center provided the data, and they were obtained from the National Comprehensive Meteorological Information Sharing Platform through the MUSIC interface. The stations include two national climate observatories, three national reference climate stations, four national basic meteorological stations, 12 national meteorological observatories, 383 conventional meteorological observatories, and six offshore buoy stations. Figure 1 shows the station distribution.

ERA5 Near Surface Wind Data Product
The ERA5 reanalysis data are the latest generation of reanalysis data that were developed by the ECMWF. ERA5 is substantially upgraded and improved when compared with its predecessor ERA-Interim [20,26,27], and it is currently open access. The latitude and longitude grid resolution of the ERA5 reanalysis data is 0.25° × 0.25°, and the time resolution is 1 h. The data are downloaded from the C3S Climate Data Storage (CDS) through ECMWF Web API using Python scripts ( Table 2).

ERA5 Near Surface Wind Data Product
The ERA5 reanalysis data are the latest generation of reanalysis data that were developed by the ECMWF. ERA5 is substantially upgraded and improved when compared with its predecessor ERA-Interim [20,26,27], and it is currently open access. The latitude and longitude grid resolution of the ERA5 reanalysis data is 0.25 • × 0.25 • , and the time resolution is 1 h. The data are downloaded from the C3S Climate Data Storage (CDS) through ECMWF Web API using Python scripts ( Table 2).

HRCLDAS Near Surface Wind Data Product
HRCLDAS is a high-resolution land surface data assimilation system that was developed by the National Meteorological Information Center of the CMA. The system uses multiple grid variational technology [28] and a terrain correction algorithm, combined with Atmosphere 2021, 12, 766 4 of 21 numerical prediction data, satellite data, and site observation data, to generate atmosphericdata-driven products [24]. The latitude and longitude grid resolution of the HRCLDAS data (Table 2) is 0.01 • × 0.01 • , and the time resolution is 1 h. The data were obtained from the CMA Data Service Centre.

Methodology
In this study, the surface meteorological observation data are regarded as the "true value", and the HRCLDAS and ERA5 data are interpolated to each station using the nearestneighbor interpolation method. The bias, root-mean-square error (RMSE), and correlation coefficient (COR) between the interpolation results and observations are calculated to judge the quality of the products [19,22,23]. For wind speed, the U wind component, and the V wind component, the evaluation index formulas are: The wind direction is a scalar of the cycle from 0 to 360 degrees, and the absolute value of the error does not exceed 180 degrees. The evaluation index formulas for the wind direction are: Here, O i is the site observation value, G i is the value obtained by interpolating the HRCLDAS or ERA5 data to the inspection site, and N is the total number of samples that participate in the inspection.
For the wind speed, the wind grade of the HRCLDAS data and observed data is determined according to the magnitude for the wind speed of each hour. Table 3 shows the classification of wind power. The accuracy (AC), the strong rate (FS), and the weak rate (FW) of the wind grade are evaluated and analyzed by comparing the wind grade of HRCLDAS with the observed data. The formulas are as follows: Here, k represents the wind classification inspection level, NR k is the number of correct stations for the wind level inspection at level k, NR k is the number of the strong wind level inspection stations at level k, NW k is the number of the weak wind level inspection stations at level k, and NF k is the total number of wind power inspection stations at level k.

Analysis of Time Series Variation
The time series of the daily average wind speed (Figure 2a Figure 2m) for Hainan Island and the South China Sea from April to October 2020 were analyzed. The trend of the ERA5 and HRCLDAS wind products over time is basically the same as the observation data trend. For the HRCLDAS data, the time series of the four wind products closely follow the observations. However, the ERA5 data show a much larger variation for the four wind products when compared with the observations: the wind speed is significantly overestimated, but the wind direction is close to the observations; the U component is underestimated most of the time, and the V component is significantly overestimated. Figure 2b,f,j,n shows the time series charts of the daily average RMSE for wind speed, wind direction, U component, and V component, respectively. The RMSE of the four wind products for HRCLDAS is lower than that for ERA5. From April to September, the RMSE of the HRCLDAS wind speed was largely within the range from 0.2 to 1 m s −1 , and the RMSE of the ERA5 wind speed was mostly within the range from 1 to 4 m s −1 . The wind speed in October was significantly higher than that from April to September, and the RMSE of HRCLDAS also improved in October, with values between approximately 1 and 2 m s −1 . The RMSE of ERA5 improved more significantly than HRCLDAS in October, with values that were between approximately 3 and 6 m s −1 . From April to September, the RMSE of the HRCLDAS wind direction was mainly concentrated between 30 • and 60 • , and the RMSE of the ERA5 wind direction was mainly concentrated in the range from 60 • to 90 • . In October, the RMSE of both wind directions decreased significantly, and the decrease of ERA5 was greater than that of HRCLDAS. The RMSE of the HRCLDAS U and V components was similar to that of wind speed, and most were within 1 m s −1 . For ERA5, the RMSE of the V component was obviously larger than that of the U component, and the variation was larger.
The daily average COR of HRCLDAS wind speed from April to October (Figure 2c) was mostly between 0.9 and 1.0, while that of ERA5 was mostly below 0.6. The COR of the wind direction of the two datasets ( Figure 2g) was lower than that of the wind speed, the COR of HRCLDAS was between 0.4 and 0.8, and that of ERA5 was less than 0.4 with occasional negative correlations. The COR of the U component ( Figure 2k) and V component (Figure 2o) for the HRCLDAS data were similar, whereas the COR of the V component for the ERA5 data was slightly better than that of the U component.  Figure 2b,f,j,n shows the time series charts of the daily average RMSE for wind speed, wind direction, U component, and V component, respectively. The RMSE of the four wind products for HRCLDAS is lower than that for ERA5. From April to September, the RMSE of the HRCLDAS wind speed was largely within the range from 0.2 to 1 m s −1 , and the RMSE of the ERA5 wind speed was mostly within the range from 1 to 4 m s −1 . The wind speed in October was significantly higher than that from April to September, and the RMSE of HRCLDAS also improved in October, with values between approximately 1 and 2 m s −1 . The RMSE of ERA5 improved more significantly than HRCLDAS in October, with  The bias of HRCLDAS from April to October was less than that of ERA5. For wind speed, the HRCLDAS data showed a negative bias from April to September, and the bias increased significantly in October, which showed a positive bias. For the wind direction, the bias of HRCLDAS and ERA5 was mostly positive, the bias of HRCLDAS was less than 6 • , and the bias of ERA5 varied more dramatically over time, mostly between 0 • and 30 • . For the U and V components, the bias between HRCLDAS and the observation data were all approximately 0 • without distinct variability. The U component of the ERA5 data showed mostly a negative bias, whereas the V component mostly showed a positive bias, and the bias of V component was larger than that of the U component.
In general, for daily average wind speed, wind direction, U component, and V component, HRCLDAS was closer to the observations than ERA5, with a lower RMSE, smaller bias, and higher COR.   (Figure 3g,h) were generally similar to that of the wind speed. The stations with a larger RMSE for the V component than the U component for ERA5 were mainly distributed in the coastal areas of Hainan Island. The RMSE for the U and V components of HRCLDAS were similar, and the stations with a lower RMSE for the V component than the U component were mainly distributed in the central and northern parts of Hainan Island.

Comparative Analysis of Each Station
Atmosphere 2021, 12, x FOR PEER REVIEW 9 of 22   with lower COR were mainly in the complex terrain areas, such as central Wuzhishan City. The stations with a higher COR for the U component of ERA5 ( Figure 4e) were mainly distributed in the South China Sea and the coast of Hainan Island. The COR for the V component of ERA5 (Figure 4g) was significantly higher than that of the U component in the eastern part of Hainan Island. The station distribution of the COR for the U and V components of HRCLDAS (Figure 4f,h) was similar to that of wind speed, most of them were above 0.95, and the stations with a low COR appeared in Wuzhishan City.

Comparative Analysis of Land and Sea
The performance of the two wind data products from April to October 2020 for Hainan Island land stations and island stations, respectively, were evaluated to analyze the performance of ERA5 and HRCLDAS wind products over land and sea. There are 70 island stations in Hainan Province, and Figure 5 shows the evaluation results. For wind speed, the bias of ERA5 on land and the islands was 1.91 m s −1 and 2.14 m s −1 , respectively, and the bias of HRCLDAS on land and the islands was 0.01 m s− 1 and 0.04 m s −1 , respec-

Comparative Analysis of Land and Sea
The performance of the two wind data products from April to October 2020 for Hainan Island land stations and island stations, respectively, were evaluated to analyze the performance of ERA5 and HRCLDAS wind products over land and sea. There are 70 island stations in Hainan Province, and Figure 5 shows the evaluation results. For wind speed, the bias of ERA5 on land and the islands was 1.91 m s −1 and 2.14 m s −1 , respectively, and the bias of HRCLDAS on land and the islands was 0.01 m s −1 and 0.04 m s −1 , respectively, both of which are positive values. In addition, the bias for land stations was smaller than that for island stations. The RMSE between the two wind speed datasets and observations had the same performance and deviation for the two underlying surfaces, and both are smaller on land than on islands. The correlation coefficients of the two wind speed datasets were similar for land and the islands. The correlation coefficients for ERA5 on land and the islands were 0.53 and 0.52, respectively, and, for HRCLDAS, they were 0.94 and 0.95, respectively. For wind direction, ERA5 and HRCLDAS show a positive bias on land and the islands. The bias of ERA5 on land was higher than that on the islands, which were 8.78° and 7.22°, respectively. The bias of HRCLDAS on land was slightly smaller than that on the islands, which were 1.24° and 1.62°, respectively. The RMSEs of the two datasets were For wind direction, ERA5 and HRCLDAS show a positive bias on land and the islands. The bias of ERA5 on land was higher than that on the islands, which were 8.78 • and 7.22 • , respectively. The bias of HRCLDAS on land was slightly smaller than that on the islands, which were 1.24 • and 1.62 • , respectively. The RMSEs of the two datasets were smaller on the islands than on land. The RMSEs of ERA5 on land and the islands were 77.2 • and 56.3 • , and those of HRCLDAS were 24.9 • and 9.0 • . The CORs of ERA5 on land and the islands were 0.27 and 0.51, respectively, and those of HRCLDAS were 0.60 and 0.85, respectively.
For the U component, ERA5 had a negative bias on land and the islands, which were −0.77 m s −1 and −0.50 m s −1 , respectively, and the bias on land was greater than that on the islands. HRCLDAS had a negative bias on land with a value of −0.02 m s −1 , and it had a positive bias on the islands with a value of 0.03 m s −1 . The RMSE and COR of the two datasets on land were both smaller than on the islands.
For the V component, ERA5 had a positive bias on land and the islands, respectively, with values of 0.81 m s −1 and 1.54 m s −1 , and the bias on land was smaller than that on the islands. The bias of HRCLDAS was similar to that of the U component. The RMSEs of the two datasets were both lower on land than on island stations, and the COR of ERA5 for the land stations were lower than those for the islands, which were 0.64 m s −1 and 0.71 m s −1 , respectively. The COR of HRCLDAS on land was slightly higher than that on the islands, which were 0.964 and 0.957, respectively.
In general, HRCLDAS wind products had a smaller bias, smaller RMSE, and larger COR for both land and sea islands when compared with ERA5. The quality of the HRCLDAS and ERA5 wind products for islands was slightly better than that for land.

Comparative Analysis of Different Landforms
The landform has a strong influence on the wind, and different physical processes will affect the wind conditions of the stations under different landforms. Owing to the difference between land and sea, the stations that are close to the sea surface will be affected by the sea and land breeze; winds in flat inland areas are influenced by land surface processes and land-atmosphere interactions; the wind in mountainous and hilly areas may be affected by localized circulation that is driven by complex topography [19,29]. According to the above situation, the stations on Hainan Island are classified according to their geomorphological characteristics, distance from the station to the sea surface, and altitude ( Figure 6): (1) coast stations, if located no further than 10 km from the sea; (2) inland stations, if located further than 10 km from the sea and situated on a plain or tableland; (3) hill stations, if located further than 10 km from the sea and with an altitude between 100 m and 500 m; and (4) mountain stations, if located further than 10 km from the sea and with an altitude that is greater than 500 m. Some previous studies have used similar classifications for wind [19,30]. islands. The bias of HRCLDAS was similar to that of the U component. The RMSEs of the two datasets were both lower on land than on island stations, and the COR of ERA5 for the land stations were lower than those for the islands, which were 0.64 m s −1 and 0.71 m s −1 , respectively. The COR of HRCLDAS on land was slightly higher than that on the islands, which were 0.964 and 0.957, respectively. In general, HRCLDAS wind products had a smaller bias, smaller RMSE, and larger COR for both land and sea islands when compared with ERA5. The quality of the HRCLDAS and ERA5 wind products for islands was slightly better than that for land.

Comparative Analysis of Different Landforms
The landform has a strong influence on the wind, and different physical processes will affect the wind conditions of the stations under different landforms. Owing to the difference between land and sea, the stations that are close to the sea surface will be affected by the sea and land breeze; winds in flat inland areas are influenced by land surface processes and land-atmosphere interactions; the wind in mountainous and hilly areas may be affected by localized circulation that is driven by complex topography [19,29]. According to the above situation, the stations on Hainan Island are classified according to their geomorphological characteristics, distance from the station to the sea surface, and altitude ( Figure 6): (1) coast stations, if located no further than 10 km from the sea; (2) inland stations, if located further than 10 km from the sea and situated on a plain or tableland; (3) hill stations, if located further than 10 km from the sea and with an altitude between 100 m and 500 m; and (4) mountain stations, if located further than 10 km from the sea and with an altitude that is greater than 500 m. Some previous studies have used similar classifications for wind [19,30].  There are 159 coast stations, 135 inland stations, 36 hill stations, and 44 mountain stations on Hainan Island. The quality of ERA5 and HRCRDAS near-surface wind data for different landforms was evaluated and analyzed, and Figure 7 shows the results. Figure 7a-c shows the bias, RMSE, and COR of ERA5 and HRCLDAS wind speed data for the different landforms, respectively. It can be seen that the bias, RMSE, and COR of ERA5 wind speed data decreased in the order of coast, inland, hill, and mountain. The wind speed bias of ERA5 was between 1.1 and 2.3 m s −1 , the RMSE was between 1.9 and 3.2 m s −1 , and the COR was between 0.39 and 0.53. The bias and RMSE of HRCLDAS wind speed data for all of the landforms were significantly lower than those for ERA5, and the COR was higher than ERA5. HRCLDAS wind speed data had a slight negative deviation for inland stations, and the rest were positive deviations. The deviation in inland and hilly areas was close to 0, and the deviation in coastal and mountainous areas was slightly higher than that in inland hills, being around 0.025 m s −1 ; the maximum RMSE occurred on the coast, which was 0.71 m s −1 , and that of inland, hills, and mountains was not particularly different, approximately 0.4 m s −1 ; the COR did not differ much between different landforms, ranging between 0.93 and 0.96, with the lowest value in coastal areas.  Figure 7d-f shows the bias, RMSE, and COR of ERA5 and HRCLDAS wind direction data for the different landforms. It can be seen that the wind directions of ERA5 and HRCLDAS were positive deviations for the different landforms. The bias of the ERA5 wind direction data was between 6.3° and 11.9°, with the highest bias occurring inland, Figure 7d-f shows the bias, RMSE, and COR of ERA5 and HRCLDAS wind direction data for the different landforms. It can be seen that the wind directions of ERA5 and HRCLDAS were positive deviations for the different landforms. The bias of the ERA5 wind direction data was between 6.3 • and 11.9 • , with the highest bias occurring inland, followed by hills, and, again, coastal and mountain regions. The bias of the HRCLDAS wind direction data showed little variation for all landforms, ranging from 1.04 • to 1.32 • . The RMSE of the ERA5 wind direction data increased in the order of coastal, inland, hilly, and mountainous areas, which ranged from 68.7 • to 92.9 • . The RMSE of the HRCLDAS wind direction data was the highest in hilly areas, with a value of 60.9 • , which was slightly lower in the mountainous areas, and the lowest in the coastal areas, with a value of 43.4 • . The COR distributions for the two kinds of wind direction data for the different landforms correspond to their RMSE distributions. If the RMSE was large, the COR was on the smaller side. The COR of the ERA5 wind direction data was between 0.13 and 0.34, the correlation was the lowest in the mountains and highest on the coast, and the correlation of HRCLDAS wind direction data was higher than that of ERA5, ranging from 0.5 to 0.65, with the lowest being recorded in the hills and the highest on the coast. Figure 7g-l shows the bias, RMSE, and COR of the U and V wind components for the two datasets. From the bias diagram it can be seen that the U component of ERA5 and HRCLDAS had a negative deviation for all landforms, the V component deviation of ERA5 was positive, and the V component deviation of HRCLDAS was negative; the absolute value of the ERA5 bias decreased in the order of coastal, inland, hilly, and mountainous areas, and the bias of HRCLDAS was close to 0. The distribution of the other evaluation indicators of the U and V components of ERA5 and HRCLDAS was basically the same as the wind speed, and the quality of the U component of the two datasets was slightly lower than that of the V component.

Evaluation of Wind Speed by Grade
The accuracy, strong rate, and weak rate of the hourly wind grades of ERA5 and HRCLDAS were analyzed on the basis of the wind grade of the hourly observation data. Table 4 shows the statistics of the sample size of each wind grade of the observations from April to October 2020. It can be seen that Grade 1 wind is the largest sample size, and that the sample size decreases with an increasing wind grade. The wind speed data of these two grades were not included in the assessment of this section because of the small number of wind samples of Grade 10 and 11.  Figure 8 shows the wind grade accuracy rate, strong rate, and weak rate of ERA5 and HRCLDAS for different wind grades from April to October. Figure 8a shows the accuracy rate. It can be seen that the accuracy rate of HRCLDAS was higher than that of ERA5 for all wind levels. The highest accuracy rate of ERA5 was 43.9% for Grade 4 wind, and the accuracy of other wind grades gradually decreased from the left and right sides of the Grade 4 wind. The accuracy rate of HRCLDAS for all wind levels was above 60%, the lowest was 63.8% for Grade 0 wind, the second lowest was 69% for Grade 6 wind, the accuracy rate of other the wind power grades increased step by step from the left and right sides of the Grade 6 wind, and Grade 1 wind was the highest, reaching 92.1%.  Figure 8b,c shows the strong rate and weak rate of each wind grade. The strong and weak rates of HRCLDAS were lower than the accuracy. The HRCLDAS wind was more likely to be stronger in Grades 0 and 1, and weaker in other wind grades, when compared with the observations. That means HRCLDAS was more likely to underestimate the wind speed. ERA5 had a relatively high strong rate when the wind grade was between 0 and 3, and the wind speed of ERA5 was significantly overestimated when compared with observations, especially for the Grade 0 wind, with its strong rate reaching 99.5%. The strong rate of ERA5 decreased with an increasing wind grade. When the wind grade was higher than Grade 5, the weak rate of ERA5 wind was higher than the strong rate and accuracy rate, which indicated that the wind speed was significantly underestimated. With the increase of the wind grade, the weak rate increased, and the highest weak rate was 100% when the wind grade was 9.
The accuracy of ERA5 and HRCLDAS at each station under all wind levels was analyzed. Figure 9a shows the distribution of the ERA5 wind accuracy for each station. It can be seen that the stations with an accuracy between 0 and 20% were mainly distributed in the eastern, northern, and coastal areas of Hainan Island. The wind accuracy rates of offshore stations in the South China Sea were relatively high, while those of some island stations in Xisha were lower than 5%. Figure 9b shows the distribution of the HRCLDAS wind accuracy rate for each station. It can be seen that the accuracy rates of HRCLDAS were higher than those of ERA5 for most of the stations. There were 285 stations with an accuracy rate of more than 80%, accounting for 69.3% of the total number of stations. Most of the stations with a lower accuracy rate were distributed in coastal areas.  Figure 8b,c shows the strong rate and weak rate of each wind grade. The strong and weak rates of HRCLDAS were lower than the accuracy. The HRCLDAS wind was more likely to be stronger in Grades 0 and 1, and weaker in other wind grades, when compared with the observations. That means HRCLDAS was more likely to underestimate the wind speed. ERA5 had a relatively high strong rate when the wind grade was between 0 and 3, and the wind speed of ERA5 was significantly overestimated when compared with observations, especially for the Grade 0 wind, with its strong rate reaching 99.5%. The strong rate of ERA5 decreased with an increasing wind grade. When the wind grade was higher than Grade 5, the weak rate of ERA5 wind was higher than the strong rate and accuracy rate, which indicated that the wind speed was significantly underestimated. With the increase of the wind grade, the weak rate increased, and the highest weak rate was 100% when the wind grade was 9.
The accuracy of ERA5 and HRCLDAS at each station under all wind levels was analyzed. Figure 9a shows the distribution of the ERA5 wind accuracy for each station. It can be seen that the stations with an accuracy between 0 and 20% were mainly distributed in the eastern, northern, and coastal areas of Hainan Island. The wind accuracy rates of offshore stations in the South China Sea were relatively high, while those of some island stations in Xisha were lower than 5%. Figure 9b shows the distribution of the HRCLDAS wind accuracy rate for each station. It can be seen that the accuracy rates of HRCLDAS were higher than those of ERA5 for most of the stations. There were 285 stations with an accuracy rate of more than 80%, accounting for 69.3% of the total number of stations. Most of the stations with a lower accuracy rate were distributed in coastal areas. Atmosphere 2021, 12, x FOR PEER REVIEW 16 of 22  Figure 10 shows the distribution of the accuracy, strong rate, and weak rate of ERA5 and HRCLDAS for the different underlying surfaces for each wind grade from April to October. From Figure 10a,c it can be seen that there was little difference in the accuracy of the ERA5 wind grade between land and the islands. The accuracy rate of Grade 4 wind was the highest, which was 40.7% and 42.5% on land and the islands, respectively. The accuracy rate of Grade 1-3 and 5 wind was higher on land than on the islands, and the accuracy rate of Grade 4 and 6-8 wind was higher on the islands than on land. The strong rate of ERA5 on land and the islands decreased with an increasing wind grade, whereas the weak rate showed the opposite, which meant that the ERA5 data were more likely to underestimate high-grade wind speed over land and overestimate low-level wind speed over the ocean.   Figure 10 shows the distribution of the accuracy, strong rate, and weak rate of ERA5 and HRCLDAS for the different underlying surfaces for each wind grade from April to October. From Figure 10a,c it can be seen that there was little difference in the accuracy of the ERA5 wind grade between land and the islands. The accuracy rate of Grade 4 wind was the highest, which was 40.7% and 42.5% on land and the islands, respectively. The accuracy rate of Grade 1-3 and 5 wind was higher on land than on the islands, and the accuracy rate of Grade 4 and 6-8 wind was higher on the islands than on land. The strong rate of ERA5 on land and the islands decreased with an increasing wind grade, whereas the weak rate showed the opposite, which meant that the ERA5 data were more likely to underestimate high-grade wind speed over land and overestimate low-level wind speed over the ocean.  Figure 10 shows the distribution of the accuracy, strong rate, and weak rate of ERA and HRCLDAS for the different underlying surfaces for each wind grade from April to October. From Figure 10a,c it can be seen that there was little difference in the accuracy o the ERA5 wind grade between land and the islands. The accuracy rate of Grade 4 wind was the highest, which was 40.7% and 42.5% on land and the islands, respectively. Th accuracy rate of Grade 1-3 and 5 wind was higher on land than on the islands, and th accuracy rate of Grade 4 and 6-8 wind was higher on the islands than on land. The strong rate of ERA5 on land and the islands decreased with an increasing wind grade, wherea the weak rate showed the opposite, which meant that the ERA5 data were more likely to underestimate high-grade wind speed over land and overestimate low-level wind speed over the ocean.   With the exception of Grade 0 wind, the accuracy of the wind grades decreased with the increasing wind grade, and the highest was 92.3% for Grade 1. The accuracy rates of Grade 1-5 wind were more than 56%, and those of Grade 6-8 wind were less than 50%. On land, the strong rates of wind above Grade 1 were low, the weak rates were higher than the strong rates, and the weak rates of Grade 6-8 wind were higher than the accuracy rates. This means that, over land, HRCLDAS underestimated high-grade wind speed, and the higher the wind grade, the more significant the underestimation. Figure 10d shows the accuracy rate of HRCLDAS for each grade of wind force on the islands. It can be seen that, except for Grade 0 wind, which had an accuracy rate of 70.3%, the accuracy rates of the wind grades were above 80%, with the lowest being 81.9% for Grade 9 and the highest being 91.6% for Grade 4. On the islands, the strong rates and weak rates were low, and the weak rates were slightly higher than the strong rates. It can be seen that the accuracy rate of HRCLDAS on the islands was significantly higher than that on land, especially for high-grade wind speed.

Comparative Analysis of Performance for the Typhoon Process
A typhoon case was selected to evaluate the performance of the two datasets for the typhoon process in order to analyze the performance of ERA5 and HRCLDAS in special weather processes. Typhoon "Nangka" occurred in October 2020. The period when the typhoon had the greatest impact on Hainan Island and the South China Sea was selected for analysis, which is 12-14 October. Figure 11a,b shows the wind vector frequency distribution of ERA5 and HRCLDAS on 12-14 October 2020, respectively. It can be seen that ERA5 had a significant false positive phenomenon in the wind direction of 270 • -315 • , there was clear underreporting between 225 • and 270 • , and the probability of not being simulated for high wind speed was high. The wind vector frequency distribution of HRCLDAS was closer to that of the observations, and the simulation of higher wind speeds was more accurate, but there was also the obvious underestimation and underreporting of high wind speeds near the 180 • wind direction.
Atmosphere 2021, 12, x FOR PEER REVIEW Figure 10b shows the accuracy of HRCLDAS for each grade of wind force o With the exception of Grade 0 wind, the accuracy of the wind grades decreased w increasing wind grade, and the highest was 92.3% for Grade 1. The accuracy rates of 1-5 wind were more than 56%, and those of Grade 6-8 wind were less than 50%. O the strong rates of wind above Grade 1 were low, the weak rates were higher th strong rates, and the weak rates of Grade 6-8 wind were higher than the accurac This means that, over land, HRCLDAS underestimated high-grade wind speed, a higher the wind grade, the more significant the underestimation. Figure 10d sho accuracy rate of HRCLDAS for each grade of wind force on the islands. It can be see except for Grade 0 wind, which had an accuracy rate of 70.3%, the accuracy rates wind grades were above 80%, with the lowest being 81.9% for Grade 9 and the h being 91.6% for Grade 4. On the islands, the strong rates and weak rates were lo the weak rates were slightly higher than the strong rates. It can be seen that the ac rate of HRCLDAS on the islands was significantly higher than that on land, especi high-grade wind speed.

Comparative Analysis of Performance for the Typhoon Process
A typhoon case was selected to evaluate the performance of the two datasets typhoon process in order to analyze the performance of ERA5 and HRCLDAS in weather processes. Typhoon "Nangka" occurred in October 2020. The period wh typhoon had the greatest impact on Hainan Island and the South China Sea was s for analysis, which is October 12-14. Figure 11a,b shows the wind vector frequency distribution of ERA5 and HRC on 12-14 October 2020, respectively. It can be seen that ERA5 had a significant fals tive phenomenon in the wind direction of 270°-315°, there was clear underreport tween 225° and 270°, and the probability of not being simulated for high wind spe high. The wind vector frequency distribution of HRCLDAS was closer to that of servations, and the simulation of higher wind speeds was more accurate, but the also the obvious underestimation and underreporting of high wind speeds near t wind direction. Several individual stations were selected to compare and analyze the varia wind speed with time between the two datasets and the observations during the ty Figure 12a depicts a location map of selected sites, which represent an offshore station, an offshore land station, a remote island station, and a land station. Several individual stations were selected to compare and analyze the variation of wind speed with time between the two datasets and the observations during the typhoon. Figure 12a depicts a location map of selected sites, which represent an offshore island station, an offshore land station, a remote island station, and a land station.  Figure 12b shows the time series of near-surface wind speed at the offshore islan station (Wanning Bai'an Island Station). It can be seen that HRCLDAS simulated the win speed change during the typhoon well, with only a few discrepancies, whereas ERA clearly overestimated the low wind speed, underestimated the high wind speed, inaccu rately simulated the wind speed change, and did not simulate many sudden wind spee changes. Figure 12c shows the time series of near-surface wind speed at the offshore land sta tion (Qionghai Qingge Port Station). It can be seen that ERA5 overestimated the win speed at the offshore land station when compared with the observations, and there was time offset of the simulation of wind speed variation between ERA5 and the observations The HRCLDAS wind speed was basically consistent with the observations, except at som of the peak values.  Figure 12b shows the time series of near-surface wind speed at the offshore island station (Wanning Bai'an Island Station). It can be seen that HRCLDAS simulated the wind speed change during the typhoon well, with only a few discrepancies, whereas ERA5 clearly overestimated the low wind speed, underestimated the high wind speed, inaccurately simulated the wind speed change, and did not simulate many sudden wind speed changes. Figure 12c shows the time series of near-surface wind speed at the offshore land station (Qionghai Qingge Port Station). It can be seen that ERA5 overestimated the wind speed at the offshore land station when compared with the observations, and there was a time offset of the simulation of wind speed variation between ERA5 and the observations. The HRCLDAS wind speed was basically consistent with the observations, except at some of the peak values. Figure 12d shows the time series of near-surface wind speed at the remote island station (Xisha Station). At this station, the simulation quality of ERA5 and HRCLDAS was better than that for the other kinds of station. ERA5 mostly simulated the change of wind speed, but there was still a time offset, which was later than the observations as a whole. Figure 12e shows the time series of near-surface wind speed at the land island station (Tunchang Station). ERA5 roughly simulated the change trend of wind speed at the land station, but without enough detail, and the simulation of the peak time of wind speed was offset. HRCLDAS simulated the change of wind speed at the land station well, but the wind speed value showed a certain deviation.
In general, there was a significant overestimation of low wind speed by ERA5 in the typhoon process, the simulation of the change of wind speed during the typhoon was not accurate, and the time simulation of the peak value was offset. When compared with ERA5, HRCLDAS could better simulate the variation characteristics of the wind speed, and the simulation was the best in the remote islands.

Discussion
The near-surface wind is a major factor in various industries of economic importance, such as agriculture, fishery, transportation, construction, water conservancy projects, and so on. It is necessary to comprehensively evaluate the performance of near-surface wind products for Hainan Island and the South China Sea. In this study, the nearest-neighbor interpolation method was used to evaluate the quality of ERA5 and HRCLDAS near-surface wind data for Hainan Island and the South China Sea from April to October 2020. The bias, RMSE, COR, and wind grade accuracy rate, strong rate, and weak rate were used as evaluation indexes to analyze the temporal and spatial distribution of these indicators. In addition, the performance of the two datasets during a typhoon was evaluated.
The daily mean values of wind speed, wind direction, U component, and V component of HRCLDAS and ERA5 were basically the same as the observations, but the biases and RMSE of HRCLDAS were smaller, with a higher COR. The variation range of the daily mean value of the four wind products of ERA5 was greater than the observed values.
It can be concluded that the quality of the HRCLDAS and ERA5 wind speed products in the inland of Hainan Island was higher than that in the coastal areas of Hainan Island on the basis of the analysis of the site spatial distribution of the four wind factor evaluation indexes of HRCLDAS and ERA5. This conclusion is consistent with the performance of ERA5 for Guangdong Province in China [23]. The quality of the wind direction products in the coastal areas of Hainan Island was higher than that in the inland. For the wind speed, the RMSEs between the two datasets and the observations on the islands were both larger than on land, but the CORs were similar. For the wind direction, the RMSEs of HRCLDAS and ERA5 on the islands were both smaller than on land, and the CORs were higher on the lands than on land. In general, the data quality of HRCLDAS and ERA5 on the islands in the South China Sea was better than that on land. However, for both land and the islands in the South China Sea, the quality of HRCLDAS was better than that of ERA5, especially in the coastal areas of Hainan Island and the island stations near Sansha, where the quality of HRCLDAS was most significantly better than that of ERA5.
The wind speed RMSE of the ERA5 data was the largest on the coast, the RMSE of wind direction was the largest in the mountain areas, and the CORs were both the smallest in the mountain areas; therefore, it is necessary to be cautious when using ERA5 data for mountain areas. This is consistent with the performance of ERA5 in Sweden [19]. The quality of HRCLDAS near-surface wind data for all of the different landforms was higher than that of ERA5. The quality of the HRCLDAS wind speed data was almost the same for all the different landforms, but it was slightly lower in coastal areas. The quality of the HRCLDAS wind direction had the lowest quality in hilly areas, followed by mountainous areas, which may be due to the complex topography of hilly and mountainous areas.
From the evaluation results of HRCLDAS and ERA5 for each wind grade, it can be seen that the accuracy rate of HRCLDAS for each grade of wind was higher than that of ERA5. ERA5 significantly overestimated low-grade wind and significantly underestimated high-grade wind, with the highest accuracy rate of 43.9% for Grade 4 wind. The accuracy rate of HRCLDAS in each wind grade was more than 60%, and its accuracy rate on islands was significantly higher than that on land, especially the accuracy rate of the high-grade wind speed.
The simulation of wind by HRCLDAS was closer to the observations in the process of special weather, such as a typhoon, and the simulation of higher wind speeds was most accurate. In addition, HRCLDAS could better simulate the details of the change of wind speed when compared with ERA5. In contrast, ERA5 had an obvious omission for the simulation of higher wind speeds. ERA5 could simulate the changing trend of wind speed, but the details of the change could not be well simulated, and there was a deviation in the time when the peak wind speed appeared. HRCLDAS and ERA5 both had the best simulation quality for the wind speed for remote island stations.
From the above analysis it can be seen that the quality of HRCLDAS was generally better than the quality of ERA5. The reason for this may be that a multi-grid variational analysis method was adopted for the generation of the HRCLDAS dataset, the background field and the observation field were gradually merged and assimilated from large scale to small scale, and the data that were collected by the automatic wind observation stations were integrated, which ensured the quality of HRCLDAS wind products in China. However, the quality of HRCLDAS near-surface wind data in areas with complex topography and sea-land boundaries still needs to be further strengthened, and how to improve it needs to be further studied.

Conclusions
This study found that HRCLDAS and ERA5 near-surface wind data can both reflect the general characteristics of wind variation over Hainan Island and the South China Sea. ERA5 near-surface wind speed has a large deviation on the coast; ERA5 overestimates the low-grade wind and underestimates the high-grade wind, and it does not reflect the details of the wind speed change. The quality of the wind direction near the surface for ERA5 is poor, especially in mountainous areas. When compared with ERA5, HRCLDAS near-surface wind data have a lower bias and RMSE from observations, and a higher COR, and it can better reflect higher wind speeds in typhoon weather. However, the near-surface wind speed of HRCLDAS also has a large deviation on the coast, and the quality of wind direction in hilly areas is poor. In general, the quality of HRCLDAS is better than that of ERA5. These differences may be caused by the different data sources and different algorithms. However, there are still some shortcomings in the HRCLDAS data, such as its time series being too short and its coverage being limited to China, which cannot support long-term climate analysis. Therefore, the products should be selected according to actual needs.