Applying Satellite Data Assimilation to Wind Simulation of Coastal Wind Farms in Guangdong, China

With the development of the wind power industry in China, accurate simulation of near-surface wind plays an important role in wind-resource assessment. Numerical weather prediction (NWP) models have been widely used to simulate the near-surface wind speed. By combining the Weather Research and Forecast (WRF) model with the Three-dimensional variation (3DVar) data assimilation system, our work applied satellite data assimilation to the wind resource assessment tasks of coastal wind farms in Guangdong, China. We compared the simulation results with wind speed observation data from seven wind observation towers in the Guangdong coastal area, and the results showed that satellite data assimilation with the WRF model can significantly reduce the root-mean-square error (RMSE) and improve the index of agreement (IA) and correlation coefficient (R). In different months and at different height layers (10, 50, and 70 m), the Root-Mean-Square Error (RMSE) can be reduced by a range of 0–0.8 m/s from 2.5–4 m/s of the original results, the IA can be increased by a range of 0–0.2 from 0.5–0.8 of the original results, and the R can be increased by a range of 0–0.3 from 0.2–0.7 of the original results. The results of the wind speed Weibull distribution show that, after data assimilation was used, the WRF model was able to simulate the distribution of wind speed more accurately. Based on the numerical simulation, our work proposes a combined wind resource evaluation approach of numerical modeling and data assimilation, which will benefit the wind power assessment of wind farms.


Introduction
Energy from fossil fuels has played a major role in the development of modern human civilization, but it also brings serious environmental problems and climate issues, such as atmospheric environmental pollution and global warming. The development of renewable energy is one of the major ways to solve environmental problems and achieve sustainable development; wind energy has been developed rapidly as the main clean and renewable energy.
In 2018, China installed an additional wind power capacity of 21 GW and thus has total wind power capacity of more than 200 GW [1]. Before the construction of wind farms, wind resources in wind farm areas need to be evaluated, and the location selection of a wind farm is mainly based on the results of the wind resource assessment. Therefore, wind speed simulation in the wind farm area is a key issue in the development of wind power. After many years of development, wind speed simulation in wind resource assessment and prediction now has two methods: the statistical method and numerical simulation. Costa et al. [2] made a brief review about the development of the short-term wind speed forecast during 30 years of history, highlighting that the main forecast method has changed from the statistical model into the numerical model, and that the integration between both models has also begun to be used. Storm et al. [3] used the Weather Research and Forecast (WRF) [4] model to simulate the LLJ (low-level jet), and the model was able to capture some characteristics of LLJ, which indicates that WRF model can be used for short-term wind energy simulation.
In order to improve the accuracy of the numerical weather model in wind speed simulation, there are two approaches: (1) developing the physical parameterization scheme to improve the wind simulation performance at near-surface levels and (2) applying the data assimilation to improve the initial condition of the atmosphere. Some research studies evaluated the parameterization scheme chosen and planetary boundary layer (PBL) development [5][6][7][8]. In addition to the selection and improvement of the PBL scheme, data assimilation is also widely used to improve the wind simulation results of the numerical model.
Liu et al. [9] combined the WRF model with a data assimilation system and a large eddy simulation (LES) model, which increased wind energy simulation resolution to the level of LES. Zhang et al. [10] used the WRF model and data assimilation to forecast near-surface wind speed. In this work, the conventional observations and infrared satellite observations were used to improve the model output wind speed by the 3DVar. The results showed that, with the improvements of the initial fields, the assimilation of conventional observations and infrared satellite observations significantly improved the wind forecast results. Ancell et al. [11] compared the effects of the ensemble Kalman filter and 3Dvar data assimilation on wind forecasting. The results showed that the EnKF assimilation effect is better than the 3DVar assimilation for 24-hour forecasting. Ulazia et al. [12] compared different data assimilation schemes and found that the assimilation at an interval of six hours has a better effect on the simulation of wind speed than at an interval of 12 hours. The study also suggested applying data assimilation techniques to mesoscale weather models in wind resource assessment. Che et al. [13] developed a system to predict wind speed at turbine height. The Kalman filter algorithm was used to assimilate the cabin wind data after quality control, and the wind speed prediction of the WRF model was improved. The study also pointed out that data assimilation can effectively reduce random errors and is more important in rare or extreme weather conditions. Ulazia et al. [14] used the WRF model to estimate the offshore wind energy resources on the Iberian Mediterranean coast and the Balearic Islands. The results of data assimilation and no data assimilation were compared. The results showed that the bias of the wind speed simulation after the 3DVar data assimilation was significantly reduced. Cheng et al. [15] improved short-term (0-3 hour) wind energy forecasting by assimilating wind speed observed in wind turbines into a numerical weather forecast system. The results showed that the assimilation of wind speed can reduce the average absolute error of the wind speed forecast for 0-3 hours by 0.5-0.6 m/s. As can be seen from related works, data assimilation can improve short-term wind speed simulation results by changing the initial field and providing real-time updates during the model run. An efficient way is dividing long-time simulation into multiple short-time simulations, using the previous numerical weather prediction (NWP) output as the "first guess" field, and then applying data assimilation to update the initial condition and to continue the next short-time run.
In this paper, we used the WRF (Weather Research and Forecast) model to make a one-year wind speed simulation on the coastal wind farm area in Yangjiang, Guangdong. Furthermore, the 3DVar data assimilation was used to assimilate the satellite radiation data. The observation data of seven wind observation towers were used to measure the simulation results and to calculate the improvements of data assimilation on near-surface wind speed simulation. The remainder of this paper is organized as follows: Section 2 mainly introduces the experiment, the data, and the results of the measurement methods; Section 3 analyzes the results of the different tests; Section 4 discusses the results of this article compared with other work; and Section 5 presents the main conclusions.

Wind Observation Data
In order to estimate the improvement of the satellite data assimilation to wind speed simulation, wind speed observations from seven wind observation towers were used. These wind towers have wind speed observations at different heights (10, 50, and 70 m) and measure the instantaneous wind speed and direction every 10 minutes. The data of wind towers were provided by China Huaneng Group Co., Ltd. (CHNG), and all wind towers are located in the wind farm of CHNG. Table 1 shows the geographical locations and altitudes of the seven wind towers. These wind towers are geographically close, and all of them are located near the coastal area in Yangjiang, Guangdong Province. Table 2 is the wind-sensor type, model number, hardware, and software version of wind towers. Figure 1 shows these towers' locations in the inner domain of the model. The observation time of the wind towers is the whole year of 2012. For the original data, quality control (QC) was performed first; another wind resource assessment research study [16] used the similar type of wind observations, so we used the same QC method as that study. The QC method was as follows: (1) If the wind speed value does not change for more than 30 minutes, these data are regarded as invalid data. (2) If there is a large difference between the observed wind speeds at different heights, the data with small value at that time are also considered as invalid data. The comparison methods were as follows: , and are the wind speed at 70, 50, and 10 m. Table 3 shows the observation data numbers of each wind tower in the different months of 2012, and Table 4 is the observation data amount after the quality control. Some towers have missing data in autumn and winter. The data of Tower 5 at heights of 50 and 70 m are considered invalid because of poor quality.

Numerical Model and Data Assimilation
We set three numerical simulation tests to measure the improvement of wind speed simulation by applying satellite data assimilation. The first test (Test 1) only used cold-start initial conditions from NCEP's final analysis data to create a simulation of wind speed. The second test (Test 2) used the analysis field generated by the data assimilation system as the initial conditions, and the model field was updated four times by the data assimilation system during each simulation run. In order to compare the improvement of satellite data assimilation with conventional observations data, we set a third test (Test 3), which used the same data assimilation configurations of Test 2, except the conventional surface and upper-air observation data as the data assimilation input. The Model we used in our work was the WRF model (version 3.8.1), and its data-assimilation system WRFDA [4] was used for satellite and conventional data assimilation. Figure 2 shows the three nested domains of our simulation tests. The inner domain we used in our simulations was mainly located in the coastal area of Yangjiang, Guangdong. Figure 1 shows the distribution of wind towers (red dots) in the inner domain. The chosen of physical configuration considered the following schemes: Morrison double-moment scheme [17] for microphysics; RRTMG [18] for longwave and shortwave radiation; Noah [19] for land-surface scheme; Kain-Fritsch [20] for cumulus convention; and YSU [21] for PBL. Table 5 shows the model configuration of simulations; since domain 03's grid resolution was less than 5 km, we did not need to set the cumulus convention scheme for domain 03.    The data used to generate the initial condition and boundary forcing of the model were Final Operational Global Analysis Data (FNL) [22], which were provided by the National Centre for Environmental Prediction (NCEP). The spatial resolution of FNL data was 1 degree (both in latitude and longitude), and the temporal resolution was 6 hours.
The background error covariance matrix used in 3DVar data assimilation was generated by the NMC method [23]. To calculate the background error, a one-month simulation was made from 1 Jan. 2012 to 1 Feb. 2012. The simulation had the same model settings as Test 1, 2, and 3, and it contained a 12-hour forecast and 24-hour forecast results at both 00:00 and 12:00 UTC. Then, the 62 pairs of results from the 31 days were used to calculate the background error covariance.
The NCEP GDAS Satellite Data [24] were used as the input data for satellite data assimilation. The satellite data sensors included AMSUA, HIRS, MHS, and AIRS. Table 6 shows the types of platforms and sensors. In order to process the satellite data before data assimilation, the Community Radiative Transfer Model (CRTM) was used as the radiative transfer model. The CRTM model can look up the Cloud coefficient, Surface Emissivity coefficient, and Aerosol coefficient and eliminate the satellite data bias caused by cloud, land, and aerosol. Table 7 shows the resolution of satellite data. Since some sensors (MHS, HIRS, and AIRS) have higher resolutions than domain 01 (27 km), we applied data-thinning to domain 01 in the data assimilation.  The conventional observations used in Test 3 includes NCEP ADP Global Surface Observational Weather Data [25] and NCEP ADP Global Upper Air Observational Weather Data [26]. Most of the conventional observations used in Test 3 are synoptic observations. Figure 3 shows the locations of synoptic observation stations.  Figure 4 shows the method of the WRF run in the three tests. Because we needed to obtain values of wind speed every 10 minutes instead of long-term variability, we started the model every day in 2012, and each run of the model made just a one-day simulation. The reason for this was that longtime running depends on boundary forcing, which can capture long-term variability, but it cannot accurately simulate the results at every moment.
In Test 1, we started the WRF model at 18:00 UTC every day and took 6 hours from 18:00 to 00:00 UTC as the spin-up time. Then, from 00:00 to 00:00 UTC the next day, the model output the simulated wind speed as the results of Test 1. Since the time interval of the observation data was 10 minutes, the time interval of the wind speed output of the model was also set to 10 minutes.
In Test 2 and Test 3, we used the WPS (WRF preprocessing system) output of 18:00 UTC as the first guess field, and then we assimilated the satellite observation data by using the 3DVar, and used the 3DVar output as the initial field of the WRF model. From 18:00 UTC to 00:00 UTC, the model was run as a spin up process. At 00:00 UTC (+1 day), 06:00 UTC (+1 day), 12:00 UTC (+1 day), and 18:00 UTC (+1 day), the data assimilation system was run four times, each time using the WRF output as the first guess field and the satellite data as the observation input. Unlike the long-time cycling run of WRF-3Dvar, our model was cold-started daily based on FNL data. In Tests 1, 2, and 3, we used FNL data to generate the initial field, to run the WRF model for 30 hours, and to begin the next day's run. The only difference between Test 1 and Tests 2 and 3 was that Test 2 and Test 3 applied data assimilation five times in each WRF run. Another similar wind resource evaluation work [27] explained that this way avoids model divergence and the accumulation of truncation errors, and the WRF simulations used in that research were 2-day-restart runs. We also calculated the 3Dvar mean absolute difference (MAD) of U and V in domain 03 in Test 2, as follows: Here, is the total grid number in domain 03; is the U or V wind speed after 3Dvar; and is the U or V wind speed of first guess field. We calculated the MAD of U and V at 4 different 3Dvar times, in the simulation, including 00, 06, 12, and 18 UTC. The results of different height and different time MAD are shown in Figure 5; we can find that the MAD is stable at 00, 06, 12, and 18, so there is virtually no difference between the typical operational cycling run and our daily cold-start run. After obtaining the model results, we first interpolated the three-dimensional wind field into heights of 10, 50, and 70 m, using linear interpolation. Then, we interpolated the wind field into the wind tower's latitude and longitude, where the horizontal interpolation method was bilinear interpolation. We used the results of the interpolation to compare with the observed wind speed.

Results Measurements
In order to evaluate the results of the different tests, the following evaluation indices were calculated to evaluate the errors and correlations between model results and observation data.

Root-Mean-Square Error (RMSE)
The root-mean-square error (RMSE) is widely used in NWP to evaluate the error of wind speed and other meteorological variables. Since the observation data has a 10-minute time resolution, we used the 10-minute model output and calculated the RMSE of the 10-minute model wind speed output.
Here, is the number of wind speed observations in each month, is the value of the model result, and is the observation value.

Index of Agreement (IA)
Index of Agreement is a standardized measure of the degree of model prediction error [28][29][30]. It can be calculated by the following: where is the average value of the total observation data. The Index of Agreement varies between 0 and 1, where a value of IA close to 1 indicates wellmatched results, and 0 indicates no agreement at all.
The index of agreement can detect additive and proportional differences in the observed and simulated means and variances [31]. We calculate the IA of the 10-minute model wind speed output to investigate the agreement level of the model output to the wind speed observations.

Pearson Correlation Coefficient (R)
The Pearson Correlation Coefficient (R) is also widely used to evaluate the performance of wind speed simulation of NWP. It reflects the correlation between wind speed simulation series and observation series. If the model output has a high level of R, the error can be largely corrected by postprocessing algorithms.
Here, , represents the covariance of the model results and observation wind speed, and and represent the variance of the model results and observation wind speed. These variables can be calculated as follows: In our tests, we calculate the R between the 10-minute model wind speed output and 10-minute wind speed observations. The R results were also compared, to evaluate the simulation results and the improvements of data assimilation.

Weibull Distribution of Wind Speed
In general, the distribution of near-surface wind speed can be fitted by Weibull distribution [32]. The probability density function of Weibull distribution is as follows: where is the wind speed, > 0 is the shape parameter, and > 0 is the scale parameter of the Weibull distribution.
The Weibull distribution has been widely used in the wind resource assessment because, before the wind farm construction, the wind speed distribution must be evaluated in order to calculate the amount of electric power the wind farm can generate. The two parameters can be used to determine whether the distribution of wind speed simulation results is similar to the observations. If the parameters of wind simulation results are close to the observation, the distribution of model result can reflect the true wind distribution well, and the model results can be used in the wind resource assessment. Figure 6 is the Weibull distribution of seven towers at 10, 50, and 70 m. Tower 5 was missing most of its data at 50 m and all its data at 70 m. Therefore, the distribution of Tower 5 at 50 and 70 m was not analyzed. Table 8 is the shape parameter and the scale parameter of each Weibull distribution in Figure 6. In all of the subfigures of Figure 6, we can see that the wind speed distribution of the red lines is generally smaller than the wind speed distribution of the green lines. This means that the wind speed of Test 2 is smaller than the wind speed of Test 1. Compared to the peak position of wind speed distribution and the peak value of the distribution, except in several cases (Tower 1 is 10 m; Tower 3 is 10 m; Tower 5 is 10 m; and Tower 6 is 10, 30, and 70 m), in most cases, the peak value of Test 2 is closer to the observation than Test 1. Furthermore, for the peak value of wind speed distribution, we can also find that, except in four cases (Tower 1 is 70 m, Tower 2 is 70 m, Tower 3 is 70 m, and Tower 6 is 70 m), the peak values of Test 2 are closer to the observations than Test 1. Table  8 shows that the shape and scale parameters of Test 1 are larger than those of Test 2 and Test 3; Test 1's simulation performance is worse than Test 2's and Test 3's, mainly due to the systematic higher simulation of the wind speed.

The RMSE, IA, and R Results
The Table A1, Table A2 and Table A3 in Appendix A are the RMSE, IA, and R results of the three tests (Test 1, Test 2, and Test 3) at heights of 10, 50, and 70 m. In order to analyze the distributions in different months, the results of each wind tower were calculated separately in each month, from January to December. The vacant positions in the tables mean that there were no valid wind speed observations in that month, so indices were not calculated.
For the distributions of wind direction, we plot the wind rose in each month, using the wind speed and wind direction of Test 2. The Figure B1, Figure B2, Figure B3 and Figure B4 in Appendix B are the wind rose at different heights in the four seasons of spring, summer, autumn, and winter.
In the results of the correlation coefficients (R) in Table A1 and Table A2, Test 1 had two records that did not pass the significance test (10 m Tower 6 Jul. and 50 m Tower 5 Jan.). This is because (1) the amount of data was too small (790 and 63) and (2) the correlation coefficient was too small. The other correlation coefficients all passed the significance test, with more than 99% confidence.
After calculating the average value of the results for the different towers, we obtained the average distribution of RMSE, IA, and R. Figure 7 shows the seven towers' average results of RMSE, IA, and R of Tests 1, 2, and 3, at 10, 50, and 70 m. As can be seen from Figure 7, the results of Test 2 and Test 3 are better than that of Test 1 on all three indices. Both conventional and satellite data can improve wind speed simulation. However, compared with Test 2, Test 3 has small improvements in RMSE, IA, and R. Compared with the conventional data, satellite data have a wider geographical coverage, and the improvement of wind speed simulation is more significant.

Wind Speed Simulation Results Analysis
From Table A1, Table A2 and Table A3, we can find that the RMSE of the model simulation results varied greatly in different months. In some towers, the gap between the different months even reached 2 m/s (10 m for Tower 2; 70 m for Tower 1) and, in most cases, there was at least a 0.5 m/s gap. Like RMSE, the R also changed greatly with the month. Furthermore, we also found that RMSE and R change a lot at different heights in some cases (Tower 2 December; Tower 3 December).
Stensrud et al. [33] compared the MM4 [34] model output with the observation temperature and found that there is systematic bias in the NWP model. In order to analyze the systematic bias, we calculated the mean wind speed value of Test 1, Test 2, and observation data in each month, and we obtained the wind speed anomalies by using the following equation: where is the wind speed anomalies, and is the mean values of wind speed in each month. In order to analyze the wind speed simulation results in Test 1 and to find out the performance of the WRF model on wind speed simulation, we calculated the RMSE, IA, and R by using the wind speed and wind speed anomalies of Test 1 and observations. Figure 8 shows the results of the average indices from seven towers. We also calculated the average value of the bias of mean speed of Test 1 and Test 2; the results are shown in Figure 9.  In Figure 8, we can find that, in the same month and at the same height, the IA is similar between wind speed and wind speed anomalies, but sometimes the value of RMSE can be very different. As for RMSE, the difference between wind speed and wind speed anomalies can be caused by systematic bias of the model's simulations. Therefore, when the model has systematic bias, the RMSE gap will become larger. In some months with large RMSE values, the RMSE gap is always large, indicating that part of the error is caused by systematic bias.
In Figure 8, the RMSE is less than 3 m/s in May, June, July, August, and September, of which the lowest value is in August. Among the IA results, August still reaches the highest value, and the values of April and May are lower than the rest of the months. The results of R are similar to those of IA, having the highest value in August and the lowest values in April and May. From these distributions of the indices, the simulation results in summer are generally the best, and the simulation results in spring are the worst. From the Figure B1 in Appendix B, we can see that the main wind directions in spring are east, south, and southeast, and the wind speed distribution is particularly dense in some directions, namely from ocean to land. We can infer that the poor performance of the spring simulation may be caused by the wind from the ocean. However, in summer ( Figure B2), especially in July and August, although there are winds in the ocean direction, the wind still distributes in many directions.
From the performance of RMSE, MAE, IA, and R in winter, we can see that, although the RMSE and MAE values are large in winter, IA and R are also large. From the RMSEs of wind speed and wind speed anomalies, we can find that the gap between them in winter is larger, when compared with other months. Figure 9 also shows that, in winter, the bias of mean speed is larger. The results indicate that the wind speed simulation has large systematic bias in winter, but since the IA and R are also large in winter, the wind speed pattern can be simulated well. Additionally, as is seen in Figure 8 and Figure 9, in some other months like March, April, June, July, and November, there also exists considerable systematic bias.
The simulation results at different heights have a smaller change compared to seasonal changes. In winter, the RMSE of 10 m is larger than the RMSE of 50 and 70 m, while it is smaller in other seasons. The R and IA results also show that the 10 m simulation was performed better in winter. From Appendix B, we can see that the wind speed distribution at different heights is basically the same, and the wind speed of 10 m is only slightly smaller than that of 50 and 70 m. , Test 2 reduced the RMSE by more than 1 m/s. The significant reduction in RMSE indicates that the bias of wind speed simulation becomes smaller after the data assimilation is used.

Data Assimilation Results Analysis
The Index of Agreement results of 10, 50, and 70 m (Table A1, Table A2 and Table A3) show that, except in some cases (10 m Tower 7 May, 10 m Tower 7 Jul., 10 m Tower 7 Nov., 50 m Tower 7 Jan., 50 m Tower 7 Feb., and 50 m Tower 7 Mar.), Test 2 has a larger value of IA than Test 1 in the rest of the cases. The increments of IA vary from 0 to 0.2, which is a significant improvement of the wind speed simulation.
In the R results of 10, 50, and 70 m (Table A1, Table A2 and Table A3), it can be found that some cases have a large value difference between Test 1 and Test 2. The increase of R indicates that satellite data assimilation significantly improved the correlation between simulation results and observations. Compared with Test 1, we calculated the average reduced RMSE, increased IA and increased R of Test 2; the results of wind speed and wind speed anomalies are shown in Figure 10. Figure 10. Difference between Test 2 and Test 1 results; "10 m", "50 m", and "70 m" are the reduced root-mean-square error (RMSE), increased index of agreement (IA), and increased correlation coefficient (R) results of wind speed at 10, 50, and 70 m. The "anomalies" histograms are the results of the reduced RMSE, increased IA, and increased R calculated using wind speed anomalies in Test 1 and Test 2.
The RMSE results in Figure 10 show that March, April, May, July, and October reduced larger values of RMSE; and November, December, January, February, June reduced less RMSE. The results of IA and R are roughly the same as those of RMSE. Data assimilation can significantly improve the wind simulation results in March-May and July-October.
By comparing the results of wind speed and wind speed anomalies, we can find that the reduced RMSE values have gaps between the two results, especially in March, April, and July, when the reduced RMSE values of the anomalies results are smaller than the wind speed results. Moreover, Figure 9 shows that Test 2 has less mean speed bias than Test 1. This means that, in these months, some systematic bias was corrected by data assimilation.
From the Figure B1, Figure B2 and Figure B3 in Appendix B, we can find that during March-May and July-October, the main wind directions were south, southeast, and east, while the main wind directions during November-February were north, and the north wind during November-February had high speed and was very stable. Wind in the north direction may be caused by the winter monsoon. The winter monsoon is affected by large-scale circulation, and the effect of data assimilation may be limited.
From Figure 10, we can find that, in March-May and July-October, although the results in Test 1 differ greatly in different months, after data assimilation, the differences in Test 2 become smaller. Figure 10 shows that, compared with Test 1, Test 2 improved a lot in March, April, May, and October. Data assimilation can solve some of the bad cases of simulations in spring and autumn.
For the performance of data assimilation at different heights, we found that, compared with the lower level (10 m), most cases at the higher levels (50 and 70 m) have larger RMSE reductions and increments of IA and R. This result indicates that, through data assimilation, simulation results at higher levels improved more than those at lower levels. It can be seen from Figure 4 that the wind speed of 10 m is smaller than the wind speed of 50 and 70 m, so the error reduction is small. At the same time, the wind speed of 10 m is affected by the terrain, and the data assimilation has a greater effect on 50 and 70 m.
The incremental field can reflect the dynamic adjustments from data assimilation. To investigate the incremental field between Test 1 and Test 2, the MAD was calculated as follows: where is the total grid number in domain 03; is the U or V wind speed of Test 2; and is the U or V wind speed of Test 1. Figure 11 shows the vertical distribution of MAD between Test 1 and Test 2 in 00, 06, 12, and 18 UTC. We could find that, at each time, the MAD increased with the height and reached the maximum value at around 200-300 m height, and then decreased with height. The assimilation of satellite data has effects in the troposphere rather than just improving the near-surface layers. Moreover, the MAD of V component of wind speed is larger than U component, especially in 12 and 18 UTC. In our study case, the V component of wind speed is the direction of sea-land breeze, and it may be because that satellite data assimilation improved the temperature and pressure field and affected the simulation of sea-land breeze.

Discussion
In previous research, the simulation of wind speeds in coastal wind farm areas were mainly based on the direct simulation of WRF models with analysis data [27,[35][36][37][38][39][40][41]. Since the parameters chosen for WRF can greatly affect the wind simulation, many studies have focused on the selection and improvement of the physical parameterization scheme of the WRF model [5,7,8]. However, another key factor affecting the model results is the initial field and real-time update of the model fields generated by the data assimilation system. Due to the uncertainty of wind speed changes near the surface, data assimilation has not been widely used in wind speed simulation. Our work used satellite data assimilation and improved the wind speed simulation results.
Wind resource assessment tasks require high quality of both wind speed distribution and time series of wind speed. Our results show that the Weibull distribution of Test 2 is closer to the observation than that of Test 1. Additionally, some statistical results were improved after data assimilation, indicating that the time series of wind speed can also be more accurate.
Application of data assimilation technology for wind speed simulation is a new trend in recent years. Our results show that data assimilation in different seasons has great differences in the improvement of wind simulation and the differences depend on the wind condition, and both systematic bias and random error can be corrected by satellite data assimilation.

Conclusions
In this paper, a one-year wind speed simulation was performed in the wind farm area of Yangjiang. Through the WRF-3DVar system, satellite data assimilation was applied to wind speed simulation in wind resource assessments. The errors and correlations between wind speed and wind speed anomalies in the two tests were compared through three indices-RMSE, IA, and R. Finally, we analyzed the differences of each index in different seasons. The main conclusions are as follows.
The Weibull distribution of Test 2 is closer to the observation than Test 1, and after applying data assimilation, the distribution of wind speed is more accurate.
According to the simulation results of the different seasons, it can be found that the wind simulation in the coastal areas of Guangdong has the best performance in summer and the worst performance in spring. This may be because the spring wind mainly comes from the ocean direction, and in winter and spring, there exist more systematic bias of the WRF model.
Compared to the conventional observations, the satellite data have greater geographic coverage, especially on the sea. The simulation results using satellite data assimilation can reduce the wind speed error and have better agreement with the observation data. Except for winter, the value of RMSE is greatly reduced in the other seasons. Comparing the wind speed and wind speed anomalies results, it can be seen that both the systematic bias and the random error were corrected. The IA and R between simulation results and observations are significantly improved in some months with very low correlations (April and May).
Because conventional observations are mainly distributed in inland synoptic observation stations, the performance of conventional data assimilation is less than the satellite data assimilation.
From the improvements of RMSE, IA, and R with data assimilation, it can be found that with the data assimilation, the performance of wind speed is improved in the spring and autumn, while the improvements are limited in winter. Data assimilation can significantly improve simulations during periods of poor simulation performance. From the wind distribution of the model result, we can find that the wind direction in winter was the same as the winter monsoon, and the systematic bias of the model was large during winter.
The wind speed improvements of data assimilation at the lower level (10 m) were less significant than that at the upper levels (50 and 70 m). This is because the wind near 10 m may be greatly affected by the terrain.
The current methods for wind resource assessment mainly use numerical models to simulate wind speed. Through this work, it can be found that data assimilation can be used to reduce simulation errors (both systematic bias and random errors) and to improve the correlation between simulation results and observations. Furthermore, the combined way of WRF-3Dvar can be applied in wind resource assessment for wind farm location selection and other applications.

Acknowledgments:
The wind observation data of the wind towers were provided by China Huaneng Group Co., Ltd. (CHNG), and the software of the tower data decoding was also provided by CHNG. The authors are very grateful for the observation data and its decoding software provided by CHNG. The FNL data were provided by CISL Research Data Archive (RDA) website (https://rda.ucar.edu/datasets/ds083.2/). The surface and upper-air observation data were also provided by CISL RDA (surface: https://rda.ucar.edu/datasets/ds461.0/ and upper air: https://rda.ucar.edu/datasets/ds351.0/). The authors are also grateful for the provider of these data.

Conflicts of Interest:
The authors declare no conflicts of interest.  Tower 1  Tower 2  Tower 3  Tower 4  Tower 5  Tower 6  Tower 7  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes  Tes Jan.  * The correlation coefficient has a confidence level of more than 99%.