Improving the Near-Surface Wind Forecast around the Turpan Basin of the Northwest China by Using the WRF_TopoWind Model

Wind energy is a type of renewable and clean energy which has attracted more and more attention all over the world. The Northwest China is a region with the most abundant wind energy not only in China, but also in the whole world. To achieve the goal of carbon neutralization, there is an urgent need to make full use of wind energy in Northwest China and to improve the efficiency of wind power generation systems in this region. As forecast accuracy of the near-surface wind is crucial to wind-generated electricity efficiency, improving the near-surface wind forecast is of great importance. This study conducted the first test to incorporate the subgrid surface drag into the near-surface wind forecast under the complex terrain conditions over Northwest China by using two TopoWind models added by newer versions of the Weather Research and Forecasting (WRF) model. Based on three groups (each group had 28 runs) of forecasts (i.e., Control run, Test 01 and Test 02) started at 12:00 UTC of each day (ran for 48 h) during the period of 1–28 October 2020, it was shown that, overall, both TopoWind models could improve the near-surface wind speed forecasts under the complex terrain conditions over Northwest China, particularly for reducing the errors associated with the forecast of the wind-speed’s magnitude. In addition to wind forecast, the forecasts of sea level pressure and 2-m temperature were also improved. Different geographical features (wind-farm stations located south of the mountain tended to have more accurate forecast) and weather systems were found to be crucial to forecast accuracy. Good forecasts tended to appear when the simulation domain was mainly controlled by the high-pressure systems with the upper-level jet far from it.


Introduction
Wind energy (WE) is the kinetic energy associated with the air flow [1,2]. This type of energy is renewable and clean, which has got more and more attention and development all over the world [3,4]. However, the disadvantages of WE are also obvious, as it features remarkable randomness, diversity and uncontrollability [5,6]. Therefore, effective utilization of the WE requires an accurate wind-field perception. There are roughly two ways to get the information of the wind field. One is to observe the wind field directly, and the other is to predict the wind field by using numerical models [7][8][9]. For the former, although it is of high accuracy, it cannot provide the information of wind field in the future; for the latter, although its errors are inevitable, it can provide future information [10,11].
Based on a series of equations that describe the atmosphere (e.g., thermodynamic and dynamic equations of the atmosphere), numerical models can gain the information of The multi-scale processes related to the lower-level wind speed in Northwest China were simulated by using the WRF version 4.1.1. A total of 3 model configurations were used in this study as shown in Table 1. Of these, the control run used the model configuration that was selected from a series of comparisons (in our operational forecasting, we compared a total of 10 model configurations to determine the best configuration) as it showed overall the best performance in forecasting the near-surface winds. The Test 01 run was the same to the control run but used the TopoWind model with topo_wind = 1; the Test 02 run was the same to the control run but used the TopoWind model with topo_wind = 2. These two runs were used to evaluate the performances of the two TopoWind models (discussed in the introduction) in improving the near-surface wind forecasts. Only one domain with 169 × 169 grid points (Figure 1), 51 vertical levels, and a horizontal resolution of 3 km × 3 km were used in the simulations. The simulation period was from 12:00 UTC 01 to 12:00 UTC 28 October 2020, with the WRF model started at 12:00 UTC each day (there was 28 runs for a set of model configuration) and ran for 48 h. From WRFv3.4 on, the WRF provided a TopoWind model that was associated with the YSU PBL scheme [33] to improve the topographic effects on the near-surface winds. There was a total of three options for the YSU PBL scheme: (i) topo_wind = 0, which did not include the additional topographic effects from the TopoWind model in the near-sur-  Table 2. Performances of Test 01 and Test 02 relative to that of control run (%). u10 = 10-m zonal wind; v10 = 10-m meridional wind; spd10 = 10-m wind speed; t2 = 2-m temperature; slp = sea level pressure; z50 = 500-hPa geopotential height; t50 = 500-hPa temperature.
From WRFv3.4 on, the WRF provided a TopoWind model that was associated with the YSU PBL scheme [33] to improve the topographic effects on the near-surface winds. There was a total of three options for the YSU PBL scheme: (i) topo_wind = 0, which did not include the additional topographic effects from the TopoWind model in the nearsurface wind calculation; (ii) topo_wind = 1, which included the standard deviation of the subgrid-scale orography in the near-surface wind calculation; and (iii) topo_wind = 2, which used a method the same as that of topo_wind = 1 to calculate the near-surface winds, but meanwhile enhanced the calculation of the friction velocity by including the subgrid terrain variance. More detailed information about the TopoWind models can be found in literatures [10,31,32].
In this study, the root mean square error (RMSE) and the correlation coefficient (CORC) were used to evaluate the performances of different model configurations: where N is the total number used for calculation (e.g., for evaluation of the 24-h forecast, N = 24, as we used hourly output and hourly ERA5/observation for evaluation); F i is the forecast at time i; O i is the reanalysis/observation at time i. The RMSE was utilized to evaluate the forecast's performance in reproducing the magnitude of a meteorological variable, and the CORC was used to evaluate the forecast's performance in representing its variational trend. For calculating RMSE and CORC, because the resolution of WRF simulation was much higher than that of ERA5 reanalysis data, firstly, we interpolated the WRF output into 0.25 • × 0.25 • resolution (by using bilinear interpolation), which was the same as that of ERA5 reanalysis data. Then, for the points within the simulation domain (the blue box shown in Figure 1b), (i) we calculated RMSEs and CORCs at each point; and (ii) we calculated the area average of the RMSEs and CORCs at all points within the simulation domain to get the area-averaged RMSE and CORC for comparing the performances of different model configurations. For the comparison using wind-farm observation, a similar approach was taken, during which we interpolated the WRF output into 24 wind-farm observational stations (Figure 1b) by using the bilinear interpolation.

Evaluation of the 10-m Wind Speed
As the primary purpose of this study is to evaluate the performance of the TopoWind model in forecasting the near-surface wind (we used 10-m and 70-m wind speed as representatives) under the complex terrain conditions over Northwest China, we first evaluated the near-surface wind forecast. From Figure 2 it can be seen that, for the spd10 forecast within 0-24 h (24-h forecast for convenience), the CORC and RMSE varied with time notably. Overall, for the 28 control runs (Section 2), their largest CORC (~0.78) appeared in the forecast started from 12:00 UTC 16 October (Figure 2a), the smallest CORC (~0.57) appeared in the forecast started from 12:00 UTC 09 October, and the mean CORC among 28 control runs was~0.68. The variation of control-runs' RMSE showed an obvious inverse correlation (the correlation coefficient was around −0.69) to their CORC (Figure 2b), implying that those forecasts with larger CORCs tended to have smaller RMSEs, and vice versa. Detailed comparisons showed that the smallest RMSE (~1.2 m s −1 ) appeared in the forecast started from 12:00 UTC 25 October (Figure 2b), the largest RMSE (~2.6 m s −1 ) appeared in the forecast started from 12:00 UTC 06 October, and the mean RMSE was~1.6 m s −1 . Thus, it can be concluded that the best/worst forecast in terms of CORC and the best/worst forecast in terms of RMSE were usually different. In order to differ good forecasts from bad forecasts, we used the following definition: (i) for a series of forecasts, if one of them satisfied: (i) its CORC was larger than the mean CORC of these forecasts, and (ii) its RMSE was smaller than the mean RMSE of these forecasts at the same time, then, it is a good forecast; otherwise, it was a bad forecast. As Figure 2 shows,~50% of the control runs were good forecasts. A similar situation was found in the spd10 forecast within 24-48 h (48-h forecast for convenience; Figure 3): (i) those forecasts with larger/smaller CORCs tended to have smaller/larger RMSEs (the correlation coefficient was around −0.63); (ii) the best/worst forecast in terms of CORC and the best/worst forecast in terms of RMSE were usually different; and~50% of the control runs were good forecasts.  The area-averaged correlation coefficient (CORC) for the forecast of 10-m wind speed (started at 12:00 UTC of each day) within 0-24 h (a); and the area-averaged root mean square error (RMSE) associated with the 10-m wind speed forecast mentioned above (b). Black, blue, and red solid lines represent the calculation results of the Control, Test 01 and Test 02 runs, respectively; and the black, blue and red dashed lines are the temporal means of the values represented by black, blue, and red solid lines, respectively. Good (i.e., CORCs of Test 01 and Test 02 are above their corresponding mean CORCs and RMSEs of Test 01 and Test 02 are below their corresponding mean RMSEs) and bad forecasts (the rest) for Test 01 and Test 02 are marked by green and purple lines, respectively.  ; and the area-averaged root mean square error (RMSE) associated with the 10-m wind speed forecast mentioned above (b). Black, blue, and red solid lines represent the calculation results of the Control, Test 01 and Test 02 runs, respectively; and the black, blue and red dashed lines are the temporal means of the values represented by black, blue, and red solid lines, respectively. Good (i.e., CORCs of Test 01 and Test 02 are above their corresponding mean CORCs and RMSEs of Test 01 and Test 02 are below their corresponding mean RMSEs) and bad forecasts (the rest) for Test 01 and Test 02 are marked by green and purple lines, respectively.
For 24-h/48-h forecast, compared to each of the 28 control runs, almost all Test-01 and Test-02 runs showed an increase in their CROCs and a decrease in their RMSEs ( Figure 2), with the changes of the Test-02 run larger than those of the Test-01 run. This means that, overall, both the TopoWind models could improve the 10-m wind speed forecasts within 0-24 h and 24-48 h, and the topo_wind = 2 option showed a better performance than that of the topo_wind = 1 option. In order to compare the improvements of Test-01 and Test-02 runs relative to the control runs, we defined the relative performance (RP) as follows: where subscripts CORC and RMSE stand for the factors that are calculated, the overbar denotes the mean values among all control, Test 01, and Test-02 runs, respectively, the subscript "test" stands for Test 01 or Test 02, and the subscript "control" represents the control run. From its definition, the RP can represent changes in forecast accuracy. As shown in Table 2, compared to the control run, on average, for the forecast within 0-24 h, Test 01 increased the mean CORC by~6% and reduced the mean RMSE by~4%; and Test 02 increased the mean CORC by~9% and reduced the mean RMSE by~13%. For the forecast within 24-48 h, Test 01 increased the mean CORC by~4% and reduced the mean RMSE by~6%; and Test 02 increased the mean CORC by~9% and reduced the mean RMSE by~14%. Therefore, it can be concluded that, (i) both topo_wind = 1 and topo_wind = 2 options could improve the 10-m wind speed forecast (particularly for reducing the RMSE) under the complex terrain conditions over Northwest China (Table 2); (ii) the improvements were similar for the 24-h (4-13%) and 48-h forecasts (4-14%); and (iii) the topo_wind = 2 option (9-14%) showed a more notable improvement than the topo_wind = 1 (4-6%).

Evaluation of the 10-m Zonal and Meridional Wind
Wind is a vector which could be decomposed into the zonal wind and the meridional wind. As illustrated in Table 3, in terms of CORC, for all runs, the v10 forecast showed a much larger correlation coefficient with the 10-m wind speed than that of the u10 forecast, whereas, in terms of RMSE, the u10 forecast had a larger correlation coefficient. This means that the forecast of v10 was more important to the variational trend of the 10-m wind speed and the forecast of u10 was more important to the magnitude of the 10-m wind speed. Overall, Test 02 had the largest correlation coefficients of CORCs and RMSEs of u10 and v10 for both 24-h and 48-h forecasts (Table 3), whereas, those of the control run were the smallest. Table 3. Correlation coefficients between the CORCs/RMSEs of various variables (i.e., u10 and v10) and the CORC/RMSE of the 10-m wind speed. u10 = 10-m zonal wind; v10 = 10-m meridional wind.

Control Run
Test  Figure 2 shows that, on average, the CORC of the 10-m wind speed ( Figure 2) was between the CORC of u10 ( Figure 4) and CORC of v10 ( Figure 5), with the former smaller than the latter. This was true for the Control run, Test 01 and Test 02, as both the zonal and meridional winds were important to the variational trend of 10-m wind speed. In contrast, for the RMSE, that of the 10-m wind speed ( Figure 2) was smaller than those of u10 and v10 (Figures 4 and 5). This was because that the 10-m wind speed was larger than u10 or v10. From Table 2 it can be found that, for the 24-h forecasts, Test 01 increased the CROCs of u10 and v10 forecasts by~3%, and reduced the RMSEs of u10 and v10 forecasts by~4%. Test 02 showed a much better performance, as it increased the CROCs of u10 and v10 forecasts by~8%, and reduced the RMSEs of u10 and v10 forecasts by~11-13%. Similar situations were found for the 48-h forecasts (cf., Figures 4-7), except that the RMSEs of u10 and v10 showed more notable improvements than those of 24-h forecasts (Table 2). Therefore, it can be concluded that, on average, both TopoWind models could improve the forecasts of u10 and v10 (particularly for the RMSE), with the topo_wind = 2 option showed a better performance than that of the topo_wind = 1 option.
v10 forecasts by ~11-13%. Similar situations were found for the 48-h forecasts (cf., Figures  4-7), except that the RMSEs of u10 and v10 showed more notable improvements than those of 24-h forecasts (Table 2). Therefore, it can be concluded that, on average, both TopoWind models could improve the forecasts of u10 and v10 (particularly for the RMSE), with the topo_wind = 2 option showed a better performance than that of the topo_wind = 1 option.    The area-averaged correlation coefficient (CORC) for the forecast of v10 (started at 12:00 UTC of each day) within 0-24 h (a); and the area-averaged root mean square error (RMSE) associated with the v10 forecast mentioned above (b). Black, blue, and red solid lines represent the calculation results of the Control, Test 01 and Test 02 runs, respectively; and the black, blue and red dashed lines are the temporal means of the values represented by black, blue, and red solid lines, respectively. Figure 6. The area-averaged correlation coefficient (CORC) for the forecast of u10 (started at 12:00 UTC of each day) within 24-48 h (a); and the area-averaged root mean square error (RMSE) associated with the u10 forecast mentioned above (b). Black, blue, and red solid lines represent the calculation results of the Control, Test 01 and Test 02 runs, respectively; and the black, blue and red

Evaluation of the 70-m Wind Speed
This study used a total of 24 wind-farm observational stations (Figure 8c) to evaluate the forecast of 70-m (the height of wind-farm's wind observation) wind speed. The simu-

Evaluation of the 70-m Wind Speed
This study used a total of 24 wind-farm observational stations (Figure 8c) to evaluate the forecast of 70-m (the height of wind-farm's wind observation) wind speed. The simulated 70-m wind speed was produced by vertical interpolation using the wind speeds at the two sigma levels that were the closest to the height of 70-m. As Figure 8a,b and Figure 9a,b show, for a same set of model configuration (e.g., the Control run, Test 01 or Test 02 shown in Table 1), its performance at different stations were different. Different geographical positions (Figure 8) of these stations were key reasons for their different forecast accuracy. Comparisons among all 24 stations show that, stations #12-#16 generally showed larger CORCS and smaller RMSEs than those of other stations (Figure 8a,b and Figure 9a,b). These stations were mainly located south of the mountain, where terrain was below 1000 m. In addition, different weather systems (i.e., systems that produced the wind) were also an important reason for the different forecast accuracy at different stations.   Figure 9c shows, on average, Test 01 and Test 02 increased the 24-station mean CORC by 1.9% and 3.3%, respectively. For the 48-h forecast, similar situations were found (cf., Figures 8a and 9a), except that the CORCs were mainly smaller than those of the 24-h forecast. Overall, Test 01 and Test 02 increased the 24-station mean CORC by 2.1% and 3.5%, respectively (Figure 9c), which were larger than those of the 24-h forecast. All in all, as mentioned above, the TopoWind models could improve the forecast of the variational trends of the 70-m wind speed, with the topo_wind = 2 option showing a better performance.
In terms of RMSE, for the 24-h forecast, the RMSEs of the control run varied from 1.9 (the wind-farm station #15) to 3.2 (#5), with a mean value of ~2.6 ( Figure 8b). The RMSEs of the Test 01 varied from 1.9 (the wind-farm station #15) to 2.9 (#5), with a mean value of ~2.4. The RMSEs of the Test 02 varied from 1.8 (the wind-farm station #15) to 3.0 (#5), with  Test 01 and Test 02, where the solid line within a box marks the median value, the cross shows the mean value, the extent of the boxes corresponding to 25% (first quartile), 75% (third quartile), and whiskers corresponds to (third quartile)-1.5 * (interquartile range) and (first quartile) + 1.5 * (interquartile range).
In terms of CORC, for the 24-h forecast, the CORCs of the control run varied from 0.63 (the wind-farm station #22) to 0.79 (#15), with a mean value of~0.69 (Figure 8a). The CORCs of the Test 01 varied from 0.65 (the wind-farm station #22) to 0.80 (#15), with a mean value of~0.70. The CORCs of the Test 02 varied from 0.67 (the wind-farm station #21) to 0.80 (#15), with a mean value of~0.71. As Figure 9c shows, on average, Test 01 and Test 02 increased the 24-station mean CORC by 1.9% and 3.3%, respectively. For the 48-h forecast, similar situations were found (cf., Figures 8a and 9a), except that the CORCs were mainly smaller than those of the 24-h forecast. Overall, Test 01 and Test 02 increased the 24-station mean CORC by 2.1% and 3.5%, respectively (Figure 9c), which were larger than those of the 24-h forecast. All in all, as mentioned above, the TopoWind models could improve the forecast of the variational trends of the 70-m wind speed, with the topo_wind = 2 option showing a better performance.
In terms of RMSE, for the 24-h forecast, the RMSEs of the control run varied from 1.9 (the wind-farm station #15) to 3.2 (#5), with a mean value of~2.6 ( Figure 8b). The RMSEs of the Test 01 varied from 1.9 (the wind-farm station #15) to 2.9 (#5), with a mean value of~2.4. The RMSEs of the Test 02 varied from 1.8 (the wind-farm station #15) to 3.0 (#5), with a mean value of~2.3. As Figure 9c shows, on average, Test 01 and Test 02 reduced the 24-station mean RMSE by 4.9% and 8.8%, respectively. For the 48-h forecast, similar situations were found (cf., Figures 8b and 9b), except that the RMSEs were mainly larger than those of the 24-h forecast (i.e., forecast accuracy was lower). Overall, Test 01 and Test 02 reduced the 24-station mean RMSE by 5.3% and 9.6%, respectively (Figure 9c), which was larger than those of the 24-h forecast (i.e., improvement for the 48-h forecast was more notable). All in all, as mentioned above, the TopoWind models could improve the forecast of the magnitude of the 70-m wind speed, with the topo_wind = 2 option showing a better performance.

Effects of Different Weather Systems
It is known that, for the same model configuration, its performances in forecasting the wind field associated with different weather systems were notably different [8,15,[34][35][36][37]. As this study focused on October 2020, it is necessary to first know the main pattern of the surface wind filed during this period. As Figure 10a shows, in terms of the temporal mean state, the simulation domain was governed by different wind field. Overall, the northerly winds were dominant, and westerly and easterly winds also appeared, particularly in the regions around 44 • N 91 • E and 42 • N 91 • E. The winds with larger/smaller speed mainly appeared in the eastern/western section of the simulation domain. In terms of the spatial mean state, it can be found that the fluctuations of the wind were notable (black line in Figure 10b), which indicated the surface winds changed significantly. For the zonal wind (red line in Figure 10b), the period occupied by the negative values was similar to that by the positive values. This means that the simulation domain was controlled alternately by easterly and westerly winds. For the meridional wind (blue line in Figure 10b), negative values appeared much more frequently than the positive values, which means that the northerly winds were dominant.
As different weather systems were associated with different wind fields [1,[34][35][36][37], we summarized the good and bad forecasts according to their background environments. As Figures 2 and 3 show, for the 24-h/48-h forecast started at each day (there were 28 runs in the Test-01/Test-02), if CORCs of Test 01 and Test 02 were above their corresponding mean CORCs (among their corresponding 28 runs), and RMSEs of Test 01 and Test 02 were below their corresponding mean RMSEs, it was regarded as a good forecast; whereas, if CORCs of Test 01 and Test 02 were below their corresponding mean CORCs, and RMSEs of Test 01 and Test 02 were above their corresponding mean RMSEs, it was regarded as a bad forecast. If the forecast was good both for the 24-h and 48-h runs, it was marked with a purple closed circle as shown in Figure 11. If the forecast was bad both for the 24-h and 48-h runs, it was marked with a green closed circle. Comparison of the situations with purple and green circles ( Figure 11) shows that, during good forecasts, the simulation domain was mainly controlled by high-pressure systems such as a ridge (Figure 11b) or a closed high-pressure center (Figure 11p), and the upper-level jet was mainly far from the domain (Figure 11k). In contrast, during bad forecasts, the simulation domain was mainly controlled by low-pressure systems such as a trough (Figure 11f), and the upper-level jet was mainly close to the domain (Figure 11s). High-pressure systems were generally more stable than lower-pressure systems [1]; and upper-level jet could affect lower-level systems by its secondary circulation (regions close to the upper-level jet were affected notably). These were key reasons for the differences between good and bad forecasts. As different weather systems were associated with different wind fields [1,[34][35][36][37], w summarized the good and bad forecasts according to their background environments. A Figures 2 and 3 show, for the 24-h/48-h forecast started at each day (there were 28 runs i the Test-01/Test-02), if CORCs of Test 01 and Test 02 were above their corresponding mea CORCs (among their corresponding 28 runs), and RMSEs of Test 01 and Test 02 wer below their corresponding mean RMSEs, it was regarded as a good forecast; whereas, CORCs of Test 01 and Test 02 were below their corresponding mean CORCs, and RMSE of Test 01 and Test 02 were above their corresponding mean RMSEs, it was regarded as bad forecast. If the forecast was good both for the 24-h and 48-h runs, it was marked wit a purple closed circle as shown in Figure 11. If the forecast was bad both for the 24-h an 48-h runs, it was marked with a green closed circle. Comparison of the situations wit purple and green circles ( Figure 11) shows that, during good forecasts, the simulation do main was mainly controlled by high-pressure systems such as a ridge (Figure 11b) or closed high-pressure center (Figure 11p), and the upper-level jet was mainly far from th domain (Figure 11k). In contrast, during bad forecasts, the simulation domain was mainl controlled by low-pressure systems such as a trough (Figure 11f), and the upper-level je was mainly close to the domain (Figure 11s). High-pressure systems were generally mor

Effects of Near-Surface Features
As discussed in Section 3, both TopoWind models could improve the near-surface wind forecast. Possible reasons were discussed in this section. As Figures 12a and 13a illustrate, for both 24-h and 48-h forecast, on average, Test 01 and Test 01 showed improvements in forecasting the variational trends of slp (i.e., CORCs increased). In addition, as Figures 12b and 13b depict, improvements of the forecast of the magnitude of slp were more notable (i.e., RMSEs reduced), particularly for the topo_wind = 2 option. This means that the TopoWind models could improve the slp forecast. This is an important reason for the improvement of the wind speed forecast, as slp was crucial in determining wind speed (through doing work by the pressure gradient force; Holton 2004).
Atmosphere 2021, 12, x FOR PEER REVIEW 15 of 21 stable than lower-pressure systems [1]; and upper-level jet could affect lower-level systems by its secondary circulation (regions close to the upper-level jet were affected notably). These were key reasons for the differences between good and bad forecasts.  Figure 2) and green shading circles mark that of bad forecast (both 24and 48-h forecasts are bad; see the caption of Figure 2).  Figure 2) and green shading circles mark that of bad forecast (both 24and 48-h forecasts are bad; see the caption of Figure 2). wind forecast. Possible reasons were discussed in this section. As Figures 12a and 13a illustrate, for both 24-h and 48-h forecast, on average, Test 01 and Test 01 showed improvements in forecasting the variational trends of slp (i.e., CORCs increased). In addition, as Figures 12b and 13b depict, improvements of the forecast of the magnitude of slp were more notable (i.e., RMSEs reduced), particularly for the topo_wind = 2 option. This means that the TopoWind models could improve the slp forecast. This is an important reason for the improvement of the wind speed forecast, as slp was crucial in determining wind speed (through doing work by the pressure gradient force; Holton 2004).   From Figures 14 and 15, although the TopoWind models showed unobvious effects in improving the forecast of the variational trends of the 2-m temperature, they showed notable contributions in improving the forecast of the 2-m temperature's magnitude, particularly for the topo_wind = 2 option. A better forecast of 2-m temperature field contributed to reach a better forecast of near-surface wind speed, as baroclinity (which could be From Figures 14 and 15, although the TopoWind models showed unobvious effects in improving the forecast of the variational trends of the 2-m temperature, they showed notable contributions in improving the forecast of the 2-m temperature's magnitude, particularly for the topo_wind = 2 option. A better forecast of 2-m temperature field contributed to reach a better forecast of near-surface wind speed, as baroclinity (which could be represented by temperature gradient) could enhance/weaken the near-surface wind speed through the baroclinic energy conversion [1,34]. All in all, as discussed above, it can be concluded that the TopoWind models could improve the forecast of slp and 2-m temperature field, which finally resulted in the improvements of near-surface wind speed forecast. Moreover, the topo_wind = 2 option was overall better than the topo_wind = 1 option in the forecast of near-surface wind. This was mainly because that the former had included the subgrid terrain variance in calculating the friction velocity in addition to the procedures added by the latter.     Figure 14. The area-averaged correlation coefficient (CORC) for the forecast of 2-m temperature (started at 12:00 UTC of each day) within 0-24 h (a); and the area-averaged root mean square error (RMSE) associated with the 2-m temperature forecast mentioned above (b). Black, blue, and red solid lines represent the calculation results of the Control, Test 01 and Test 02 runs, respectively; and the black, blue and red dashed lines are the temporal means of the values represented by black, blue, and red solid lines, respectively.

Limitations of This Study
As discussed above, this study has shown the ability of two TopoWind models in improving the forecast accuracy of the near-surface winds under the complex terrain conditions over Northwest China. To better understand this conclusion, one needs to know the limitations of this study. (i) The result was only based on a forecast period of around a month. As different weather systems were crucial to the forecast accuracy, we suggest to conduct more tests in the future. These tests should contain more types of weather systems, and also should consider the influences of seasonal variations (this study only focused on autumn). (ii) The simulation domain of this study was small relative to the Northwest China, therefore more regions in the Northwest China should be used in the evaluations of the TopoWind models. Properly addressing (i) and (ii) will contribute to the final improvement of the near-surface wind forecast over Northwest China. (iii) We only focused on the near-surface features to understand why the TopoWind models could improve the near-surface winds' forecast. In fact, as the weather systems that caused the strong winds usually had a thick vertical extent, more vertical levels (such as 700 hPa, 500 hPa) should also be used in the analyses. This will enhance the understanding of the TopoWind models, which is useful for improving them in the future.

Conclusions
The Northwest China is a region with the most abundant WE in East Asia and even the world. Because of the WE's notable features of randomness, diversity and uncontrollability, the effective utilization of the WE needs accurate near-surface wind forecasts. Although the WRF model had been widely used for wind forecasts worldwide, its forecasts of nearsurface wind still showed notable errors. In order to improve the simulation accuracy of the low-level wind speed, two TopoWind models were developed and added to the YSU PBL scheme. This study conducted the first test to check whether the two TopoWind models could improve the near-surface wind prediction under complex terrain conditions over Northwest China. This contributes to making full use of WE in Northwest China and to improving the efficiency of wind power generation systems.
Based on three groups (each group had 28 runs) of forecasts (i.e., Control run, Test 01 and Test 02) started at 12:00 UTC of each day (ran for 48 h) during the period of 01-28 October 2020, we found that: (i) the forecast accuracy of 10-m wind speed varied each day, with~50% of them belonged to good forecasts; those forecasts with larger CORCs tended to have smaller RMSEs (i.e., they were in an inverse correlation), and vice versa; both TopoWind models could improve the 10-m wind speed forecasts under the complex terrain conditions over Northwest China, particularly for reducing the RMSE, and the topo_wind = 2 option showed a better performance (improve the forecast accuracy by 9-14%) than that of the topo_wind = 1 option (4-6%). (ii) The forecast of 10-m meridional wind was more important to the forecast of the variational trend of the 10-m wind speed, and the forecast of 10-m zonal wind was more important to the forecast of the magnitude of the 10-m wind speed; on average, both TopoWind models could improve the forecasts of 10-m meridional and zonal winds (particularly for reducing RMSE), with the topo_wind = 2 option showed a better performance (improve the forecast accuracy by 7-14%) than that of the topo_wind = 1 option (3-6%). (iii) Geographical features (stations located south of the mountain, where terrain was below 1000 m, tended to have more accurate forecast) were crucial to the forecast accuracy of the 70-m wind speed; the two TopoWind models could improve the forecasts of the 70-m wind speed, particularly for lowering the RMSE, with the topo_wind = 2 option shown a better performance (improve the forecast accuracy by 3-10%). (iv) Different weather systems were crucial to the forecast accuracy, good forecasts tended to appear when the simulation domain was mainly controlled by high-pressure systems with the upper-level jet far from it; bad forecasts tended to appear when the simulation domain was mainly controlled by low-pressure systems with the upper-level jet close to it. (v) The two TopoWind models could improve the forecast of sea level pressure (which affected the wind field through the work done by the pressure gradient force) and 2-m temperature field (which influenced the wind field through the baroclinic energy conversion), which finally resulted in the improvements of near-surface wind speed forecast.  Data Availability Statement: The ECMWF ERA5 reanalysis dataset presented in this study are openly available in Copernicus Climate Change Service Climate Data Store: https://www.ecmwf.int /en/forecasts/datasets/reanalysis-datasets/era5 (accessed on 1 June 2021).