Estimating Air Density Using Observations and Re-Analysis Outputs for Wind Energy Purposes

: A method to estimate air density as a function of elevation for wind energy resource assessments is presented. The current practice of using nearby measurements of pressure and temperature is compared with a method that uses re-analysis data. It is found that using re-analysis data to estimate air density gives similar or smaller mean absolute errors compared to using measurements that were on average located 40 km away. A method to interpolate power curves that are valid for different air densities is presented. The new model is implemented in the industry-standard model for wind resource assessment and compared with the current version of that model and shown to lead to more accurate assessment of the air density at different elevations.


Introduction
Wind power capacity has been growing rapidly over the past decades and reached approximately 539 GW globally in 2017 and continued growth is expected over the next five years [1]. The success of new wind projects is largely dependent on the estimated annual energy production (AEP) and its uncertainty at a proposed wind farm site.
The wind power density at a turbine site depends on the wind speed cubed, so there is usually a strong focus on measuring and modelling the wind speed frequency distribution with an accuracy that is as high as possible [2][3][4]. However, the wind power density and thus the AEP also depends on the air density at hub height, but the estimation of air density has received very little attention in literature.
For the estimation of air density the pressure, P, and the virtual temperature T v at the site have to be known. The virtual temperature is a quantity that is often used by meteorologists; because water vapour is less dense than dry air, it specifies the temperature at which moist air would have the same density as dry air. Because measurements of humidity are rarely available and because the influence of humidity on the air density is small, common standards allow for using the temperature instead of the virtual temperature [5,6]. All these parameters will usually be obtained from measurements at the site. However, processing measurements is a cumbersome process, due to measuring errors and availability issues. In addition, short measurement series are not necessarily representative for the whole period when a turbine is operating.
The Wind Atlas Analysis and Application Program (WAsP) software became the standard software to compute the AEP [7], which is implemented in various commercial wind farm planning software tools, such as windPRO [8]. The WAsP program contains submodels to take into account the effects of surface roughness, elevation and atmospheric stability on the flow to accurately estimate the AEP using nearby wind measurements and has been developed by the Technical University of Denmark (DTU) over several decades.
In WAsP version 11 and earlier, there was a utility which used the US standard atmosphere equations to extrapolate the air density to the turbine height given a measured temperature at a specified height [9]. However, to find the air density for a wind farm with a number of wind turbines at different heights, one had to manually compute the air density for each location using this utility. Furthermore, the effect of humidity was not taken into account. Thus, a more user-friendly method was needed, which automatically computes the correct air density based on an elevation map, which is already available in the model because it is used for computing the effect of wind speedups over hills [7].
Recently a number of re-analysis products has appeared, which assimilate a vast amount of observational data into a numerical weather prediction models to represent the past state of the atmosphere as accurately as possible. For example, the Climate Forecast System Re-analysis (CFSR, [10]) and the fifth major re-analysis of the European Centre for Medium-Range Weather Forecasts (ERA5, [11]) are widely used.
To simplify the estimation of air density for wind energy purposes, we therefore present a method, that can use both measurements and outputs from long-term re-analysis to estimate the air density in Section 2.1. The estimated local air density is used for correction of wind turbine power curves and Sections 2.3 and 2.4 presents inter-and extrapolation methods for that purpose. We evaluate the proposed methods in Section 3. Finally, we present the conclusions in Section 4.

Theory
The AEP at a wind turbine site depends on the wind speed (U) and wind direction (θ) frequency distribution, P(U|θ), which is usually approximated by fitting a Weibull distribution to wind measurements in a number of wind sectors. The Weibull frequency distribution is determined by the scale parameter A and the shape parameter k as The wind power density E is then given by where Γ is the gamma distribution and ρ is the air density. ρ can be computed from the ideal gas law, where R d is the gas constant for dry air, R d = 287.05 [12]. Note that we consider air as an ideal gas, although one can also adopt a so-called compressibility factor in Equation (3) and consider it is a real gas [13]. The variation of density with height can be estimated using the hydrostatic equation, where g is the gravitational acceleration and z is the height above the surface. Because g decreases with height, it is common to express Equation (4) using a reference gravitational acceleration g 0 = 9.80665 m s −2 instead of g and the geopotential height H instead of z, where H is given by: where R 0 = 6.357 × 10 6 m is the average radius of the earth. H and z differ only about 0.1% at 10 km and less at lower heights. The derivation of the geopotential height is extensively described in for example [12,14,15]. From here on the equations are expressed in terms of H instead of z. Substituting Equation (3) into Equation (4) and integrating from a certain pressure P 1 at geopotential height H 1 to a second level with pressure P 2 at geopotential height H 2 gives Because the virtual temperature usually decreases with height, T v is approximately a linear function of H, where the lapse rate L = −0.0065 K m −1 , which is taken from the US standard atmosphere model [15]. The result of Equations (6) and (7) is: Because we can now extrapolate T v and P to any height, it is possible to use Equation (3) to obtain ρ at any turbine site given a reference temperature and pressure.
To take into account the effect of humidity, T v is computed as: where q is the specific humidity [12]. It is more common to measure the relative humidity where q s is the specific humidity for saturated air. It is given by: with = 0.622 and e s the saturation vapor pressure, which can be approximated with Tetens formula [12], where e 0 = 611 Pa, T 1 = 273.16 K and T 2 = 35.86 K. It also relevant to compare the above approach to the equation that is used in the International Electrotechnical Commission (IEC) standard 61400-12-1 for power performance measurements of electricity producing wind turbines. They recommend a similar procedure as above, but instead take into account the effect of humidity by approximating ρ as where R w = 461.5 J kg −1 K is the gas constant of water vapour and with a = 2.05 × 10 −5 and c = 0.0631.

Data
Because measurements of temperature, pressure and relative humidity are often not available, it is an attractive alternative to use re-analysis data. Here, we used two data sources: CFSR version 2 and ERA5. The European Centre for Medium-Range Weather Forecasts (ECMWF) recently introduced the ERA5 data which eventually will cover the period from 1950 to present. This data set was generated by model cycle 41r2 of the integrated forecast system (IFS), which uses four dimensional variational data assimilation and uses as many historical observations and satellite data as possible. ERA5 provides hourly estimates of many variables at a 30 km global grid and uses 137 vertical levels from the surface up to a height of 80 km.
The CFSR re-analysis has a horizontal spectral triangular truncation of 126 waves (T126) (≈100 km grid resolution) and 64 vertical levels. It was improved compared to version 1 in several aspects, that were extensively described in [10].
The CFSR model outputs were downloaded at a resolution of 0.20 • with a six-hourly output interval and averaged into monthly means [16]. The ERA5 outputs were obtained at 0.25 • grid spacing in hourly intervals and were also downloaded as monthly means [17]. Both data sets are freely available.
For evaluation of the method presented here, we used the daily mean of hourly observations of T, P and R from the German weather service (DWD) which are also freely available [18]. Both P, T and R were usually measured at 2 m above the surface. Extensive metadata for all sensors is available as well.
Most of the barometers were of the type Vaisala PTB 330 and most of the thermometers were PT 100s. Data were selected from stations that had at least 95% of availability between the 1st of January 2011 until the 1st of January 2018. Stations were selected which did not change the height were T and P were measured during this period. All measurements were done according to the criteria of the World Meteorological Organization (WMO). Sources of long term uncertainty in these data were discussed in [19]. The result of the quality control procedure described above was a set of 77 stations spread over Germany (see Figure 1). The heighest observations came from the Zugspitze, which was located at 2964 m above mean sea level (amsl). Most of the observations came from the lowlands between 0-300 m amsl. To assess the air density prediction model, we used the observed T, P and R at one station and predicted the density at the nearest station. The average horizontal distance for the cross-predictions that resulted from this procedure was ≈60 km, whereas the vertical distance was ≈40 m.
T, P and q were also obtained from the CFSR en ERA5 outputs from the grid cell that was nearest to the target station. T and q are diagnosed from the output fields of the temperature and specific humidity, respectively, at 2 m above the model surface and P is available at the model surface for both the CFSR and ERA5 data.

Interpolation of Power Curves
The power production of a wind turbine depends on air density, wind speed and turbine efficiency. The turbine efficiency depended on factors like air-foil design, tip speed, pitch angle, generator efficiency and a control system which constantly adapts to changing wind conditions [20]. Annual wind production estimates are usually made by a simplified calculation using a power curve provided by the turbine manufacturer. This curve predicts power production for a fixed reference air density as a function of wind speed at turbine hub height.
Sometimes the manufacturer offers multiple power curves for different reference air densities, allowing interpolation to the local air density. Therefore, we first considered the power coefficient C P which is defined by where P is the power and A is the rotor-swept area. We assume that C P has a linear dependence on air density, and this leads to an interpolation formula for the power, Here P 1 (u) and P 2 (u) are production estimates for the reference air densities ρ 1 and ρ 2 , respectively. The second case is the limit near the cut-in wind speed where one of the power curves might predict zero production, P 1 (u) = 0.

Extrapolation of Power Curves
Sometimes we only have a single reference power curve, which we need to extrapolate to the local air density. For this purpose we could use the corrections suggested in the IEC 61400-12-1 standard [6] for power-performance measurements. According to this standard the power output of a stall-regulated turbine is linearly proportional to air density, i.e., we may correct by simple scaling, Modern turbines are usually pitch regulated, and for this turbine type the IEC standard specifies a correction of wind speed instead of the power output, This method will not affect the flat part of the power curve, which is kept constant by pitch control. We then proceed by fitting the scaled power curve by a cubic spline, which is re-sampled to the reference wind-speed values of the original power curve. The reason for this procedure was that it was easier to calculate the annual energy production for a wind farm when the power curves for individual turbines have the same reference wind speeds. Furthermore, we avoided corrections for wind speeds above rated wind, since control systems for some modern turbines will derate the power during storm conditions. Svenningsen [21] suggested an improved version of the IEC correction method for pitch-regulated turbines, The exponent m was equal to 3 for wind speeds below the point of maximum power coefficient C p,max and linearly reduced to a minimum exponent at rated wind speed and above. Here, this minimum exponent was set to m min = 1.5, a generic value which Svenningsen [21] found reasonable for a large number of pitch-regulated turbines.

Results
We performed an evaluation of methods to extrapolate air density using nearby observations or grid points. All methods extrapolated P and T using Equations (6) and (7) from a certain H 1 to another H 2 . In many cases the temperature and pressure sensors were mounted at different H and the appropriate height of each instrument was always used in Equations (6) and (7). In all methods it was assumed that the relative humidity is the same in the source site/grid point and the target site, i.e., R 1 = R 2 . There were three air density estimation methods that use the nearest observations to specify T 1 , P 1 and q 1 : The other methods did not use nearby observations, but instead used T 1 , P 1 and q 1 from the nearest grid point from re-analysis data.

•
Equation (3) using ERA5 outputs with L estimated from the nearest grid point using the closest pressure level and a pressure level 50 hPa above (WAsP 12 ERA5 L).
In brackets we denote the software in which the methods have been implemented, which will be used as labels throughout this paper.

The Effect of Humidity
The density of moist air is lower, because water vapour is less dense than dry air. To investigate the effect of taking this into account, we compared the three approaches that use the nearest observations to predict ρ. As stated in the previous section, the effect of humidity was ignored in WAsP 11. It was recommended in the first edition of IEC standard 61400-12-1 [5] and correction by Equation (13) became mandatory in edition 2 [6].
In Figure 2 it can be seen that there is high correlation between the observed and modelled air density at the sites. Because most sites were in the lowlands, ρ was usually close to 1.225 kg m −3 , which was the value used in the US standard atmosphere and that was often used for normalization of wind turbine power curves. WAsP 11 generally slightly overpredicted ρ, especially for higher air densities, because it did not include the effects of humidity. This was because warm air can contain more water vapour, so the difference in density between dry and non-dry air increases with temperature. The IEC 61400-12-1 and WAsP 12 methods give very similar results and points largely overlap each other in Figure 2.
The results from Figure 2 are also summarized in Table 1. It can be seen that not taking into account humidity (WAsP 11) resulted in a slightly higher mean absolute error and lower R than the two other methods. The WAsP 12 and the IEC 61400-12-1 give similar results as expected.  . Note that the points from the Zugspitze station were omitted because it had a very low ρ, which reduced the readability of the plot. Table 1. Summary error metrics from the methods presented in this paper. The modelled and observed air density at site i of the total number of sites N = 77 are denoted as x i and y i , respectively. Then, the mean absolute error is defined as 1 , where the overbar and σ denote the mean and standard deviation, respectively.

The Effect of a Varying Lapse Rate
In all but the ERA5 method L is assumed constant. To explore this assumption, we study the lapse rate as a function of latitude, because the vertical temperature gradient varies significantly at the pole and the equator. The zonal mean of L in a 50 hPa layer (which corresponds to ≈500 m) above the surface is shown in Figure 3. In addition, L = −0.0065 K m −1 , i.e., the assumption in the US standard atmosphere is shown for comparison.  It can be seen that L from the ERA5 outputs crosses the value of −0.0065 K m −1 at ≈60 • S and at ≈40 • N. In between these latitudes, L from the ERA5 data was lower than −0.0065 K m −1 and reached a minimum (≈ −0.009 K m −1 ) near the equator, which indicates the temperature decreases more with height than assumed in the US standard atmosphere. However, near the poles and particularly at the South Pole, L −0.0065 K m −1 . This is due to the negative surface energy balance at the poles, which causes a strong temperature inversion, i.e., the temperature often increases with height [22].
Although the lapse rate was only determined from the ERA5 data, it was observed that using L from ERA5 in combination with a measured T 1 , P 1 and R 1 also lowered the mean absolute error in Table 1. It is unlikely that the lapse rate can be measured with sufficient accuracy using a meteorological mast, because they are usually not tall enough to calculate meaningful differences between temperature sensors mounted at two heights.

Using Re-Analysis Outputs
We evaluate the use of methods that use re-analysis data to prescribe T 1 , P 1 and q 1 and subsequently estimate ρ (Figure 4). For comparison the best performing method which uses the nearest observations is also shown. It can be seen that the correlation between the modelled and observed ρ is near 1 for all three methods. The ρ estimated from CFSR data appears to have a slight positive bias compared to the observations, whereas particularly ρ using ERA5 data and the local L performs very well.  These features can be further quantified using Table 1. It can be seen that the WAsP 12 CFSR method had a slightly higher mean absolute error when compared to the method that uses nearby observations, but R is slightly higher. Using the ERA5 outputs to estimate ρ resulted in a lower mean absolute error than using all other methods and has the highest R.
To investigate in more detail the counter-intuitive result that using model outputs compared to observations results in a more accurate estimation of ρ, it can be hypothesized that the horizontal and vertical extrapolation distance play a role in this comparison: as stated in Section 2.2, the average distance to the nearest measurements used for extrapolation is usually higher than that from the distance the nearest grid point from the re-analysis outputs. Figure 5a shows the mean absolute error between the modelled and predicted ρ as a function of horizontal distance. For clarity we have again included only the best performing extrapolation method that uses nearby measurements. It can be seen that indeed the average distance to the nearest grid point is much smaller than that to the nearest observations. In addition, there is a trend when using the WAsP 12 method, that larger distances result in larger mean absolute errors. When using re-analysis outputs, there was no clear trend visible, because the distance can never be greater than half the grid spacing of the re-analysis outputs (≈10 km). In Figure 5b the mean absolute error as a function of vertical extrapolation distance is shown. There are increasing errors with increasing vertical distance using all methods. As expected, when using the ERA5 outputs in combination with a prescribed lapse rate compared to using a standard lapse rate, the mean absolute errors became smaller for large vertical extrapolations.
L ≈ −0.0056 K m −1 as obtained from ERA5 in Germany, which was close to the standard value of −0.0065 K m −1 (see Figure 3). Therefore larger deviations can be expected near the equator or near the poles and in these regions it is clearly better to account for the varying lapse rate using the WAsP 12 ERA5 L method.
When comparing the different methods (Table 1), using the ERA5 instead of the CFSR outputs results in a substantial decrease in mean absolute error. In this comparison, using the ERA5 also results in the lowest mean absolute errors when comparing with the methods that use the nearest observations instead.
In this study we evaluated the air density when considering it as a ideal gas instead of a real gas. Therefore we also evaluated ρ using Equation 1 from Picard et al. [13] instead of Equation (3). This led to a small increase in ρ, but because the compressibility factor is applied both in the observations and in the modelling it did not significantly change the results presented here.

Example
To illustrate the significance of power-curve air-density corrections, we examined the test project Parque Fictio, which is distributed with WAsP. Instead of the usual wind turbine generator in the project, we applied a Vestas V82/1650 kW turbine for which the manufacturer provides a range of reference power curves for air densities separated by ∆ρ = 0.03 kg/m 3 . The test project was situated in Chile (31 • 33' W, 71 • 28' S) and had eight turbine sites (T1-T8) at altitudes varying from 475 to 600 m to which we add 70 m to reach turbine hub height z hub . Using the CFSR data we expected air density at the turbine sites to vary between 1.134 and 1.148 kg/m 3 with an average of 1.140 kg/m 3 . Table 2 shows the calculated annual energy productions for the turbine sites with the lowest and highest air density and for the entire wind farm. Energy productions for individual turbine sites will both depend on the power curve and wind speed distribution, and as a reminder of this we include the mean wind speed U in the table. The differences between production estimates using power curves interpolated to the local air density at individual turbine sites and estimates using nearby reference power curves are O(1%) in the present example. Some manufactures only present a single power curve for a standard air density, ρ ref = 1.225 kg/m 3 , in the sales material and then expect project developers to request site-specific power curves. However, the experience in wind resource assessment courses [9] is that people frequently forget to adapt the power curves or make errors in estimating the air density. The last column of the table shows that this may lead to large errors in case an inexperienced user use this power curve without correction. We could of course correct the power curve for ρ ref = 1.225 kg/m 3 by extrapolation instead of interpolation. That would lead to production estimates of 52,043 GWh (+1.35%) using the IEC method and 51,550 GWh (+0.39%) using the Svenningsen method, which are better than the estimates using an uncorrected power curve but also highlighting the uncertainty associated with extrapolation to a very different air density. Finally, using WAsP 12 ERA5 changed the overall AEP to 51,414 GWh (+0.13%) and using WAsP 11 resulted in 51,654 GWh (+0.60%).

Conclusions and Discussion
Revised methods for the estimation of air density using nearby observations or re-analysis outputs were implemented and evaluated. It was found that taking into account the effect of humidity by using T v instead of T resulted in lower mean absolute errors when using nearby measurements to estimate ρ. The method implemented here gave nearly identical results to the equation suggested in the IEC-61400-12-1 standard, but has the advantage that it can take into account the virtual temperature lapse rate as well.
In addition, outputs from the nearest grid point from two re-analysis models were used to estimate ρ. Using the CFSR version 2 re-analysis resulted in slightly higher mean absolute errors compared to using nearby observations. Using the ERA5 outputs resulted in lower mean absolute errors compared to using nearby observations. Using additionally a local lapse rate further decreased the mean absolute errors.
Due to the high accuracy of the ERA5 outputs and its availability from 1950 to now, there is no need to do any long-term correction, as would be required using a shorter measured time series of T, P and RH. Therefore, using the nearest ERA5 grid point is likely the most accurate way to estimate ρ in western Europe when measurements are located more than 10 km away. Only when high quality observations of T, P and R are available at the site within 10 km one can decide to override the values obtained from the re-analysis. Possibly the re-analysis outputs can be less accurate in other parts of the world where less observations are assimilated into the global model that is used to compute the re-analysis [11].
In the future it could be interesting for the wind energy community to include more re-analysis data, such as the frequently used MERRA2 dataset. In addition, it is important to state the impact of climate change on the results presented here: under the intermediate scenarios of the IPCC assessment the mean global temperature in 2046-2065 is expected to be 1.3-1.4 • C higher than in the 1986-2005 period [23]. This would be equivalent to a decrease in ρ of ≈0.5% for the sites presented here, i.e., much higher than the uncertainty due to using different datasets. Therefore, future work could instead use the global output of climate models to estimate realistic values of the temperature over the life span of a typical wind energy project (∼20 yrs).
The power curves supplied by wind turbine manufactures are available for or limited number of reference air densities. We may, however, adapt available power curves to the local air density estimated by re-analysis datasets and using the interpolation and extrapolation methods of Sections 2.3 and 2.4. In WAsP 12, this is done automatically for individual turbine sites, which may have individual terrain elevation and thereby individual local air density. We note that the ERA5 data set and the ability to alter L will be included in a future release of WAsP and is not yet available in the currently available software (www.wasp.dk). However, the ERA5 data are provided as Supplementary Material so the results here can also be reproduced before the software is officially released. Author Contributions: R.F. contributed to all aspects of this paper. M.N. contributed to conceptualization, methodology, resources, writing-original draft preparation, writing-review and editing.
Funding: This research was funded by the Danish EUDP project 64018-0095 GASP and from internal resources.