Fog and Low Stratus Obstruction of Wind Lidar Observations in Germany—A Remote Sensing-Based Data Set for Wind Energy Planning

: Coherent wind doppler lidar (CWDL) is a cost-effective way to estimate wind power potential at hub height without the need to build a meteorological tower. However, fog and low stratus (FLS) can have a negative impact on the availability of lidar measurements. Information about such reductions in wind data availability for a prospective lidar deployment site in advance is beneﬁcial in the planning process for a measurement strategy. In this paper, we show that availability reductions by FLS can be estimated by comparing time series of lidar measurements, conducted with WindCubes v1 and v2, with time series of cloud base altitude (CBA) derived from satellite data. This enables us to compute average maps (2006–2017) of estimated availability, including FLS-induced data losses for Germany which can be used for planning purposes. These maps show that the lower mountain ranges and the Alpine regions in Germany often reach the critical data availability threshold of 80% or below. Especially during the winter time special care must be taken when using lidar in southern and central regions of Germany. If only shorter lidar campaigns are planned (3–6 months) the representativeness of weather types should be considered as well, because in individual years and under persistent weather types, lowland areas might also be temporally affected by higher rates of data losses. This is shown by different examples, e.g., during radiation fog under anticyclonic weather types.


Introduction
Wind turbine planning requires reliable estimation of the wind power potential at a prospective site which necessitates knowledge of the wind conditions. The assessment of the wind conditions is recommended to be based on representative wind measurements at projected hub height for at least 12 months with a data availability of at least 80% [1,2]. Representativeness must include the consideration of seasonal differences in wind speed. In Germany, the average hub height of newly installed turbines in 2018 was 133 m above ground [3]. In general, the optimal hub height depends on the type of the wind turbine, the wind regime at the location, and the economic costs and benefits with regard to different planning scenarios [4,5]. Currently, the trend for the next generation of turbines goes up to 160 m hub heights and will most likely increase up to 180 m in the future [6]. It is of the utmost importance for planning and financing of wind parks at a certain position that the wind situation at hub height, and thus the expected wind energy revenue, can be properly estimated in advance of a construction. Most inland wind parks are planned in complex topography and higher elevations due to higher wind speeds at these elevations [6][7][8]. Unfortunately, there are hardly any wind measurements available in these mostly remote and rural planning areas, particularly not at hub height. Reanalysis data or even data from wind atlases might help, but the results are not always reliable, particularly in complex terrain [9][10][11]. Traditionally, measurements utilized meteorological towers, which are very expensive due to the need of tower construction up to hub height, requiring permits, acquisition of suitable building sites, and the deconstruction of the tower after the temporally limited measurement campaign [12]. A more cost-effective alternative is the use of coherent wind Doppler lidar (CWDL) technology, which has supplemented or even substituted towers in recent years [8,[12][13][14][15]. While lidar measurements are cost-effective and flexible compared to meteorological masts, it is known that fog or low stratus events (FLS) can lead to loss of measurement data [16,17]. During FLS conditions it can be observed that the high number of fog droplets leads to higher reflectivity of the signal, thus resulting in an increase of the Carrier-to-Noise Ratio (CNR) at lower measurement ranges and a greater decrease at longer ranges [17]. Consequently, the availability, which is defined as the percentage of CNR values above a predefined threshold during a certain time period [18], also decreases at the higher measurement ranges during such events. This has a negative impact on the overall availability of a lidar measurement campaign. In order to minimize low availability periods, it is advantageous to be able to estimate the lidar data availability at potential deployment sites when planning the lidar measurement strategy. While FLS is one of the most important drivers of low availability, its spatial and temporal variabilities are mostly unknown for remote planning sites without any visibility observations. Weather satellites have long been used successfully to detect the presence of fog and low stratus [19][20][21][22], and therefore can give useful information about FLS on high spatiotemporal scales. A newly developed scheme based on machine learning technology called HYFOG allows the spatially explicit calculation of ground fog, based on the retrieval of FLS cloud base altitudes (CBA) for the entire area of Europe from data obtained from the Meteosat Second Generation (MSG) SEVIRI (Spinning Enhanced Visible and Infrared Imager) instrument [23,24].
The aim of the current paper is to show that the CBA product from the HYFOG algorithm can support the prediction of low availability of lidar wind measurements in time and space. This is of great importance because the analysis of ground fog frequency for Germany particularly revealed a very high frequency of ground fog especially in the higher terrain of the lower mountain ranges, which at the same time are among the interesting onshore sites for future wind turbines [6][7][8]. To show the feasibility of such an approach, we compare lidar data availability of field measurements, obtained by WindCube v1 and v2 lidars, to the presence of low clouds derived with the HYFOG algorithm, and finally develop a map of availability for lidar wind measurements under FLS conditions.

Satellite Data
For this study, Meteosat Second Generation (MSG) SEVIRI satellite data from 2014 to 2017, which has a temporal resolution of 15 min and a nominal spatial resolution of 3 km at sub satellite point, were used. This data was processed with the state-of-the-art hybrid fog retrieval algorithm outlined in [23] to obtain information on cloud base altitudes (CBA). The algorithm uses a random forest machine learning model trained with cloud base altitudes measured by synoptic weather observation data (SYNOP) and Meteorological Aviation Routine Weather Reports (METAR) stations and the pixel information from the satellite data corresponding to these station locations to predict cloud base altitudes. The general procedure is to train a model for every time slot of the MSG data with station data corresponding to that time where clouds were detected and then predict CBA for the whole MSG scene (see Figure 1 for example scene). This product has been validated extensively using a leave-location-out cross-validation scheme and shows overall good performance with a probability of detection (POD) of 0.61 and a Heidke Skill Score (HSS) of 0.58 [23]. The CBA data for 2006 to 2017 with a temporal resolution of 15 min is available upon request.

Digital Elevation Data
Furthermore, the WorldClim Digital Elevation Model (DEM) [25] was used for elevation data. The DEM was resampled to MSG projection and resolution. This data was used to correct cloud base altitudes at station locations by the difference between the station elevation and the elevation of the DEM used in the HYFOG algorithm (see Section 2.4 for more details).

lidar Data
Lidar measurements were conducted with Leosphere WindCube v1 and v2 CWDL lidars , which can be set up to measure at up to twelve arbitrary height levels. The focal points of the lidar systems were set to 140 m. For this study, time series data from 25 sites (for locations see Figure 2), each with twelve consecutive measuring months, were available. The data for each site contains availability values in percent with a frequency of 10 min for multiple measurement heights up to 200 m. Overall, the dataset was comprised of over 5.3 million data points. All sites are situated in Germany at elevations ranging from 61 m to 703 m a.s.l. (see Figure 3 for height distribution). The availability, henceforth called the 10-min availability, is defined as the percentage of values with a CNR value above the threshold of −22 dB for the respective ten minute interval [18,26].

Data Evaluation
CBA pixel values were extracted from the HYFOG product for each station location and each time step. As the temporal resolution of the lidar and satellite datasets differ (lidar: 10 min frequency; CBA product: 15 min frequency), and because measurement times were not aligned, the CBA product was subsequently resampled to the temporal frequency of the lidar measurements using linear interpolation. Additionally, as the CBA product only supplies information at 3 km resolution per pixel but the topography can change significantly within one pixel in complex terrain, the height difference between the actual elevation of the lidar site and the elevation of the digital elevation model used in the HYFOG algorithm was added to the cloud base altitude. On average this difference is 47 m. Because the 12 measurement levels of the lidars were not set up to the same heights for all sites, data from fixed height ranges were binned. A height bin width of 15 m with the first bin center at 47.5 m above ground was chosen. Thus, the first bin stretched from 40 m to 55 m above ground, the second one from 55 m to 70 m, and so on. If at any site two measurement levels fell into the same bin, the measurement level with the height closest to the bin center was chosen, thus warranting that only one availability value per bin per site per time step was used in the analysis. Two measurement levels might fall into the same bin if the lidar is set up with a height difference between two levels smaller than the chosen 15 m bin width. This kind of set-up is done to cover the prospective hub height at a site with smaller measurement level increments. As only a few sites covered the height bin of 107.5 m, data of this bin were not representative and therefore left out of the analysis. For each height bin the mean of the 10-min availability was calculated for cloud-free and CBA ≤ 100 m situations.
To get a spatial estimate for the availability, we assumed a commonly used 10-min availability threshold of 80%, which means intervals (data points) with 10-min availability below 80% are considered invalid measurements and thus not available. With the CBA ≤ 100 m frequencies for 2006 to 2017 ( f CBA≤100 m ) we can calculate yearly and monthly maps of estimated availability for Germany with As can be seen in Figure 4 (marked by the arrow), the threshold is reached at about 130 m a.s.l. All higher measurement levels are also below the threshold, therefore the maps are valid for measurement heights of 130 m a.s.l. and above.

Results
For the cloud-free case, the characteristic curve with high 10-min availability at lower measurement heights and the subsequent continuous drop up to the maximum measurement height can be seen. This is due to a higher probability of the CNR value not reaching the threshold to be accepted as valid at higher measurement distances. For the analyzed data set, the average 10-min availability under cloud-free conditions is 97.7% at the lowest measurement height (bin center 47.5 m), it slightly increases to a maximum value of 98.4% at height bin 62.5 m, and then drops to 81.7% at height bin 197.5 m (blue dots in Figure 4). In contrast to the CBA ≤ 100 m situation (red dots in Figure 4), with rising measurement height a sharper drop in 10-min availability, down to 52.8% at height bin 197.5 m, can be observed.

Diurnal and Monthly Investigation
As of 2018, the majority of newly installed wind turbines in Germany featured a hub height of above 140 m [3], which will increase in the future. Therefore, in the following the measurement height bin of 152.5 m was explicitly looked at.
In the diurnal cycle at 152.5 m measurement height bin (see Figure 5), the availability in the cloud-free case in general is very high with only slight variations. For the cloudy case, availability ranges from 63% (at 09:00 UTC) to 77% (at 19:00 UTC). Availability is the lowest during the morning hours. In order to investigate seasonal effects, the availability was averaged on a monthly basis for the measurement height bin of 152.5 m. It can be seen ( Figure 6) that, compared to the cloud free case, the cloudy situations show substantially lower data availability during winter season. During winter, the situations with cloud base altitudes below 100 m are often characterized by extremely low stratus or even ground fog, and therefore most of the lidar signal is lost at very low heights. During the summer months, this is not the case. In addition, the summer overall has far less situations with cloud base heights below 100 m, and the uncertainty of the estimated CBA from the HYFOG algorithm is higher (see [23]) than during winter. This higher uncertainty of the CBA of the HYFOG algorithm may cause the miss of low cloud events (CBA ≤ 100 m) or the other way around cloud-free/CBA ≥ 100 m events may be selected as alleged low cloud events.
Due to the non-associative property of the mean function the total mean can differ from the mean of the groups (month) if the groups have different numbers of items. Therefore, it seems that the numbers in Figures 4 and 5 do not match those of Figure 6 because the months from March to September show lower case counts than the months from October to February. The total mean (Figure 4) is therefore dominated by October to February.

Availability Maps
The annual availability map for Germany and surrounding areas reveals that the lower mountain ranges and the alpine region generally show availabilities which reach the critical level of 80% (used, for example, in [1]) or even below Figure 7). Hotspot areas of critical availability are the lower mountain ranges of the Black Forest, Vosges mountains, the Swiss Jura, and the Swabian Alps in the south, but also some more northerly mountain ranges like the Thuringian Forest, Röhn, Spessart, the Harz Mountains, as well as the northwestern mountain areas of Hunsrück, Taunus, and the Rothaar Mountains. While the lowlands of Germany generally show higher availability ≥ 80%, it should be kept in mind that the maps only show the expected availability with regard to the frequency of conditions with clouds of base altitude of 100 m and below. Other factors, such as exceptional clean air periods (low aerosol backscatter), can further reduce the availability of lidar wind measurements even under cloud-free conditions [16]. In order to reduce the data loss due to FLS during a complete measurement period, a monthly resolution of spatial availability information is required. Here, the availability reveals a clear seasonal course. The monthly availability maps (see Figure 8) show that during the summer months availability might reach the problematic threshold only at the highest elevations. On the other hand, during all months of the winter half of the year (Oct-Mar) the highest elevations are all below the critical threshold and the availabilities at middle elevations already reach close to critical values of~85%. Therefore, special care regarding the temporal representativity must be taken when designing measurement strategies for these regions, especially during the winter time. In particular, lower availabilities due to FLS are widely found in central and southern Germany in November and December. This is especially important if the currently necessary 12-month measurement period of the TR-6 (Technische

Spatial Availability Distribution with Regard to Weather Patterns
While Figure 8 shows the long-term average availability on a monthly basis, on shorter time scales the occurrence of fog and low clouds is influenced by the weather situation at that time. Therefore, several possibly persistent weather types might dominate the measurements during shorter observation periods and thus result in higher rates of data loss. In general, lower availabilities in areas with lower elevation can be expected during anticyclonic weather situations which favor ground fog, while cyclonic weather patterns reduce the availability in elevated terrain (see Figure 9). Of course the actual occurrence of ground fog in anticyclonic situations also heavily depends on the local meteorological conditions. In extreme cases of persistent ground fog frequently occurring in the winter half year, as shown in Figure 10a, measurement availability can be heavily impacted. While on average (over a twelve year period) this fog pattern has a very low occurrence for most of the year, during the winter months, and especially in November and December, these extreme conditions can be observed on almost 10% of the days (see Figure 11) and may significantly lower the monthly availability during a measurement campaign conducted even in terrain with lower elevations. On the other hand, as can be seen in Figure 10b, other extreme conditions only affect areas with higher elevations like the lower mountain ranges. The fog patterns P8 and P13 in Figure 10a,b are based on a self-organizing map approach assigning daily averaged fog frequency maps to specific fog patterns which is presented in [24]. Figure 10c ("high over central Europe" (HM) weather type) and Figure 10d ("anticyclonic southwesterly" (SWA) weather type) show the average availabilities of two general weather situations (GWS), based on the work in [27], commonly associated with favorable conditions for lowland radiative fog. It can be seen that, while not reaching the critical 80% threshold, availability reductions can be expected in low elevation areas. While these situations also occur a little more often during the winter time (close to 20% of days), Figure 11 shows that they can also be observed on a significant amount of days during the summer months. Thus, while planning measurement campaigns, reductions due to certain weather situations should be taken into account as a factor for reduced availability.  [24] and GWShigh over central Europe (HM) (c), anticyclonic southwesterly (SWA) (d) from the work in [27], and assumed 10-min availability threshold of 80% at measurement heights of 130 m and above.

Conclusions
It could be shown that information about the occurrence of low clouds derived from satellite data on average show good agreement with decreasing 10-min availability of wind measurements conducted with CDWL. Based on this finding we computed availability maps for Germany. To our knowledge, up to now, maps of this kind have not been available and should be a valuable aid in the planning process for lidar measurement strategies. It can be seen that especially more complex terrain like the lower mountain ranges in Germany require special care when using lidar technology because these areas reach critical availability values especially during the winter time. This is especially important as these areas are among the interesting sites for future wind turbines. Additionally, many areas which are still above the critical availability threshold might reach the critical level if other factors like exceptional clean air situations occur during the measurement period.
Unfortunately, in complex terrain the CBA product is not always able to resolve local cloud cover which most likely leads to mismatches of the cloud base altitude derived from the satellite data compared to the actual cloud base height as present at the site of the lidar measurement. Therefore, we are currently working on improving the spatial resolution of the CBA product to be able to produce maps of higher accuracy.
Furthermore, it needs to be emphasized that the results shown are only valid for the Leosphere WindCube v1 and v2, as only data from these two type of lidars was used in the analysis. For other types of lidar, possibly with stronger laser units, the results probably differ as the CNR threshold for the WinCube 200S and 400S, for example, are slightly lower (−26 dB and −28 dB, respectively) [18]. Therefore, detection efficiency should be slightly higher compared to the WinCube v1 and v2. During moderate visibility reductions (haze), this could result in slightly higher availability but comparable problems in the case of strong fog or dense clouds were reported for the more powerful lidar systems [18,28].