Short-Term Wind Power Forecasting at the Wind Farm Scale Using Long-Range Doppler LiDAR

: It remains unclear to what extent remote sensing instruments can effectively improve the accuracy of short-term wind power forecasts. This work seeks to address this issue by developing and testing two novel forecasting methodologies, based on measurements from a state-of-the-art long-range scanning Doppler LiDAR. Both approaches aim to predict the total power generated at the wind farm scale with a ﬁve minute lead time and use successive low-elevation sector scans as input. The ﬁrst approach is physically based and adapts the solar short-term forecasting approach referred to as “smart-persistence” to wind power forecasting. The second approaches the same short-term forecasting problem using convolutional neural networks. The two methods were tested over a 72 day assessment period at a large wind farm site in Victoria, Australia, and a novel adaptive scanning strategy was implemented to retrieve high-resolution LiDAR measurements. Forecast performances during ramp events and under various stability conditions are presented. Results showed that both LiDAR-based forecasts outperformed the persistence and ARIMA benchmarks in terms of mean absolute error and root-mean-squared error. This study is therefore a proof-of-concept demonstrating the potential offered by remote sensing instruments for short-term wind power forecasting applications.


Introduction
Worldwide energy markets are undergoing a rapid shift towards low carbon technologies and renewable energy sources. Driven by the latest technology advancements and the associated reduction in investment costs [1], wind power has recently gained considerable traction with more than 60GW installed in 2019 alone, bringing the total installed capacity worldwide to 651GW [2]. These increasingly large wind penetration levels present new technical challenges due to the stochastic and intermittent nature of wind, together with the inability of wind farms to provide reserve power [3,4]. Intra-hourly (i.e., within an hour) variability in regions where a large number of wind turbines are condensed over a small spatial extent is of particular concern, due to correlated fluctuations amongst neighbouring groups of turbines or wind farms [5,6]. Rapid increases or decreases of wind power generation over a short amount of time, also called "ramp events", are especially challenging to forecast and represent a threat to electric systems' security [7]. Upward ramps often incur energy losses through curtailment, whereas downward ramps can lead to significant power disruptions due to the lack of backup generation [8]. In addition, a growing number of energy markets are moving towards shorter dispatch and pricing time frames in an effort to limit spot price fluctuations and ensure system reliability. In countries such as Belgium, France, Germany [9] and Australia [10], markets operate on a five minute basis, and forecasts at this time scale are required to reduce the uncertainty and costs associated with ancillary services. Accurate and timely short-term wind power forecasts are therefore a key component to mitigate the aforementioned issues and increase the control and integration of wind farms [11,12]. The terminology "short-term" used in this study relates to prediction horizons up to one hour, which is sometimes also referred to as "very short-term" or "ultra-short-term" in the literature.
In particular, ground-based optical remote sensing instruments measuring the incoming wind field such as LiDARs, sodars and radars have become increasingly popular and are considered of great potential for wind power forecasting applications [12][13][14]. Theoretically, a priori knowledge of the incoming wind field provides valuable information on the forthcoming conditions at the wind farm site.
This paper introduces and assesses the performance of two new methodologies to predict wind power with a five minute lead time based on LiDAR data, herein referred to as the "LiDAR-based forecasts". The study also presents an innovative dynamic scanning strategy designed to improve the sampling frequency of the incoming wind field. The paper is organised as follows. Section 2 introduces the LiDAR technology and reviews the stateof-the-art in remote sensing forecasting. The forecasting methodologies and evaluation frameworks are established in Section 3. The models are tested using real data over a 72 day assessment period, and their forecasting skills are compared against persistence and autoregressive integrated moving average (ARIMA) benchmarks in Section 4. Finally, conclusions and a discussion of future work are presented in Section 5.

Doppler LiDAR Working Principle
Pulsed-coherent Doppler LiDARs (referred to as "Doppler LiDAR" herein) probe the flow through the atmosphere by means of pulsed light beams, with a measurement range extending horizontally up to 30 km [15,16]. Scanning Doppler LiDARs allow the positioning of their beam in any direction within a hemisphere through a revolving scanning head and mirrors system. The term "range gate" refers to the distance from the LiDAR along the line-of-sight (LOS), and the "elevation angle" denotes the LOS incline relative to the horizontal plane located at the height of the LiDAR head. The fundamental scanning configuration most relevant to short-term forecasting is the plan position indicator (PPI) scan, in which the laser beam sweeps over a circular conic surface of interest with a fixed low elevation angle (Figure 1).
The working principle of Doppler LiDARs relies on detecting minor frequency shifts in back-scattered light, induced by the movement of aerosol particles (i.e., soot, dust, pollen, sand, sea salt) transported by the wind in the direction of the laser beam. Doppler LiDARs hence only measure the radial (or "along-the-beam") component of the wind, and post-processing is required to retrieve the horizontal wind speed and direction across the area of interest. Representation of a PPI scan. The figure also displays the wind analysis spherical system (defined by the azimuth angle φ and the elevation angle θ) along with the Cartesian coordinate system (defined by the coordinates x, y and z). The blue dots along the LiDAR beam illustrate the range gates over which the radial velocities are averaged.

Wind Field Remote Sensing for Wind Power Forecasting
Several studies have presented methodologies for predicting wind speed or wind power based on wind field remote sensing. Reference [17] forecast wind power solely based on scanning Doppler LiDAR observations, using spatial correlations between measurements at various upwind distances from the LiDAR as a propagation model. In [18], two long-range LiDARs were placed in a row to forecast wind speeds over flat terrain with look-ahead times between five and 45 min. Reference [19] used a dual-scanning LiDAR system configuration (two LiDARs separated by 4 km scanning the same domain) to predict wind speeds over the Danish North Sea five minutes in advance. Their proposed methodology, which included forward propagation of wind vectors and local terrain corrections, could outperform the persistence and ARIMA benchmark under stable and neutral stability conditions. Reference [20] presented various model formulations for short-term wind speed forecasting using a single LiDAR setup. Of particular interest is the use of machine learning methods such as the convolutional long short-term memory (ConvLSTM) neural network [21] to directly process raw radial velocity measurements from the LiDAR, hence bypassing the need for wind field reconstruction and the associated computational costs. The methodology was shown to yield an improvement over persistence for lead times less than four minutes, but failed at the five minute horizon, possibly relating to the limited measurement range of the scanning LiDAR used in the study (4 km).
Achieving high prediction skills is inherently more challenging within complex terrain as opposed to flat sites (coastal/offshore). Indeed, rugged topography and local terrain roughness often lead to complex wind flows, which are more difficult to model. In [22], a long-range LiDAR was used to predict wind power ramps for a single reference turbine under complex topography conditions. The resulting forecast could not outperform the persistence benchmark for the lead times considered in the study (up to 20 min).
The forecasting approaches described above are deterministic by nature, i.e., they provide a single estimate of the most likely conditions at the time of forecast. Probabilistic forecasts, which also provide information about the uncertainty of the forecast, have gained increasing traction due to the additional support they provide to decision-makers [23][24][25]. Recent studies presented a probabilistic framework to forecast offshore wind power generation using dual radar measurements [26,27]. Reference [26] focused on predicting power generation five minutes ahead for a small number of turbines (first wind-facing row of the wind farm) under a precise set of conditions (wind speed less than 16 ms −1 and wind direction between 191 and 282 • ). Building on this approach, Reference [27] extended the methodology to the entire wind farm by including efficiency correction factors for wake-affected turbines. The method could outperform the probabilistic persistence benchmark during specific ramp events, but was unsuccessful for longer assessment periods.
Reference [28] applied a similar methodology to forecast the wind power generation of seven free-flow offshore wind turbines, also with a five minute lead time. The proposed forecast exceeded its persistence counterpart during unstable conditions, although it failed to yield improvements under stable and neutral stability conditions. The author attributed the larger errors observed under these conditions to uncertainties in extrapolating the wind speed to hub height.
Uncertainties remain about the extent to which knowledge of the incoming wind velocity field can effectively increase the accuracy of short-term wind power forecasts, as techniques for the best use of remote sensing data are still under development. Most of the effort to date has focused on predicting wind speeds or power generation from a limited number of turbines under specific atmospheric conditions. The present study builds on existing research by proposing and evaluating two new methodologies to predict power generation at the wind farm scale.

The Site and Data Collection
Data for this study originated from the Mount Mercer wind farm. The site is located in central Victoria, Australia, and is comprised of 64 Senvion MM92 wind turbines (2.05 MW rated power model) distributed over an area of 2650 ha. All data presented in this study were collected with a second-scale resolution over a year-long measurement campaign, ranging from 1 November 2019 to 1 November 2020. An exception to this are the data used for in situ power curves, typically requiring a more extensive data set (see Section 3.3.2). The forecast assessment period extended over 72 days between 20 August 2020 and 1 November 2020, while data collected between 1 November 2019 and 19 August 2020 were used for model formulation and training (see Section 3.3).
The topography in the vicinity of the Mount Mercer wind farm is of moderate complexity and generally slopes towards the south, from 255 m above the Australian height datum (m AHD) in the southwest corner to 370 m AHD in the northwest corner. The site's highest point is located on top of Mount Mercer (northwest corner of the site), at 427 m AHD. The wind farm is bordered to the west and north by forests extending towards the northwest. The region is also characterised by diverse wind regimes, with frequent shifts from the north to the west or southeast and vice versa. The wind rose for the assessment period is shown in Figure 2, indicating preferred wind directions from the north, west, and southeast sectors.
Wind farm power data were collected from each turbine. Outliers and periods of abnormal operation (i.e., external curtailment, negative generation and outages) were removed from the data set. These corresponded to 4.50% of the initial assessment period. A time series of the number of turbines available for generation was also retrieved. Wind and temperature sensors were installed on two 80 m high met masts, denoted MM1 and MM2, which are respectively located in the northwest and southeast corner of the site ( Figure 3). Wind speed data were collected from cup anemometers installed at 80 m AGL on both met masts. Similarly, wind direction and temperature data were retrieved from sensors installed on each met mast at 35 m and 76 m AGL (wind direction) and 2.2 m and 76 m AGL (temperature). Additional sensors placed on both met masts at 76 m AGL measured pressure and relative humidity. Missing data were interpolated using simple linear interpolation. Topography data were sourced at a 90 m spatial resolution from the Shuttle Radar Topography Mission (SRTM) database [29]. A scanning-head pulsed-coherent Doppler LiDAR manufactured by Leosphere (Wind-Cube 400S) is located at the top of Mount Mercer (latitude: −37.81961, longitude: 143.86365; point of highest topography). The locations of the LiDAR, met masts and turbines, along with the topography of the area are shown in Figure 3. The LiDAR is installed on a 2 m high platform, with its lens sitting at approximately 430 m AHD. The LiDAR is set up so as to perform continuous low-elevation PPI sector scans (i.e., sweeps over a segmented conical area). An elevation angle of 0.6 • was adopted so that the LiDAR beams were undisturbed in most directions while being as close to the horizontal as possible. The rotation speed of 3 • s −1 with an accumulation time of 1 s was chosen as a trade-off between spatial and temporal resolution. The range gate length (LOS distance over which wind speeds are averaged) and the display resolution (LOS distance between two measurements) were 150 m and 75 m, respectively. In this configuration, the LiDAR's maximum range gate under undisturbed conditions was 12.25 km, and each scan comprised 4770 observation points (30 azimuth angles × 159 range gates). For illustrative purposes, Figure 4a shows the sampling elevation of the LiDAR (i.e., distance between ground level and sampling height) and its maximum range gate (12.25 km). The main LiDAR features and scanning parameters are summarised in Table 1.
An adaptive scanning methodology was implemented in an effort to increase the sampling frequency of the incoming wind field while maintaining sufficiently fine spatial resolution. To do so, one full rotation was performed every 11 min and the average wind direction was computed using the velocity azimuth display method applied to each range gate (see Section 3.2.2). The starting and final azimuth angles were then automatically updated to perform 80 • sector scans centred on the calculated upstream wind direction. The dynamic scanning strategy implemented in this study meant that the time required to sample the incoming wind field was reduced from 123 s (full PPI scan) to 33 s. To the best of the authors' knowledge, the implementation of a dynamic scanning strategy constitutes a major innovation of this study as such an application has not been reported in the current body of literature. The LiDAR data set is publicly available [30].

Filtering
First, spurious data within each scan were removed. The carrier-to-noise ratio (also sometimes called the signal-to-noise-ratio) (CNR) is commonly used to filter out noisy data associated with low signal quality [31][32][33][34]. In this study, all data associated with a CNR less than −32 dB (low signal quality at far ranges; [35]) or higher than 0 dB (hard-target) were rejected. Data flagged as erroneous by the manufacturer's internal quality status were also excluded. The maximum valid range was then calculated for each scan as the last range containing at least 50% of valid data, and data from all farther range gates were discarded.
Given that scans associated with poor visibility conditions tend to produce erroneous wind field estimation leading to large forecast errors [31,36], a quality flag was attributed to each scan before further processing. A scan was considered as a suitable input for the LiDAR forecasts if the maximum valid range exceeded 3 km and at least 80% of the observed range was valid. All scans failing to meet these criteria were flagged as erroneous and discarded. The 3 km threshold was established based on the minimum clear vision distance necessary to probe wind conditions five minutes ahead assuming wind velocities averaging 10 ms −1 . Out of the 641,189 scans in the initial data set, 92,926 (14.49%) were removed following the filtering process outlined above.
Finally, missing values remaining within the maximum valid range were interpolated using a 2-dimensional (2D) nearest neighbour interpolant [37].

Wind Field Reconstruction
As LiDARs only measure radial velocities, a single point measurement would lead to an infinite number of possible combinations of the Cartesian velocity components [38]. Three-dimensional reconstruction of the wind vectors therefore requires additional models or hypotheses about the flow.
The wind field reconstruction method implemented in this study relies on the assumption that wind direction is homogeneous within every range gate. First, the average wind direction within each range gate ω rg was calculated using the velocity azimuth display (VAD) technique [39]. Briefly stated, the radial velocity was expressed solely as a sinusoidal function of the azimuth angle [40], and the mean horizontal wind speed was retrieved from the best fit function on the radial velocities from each range gate circle. Second, the horizontal wind vectors V H were calculated for each grid point through projection of the radial velocities onto ω rg : in which φ is the azimuth angle following the conventions shown in Figure 1. This wind field reconstruction method has the advantage of being robust and computationally inexpensive. On the other hand, the method will likely fail to characterise complex flow features due to its underlying homogeneity assumption [41]. For illustrative purposes, Figure 4b shows the reconstructed wind field based on a sector scan on 1 November 2019 at 22:38:40 (UTC+10). The figure displays conditions shortly before a large upward ramp associated with a cold front and characterised by a power generation increase of 121 MW (92 % rated power) in 33 min.

Benchmarks: Persistence and ARIMA
The LiDAR forecasts were compared against two standard time series models: persistence and ARIMA. Both benchmarks were based on a one minute resolution power generation time series. The persistence model (P) simply assumes that conditions at the time of the forecast are the same as the current conditions.
Owing to the strong temporal auto-correlation of winds over short time scales [20], the persistence method tends to produce more accurate predictions than most statistical and physical models with a very short-term horizon (seconds to a few minutes; [42]) and remains the industry standard for short-term forecast evaluation to this day [22].
The ARIMA (autoregressive integrated moving average) model [43] uses previous observations (AR) and errors (MA) as predictors for future outcomes, as well as differencing operations to ensure stationary transformations (I). Model order estimation was performed automatically using the auto-ARIMA method [44]. The method first assesses stationarity using the augmented Dickey-Fuller test, a standard statistical method used to test non-stationarity in time series through unit root testing [45]. A grid search approach was then implemented to find the optimal set of model parameters minimising the AIC (Akaike information criterion; [46]). In short, the AIC is a model evaluation metric that rewards goodness-of-fit, but penalises over-fitting. The optimum model found was an ARIMA(4,0,2) [47]. The model coefficients were then fit to the training set (1 November 2019-19 August 2020) using the conditional sum of squares likelihood maximisation approach. Finally, out-of-sample predictions were produced using the test set (20 August 2020-01 November 2020), and the value corresponding to a forecast horizon of five minutes was retained.

Smart Persistence
Smart persistence models were first introduced for solar power forecasting applications [48][49][50]. These models incorporate easily predictable solar power generation drivers such as the clear-sky index into the persistence model to improve forecast accuracy. In this study, we extended the concept to LiDAR-based wind power forecasting.
The core principle is to estimate the wind farm ramp rate, i.e., the rate of power generation change over a forecasting window, and to adjust the persistence model accordingly. Mathematically, the smart persistence (SP) is defined as follows: where P t+h|t is the future power generation for time t + h at time origin t (MW), α is a damping parameter, β is the predicted ramp rate (MW min −1 ) and h is the forecast lead time (min) (five minutes). The ramp rate is calculated as follows. First, the wind fields from the five sector scans closest to the forecast time t and associated with a "valid" quality flag are retrieved. Next, wind vectors are propagated five minutes forward in time using Taylor's frozen turbulence hypothesis [51,52], which assumes that turbulent structures ("eddies") are transported at the rate of the mean wind speed. In other words, wind vectors are propagated forward whilst preserving the same wind speed and direction over time. Similar propagation models in which wind vectors maintain the same wind speed and direction over time have been applied in numerous remote sensing forecasting studies (e.g., [19,22,26,28,53] and [27]).
Propagated wind vectors are then spatially averaged before being converted to turbineequivalent power generation (TEPG) using an in situ power curve. The approach that consists of converting forward-propagated wind fields to turbine generation using in situ power curves was first implemented in a study by [26]. In this study, the power curve was generated using ten minute averaged turbine generation and met mast wind speed measurements collected from 06 April 2017 to 12 January 2019 and grouped into 0.5 ms −1 bins ( Figure 5). A ten minute averaging window was required to ensure converged power values. We then computed the wind farm generation by multiplying the TEPG by the number of turbines actively generating. This accounts for times when not all turbines are available for generation, e.g., when a portion of the wind farm is under maintenance.
The steps above were carried out for all five LiDAR scans, and the predicted ramp rate β was determined as the slope of the best linear fit of the resulting power estimates with respect to time. Finally, the optimum value for the damping parameter α (0.27) was derived empirically, minimising the mean absolute error based on the training data set (1 November 2019-19 August 2020).

Deep Convolutional Neural Network
The second approach tackles the same short-term forecasting challenge using convolutional neural networks (CNNs). Briefly stated, a neural network is a computational model that uses back-propagation algorithms [54] to update its parameters (weights and bias) so as to minimise a given cost function. The fundamental building blocks of a neural network, also called "nodes", connect an input variable to an output variable via a transfer function (e.g., rectified linear activation function or "ReLU"; [55]). The overall tendency is towards deeper and more intricate model architectures [56], hence the denomination "deep convolutional neural networks" (DCNNs).
The DCNN architecture developed as part of this study is shown in Figure 6. Just like for the SP forecast, the DCNN model uses as inputs the wind fields from five valid sector scans, the power generation ("P t ") and the number of turbines available ("num_WT"), with the output being the wind farm generation at time t + h ('P t+h|t '). The wind fields are stacked together to form a 2D image of shape 30 × 159 × 5 (each wind field comprises wind speed data from 30 azimuth angles and 159 range gates). Given that the wind sector being probed by the LiDAR cannot be learned directly from the input images, wind direction data ("wd") is provided separately as numerical input. All data were then normalised to a [0, 1] range to improve model convergence and training time [57]. The LiDAR scan images were processed through seven two-dimensional convolution layers (Conv2D) with 3 × 3 convolution kernels. The number of filters within each layer ranged between 16 and 128. Such a model design was inspired by previous research suggesting deep model architectures combined with small receptive fields of a 3 × 3 convolution window yield superior outcomes [58]. To reduce the number of trainable parameters and the risk of over-fitting, the first two layers of the CNN were down-sampled using a 2DMaxPooling layer [59] with a 2 × 2 kernel. A second input channel processed the remaining numerical data through a standard multi-layer perceptron (MLP). Finally, the two model branches were combined and connected to the output via a regression node with linear activation. The resulting DCNN comprised 669,577 trainable parameters.
The data set was split into training, validation and test segments following recommendations in [59]: 60% was used for model training (1 November 2019-07 June 2020), 20% for validation (08 June 2020-19 August 2020) and the remaining 20% for testing (20 August 2020-01 November 2020). The division into the training, validation and test set was done chronologically because wind field properties are strongly auto-correlated over short time scales and randomising results in over-fitting. The DCNN forecast was implemented through the Keras framework [59] using Tensorflow [60] as the backend engine. The model was trained with an HPC-Cloud Hybrid System [61] utilising a single graphics processing unit (GPU) core with 64 GB memory. Different optimizers were tested for the iterative update of the network's weights based on the training data set, including RMSprop [62], stochastic gradient descent [63], Adam [64] and Nadam [65]. The Adam algorithm with a learning rate of 1e-5 and using the mean absolute error for loss function was empirically chosen. The model was trained over 200 epochs with a batch size of 64. Scaled forecasts were obtained by processing the inputs from the test data set using the model weights associated with the lowest validation loss. The output was finally converted back to power through inverse feature scaling.  Figure 7 shows in green the distribution of valid forecast times throughout the assessment period (20 August 2020-01 November 2020). These account for 84.19% of the series in terms of temporal coverage. Periods discarded from the assessment period included (1) times when less than five valid sector scans could be retrieved over a five minute window (11.31%) and (2) periods of abnormal wind farm operation (4.50%). The black lines depict periods when less than five valid LiDAR scans could be retrieved within five minutes before forecast (11.31%). The red lines represent periods of abnormal wind farm operation (outages and negative or curtailed generation; 4.50%).

Model Evaluation
As discussed in Section 3.3.1, the novel LiDAR-based forecasts (SP and DCNN) were assessed against P and ARIMA to determine in which proportion they could improve over these techniques. All forecasts were computed at one minute resolution, and the gain relative to the benchmarks was quantified as follows: where ε LF and ε bench are the errors from the LiDAR-based forecasts and the benchmarks, respectively. There is currently no consensus on which error metric is superior when it comes to forecast evaluation. In this study, the forecasts were assessed using the two most frequently used error metrics, namely the mean absolute error (MAE) and the root-mean-squared error (RMSE). The main advantage of the MAE is as an intuitive characterisation of the average forecast error [66]. In contrast, the RMSE tends to give higher weight to the larger errors and is especially valuable when significant errors are highly undesirable. Another model evaluation metric used in this study is the daily outperformance percentage (%DO), defined as the percentage of days recording improvement over the benchmark within the assessment period. We also assessed the performance of the forecasts under various atmospheric stability regimes. In order to estimate stability conditions at the site, the Monin-Obukhov length (L) was calculated from the Richardson number. Since only wind speed measurements at one height were available throughout the reporting period, we calculated the surface-layer Richardson number (RI s ) as follows [67]: in which g is the gravitational constant,θ v is the virtual potential temperature,Ū is the wind speed measured at 80m AGL, ∆z θ = 75.8 m and ∆z u = 80 m. We then related RI s to L using [68]: in which z m is the geometric mean of the heights used for calculating RI s . L was calculated every ten minutes using wind speed and temperature measurements averaged over a ten minute moving window. Atmospheric stability conditions at the site were determined using Monin-Obukhov lengths divided into 3 classes: conditions at the site were considered stable if 0 m < L < 200 m, unstable if −200 m < L < 0 m and near-neutral in all other cases [69]. We further evaluated the performance of the forecasts independently during wind power ramps, i.e., when large forecast errors were likely to occur. The ramp identification approach followed that of a previous study at the site focusing on ramp characterisation [70], where ramps were identified as the 1% strongest variations of the wind power time series based on continuous wavelet analysis. Further details about the ramp characterisation can be found in [70]. We used three binary ramp detection statistics, namely the forecast accuracy (FA), ramp capture (RC) and critical success index (CSI), defined as follows [7]: RC = TF TF+MR (8) in which TF (true forecast) refers to the correct identification of a ramp event (prediction = 1, truth = 1), FF (false forecast) is the number of false positives (prediction = 1, truth = 0) and MR (missed ramp) is the number of ground truth ramp events not detected by the forecast (prediction = 0, truth = 1). We further examined the ramp detection skill using the amplitude error ε a , duration error ε d and timing error ε t defined by: ε t = t pred − t truth (11) in which ∆P is the ramp amplitude, ∆T is the ramp duration (rise time) and t is the ramp time at the centre of the ramp.

Results and Discussion
The distribution of five minute power changes throughout the assessment period is displayed in Figure 8. The histogram provides valuable insights on the degree of volatility expected in the forecasts. For example, it is shown that |∆P 5min | < 1 MW corresponds to 42% of the time series in terms of temporal coverage. The relatively narrow distribution observed in Figure 8 also supports the use of P as a benchmark in this study. The key results for all forecasts (P, ARIMA, SP and DCNN) are presented in Table 2. It is shown that the MAE and RMSE of the LiDAR-based forecasts were lower compared to their benchmarks. The DCNN model exhibited superior accuracy for all error metrics presented in Table 2, with notably 90% of the days reporting improvement over persistence in terms of RMSE. ARIMA and SP showed a similar %DO P according to the RMSE, and ARIMA outperformed SP according to the MAE. To better understand the features driving LiDAR-based forecast accuracy, Table 3 presents the gains relative to the benchmarks broken down into wind sectors, stability conditions and periods (all/ramp/no-ramp). %Imp P and %Imp ARIMA were positive for all presented categories, demonstrating the effectiveness of the LiDAR-based models. We also observed that the DCNN model always outperformed SP except for two cases shown in bold in Table 3. Results in Table 3 suggest improved performances of the LiDAR-based forecasts under westerly and southerly wind regimes. Figure 9 shows the complex relationship between daily %Imp P and wind direction in greater detail. We hypothesised that the observed variability was due to a combination of three factors. First, differences in LiDAR sampling elevation influenced the LiDAR-forecast performance. Figure 4a indicates the sampling height of the flow field was approximately 100 m AGL for north-and east-facing scans, but could reach up to 400 m AGL for southward scans. Secondly, the forests and the hilly topography to the north were likely to induce complex surface layer flows that might not be adequately captured by the LiDAR-based models. Thirdly, easterly winds were characterised by little variability compared to other wind sectors. Only two out of the 96 identified-ramps were associated with easterly winds, whereas 43 originated from the west sector. As persistence inherently performs well over stationary wind regimes, lower %Imp P with easterlies was expected. Note the prevalence of westerly ramps was associated with mesoscale and synoptic-scale frontal systems characteristic of southeast Australia [70]. The considerations above underline the potentially critical role of terrain roughness, topography and synoptic-scale meteorology, adding another layer of complexity to short-term forecasting. Table 3 also indicates that the LiDAR-based forecast skill was generally lower under stable atmospheric conditions than under unstable and neutral conditions. Much like for easterly winds, we attributed this behaviour to differences in power variability. Indeed, the standard deviation of power generation under stable, neutral and unstable conditions was 38 MW, 40 MW and 45 MW, respectively. Therefore, the lower %Imp P under stable conditions could be explained by a reduced variability benefiting the P forecast. We also generally observed lower performance improvements under unstable conditions compared to neutral conditions. A possible explanation for this is that changes occurring ahead of the wind farm are less likely to be transmitted downwind under unstable conditions due to the erratic nature of turbulent eddies [71]. Again, we wish to point out the interpretations above are only speculative, and further investigations are required to verify these postulates. A total of 96 ramps were identified throughout the assessment period, amongst which 43 were downward ramps. The ramp amplitudes (maximum power change) and rise times varied between 21% and 88% of the rated capacity and 5 to 56 min, respectively, as seen in Figure 10. Results in Table 3 indicate the performance of the LiDAR-based forecasts was significantly higher during ramp conditions, with 18.59% %Imp P, MAE reported for the DCNN model. This behaviour resulted from both an increased ability of the models to predict ramps combined with a reduced performance inherent to P during variable conditions. To further investigate the effect of ramps on forecast performances, Table 4 presents a comprehensive list of ramp-specific statistics. By definition, P will detect the exact same ramps as the ground truth with a lag equal to the forecast horizon. Such a behaviour is reflected in Table 4, where P reports a five minute timing error, near-perfect binary ramp detection metrics and near-zero amplitude and duration errors. Note the deviations from the theoretical values were due to falsely-identified ramps (FF) resulting from communication outages. Despite being associated with a lower MAE and RMSE, the LiDAR-based forecasts exhibited worse performance relative to their benchmarks in terms of FA, RC, CSI, ε a , ε t and ε d . This illustrates the inability of the models to accurately predict changes in the sign of the rate of power change ("turning points"). That is, the LiDAR-based forecasts were limited by their tendency to under-and over-shoot at turning points, albeit being of superior overall accuracy during ramps.  Finally, we assessed the forecast residuals. Figure 11 shows the box-plot distributions of the forecast errors (observations minus predictions) for the four methods, where the edge of the boxes indicates the 25th and 75th percentiles and the whiskers extend from the fifth to the 95th percentiles. P exhibits the widest distribution, followed by ARIMA and SP. In line with previous analyses, the DCNN model displayed the smallest margins, with a [5th, 95th]percentile range of [−6.31 MW, 6.11 MW]. The near-zero (<0.1 MW) average reported for all methods suggested none of the models presented suffered from significant over-or under-estimation bias.
The normality of the LiDAR-based forecasts' residuals was further assessed through Q-Q (quantile-quantile) plot analysis. Figure 12 shows the relationship between the distribution of the residuals (sample quantiles) and a normal distribution (theoretical quantiles) for the SP and DCNN forecasts. The graph demonstrates the errors from both forecasts were following a generally normal, light-tailed, distribution. The analysis above suggested that the model errors were normal and uncorrelated, which is a good indication of the effectiveness of the proposed approaches.

Conclusions
Intermittent power generation sources such as solar and wind present technical challenges associated with their inability to contribute to electric system security (i.e., providing frequency control, dispatchability and inertial response). Accurate wind power forecasts could help overcome this issue by making the wind farm a more controllable resource. This work introduced two novel methodologies using LiDAR remote sensing to predict power generation at the wind farm scale with a lead time of five minutes. The first builds upon the so-called "smart persistence" approach and uses multiple near-horizontal sector scans to retrieve the wind farm ramp rate, i.e., the expected rate of power generation change over the forecast window. The second addresses the same forecasting problem through the development and training of a deep convolutional neural network. The two models presented were shown to outperform the persistence and ARIMA benchmarks in terms of the MAE and RMSE. The effectiveness of the proposed methods was particularly evident during ramp events, during which a 19% MAE improvement over the persistence benchmark was notably reported.
The study endeavoured to objectively assess the performance of LiDAR-based forecasts throughout an extended time period and is one of the first studies to predict onshore power generation at the wind farm scale using remote sensing. This approach contrasts with previous studies, which generally focused on wind power from a limited number of turbines under ideal conditions. This work also implemented a dynamic scanning strategy to acquire high-resolution upfield wind field measurements. The methods presented in this paper could be applied to different wind farms with a similar setup, provided a sufficient amount of data is available for model training. In particular, the LiDAR-based forecasts are expected to perform at least equally well at offshore wind farm locations, where the effects of topography are less significant.
While this study presented a compelling argument for integrating remote sensing measurements into short-term wind power forecasts, some limitations need to be considered. A significant limitation is the varying measurement range of the LiDAR depending on atmospheric conditions. In the present research, scans associated with poor visibility accounted for up to 11.31% of the LiDAR data set. During this time, the LiDAR was essentially "blind", and fallback mechanisms must be implemented. In addition, the Taylor frozen turbulence hypothesis used for the SP forecast may not accurately reflect the complexity of atmospheric processes within a semi-complex topography. Such a model could be improved using computational fluid dynamics modelling. Further improvements of the method could also come from incorporating data from the upstream edge of the farm in the models. Finally, the present study only focused on five minute-ahead predictions, and future studies should also target other lead times.
The innovative nature of remote sensing forecasting can be viewed as both an opportunity and a challenge. The opportunity is that the full potential of remote sensing for minute-scale forecasting is yet to be discovered. As a result, more efforts should be directed towards exploring various model formulations to identify the best performing approaches.  Data Availability Statement: The LiDAR data set was published in [30]. Remaining wind farm data are confidential and therefore not publicly available.