Wind Speed Modeling by Nested ARIMA Processes

: Wind speed modelling is of increasing interest, both for basic research and for applications, as, e.g., for wind turbine development and strategies to construct large wind power plants. Generally, such modelling is hampered by the non-stationary features of wind speed data that, to a large extent, reﬂect the turbulent dynamics in the atmosphere. We study how these features can be captured by nested ARIMA models. In this approach, wind speed ﬂuctuations in given time windows are modelled by one stochastic process, and the parameter variation between successive windows by another one. For deriving the wind speed model, we use 20 months of data collected at the FINO1 platform at the North Sea and use a variable transformation that best maps the wind speed onto a Gaussian random variable. We ﬁnd that wind speed increments can be well reproduced for up to four standard deviations. The distributions of extreme variations, however, strongly deviate from the model predictions.


Introduction
When planning wind power plants, the choice of its location as well as the precise positions of the composing wind turbines depend on estimates of wind speed at those particular spots. By deriving efficient models that can, under certain accuracy, convert wind speed estimates into wind power estimates, one is able to forecast deficit and excess of generated power and decide in advance the amount of necessary energy to handle in the market and for maintaining stable functioning of power grids.
ARIMA (Autoregressive Integrated Moving Average) models have been often used in the last decades [1] for modelling wind speed and wind power variations over large time lags, typically of the order of one hour. They are simple to implement and, if of low order, can be interpreted as discretised forms of differential equations. They were used for modelling output of wind power plants in connection with investment strategies in the energy market [2], variations of energy demand [3], and monthly variations of wind power generation [4]. For describing fluctuations, on shorter time-scales of about 10 min, ARIMA models could be applied by increasing the number of regression and moving average terms (higher order) and introducing frequency decomposition [5]. One main advantage of ARIMA models is their cheap computational cost, since they are based on simple iterative procedures [6].
More sophisticated recursive models are the so-called artificial neural networks, which include nonlinear terms and derive the weight of each particular contribution through a learning process, based on previous sets of measurements. They have been applied also for wind speed prediction [7]. Though more precise in value prediction than standard ARIMA models [8,9], artificial neural networks are difficult to interpret due to their higher complexity.
For this reason, ARIMA models remain as a widely used approach for wind speed modelling [10], sometimes in combination with other methods. For example, hourly and ten minute averages of wind speed data collected in Mexico were modelled with an ARIMA model and compared with a neural network that uses additional exogenous meteorological variables [8]. Though using considerably more data, neural networks seem to guarantee improvements not better than 6%. ARIMA models with exogenous variables have been also developed [11] for modelling 15 min average wind speed. By studying wind speed series from China, it has been found that an ARIMA model combined with empirical mode decomposition has a better accuracy in comparison with neural networks and other machine learning approaches [12], achieving a mean absolute error smaller than 4%. However, this accurary is obtained for ten minute averages. For one hour, the accuracy of all models decreases considerably. A further approach, using empirical mode decomposition in combination with neural networks, was followed in [13]. In a recent review, different approaches for modelling ultra-short, short, medium and long-term wind speed fluctuations are discussed [14], including neural networks, ARIMA models and wavelet methods.
With respect to wind data analysis, there are two major issues that challenge the application and usefulness of ARIMA models. First, wind speeds are not Gaussian distributed, but typically can be well described by a Weibull distribution [15]. Secondly, the distribution of wind speed increments is also not Gaussian, showing fat tails [16]. These issues raise the question whether these problems can be resolved by a refined approach based on ARIMA processes. In this paper, we investigate up to which extent ARIMA models can reproduce statistical features of wind speed data, by addressing both these issues. We focus on 15 min of wind speed measurements collected at the FINO1 platform at the North Sea [17], see Figure 1a,b. These measurements were performed over a period of 20 months with a sample frequency of 1 Hz. The choice of 15 min is motivated by forecasting strategies in energy markets, which commonly use predictions of energy production 15 min ahead. To address the non-Gaussianity of wind speed distributions, we consider a transformed variable, which is a power of the wind speed. The respective exponent is determined by the requirement that the distribution of the transformed variable fits a Gaussian distribution as best as possible. The corresponding optimisation procedure is outlined in the Appendix A, where we provide a table also of exponent values as a function of the two parameters specifying the Weibull distribution. This can be used for other wind speed data sets.
We find that the best ARIMA model for the wind speed data on the time scale of 15 min is an ARIMA(0, 1, 2) model. The coefficients change for each 15 min window and are also modelled by ARIMA processes. We call this combination of the ARIMA(0, 1, 2) model with the ARIMA models for its coefficients a nested ARIMA model and show that it is able to predict distributions of wind speed increments over a range of up to four standard deviations. Beyond this range, however, this approach has its limits. The methodology resembles that of the so-called superstatistics approach [18] and earlier work on the distribution of velocity increments in turbulent flows [19]. Distribution parameters have been modelled with stochastic differential equations, for instance to model the evolution of volume-price distributions in the stock exchange [20,21]. Stochastic differential equations are a further means of data modelling and can in general be interpreted straightforwardly. They have been recently applied for describing short-time fluctuations of wind power outputs in wind turbines and wind farms [22,23].
In Section 2, we describe the ARIMA model for the wind speed and its parameters. In Section 3, we first describe the data used and the pre-processing of these data. Then, we derive the ARIMA model that best suits the set of measurements on the scale of 15 min. In Section 4, we introduce the nested ARIMA model and compare its predictions for the distributions of wind speed increments with the corresponding distributions of the measured values. In Section 5, we summarise our main results and discuss their implications for future research.

ARIMA-Model
ARIMA model stands for Autoregressive Integrated Moving Average model [10], where the evolution of a variable x in discrete time t is described as (1) Here, p, d and q are the parameters quantifying the order of the model, usually represented as (p, d, q), φ j (j = 1, . . . , p) and θ k (k = 1, . . . , q) are coefficients, and w k are uncorrelated Gaussian random numbers with zero mean and variance σ 2 w . The operator ∆ corresponds to a discrete derivative and is defined as Without loss of generality, the mean of x t is zero. Parameter d specifies the minimum number of "differentiations" needed to map the original (possibly non-stationary) process onto a stationary process. Parameter p specifies the number of previous x values entering the ARIMA process and defines the order of the auto-regressive (AR) part of the model. Parameter q gives the number of previous additive Gaussian white noise terms and defines the order of the moving average (MA) part of the model. For modelling a time series {x t } 1≤t≤N , an ARIMA process can be derived by estimating its order (p, d, q), the sets φ = {φ j } 1≤j≤p , θ = {θ k } 1≤k≤q of coefficients and the noise strength σ w by a maximum likelihood estimator in combination with the Bayesian Information Criterion [24]. The corresponding procedure is illustrated in Figure 2 for a test series {x (test) t } 1≤t≤N of N = 10 5 values generated by Equation (1) with (p, d, q) = (2, 1, 2), σ w = 0.1, and coefficients φ 1 = 0.5, φ 2 = −0.1, θ 1 = 0.3, and θ 2 = −0.2. The test series is shown in Figure 2a, and in Figure 2b the series of its increments ∆x t−1 ] is plotted; the corresponding distributions are shown in the right panels of these graphs.
To estimate parameter d, the unit-root test [10] is performed, starting with d = 0. In case it fails, the order d is increased by one and the test repeated. The first value of d for which ∆ d x t fulfils the unit-root test is chosen. For the series {x (test) t } 1≤t≤N , this yielded the correct value d = 1.
To determine q and p, first maximal values τ q and τ p of these parameters are estimated from analysing the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the time series {x t } 1≤t≤N , respectively [10,25]. The ACF measures the correlation between two increment values ∆ d x t and ∆ d x t+τ and, for d = 1; it is given by where σ 2 is the variance of the increments ∆x t . In general, the ACF must be defined with time-dependent variances σ 2 x (t), σ 2 x (t + τ), but in this case the time series ∆x t is stationary and σ 2 becomes time-independent. For the test series, the ACF is shown in Figure 2c. The upper bound τ q of q is estimated from the smallest value of τ, where the ACF becomes negative, i.e., τ q = min τ {γ(τ) < 0}. In Figure 2c, the corresponding value τ q = 3 is indicated by the arrow.
with the corresponding histogram ρ(∆x). This series fulfils the unit-root test [10] and we show in (c) its autocorrelation function (ACF) and in (d) its partial autocorrelation function (PACF). From these data, the range 0 ≤ p ≤ τ p and 0 ≤ q ≤ τ q of possible (p, q) values are estimated (see text). Within these ranges, the Bayesian Information Criterion (BIC) is used to obtain an optimal (p, q) choice. (e) 30 pieces of predicted data from the ARIMA model adapted to the test series and the model of maximum persistence in comparison with the values generated by the ARIMA(2, 1, 2) process. In the right panel, the distributions of the deviations between the predicted and generated values are shown. (f) Accuracy of the prediction of the parameters (p, d, q) dependent on the size N of the time series. This accuracy is given by the percentage of correct estimates, when generating n = 250 test series with N data points by using Equation (1).
The PACF quantifies the direct correlation between x t and x t−τ after removing the linear dependence on the intermediate values x t−k with k = 1, . . . , τ − 1. Specifically, the PACF ϕ(τ) is the last component of the vector [25] For the test series, the PACF is shown in Figure 2d. We estimate the upper bound τ p from a least square fit of |ϕ(τ)| to an exponential decay |ϕ(0) The corresponding value τ p = 4 is indicated by the arrow in Figure 2d.
Given d as well as p and q in 0 ≤ p ≤ τ p and 0 ≤ p ≤ τ q , a maximum likelihood function estimator can be defined, based on assumptions regarding Gaussian statistics of residuals and by using reasonable initialisations of the recurrence Equation (1) (see, e.g., reference [10] for details). The likelihood function L X (φ, θ, σ w ) estimates the probability of finding the time series X = {x t } 1≤t≤N realised for the sets φ = {φ j } 1≤j≤p , θ = {θ k } 1≤k≤q of coefficients and the noise strength σ w in the ARIMA process (1). Maximising L X (φ, θ, σ w ) with respect to the model parameters, best estimatorŝ φ(p, q),θ(p, q), andσ w (p, q) are obtained for each p and q.
The maximum likelihood method yields parameter sets for each p and q, but one still needs to decide which of the p and q in the ranges 0 ≤ p ≤ τ p and 0 ≤ p ≤ τ q should be selected. Larger p and q will give better maximised likelihoods, but at the price of a higher complexity of the model. To take into account both (i) how well the models fit the data, quantified by the likelihood L(φ(p, q),θ(p, q),σ w (p, q)), and (ii) how many parameters are needed for obtaining this likelihood, the Bayesian Information Criterion (BIC) [24] B X (p, q) = (p + q + 1) ln N − 2 ln L X (φ(p, q),θ(p, q),σ w (p, q)) is often introduced. The best values p and q are the ones yielding the smallest B X (p, q). For the test series {x (test) t } 1≤t≤N , we obtain the correct order (p, d, q) = (2, 1, 2) and coefficients values that, within the numerical uncertainties, agree with the values used for generating the series:φ 1 = 0.49 ± 0.03, φ 2 = −0.101 ± 0.005,θ 1 = 0.31 ± 0.03,θ 2 = −0.20 ± 0.02, andσ 2 w = 0.01. The error associated to the variance of the Gaussian noise can be estimated as SE(σ 2 w ) =σ 2 w 2/(N − 1) ∼ 4 × 10 −5 . With this estimation of the parameters, it is clear that the respective adapted ARIMA(2, 1, 2) model withφ,θ andσ w would give predictions within an uncertainty that almost agrees with the intrinsic noise of the ARIMA(2, 1, 2) process used for generating the test series. It is instructive to compare corresponding predictions with the simplest prediction model, the maximum persistence model witĥ x t+1 = x t . In Figure 2e, we show, as an example, 30 predicted data values of the adapted ARIMA(2, 1, 2) process and that of the maximum persistence model in comparison with additional data (N = 10 5 + k, k = 1, 2, . . .) generated for the test series. In the right panel of this figure, the distributions of the deviations between predicted and generated data are shown. The smaller width of the distribution for the adapted ARIMA model shows that it significantly improves the prediction quality over the maximum persistence model.
To evaluate the accuracy of the method described above for determining p and q, we generated a number n of test series {x (j) t } 1≤t≤N , j = 1, . . . , n, for the same ARIMA(2, 1, 2) model and determined the fraction of series, where the above analysis recovers the correct values. The so estimated accuracy will increase with the size N of the test series. This is shown in Figure 2f, where the accuracy, estimated from n = 250 test series, is plotted as a function of N for the case (p, d, q) = (2, 1, 2). It can be seen that for N 10 3 , the accuracy becomes larger than 90%.

ARIMA Model for Wind Speed Measurements
We analyse data collected at the FINO1 platform, which is located about 45 km north from the island of Borkum in the North Sea (see Figure 1a). The wind speeds were measured over 20 months, from September 2015 to April 2017, at a sampling frequency of 1 Hz. The FINO1 platform measures wind speed at eight different heights. Here, we consider the measurements at a height of 100 m, taken by a cup anemometer.
As illustrated in Figure 1b, the series {v(m)} 1≤m≤M of 20 months wind speed data (M ∼ = 5.2 × 10 7 ) is transformed into a series of backward averages over M 0 = 900 values, corresponding to 15 min, This series of backward-averaged values is shown in Figure 3a. All windows of 15 min data having more than 25% missing data (225 values) were disregarded. In total, this pre-processing resulted in only 65 missingv values. The distribution of thev, shown in Figure 3b, is well fitted by a Weibull form with scale parameter λ and shape parameter k. By least square fitting, we obtained λ = 10.17 m/s and k = 2.07, corresponding to a mean µv = 9.16 m/s and standard deviation σv = 4.76 m/s. Because ARIMA models yield Gaussian distributed values, we consider a transformed variable u, which is a power law of the wind speed, i.e., u =v α (8) with 0 < α < 1. An optimal value α opt of α is determined by requiring the distribution of values in the series 1≤t≤N to be close to a Gaussian. The corresponding optimisation procedure is described in the Appendix A and yields α opt = 0.586 ± 10 −3 for the series {v(t)} 1≤t≤N . Figure 3c shows {u(t)} 1≤t≤N and in Figure 3d the respective distribution of u values is plotted (symbols). The mean and standard deviation of this distribution are µ u = 3.53 (m/s) α opt and σ u = 1.14 (m/s) α opt . As can be seen from Figure 3d, the Gaussian distribution (dashed lines) with µ u and σ u fits the data (symbols) well. Previous works have considered other values of α, e.g., α = 1/2 in Ref.
[1] and α = 1/3 in Ref. [26]. In general, the optimal exponent varies from one data sample to another. The optimisation procedure in the Appendix A gives α opt in dependence of the Weibull parameters λ and k, and we provide a table of α opt values for k and λ lying in ranges typical for wind speed distributions.
We now intend to adapt an ARIMA model to the transformed wind speed u. To this end, we consider time windows shorter than the full 20-month series for the following reasons: firstly, we want to validate whether there exists a typical order (p, d, q) of the ARIMA model. Secondly, for practical applications, one cannot expect data to be available over a period as large as 20 months. As illustrated above, cf. Figure 2f, about 1000 values are sufficient to estimate the parameters of the ARIMA model with good accuracy. We fix the size of the time window to N s = 3 × 10 3 (approximately one month). This yields a total of n s 5.5 × 10 4 training series {u(t + t)} 1≤t≤N s , t = 1, . . . , n s . In case a training series contains some of the 65 missingv-values (see above), we consider a slightly smaller N s .
We now apply the methods described in the previous Section 2 to obtain ARIMA models for each of the training series. In all cases, we find d = 1. The analysis of the ACF and PACF was performed for the complete series {u(t)} 1≤t≤N and yielded τ p = τ q = 5. The p and q values for the training series vary and their joint histogram is shown in Figure 4. In this histogram, a maximum occurs at (p, q) = (0, 2). We therefore consider (p, d, q) = (0, 1, 2) as the order of the ARIMA model for predicting 15 min wind speed averages in our data set. Next, we evaluate the power of the ARIMA(0, 1, 2) model to predict valuesû(t) for u(t) from the series {u(t )} t−N s ≤t ≤t−1 of N s = 3 × 10 3 previous values. The previous values are used to determine the optimal model parametersθ 1 ,θ 2 andσ w with the maximum likelihood estimator, and the ARIMA process with these optimal parameters gives the respective model series {û(t )} t−N s ≤t ≤t−1 . The prediction is obtained by setting w t = w t = 0 in Equation (1), and by replacing w t−1 and w t−2 with the residuals t−1 and t−2 , The corresponding prediction for the 15 min averaged wind speed isv(t) =û(t) 1/α opt . Figure 5a shows the first 30 predictionsv(t) (open circles), t = 3001, . . . , 3030, in comparison with the measured valuesv(t) (full circles). Compared to the simple maximum persistence prediction, v pers (t + 1) =v(t) (crosses), it does not give a significant improvement. This can be clearly seen from Figure 5b, where the distribution of the deviations between each of the models and the real measurement is shown. We find in both cases an average of zero ( v model −v 3 × 10 −3 ) and similar standard deviations, σ (012) = 0.7495 and σ pers = 0.7525.
One could suppose that the predictive power of the ARIMA modelling becomes better, when taking higher orders (p, q) (with d = 1 kept fixed). However, when performing an analysis with such models of higher order, we obtained results comparable to those shown in Figure 5. To evaluate the performance of the ARIMA model, we consider the mean absolute percentual error (MAPE) defined as of the wind speed increments. Here,v i denotes the prediction of the ith measured value of the wind speeds, and v i is the corresponding measured value. The mean of wind speeds is denoted byv. Table 1 shows the results of our ARIMA(0, 1, 2) model together with the values obtained for other models. The MAPE of the ARIMA(0, 1, 2) process shows values comparable to the most accurate ARIMA models. Table 1. Performance of the ARIMA(0,1,2) model, using the metric defined in Equation (10), in comparison with different data sets and models reported in [14].

Nested ARIMA Model for Wind Speeds
Apart from the prediction of wind speeds based on previous values, it is an important issue to develop tools for generating surrogate wind speed data with statistical features that resemble key features of real data. This allows researchers to use such surrogate data in studies without access to real data, or when an averaging over a large amount of data is required. As mentioned in the Introduction, the distribution of wind speed increments [v(t + τ) −v(t)] for a time lag τ shows also strong deviations from a Gaussian. As the ARIMA(0, 1, 2) with fixed parameters θ 1 , θ 2 , and σ w corresponds to a simple random walk, it is clear that the time variation of these parameters is important for obtaining non-Gaussian features for the increments. In the prediction procedure described in Section 3, the time variations is mediated by the optimal adaptation of the model parameters to the set of N s = 3 × 10 3 previous, and assumed to be known real values.
For generating surrogate data, we now consider the corresponding time series of optimal parametersθ 1 (t),θ 2 (t), andσ w (t). For convenient notation, we will write θ 1 (t), θ 2 (t), and σ w (t) for these series, i.e., we drop the circumflex. As for σ w (t), we consider its (normalised) logarithm, whereσ w represent the average over all σ w (t). The use of the logarithm is motivated by a previous study [19], where the distribution of velocity increments in turbulent flows was successfully modelled by considering the velocity increments to follow a Gaussian process with fluctuating diffusivity (variance) that is log-normal distributed. Note also that the logarithm implies that ζ(t) can have positive and negative values. Figure 6 shows the time series of θ 1 (t), θ 2 (t) and ζ(t) together with their respective histograms. These time series are now modelled also by ARIMA processes, resulting in a nested ARIMA model for generating surrogate data. Applying the procedure described in Section 2, we find that θ 1 is best described by an ARIMA (3, 1, 1) process, parameter θ 2 by an ARIMA (1, 1, 0) process, and ζ by an ARIMA(2, 1, 2) process: The coefficients φ θ 1 ,θ 2 ,ζ 1 , φ θ 1 ,ζ 2 , φ θ 1 3 , η θ 1 ,ζ 1 and η ζ 2 are given in Table 2 together with the standard deviations σ θ 1 ,θ 2 ,ζ w of the independent Gaussian noise terms w The nested ARIMA model that combines the ARIMA(0, 1, 2) model (13) with Equations (12a)-(12c) provides transformed wind speed increments ∆u, whose increment statistics can now be analysed. Table 2. Estimated coefficients entering Equations (12a)-(12c) of the ARIMA processes describing the stochastic evolution of the coefficients in Equations (11) and (13). Notice that each coefficient is modelled with an ARIMA process of different order: ARIMA(3, 1, 1) for θ 1 , ARIMA(1, 1, 0) for θ 2 , and ARIMA(2, 1, 2) for ζ.
9.91 7. In (e,f) the time series and histogram of the noise strength ζ are shown (see Equation (11)).
In Figure 7a-d we show the increment statistics of the wind speeds v(t) = u(t) 1/α opt , for different time lags τ = 0.25, 2, 4 and 8 h (red dashed lines). In the same plot, we also show the corresponding increment statistics of the 15 min averages of measured wind speeds (solid black lines). For a time-lag of 15 min (Figure 7a), the simulated increment distribution follows approximately the distribution of the measured values, except close to the maximum where it is more sharply peaked. When the time-lag is increased, the measured wind speed increment distributions become more similar to a Gaussian distribution, as it has been reported earlier [27,28]. However, this feature is not reproduced by the modelled increment distributions, which exhibit a significant leptokurtic shape even for large time lags.
In Figure 7e,f, we show the skewness S and the kurtosis κ of the modelled and measured increment distributions as a function of the time lag τ. As the values for S are not strongly deviating from zero, both for the modelled and measured data, the distributions can be considered to be approximately symmetric. The kurtosis κ in contrast shows significant differences: while it approaches κ = 3 of a Gaussian for very large time lags in the case of the measured increment distribution, the kurtosis for the modelled distribution remains nearly constant at a large value of about nine for all lags.
Based on these findings, one can not expect the full distributions of the measured and modelled velocity increments to match. Indeed, performing a Kolmogorov-Smirnov test retrieves a rejection when setting a significance level of 5%. Therefore, modelled and measured distributions are not the same. Nevertheless, we consider the nested ARIMA model as a first step for modelling the intermittent character reflected in the non-Gaussian tails of the increment distributions. Possible improvements of this approach are discussed below.

Conclusions
In this manuscript, we applied ARIMA models to a series of 15 min of the wind speeds measured at the North Sea. We focused on two problems. The first was to evaluate the power of these models for wind speed forecasting. The second was to ascertain the ability of these models to generate surrogate data for wind speeds. To overcome the problem that ARIMA models yield Gaussian distributions, while measured wind speed distributions resemble a Weibull form, a power law transformation was applied. Different from earlier studies, which used ad hoc values for the exponent in the transformation, we here determined an optimal value.
The evaluation of the predictive power was based on a subdivision of the time series into moving windows of about one month. We found that the best predictive ARIMA model for the set of wind speeds is ARIMA(0, 1, 2), though showing no significantly better accuracy than a simple maximum persistence model. In fact, our results showed also that predictions of the maximum persistence model could not be significantly improved by using ARIMA models of higher order.
With respect to generating surrogate data, we applied the ARIMA(0, 1, 2) model to each of the moving time windows. Thereby, time series of the coefficients and the noise parameter are obtained, which were also modelled by ARIMA processes. We called this combination of ARIMA processes a nested ARIMA model. Our results provide evidence that this nested ARIMA model is able to approximately reproduce wind speed increments yielding strong non-Gaussian features. In particular, the tails of the increment distributions could be nearly recovered for time lags below two hours. Consequently, within this time window, the nested ARIMA model can be used as a simple tool for generating surrogate wind speed data, for example, as input data for simulations of power grid dynamics under fluctuating wind power injection [29,30].
For larger time lags, the nested ARIMA model is not well suited for generating surrogates of wind speed data, showing pronounced deviations between the modelled and measured increment distributions. In particular, we observed a considerably large leptokurtic shape of the modelled distributions. These findings suggest to favor a modeling of wind speed data by alternative approaches, such as stochastic differential equations [22,31,32]. Differential equation approaches have proven helpful in minimising the amount of input data [33] in situations where large data sets are needed for training neural networks. Moreover, it has been shown that, to reconstruct stochastic series of wind tower vibrations, artificial neural networks retrieve accurate estimates of the mean and standard deviation of the tower vibration increments [34]. However, higher order moments, namely skewness and kurtosis, are better reconstructed with differential equation approaches.
Alternatively, one could try to further improve the modelling based on ARIMA processes. For instance, instead of using the power-law transformation of the wind speed, it could be better to apply a similar transformation to the wind speed increments. This in particular may avoid to obtain increment distributions with a too strong leptokurtic shape. Furthermore, the non-Gaussian character of the parameter evolution (cf. Figure 5) could also be mitigated by applying some optimised nonlinear transformation to them.  Acknowledgments: Financial support from the Deutsche Forschungsgemeinschaft (MA 1636/9-1) and from the bilateral cooperation FAPERJ-DFG (LI 1599/3-1) is gratefully acknowledged. The authors also thank the FINO-project and the Forschungs-und Entwicklungszentrum Fachhochschule Kiel GmbH for providing the data sets.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Optimal Non-Linear Variable Transformation of Wind Speeds into a Gaussian Variable
In previous studies, the square-or cubic-root of the wind velocity v was considered to be Gaussian-distributed [1,26]. In this appendix, we derive a optimal value α opt of the exponent α ∈ (0, 1] in the transformation that best approximates the distribution of u to a Gaussian, if v is distributed according to a Weibull form ρ W (v) (see Equation (7)). We show how the optimal exponent changes for different combinations of parameters k and λ of the Weibull distribution and provide a table that can be used for other data sets of wind speed measurements.
As wind speeds v are always positive, u should also be positive. Therefore, we consider the truncated Gaussian distribution for u ≥ 0. In case of an exact transformation to a Weibull distribution, we would have are the cumulative distributions of the Weibull and truncated Gaussian, respectively. An optimal exponent α opt can be determined by considering a distance measure between F W (v) and F G (v α ), and by minimising this distance with respect to α. One possibility would be to take the maximum distance sup v |F W (v) − F G (v α )| as suggested by the Kolmogorov-Smirnov statistic. Here, we use the L 2 -norm (integral over squared deviations) as the distance measure. The determining equation for the optimal exponent α opt then is and the functions µ = µ(α) and σ 2 = σ 2 (α) are given by The terms in Equation (A7) thus are: Substituting Equation (A9) into (A7), and the result into Equation (A6), yields a nonlinear equation that can be solved with respect to α for given parameters k and λ specifying the Weibull distribution.
For wind speed time series, the parameters of the associated Weibull distribution vary typically in the ranges 1.5 < k < 4 and 5 < λ < 20 (m/s). In Figure A1a, the optimal exponent is plotted as a function of both Weibull parameters in these ranges, and, in Figure A1b, we show the corresponding value of Λ(α). It can be seen that the transformation parameter α opt strongly depends on k, but is almost insensitive to the scale parameter λ. The minimum of the distance measure, Λ(α opt ), quantifies the quality of the transformation. The larger the minimum, the more the transformed Weibull distribution deviates from a truncated Gaussian distribution. The quality of the transformation increases with decreasing λ, i.e., series with smaller average wind speeds transform better to a truncated Gaussian. The α opt values for different pairs of k and λ are listed in Table A1. These can be used for other series of wind speeds.