- freely available
- re-usable
Resources 2013, 2(3), 370-384; doi:10.3390/resources2030370
Published: 3 September 2013
Abstract
: The goal of this paper is to show how to derive the multivariate Weibull probability density function from the multivariate Standard Normal one and to show its applications. Having Weibull distribution parameters and a correlation matrix as input data, the proposal is to obtain a precise multivariate Weibull distribution that can be applied in the analysis and simulation of wind speeds and wind powers at different locations. The main advantage of the distribution obtained, over those generally used, is that it is defined by the classical parameters of the univariate Weibull distributions and the correlation coefficients and all of them can be easily estimated. As a special case, attention has been paid to the bivariate Weibull distribution, where the hypothesis test of the correlation coefficient is defined.1. Introduction
The Weibull distribution is a continuous probability distribution that was described by Waloddi Weibull in 1951 [1]. It can be applied in a wide range of fields such as survival analysis, reliability engineering or weather forecasting, to mention just a few [1,2,3,4,5]. Mathematically, it can be written as a function of three parameters, i.e., location parameter γ, scale parameter C and shape parameter k. The location parameter gives information about the minimum value of the set, so it can be substituted by 0 in the case of wind speed sets.
When more than one Weibull-described variable is being considered and the dependence among them has certain relevance, the so called multivariate distribution [2,3] has to be applied. The multivariate distribution describes the outcome of a number of variables at the same time. Therefore, if the simultaneous behavior of a number of dependent variables, each of them described by a Weibull distribution, is being evaluated, a multivariate Weibull distribution needs to be used. Obviously, when the variables are independent, they can be evaluated separately.
So far, most of the multivariate Weibull expressions for Cumulative Distribution Function (CDF) or Probability Distribution Function (PDF) are based on models [4,5] that approximately describe the joint probability distribution of a group of variables. The most noteworthy one was introduced by Marshall and Olkin in 1967 [6], but others are also interesting such as those presented by Lee [7], Roy and Mukherjee [8], Crowder [9], Lu and Bhattacharyya [10] and Patra and Dey [11]. Bivariate Weibull models have also been developed based on the following Copula functions [4]: Farlie-Gumbel-Morgenstern, Clayton, Ali-Mikhail-Haq, Gumbel-Hougaard, Gumbel-Barnett, Nelsen Ten. However, most of these models are based on the estimation of parameters with no significance in order to define the model completely, i.e., in most cases those parameters have no direct relationship to the univariate Weibull model parameters and with the correlation coefficients which are used as a measurement of the dependence.
In this paper, a model is proposed for the multivariate Weibull PDF, based on the classic parameters used in the definition of a univariate Weibull model and on the correlation coefficients among the marginal distributions. It develops the change of variables from Normal to Weibull used in [12,13]. It is applied to wind speed and wind power, as there seems to be some agreement on the importance of Weibull distribution for the modeling and simulation of these in wind turbines or wind farms [14,15,16]. Therefore, this paper presents the relationship of the simultaneous behavior between wind speed and power in a pair or more locations. The bivariate case is analyzed on its own due to its importance, and the correlation coefficient inference procedure is established.
The structure of the paper is as follows: Section 2 derives the multivariate Weibull PDF from the Standard Normal one and outlines its application to wind speed and wind power, Section 3 deals with the bivariate Weibull PDF and Section 4 states the conclusions.
2. Multivariate Weibull Distribution
The proposed multivariate Weibull distribution is obtained by means of a Normal to Weibull change of variables. The procedure to derive it is as follows:
- (1)
The key point of the procedure is to define a change of variable from a Standard Normal to a Weibull distributed one. This transformation must be differentiable and the inverse function has to exist. The Standard Normal distributed variable is created in order to use its known features and transfer them to the Weibull one;
- (2)
Establish as many changes as the number of variables, n, considering the different Weibull parameters for each case;
- (3)
Obtain the multivariate Weibull PDF from the multivariate Standard Normal PDF applying the change of variable.
Finally, the relationship between the correlation coefficients in the multivariate Standard Normal PDF and in the Weibull one has to be checked in order to establish a multivariate Weibull PDF that depends on marginal Weibull parameters and the correlation coefficients between pairs of Weibull variables.
2.1. Normal to Weibull Change of Variables
In order to define a change of variables from a Normal distributed variable to a Weibull one, the Probability Integral Transform is applied [17], based on the obtaining of Uniform distributed variables. The Uniform distributed variables derived from Normal and Weibull ones are then equalized and a relationship between them is established.
The CDF of a univariate Weibull distribution [2,3,4] with scale parameter C and shape parameter k is defined in Equation (1):
On the other hand, the CDF of the univariate Normal distribution [2,3,4] with mean value µ and standard deviation σ is defined in Equation (2):
As mentioned earlier, the Probability Integral Transform states that variables from any given continuous distribution can be converted into variables having a uniform distribution, i.e., applying its CDF to a variable provides a new variable, uniformly distributed. So, y_{u} = F_{u}(u) and y_{x} = F_{x}(x) are uniform distributed variables.
Both variables y_{u} and y_{x} can be matched in order to establish a relationship between a Weibull and a Normal distributed variable. The whole process can be understood in two steps: First, the conversion of a Normal distributed variable into a Uniform one and then the conversion of this one into a Weibull one. Therefore, if both CDFs are matched, then the Weibull distributed variable, u, can be expressed as a function of the Normal one x, such as in Equation (3):
Thus, from a Normal distributed variable with given parameters (µ, σ), a Weibull one can be derived, with the desired parameters (C, k). Notice that this transformation can be applied to other types of variables to obtain an equation that relates two variables with different distributions.
In order to derive further results, the transformation given in Equation (3) will be referred to as ntw(x;C,k) where x is a Standard Normal variable (µ = 0 and σ = 1), the Weibull parameters are C and k. For a single variable, x_{i}, the notation is expressed in Equation (4):
The inverse transformation is also needed and denoted as ntw^{−1}(u;C,k) where u is the Weibull variable. For a single variable, u_{i}, that function is shown in Equation (5):
The derivative of ntw( ) is denoted as ntwʹ(x;C,k) and expressed in Equation (6), also for a single variable.
In Equation (6) the derivative of the error function is used [18], which is shown in Equation (7):
By using Equations (4–6), the change of variables can be extended to multiple variables.
2.2. Multivariate Normal to Weibull Change
In order to broaden the transformation to several variables, the multivariate Standard Normal distribution has to be considered. Its PDF for when the covariance matrix is positive definite is shown in Equation (8):
By using Equation (8), the PDF corresponding to the multivariate Weibull distribution is obtained through Equation (9):
Thus, the Jacobian matrix has non-zero elements in its diagonal. The determinant of this matrix is obtained through the equation expressed in Equation (11):
The multivariate Weibull PDF of a group of variables is shown in Equation (12) as a function of the multivariate Standard Normal PDF and ntw( ).
Equation (12) can be expressed as in Equation (13) if Equation (8) is taken into account.
Notice that Equation (13) depends on two parameters per variable (C_{i} and k_{i}) and one for every pair of them (ρ_{ij}). However, the correlation coefficient ρ, between pairs of variables is established according to Standard Normal distributed variables, so the relationship between both parameters has to be obtained. By making a numerical approach based on the Cholesky decomposition (see Appendix), an interval can be obtained for each parameter where the correlation coefficients in Equation (13) can be considered according to Weibull variables, i.e., where the elements of the covariance matrix, Σ, in Equation (13) correspond to the correlation coefficients between pairs of variables u_{i} and u_{j}. The intervals are shown in Equation (14):
Therefore, in many cases the parameters used in Equation (13) are defined by the behavior of the group of Weibull variables.
Even though Equation (13) seems a trifle complex, it should be emphasized that, when introducing it in a software application, its complexity does not depend on the number of variables, n. Moreover, some models referred to in section I serve only the bivariate case and those applied to the multivariate one do not consider the correlation coefficients, which are the classical measurements of dependence.
2.3. Multivariate Wind Speed Distribution
According to [14,15,16] the cumulative behavior of the wind speed at location i, v_{i}, can be described by a Weibull distribution with parameters C_{i} and k_{i}, and the relationship between every two distributions can be described by its correlation coefficient [19,20,21,22]. So, the multivariate wind speed PDF, or multilocation wind speed PDF, for a group of n variables (v_{1},…,v_{n}), with Weibull parameters C_{i} and k_{i} and correlation matrix R_{v} referred to the pairs of Normally distributed variables, is shown in Equation (15):
On the other hand, in most cases the Weibull parameters of the wind speed distributions lie in the intervals expressed in Equation (14), so, as it is explained in Section 2.1, R_{v} can also represent the correlation matrix of the v_{i} variables.
2.4. Multivariate Wind Power Distribution
The relationship between wind speed and wind power [14,16] is defined by Equation (16):
Therefore, applying a change of variables, P_{i} can be described by a Weibull distribution of parameters C_{i}ʹ and k_{i}ʹ [14], expressed in Equation (17):
And in Equation (19), as a function of the parameters of the wind speed distribution.
In both cases, Equations (18) and (19), the matrix R_{P} contains the correlation coefficients corresponding to the pairs of normally distributed variables. In most cases the parameters of the Weibull distributions defined in Equation (17) lie outside the intervals expressed in Equation (14), so, as it is explained in Section 2.1, R_{P} does not represent the correlation matrix of the P_{i} variables.
3. Bivariate Weibull Distribution
In many cases the bivariate Weibull distribution is sought in order to describe the wind speed or wind power behavior in a pair of locations. Due to its importance, we have considered it interesting to develop here as a particular case.
3.1. Bivariate Weibull Distribution Applied to Wind Speed
Equation (9) specifically for n = 2 is shown in Equation (20):
And the bivariate Standard Normal PDF is expressed in Equation (21):
By using Equations (20) and (21) the bivariate Weibull PDF Equation (22) is obtained as a function of C_{1}, k_{1}, C_{2}, k_{2} and ρ, which stands for the correlation coefficient between x_{1} and x_{2} but, as has been stated above, can be considered as the correlation coefficient between v_{1} and v_{2}.
In order to simplify Equation (22), as it depends on x_{1} and x_{2}, their relationships with v_{1} and v_{2} are shown in Equation (23):
As stated, in most cases the Weibull parameters to define the wind speed behavior lie in the intervals given in Equation (14), so Normal and wind speed correlation coefficients can be considered equal. The bivariate wind speed PDF for several values of ρ is shown in Figure 2, Figure 3 and Figure 4 (C_{1} = 8, k_{1} = 2, C_{2} = 8, k_{2} = 2).
3.2. Correlation Coefficient Inference
In order to perform the correlation coefficient inference between two variables (u_{1}, u_{2}), the sample correlation coefficient has to be tested with a hypothesis [23,24,25]. The sample value, r, is obtained using Equation (24), once the Weibull variables are changed to Normal ones (y_{1}, y_{2}).
The hypotheses to be checked in this case are the following:
H_{1}:
According to [23,24,25], the variable Z is Normal distributed with the parameters given in Equation (26):
Depending on the significance level, α, a confidence interval, CI = [zmin, zmax], is established, in which its limits are expressed in an implicit way in Equation (27):
So, if z, obtained through Equation (25), belongs to the CI, the Null Hypothesis H_{0} can be accepted with significance level α, and if not, it cannot be accepted.
If H_{0} is accepted, the correlation coefficient corresponding to the bivariate Weibull distribution has to be obtained.
According to previous sections, the correlation coefficients corresponding to the bivariate Normal distribution and to the Weibull one, when it represents a pair of wind speed variables, are approximately the same, so if H_{0} is accepted, it can be said that ρ_{0} can be taken as an estimation of the correlation coefficient in a bivariate Weibull distribution with significance level of α.
3.3. Bivariate Weibull Distribution Applied to Wind Power
According to Equation (19), the bivariate wind power PDF is expressed in Equation (28):
The bivariate wind power PDF for several values of ρ is shown in Figure 6, Figure 7, Figure 8 (C_{1} = 8, k_{1} = 2, A_{1} = 7853 m^{2}, d_{1} = 1.225 kg/m^{3}, C_{2} = 8, k_{2} = 2, A_{2} = 7853 m^{2}, d_{2} = 1.225 kg/m^{3}).
4. Case Study
As a case study, it can be assessed if a certain model of correlation coefficient between a pair of locations can be accepted in order to estimate its value as a function of the distance between them.
The main features of the behavior of the wind in Galicia, in the Northwest of Spain, are that during winter, the winds blow from the Southwest and are very constant and powerful and during the summer, the winds normally blow softly from the Northeast. There are a great number of meteorological stations spread throughout Galicia [26], if wind speed series of data are collected through them, the Weibull distribution parameters for each location can be obtained. Moreover, the correlation coefficients of each pair can be derived according to Equation (24).
In order to estimate the correlation coefficient for locations with a low number of simultaneous sample values of wind speed measures, the relationship between the correlation coefficient and the distance can be analyzed [27]. The great-circle distance between two locations (i and j) can be obtained through Equation (30), as a function of their geographical coordinates.
Therefore, including all the possible pairs (distance, correlation coefficient) in Galicia and by means of the least square method, the relationship Equation (31) is derived.
So, if the correlation coefficient between a location not included in the previous analysis (Coto Muiño), and another one that is included (Melide), needs to be estimated, all that has to be done is to apply Equation (30) between both locations, and then Equation (31), after which a value of ρ_{0} = 0.6012 is obtained.
Moreover, by utilizing simultaneous sample data (n = 1000) from both locations, the sample correlation coefficient can be obtained through Equation (24), r = 0.6202, and the change suggested in Equation (25) applied, to obtain z = 0.7253.
Considering α = 0.05, Z ~ N(0.6951, 0.0317), the CI obtained is CI = [0.6330, 0.7571]. Therefore, as explained in the previous sections, as z_{min}< z < z_{max}, ρ_{0} = 0.6012 can be accepted as the correlation coefficient of the bivariate Weibull distributions corresponding to the wind speed data of those locations. The results are shown in Table 1.
Table 1. Results of the case study. |
Variable | Obtained value | Minimum value | Maximum value |
---|---|---|---|
r | 0.6202 | 0.5601 | 0.6394 |
z | 0.7253 | 0.6330 | 0.7571 |
5. Conclusions
In this paper, a Normal to Weibull change of variables has been defined. It should be noticed that the process can be applied to any type of variables, even inversely. The multivariate Weibull PDF has been obtained and justified, depending on the classic parameters of a single variable and the correlation coefficients between pairs of them. It upgrades former approaches that mainly consist of models based on parameters that have no direct relationship with the univariate parameters and the usual dependence measurement. The function proposed can be easily implemented in a software application regardless of the number of variables. The bivariate case seems a bit complex, compared to other models, but it uses the correlation coefficient between both variables. From the point of view of n variables, each defined by a Weibull distribution with correlation coefficients between pairs given, the PDF proposed is not an approximation, it provides exact results. Additionally, the application of the multivariate Weibull PDF to wind speed and wind power has been explained and derived. The bivariate case for both has also been specified due to its relevance, and some figures are given for clarification purposes. Moreover, the inference of the correlation coefficient in the bivariate wind speed distribution is explained and applied to a particular case.
Appendix
The decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose is called the Cholesky decomposition. A Hermitian matrix is a square matrix with complex entries that is equal to its conjugate-transpose.
Therefore, given Ω, Hermitian and positive-definite, the Cholesky decomposition consists of obtaining L, fulfilling Equation (32):
The Cholesky decomposition is mainly used for the numerical solution of linear equations, linear least squares problems, non-linear optimization or in Kalman filters. Here it is utilized in another application: the Monte Carlo [28,29,30] simulation.
Given X, a matrix of uncorrelated series of samples, and Ω, the desired correlation matrix for these series, Equation (33) is applied in order to obtain Y, a matrix of correlated series of samples according to Ω, where L is the result of the Cholesky decomposition of Ω.
Moreover, if X is a matrix of series of samples where each of these series follows a Normal distribution, the resulting matrix in Equation (33), Y, is also a matrix of series of samples that follows a Normal distribution, which can easily be demonstrated. Equation (33) has the shape shown in Equation (34).
The jth series is formed by m elements of the type expressed in Equation (35).
If x_{li} is distributed according to a N(µ_{l},σ_{l}) distribution, then y_{ji} will follow a . Moreover, if µ_{1} = µ_{2} =…= µ and σ_{1} = σ_{2} =…= σ, then y_{ji} will follow a distribution. And, in the Standard case, if µ = 0 and σ = 1, then y_{ji} will be a distribution.
On the other hand, as the condition that L fulfills is Equation (32), it is always true that , therefore in the particular case µ_{1} = µ_{2} =…= 0, σ_{1} = σ_{2}=…= 1, y_{ji} will be distributed by a N(0,1) distribution.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Weibull, W. A statistical distribution function of wide applicability. J. Appl. Mech. Trans. ASME 1951, 18((3)), 293–297.
- Devore, J.L. Probability and Statistics; Brooks/Cole: Belmont, CA, USA, 2010.
- Milton, J.S.; Arnold, J.C. Introduction to Probability and Statistics; McGraw-Hill: New York, NY, USA, 2002.
- Pham, H. Handbook of Engineering Statistics; Springer: Piscataway, NJ, USA, 2006.
- Murthy, D.N.P.; Xie, M.; Jiang, R. Weibull Models; John Wiley: Hoboken, NJ, USA, 2004.
- Marshall, A.W.; Olkin, I. A multivariate exponential distribution. J. Am. Stat. Assoc. 1967, 62, 30–44, doi:10.1080/01621459.1967.10482885.
- Lee, L. Multivariate distributions having Weibull properties. J. Multivar. Anal. 1979, 9((2)), 267–277, doi:10.1016/0047-259X(79)90084-8.
- Roy, D.; Mukherjee, S.P. Some characterizations of bivariate life distributions. J. Multivar. Anal. 1989, 28, 1–8.
- Crowder, M. A multivariate distribution with weibull connections. J. R. Statist. Soc. B 1989, 51((1)), 93–107.
- Lu, J.; Bhattacharyya, G.K. Some new constructions of bivariate Weibull models. Ann. Inst. Statist. Math. 1990, 42((3)), 543–559.
- Patra, K.; Dey, D.K. A multivariate mixture of Weibull distributions in reliability modeling. Stat. Probab. Letters 1999, 45, 225–235.
- Liu, P.L.; Kiureghian, A.D. Multivariate distribution models with prescribed marginals and covariances. Probab. Eng. Mech. 1986, 1((2)), 105–112.
- Morales, J.M.; Baringo, L.; Conejo, A.J.; Minguez, R. Probabilistic power flow with correlated wind sources. IET Gener. Transm. Distrib. 2010, 4, 641–651, doi:10.1049/iet-gtd.2009.0639.
- Troen, I.; Petersen, E.L. European Wind Atlas; Riso National Laboratory: Roskilde, Denmark, 1989.
- Wind turbines. Part 1: design requirements; IEC 61400-1; IEC Standards: Geneve, Switzerland, 2005.
- Freris, L.L. Wind Energy Conversion Systems; Prentice Hall: London, UK, 1990.
- Stuart, A.; Ord, K. Kendall’s Advanced Theory of Statistics; Oxford University Press Inc.: New York, NY, USA, 1994.
- Bronshtein, I.; Semendiaev, K. Handbook of Mathematics; Springer: New York, NY, USA, 2007.
- Lu, X.; McElroy, M.; Kiviluoma, J. Global potential for wind-generated electricity. Proc. Natl. Acad. Sci. USA 2009, 106, 10933–10938, doi:10.1073/pnas.0904101106.
- Correia, P.F.; Ferreira de Jesús, J.M. Simulation of correlated wind speed and power variates in wind parks. Electr. Power Syst. Res. 2010, 80((5)), 592–598, doi:10.1016/j.epsr.2009.10.031.
- Segura-Heras, I.; Escrivá-Escrivá, G.; Alcázar-Ortega, M. Wind farm electrical power production model for load flow analysis. Renew. Energy 2011, 36((3)), 1008–1013.
- Vallée, F.; Lobry, J.; Deblecker, O. System reliability assessment method for wind power integration. IEEE Trans. Power Syst. 2008, 23((3)), 1288–1297.
- Jobson, J.D. Applied Multivariate Data Analysis, Volume I: Regression and Experimental Design; Springer: New York, NY, USA, 1991.
- Johnson, R.A.; Wichern, D.W. Applied Mutivariate Statistical Analysis; Prentice Hall: Upper Saddle River, NJ, USA, 2007.
- Kleinbaum, D.G.; Kupper, L.L.; Muller, K.E.; Nizam, A. Applied Regression Analysis and Multivariable Methods; Thomson Brooks/Cole: Pacific Grove, CA, USA, 1998.
- Meteogalicia Homepage. Available online: http://www.meteogalicia.es (accessed on 1 November 2012).
- Freris, L.L.; Infield, D. Renewable Energy in Power Systems; John Wiley & Sons: Chichester, UK, 2008.
- Metropolis, N.; Ulam, S. The Monte Carlo Method. J. Am. Stat. Assoc. 1949, 44, 335–341, doi:10.1080/01621459.1949.10483310.
- Rubinstein, R.Y. Simulation and the Monte Carlo Method; Wiley Interscience: Hoboken, NJ, USA, 2008.
- Gentle, J.E. Random Number Generation and Monte Carlo Methods; Springer: New York, NY, USA, 2005.
© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).