The Compound Inverse Rayleigh as an Extreme Wind Speed Distribution and its Bayes Estimation

: The paper deals with the Compound Inverse Rayleigh distribution, shown to constitute a proper model for the characterization of the probability distribution of extreme values of wind-speed, a topic which is gaining growing interest in the field of renewable generation assessment, both in view of wind power production evaluation and the wind-tower mechanical reliability and safety. The first part of the paper illustrates such model starting from its origin as a generalization of the Inverse Rayleigh model - already proven to be a valid model for extreme wind-speeds - by means of a continuous mixture generated by a Gamma distribution on the scale parameter, which gives rise to its name. Moreover, its validity to interpret different field data is illustrated, also by means of numerous numerical examples based upon real wind speed measurements. Then, a novel Bayes approach for the estimation of such extreme wind-speed model is proposed. The method relies upon the assessment of prior information in a practical way, that should be easily available to system engineers. In practice, the method allows to expre ss one’s prior beliefs both in terms of parameters, as customary, and/or in terms of probabilities. The results of a large set of numerical simulations – using typical values of wind-speed parameters - are reported to illustrate the efficiency and the accuracy of the proposed method. The validity of the approach is also verified in terms of its robustness with respect to significant differences compared to the assumed prior information.


Introduction
In recent years, renewable energy has become one of the major research topics among power system engineers. Especially wind energy is considered one of the largest potential renewable energy able to generate a great amount of electricity with a low value of polluting emissions into the atmosphere [1][2].
The potential of wind energy is strictly related to the availability and the intensity of wind in a certain geographical area. These two characteristics depend on the climatic conditions as air temperature, wind speed, solar radiation, and other factors. Therefore, the wind electrical energy use needs a deep analysis of the wind conditions.
In particular, the electrical power produced by the wind energy is modelled as a statistical distribution of the wind speed (WS) random variable (RV); this determines that the estimation of wind electrical energy production is strictly related to the accuracy of the adopted wind speed distribution [3][4][5][6][7][8][9][10][11][12][13][14][15][16]. So, an accurate wind speed distribution modeling is the first step to achieve accurate wind energy production estimation. Moreover, as discussed in [17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32], the randomness of WS also has a great impact on the mechanical reliability of wind power systems, since extreme values of wind speed (EWS) may damage sensible components of the structures, such as towers and wind blades, so that EWS characterization also constitutes a basic tool for an efficient wind turbine design. Furthermore, as also pointed out in [15][16][17][18][19], values of WS that are greater than the "cut-off" value of the wind generator are generally undesirable, since the electric generator has to be disconnected from the wind turbine to avoid damages to the electrical section of the wind power system; consequently, the "cut off" value of the generator must be chosen according to the characterization of EWS in the particular location, since it has a great impact on aggregate power production [1][2][3][4][21][22][23]. So, EWS is also indirectly related to the electricity production, as discussed in [4], although the paper is primarily devoted to the issue of mechanical safety. Furthermore, as reported in [24], a reliable assessment of the expected EWS values is necessary -still in view of guaranteeing a desired safety level-also for the design of structures at exposed locations (e.g. bridges, radio masts), and not only for the wind turbines.
For such purpose, a new model, the Compound Inverse Rayleigh (CIR) distribution, is proposed and illustrated in the paper. Moreover, a novel Bayes method of inference for such model is adopted in the paper, in view of designing an efficient method of estimation.
Concerning the topic of estimation, it should be remarked that EWS characterization is also important in order to attain an efficient estimation of the upper quantiles of the wind speed. Indeed, wind power statistics is highly affected by the "extreme tails" of the probability distribution of the wind speed random variable [24][25][26][27][28]. The upper quantiles estimation of the wind can result very crucial, especially in view of the fact that such quantiles are to be transformed into the corresponding quantiles of the "Wind Power Density" (WPD), formulated in W/m 2 by the well-known cubic rule [1][2][3][4]: where W is the wind-speed in m/s;  is the air density (nominally 1.225 kg/m 3 at 15.55 °C and 101.325 Pa). Indeed, it is known from basic statistical properties, a given p-quantile, wp, of the WS is transformed into the p-quantile of the RV Z=WPD by means of same cubic rule (1), thus emphasizing any possible bias in the quantile estimations. This holds especially in the cases of upper extreme quantiles (e.g., those corresponding to probability values p=0.95 or 0.99) which corresponds to very high values of power, around which the power curve is most sensitive to even very small variations of its argument, so that a small error in wind speed evaluation may cause a large error in wind power, because of the above cubic relation. That is one first reason why electric utilities need a careful wind speed statistics assessment.
With this aim, a new model, the Compound Inverse Rayleigh (CIR) distribution, is illustrated and proposed in the paper as valid model for extreme wind speeds. Such CIR model takes its name (as well as its derivation) from the more popular Inverse Rayleigh distribution [19,[33][34][35], which was proposed since a few decades for the purpose of modeling positive random variables, and especially for its applications in reliability and survival times studies. Such model also constitutes a particular case of the Inverse Weibull distribution [33] which, which -apart being rather popular for its reliability applications [36][37][38] -has been also adopted as a valid model for the characterization and estimation of EWS [6][7][8][9][10][11]23]. The IR model is studied in this paper in relation to applications to EWS data analysis, by extending a similar approach promoted by some of the authors in a previous paper on the subject [19].
The following expressions for CIR, CDF and PDF are reported, while their derivation from the more known Rayleigh and Inverse Rayleigh models will be shown afterwards: (2) where is a positive scale parameter. It is apparent that is the median value, or the "0.50quantile" of the CIR model. Indeed: F( ) = 0.5 (4) It is recalled that the p-quantile is a value x* of a RV is such that p100% of the observed values of the RV fall below x*, i.e. -in the case of a continuous RV as in the present case -it is the unique solution of F(x*)=p, being F(x) the CDF of the RV. So, the median value is equal to the 0.5-quantile, such that 50% of the observed values of the WS values fall below . In particular, the upper quantiles of the WS probability density function (PDF) are a key topic of the present study.
The rest of this paper is organized as follows. In section 2, the CIR model is developed, and illustrated with some details also concerning its deduction from other models, or its similarities with other, more popular models, such as the Pareto one [24,33].
Then, Sec. 3 shows many applications of the CIR model to extreme wind speed real data. In Section 4, we introduce a suitable Bayesian approach [39][40][41] for the estimation of the SI, with reference to the CIR model. The method constitutes a "practical" Bayesian statistical inference approach [40] for the estimation of the EWS distribution from available data, by using a model quantile or the model CDF as the proper input for the prior assessment of the model. In order to assess the overall performance of the proposed Bayes estimation, a large series of Monte Carlo simulations have been carried out, whose results are illustrated in the successive section 5, on the basis of typical values of real EWS data. Such results assess the good performance of the proposed approach, especially when compared to "classical" estimation methods, such as the Maximum Likelihood estimation, the method of Moments, or the Quantile Estimation method [28][29][39][40][41]. Moreover, to highlight a special feature of the proposed method, in the final part of section 5 the robustness of the proposed estimation method is illustrated, by means of a further large series of simulations. Such "robustness analysis" clearly shows that the same estimation efficiency holds, at least in most of the examined cases, also when the true prior model is very different from the one assumed for the random parameter under study.

CIR model genesis
As stated in Sec. 1, the CIR distribution is obtained from a mixture of IR distributions, arising from the random variation of some related weather variables, such as temperature and pressure. The mixture of the shape parameter has been adopted also for the widespread Weibull model [29] in order to allow for the randomness of environmental conditions of the windfarm under study. Such continuous mixture generated by a probability distribution on a given parameter of the WS distribution allows to obtain a more flexible WS distribution. This method, when applied to the IR model, gives rise to the above introduced "Compound Inverse Rayleigh" distribution, leading to the analytical expression of its CDF and PDF reported in (2) and (3) respectively. It is recalled that, given a positive random variable X (the EWS in this study), the IR Model CDF F(x)=P(X≤x), with positive shape parameter may be expressed by the following equation: with the following probability density function (PDF): In this paper, the so-called CIR model constitutes a generalization of the Inverse Rayleigh Distribution obtained by means of a continuous mixture of its parameter . So, let the parameter α be considered -for the reasons above outlined -as an uncertain quantity, which is to be modeled in a Bayesian approach [39] through a Gamma PDF [33]: where is the scale parameter and is the shape parameter of the PDF of α, both positive numbers, and Γ(•) is the Gamma special function defined as follows: Let the EWS RV be denoted by X, and possessing the IR CDF of (5), conditional to a given value of the parameter . Then, using the total probability theorem, the unconditional CDF of the RV X can be expressed, denoting (for covering also more general cases )  = -2 , as: and, by means of some straightforward manipulations, as follows: So, equation (9) can be rewritten as: or, denoting by = 1/ , the following expressions for the CDF and PDF are obtained: where , are a scale and a shape parameters, respectively. The above model belongs to the family of "Inverse Burr" distributions [28][29]33]. Assuming = 1 , so that the mixing Gamma distribution becomes an Exponential distribution, the expression of the CIR distribution of (2) and (3) is obtained.

CIR model statistical parameters
In this sub section, some graphical examples are illustrated for the CIR model, as well as its comparisons with the IR and similar models.
By virtue of (3), the median value -or the 0.5-quantile -of the CIR distribution is simply: Any p-quantile Xp may be expressed by means of the median value m by: with = √ and = 1 − . Of course, if p=0.5, then kp=1, and (14) coincides with (13).
The mean value is expressed by:

CIR model examples related to WS distribution
In Fig.1, three typical CIR PDF curves corresponding to the analytical expression of (1)-(2) are shown, characterized by three different values of the parameter  (10,25,40). The values on the xaxis are WS values in m/s. It may be shown indeed that the two models become close numerically for large values of the WS. It is simple to note that when the CDF argument diverges, or, in practice when the WS value x is sufficiently high, so that ≫ max ( , ) , the first-order Maclaurin series expansions of the IR and CIR CDF reported respectively in eqns. (2) and (4) show that ( ) ≅ ( ) when and are not very different, since: (16) and:

Comparison among similar WS models
It seems opportune to compare the CIR model with other similar models such as the IR and the Exponential model. Further on, also some hints at a possible approximation with the Pareto CDFalso widely adopted for extreme value data characterization -are illustrated in the present subsection.
First, the IR and CIR model are examined. It is interesting to compare, in order to get a better knowledge of the two models in their "extreme tails", the quantiles of the two models. when they possess the same median or 0.50-quantile. So, let m be the common median value. The CIR model is expressible directly in terms of m, of course, being m=, as follows: Also the IR CDF (as well as the corresponding PDF) is expressible in terms of the median m, as follows, by denoting: Such re-parameterization can be useful, as remarked in [37][38] , in the phase of a proper model identification, since different probability models often imply similar estimated values of "central parameters" -such as the median or mean value -when fitted to the same data, but exhibit different "extreme" tails, i.e. extreme quantiles such as the ones corresponding to values of p equal to p=0.95 or p=0.99.
The generic p-quantile of the IR and CIR models -respectively denoted as Xp and X'p, are simply expressible in terms of their median m as follows: In Table I, three different p-quantiles (for p= 0.63, 0.95, 0.99, in column 3,4,5) are reported for an IR model (row 1) and a CIR model (row 2) with the same median m=7 m/s. A not very large value of m has been chosen for sake of illustration, with the purpose to highlight that, also in such case, the extreme quantiles may sometimes be rather different. Moreover, the "large tails" of the above models under studies are also made more evident by reporting the corresponding quantities for a "classical" Exponential model, i.e. with CDF expressible as follows in terms of the mean value : For a more complete comparison also the mean (expected) values of the three models are reported. The one of the Exponential model is reported as a parameter in (23), those of the IR model and the CIR model (as reported above) are respectively: For the Exponential model, it is well known that m =, and that the mean is equal to the pquantile for p=0.63, which is the reason why such 0.63 quantile is reported, for easier reference with the other models. In all the three cases, the mean is higher than the median, as always happens with unimodal distributions with positive skewness (i.e., with "right tails" [33]). The curves of the PDF corresponding to the three models are shown in Fig. 2.

Figure 2.
The curves of the three PDF corresponding to the models (the Exponential, IR, CIR) in Table I, with the same median value 7 m/s. Table I, as also from the curves, that, although possessing the same median (and about the same mean) -or more generally being similar in their central part" (including the 0.63-quantile) -the IR and CIR models behave quite different in their extreme right tails, the one of the CIR model being much larger than the one of the IR model. For instance, the 0.95-quantile has a 18% increase from the IR to the CIR model, while the 0.99-quantile has a 20% increase. Both models, of course, possess larger tails when compared to the Exponential model.

It is clearly apparent, from the values reported in
It is remarkable that the above right-hand sides in eqns. (16) and (17) can be seen as a particular case of the Pareto distribution (although only for very large x values), a very classical model for describing the extreme values distribution [31][32][33]. The Pareto CDF has indeed the following expression (which, if h=2, is the same expression of (14) and (15) for finite values of x): A similar argument also applies to the Inverse Log-logistic and the Inverse Burr models [28][29]. (the latter is also known in literature as "Dagum distribution" [42]), whose CDF can be expressed respectively as follows: being () all positive parameters.
The problem of the estimation of the CIR model is dealt by means of a proper Bayes inference approach in sec. 4. First, in the following sec. 3, many applications of the CIR model to real WS or EWS datasets are shown.

EWS datasets
The CIR model is applied to characterize real-world data collected at several locations in Greece in 2012 [25][26]. Data consists of hourly observations of wind speed, from which the EWS are extracted in terms of Period-Maxima (PM) EWS and Peaks-Over-Threshold (POT) EWS. PM values are the maximum WS values in a given time interval (one week for the present data set), while POT values re only the WS values which are greater than a given threshold value. Four locations, that significantly differ in terms of wind speed characteristics, are presented in this Section: Amintayo, Beui, Koilada, and Pontokomi.
In order to keep the dimension of data consistent, the threshold picked for building the POT EWS dataset is not the same across the locations. It is set at 10 m/s for locations Amintayo and Pontokomi, and 5 m/s for Beui and Koilada. With reference to the PM EWS dataset building, the period considered is 1 week for all the four locations. With this approach, the eight considered datasets have the size and the statistical characteristics presented in Table 2. The goodness of fit of the CIR distribution upon these data is compared to the goodness of fit of other benchmark distributions that are typically used for EWS characterization, i.e., the Inverse Burr (IB) distribution, the Inverse Weibull (IW) distribution, the Gumbel (GU) distribution [33], and the Inverse Rayleigh (IR) distribution.
To evaluate the performance of the CIR distribution and to compare it with the benchmark distributions, the outcomes of the Kolmogorov-Smirnov (KS) test [41][42], i.e., the KS test statistics ( ), is presented. The Adjusted Determination Coefficient ( ) is presented too, in order to account for the number of parameters in each distribution [41].

Period-maxima EWS characterization
The results for the PM EWS characterization are shown in Table 3. Bold values indicate the best results for each location.
The CIR distribution has overall the best performance for the locations Koilada and Pontokomi, and it also shows the smallest KS test statistic for the location Amintayo (note that the CIR for the Amintayo location is only slightly smaller than the greatest value achieved by the IW distribution). Nevertheless, the CIR distribution is not suitable to characterize the PM EWS at the Beui location, as it is unable to reach performance close to the IB distribution, which is the best pick for this location. In only one case (i.e., the GU distribution for the location Amintayo) the KS test was not successful. With the purpose of providing a graphical interpretation of the fitting, Figure 3 shows the sample CDF and the fitted CIR, IB, and IR distributions, for the Amintayo, Beui, Koilada, and Pontokomi locations, respectively.

Peak-over-threshold EWS characterization
The results for the POT EWS characterization are shown in Table 4. Bold values indicate the best results for each location. The CIR distribution always returns the smallest KS test statistic and the greatest , in all the considered locations, denoting an overall better fitting than the fitting of PM EWS. of estimation, any parameter to be estimated is seen as a fixed but unknown variable. Instead, in Bayesian parameter estimation [39][40][41]43], each unknown parameter is seen as a random variable, so that it possesses a "prior" PDF representing the probabilistic information or the "belief" of the observer about the parameter. Then, letting the parameter (possibly a vector) denoted by w, its prior PDF g(w) is integrated and updated with field data -denoted by D -by the below reported well known Bayes' theorem, which allows to obtain the the posterior distribution g(w|D) of w, conditional to the observed data D: where: • L(D|w) denotes the Likelihood (i.e. the joint PDF in the present case) of the data D conditional to the parameter w; • C is a constant (with respect to the parameter values), obtained by the "normalizing" integral: The above integral is a multiple one in the case w is a vector of parameters, say: w= (w1, w2,..,mwp), so that dw= dw1dw2..dwp. In a Bayesian analysis, the posterior distribution sums up the current status of knowledge about the unknown parameters of probability distributions, integrating the prior knowledge (before data observation) -or the initial belief about the parameter -represented by the prior distribution, and the knowledge produced by data in terms of the likelihood function of the observed data, which in a sense measures the information contained in the data. Once the posterior distribution, here in terms of posterior PDF g(w|D) of (29), has been deduced, there are various possible choice or the Bayes point estimate of the parameter w. A very common choice as the "best" Bayes estimate -in the mean square error sense -is the posterior mean of such PDF. As well known [39], such estimate minimizes the posterior mean squared error.
Another popular choice, here performed, is the so called "Maximum a Posteriori Probability" (MAP) estimation, in which the point estimate w° of the parameter w is the value which maximizes the posterior PDF g(w|D): where Sw is the parameter space, i.e. the set of all possible values that w can assume. Such choice, which by definition maximizes the probability of a correct estimation (or minimizes the estimation error probability), constitutes a "classical" method in Bayesian analysis and has gained new interest in last years, being also adopted in the framework of the recent machine learning methods [44][45][46].
Both the posterior mean and the MAP estimator, as will be apparent from the following equations, require numerical methods, but the MAP estimator is somewhat simpler to compute, being nonetheless very close to the posterior mean.
In view of the estimation of the CIR model, the parameter w to be estimated, appearing in eqns.
(29)-(31), i.e. the "input" of the estimation process, may naturally be the parameter  of (2) -(4), also reported in the following eq. (32). The parameter  also represents the median of the CIR distribution. However, also any other parameter =() related to the unknown parameter  may be chosen as the input parameter. It is recalled indeed that, in the Bayesian methodology, also any functions of the random parameter  are themselves RV, described by appropriate distributions. In particular here, once a given WS value x has been fixed, the attention may be focused on the inference on value of the CIR CDF at x: Indeed, also the CDF -as well as other parameters (e.g. moments and quantiles) -may be considered as the input of the estimation process. It is remarked that, although formally equal to eqn.
(2), the expression in (32) should not be seen as a function of the WS x, which is instead a fixed constant there This is indeed a function of the unknown parameter  thus constituting a new parameter, on which it should be easy to assess a prior PDF. For mathematical convenience, alternatively, by introducing the new RV: the above expression of the CIR CDF can be re-parametrized (omitting for clarity the dependence from x, which is a constant, so writing F instead of F(x)): Of course, assessing a prior PDF on the random parameter  implies -being x a given constant assessing a prior PDF on the random parameter Y, as well as on F and other related parameters; of course, also the converse is true. The choice of the most convenient way depends on the kind of information that the observer possesses.
For instance, a reasonable prior assessment could be the one assessing the probability that the WS is higher than a given value x: this seems indeed a realistic piece of information, which should be available to the engineer based on past WS data. By the above relation, such "exceedance probability" is expressed by: being S(x)=1-F(x) the "Survival function" of the WS RV. It is interesting to remark that this can be considered as a "un-safety index", accounting for the most severe EWS amplitudes as disturbances for wind tower safety. In summary, for the above estimation, two methods are considered in the paper in order to establish an appropriate prior distribution as illustrated in the following sub-sections, i.e. assessing assigning a Lognormal prior distribution to the parameter  (sec. 4.1.), or a Beta prior distribution to the parameter S (sec. 4.2). Although, in general, the two prior assessments (i.e., assessing a prior distribution to the parameter  or a prior distribution to the parameter S) can be made equivalent by appropriate choice of the relevant prior PDF parameters, choosing one or the other as the basic input information has some peculiar analytical aspects that are highlighted in the following, since -for instance -a Lognormal prior distribution on the parameter  does not imply a Beta prior distribution on the parameter S, but a so-called "Beta-Lognormal" distribution discussed in the following, which can nonetheless approximated to certain extent by a Beta distribution.
Finally, it is remarked that the choice of assessing a prior distribution to the parameter S is in accordance with the so-called "practical" Bayes estimation, which is widely adopted nowadays in various applications after being first proposed in [36] in the framework of reliability analyses.

"First approach": choice of a Lognormal prior distribution the parameter 
Let us examine the first choice, for which a Lognormal prior PDF is suggested for its great flexibility and for its easy transformation into the corresponding prior PDF for the above introduced RV Y and S, when needed. Let  be a Lognormal RV, with the following PDF [33]: The function g() is zero for  <0; parameters ,  are respectively the scale and the shape parameters of the Lognormal PDF. Then, by virtue of (33), also Y is an opportune Lognormal RV, because of known properties of the Lognormal model [33] (among which, the fact that -given a Lognormal RV Y -any power function, Y k is still a Lognormal RV, whatever the value of the real constant k). The parameters of the PDF of Y are simply deducible from those of . Then, the RV S of (35) possesses a known PDF, which can be expressed as follows. First, the new RV : is introduced. It is again a Lognormal RV with PDF, here denoted by h(t), simply expressible in terms of the Lognormal PDF of (36): Then, the PDF of S is easily expressible in terms of the one of T, since: Then, denoting by q(s) the PDF of S, it can be expressed in terms of the above PDF h(t) of T by means of a "Beta-Lognormal" PDF (introduced in [47]), which -by well-known rules of PDF transformations, is given by: Of course, since F= 1-S, also the prior PDF of F is simply expressible. The above PDF should be conveniently approximated by a Beta PDF, as discussed in [47]. Indeed, it is known that a Beta RV, say Z, may be characterized, under some hypotheses, as the ratio Z= U/(U+V) = 1/(1+W), being U and V independent Gamma RV [33,43], and so W= V/U the ratio of two independent Gamma RV. Because of the known similarity between the Gamma and the Lognormal PDF (for given common values of mean and standard deviation, [46]), it may be expected that also the PDF of Y in (39) may be at least approximated by a Beta PDF, as in our application was successfully verified. Indeed, in the following section 5 it is shown that the estimation results are not so different when assuming a Lognormal PDF on  (as discussed here), or when assuming a Beta PDF on S, as discussed in the following sub-section.

"Second approach": choice of a Beta prior distribution the parameter S
With the purpose of assessing the prior information on S, a Beta PDF [33,39,41,43] may be successfully adopted, as generally done for RV in (0,1) [39], due to its great flexibility in this interval.
The Beta PDF for S, f(s), with positive parameters (p,q), can be expressed as follows: It is noticeable that in such case also F = 1-S is a Beta RV [33], whose PDF can be obtained from (41) simply swapping the roles of parameters (p,q). Such choice also implies that the RV Y of (35), which is obtained as a function of S (as simply deducible from (35) itself) by: is a RV with support (0, ∞), which possesses a so-called "Beta Prime" distribution [33], with the same positive parameters (p,q) of the Beta PDF of S. This means that the PDF u(s) of Y can be expressed as follows: Such PDF of Y implies in turn an analytical expression also for the parameter , which is also a RV with support (0, ∞) , and by means of (33) is proportional to the square root of Y: This means that the PDF of  can be still expressed, with some adequate transformation, in terms of the Beta Prime distribution of (43). I.e., given the PDF u(y;p.q) of (43), the implied prior PDF of , say v(), is expressed by: The applications of the above equations (36) and (45), expressing respectively the prior PDF of  according the first approach and the second approach, have been performed in the course of the computations of the following section, as described afterwards. Then, both in the case of the first and the second approach, in order to obtain the Bayes estimate (be it the posterior mean or the MAP estimate) , the posterior PDF shall be computed according (29), i.e. by multiplying the above prior PDF ((36) or (45), according to the case under study) and the Likelihood function L(D|w). Such Likelihood function, which can be denoted by L(D|), since in this case is the only parameter to be estimated, is the joint PDF of the CIR RV which constitute the sample of n WS values (x1, x2 , …,xn), i.e. it is the product of n PDF f(xj| ) j=1..n, such as the one in (3), evaluated in xj (j=1..n), being n the sample size. As long as the MAP estimate is concerned, it isn't important to evaluate the constant C in (29), since its value does not change the maximum of the above product g() L(D|). However, the MAP estimate must be obtained numerically (and the same would be true also for the posterior mean or other possible estimates).

Introduction: illustration of the Bayes estimation results
The present section illustrates a series of numerical applications for assessing the estimation efficiency of the proposed Bayes approach based on a large set of simulated experiments. All the relevant results are expressed in terms of the estimation of the only model parameter, , here considered as an unknown. Nonetheless, as it was illustrated in previous sub-section 4.3, in some cases, according to the "second approach", the primary source of (prior) information will not be considered the information on the parameter, but the information on a given values of the S. This efficiency assessment is carried out in "absolute" terms, as well as in terms of 'relative efficiency' [41-43-46] of the Bayes approach compared to the "classical" approach. As is well known [41,[43][44][45][46], the classical approach is generally based on the aforementioned ML estimation method, which in the present case can be easily derived from the more general one regarding the EWS model presented in [29]. However, as discussed in [29], such MLE procedure yields sometimes some convergence problems. In such cases, simpler ways to attain parameter estimates are the Method of moments estimation [41] (briefly, "moments estimation ") and the "quantile estimation" (QE) [29] methods. In the case of the CIR model, the moment estimation procedure consists in solving a simple system of one equation, obtained equating the first sample moment (or the "sample mean", i.e. the average value of WS in the sample) to the theoretical mean value. The QE procedure consists in equating a given sample quantile to the analogous theoretical quantile. Since the parameter  is equal to the median of the CIR model, it is convenient in this case to adopt a QE method by choosing the sample median. As intuitive from previous equations expressing the statistical parameters of the CIR model (showing that the mean and the median are related by a linear relationship), it may be expected that the moment estimation and the QE yield very close estimates for , as was verified in the course of the computations of the present section. Moreover, in all the applications here presented they resulted practically equal to the MLE. So, for brevity and computational simplicity in the following the value of the QE estimate is reported as the "classical" estimate.
As hinted at before, the Bayes MAP estimates have been obtained, in each simulation, numerically (since an analytical solution of the above equations is not available), on the basis of the performed simulations which are described in the following sub-sections, respectively for the first and the second approach (i.e. the choice of a Lognormal prior distribution on the parameter  or the choice of a beta prior distribution on the parameter S, as explained in section 4).
The results are reported with the following subdivision: 3) Case C (in sub-section 5.4), in which a "robustness analysis" is illustrated, The three cases (A,B,C) are further subdivided into more sub-cases (A1-A3, B1-B3, C1-C4), as explained in the following .

Case A: Bayes estimation results for the "first approach" (choice of a Lognormal prior distribution on the parameter ).
Tables 5-7 illustrate the results of 3 sets of simulation cases, among the many more performed, in which a Lognormal prior PDF g(; ) as in (36)  These three cases are selected to highlight different degrees of prior uncertainty; indeed, a greater prior CV involves a greater degree of prior uncertainty, thus the estimates are expected to be less efficient when examining the results rom case A1 to case A3, as will be confirmed to some extent by the results hereafter. Out of the above cases, case A1 is to be considered the "worst" case since it possesses the highest CV value. In Fig. 5, the Lognormal prior PDF of case A2 for  is shown. index: the more REFF overcomes unity, the more efficient is the Bayes estimate compared to the classic estimate. Furthermore, the meaningful "accuracy indexes" listed here below were also considered as a quantitative way to assess the "relative bias" (or the "precision") of the Bayes estimator [39], which should be of course as small as possible: • Mean Relative Error of the Bayes estimator (BMRE); • Maximum Relative Error of the Bayes estimator (BMAXRE).
In all simulated samples for the estimation of , the estimates are carried out for different values of sample size n -as usually happens in reality -which can be classified as small, i.e. n=5,10, and moderately large, i.e. n=20,30. It is apparent from the above tables that, as desired, the indexes BMSE, BMRE and BMAXRE are very limited, while the REFF index is always larger, often much larger than 1 (unless the sample size and/or the prior CV are relatively high), accounting for a very efficient estimation procedure. A more detailed discussion of the reported results is postponed to the end of next sub-section, in order to examine also the cases (B1, B2, B3) relevant to the "second approach".

Case B: Bayes estimation results for the "Second approach" (choice of a Beta prior distribution the parameter S).
According to the same methodology as in the previous sub-section, tables 8-10 illustrate the results of 3 sets of simulation cases, among the many more performed, in which the before discussed choice of a Beta PDF on the Survival function S= S(x), at x=11.5 m/s is adopted. All the cases reported here are relevant to a prior mean E[S ] = 0.5 at x=11.5 m/s (which is -in a probabilistic sense -to a certain degree equivalent to assume a prior mean of the median having value 11.5 m/s, as in cases A). As in case A, the same three different assumptions on the prior CV are held, originating the three sub-cases B1, B2, B3, reported below together with the corresponding values (p,q) of the beta prior PDF u(p,q) of (43): • In Fig.6. the Beta prior PDF of case B1 for S is shown.   (in terms of BMSE) are reasonably limited, for each value of n. Moreover, the relative efficiency (REFF index) with respect to the classic estimate is always larger than 1, not only for small sample sizes (n = 1, 3), as expected, but also for sample sizes as large as n = 30 (of course, as well known [39], the REFF index generally decreases with n).
Looking in succession to the three tables, the efficiency of the Bayes estimates is seen to increase from case A1 to A3, and from case B1 to B3. This was to be expected, since the prior precision (as measured by the prior CV) increases as the CV decreases. Nonetheless, the results show that also in cases A1 and B1, the "worst" cases here reported (i.e. those implying the highest CV value), the results denote an excellent efficiency. This is true both in absolute terms and with respect to the classic estimation results. Moreover, the estimation efficiency seems generally greater, in most cases, for case B than for case A.
Similar considerations, as the ones regarding the estimation efficiency; may be made with respect to the precision of the estimates, described by the indices BMRE and BMAXRE, which are always very small; this feature appears to a certain extent independent from the sample size.

Case C: Bayes estimation results relevant to a robustness analysis
The numerical assessment of Bayes approach efficiency is closed by a "robustness" analysis [39] with respect to the choice of the prior distribution (but always retaining, except in the last case, the same values of the assumed prior mean and CV). The purpose of a robustness analysis is to certify, to a certain degree, that the performance of the estimation method is not so much sensitive with respect to "wrong" prior assumptions, in a sense which will be better clarified afterwards.
In the cases examined here, and illustrated by the last four tables 11-14, it is hypothesized the Bayes computations are performed assuming a "wrong" prior distribution, be it the Lognormal model of case A for  or the Beta model of case B for S: instead, a different prior distribution generates the true values of  or S. In detail, the following more cases are illustrated in Tables 11-14, indicated overall as "Case C", with the following subdivision: • Case C1: The prior distribution of  is Normal, with the same mean (11.5 m/s) and CV=0.15 of the assumed prior distribution of  , which is the Lognormal prior PDF of case A1; • Case C2: The prior distribution of  is Uniform, with the same mean (11.5 m/s) and CV=0.15 of the assumed prior distribution of  , which is again the Lognormal prior PDF of case A1. This implies that the prior distribution of  is Uniform in the interval (a,b), with: a=8.5122; b= 14.4878; • Case C3: The prior distribution of S is Uniform, with the same mean (0.50) and CV=0.15 of the assumed prior distribution of S, which is the Beta prior PDF of case B1. This implies that the prior distribution of S is Uniform in the interval (a,b), with: a=0.3701; b=0.6299; • Case C4: The prior distribution of S is Uniform in (01), while the assumed prior distribution of S which is the Beta prior PDF of case B1. This implies that the prior distribution of S has the same mean (0.50) of the assumed prior, but a quite different CV (0.5477).
The four cases are representative of a wide range of cases in which the assumed prior PDF is a wrong one, and in particular, in the last case C4, a very wrong assumption is hypothesized, since the Uniform PDF in (01) is very different from the Beta PDF; the Uniform PDF possesses a mean value 0.5 and, as well known [33], a standard deviation equal to 1/√12,and thus a CV equal to 2/√12 ≈ 0.5477. It is remarked that any any case the less favourable prior PDFs are adopted as the reference models, i.e. those of cases A1 and B1, possessing the highest values of the CV.
The reported results show that the efficiency and the precision of the of the estimates is not so much sensitive with respect to "wrong" assumptions on the prior distribution, at least when the values of the assumed prior mean and CV are the "correct" ones. Indeed, the REFF index is less than 1 in only three cases: one is relevant to the case C2 with sample size n=30, the other two are relevant to the case C4 with sample size n=20 and n=30, which may be not surprising given -in such cases -the large difference between the assumed prior model and the true prior model, and the high value of CV and sample size. Also in such cases, nonetheless, the very low values of the BMSE should be noticed.
It is remarked that the above terms "wrong" and "correct" are to be interpreted with a certain degree of caution, since -in the framework of the Bayes inference -the prior assumptions have the meaning of personal or subjective beliefs, based upon the experimenter's experience or past data, and should not be justified. Nonetheless, in order that the results could be, in a sense, also accepted in the framework of an objective estimation philosophy [39], the robustness appears to be a very desirable property, also aimed at circumventing possible criticisms that an "ad hoc" prior assumptions is made with the purpose of obtaining the desired results.
Finally, the proposed Bayes estimation method appears to be both efficient and robust. It is remarked that the proposed method of estimation appears to be both simple and very practical, since it only requires some prior guess on the median of the EWS distribution, or on the probability that the wind speed is larger than a given value, and that such choice could be expressed by very simple and flexible prior distributions, such as the Beta and the Lognormal models.

Conclusions
In the paper, the Compound Inverse Rayleigh model has been motivated for the probabilistic characterization of extreme values of wind-speed, also illustrating its validity in interpreting real wind-speed data sets. Moreover, the statistical estimation of the model has been developed by means of a novel Bayes estimation method, whose results have been illustrated in the second part of the paper, on the basis of a very large set of numerical simulations. Such Bayesian approach constitutes a "practical" statistical method for the estimation of the above distribution parameter from available data, proposing two alternative, simple prior models, the Beta and the Lognormal models, for the prior information assessment of a model quantile or of the model Survival Function. Before introducing its practical adoption to real data sets, and then its estimation, the model has been deduced analytically as a mixture of Inverse Rayleigh models (a mixture which originates the model's denomination), also discussing with some detail its comparison with other more popular distributions adopted for extreme wind-speeds, such as the Pareto, Gumbel, Inverse Weibull, Inverse Burr and more.
In section 5, different performance indexes, such as the MSE, the "Relative Efficiency", the "Mean Relative Error" and the "Maximum Relative Error" indices have been evaluated by means of a large set of numerical simulation, with the purpose of a thorough assessment of the performances of the estimation method, with very satisfactory results. In the final section, a numerical analysis is devoted to the robustness assessment of the methodology (i.e. using prior distributions different from the ones chosen here), again with good results., showing that the estimation efficiency holds, at least in most of the examined cases, also when the true prior model is rather different from the one assumed for the computations.