Scale Mixture of Rayleigh Distribution

: In this paper, the scale mixture of Rayleigh (SMR) distribution is introduced. It is proven that this new model, initially deﬁned as the quotient of two independent random variables, can be expressed as a scale mixture of a Rayleigh and a particular Generalized Gamma distribution. Closed expressions are obtained for its pdf, cdf, moments, asymmetry and kurtosis coefﬁcients. Its lifetime analysis, properties and Rényi entropy are studied. Inference based on moments and maximum likelihood (ML) is proposed. An Expectation-Maximization (EM) algorithm is implemented to estimate the parameters via ML. This algorithm is also used in a simulation study, which illustrates the good performance of our proposal. Two real datasets are considered in which it is shown that the SMR model provides a good ﬁt and it is more ﬂexible, especially as for kurtosis, than other competitor models, such as the slashed Rayleigh distribution.


Introduction
Rayleigh distribution is a continuous and positive distribution named after Lord Rayleigh (J.W. Strutt 1880-1919), who introduced it in connection with a problem in acoustics. Since then, it has been widely used in different fields of science and technology and has become one of the most popular models for describing skewed positive data. Siddiqui [1] discussed the relationship of certain problems with the Rayleigh distribution. Miller [2] worked with this distribution for modeling the length of a vector in an N-dimensional euclidean space, whose components are independent and normally distributed according to a N(0, σ 2 ) distribution. Polovko [3], in the field of reliability, highlight that the hazard function of a Rayleigh distribution is an increasing linear function of time. Hirano [4] provides us the history and relevant properties of this model. Results on unbiased minimum variance estimation in this model can be seen in Lopez-Blazquez et al. [5]. Bayesian analysis has been carried out by Ahmed et al. [6], who discussed and compared classical and Bayesian methods by using the mean squared error. As for applications of interest in the fields of Engineering and Physics, Sarti et al. [7], Kalaiselvi et al. [8] and Dhaundiyal and Singh [9] can be cited, among others. All these references illustrate the potential interest of this model from a theoretical and applied point of view.
Recall that a continuous random variable (RV) X follows a Rayleigh (R) distribution with scale parameter σ > 0, denoted as X ∼ R(σ), if its probability density function (pdf) is , where X and U are independent RVs, X ∼ R(σ), U follows an uniform distribution on (0,1), U ∼ U(0, 1) and q > 0 is a scalar. The pdf of T is where Γ(·) denotes the gamma function, and F(·; a; b) the cdf of a gamma distribution with shape and rate parameters a and b, respectively. In this paper, an extension of the R distribution is introduced following the general method to obtain distributions with a higher kurtosis coefficient than the slash version of the Rayleigh model proposed by Iriarte et al. [10], and applied successfully by other authors: Reyes et al. [11] to obtain the Generalized Modified slash model, Reyes et al. [12] to get a generalization of Birnbaum-Saunders, Iriarte et al. [13] and Segovia [14] to extend the quasi-gamma and power Maxwell distributions, respectively. It will be proven that our proposal admits a representation as a scale mixture of a Rayleigh and a Generalized Gamma distribution. Throughout the different sections in this paper, it will be shown that the SMR distribution can be used for modeling positive, right skew data with a heavy right tail.
The outline of this paper is as follows. Section 2 is devoted to the study of the SMR model; the property of mixture is presented and closed expressions are given for its pdf, cdf, moments, asymmetry and kurtosis coefficient. In Section 3, results of interest in survival analysis and reliability are given. These are the survival and hazard function, mean residual life and order statistics. We highlight that, in contrast to the Rayleigh distribution, whose hazard function is increasing and linear, the hazard function of the SMR model is unimodal. In Section 4, the Rényi entropy is obtained. Section 5 is devoted to moment and ML estimation of parameters. An EM type algorithm is given, which has closed expressions in all the stages, and therefore it allows us to estimate the parameters in a computationally efficient way. In Section 6, a simulation study is carried out, which suggests the consistency of estimators even for moderate sample sizes. In Section 7, two real applications from the fields of survival analysis and engineering are provided. Plots and criteria are considered that show that the SMR model fits better than R and SR models, especially on the right tail of the datasets. The simulation study and applications have been carried out by using R programming language. Some final conclusions can be seen in Section 8.

Definition and Properties
In this section, a new extension of R distribution is introduced. Simple expressions (which can be computed easily in many softwares) for its pdf and cdf are obtained, along with its moments, skewness and kurtosis coefficients. It is also proved that this model can be expressed as a scale mixture of distributions. This result is the basis to apply the EM algorithm in Section 5.
Our proposal is based on the Generalized Gamma (GG) distribution introduced by Stacy [15] and whose pdf is given in Definition 1.

Definition 1.
An RV Z follows a three-parameter GG distribution, proposed by Stacy [15], if its pdf is We denote as Z ∼ GG(a, d, p).
Next, the new model is introduced.

Definition 2.
An RV T follows an SMR distribution with parameters σ > 0 and q > 0, if T can be expressed as the ratio of two independent RVs with X ∼ R(σ) and Y ∼ GG(1, q, 2). We use the notation T ∼ SMR(σ, q).
Next, the pdf of T is given.

Remark 1.
A GG(1, q, 2) model has been proposed for Y in (3) in order to get a closed expression for the pdf of T given in (4).
Next, the cumulative distribution function (cdf) of T is obtained.

Proposition 2.
Let T ∼ SMR(σ, q) with σ > 0 and q > 0. Then the cdf of T is Proof. Taking into account (4) and the change of variable u = x 2 2σ + 1, the result is obtained.
Plots of (4) and (6) are given in Figure 1 for σ = 1 and several values of q. It will be seen that √ σ > 0 is a scale parameter and q > 0 is a shape parameter since it affects the skewness and kurtosis coefficients in this model.
The next proposition shows that the SMR model can be expressed as a scale mixture of distributions.
Note that the last integrand corresponds to the pdf of an RV with GG 1 + t 2 2σ 1/2 , q + 2, 2 distribution. Therefore, such integral is 1 and then

Remark 2.
Proposition 3 is a key result in this paper since it allows us to generate random samples of SMR model. It will be also useful for the application of EM algorithm, simulation studies and computation of ML estimators.

Moments
In this subsection, the moments of SMR distribution are obtained. For this, the next lemma will be useful.
Proof. By definition, By making the change of variable u = y 2 , the result is immediate.
In Proposition 4, the noncentral moments of an SMR model are given.
Proof. By using the representation given in (3) and the independence of X and Y From Proposition 4, the explicit expression of the noncentral moments, µ r = E[T r ], for r = 1, 2, 3, 4 and the variance of T ∼ SMR(σ, q), V(T) follow. (8), the first four noncentral moments and the variance of T, say V(T), are given by The next proposition gives us the asymmetry coefficient, β 1 , of an SMR(σ, q) model.
Note that the asymmetry and kurtosis coefficients only depend on the parameter q. Therefore q is a shape parameter since it determines the asymmetry and kurtosis coefficients of this model. In addition, it is of interest to compare both coefficients to other models, such as SR distribution, introduced in Iriarte et al. [10]. Figure 2 shows the plots for the asymmetry, (a), and kurtosis, (b), coefficients for SMR and SR models. We highlight that both coefficients are much higher for the SMR distribution than the ones for the SR model. That means that SMR distribution is more flexible with respect to the asymmetry and kurtosis than SR model.
To conclude, note that (9) and plot (a) in Figure 2 suggest that β 1 takes values in [0.68 ; +∞). On the other hand, (10) and plot (b) in Figure 2 suggest that β 2 approximately takes values in [3.36; +∞). That is, we are dealing with a skew right model, (the right tail is long relative to the left one), with positive kurtosis, i.e., heavy tails.

Lifetime Analysis
Since the SMR distribution is nonnegative and asymmetric, it can be used to model survival time data. In this section, the main features of interest in this field are studied. First the survival and hazard function for an SMR model are given.

Remark 4. From
Note that the intervals where h(t; σ, q) is monotone only depend on the parameter σ, they do not depend on q.
Plots of h(t; σ, q) are given in Figure 3 for σ = 1 and q < 1 in (a) and q > 1 in (b). In both settings, we note that the peak increases with q.

Remark 5.
For λ > 0, the survival function of the SMR model satisfies that Therefore, the survival function of the SMR distribution is regularly varying.

Mean Residual Life
Next the mean residual life is studied. Recall that for a nonnegative RV T, its mean residual life is defined as with S(·) the survival function of T.
It will be proved that this characteristic can be expressed in terms of the survival function of a Pearson Type VII (PTVII) distribution.

Definition 3.
An RV W follows a PTVII distribution with parameters m, c and ξ (see Johnson et al. [17]) if its pdf is given by with w ∈ R, m > 1/2, c > 0 and ξ ∈ R. We denote as W ∼ PTV I I (m, c, ξ).

Proposition 8.
The mean residual life of T ∼ SMR(σ, q), q > 1, can be obtained as where S W denotes the survival function of an RV W ∼ PTV I I q 2 , √ 2σ, 0 .

Proof. Recall that
Note that the integrand in previous expression is, up to a constant, the pdf of a PTV I I q 2 , √ 2σ, 0 distribution. Therefore we can write with S W the survival function of W ∼ PTV I I q 2 , √ 2σ, 0 , q > 1.
In Proposition 9, an explicit expression for (14) is given in terms of the regularized incomplete beta function, defined as This expression is useful from a computational point of view.

Proposition 9.
For q > 1, the mean residual life of T ∼ SMR(σ, q) can be obtained as Proof. Let us consider the integral in (15) and the change of variable u = z 2 2σ By using the relationship and Γ(1/2) = √ π, from (16) and (17) is obtained.

Order Statistics
Given a random sample of size n of T ∼ SMR(σ, q), let us denote by T (j) the jth−order statistics, j ∈ {1, . . . , n}.
In particular, the pdf of the minimum, T (1) , is and the pdf of the maximum, T (n) , is Proof. Since we are dealing with an absolutely continuous model, the pdf of the jth−order statistics is obtained by applying where F and f denote the cdf and pdf of the parent distribution, T ∼ SMR(σ, q) in this case.

Remark 6.
From (18), note that T (1) ∼ SMR(σ, qn). That is, when sampling from an SMR(σ, q) distribution then the minimum is also distributed according to an SMR model.

Entropy
In this section Rényi entropy is obtained for the SMR model. Given T an RV and γ ∈ (0, 1) (1, ∞), the Rényi entropy of T is given by where log = log e denotes the Naperian logarithm.

Inference
In this section, moment and ML estimation methods are studied.

Moment Method Estimators
Let T 1 , . . . , T n be a random sample from T ∼ SMR(σ, q). Let us consider the first and second sample moments, denoted by T = ∑ n i=1 T i n and T 2 = ∑ n i=1 T 2 i n , respectively. Proposition 12. Given T 1 , . . . , T n a random sample from T ∼ SMR(σ, q) with q > 2, the moment method estimators of σ and q areσ where (23) must be solved numerically to obtainq m . Later,q m must be replaced in (22) to getσ m .

Proof. Consider the moment method equations
with µ i the noncentral moments of T given in Corollary 1 for i = 1, 2. σ m given in (22) is obtained from (24). From (22) and (25), we have , which can be rewritten as (23) and must be solved numerically to obtainq m .

ML Estimation
Let T 1 , . . . , T n be a random sample from T ∼ SMR(σ, q). Then the log-likelihood function is Taking partial derivatives in l(σ, q) with respect to σ and q and setting them equal to zero, we get where (26) must be solved numerically to getσ, which must be substituted into (27) to obtainq. Since we do not have explicit expressions for ML estimators, EM algorithm can be implemented to get these estimates. The next section is devoted to reach this aim.

ML Estimation Using EM-Algorithm
The EM-algorithm (Dempster et al. [18]) enables the computationally efficient determination of the ML estimates when iterative procedures are required. Next, details about the implementation of this algorithm for the SMR model are given. The method is based on Proposition 3 and Lemma 2. Lemma 2. Let X ∼ Gamma(k, β) with k > 0 shape parameter and β > 0 rate parameter, that is its pdf is Let T ∼ SMR(σ, q), by applying the hierarchical representation given in Proposition 3, we can write T|Y = y ∼ R(σ/y 2 ) and Y ∼ GG (1, q, 2). The joint pdf of (T, Y) is where Y is a latent variable and the parameter vector is θ = (σ, q) T , θ ∈ R + × R + . Given T 1 , . . . , T n , a random sample of size n from T ∼ SMR(σ, q), let us denote by t = [t 1 , . . . , t n ] the observed data, y = [y 1 , . . . , y n ] the unobserved data and t c = [t , y ] the complete data, that is, the original data t augmented by y.
Let l c (θ|t c ) and Q(θ| θ) = E[l c (θ|t c )|t, θ] the complete log-likelihood function associated with t c and its expected value, respectively. From (28), l c (θ|t c ) is where c is a constant, which does not depend on θ.
In order to get Q(θ| θ) and apply EM-algorithm, we need to calculate E[y 2 i |t i , θ] and E[log(y i )|t i , θ]. From (28) and (2) where C is a constant that does not depend on y i . As (29) is related to the pdf of a Generalized Gamma distribution, specifically, On the other hand, Lemma 2 can be applied to (30) with Taking into account that we have Let Q(θ| θ) = E[l c (θ|t c )|t, θ] be the conditional expectation of the complete log-likelihood function. We have (31) and (32), respectively. Taking first partial derivative with respect to σ, the estimate of σ is given bŷ where t 2 y 2 is the mean of t 2 i y 2 i . By proceeding similarly, the estimate of q iŝ where log(y) is the mean of log(y i ) and Ψ −1 (·) is the inverse of the digamma function. Therefore, the EM algorithm is • M-step: Update the vector of parameters θ = (σ, q) .
• E and M steps are repeated until a suitable convergence is reached.
EM algorithm implementation was carried out by using R software. Three functions were implemented.
In the function used in E step, estimates of y 2 i (k+1) and log(y i ) (k+1) were generated. In the function used in M step, estimates of σ and q were obtained. Later, these steps were repeated by using a function with the convergence criterion. The criterion was to stop when the difference between the successive values obtained is less than a a value fixed in advance, specifically it was

Simulation Study
In this section, the performance of ML estimates for finite sample sizes is studied. It is empirically checked if the proposed estimators satisfy desirable properties, such as asymtotic unbiasedness, asymptotic efficiency, asymptotic normality. Values of the Rayleigh and the Generalized Gamma distributions were generated to get values of the SMR distribution introduced in (4). The EM algorithm was used to compute the estimates of parameters in the SMR model, and their standard errors. This procedure was carried out 10,000 times with sample sizes n = {30, 40, . . . , 190, 200}, and taking as values of the parameters σ = 1 and 10; q = 1, 1.5, 2 and 3.
As summaries of these simulations, the average of estimates, average of standard errors of estimates, average bias, squared root of mean squared error (RMSE) and the empirical coverage probability of confidence intervals for the parameters (with confidence level 0.95) were calculated. From Figures 4-11, average bias, RMSE and empirical coverage probability (in percentages) are plotted for n varying in {30, 40, . . . , 190, 200}. It is observed that if the sample size increases then bias and RMSE decrease, this fact suggests that the estimators are consistent. In addition, we note that the empirical coverage probability (in percentage) approaches to the nominal level (95%) when the sample size increases.

Real Data Illustration
In this section, we present applications of SMR model in two real datasets from the fields of lifetime analysis and reliability. To illustrate its good performance, our proposal is compared to other competing models previously introduced in literature. These are R and SR distributions.

Application to Patients with Bladder Cancer
We consider a dataset corresponding to remission times (in months) of a random sample of 128 bladder cancer patients studied by Lee and Wang [19]. The SMR(σ, q) model is proposed to describe this dataset. The EM algorithm was used to estimate σ and q. Their corresponding standard errors were obtained by using the inverse of the hessian matrix. Table 1 provides the descriptive summaries. These are: the sample size n, the sample mean T, the standard deviation S, the sample asymmetry coefficient √ b 1 , the sample kurtosis coefficient b 2 , the sample minimum min(T) and the sample maximum max(T). We highlight that we are dealing with positive, positive skewed data with a really high kurtosis, b 2 = 18.483.  Table 2 provides the ML estimates of parameters and their standard errors in parentheses for SMR, R and SR models. These models are compared by using the Akaike Information Criterion (AIC), Akaike [20] and the Bayesian Information Criterion (BIC), introduced in Schwarz [21]. Both criteria reveal that the SMR model provides a better fit to this data since their values are less than the others.  Figure 12 depicts the histogram of this dataset along with the estimated pdf for R, SR and SMR distributions. Moreover, Figure 13 provides the corresponding QQ-plots for these models. Note that the QQ-plots for R and SR models show that these distributions does not provide a good fit on the right tail of these data. However the QQ-plot for the SMR distribution does not exhibit this drawback.
All these plots and summaries suggest that the SMR model provides a better fit to this dataset than the other models under consideration.

Application to Number of Failures of an Air Conditioning System
The second dataset under consideration was reported by Proschan [22]. This dataset consists of the times between failures of the air-conditioning equipment in 13 Boeing 720 aircrafts. A similar outline to that given in previous applications has been followed. Table 3 shows the basic statistical summaries. We highlight that, again, a high value of kurtosis is observed b 2 = 8.023.  In Table 4, results related to the fit of R, SR and SMR models are given. Note that the SMR model provides a better fit over the others since its AIC and BIC values are less than those in R and SR models.
In Figure 14, the histogram and the estimated pdfs are given. Figure 15 provides the corresponding QQ-plots. Note that the SMR model provides a better fit, especially for the right tail of the dataset.  Figure 14. Density adjusted for the number of failures of an air conditioning system in the R, SR and SMR distributions.

Conclusions
In this paper, the SMR distribution is introduced. As strengths of our proposal, we highlight that it is more flexible as for its kurtosis coefficient and hazard function than the Rayleigh and slashed Rayleigh distribution. Closed expressions are given for its main characteristics: pdf, cdf, moments and related coefficients. Since we are dealing with a positive and skew right model, it can be of interest for modeling survival time and reliability data. For this reason, features of interest in this field such as the hazard function, mean residual life and order statistics are studied. A closed expression is obtained for the Rényi entropy. The special interest is the EM estimation algorithm based on the hierarchical representation proposed for this model. A simulation study is included, which suggests that the ML estimators are consistent even for moderate sample sizes. Two real applications are included, in which AIC, BIC and QQ-plots show that our proposal provides a better fit than R and SR distributions, especially on the right tail of these datasets.

Conflicts of Interest:
The authors declare no conflict of interest.