You are currently viewing a new version of our website. To view the old version click .
Mathematics
  • Article
  • Open Access

3 January 2024

Scale Mixture of Exponential Distribution with an Application

,
,
,
and
1
Departamento de Estadística y Ciencias de Datos, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
2
Departamento de Estadística, Facultad de Ciencias, Universidad del Bío-Bío, Concepción 4081112, Chile
3
Department of Quantitative Methods in Economics and TIDES Institute, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain
4
Departamento de Ciencias Matemáticas y Físicas, Facultad de Ingeniería, Universidad Católica de Temuco, Temuco 4780000, Chile
This article belongs to the Special Issue Computational Statistical Methods and Extreme Value Theory

Abstract

This article presents an extended distribution that builds upon the exponential distribution. This extension is based on a scale mixture between the exponential and beta distributions. By utilizing this approach, we obtain a distribution that offers increased flexibility in terms of the kurtosis coefficient. We explore the general density, properties, moments, asymmetry, and kurtosis coefficients of this distribution. Statistical inference is performed using both the moments and maximum likelihood methods. To show the performance of this new model, it is applied to a real dataset with atypical observations. The results indicate that the new model outperforms two other extensions of the exponential distribution.

1. Introduction

A scale mixture is a statistical model that combines two or more probability distributions to generate a new distribution. In a scale mixture, one distribution is used to determine the scale parameter of another distribution. For example, in a normal scale mixture, the scale parameter of a normal distribution is determined by another distribution, such as a gamma distribution (see Andrews and Mallows []). This allows for greater flexibility in modeling data that may have varying levels of variability.
Scale mixtures are commonly used in Bayesian statistics, where the scale parameter is often treated as a random variable (Fernández and Steel []). They can also be used in other areas of statistics, such as in the modeling of heavy-tailed distributions. The slah methodology is used for distributions that arise from a scale mixture. The slash distribution is a symmetric extension of the standard normal distribution; it is represented as the quotient between two independent random variables, one standard normal and the other Beta ( q , 1 ) . Thus, we say that W has a slash distribution if
W = X Y ,
where X N ( 0 , 1 ) , Y B e t a ( q , 1 ) , q > 0 and X is independent of Y (see Johnson et al. []). This distribution has heavier tails than the normal distribution, i.e., it has greater kurtosis. The properties and inference of this family are discussed in Rogers and Tukey [], Mosteller and Tukey [] and Kadafar []. Wang and Genton [] offered a multivariate version of the slash distribution and a multivariate skew version. Various works have used the slash methodology to extend some distributions with positive support, such as Olmos et al. [], Rivera et al. [], and Castillo et al. [], among others.
Overall, scale mixtures provide a flexible framework for modeling data with varying levels of variability, allowing for more accurate and robust statistical analysis.
The principal object of this article is to introduce a new extension of the exponential (E) distribution, with probability density function (pdf) given by f X ( x ; λ ) = λ exp ( λ x ) ,    λ , x > 0 , based on a scale mixture; this new distribution has a more flexible coefficient of kurtosis and can thus be used for modelling atypical data. Some extensions of the exponential distribution are the Weibull distribution and the generalized exponential (GE) distribution, which was studied by Gupta and Kundu [,,]; the latter is a particular case of the exponentiated Weibull distribution, with zero localization, introduced by Mudholkar et al. [].
This article is organized as follows. In Section 2, we give the representation of this new distribution and generate the new density, basic properties, moments, coefficients of asymmetry, and kurtosis. In Section 3, we perform the inference using estimation by moments and maximum likelihood (ML) with the EM algorithm. In Section 4, we show an application to a real dataset. The codes necessary to reproduce the results obtained are available in the Appendix A and as Supplementary Material in the case of the EM algorithm.

2. Density Function and Properties

In this Section, we introduce the density, properties, and graphs of the new distribution.

2.1. Scale Mixture

Definition 1. 
We say that the random variable Z has a pdf given by
f Z ( z ; λ , q ) = λ e 2 λ z 1 F 1 q , 2 q + 1 ; 2 λ z , z > 0 ,
where λ > 0 is scale parameter, q > 0 is shape parameter, and 1 F 1 is the confluent hypergeometric function (see Abramowitz and Stegun []), which is given by
1 F 1 ( a , b ; x ) = Γ ( b ) Γ ( a ) Γ ( b a ) 0 1 v a 1 ( 1 v ) b a 1 e x v d v , b > a > 0 ,
where Γ ( · ) is the gamma function. We call Z a scale mixture of the exponential (SME) distribution.
The following proposition shows that the SME distribution is the product of a mixture scale between the E and Beta distributions.
Proposition 1. 
If Z | X = x E 2 λ x and X B e t a ( q , q ) then Z S M E ( λ , q ) .
Proof. 
The marginal pdf of Z is given by
f Z ( z ; λ , q ) = 0 1 f Z | X ( z ) f X ( x ) d x = 0 1 2 λ x e 2 λ x z 1 B ( q , q ) x q 1 ( 1 x ) q 1 d x = 2 λ B ( q , q ) 0 1 x q ( 1 x ) q 1 e 2 λ z x d x , = 2 λ e 2 λ z B ( q , q ) 0 1 x q ( 1 x ) q 1 e 2 λ z ( 1 x ) d x ,
where B ( · , · ) is the beta function; making the transformation u = 1 x and using the confluent hypergeometric function given in (2), this result is obtained.    □
Remark 1. 
In this scale mixture of the exponential distribution, we use the Beta distribution, motivated by the representation of the slash distribution, since this generates distributions with greater kurtosis.
The following proposition shows that the SME distribution is also a product of the quotient between two independent random variables, i.e., using the slash methodology.
Proposition 2. 
Let X E ( λ ) and Y B e t a ( q , q ) be independent. Then, Z = X 2 Y S M E ( λ , q ) .
Proof. 
Using the stochastic representation Z = X 2 Y , and procedures based on the Jacobian method, we can write
Z = X 2 Y V = Y X = 2 Z V Y = V J = X Z X V Y Z Y V = 2 v 2 z 0 1 = 2 v
f Z , V ( z , v ) = | J | f X , Y ( 2 z v , v ) f Z , V ( z , v ) = 2 v f X ( 2 z v ) f Y ( v ) , 0 < v < 1 , z > 0 .
Hence, marginalizing with respect to variable V, we arrive at the density of Z, which is given by
f Z ( z ; λ , q ) = 2 λ B ( q , q ) 0 1 v q ( 1 v ) q 1 e 2 λ z v d v = 2 λ e 2 λ z B ( q , q ) 0 1 v q ( 1 v ) q 1 e 2 λ z ( 1 v ) d v .
The result follows by making the transformation u = 1 v and using the confluent hypergeometric function given in (2).    □
In Figure 1, we show the pdf of the SME distribution for two values of the parameters q and λ = 3 and we compare it with the E(3) distribution.
Figure 1. Densities SME(3, 1) (solid line), SME(3, 5) (dashed line), and E(3) (dotted line).
We perform a brief comparison illustrating that the tails of the SME distribution are heavier than those of the E distribution.
Table 1 shows P ( Z > z ) for different values of z in the distributions mentioned. It is clear that the SME distribution has much heavier tails than the E distribution.
Table 1. Tails comparison for different SME and E distributions.

2.2. Properties

In this subsection, we study some properties of SME distribution.

2.3. Cumulative Distribution Function

The following proposition shows the cdf of the SME distribution, which is generated using the representation given in (1).
Proposition 3. 
Let Z S M E ( λ , q ) . Then, the cdf of Z is given by
F Z ( z ; λ , q ) = 1 e 2 λ z 1 F 1 q , 2 q ; 2 λ z , z > 0 ,
where λ > 0 and q > 0 .
Proof. 
Calculating the cdf of Z directly, we have
F Z ( z ; λ , q ) = 0 z λ e 2 λ t 1 F 1 q , 2 q + 1 , 2 λ t d t = 2 λ B ( q , q ) 0 1 v q ( 1 v ) q 1 0 z e 2 λ t v d t d v = 1 B ( q , q ) 0 1 v q 1 ( 1 v ) q 1 ( 1 e 2 λ z v ) d v = 1 1 B ( q , q ) 0 1 v q 1 ( 1 v ) q 1 e 2 λ z v d v ,
the result follows using the confluent hypergeometric function given in (2).    □

2.4. Reliability Analysis

The reliability function r ( t ) and hazard function h ( t ) of the SME distribution, which are generated using the representation given in (1), are given in the following corollaries.
Corollary 1. 
Let T S M E ( λ , q ) . Then, the r ( t ) and h ( t ) of T are given by
1. 
r ( t ) = e 2 λ t 1 F 1 q , 2 q ; 2 λ t ,
2. 
h ( t ) = λ 1 F 1 ( q , 2 q + 1 ; 2 λ t ) 1 F 1 ( q , 2 q ; 2 λ t ) ,
where λ > 0 and q > 0 .
Figure 2 shows that the hazard function of the SME distribution is monotone decreasing; only in the limit case, when parameter q tends to infinity, is it constant, as this is the hazard function of the E distribution (whose hazard function is λ ).
Figure 2. Hazard function SME(1, 1) (solid line), SME(1, 5) (dashed line), SME(1,10) (dotted line), and SME(1,∞) = E(1) (horizontal dashed line).

2.5. Order Statistics

Let Z 1 , Z 2 , . . . , Z n be a random sample from Equation (1). Let Z 1 : n , Z 2 : n , . . . , Z n : n denote the corresponding order statistics. It is well known that the pdf and the cdf of the k -th order statistic, i.e., Y = Z k : n , are given by
f Y ( y ) = n ! ( k 1 ) ! ( n k ) ! F Z k 1 ( y ) 1 F Z ( y ) n k f Z ( y ) = n ! λ e 2 λ ( n k + 1 ) y ( k 1 ) ! ( n k ) ! 1 e 2 λ y 1 F 1 q , 2 q ; 2 λ y k 1 1 F 1 n k q , 2 q ; 2 λ y 1 F 1 q , 2 q + 1 ; 2 λ y .
Therefore, the pdf of the largest order statistic Z ( n ) = Z n : n is given by
f Z ( n ) ( y ) = n λ e 2 λ y 1 e 2 λ y 1 F 1 q , 2 q ; 2 λ y n 1 1 F 1 q , 2 q + 1 ; 2 λ y ,
and the pdf of the smallest order statistic Z ( 1 ) = Z 1 : n is given by
f Z ( 1 ) ( y ) = n λ e 2 n λ y 1 F 1 n 1 q , 2 q ; 2 λ y 1 F 1 q , 2 q + 1 ; 2 λ y .
The following proposition shows that, when parameter q tends to infinity in the SME distribution, it converges to the E( λ ) distribution.
Proposition 4. 
Let Z S M E ( λ , q ) . If q , then Z converges in law to a random variable Z E ( λ ) .
Proof. 
Let Z S M E ( λ , q ) and Z = X 2 Y , where X E ( λ ) and Y B e t a ( q , q ) .
We study the convergence in law of Z, since Y B e t a ( q , q ) , we have E [ Y ] = 1 / 2 and V a r [ Y ] = 1 4 ( 2 q + 1 ) . By applying Chebychev’s inequality to Y, we have ϵ > 0
P | Y 1 / 2 | > ϵ V a r ( Y ) ϵ 2 = 1 4 ϵ 2 ( 2 q + 1 ) .
If q , then the right-hand side of (3) tends to zero, i.e., Y converges in probability to 1 / 2 , then we have
Y P 1 2 , q , 2 Y P 1 , q .
Since X E ( λ ) , by applying the Slutsky’s Lemma to Z = X 2 Y , we have
Z L X E ( λ ) , q ,
that is, for increasing values of q, Z converges in law to a E ( λ ) distribution.    □

2.6. Moment-Generating Function and Moments

The following proposition shows the moment-generating function M Z ( t ) of the SME distribution, which is generated using the representation given in (1).
Proposition 5. 
Let Z S M E ( λ , q ) . Then, the moment-generating function of Z is given by
M Z ( t ) = λ t 2 F 1 1 , q + 1 , 2 q + 1 , 2 λ t ,
where λ > 0 and q > 0 .
Proof. 
Calculating the M Z ( t ) directly, we obtain
M Z ( t ) = λ 0 1 2 B ( q , q ) x q ( 1 x ) q 1 d x 0 e t z e 2 λ x z d z = λ 0 1 2 B ( q , q ) x q ( 1 x ) q 1 1 2 λ x t d x = λ t 0 1 2 B ( q , q ) x q ( 1 x ) q 1 1 2 λ x t 1 d x ,
and using the Gauss hypergeometric function, 2 F 1 , which is given by
2 F 1 ( a , b , c ; x ) = Γ ( c ) Γ ( b ) Γ ( c b ) 0 1 v b 1 ( 1 v ) c b 1 ( 1 x v ) a d v ,
where c > a + b or a + b 1 < c a + b (for details on this function see Abramowitz and Stegun []), this result is obtained.    □
Using Proposition 1, we can calculate the r -th distributional moment.
Proposition 6. 
Let Z S M E ( λ , q ) . Then, for r = 1 , 2 , and q > r the r -th distributional moment is given by
μ r = E ( Z r ) = Γ ( r + 1 ) B r 2 r λ r B 0 ,
where λ > 0 is a scale parameter, q > 0 is shape parameter, and B i = B ( q i , q ) = Γ ( q i ) Γ ( q ) Γ ( 2 q i ) .
Proof. 
Using the representation given in Proposition 1, it follows that
μ r = E Z r = E E Z r | X = E Γ ( r + 1 ) ( 2 λ X ) r = Γ ( r + 1 ) 2 r λ r E X r ,
where E X r = B r B 0 are the distributional inverse moments of the B e t a ( q , q ) .    □
Remark 2. 
The μ r exist for every r that belongs to the real values whenever r + 1 Z , q r Z and 2 q r Z , where Z are the negative integers.
Corollary 2. 
Let Z S M E ( λ , q ) . Then, the mean and variance are given, respectively, by
E ( Z ) = 2 q 1 2 λ ( q 1 ) , q > 1 , a n d V a r ( Z ) = ( 2 q 1 ) ( 2 q 2 3 q + 2 ) 4 λ 2 ( q 2 ) ( q 1 ) 2 , q > 2 .
Corollary 3. 
Let Z S M E ( λ , q ) . Then, the asymmetry ( β 1 ) and kurtosis ( β 2 ) coefficients, for q > 3 and q > 4 , respectively, are
β 1 = 2 3 B 0 2 B 3 3 B 0 B 1 B 2 + B 1 3 2 B 0 B 2 B 1 2 3 / 2
and
β 2 = 3 8 B 0 3 B 4 8 B 0 2 B 1 B 3 + 4 B 0 B 1 2 B 2 B 1 4 2 B 0 B 2 B 1 2 2 .
Remark 3. 
Figure 3 shows that when parameter q approaches 3, the asymmetric coefficient tends to infinity. In the same way, when parameter q approaches 4, the kurtosis coefficient tends to infinity. We can observe that β 1 ( q 3 ) 1.5 as q 3 + and β 2 ( q 4 ) 2 as q 4 + . The notation ∼ indicates that it is asymptotically equivalent. This shows the flexibility of the SME distribution in the asymmetry and kurtosis coefficients.
Figure 3. Plots of the asymmetry and kurtosis coefficients of the SME distribution.

3. Inference

In this Section, the moment and ML estimators for the SME distribution are discussed.

3.1. Moment Estimators

Proposition 7. 
Let Z 1 , Z n be a random sample of size n from the Z SME ( λ , q ) distribution. Then, the moment estimator ( θ ^ M ) of θ = ( λ , q ) for q > 2 is given by
λ ^ M = 2 q ^ M 1 2 Z ¯ ( q ^ M 1 )
q ^ M = 5 Z 2 ¯ 8 Z ¯ 2 + Z 2 ¯ ( 9 Z 2 ¯ 16 Z ¯ 2 ) 4 ( Z 2 ¯ 2 Z ¯ 2 ) ,
where Z ¯ is the sample mean and Z 2 ¯ is the sample mean for the squared observations. We calculate the value of q ^ M in (7), and then this value is replaced in (6) to obtain the value λ ^ M .
Proof. 
From (5), and considering the first two equations in the moments method, we have
Z ¯ = 2 q 1 2 λ ( q 1 ) , Z 2 ¯ = 2 q 1 λ 2 ( q 2 ) .
The result is obtained by solving for λ and q.    □

3.2. ML Estimators

Given an observed sample Z 1 , , Z n from the SME ( σ , q ) distribution, the log-likelihood function for parameters λ and q given z = ( z 1 , . . . , z n ) , can be written as
l ( λ , q ) = n log ( λ ) 2 λ i = 1 n z i + i = 1 n log 1 F 1 q , 2 q + 1 , 2 λ z i .
The ML estimators are obtained by maximizing the log-likelihood function given in (8). Partially differentiating the log-likelihood function with respect to each parameter and equating to zero, the following normal equations are obtained as
n λ 2 i = 1 n z i + i = 1 n H 1 ( z i ; λ , q ) H ( t i ; λ , q ) = 0 ;
i = 1 n H 2 ( z i ; λ , q ) H ( z i ; λ , q ) = 0 ;
where H ( z i ; λ , q ) = 1 F 1 q , 2 q + 1 , 2 λ z i , H 1 ( z i ; λ , q ) = λ H ( z i ; λ , q ) , and H 2 ( z i ; λ , q ) = q H ( z i ; λ , q ) .
Numerical methods, such as the Newton–Raphson algorithm, can be employed to find solutions for Equations (9) and (10). Another approach to obtain the maximum likelihood estimates is by maximizing (8) using the “optim” subroutine in the R software package (R version 4.3.2) []. The EM algorithm is used as an alternative approach to obtain the ML estimators in the next subsection.

3.3. Em Algorithm

The iterative method for finding the ML estimators based on the EM algorithm can be applied using the stochastic representation of the SME model provided in Proposition 1 (see Dempster et al. []). In order to simplify the estimation process, latent variables X 1 , , X n are introduced through a hierarchical representation of the SME model.
Z i | X i = x i E ( 2 λ x ) and X i B e t a ( q , q ) .
Hence, the complete likelihood function for θ = ( λ , q ) can be expressed as
l c ( θ ) = n log ( 2 λ ) 2 λ i = 1 n z i x i n log B ( q , q ) + q i = 1 n log x i + i = 1 n log ( 1 x i ) + c .
Let x i ^ = E ( X i | Z i = z i ) ; u i ^ = E ( log X i | Z i = z i ) and v i ^ = E ( log ( 1 X i ) | Z i = z i ) . Note that such expectations can be computed numerically considering that
f ( x i Z i = z i ) x i q ( 1 x i ) q 1 e 2 λ z i x i , i = 1 , , n ,
i.e., X i Z i = z i C H q + 1 , q , 2 λ z i , where C H is a confluent hypergeometric distribution, introduced by Gordy []. Then, x i ^ = q + 1 2 q + 1 1 F 1 ( q + 2 , 2 q + 2 , 2 λ z i ) 1 F 1 ( q + 1 , 2 q + 1 , 2 λ z i ) . With these definitions, the expected value for the log-likelihood function given the observed data is
Q ( θ | θ ^ ( k ) ) = n log ( 2 λ ) 2 λ i = 1 n x i ^ ( k ) z i n log B ( q , q ) + q i = 1 n u i ^ ( k ) + i = 1 n v i ^ ( k ) .
Therefore, the EM algorithm to estimate vector θ is given as follows:
  • E-step: For i = 1 , , n , use θ ^ ( k 1 ) , the estimate of θ at the ( k 1 ) -th iteration of the algorithm, to compute
    x ^ i ( k ) = q ^ ( k 1 ) + 1 2 q ^ ( k 1 ) + 1 1 F 1 ( q ^ ( k 1 ) + 2 , 2 q ^ ( k 1 ) + 2 , 2 λ ^ ( k 1 ) z i ) 1 F 1 ( q ^ ( k 1 ) + 1 , 2 q ^ ( k 1 ) + 1 , 2 λ ^ ( k 1 ) z i ) , u ^ i ( k ) = D i 10 ( k ) and v ^ i ( k ) = D i 01 ( k ) ,
    where
    D i a b ( k ) = 0 1 log x i a log ( 1 x i ) b g ( x i θ ^ ( k 1 ) ) d x i ,
    and g ( · θ ^ ( k 1 ) ) corresponds to the pdf of the C H ( q ^ ( k 1 ) + 1 , q ^ ( k 1 ) , 2 λ ^ ( k 1 ) z i ) model.
  • M1-step: Update λ ^ ( k ) as
    λ ^ ( k ) = n 2 i = 1 n z i x ^ i ( k ) .
  • M2-step: Update q ^ ( k ) as the solution for the non-linear equation
    ψ ( q ) ψ ( 2 q ) = 1 2 u ^ ¯ ( k ) + v ^ ¯ ( k ) ,
    where ψ ( · ) is the digamma function and u ^ ¯ ( k ) and v ^ ¯ ( k ) denote the mean of u ^ 1 , u ^ 2 , , u ^ n and v ^ 1 , v ^ 2 , , v ^ n evaluated in the k-th step, respectively.
The E-step, M1-step, and M2-step are repeated until convergence is obtained, for instance, until the maximum distance between the estimates obtained in two consecutive iterations is less than a specified value. Codes for the EM algorithm are available as Supplementary Material.

3.4. Observed Information Matrix

Let Z 1 , , Z n be a random sample of S M E ( λ , q ) distribution, so the observed information matrix is given by
I n ( λ , q ) = 2 l ( λ , q ) λ 2 2 l ( λ , q ) λ q 2 l ( λ , q ) q λ 2 l ( λ , q ) q 2 ,
such that  
2 l ( λ , q ) λ 2 = n λ 2 + i = 1 n H 3 ( z i ; λ , q ) H ( z i ; λ , q ) H 1 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) , 2 l ( λ , q ) λ q = i = 1 n H 4 ( z i ; λ , q ) H ( z i ; λ , q ) H 1 2 ( z i ; λ , q ) H 2 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) , 2 l ( λ , q ) q λ = i = 1 n H 5 ( z i ; λ , q ) H ( z i ; λ , q ) H 1 2 ( z i ; λ , q ) H 2 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) , 2 l ( λ , q ) q 2 = i = 1 n H 6 ( z i ; λ , q ) H 2 ( z i ; λ , q ) H 2 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) ,
where H 3 ( z i ; λ , q ) = λ H 1 ( z i ; λ , q ) , H 4 ( z i ; λ , q ) = q H 1 ( z i ; λ , q ) , H 5 ( z i ; λ , q ) = λ H 2 ( z i ; λ , q ) , and H 6 ( z i ; λ , q ) = q H 2 ( z i ; λ , q ) .

3.5. Simulation Study

To evaluate the effectiveness of the proposed approach, we conducted a simulation study to assess the performance of the estimation procedure for the parameters λ and q in the SME model. The study involved simulating 1000 samples from the SME model with three different sample sizes: n = 50 , 100 , and 200. The objective of the simulation was to analyze the behavior of the ML estimators for the parameters. The simulation utilized Algorithm 1 to generate samples from the SME model.
Algorithm 1 Algorithm to simulate values from the Z SME ( λ , q ) distribution.
1:
Generate U U ( 0 , 1 ) .
2:
Compute X = log ( U ) .
3:
Generate W Beta ( q , q ) .
4:
Compute Z = X 2 λ W .
The ML estimates were calculated using the EM algorithm for each generated sample. The bias estimate mean (Bias), Relative Bias (Relat. Bias), standard errors (SEs), and root mean squared error (RMSE) are shown in Table 2. Based on the table, it can be concluded that the ML estimates are stable. The bias is reasonable and decreases as the sample size increases. Additionally, the standard errors and root mean squared error become closer as the sample size increases, indicating accurate estimation of the standard errors of the estimators. Moreover, the coverage probability (CP) converges to the nominal value of 95%, suggesting that the approximation to a normal distribution is reasonable for asymptotic distributions of ML estimators in the SME model, even with moderate sample sizes.
Table 2. ML estimations for parameters λ and q of the SME distribution.

4. Application

In this section, we present an application to a real dataset and compare the fits of the Weibull, GE, and SME distributions. Next, the pdf GE is given.
A random variable X has a GE distribution with scale parameter λ and shape parameter q if its pdf is given by
f ( x ; λ , q ) = q λ 1 e λ x q 1 e λ x , x > 0 ,
with λ > 0 and q > 0 . We denote this by X G E ( λ , q ) .
This dataset refers to the repair time (hours) of a simple total sample of 46 airborne communications receivers, available at Devore [] (p. 44). The data are as follows:
0.2 0.3 0.5 0.5 0.5 0.6 0.6 0.7 0.7 0.7 0.8 0.8 0.8 1.0 1.0 1.0 1.0 1.1 1.3 1.5 1.5 1.5 1.5 2.0 2.0 2.2 2.5 2.7 3.0 3.0 3.3 3.3 4.0 4.0 4.5 4.7 5.0 5.4 5.4 7.0 7.5 8.8 9.0 10.3 22.0 24.5
The codes for this application are available in the Appendix A.
Table 3 shows a descriptive summary of the data, where b 1 and b 2 are the asymmetry and kurtosis coefficients of the sample, respectively.
Table 3. Descriptive summary of Repair Time data.
Computing initially the moment estimators under the SME model, we have the following estimates: λ ^ M = 0.335 and q ^ M = 3.398 . Using the moment estimators as initial values, the ML estimates are computed and presented in Table 4. ML estimates for Weibull, GE, and SME distributions, together with the values for the AIC and BIC, are presented in Table 4.
Table 4. ML estimates for the Weibull, GE, and SME models, and AIC and BIC values.
Table 4 shows the parameter estimations for the Weibull, GE, and SME distributions using the ML method, and the corresponding Akaike information criterion (AIC) proposed by Akaike [] and the Bayesian information criterion (BIC) proposed by Schwarz []. For the dataset analyzed, and using the AIC and BIC selection criteria, the SME model gives a better fit to the data than the Weibull and GE models.
Figure 4 (left) presents the histogram of the dataset with the curves of the fitted models. To allow for a clearer appreciation of the fits for the repair times (hours) of 46 airborne telecommunications receivers, Figure 4 (right) shows a zoom of the tails of the histogram. This shows more conclusively that the SME model produces a greater probability in the tails than the Weibull and GE models. To complete the analysis of the fits to this dataset, Figure 5 (below) presents the qqplot graphs of the three distributions fitted.
Figure 4. SME (solid line), GE (dashed line), and Weibull (dotted line).
Figure 5. QQ-plots for repair time of 46 airborne communications receivers dataset: (left) Weibull model; (center) GE model; (right) SME model.
Figure 5 shows that the theoretical quantiles of the proposed SME model present a better fit to the quantiles of the repair time data of the sample than the theoretical quantiles of the Weibull and GE models. Thus, as stated above, based on the AIC and BIC selection criteria, the SME model presents a better fit with this dataset.

5. Conclusions

This paper presents an extension of the exponential distribution based on the slash methodology. This results in a distribution which is represented using the confluent hypergeometric function. We study its properties and its ML estimation using the EM algorithm, and present a simulation study and an application to real data. Some other characteristics of the SME distribution are as follows:
  • The SME distribution has two representations, given in (1) and in Proposition 1.
  • Based on the mixed-scale representation, the SME distribution was implemented using the EM algorithm to calculate the maximum likelihood estimators.
  • The simulation study shows that the ML estimators produce very good results with small samples.
  • Our application shows that the SME distribution is a good option when the data have a heavy right tail; this is confirmed by the AIC and BIC model selection criteria in a comparison with the Weibull and GE distributions.
  • We are working on an extension of the SME distribution that will have a more flexible mode, as well as using it to model data with covariables.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math12010156/s1.

Author Contributions

Conceptualization, J.A.B., Y.M.G. and H.W.G.; methodology, Y.M.G. and H.W.G.; software, J.A.B., Y.M.G. and E.G.-D.; validation, Y.M.G., H.W.G. and E.G.-D.; formal analysis, J.A.B. and O.V.; investigation, J.A.B., O.V. and E.G.-D.; writing—original draft preparation, Y.M.G. and O.V.; writing—review and editing, Y.M.G., O.V. and E.G.-D.; funding acquisition, H.W.G. and O.V. All authors have read and agreed to the published version of the manuscript.

Funding

The research of J.A. Barahona and H.W. Gómez was supported by Semillero UA-2023.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

  • Density function.
    The hypergeometric function contained in the CharFun package was used to obtain the graph of the density function.
    $
    eexp<-function(a,b1,b2,b3)
     
    {
    x <- seq(0, 6, 0.04)
    y <- (a)*exp(-2*a*x)*hypergeom1F1(2*a*x,b1,2*b1+1)
    y1 <- (a)*exp(-2*a*x)*hypergeom1F1(2*a*x,b2,2*b2+1)
    y2 <- (a)*exp(-2*a*x)*hypergeom1F1(2*a*x,b3,2*b3+1)
    y3<- a*exp(-a*x)
    plot(x,y, type = "l",, ylim = c (0,0.6), xlim = c(0, 3), xlab="z",
      ylab="Density")
    lines(x, y1, lty = 2)
    lines(x, y2, lty = 3)
    lines(x, y3, lty = 4)
    }
    eexp(3,1,5,10)
    $
  • Hazard function
    The hypergeometric function contained in the CharFun package was also used to obtain the graph of the hazard function.
    $
    hexp<- function(a,b1,b2,b3)
    {
    x <- seq(0, 10, 0.04) 
    y <- ((a)*hypergeom1F1(2*a*x,b1,2*b1+1))/(hypergeom1F1(2*a*x,b1,2*b1))
    y1 <- ((a)*hypergeom1F1(2*a*x,b2,2*b2+1))/(hypergeom1F1(2*a*x,b2,2*b2))
    y2 <- ((a)*hypergeom1F1(2*a*x,b3,2*b3+1))/(hypergeom1F1(2*a*x,b3,2*b3))
    y3 <- (a*exp(-a*x))/(exp(-a*x)) 
    plot(x,y, type = "l",, ylim = c (0,1), xlim = c(0, 10), xlab="t",
      ylab="Hazard function")
    lines(x, y1, lty = 2) 
    lines(x, y2, lty = 3) 
    lines(x, y3, lty = 5) 
    hexp(1,1,5,10) 
    $
  • Asymmetry Coefficient
    $
    q <- seq(3.01, 20.01, 0.01)
     
    b0 = beta(q,q)
    b1 = beta(q-1,q)
    b2 = beta(q-2,q)
    b3 = beta(q-3,q)
    Asym <- (2*(3*(b0 2 ^ )*b3-3*b0*b1*b2+b1 3 ^ ))/(2*b0*b2-b1 2 ^ )( 3 ^ /2)
    plot(q,Asym, type = "l", ylim = c (0,20), xlab="q",
     ylab="Asymmetry Coefficient")
    $
  • Kurtosis Coefficient
    $
    q <- seq(4.01, 140, 0.01)
    b0 = beta(q,q)
    b1 = beta(q-1,q)
    b2 = beta(q-2,q)
    b3 = beta(q-3,q)
    b4 = beta(q-4,q)
    Kurt <- (3*(8*b0 3 ^ *b4-8*b0 2 ^ *b1*b3+4*b0*b1 2 ^ *b2-b1 4 ^ ))/(2*b0*b2-b1 2 ^ )( 2 ^ )
    plot(q,Kurt, type = "l",xlim=c(5,100),ylim = c (0,40), xlab="q",
      ylab="Kurtosis Coefficient")
    $
  • Application
    The dataset, related to the repair time (hours) for a simple total sample of 46 airborne communications receivers:
    0.2 0.3 0.5 0.5 0.5 0.6 0.6 0.7 0.7 0.7 0.8 0.8 0.8 1.0 1.0 1.0 1.0 1.1 1.3 1.5 1.5 1.5 1.5 2.0 2.0 2.2 2.5 2.7 3.0 3.0 3.3 3.3 4.0 4.0 4.5 4.7 5.0 5.4 5.4 7.0 7.5 8.8 9.0 10.3 22.0 24.5
    Parameter estimation using maximum likelihood estimators, to contrast the SME model with the Weibull and generalized exponential models:
    $
    #SME
    library(CharFun)
    se3 <- function(theta){
    lambda = theta[1]
    q = theta[2]
    f = -log(lambda)-log(hypergeom1F1(-2*lambda*y,q+1,2*q+1))
    log.f = sum(f)
    return(log.f)}
    #Iterative Method
    optim(par=c(0.1067734,4),se3, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    n = optim(par=c(0.1067734,4),se3, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    #Hessian matrix
    solve(n$hessian)
    #Standar Error
    sqrt(round(diag(solve(n$hessian)),5))
    $
     
    $
    #Weibull
    se4 <- function(theta){
    lambda = theta[1]
    q = theta[2]
    f = -log(lambda)- log(q)-(q-1)*log(y)+lambda*y q ^
    log.f = sum(f)
    return(log.f)}
    #Iterative Method
    optim(par=c(0.106,2),se4, hessian=TRUE,method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    n = optim(par=c(0.106,2),se4, hessian=TRUE,method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    #Hessian matrix
    solve(n$hessian)
    #Standar Error
    sqrt(round(diag(solve(n$hessian)),5))
    $
     
    $
    #GE
    se5 <- function(theta){
    lambda = theta[1]
    q = theta[2]
    f = -log(q)-log(lambda)-(q-1)*log(1-exp(-lambda*y))+lambda*y
    log.f = sum(f)
    return(log.f)}
    #Iterative Method
    optim(par=c(0.5268,0.7904),se5, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    n = optim(par=c(0.5268,0.7904),se5, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    #Hessian matrix
    solve(n$hessian)
    #Standar Error
    sqrt(round(diag(solve(n$hessian)),5))
    $
     
    $
    library(CharFun)
    hist(x, freq=F, ylim= c(0,0.17),ylab="Density", xlab="Variable", main="")
    #SME, values obtained by fitting the model: 
    a1= 0.3722 
    b1= 2.3078 
    curve((a1)*exp(-2*a1*x)*hypergeom1F1(2*a1*x,b1,2*b1+1), add=T) 
    #GE, values obtained by fitting the model:
    a2= 0.2694 
    b2= 0.9583 
    curve((b2)*(a2)*(1-exp(-x*a2))( b ^ 2-1)*(exp(-x*a2)), lty = 2, add=T) 
    #Weibull, values obtained by fitting the model: 
    a3= 0.3337 
    b3= 0.8986 
    curve(a3*b3*((x*a3)( b ^ 3-1))*exp(-x*a3)( b ^ 3),lty=3, add=T) 
    $
     
    $
    # QQPLOTS
    #WEIBULL
    datos = x 
    lambda= 0.3337
    q= 0.8986 
    Fx= 1 - exp(-lambda*datos) q ^  
    f= qnorm(Fx) 
    library(nortest) 
    qqnorm(f, pch = 1, frame = FALSE,ylim=c(-4,4), 
    xlim=c(-3,3),main="",cex.lab=1.5,cex.main=2, 
    xlab="Theoretical quantiles Weibull", 
    ylab="Quantiles sample repair time") 
    qqline(f, col = "black", lwd = 2)
    $
     
    $
    #GE 
    datos = x 
    lambda= 0.2694 
    q= 0.9582 
    Fx= (1 - exp(-lambda*datos)) q ^  
    f= qnorm(Fx) 
    library(nortest) 
    qqnorm(f, pch = 1, frame = FALSE,ylim=c(-4,4),
    xlim=c(-3,3),main="",cex.lab=1.5,cex.main=2,
    xlab="Theoretical quantiles GE",
    ylab="Quantiles sample repair time")
    qqline(f, col = "black", lwd = 2)
    $
     
    $
    #SME
    library(CharFun)
    datos = x 
    lambda= 0.3722 
    q= 2.3078 
    Fx= 1-exp(-2*lambda*datos)*hypergeom1F1(2*lambda*datos,q,2*q)
    f= qnorm(Fx) 
    library(nortest) 
    qqnorm(f, pch = 1, frame = FALSE,ylim=c(-4,4), 
    xlim=c(-3,3),main="",cex.lab=1.5,cex.main=2, 
    xlab="Theoretical quantiles SME", 
    ylab="Quantiles sample repair time") 
    qqline(f, col = "black", lwd = 2) 
    $

References

  1. Andrews, D.F.; Mallows, C.L. Scale Mixtures of Normal Distributions. J. R. Stat. Soc. Ser. B (Methodol.) 1974, 36, 99–102. [Google Scholar] [CrossRef]
  2. Fernández, C.; Steel, M.F. Bayesian Regression Analysis with Scale Mixtures of Normals. Econom. Theory 2000, 16, 80–101. [Google Scholar] [CrossRef]
  3. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; Wiley: New York, NY, USA, 1995; Volume 1. [Google Scholar]
  4. Rogers, W.H.; Tukey, J.W. Understanding some long-tailed symmetrical distributions. Stat. Neerl. 1972, 26, 211–226. [Google Scholar] [CrossRef]
  5. Mosteller, F.; Tukey, J.W. Data Analysis and Regression; Addison-Wesley: Reading, MA, USA, 1977. [Google Scholar]
  6. Kadafar, K. A biweight approach to the one-sample problem. J. Am. Statist. Assoc. 1982, 77, 416–424. [Google Scholar] [CrossRef]
  7. Wang, J.; Genton, M.G. The multivariate skew-slash distribution. J. Stat. Plann. Inference 2006, 136, 209–220. [Google Scholar] [CrossRef]
  8. Olmos, N.M.; Varela, H.; Bolfarine, H.; Gómez, H.W. An extension of the generalized half-normal distribution. Stat. Pap. 2014, 55, 967–981. [Google Scholar] [CrossRef]
  9. Rivera, P.; Barranco-Chamorro, I.K.; Gallardo, D.I.; Gómez, H.W. Scale Mixture of Rayleigh Distribution. Mathematics 2020, 8, 1842. [Google Scholar] [CrossRef]
  10. Castillo, J.; Gaete, K.; Muñoz, H.; Gallardo, D.I.; Bourguignon, M.; Venegas, O.; Gómez, H.W. Scale Mixture of Maxwell-Boltzmann Distribution. Mathematics 2023, 11, 529. [Google Scholar] [CrossRef]
  11. Gupta, R.D.; Kundu, D. Generalized Exponential Distributions. Aust. N. Z. J. Stat. 1999, 41, 173–188. [Google Scholar] [CrossRef]
  12. Gupta, R.D.; Kundu, D. Exponentiated Exponential Family: An Alternative to Gamma and Weibull Distributions. Biom. J. 2001, 43, 117–130. [Google Scholar] [CrossRef]
  13. Gupta, R.D.; Kundu, D. Generalized exponential distribution: Existing results and some recent developments. J. Stat. Plann. Inference 2007, 137, 3537–3547. [Google Scholar] [CrossRef]
  14. Mudholkar, G.S.; Srivastava, D.K.; Freimer, M. The exponentiated Weibull Family—A reanalysis of the Bus-Motor-Failure data. Technometrics 1995, 37, 436–445. [Google Scholar] [CrossRef]
  15. Abramowitz, M.; Stegun, I.A. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th ed.; National Bureau of Standards: Washington, DC, USA, 1970. [Google Scholar]
  16. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 12 January 2023).
  17. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. B Stat. Methodol. 1977, 39, 1–38. [Google Scholar]
  18. Gordy, M. A Generalization of Generalized Beta Distributions; Finance and Economics Discussion Series (FEDS); Board of Governors of the Federal Reserve System: Washington, DC, USA, 1998; p. 28. [Google Scholar]
  19. Devore, J.L. Probabilidad y Estadística para Ingeniería y Ciencias, 7th ed.; Cengage Learning Editores: Santa Fe, México, 2008. [Google Scholar]
  20. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1074, 19, 716–723. [Google Scholar] [CrossRef]
  21. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.