Univariate and Bivariate Models Related to the Generalized Epsilon–Skew–Cauchy Distribution

In this paper, we consider a stochastic representation of the epsilon–skew–Cauchy distribution, viewed as a member of the family of skewed distributions discussed in Arellano-Valle et al. (2005). The stochastic representation facilitates derivation of distributional properties of the model. In addition, we introduce symmetric and asymmetric extensions of the Cauchy distribution, together with an extension of the epsilon–skew–Cauchy distribution. Multivariate versions of these distributions can be envisioned. Bivariate examples are discussed in some detail.


Introduction
Mudholkar and Hutson (2000) [1] studied an asymmetric normal distribution that they called the epsilon-skew-normal {ESN(ε); |ε| < 1}, with asymmetry or skewness parameter ε. When the parameter ε assumes the value 0, the distribution simplifies to become a standard normal distribution. The family thus consists of a parameterized set of usually asymmetric distributions that includes the symmetric standard normal density as a special case. Specifically, we say that X ∼ ESN(ε) if its density is of the form: where x ∈ R, φ is the standard normal density and sgn(·) is the sign function.
Arellano-Valle et al. (2005) [2] discuss extension of this model, together with associated inference procedures. They consider a class of Epsilon-skew-symmetric distributions associated with a particular symmetric density f (·) that is indexed by an asymmetry parameter ε with densities given by where x ∈ R and |ε| < 1. If X has density of the form (1), then we say that X is an epsilon-skew-symmetric random variable and we write X ∼ ES f (ε). Arellano-Valle et al. (2005) [2] extend this family to the model epsilon-skew-exponential-power, a model that has major and minor asymmetry and kurtosis that the ESN model. On the other hand Gómez et al. (2007) [3] study the Fisher information matrix for epsilon-skew-t model, which was used before in the study a financial series by Hansen (1994) [4]; see also Gómez et al. (2008) [5]. Note that if in (1) we set f (t) = 1/ π 1 + t 2 , we obtain the epsilon-skew-Cauchy model.
We will write X ∼ N(0, 1) to indicate that X has a standard normal distribution, and we will write Y ∼ HN(0, 1) to indicate that Y has a standard half-normal distribution, i.e., that Y = |X| where X ∼ N(0, 1).
The distribution of the ratio X/Y of two random variables is of interest in problems in biological and physical sciences, econometrics, and ranking and selection. It is well known that if X ∼ N(0, 1), Y 1 ∼ N(0, 1) and Y 2 ∼ HN(0, 1) are independent, then the random variables X/Y 1 and X/Y 2 both have Cauchy distributions; see Johnson et al. (1994, [6] Chapter 16). Behboodian, et al. (2006) [7] and Huang and Chen (2007) [8] study the distribution of such quotients when the component random variables are skew-normal (of the form studied in Azzalini (1985) [9]). The principle objective of the present paper is to study the behavior of such quotients when the component random variables have epsilon-skew-normal distributions.
The paper is organized in the following manner. In Section 2, we describe a representation of the epsilon-skew-Cauchy model. In Section 3, we consider the distribution of the ratio of two independent random variables, one of which has an ESN (ε) distribution and the other a standard normal distribution. In addition, an extension of the epsilon-skew-Cauchy (ESC) distribution is introduced. Bivariate versions of these distributions are discussed in Section 4. Extensions to higher dimensions can be readily envisioned, but are not discussed here. In Section 5, some of the bivariate distributions introduced in this paper are considered as possible models for a particular real-world data set.
Proof. With the transformation Z 1 = X/Y and W = Y, whose Jacobian |J| = w, we obtain where z ∈ R, w > 0.

General Bivariate Mudholkar-Hutson Distribution
Define Z 1 , Z 2 , Z 3 to be i.i.d. standard normal random variables. For i = 1, 2, 3 define where α 1 , α 2 , α 3 , β 1 , β 2 , β 3 are positive numbers and 0 < γ 1 , γ 2 , γ 3 < 1. So, the parameters α i and β i indicate the propensity in which the discrete random variable takes negative and positive values, respectively. The parameters γ i control how often negative and positive values are taken by U i . In addition assume that all six random variables Z 1 , Z 2 , Z 3 , U 1 , U 2 and U 3 are independent, and define The model (2) is highly flexible since it allows for different behavior in each of the four quadrants of the plane. From (2) it may be recognized that only fractional moments of X and Y exist.
Note that if we define (W 1 , in which case we say that (W 1 , W 2 ) has a standard bivariate half-Cauchy distribution.
Using (3) and conditioning on U 1 , U 2 and U 3 we obtain the density of (X, Y) in the albeit complicated form: Some special cases which might be considered include the following: 1. Mudholkar and Hutson type. For this we set: In this case the density (4) simplifies somewhat to become: A further specialization of the density (5) can be considered, as follows.

Homogenous Mudholkar and Hutson type. For this we set:
This homogeneity results in a little simplification of (4), thus: It is easy to see that the parameter ε is not identifiable in (6) An adjustment to ensure identifiability involves introducing the constraint ε ≥ 0. 3. Equal weights. In this case we assume that α 1 , α 2 , α 3 , β 1 , β 2 , β 3 are positive numbers and The pdf (7) is not identifiable because the values of α i can be interchanged with those of β i and f (x, y) does not change. Moreover, multiplying all of the α's and β's by a constant does not change f (x, y). So, one way to get identifiability in the model (7) is to set α i = β i (i = 1, 2, 3) and α 3 equal to 1. In that case, Equation (7) takes the form However, this is then recognizable as being simply a scaled version of the standard bivariate Cauchy density (compare with Equation (3)).
We now consider the marginal densities for the random variable (X, Y) defined by (2).
Consequently, the density of X is of the form f X (x) = 2 π γ 1 γ 3 which is a mixture of half-Cauchy densities. The density of Y is then of the same form with α 1 , β 1 replaced by α 2 , β 2 .
To get a more direct bivariate version of the original Mudholkar-Hutson density in which the α's, β's and γ's were related to each other, we go back to the original representation, i.e., Equation (2), but now we set So the density is of the form A more general version of the density with γ 3 = 1, is of the form: (without loss of generality set Needless to say we can consider analogous models in which, instead of assuming that (W 1 , W 2 ) = |Z 1 | |Z 3 | , |Z 2 | |Z 3 | has a density of the form (3) we assume that it has another bivariate density with support R + × R + , e.g., bivariate normal restricted to R + × R + , or bivariate Pareto, etc.
Another general bivariate MH model which includes the bivariate skew-Cauchy distribution given by (2) is the bivariate skew t model. This model can be obtained replacing |Z 3 | by V 3 ∼ χ 2 ν in Equation (2). Thus, if we assume that all six random variables Z 1 , Z 2 iid ∼ N (0, 1), V 3 ∼ χ 2 ν ; U 1 , U 2 and U 3 are independent, and define then, because (W 1 , The pair of variables in (8) allows one to model a wider variety of paired data sets than that given in (2) because it also can model light and heavy "tails" in a different way for each quadrant of the coordinate axis. Additionally, it is possible to compute the r−order moments. Proposition 3. The expected value of (X, Y) in (8) is given by provided that ν > 1.
On the other hand, since Therefore, from (8) we have and the result is obtained straightforward.
Following the proof of the previous proposition it is possible to obtain the r−order moments provided that ν > r.
In applications, it will usually be appropriate to augment these models by the introduction of location, scale and rotation parameters, i.e., to consider where µ ∈ (−∞, ∞) 2 and Σ is positive definite.

Application
The data that we will use were collected by the Australian Institute of Sport and reported by Cook and Weisberg (1994) [10]. The data set consists of values of several variables measured on n = 202 Australian athletes. Specifically, we shall consider the pair of variables (Ht,Wt) which are the height (cm) and the weight (Kg) measured for each athlete.
We fitted the bivariate Mudholkar-Hutson distributions for five different cases. In addition to the general case which is given by the pdf f (x, y; α i , β i , γ i , ν, µ, Σ), based on (9), where i = 1, 2, 3, and µ = (µ 1 , µ 2 ) is the location parameter and is the symmetric positive definite scale matrix, we also consider four special cases: 1. When the pdf is given by Equation (4). That is taking ν → ∞ in the bivariate skew t MH model. 2. When the pdf is given by which is the bivariate MH distribution specified using the special case (1.), where |ε i | < 1 for all i = 1, 2, 3. 3. When the pdf is given by f x, y; ε, µ j , Σ 11 , Σ 22 , Σ 12 = f x, y; 1 + ε, 1 − ε, which is the bivariate MH distribution specified using the special case (2.), where |ε| < 1.
All fits were done by maximizing the likelihood through numerical methods which combine algorithms based on the Hessian matrix and the Simulated Annealing algorithm. Standard errors of the estimations were computed based on 1000 bootstrap data samples.
To compare model fits, we used the Akaike criterion (see Akaike, 1974 [11]), namely where k is the dimension of θ which is the vector of parameters of the model being considered. Table 1 displays the results of the fits for 100 women. In Table 1, we see the results of fitting the five competing models. They are the general bivariate skew−t MH model with pdf given in Equation (9), General bivariate MH with pdf given in Equation (4), Bivariate MH (1.) with pdf given in Equation (5), Bivariate MH (2.) with pdf given in Equation (6) and Bivariate Cauchy. It shows the maximum likelihood estimates (mle's) of the five models. The last column shows the estimated standard errors (se) of the estimates. Non identifiability of the bivariate skew−t MH model is evidenced by the huge value of the estimated standard error of the estimate of ν. However, the AIC criterion indicates that data are better fitted by the general bivariate skew−t MH (9) model. Figure 2 shows the contour lines of the fitted pdf.  Table 2 displays the results of the fits for 102 men. In Table 2 we again compare five competing models. The maximum likelihood estimates (mle's) of the five models, the corresponding Akaike criterion values and estimated standard errors of the estimates are displayed in the table. Again, the AIC indicates that the data are better fitted by the general bivariate skew−t MH (9) model. Figure 3 shows the contour lines of the fitted pdf. Non identifiability of the general bivariate skew−t MH (9) and General bivariate MH (4) models is shown by the large values of the estimated standard errors for some estimates.   Table 3 displays the results of the fits for the full set of n = 202 Australian athletes, regardless of gender. Table 3 shows the maximum likelihood estimates (mle's) of the five models together with the corresponding Akaike criterion values and estimated standard errors of the estimates. The AIC here also indicates that the data are better fitted by the general bivariate skew−t MH model (9). Figure 4 shows the contour lines of the fitted pdf. Thus, in all three cases, males, females and combined, the best fitting model was the general bivariate skew−t MH. For the three real cases analyzed the general bivariate skew-t MH model (9) was indicated as the best fitted. That means that it seems worth considering the more general model to explain the variability of these data sets.

Concluding Remarks
The Mudholkar-Hutson skewing mechanism admits flexible extensions in both the univariate and multivariate cases. Stochastic representations of such extended models typically have corresponding likelihood functions that are somewhat complicated (for example Equations (4) and (9)). This is partly compensated for by the ease of simulation for such models using the representation in terms of latent variables ( the U i 's and Z i 's in (2) or (8)). The data set analyzed in Section 5 illustrates the potential advantage of considering the extended bivariate Mudholkar-Hutson model, since the basic bivariate M-H model does not provide an acceptable fit to the "athletes" data. Needless to say, applying Occam's razor, it would always be desirable to consider the hierarchy of the five nested bivariate models that were considered in Section 5, in order to determine whether one of the simpler models might be adequate to describe a particular data set. Indeed there may be other special cases of the model (4), intermediate between (4) and (5) say, that might profitably be considered for some data sets. However, it should be remarked that selection of such sub-models for consideration should be done prior to inspecting the data. The addition of the GBMH model to the data analysts tool-kit should provide desirable flexibility for modeling data sets which exhibit behavior somewhat akin to, but not perfectly adapted to, more standard bivariate Cauchy models.