A Note on Pareto-Type Distributions Parameterized by Its Mean and Precision Parameters

: Pareto-type distributions are well-known distributions used to ﬁt heavy-tailed data. How-ever, the standard parameterizations used for Pareto-type distributions are poorly suited to modeling. On this note, we suggest new parameterizations that are better suited to the purpose. In addition, we propose many regression models where the response variable is Pareto-type distributed using new parameterizations that are indexed by mean and precision parameters. The main motivation for these new parametrizations is the useful interpretation of the regression coefﬁcients in terms of the mean and precision, as is usual in the context of regression models. The parameter estimation of these new models is performed, based on the maximum likelihood paradigm. Some numerical illustrations of the estimators are presented with a discussion of the obtained results. Finally, we illustrate the practicality of the new models by means of two applications to real data sets. Results suggest that the bias for all the cases is acceptable, and both the bias and the se terms are reduced when the sample size is increased. Additionally, the coverage probabilities are closer to the nominal value when n is increased. Those results suggest that the estimators for the RLo model are consistent.


Introduction
The Pareto distribution was originally applied by Pareto [1] to model the unequal distribution of wealth. Despite being proposed a long time ago, there are still many current works that use this distribution. See, for instance, the works of Wang and Li [2], Shrahili et al. [3] and Sharpe and Juárez [4], to name a few. The random variable Y has the Pareto distribution if its cumulative distribution function (CDF) for y ≥ β is given by where β > 0 is a scale parameter and α > 0 is a shape parameter. The parameter β is only a scale factor, which is known as the tail index. When this distribution is used to model the distribution of wealth, the parameter α a is called the Pareto index. Here, this distribution is called Pareto Type I distribution (Pareto [5]). The Lomax distribution (Lomax [6]), also called the Pareto Type II distribution, is a Pareto Type I distribution shifted so that its support begins at zero. Its CDF is of the form There is a relation between the Pareto Type II distribution and the generalized Pareto distribution (GPD), which is much used in the study of extreme values and peaks over thresholds. The CDF of the GPD is given by (Pickands [7]) F(y; σ, ξ) = 1 − 1 + α y β −1/α . Regression models are typically constructed to model the mean of a distribution. Despite the nice properties of Pareto-type distributions, none of their parameters correspond to the expectation, which complicates the interpretation of regression models specified using these distributions. In this context, we proposed a new parameterization of these distributions that is indexed by mean and precision parameters. Parameterizations of statistical models are not unique. In general, we use a particular parameterization for interpretation of the parameters and/or for computational convenience. The current manuscript falls into the first category (interpretation of the parameters). The main advantage of our new parametrization is the straightforward interpretation of the regression coefficients in terms of the expectation of the positive real line response variable, as is usual in the context of generalized linear models.
The paper is organized as follows. In Section 2, we present new parameterizations of the Pareto-type distributions indexed by mean and precision parameters. Section 3 introduces the Pareto-type regression models with varying mean and precision. Furthermore, numerical results from Monte Carlo simulation experiments are presented and discussed. In Section 4, we provide the applications to two real data sets. Concluding remarks and possible points for future research are given in the Section 5.

Pareto-Type Distributions with Alternative Parameterizations
In this Section, we provide four reparameterizations for Pareto-type distributions indexed by mean and dispersion parameters.

Pareto Distribution
The probability density function (PDF) related to the Pareto model with PDF indicated in Equation (1) is given by The mean and variance of the distribution are given by respectively. In practice, α is frequently assumed to be larger than 2, so that the distribution has a finite variance.
Using the proposed parameterization, the RPa density in (3) can be written as

Power Function Distribution
The power function model is a two-parameter distribution, which is the distribution of the reciprocal of a variable distributed according to the Pareto distribution, i.e., if X has a Pareto distribution with parameters α and β, then Y = X −1 has a power function distribution with parameters α > 0 (shape parameter) and β > 0 (scale parameter).
We begin with the PDF of the power function distribution given by The mean and variance of Y are A new parameterization of the power function distribution is given by With this new parameterization, it follows from (7) that Hereafter, we consider the notation Y ∼ RPo(µ, φ) to specify that Y is a random variable following a reparameterized power function distribution, with mean µ and precision parameter φ > 0. We remark that this parameterization has not been proposed in the statistical literature.
Using this alternative parameterization, the PDF for the RPO distribution in (6) can be written as

Lomax Distribution
The PDF associated to a dislocated Pareto in (2) is given by The mean and the variance for (9) are given by We considered an alternative parameterization of the Lomax distribution in terms of the mean and precision parameters.
where ω(φ) = φ+2 φ represents the coefficient of variation of Y, which depends only on φ > 0. Moreover, φ represents a precision parameter because, for a fixed µ, and an increasing φ, the corresponding variance decreases. From now on, we use the notation Y ∼ RLo(µ, φ) to indicate that Y has a reparameterized Lomax distribution with mean µ > 0 and precision parameter φ > 0. We highlight that this parameterization has not been proposed in the statistical literature. Using the proposed parameterization, we can write the RLo PDF as

Generalized Pareto Distribution
The GPD (Pickands [7]) is a two-parameter family of distributions, with PDF given by where β > 0 and α are the scale and shape parameters, respectively. For α > 0 the range of y is 0 < y < β/α and for α < 0 the range is y > 0. One of the interesting features of this distribution is its simple mathematical form. The mean and the variance associated with (12) are given by respectively.
A new insight into the GPD can be obtained by performing a reparameterization on the random variable X, whose PDF is given in (12). Consider the parameterization With this alternative parameterization, it follows from (13) that Hereafter, we use Y ∼ RGPD(µ, φ) to say that Y follows a reparameterized GPD distribution with mean µ > 0 and precision φ > 0. We also note that this parameterization is already known in the statistical literature (see, for instance, Bourguignon and do Nascimento [8]). Thus, the RGPD PDF in equation (12) can be written as

Other Models Parameterized in Terms of the Mean and Precision Parameters
For the RPa and RLo models, we considered the restriction α > 2 and for the RGPD model we take into account the restriction α < 1/2, in order to guarantee the existence of the mean and variance terms. In principle, this can be a disadvantage because the class of models that we are considering is smaller than the original proposals. In return, we obtain models where, under some conditions, the coefficients can be interpreted in a very useful way, as we will see in the Section 3.
In the literature, there are many models with positive support parameterized in terms of the mean, and with a quadratic form to the variance, i.e, the mean and variance of the model are given by µ and µ 2 [ω(φ)] 2 , respectively, where µ, φ > 0 and ω(·) is a positive function. To name a few examples, we referred to the reparametrized gamma and Weibull models, both available with the GA and WEI3 functions in the gamlss.dist [9] package of the software R [10]; the reparameterized Birnbaum-Saunders model [11]; the reparameterized slash half-normal distribution [12]; among others. The recommendation is to fit part of such models and, based on model selection criteria such as the Akaike [13] (AIC) and Schwartz [14] (BIC) criteria, choose a model and validate it based on some kind of residuals. For instance, we suggest the quantile residuals (QR) discussed in Dunn and Smyth [15].

Modelling and Inference
The main advantage of the reparameterization of models in terms of the mean is the possible interpretation of the coefficients when a regression structure is incorporated into the mean. For this, let Y 1 , . . . , Y n be n independent random variables, where each Y i , i = 1, . . . , n, follows the PDF given in Equation (5), (8), (11) or (14), depending on whether we are interest in the use of the RPa, RPo, RLo or RGPD model, with mean µ i and precision parameter φ i . In order to introduce a regression structure in the Pareto-type models, we assume that where β = (β 1 , . . . , β p ) and ν = (ν 1 , . . . , ν q ) are vectors of unknown regression coefficients, which we assumed to be functionally independent, β ∈ R p and ν ∈ R q , with p + q < n, η 1i and η 2i are the linear predictors, and x i = (x i1 , . . . , x ip ) and z i = (z i1 , . . . , z iq ) are p and q known regressors, respectively, for i = 1, . . . , n. Additionally, we assume that the rank of X = (x 1 , . . . , x n ) and Z = (z 1 , . . . , z n ) are p and q, respectively. The link functions g 1 : R → R + and g 2 : R → R + in (15) must be strictly monotone, positive and at least twice differentiable, such that µ i = g −1 1 (x i β) and φ i = g −1 2 (z i ν), with g −1 1 (·) and g −1 2 (·) being the inverse functions of g 1 (·) and g 2 (·), respectively. For the case where g 1 (u) = exp(u), the interpretations about the components of β are as following: • exp(β 0 ) represents the mean of the response variable when all the covariates are equal to 0. Of course, this interpretation is valid as long as it makes sense. • exp(β j ), j = 1, . . . , p represents the increment (in percentage terms) when the j-th covariates increased in 1 unit and the others are fixed.
On the other hand, the log-likelihood function is given by (5), (8), (11) or (14), depending on the reparameterized model to be used. The maximum likelihood (ML) estimators of β and ν, say β and ν, respectively, can be obtained by solving simultaneously the nonlinear system of equations ∂ (β, ν)/∂β = 0 and ∂ (β, ν)/∂ν = 0. However, no closed-form expressions for the ML estimates are obtained, except in the Pareto and Lomax models in the case where x i = z i = 1, for i = 1, . . . , n, i.e., for the instance where only the intercept term is included in both set of covariates. Therefore, we must use an iterative method for nonlinear optimization.

A Simulation Study
Here, we present a simulation study. For this, we consider the RLo model and the same covariate to model the parameters µ and φ. The covariates were drawn from the uniform distribution. We considered two combinations of parameters: scenario 1, β 0 = 1.5, β 1 = 0.8, ν 0 = −1.2 and ν 1 = 0.7 and; scenario 2, β 0 = −2.1, β 1 = 1.3, ν 0 = 0.8 and ν 1 = 1.2. We also considered three sample sizes: 50, 100, and 200. For each combination of parameters, we drew 1000 replicates of the respective sample size and compute the maximum likelihood estimators and their respective estimated standard errors. Table 1 summarizes the mean bias (bias), the mean of the estimated standard errors (se), and the respective 95% coverage probabilities (cp). Results suggest that the bias for all the cases is acceptable, and both the bias and the se terms are reduced when the sample size is increased. Additionally, the coverage probabilities are closer to the nominal value when n is increased. Those results suggest that the estimators for the RLo model are consistent.

Real-World Data Analysis
In this section, we present two applications of the proposed models using real data for illustrative purposes.

Lomax Regression Model
This data set was obtained from the Department of Mining of the University of Atacama, Chile, to study the concentration of some ores in the soil. The data set corresponds to 86 measurements of the concentration of the Zinc (Zn) and Uranium (U) ore respectively, both in parts per million (ppm). We consider a regression model to explain the quantity of Zn in terms of the quantity of U. For this, we considered that For comparative purposes, we also considered y i ∼ RGa(µ i , φ i ), a reparametrized version of the gamma distribution such as E(Y i ) = µ i and Var(Y i ) = µ 2 i φ 2 i , a very similar structure for the mean and variance as the RLo model, where µ i and φ i are defined based on the regression structure given in (16). Table 2 shows the results for these models. Note that the RLo model presents lower AIC and BIC criteria, suggesting the use of the RLo model instead of the RGa model for this particular problem. Additionally, as exp( β 1 ) = 0.974, the mean of Zn decrease in 2.6% (95% confidence interval 1.5-3.6%) for each ppm in which U is increased. Finally, Figure 1 shows the estimated mean for Zn for each value of U.

Pareto Regression Model
This data set was presented in Gunst and Mason (1980). The data set gives different measure of air pollution, and environmental, demographic and socioeconomic variables for 60 Standard Metropolitan Statistical Areas of the United States. We considered the percent of families with income under $3000 (poor) and the relative pollution potential of hydrocarbons (hc). We consider a regression model to explain hc in terms of poor, For comparative purposes, we also considered y i ∼ RWe(µ i , φ i ), a reparameterized version of the Weibull distribution such as E(Y i ) = µ i and Var(Y i ) = µ 2 i Γ(1/φ i + 1)/Γ 2 (1/φ i + 1), again a very similar structure for the mean and variance as the RPa model, where µ i and φ i are defined based on the regression structure given in (17). Table 3 shows the results for this case. Note that the RPa model presents lower AIC and BIC criteria, suggesting the use of the RPa model instead of the RWe model for this particular problem. Finally, Figure 2 shows the estimated mean for hc for each value of poor.

More Concluding Remarks and Discussion
In this work, we study new parameterizations for the Pareto-type distributions in terms of the mean and precision parameters. Furthermore, we have proposed regression models where the response variable is Pareto-type distributed using these new parameterizations. The models' parameters are estimated by the maximum likelihood method. A Monte Carlo simulation study shows that the maximum likelihood estimators have a reasonable behavior. Finally, the usefulness of the proposed methodology is shown through two applications.