Comparing Estimation Methods for the Power–Pareto Distribution

Caeiro, Frederico; Norouzirad, Mina

doi:10.3390/econometrics12030020

Open AccessArticle

Comparing Estimation Methods for the Power–Pareto Distribution

by

Frederico Caeiro

^1,*

and

Mina Norouzirad

²

¹

Department of Mathematics and Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology (NOVA FCT), 2829-516 Caparica, Portugal

²

Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology (NOVA FCT), 2829-516 Caparica, Portugal

^*

Author to whom correspondence should be addressed.

Econometrics 2024, 12(3), 20; https://doi.org/10.3390/econometrics12030020

Submission received: 14 May 2024 / Revised: 21 June 2024 / Accepted: 1 July 2024 / Published: 11 July 2024

Download

Browse Figures

Versions Notes

Abstract

Non-negative distributions are important tools in various fields. Given the importance of achieving a good fit, the literature offers hundreds of different models, from the very simple to the highly flexible. In this paper, we consider the power–Pareto model, which is defined by its quantile function. This distribution has three parameters, allowing the model to take different shapes, including symmetrical and left- and right-skewed. We provide different distributional characteristics and discuss parameter estimation. In addition to the already-known Maximum Likelihood and Least Squares of the logarithm of the order statistics estimation methods, we propose several additional methods. A simulation study and an application to two datasets are conducted to illustrate the performance of the estimation methods.

Keywords:

parameter estimation; power–Pareto distribution; quantile function

1. Introduction

Univariate continuous distributions play a crucial role in modeling real-world phenomena. While well-known distributions like the normal, exponential, and Pareto distributions are commonly used, there is often a need for specialized distributions, to model specific data patterns. One established practice for defining more flexible distributions is through the quantile function (QF)

Q (p) = inf {x : F (x) \geq p}, 0 \leq p \leq 1,

where

F (x) = P (X \leq x)

represents the cumulative distribution function (CDF) of the random variable X. Thus, if F is strictly increasing, Q and F are inverse functions of each other, and

F (Q (p)) = p

. QFs possess numerous distinct characteristics that are absent in CDFs. We highlight that new and more flexible QFs can be easily constructed from the combination of existing QFs. For example, the product of QFs is still a valid QF.

Since the QF provides all valuable information about the distribution’s shape, several QF models have been proposed in the literature. The symmetric Tukey lambda distribution (Tukey 1960) and its asymmetric version, known as the generalized lambda distribution (Ramberg and Schmeiser 1972), are both defined in terms of their QF. Similarly, the quantile-based skew logistic distribution introduced by Gilchrist (2000) is also defined through its QF. More recently, Sankaran et al. (2016) introduce a new QF resulting from the sum of the QFs of the generalized Pareto and Weibull distributions.

In some cases, the density and distribution functions for distributions expressed through QFs are not available in closed form, except for specific parameter values. However, those functions can be easily computed by numerically inverting the corresponding QF. One significant advantage of these distributions is the simplicity of their QF, which facilitates the generation of random values through the use of uniform random variables and the application of inference procedures based on quantiles.

In this article, we are interested in the power–Pareto distribution introduced in Gilchrist (2000) and further studied in Hankin and Lee (2006). This is a versatile family of distributions for a non-negative random variable, such as income and wealth. This model is formed through the product of the power and Pareto QFs as

Q (p ∣ c, λ_{1}, λ_{2}) = c p^{λ_{1}} {(1 - p)}^{- λ_{2}}, 0 \leq p \leq 1,

(1)

where

c > 0

,

min (λ_{1}, λ_{2}) \geq 0

, and

max (λ_{1}, λ_{2}) > 0

. We write

X \sim P P (c, λ_{1}, λ_{2})

whenever X has the QF in Equation (1). The parameter c is related to the scale, while

λ_{1}

and

λ_{2}

control the shape of the distribution. By fixing or restricting some of this distribution’s parameters, we obtain well-known reduced versions. More precisely, if

λ_{1} = 0

and

λ_{2} > 0

, then X follows a Pareto (type I) distribution with the QF

Q (p) = c p^{λ_{1}}, 0 \leq p \leq 1,

and if

λ_{1} > 0

and

λ_{2} = 0

, X has a scaled power distribution with the QF

Q (p) = c {(1 - p)}^{- λ_{2}}, 0 \leq p \leq 1 .

Furthermore, it can be observed that when

λ_{1} = λ_{2} > 0

, X has the well-known log-logistic distribution, which is a special case of Burr (1942) type XII and Dagum (1977) family of distributions (for further details, see Caeiro and Mateus 2024). The case

λ_{1} = λ_{2} = 0

is not considered here, as it results in a degenerate distribution at c. In the literature, the power–Pareto model in Equation (1) is also known as the Davies distribution (Hankin and Lee 2006) or Hankin–Lee distribution (Nair and Vineshkumar 2010). Hankin and Lee (2006) proposed two inference procedures to estimate the parameters c,

λ_{1}

, and

λ_{2}

in Equation (1), namely the maximum likelihood and the least squares method for the logged order statistics. Additionally, the authors compare the efficiency of those two estimation methods by comparing their variance. Since maximum likelihood estimators are often severely biased, for small sample sizes, we argue that solely considering the variance of the estimators may not provide a comprehensive assessment of their performance, and thus, it could lead to misleading conclusions. Therefore, the primary goal of this paper is to discuss a broader set of estimation techniques and consider alternative criteria for a more precise and unbiased comparison of the estimators.

The remainder of the paper is organized as follows. In Section 2, we describe various known properties of the power–Pareto model, like probability density and distribution functions, moments, and quantile-based measures. Several inferential procedures for the parameters of the power–Pareto distribution are discussed in Section 3. In Section 4, we conduct Monte Carlo simulations to analyze the performance of the different inferential procedures. In Section 5, we apply the inferential methods to two real datasets, and Section 6 concludes the article.

2. Statistical Properties of the Power–Pareto Distribution

2.1. Functions

From now on, we use

θ = (c, λ_{1}, λ_{2})

to denote the three parameters of the power–Pareto model. The derivative of

Q (p ∣ θ)

, denoted as

q (p ∣ θ) = \partial Q (p ∣ θ) / \partial p

, is known as the quantile density function. For the model in Equation (1), this function is given by

q (p ∣ θ) = Q (p ∣ θ) (\frac{λ_{1}}{p} + \frac{λ_{2}}{1 - p}), 0 \leq p \leq 1 .

(2)

Note that the quantile density function in Equation (2) satisfies the identity

f (Q (p ∣ θ)) q (p ∣ θ) = 1,

where

f (\cdot)

is the probability density function.

If we exclude the cases where the power–Pareto reduces to the power, the Pareto, or the log-logistic distributions, neither the distribution function nor the density function can be expressed in closed form. Thus, these functions have to be computed through numerical inversion of the QF. Suppose that

u = u (x ∣ θ)

is the solution of the equation

x = Q (u ∣ θ)

. Then, the CDF can be expressed as

F (x ∣ θ) = u

, and the density function can be derived from the inverse function rule as

\begin{matrix} f (x ∣ θ) & = \frac{\partial F (x ∣ θ)}{\partial x} \\ = {(\frac{\partial Q (u ∣ θ)}{\partial u})}^{- 1} \\ = {(Q (u ∣ θ) (\frac{λ_{1}}{u} + \frac{λ_{2}}{1 - u}))}^{- 1} . \end{matrix}

(3)

The density at the left tail can be approximated by

f (x ∣ θ) \sim \frac{1}{c λ_{1}} {(\frac{x}{c})}^{\frac{1}{λ_{1}} - 1};

(4)

Similarly, the right tail density can be approximated by

f (x ∣ θ) \sim \frac{1}{c λ_{2}} {(\frac{c}{x})}^{\frac{1}{λ_{2}} + 1} .

(5)

In addition, we have

1 - F (x ∣ θ) \sim {(\frac{x}{c})}^{- α},

(6)

for large x, where

α = 1 / λ_{2}

is the upper tail index (Finkelstein et al. 2006; Schluter 2018). Hence, the power–Pareto model belongs to the class of heavy-tailed distributions. In numerous applications, it is crucial to estimate accurately the tail index

α

in Equation (6). We refer the reader to Beirlant et al. (2012, 2004); Mehta and Yang (2022); Ndlovu and Chikobvu (2023); Reiss and Thomas (2007), among others. As noted in Hankin and Lee (2006), Equations (4) and (5) show that

λ_{1}

controls the behavior of the left-hand tail, while

λ_{2}

governs the right-hand tail. A larger value of

λ_{1}

results in a shorter left tail, whereas a larger value of

λ_{2}

leads to a longer right tail. This relationship is illustrated in Figure 1, where different parameter values are used to depict the probability density function.

2.2. Moments

The k-

th

moment can be expressed in an explicit form as follows:

E (X^{k}) = \int_{0}^{1} {(Q (p ∣ θ))}^{k} d p = c^{k} B (1 + k λ_{1}, 1 - k λ_{2}), λ_{2} < \frac{1}{k},

(7)

where

B (a, b) = \int_{0}^{1} x^{a - 1} {(1 - x)}^{b - 1} d x

, with

a > 0

and

b > 0

, represents the Beta function. Using the notation

b (k, λ_{1}, λ_{2}) = B (1 + k λ_{1}, 1 - k λ_{2})

, the mean (

μ

) and the variance (

σ^{2}

) are

\begin{matrix} μ & = & c b (1, λ_{1}, λ_{2}), \\ σ^{2} & = & c^{2} (b (2, λ_{1}, λ_{2}) - b^{2} (1, λ_{1}, λ_{2})), \end{matrix}

and exist if

λ_{2} < 1

and

λ_{2} < \frac{1}{2}

, respectively. Some other measures, like the coefficient of variation (CV), Pearson’s skewness (

S_{p}

), and kurtosis (

K_{p}

) can also be easily obtained in explicit forms,

\begin{matrix} CV & = & \frac{b (1, λ_{1}, λ_{2})}{\sqrt{b (2, λ_{1}, λ_{2}) - b^{2} (1, λ_{1}, λ_{2})}}, λ_{2} < \frac{1}{2}, \\ S_{p} & = & \frac{b (3, λ_{1}, λ_{2}) - 3 b (1, λ_{1}, λ_{2}) b (2, λ_{1}, λ_{2}) + 2 b^{3} (1, λ_{1}, λ_{2})}{{(b (2, λ_{1}, λ_{2}) - b^{2} (1, λ_{1}, λ_{2}))}^{3 / 2}}, λ_{2} < \frac{1}{3}, \\ K_{p} & = & \frac{b (4, λ_{1}, λ_{2}) - 4 b (1, λ_{1}, λ_{2}) b (3, λ_{1}, λ_{2}) + 6 b^{2} (1, λ_{1}, λ_{2}) b (2, λ_{1}, λ_{2}) - 3 b^{4} (1, λ_{1}, λ_{2})}{{(b (2, λ_{1}, λ_{2}) - b^{2} (1, λ_{1}, λ_{2}))}^{2}}, λ_{2} < \frac{1}{4} . \end{matrix}

2.3. Quantile Measures

Quantile-based measures of distributional characteristics, including location, dispersion, skewness, and kurtosis, exhibit less sensitivity to outliers when compared to conventional moments. For the power–Pareto distribution, the median (M) and the interquartile range (IQR) are, respectively, given by

\begin{matrix} M & = & Q (1 / 2 ∣ θ) = c 2^{λ_{2} - λ_{1}}, \\ IQR & = & Q (3 / 4 ∣ θ) - Q (1 / 4 ∣ θ) = c 4^{λ_{2} - λ_{1}} (3^{λ_{1}} - 3^{- λ_{2}}) . \end{matrix}

The asymmetry and peakedness of the distribution can be analyzed using Bowley (1901) Skewness (

S_{B}

) and Moors (1988) Kurtosis (

K_{M}

) quantile-based coefficients,

\begin{matrix} S_{B} & = & \frac{Q (3 / 4 ∣ θ) - 2 Q (1 / 2 ∣ θ) + Q (1 / 4 ∣ θ)}{IQR} \\ = & \frac{3^{λ_{1}} - 2^{1 + λ_{1} - λ_{2}} + 3^{- λ_{2}}}{3^{λ_{1}} - 3^{- λ_{2}}}, \end{matrix}

and

\begin{matrix} K_{M} & = & \frac{Q (7 / 8 ∣ θ) - Q (5 / 8 ∣ θ) + Q (3 / 8 ∣ θ) - Q (1 / 8 ∣ θ)}{IQR} \\ = & \frac{2^{λ_{2} - λ_{1}} (7^{λ_{1}} - 5^{λ_{1}} 3^{- λ_{2}} + 3^{λ_{1}} 5^{- λ_{2}} - 7^{- λ_{2}})}{3^{λ_{1}} - 3^{- λ_{2}}} . \end{matrix}

All the aforementioned quantile-based measures are more robust than moments, since they exist in the complete parameter space, in contrast to moments.

2.4. Order Statistics

Let

X_{1}, X_{2}, \dots, X_{n}

be a random sample of size n from a population with the QF defined in Equation (1), and let

X_{(1)} \leq X_{(2)} \leq \dots \leq X_{(n)}

be the corresponding ascending order statistics. Order statistics play a crucial role in statistical inference due to their ability to provide valuable insights into the distribution of X, as well as in estimation procedures for parameters of the model. The density function of

X_{(i)}

is

f_{(i)} (x) = \frac{1}{B (i, n - i + 1)} {(F (x))}^{i - 1} {(1 - F (x))}^{n - i} f (x) .

Note that

f_{(i)} (x)

does not have a closed form, since neither the CDF nor the density function can be expressed in closed form. However, the single moments of the order statistics,

μ_{(i)} = E (X_{(i)})

, can be easily obtained from the corresponding QF in Equation (1). For the class of distributions in Equation (1),

μ_{(i)}

, can be expressed as follows:

\begin{matrix} μ_{(i)} & = & \frac{1}{B (i, n - i + 1)} \int_{0}^{1} Q (p ∣ θ) p^{i - 1} {(1 - p)}^{n - i} d p \\ = & c \frac{B (i + λ_{1}, n - i + 1 - λ_{2})}{B (i, n - i + 1)}, n - i + 1 - λ_{2} > 0 . \end{matrix}

(8)

Thus, as explicit formulas for moments of order statistics exist, several mathematical quantities associated with order statistics can be derived from Equation (8).

Additional properties can be found in Giorgi and Nadarajah (2010); Nair et al. (2013); Sunoj and Sankaran (2012).

3. Estimation Methods for the Power–Pareto Distribution

In this section, we discuss the parameter estimation methods employed in this paper. For the estimation of the parameters of the aforementioned reduced versions of the power–Pareto model, we refer to Bhatti et al. (2018); Caeiro et al. (2015); Caeiro and Mateus (2023); Lu and Tao (2007); Mateus and Caeiro (2022); Rytgaard (1990); Shakeel et al. (2016); Zaka et al. (2013). Concerning the three-parameter power–Pareto model, in Equation (1), Hankin and Lee (2006) proposed the estimation of the parameters by two methods: maximum likelihood and quantile least squares. The variance–covariance matrix of those two methods is also provided in Hankin and Lee (2006). The maximum likelihood estimators possess desirable asymptotic properties. However, in the case of small samples, this method may exhibit lower efficiency, when compared to other estimation methods. Therefore, in this paper, we consider not only the estimation methods in Hankin and Lee (2006), but also new estimation methods. In the following, let

x_{1}

,

x_{2}

, …,

x_{n}

represent a sample of size n, from the power–Pareto distribution with all three parameters assumed unknown.

3.1. Maximum Likelihood (ML)

The maximum likelihood (ML) estimators of the three parameters are obtained by solving an optimization problem, which involves maximizing the likelihood function, or equivalently, minimizing the negative log-likelihood function. This can be expressed as follows:

{\hat{θ}}^{ML} = \underset{θ}{argmin} \{- \sum_{i = 1}^{n} (log (Q (u_{i} ∣ θ)) + log (\frac{λ_{1}}{u_{i}} + \frac{λ_{2}}{1 - u_{i}}))\} .

(9)

where

u_{i}

represents the solution of the equation

x_{i} = Q (u_{i} ∣ θ)

. Here,

{\hat{θ}}^{ML} = ({\hat{c}}^{ML}, {\hat{λ}}_{1}^{ML}, {\hat{λ}}_{2}^{ML})

denotes the ML estimate of

θ = (c, λ_{1}, λ_{2})

.

While the ML estimation method provides asymptotically unbiased estimators and efficiency for large sample sizes, the lack of a closed-form expression for the probability density function requires

{\hat{θ}}^{ML}

to be obtained through a three-dimensional numerical search. This makes the ML method computationally intensive, and convergence of the negative log-likelihood to the global minimum can be sensitive to the initial values. Thus, this estimation method for the parameters of the power–Pareto can be computationally complex and challenging, especially for large datasets. Additionally, the ML method can be impacted by model misspecification. Therefore, it is crucial to consider alternative methods, potentially with closed-form expressions for the estimators.

3.2. Log Quantile Least Squares (LQLS)

Hankin and Lee (2006) proposed a regression method for estimating the parameters of the power–Pareto distribution using order statistics. To achieve a simple linear relation involving the parameters, a log transformation is applied, yielding the sum of squares

\sum_{i = 1}^{n} {[log x_{(i)} - E (log (X_{(i)}))]}^{2}

(10)

that needs to be minimized with respect to the vector parameters

θ

. Since X is continuous, the inverse probability integral transform guarantees

X \overset{d}{=} Q (U ∣ θ)

, where U denotes a uniform distribution on the interval

(0, 1)

. Consequently,

X_{(i)} \overset{d}{=} Q (U_{(i)} ∣ θ), i = 1, \dots, n,

(11)

where

U_{(i)}

denotes the

i^{th}

-order statistic from a sample of size n from a uniform distribution on

(0, 1)

. Note that

U_{(i)}

has a Beta distribution with parameters i and

n - i + 1

. Using Equation (11), we have

log (X_{(i)}) \overset{d}{=} λ_{0} + λ_{1} log (U_{(i)}) - λ_{2} log (1 - U_{(i)}), i = 1, \dots, n,

with

λ_{0} = log (c)

. Thus,

\begin{matrix} E (log U_{(i)}) & = & ψ (i) - ψ (n + 1) = - \sum_{k = i}^{n} \frac{1}{k}, \\ E (log (1 - U_{(i)}) & = & ψ (n - i + 1) - ψ (n + 1) = - \sum_{k = n - i + 1}^{n} \frac{1}{k}, \end{matrix}

where

ψ

is the digamma function, the derivative of the log gamma function. For n integer,

ψ (n) = - γ + \sum_{i = 1}^{n - 1} \frac{1}{i},

where

γ

is Euler’s constant. Then, by introducing the notation

λ = (λ_{0}, λ_{1}, λ_{2})

, Equation (10) can be expressed in matrix form as

S (λ) = {(Y - X λ)}^{⊤} (Y - X λ)

where

Y

is a column matrix with the logarithm of the order statistics from the sample,

log X_{(i)}

, and

X

is an

n \times 3

matrix where the

i^{th}

row is given by

(1, a_{i}, a_{n - i + 1})

, with

a_{i} = - \sum_{k = i}^{n} \frac{1}{k}

. Applying the least squares method, the vector parameters are estimated by

{\hat{λ}}^{LQLS} = {(X^{⊤} X)}^{- 1} X^{⊤} Y;

Consequently,

{\hat{θ}}^{LQLS} = (exp ({\hat{λ}}_{0}^{LQLS}), {\hat{λ}}_{1}^{LQLS}, {\hat{λ}}_{2}^{LQLS}) .

(12)

The LQLS method offers several advantages. Firstly, it is more robust against outliers, as the logarithmic transformation reduces the influence of those values. Secondly, unlike the ML method, estimates are based on the order statistics and require straightforward calculations, leading to computational efficiency. However, the LQLS method may exhibit lower efficiency when compared to the ML method and can be sensitive to small sample sizes.

3.3. Percentile (P)

Percentile points were first used for the determination of parameters of the Weibull model (Kao 1959). This method is nowadays popular due to its simplicity. Estimators are found from the relation, through the CDF or the QF, between probabilities and percentile values. To estimate the parameters, one must consider the same number of percentiles. Therefore, given three distinct cumulative probability levels

p_{1}

,

p_{2}

, and

p_{3}

(

0 < p_{1} < p_{2} < p_{3} < 1

), the corresponding

100 p_{i} %

percentiles,

i = 1, 2, 3

, are the values

q_{1}

,

q_{2}

, and

q_{3}

such that

F (q_{i} ∣ θ) = p_{i} \Leftrightarrow q_{i} = Q (p_{i} ∣ θ), i = 1, 2, 3,

with Q the QF in Equation (1). Next, applying a log transformation to the ratio between two consecutive percentiles, we obtain

log \frac{q_{2}}{q_{1}} = λ_{1} log \frac{p_{2}}{p_{1}} + λ_{2} log \frac{1 - p_{1}}{1 - p_{2}},

and

log \frac{q_{3}}{q_{2}} = λ_{1} log \frac{p_{3}}{p_{2}} + λ_{2} log \frac{1 - p_{2}}{1 - p_{3}} .

Solving the above two equations for

λ_{1}

and

λ_{2}

, we obtain

λ_{1} = \frac{log \frac{1 - p_{2}}{1 - p_{3}} log \frac{q_{2}}{q_{1}} - log \frac{1 - p_{1}}{1 - p_{2}} log \frac{q_{3}}{q_{2}}}{log \frac{p_{2}}{p_{1}} log \frac{1 - p_{2}}{1 - p_{3}} - log \frac{p_{3}}{p_{2}} log \frac{1 - p_{1}}{1 - p_{2}}},

(13)

and

λ_{2} = \frac{- log \frac{p_{3}}{p_{2}} log \frac{q_{2}}{q_{1}} + log \frac{p_{2}}{p_{1}} log \frac{q_{3}}{q_{2}}}{log \frac{p_{2}}{p_{1}} log \frac{1 - p_{2}}{1 - p_{3}} - log \frac{p_{3}}{p_{2}} log \frac{1 - p_{1}}{1 - p_{2}}} .

(14)

Next, we use the following equation for the second percentile:

q_{2} = c p_{2}^{λ_{1}} {(1 - p_{2})}^{- λ_{2}} \Leftrightarrow c = q_{2} p_{2}^{- λ_{1}} {(1 - p_{2})}^{λ_{2}} .

(15)

The estimators are obtained by replacing, in Equations (13)–(15), the percentiles

q_{i}

, by the corresponding sample percentiles. A possible choice for the probabilities is

(p_{1}, p_{2}, p_{3}) = (0.1, 0.5, 0.9)

. Equivalently, let I be a set of three distinct values from the first n positive integer values,

{1, 2, \dots, n}

, where n denotes the sample size. Another possible choice of percentiles is

q_{i} = x_{(i)}

,

i \in I

, associated to the cumulative probabilities

p_{i} = (i - a) / (n + b)

, where a and b are real constants. A popular choice of the constants is

a = 0

and

b = 1

.

The P method offers simplicity in computation and robustness against outliers. This makes it straightforward to implement and suitable for exploratory analysis and initial estimation, providing a quick and effective way to estimate parameters. However, it may be less efficient and less accurate compared to other methods.

3.4. Least Squares (LS) and Weighted Least Squares (WLS)

Here we consider the difference between the empirical and the theoretical CDF. Then, the least squares (LS) estimator of

θ

, denoted by

{\hat{θ}}^{LS} = ({\hat{c}}^{LS}, {\hat{λ}}_{1}^{LS}, {\hat{λ}}_{2}^{LS})

, can be obtained as

{\hat{θ}}^{LS} = \underset{θ}{argmin} \{\sum_{i = 1}^{n} {(F (x_{(i)} ∣ θ) - \frac{i}{n + 1})}^{2}\} .

(16)

Furthermore, the estimation of parameters using the weighted least squares (WLS) method, symbolized as

{\hat{θ}}^{WLS} = ({\hat{c}}^{WLS}, {\hat{λ}}_{1}^{WLS}, {\hat{λ}}_{2}^{WLS})

, can be determined by

{\hat{θ}}^{WLS} = \underset{θ}{argmin} \{\sum_{i = 1}^{n} \frac{{(n + 1)}^{2} (n + 2)}{i (n - i + 1)} {(F (x_{(i)} ∣ θ) - \frac{i}{n + 1})}^{2}\} .

(17)

The LS method involves minimizing the squared difference between the empirical and theoretical CDFs. This method is straightforward to implement and interpret, making it accessible for various applications. However, LS assumes homoscedasticity, which is not valid, since the variance of

F (x_{(i)} ∣ θ)

depends on the index i. This violation does not affect the bias of the estimators, but may increase their variance. On the other hand, the weighting scheme used in the WLS method addresses heteroscedasticity by assigning larger weights to observations that are closer to the center of the sample and smaller weights to observations that are closer to the edges of the sample. Additionally, both the LS and WLS methods are computationally intensive, since both depend on the CDF, which needs to be computed numerically.

3.5. Quantile Least Squares (QLS)

The quantile least squares (QLS) estimator of distribution parameters, denoted by

{\hat{θ}}^{QLS} = ({\hat{c}}^{QLS}, {\hat{λ}}_{1}^{QLS}, {\hat{λ}}_{2}^{QLS})

, can be derived by

{\hat{θ}}^{QLS} = \underset{θ}{argmin} \{\sum_{i = 1}^{n} {(x_{(i)} - μ_{(i)})}^{2}\},

(18)

with

μ_{(i)}

defined in Equation (8).

The QLS estimator minimizes the squared difference between the order statistics and their expected value, which can be easily obtained from Equation (8). A limitation of this method is that

μ_{(n)}

only exists if

λ_{2} < 1

; therefore, the QLS should only be considered if

λ_{2}

is a small positive value. Furthermore, the accuracy of parameter estimates can be affected by the presence of large outliers.

A weighted version of this method was not considered because it would further restrict its domain of validity.

4. Comparison of the Estimation Methods by Monte Carlo Simulation

In this section, a Monte Carlo simulation study is carried out to compare the performance of the proposed P, LS, WLS, and QLS estimation methods, and to compare them with the ML and LQLS methods, proposed by Hankin and Lee (2006). Davies package was used for the ML method. Parameter estimation with the LS, WLS, and QLS was performed with the R optimization function optim of the R Software version 4.0.0 and using the starting values provided by the davies.start function in Davies package. The power–Pareto distribution was used to generate

r = 1000

samples with sizes

n = 10

, 20, 50, 75, and 100. Sample values are generated using the inversion method. In the simulation study, the following parameter combinations were considered:

Case 1: $(c, λ_{1}, λ_{2}) = (1, 0.1, 0.1)$ ;
Case 2: $(c, λ_{1}, λ_{2}) = (1, 0.1, 0.4)$ ;
Case 3: $(c, λ_{1}, λ_{2}) = (1, 0.4, 0.4)$ ;
Case 4: $(c, λ_{1}, λ_{2}) = (1, 0.4, 0.9)$ ;
Case 5: $(c, λ_{1}, λ_{2}) = (1, 0.9, 0.4)$ .

All the parameter combinations provide a power–Pareto distribution with finite mean value and different levels of positive skewness and kurtosis. Both measures increase with respect to

λ_{2}

and decrease with respect to

λ_{1}

. The corresponding densities, for all five cases, are presented in Figure 2.

For each of the three parameters of

θ

, denoted generically by

θ

, we computed the simulated average bias (ABias), median bias (MBias), and root mean squared error (RMSE) of the corresponding estimator

\hat{θ}

. The statistics are defined by

\begin{matrix} ABias (\hat{θ}) & = & \frac{1}{r} \sum_{i = 1}^{r} ({\hat{θ}}_{i} - θ), \\ MBias (\hat{θ}) & = & median ({\hat{θ}}_{1}, {\hat{θ}}_{2}, \dots, {\hat{θ}}_{r}) - θ, \\ RMSE (\hat{θ}) & = & \sqrt{\frac{1}{r} \sum_{i = 1}^{r} {({\hat{θ}}_{i} - θ)}^{2}}, \end{matrix}

where

{\hat{θ}}_{i}

is the estimate of

θ

computed using the

i^{th}

sample.

As a global criterion of comparison, we also computed the average absolute difference between the true and the estimated CDFs,

D_{abs} = \frac{1}{r} \sum_{i = 1}^{r} (\frac{1}{n} \sum_{j = 1}^{n} | F (x_{i j} | θ) - F (x_{i j} | \hat{θ}) |)

(19)

and the average of the maximum absolute difference between the true and estimated CDFs,

D_{\max} = \frac{1}{r} \sum_{i = 1}^{r} max | F (x_{i j} | θ) - F (x_{i j} | \hat{θ}) |,

(20)

where

x_{i j}

represents the

j^{th}

observation in the

i^{th}

sample. The smaller the values of

D_{abs}

and

D_{m a x}

, the better the fit to the data.

The ABias, MBias, and RMSE are presented in Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7, while the related Table A1, Table A2, Table A3, Table A4 and Table A5, with the corresponding values, are given in Appendix A. It is important to note that it was impossible to obtain estimates provided by the QLS method for a few samples. This was due to the non-convergence of the optimization method used to solve Equation (18). The number of cases where convergence was achieved is indicated beneath each table. This issue is not critical, as the QLS method generally demonstrates the poorest performance. Thus, we do not advise its use.

Moreover, since only small sample sizes were considered, it is difficult to assess the convergence of both median and mean simulated bias to zero, likely attributed to sampling error. However, it is evident that if

n = 75

or

n = 100

, the simulated bias is usually closer to zero than if

n = 10

or

n = 20

. In almost all cases, the RMSE of the estimators of the parameters c,

λ_{1}

, and

λ_{2}

decreases toward zero, when the sample size increases.

Based on RMSE values in Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7, it is evident that the performance of various estimation methods varies based on the values of

λ_{1}

and

λ_{2}

, and also the sample size (n). Regarding the RMSE, we also have the following additional comments:

For small sample sizes, such as $n = 10$ , the P method generally demonstrates the highest efficiency. Moreover, it is not a recommended method for larger sample sizes.
The WLS method consistently outperforms the LS estimator in estimating each of the three parameters.
The LQLS method always has a good performance for samples of size $n \geq 20$ . The WLS has a similar performance to the LQLS method if $λ_{1} \neq λ_{2}$ . If $λ_{1} = λ_{2}$ , LQLS and WLS methods have a similar performance for $n \geq 50$ .
The ML method shows strong performance when $λ_{1} = 0.1$ and $λ_{2} \leq 0.4$ and when the sample size is equal to or larger than 50. Thus, we do not recommend its use for samples of size smaller than $n = 100$ .

Table 1 and Table 2 provide a comparative analysis of Monte Carlo simulated mean absolute difference and mean maximum absolute difference between true and estimated CDFs. The best values are highlighted in bold. The insights derived from the analysis of these tables can be summarized as follows:

The performance rankings across different methods are consistent between the two tables.
The WLS methods demonstrate a very good performance, typically yielding the smallest or second smallest values of $D_{abs}$ and $D_{\max}$ .
The LS method consistently performs slightly worse than WLS, and LQLS shows similar performance to WLS when $n \leq 50$ , except when $λ_{1} = 0.1$ and $λ_{2} = 0.4$ . The ML method is never the best performer, but it shows good performance if $n \geq 50$ and $λ_{1} < λ_{2}$ or $λ_{1} = λ_{2} = 0.4$ .
The remaining methods exhibit poor performance. Both P and WLS methods provide generally the largest absolute differences. The exception is the QLS method, for small sample sizes and $λ_{1} = λ_{2} = 0.1$ .

5. Application

In this section, we use two real datasets to illustrate the behavior of the estimators, described in Section 3. To compare the fitted power–Pareto model we computed the Kolmogorov–Smirnov (K-S) statistic and associated p-value for each method. Since parameters are estimated, the p-value of the K-S test is obtained using Monte Carlo simulation. To measure the goodness-of-fit, we also computed the empirical correlation coefficient

r_{Q}

, between empirical quantiles

x_{i}

and the corresponding estimated quantiles

q_{i} = Q (x_{i} ∣ \hat{θ})

,

i = 1, 2, \dots, n

(Beirlant et al. 2004). Since both vectors have monotonically increasing values,

r_{Q}

will be non-negative.

5.1. Household Income by State in USA

The U.S. Census Bureau defines “household income” as the gross income of all people aged 15 years or older who live in the same housing unit, regardless of their relationship. Household income reflects the standard of living in distinct households and is an important indicator of the local and national economies. Table 3 presents a dataset comprising the median household income in 2016 in the United States, in dollars, of

n = 52

states, as available on the website data.world.1

The histogram and the boxplot of these observations, in Figure 8, are compatible with the power–Pareto distribution.

Table 4 summarizes the estimated parameters, K-S statistics, associated p-values, and the empirical correlation coefficient for various statistical methods applied to the household income dataset.

Regarding Table 4, it is shown that all estimation techniques produce p-values exceeding

0.05

, indicating a favorable fit of the power–Pareto distribution. Considering that a lower K-S statistic and a higher p-value signify a better fit, and a higher

r_{Q}

implies a stronger relationship between observed and expected quantiles, the P method stands out with notably high p-value and

r_{Q}

, indicating a good fit. Moreover, the LQLS method achieves the highest

r_{Q}

, further supporting its efficacy. Although the QLS method has a large

r_{Q}

value, the p-value is the lowest.

Figure 9 depicts Q-Q plots, comparing the observed data with the estimated quantiles provided from various methods. If the points in the Q-Q plots align closely along the diagonal line, it indicates that the estimated distribution provides an adequate statistical fit. Figure 10 provides the empirical CDF vs. the fitted CDF, for the six different estimation methods.

Figure 9 shows a good similarity between empirical and fitted quantiles in the body of the distribution, although there are discrepancies in the right tail. All methods provide a good correspondence in the body of the distribution. But the LQLS and QLS methods provide the best correspondence in the right tail. Similar conclusions can be drawn from Figure 10.

5.2. Peak Concentrations

For the examination of accidental releases of hazardous gases, a method commonly employed is the instantaneous release of a finite volume of gas into a surrounding flow field. Concentration measurements are then taken at a fixed location downwind. In a series of experiments conducted by Hall (1991) involving 100 repetitions, a key parameter for risk assessment was the peak concentrations achieved. The dataset, studied by Hankin and Lee (2006), is provided in Table 5.

In Figure 11, we present the histogram and the boxplot of the dataset. Both plots are compatible with the power–Pareto distribution.

Table 6 provides the estimated parameters, K-S statistics, the associated p-values, and the empirical correlation coefficient for various statistical methods for the peak concentration dataset.

It is observed that the data conform well to the distribution for all estimation methods, with all associated p-values exceeding

0.05

and empirical correlation coefficient close to 1. Results for the different estimation methods are similar, except for the QLS, which presents a much higher K-S value. Furthermore, the P, LS, and WLS methods demonstrate favorable outcomes, as indicated by the low K-S statistic, high p-value, and high empirical correlation coefficient,

r_{Q}

.

Figure 12 presents Q-Q plots, contrasting the observed data with the estimated quantiles derived from the fitted power–Pareto distribution. Both the P and WLS methods demonstrate a good correspondence, with similar patterns and some discrepancies in the right tail. The QLS again evidences overfitting in the right tail.

Figure 13 displays the empirical and fitted CDFs. All methods work quite well for analyzing this dataset. However, the P and WLS are the ones that provide the best correspondence between CDFs.

6. Conclusions

This study examines the power–Pareto model for non-negative variables. The model has three parameters and can exhibit various shapes, making it suitable for modelling both symmetrical and skewed data. The paper explores distributional characteristics, with a particular focus on different parameter estimation techniques, some of them introduced in this work.

The numerical analysis reveals the importance of selecting an appropriate estimation method based on both sample size and the values of the power–Pareto distribution parameters. Our results indicate that for very small sample sizes, the P method performs well in terms of RMSE. However, for larger sample sizes, the LQLS and WLS methods emerge as adequate choices and are recommended for practical applications.

Additionally, it is worth noting that the ML method also exhibits good performance for larger sample sizes, typically with at least 100 observations. However, it is essential to consider the computational time associated with this method, which is longer when compared to other methods, a factor to weigh in the decision-making process.

Author Contributions

Conceptualization, F.C.; methodology, F.C. and M.N.; software, F.C. and M.N.; validation, F.C.; formal analysis, F.C.; investigation, F.C. and M.N.; data curation, F.C. and M.N.; writing—original draft preparation, M.N.; writing—review and editing, F.C. and M.N.; visualization, F.C. and M.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work is funded by national funds through the FCT—Fundação para a Ciência e a Tecnologia, I.P., under the scope of the projects UIDB/00297/2020 (https://doi.org/10.54499/UIDB/00297/2020, accessed on 6 June 2024) and UIDP/00297/2020 (https://doi.org/10.54499/UIDP/00297/2020, accessed on 6 June 2024) (Center for Mathematics and Applications).

Data Availability Statement

The data supporting the findings in Section 5 of this study are available within the article.

Acknowledgments

The authors thank the referees for their comments and suggestions that led to an improvement of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

QF	Quantile function
CDF	Cumulative distribution function
IQR	Interquartile range
ML	Maximum Likelihood
LQLS	Log quantile least squares
P	Percentile
LS	Least squares
WLS	Weighted least squares
QLS	Quantile least squares
ABias	Average bias
MBias	Median bias
RMSE	Root mean squared error

Appendix A. Monte Carlo Simulation Results

Table A1, Table A2, Table A3, Table A4 and Table A5 provide the ABias, MBias, and RMSE for the cases in Section 4, with the best values highlighted in bold. Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7 are related with these tables.

Table A1. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = λ_{2} = 0.1

.

Table A1. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = λ_{2} = 0.1

.

n		ML	LQLS	P	LS	WLS	QLS*
10	$ABias (\hat{c})$	$0.0155$	$0.0060$	$0.0051$	$0.0141$	$0.0127$	$- 0.0036$
	$MBias (\hat{c})$	$- 0.0042$	$- 0.0039$	$- 0.0055$	$0.0073$	$- 0.0001$	$- 0.0036$
	$RMSE (\hat{c})$	$0.1576$	$0.1232$	$0.1164$	$0.1432$	$0.1360$	$0.1756$
	$ABias ({\hat{λ}}_{1})$	$- 0.0004$	$0.0005$	$- 0.0165$	$0.0174$	$0.0172$	$0.0055$
	$MBias ({\hat{λ}}_{1})$	$- 0.0131$	$- 0.0065$	$- 0.0204$	$0.0077$	$0.0089$	$- 0.0005$
	$RMSE ({\hat{λ}}_{1})$	$0.0841$	$0.0669$	$0.0641$	$0.0890$	$0.0836$	$0.0671$
	$ABias ({\hat{λ}}_{2})$	$- 0.0039$	$0.0016$	$- 0.0148$	$0.0116$	$0.0125$	$- 0.0062$
	$MBias ({\hat{λ}}_{2})$	$- 0.0123$	$- 0.0070$	$- 0.0219$	$0.0025$	$0.0049$	$- 0.0164$
	$RMSE ({\hat{λ}}_{2})$	$0.0812$	$0.0684$	$0.0642$	$0.0883$	$0.0832$	$0.0644$
20	$ABias (\hat{c})$	$0.0094$	$0.0027$	$0.0070$	$0.0094$	$0.0061$	$- 0.0199$
	$MBias (\hat{c})$	$0.0041$	$0.0012$	$0.0010$	$0.0051$	$0.0043$	$0.0066$
	$RMSE (\hat{c})$	$0.1047$	$0.0789$	$0.0906$	$0.0972$	$0.0880$	$0.1859$
	$ABias ({\hat{λ}}_{1})$	$0.0003$	$- 0.0007$	$- 0.0086$	$0.0098$	$0.0074$	$0.0004$
	$MBias ({\hat{λ}}_{1})$	$- 0.0059$	$- 0.0064$	$- 0.0115$	$0.0063$	$0.0051$	$- 0.0013$
	$RMSE ({\hat{λ}}_{1})$	$0.0570$	$0.0435$	$0.0513$	$0.0596$	$0.0526$	$0.0475$
	$ABias ({\hat{λ}}_{2})$	$- 0.0047$	$- 0.0008$	$- 0.0111$	$0.0034$	$0.0043$	$- 0.0069$
	$MBias ({\hat{λ}}_{2})$	$- 0.0096$	$- 0.0051$	$- 0.0142$	$- 0.0014$	$0.0010$	$- 0.0121$
	$RMSE ({\hat{λ}}_{2})$	$0.0559$	$0.0438$	$0.0518$	$0.0587$	$0.0528$	$0.0472$
50	$ABias (\hat{c})$	$- 0.0006$	$- 0.0014$	$- 0.0008$	$- 0.0011$	$- 0.0017$	$- 0.0322$
	$MBias (\hat{c})$	$- 0.0032$	$- 0.0048$	$- 0.0020$	$- 0.0022$	$- 0.0035$	$- 0.0011$
	$RMSE (\hat{c})$	$0.0527$	$0.0491$	$0.0598$	$0.0581$	$0.0518$	$0.1917$
	$ABias ({\hat{λ}}_{1})$	$- 0.0019$	$- 0.0012$	$- 0.0046$	$0.0019$	$0.0011$	$- 0.0024$
	$MBias ({\hat{λ}}_{1})$	$- 0.0029$	$- 0.0034$	$- 0.0056$	$0.0013$	$- 0.0002$	$- 0.0004$
	$RMSE ({\hat{λ}}_{1})$	$0.0299$	$0.0279$	$0.0347$	$0.0349$	$0.0305$	$0.0342$
	$ABias ({\hat{λ}}_{2})$	$- 0.0009$	$0.0004$	$- 0.0031$	$0.0037$	$0.0032$	$- 0.0048$
	$MBias ({\hat{λ}}_{2})$	$- 0.0026$	$- 0.0011$	$- 0.0052$	$0.0008$	$0.0004$	$- 0.0058$
	$RMSE ({\hat{λ}}_{2})$	$0.0310$	$0.0283$	$0.0346$	$0.0362$	$0.0316$	$0.0354$
75	$ABias (\hat{c})$	$0.0014$	$0.0001$	$0.0020$	$0.0018$	$0.0009$	$- 0.0125$
	$MBias (\hat{c})$	$0.0008$	$- 0.0016$	$0.0022$	$0.0013$	$0.0010$	$0.0020$
	$RMSE (\hat{c})$	$0.0405$	$0.0401$	$0.0493$	$0.0470$	$0.0416$	$0.1304$
	$ABias ({\hat{λ}}_{1})$	$- 0.0008$	$- 0.0006$	$- 0.0021$	$0.0023$	$0.0014$	$- 0.0006$
	$MBias ({\hat{λ}}_{1})$	$- 0.0016$	$- 0.0026$	$- 0.0029$	$0.0020$	$0.0016$	$0.0005$
	$RMSE ({\hat{λ}}_{1})$	$0.0230$	$0.0222$	$0.0274$	$0.0275$	$0.0234$	$0.0268$
	$ABias ({\hat{λ}}_{2})$	$- 0.0017$	$- 0.0003$	$- 0.0031$	$0.0010$	$0.0009$	$- 0.0032$
	$MBias ({\hat{λ}}_{2})$	$- 0.0026$	$- 0.0010$	$- 0.0053$	$- 0.0010$	$- 0.0007$	$- 0.0047$
	$RMSE ({\hat{λ}}_{2})$	$0.0239$	$0.0233$	$0.0282$	$0.0282$	$0.0243$	$0.0287$
100	$ABias (\hat{c})$	$0.0017$	$0.0001$	$0.0015$	$0.0018$	$0.0010$	$- 0.0286$
	$MBias (\hat{c})$	$- 0.0010$	$- 0.0016$	$0.0005$	$- 0.0002$	$- 0.0015$	$0.0011$
	$RMSE (\hat{c})$	$0.0350$	$0.0353$	$0.0432$	$0.0404$	$0.0354$	$0.1805$
	$ABias ({\hat{λ}}_{1})$	$- 0.0002$	$- 0.0005$	$- 0.0015$	$0.0020$	$0.0012$	$- 0.0022$
	$MBias ({\hat{λ}}_{1})$	$- 0.0008$	$- 0.0015$	$- 0.0017$	$0.0015$	$0.0008$	$0.0000$
	$RMSE ({\hat{λ}}_{1})$	$0.0204$	$0.0200$	$0.0243$	$0.0236$	$0.0201$	$0.0277$
	$ABias ({\hat{λ}}_{2})$	$- 0.0019$	$- 0.0003$	$- 0.0025$	$0.0005$	$0.0005$	$- 0.0047$
	$MBias ({\hat{λ}}_{2})$	$- 0.0024$	$- 0.0013$	$- 0.0046$	$- 0.0007$	$- 0.0004$	$- 0.0048$
	$RMSE ({\hat{λ}}_{2})$	$0.0208$	$0.0200$	$0.0247$	$0.0236$	$0.0204$	$0.0286$

* The numbers of convergence cases are 983 (n = 10), 978 (n = 20), 966 (n = 50), 985 (n = 75), and 969 (n = 100).

Table A2. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.1, λ_{2} = 0.4

.

Table A2. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.1, λ_{2} = 0.4

.

n		ML	LQLS	P	LS	WLS	QLS*
10	$ABias (\hat{c})$	$0.0853$	$0.0482$	$0.1121$	$0.1063$	$0.0957$	$0.2065$
	$MBias (\hat{c})$	$- 0.0579$	$0.0199$	$0.0572$	$- 0.0064$	$- 0.0063$	$0.1475$
	$RMSE (\hat{c})$	$0.4478$	$0.3366$	$0.3707$	$0.4333$	$0.4001$	$0.6011$
	$ABias ({\hat{λ}}_{1})$	$0.0019$	$- 0.0008$	$- 0.0048$	$0.0474$	$0.0433$	$0.1051$
	$MBias ({\hat{λ}}_{1})$	$- 0.0912$	$0.0043$	$- 0.0052$	$0.0057$	$0.0108$	$0.0669$
	$RMSE ({\hat{λ}}_{1})$	$0.1462$	$0.1343$	$0.1303$	$0.1696$	$0.1546$	$0.2374$
	$ABias ({\hat{λ}}_{2})$	$- 0.0196$	$0.0061$	$- 0.0652$	$0.0133$	$0.0174$	$- 0.1057$
	$MBias ({\hat{λ}}_{2})$	$- 0.0181$	$- 0.0487$	$- 0.1005$	$- 0.0012$	$0.0031$	$- 0.1360$
	$RMSE ({\hat{λ}}_{2})$	$0.2231$	$0.2375$	$0.2217$	$0.2490$	$0.2405$	$0.2128$
20	$ABias (\hat{c})$	$0.0267$	$0.0280$	$0.0786$	$0.0514$	$0.0415$	$0.1414$
	$MBias (\hat{c})$	$- 0.0238$	$0.0277$	$0.0497$	$0.0007$	$0.0049$	$0.1462$
	$RMSE (\hat{c})$	$0.2725$	$0.2142$	$0.2671$	$0.2535$	$0.2207$	$0.4967$
	$ABias ({\hat{λ}}_{1})$	$- 0.0028$	$0.0014$	$0.0026$	$0.0236$	$0.0186$	$0.0901$
	$MBias ({\hat{λ}}_{1})$	$- 0.0220$	$0.0068$	$0.0010$	$0.0059$	$0.0084$	$0.0626$
	$RMSE ({\hat{λ}}_{1})$	$0.1058$	$0.0833$	$0.1016$	$0.1090$	$0.0901$	$0.2491$
	$ABias ({\hat{λ}}_{2})$	$- 0.0079$	$- 0.0056$	$- 0.0506$	$0.0032$	$0.0042$	$- 0.0928$
	$MBias ({\hat{λ}}_{2})$	$- 0.0059$	$- 0.0299$	$- 0.0700$	$- 0.0044$	$- 0.0046$	$- 0.1111$
	$RMSE ({\hat{λ}}_{2})$	$0.1693$	$0.1553$	$0.1754$	$0.1741$	$0.1587$	$0.1800$
50	$ABias (\hat{c})$	$- 0.0017$	$0.0065$	$0.0224$	$0.0099$	$0.0075$	$0.0880$
	$MBias (\hat{c})$	$- 0.0110$	$0.0019$	$0.0074$	$- 0.0050$	$- 0.0024$	$0.1169$
	$RMSE (\hat{c})$	$0.1232$	$0.1342$	$0.1616$	$0.1359$	$0.1173$	$0.3460$
	$ABias ({\hat{λ}}_{1})$	$- 0.0052$	$- 0.0011$	$- 0.0022$	$0.0054$	$0.0037$	$0.0656$
	$MBias ({\hat{λ}}_{1})$	$- 0.0087$	$0.0020$	$- 0.0034$	$- 0.0006$	$0.0009$	$0.0656$
	$RMSE ({\hat{λ}}_{1})$	$0.0496$	$0.0528$	$0.0686$	$0.0613$	$0.0495$	$0.1350$
	$ABias ({\hat{λ}}_{2})$	$- 0.0014$	$- 0.0011$	$- 0.0175$	$0.0050$	$0.0037$	$- 0.0644$
	$MBias ({\hat{λ}}_{2})$	$- 0.0068$	$- 0.0120$	$- 0.0264$	$- 0.0035$	$- 0.0047$	$- 0.0804$
	$RMSE ({\hat{λ}}_{2})$	$0.0963$	$0.1020$	$0.1163$	$0.1059$	$0.0946$	$0.1496$
75	$ABias (\hat{c})$	$- 0.0002$	$0.0080$	$0.0216$	$0.0117$	$0.0085$	$0.0765$
	$MBias (\hat{c})$	$- 0.0046$	$0.0072$	$0.0143$	$0.0046$	$0.0024$	$0.1187$
	$RMSE (\hat{c})$	$0.0911$	$0.1110$	$0.1336$	$0.1095$	$0.0941$	$0.3288$
	$ABias ({\hat{λ}}_{1})$	$- 0.0035$	$0.0003$	$0.0012$	$0.0059$	$0.0037$	$0.0611$
	$MBias ({\hat{λ}}_{1})$	$- 0.0045$	$0.0022$	$0.0018$	$0.0040$	$0.0026$	$0.0621$
	$RMSE ({\hat{λ}}_{1})$	$0.0356$	$0.0427$	$0.0557$	$0.0485$	$0.0385$	$0.1240$
	$ABias ({\hat{λ}}_{2})$	$- 0.0027$	$- 0.0029$	$- 0.0145$	$0.0005$	$0.0002$	$- 0.0595$
	$MBias ({\hat{λ}}_{2})$	$- 0.0030$	$- 0.0078$	$- 0.0202$	$- 0.0066$	$- 0.0041$	$- 0.0714$
	$RMSE ({\hat{λ}}_{2})$	$0.0746$	$0.0838$	$0.0943$	$0.0832$	$0.0740$	$0.1423$
100	$ABias (\hat{c})$	$0.0011$	$0.0065$	$0.0167$	$0.0088$	$0.0063$	$0.0579$
	$MBias (\hat{c})$	$- 0.0060$	$0.0008$	$0.0092$	$0.0029$	$0.0010$	$0.1192$
	$RMSE (\hat{c})$	$0.0774$	$0.0966$	$0.1161$	$0.0916$	$0.0785$	$0.3323$
	$ABias ({\hat{λ}}_{1})$	$- 0.0021$	$0.0004$	$0.0012$	$0.0047$	$0.0028$	$0.0565$
	$MBias ({\hat{λ}}_{1})$	$- 0.0023$	$0.0009$	$0.0003$	$0.0019$	$0.0017$	$0.0619$
	$RMSE ({\hat{λ}}_{1})$	$0.0307$	$0.0376$	$0.0490$	$0.0408$	$0.0324$	$0.1217$
	$ABias ({\hat{λ}}_{2})$	$- 0.0031$	$- 0.0026$	$- 0.0117$	$0.0004$	$0.0001$	$- 0.0567$
	$MBias ({\hat{λ}}_{2})$	$- 0.0043$	$- 0.0076$	$- 0.0175$	$- 0.0009$	$- 0.0030$	$- 0.0650$
	$RMSE ({\hat{λ}}_{2})$	$0.0636$	$0.0721$	$0.0832$	$0.0696$	$0.0622$	$0.1411$

* The numbers of convergence cases are 949 (n = 10), 962 (n = 20), 960 (n = 50), 959 (n = 75), and 952 (n = 100).

Table A3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.4

.

Table A3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.4

.

n		ML	LQLS	P	LS	WLS	QLS*
10	$ABias (\hat{c})$	$0.2517$	$0.1202$	$0.0984$	$0.1953$	$0.1829$	$0.4519$
	$MBias (\hat{c})$	$- 0.0054$	$- 0.0156$	$- 0.0176$	$0.0262$	$0.0081$	$0.2224$
	$RMSE (\hat{c})$	$0.8934$	$0.6031$	$0.5528$	$0.7712$	$0.7186$	$1.1709$
	$ABias ({\hat{λ}}_{1})$	$0.0182$	$0.0019$	$- 0.0730$	$0.0710$	$0.0719$	$0.2361$
	$MBias ({\hat{λ}}_{1})$	$- 0.0427$	$- 0.0259$	$- 0.0932$	$0.0327$	$0.0355$	$0.1309$
	$RMSE ({\hat{λ}}_{1})$	$0.3683$	$0.2677$	$0.2599$	$0.3592$	$0.3393$	$0.6460$
	$ABias ({\hat{λ}}_{2})$	$- 0.0277$	$0.0063$	$- 0.0476$	$0.0442$	$0.0449$	$- 0.1261$
	$MBias ({\hat{λ}}_{2})$	$- 0.0690$	$- 0.0282$	$- 0.0808$	$0.0086$	$0.0186$	$- 0.1559$
	$RMSE ({\hat{λ}}_{2})$	$0.3239$	$0.2735$	$0.2605$	$0.3529$	$0.3380$	$0.2464$
20	$ABias (\hat{c})$	$0.1758$	$0.0492$	$0.0769$	$0.0988$	$0.0739$	$0.2839$
	$MBias (\hat{c})$	$0.0412$	$0.0046$	$0.0001$	$0.0216$	$0.0204$	$0.1953$
	$RMSE (\hat{c})$	$0.5886$	$0.3440$	$0.4030$	$0.4443$	$0.3911$	$0.9170$
	$ABias ({\hat{λ}}_{1})$	$0.0414$	$- 0.0029$	$- 0.0361$	$0.0405$	$0.0304$	$0.1716$
	$MBias ({\hat{λ}}_{1})$	$- 0.0131$	$- 0.0256$	$- 0.0478$	$0.0255$	$0.0212$	$0.1106$
	$RMSE ({\hat{λ}}_{1})$	$0.2776$	$0.1741$	$0.2055$	$0.2389$	$0.2108$	$0.1106$
	$ABias ({\hat{λ}}_{2})$	$- 0.0454$	$- 0.0031$	$- 0.0423$	$0.0126$	$0.0164$	$- 0.1047$
	$MBias ({\hat{λ}}_{2})$	$- 0.0504$	$- 0.0206$	$- 0.0536$	$- 0.0060$	$0.0012$	$- 0.1269$
	$RMSE ({\hat{λ}}_{2})$	$0.2387$	$0.1751$	$0.2069$	$0.2350$	$0.2119$	$0.2008$
50	$ABias (\hat{c})$	$0.0290$	$0.0090$	$0.0181$	$0.0161$	$0.0093$	$0.1586$
	$MBias (\hat{c})$	$- 0.0126$	$- 0.0192$	$- 0.0086$	$- 0.0091$	$- 0.0153$	$0.1755$
	$RMSE (\hat{c})$	$0.2590$	$0.2000$	$0.2475$	$0.2393$	$0.2112$	$0.4555$
	$ABias ({\hat{λ}}_{1})$	$0.0022$	$- 0.0048$	$- 0.0186$	$0.0075$	$0.0042$	$0.1206$
	$MBias ({\hat{λ}}_{1})$	$- 0.0093$	$- 0.0135$	$- 0.0225$	$0.0051$	$- 0.0174$	$0.1279$
	$RMSE ({\hat{λ}}_{1})$	$0.1430$	$0.1116$	$0.1389$	$0.1396$	$0.1221$	$0.3026$
	$ABias ({\hat{λ}}_{2})$	$- 0.0085$	$0.0017$	$- 0.0119$	$0.0149$	$0.0131$	$- 0.0750$
	$MBias ({\hat{λ}}_{2})$	$- 0.0110$	$- 0.0043$	$- 0.0206$	$0.0033$	$0.0022$	$- 0.0941$
	$RMSE ({\hat{λ}}_{2})$	$0.1279$	$0.1132$	$0.1385$	$0.1450$	$0.1265$	$0.1603$
75	$ABias (\hat{c})$	$0.0251$	$0.0102$	$0.0219$	$0.0225$	$0.0192$	$0.1271$
	$MBias (\hat{c})$	$- 0.0001$	$- 0.0064$	$0.0085$	$0.0058$	$0.0040$	$0.1567$
	$RMSE (\hat{c})$	$0.2098$	$0.1634$	$0.2018$	$0.1991$	$0.1993$	$0.4106$
	$ABias ({\hat{λ}}_{1})$	$0.0036$	$- 0.0025$	$- 0.0088$	$0.0103$	$0.0075$	$0.0994$
	$MBias ({\hat{λ}}_{1})$	$- 0.0072$	$- 0.0105$	$- 0.0117$	$0.0085$	$0.0057$	$0.1114$
	$RMSE ({\hat{λ}}_{1})$	$0.1168$	$0.0886$	$0.1096$	$0.1124$	$0.0995$	$0.2683$
	$ABias ({\hat{λ}}_{2})$	$- 0.0091$	$- 0.0013$	$- 0.0120$	$0.0032$	$0.0016$	$- 0.0652$
	$MBias ({\hat{λ}}_{2})$	$- 0.0106$	$- 0.0039$	$- 0.0208$	$- 0.0051$	$- 0.0033$	$- 0.0783$
	$RMSE ({\hat{λ}}_{2})$	$0.0968$	$0.0930$	$0.1127$	$0.1146$	$0.1044$	$0.1518$
100	$ABias (\hat{c})$	$0.0238$	$0.0079$	$0.0171$	$0.0235$	$0.0162$	$0.1031$
	$MBias (\hat{c})$	$- 0.0016$	$- 0.0064$	$0.0018$	$0.0006$	$- 0.0052$	$0.1563$
	$RMSE (\hat{c})$	$0.2038$	$0.1432$	$0.1764$	$0.2003$	$0.1706$	$0.4092$
	$ABias ({\hat{λ}}_{1})$	$0.0051$	$- 0.0020$	$- 0.0060$	$0.0111$	$0.0020$	$0.0888$
	$MBias ({\hat{λ}}_{1})$	$- 0.0027$	$- 0.0059$	$- 0.0068$	$0.0063$	$0.0033$	$0.1110$
	$RMSE ({\hat{λ}}_{1})$	$0.1122$	$0.0800$	$0.0972$	$0.1047$	$0.0859$	$0.2651$
	$ABias ({\hat{λ}}_{2})$	$- 0.0088$	$- 0.0012$	$- 0.0099$	$0.0001$	$0.0001$	$- 0.0645$
	$MBias ({\hat{λ}}_{2})$	$- 0.0117$	$- 0.0051$	$- 0.0185$	$- 0.0044$	$- 0.0021$	$- 0.0752$
	$RMSE ({\hat{λ}}_{2})$	$0.0825$	$0.0798$	$0.0987$	$0.0993$	$0.0892$	$0.1525$

* The numbers of convergence cases are 977 (n = 10), 958 (n = 20), 962 (n = 50), 960 (n = 75), and 949 (n = 100).

Table A4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.9

.

Table A4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.9

.

n		ML	LQLS	P	LS	WLS	QLS*
10	$ABias (\hat{c})$	$0.8308$	$0.3513$	$0.4594$	$0.7081$	$0.6520$	$6.5721$
	$MBias (\hat{c})$	$- 0.0791$	$0.0156$	$0.0484$	$0.0029$	$- 0.0105$	$1.1266$
	$RMSE (\hat{c})$	$3.3311$	$1.2160$	$1.5800$	$2.9448$	$2.6376$	$19.7535$
	$ABias ({\hat{λ}}_{1})$	$0.0247$	$- 0.0001$	$- 0.0694$	$0.1238$	$0.1205$	$2.1121$
	$MBias ({\hat{λ}}_{1})$	$- 0.0987$	$- 0.0106$	$- 0.0639$	$0.0315$	$0.0551$	$0.5679$
	$RMSE ({\hat{λ}}_{1})$	$0.4876$	$0.3658$	$0.3765$	$0.4962$	$0.4657$	$11.4752$
	$ABias ({\hat{λ}}_{2})$	$- 0.0515$	$0.0138$	$- 0.0797$	$0.0504$	$0.0574$	$- 0.5457$
	$MBias ({\hat{λ}}_{2})$	$- 0.0566$	$- 0.1060$	$- 0.1754$	$0.0317$	$0.0047$	$- 0.5764$
	$RMSE ({\hat{λ}}_{2})$	$0.5814$	$0.5502$	$0.5653$	$0.6279$	$0.6095$	$0.6298$
20	$ABias (\hat{c})$	$0.5669$	$0.1522$	$0.2916$	$0.2796$	$0.2231$	$11.8500$
	$MBias (\hat{c})$	$0.0053$	$0.0482$	$0.0824$	$0.0234$	$0.0045$	$1.5789$
	$RMSE (\hat{c})$	$1.9100$	$0.6254$	$0.8795$	$0.9625$	$0.8223$	$39.5185$
	$ABias ({\hat{λ}}_{1})$	$0.0690$	$0.0007$	$- 0.0200$	$0.0645$	$0.0508$	$2.9156$
	$MBias ({\hat{λ}}_{1})$	$- 0.0317$	$0.0013$	$- 0.0364$	$0.0266$	$0.0245$	$0.8483$
	$RMSE ({\hat{λ}}_{1})$	$0.4436$	$0.2305$	$0.2819$	$0.3250$	$0.2808$	$7.1807$
	$ABias ({\hat{λ}}_{2})$	$- 0.0734$	$- 0.0111$	$- 0.0995$	$0.0143$	$0.0180$	$- 0.5095$
	$MBias ({\hat{λ}}_{2})$	$- 0.0675$	$- 0.0597$	$- 0.1403$	$0.0028$	$0.0007$	$- 0.5279$
	$RMSE ({\hat{λ}}_{2})$	$0.4729$	$0.3577$	$0.4106$	$0.4349$	$0.3980$	$0.5904$
50	$ABias (\hat{c})$	$0.0958$	$0.0449$	$0.0913$	$0.0702$	$0.0549$	$25.8752$
	$MBias (\hat{c})$	$- 0.0172$	$- 0.0112$	$0.0076$	$- 0.0230$	$- 0.0146$	$2.2231$
	$RMSE (\hat{c})$	$0.6234$	$0.3466$	$0.4536$	$0.4785$	$0.3971$	$122.0664$
	$ABias ({\hat{λ}}_{1})$	$0.0068$	$- 0.0047$	$- 0.0151$	$0.0140$	$0.0106$	$6.5544$
	$MBias ({\hat{λ}}_{1})$	$- 0.0163$	$- 0.0017$	$- 0.0172$	$0.0001$	$0.0071$	$1.5314$
	$RMSE ({\hat{λ}}_{1})$	$0.2183$	$0.1468$	$0.1905$	$0.1929$	$0.1629$	$24.7446$
	$ABias ({\hat{λ}}_{2})$	$- 0.0206$	$- 0.0008$	$- 0.0347$	$0.0183$	$0.0132$	$- 0.4825$
	$MBias ({\hat{λ}}_{2})$	$- 0.0293$	$- 0.0215$	$- 0.0544$	$- 0.0038$	$- 0.0095$	$- 0.4759$
	$RMSE ({\hat{λ}}_{2})$	$0.2510$	$0.2342$	$0.2728$	$0.2703$	$0.2429$	$0.5576$
75	$ABias (\hat{c})$	$0.0677$	$0.0390$	$0.0775$	$0.0662$	$0.0511$	$36.4554$
	$MBias (\hat{c})$	$0.0080$	$0.0073$	$0.0264$	$0.0080$	$0.0030$	$2.6556$
	$RMSE (\hat{c})$	$0.4865$	$0.2835$	$0.3631$	$0.4389$	$0.3886$	$155.8396$
	$ABias ({\hat{λ}}_{1})$	$0.0080$	$- 0.0010$	$- 0.0038$	$0.0177$	$0.0118$	$8.4604$
	$MBias ({\hat{λ}}_{1})$	$- 0.0041$	$0.0002$	$- 0.0017$	$0.0134$	$0.0057$	$1.9628$
	$RMSE ({\hat{λ}}_{1})$	$0.1639$	$0.1177$	$0.1527$	$0.1578$	$0.1301$	$19.2592$
	$ABias ({\hat{λ}}_{2})$	$- 0.0250$	$- 0.0057$	$- 0.0298$	$0.0022$	$0.0011$	$- 0.4697$
	$MBias ({\hat{λ}}_{2})$	$- 0.0270$	$- 0.0155$	$- 0.0410$	$- 0.0112$	$- 0.0063$	$- 0.4659$
	$RMSE ({\hat{λ}}_{2})$	$0.1905$	$0.1925$	$0.2215$	$0.2117$	$0.1885$	$0.5442$
100	$ABias (\hat{c})$	$0.0541$	$0.0304$	$0.0604$	$0.0476$	$0.0457$	$41.6837$
	$MBias (\hat{c})$	$0.0076$	$0.0034$	$0.0175$	$0.0019$	$0.0015$	$2.8660$
	$RMSE (\hat{c})$	$0.4540$	$0.2453$	$0.3105$	$0.2947$	$0.4184$	$173.9902$
	$ABias ({\hat{λ}}_{1})$	$0.0048$	$- 0.0005$	$- 0.0017$	$0.0147$	$0.0109$	$10.6657$
	$MBias ({\hat{λ}}_{1})$	$0.0032$	$- 0.0007$	$0.0028$	$0.0071$	$0.0029$	$2.3627$
	$RMSE ({\hat{λ}}_{1})$	$0.1220$	$0.1049$	$0.1346$	$0.1308$	$0.1192$	$25.2684$
	$ABias ({\hat{λ}}_{2})$	$- 0.0221$	$- 0.0050$	$- 0.0251$	$0.0004$	$- 0.0011$	$- 0.4648$
	$MBias ({\hat{λ}}_{2})$	$- 0.0265$	$- 0.0133$	$- 0.0385$	$- 0.0053$	$- 0.0042$	$- 0.4600$
	$RMSE ({\hat{λ}}_{2})$	$0.1584$	$0.1652$	$0.1948$	$0.1762$	$0.1627$	$0.5403$

* The numbers of convergence cases are 913 (n = 10), 936 (n = 20), 941 (n = 50), 946 (n = 75), and 944 (n = 100).

Table A5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.9, λ_{2} = 0.4

.

Table A5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.9, λ_{2} = 0.4

.

n		ML	LQLS	P	LS	WLS	QLS*
10	$ABias (\hat{c})$	$0.5116$	$0.4565$	$0.1674$	$0.4133$	$0.4064$	$0.7991$
	$MBias (\hat{c})$	$0.1972$	$- 0.0579$	$- 0.1179$	$0.0812$	$0.0219$	$0.3257$
	$RMSE (\hat{c})$	$1.3260$	$2.0416$	$1.0804$	$1.3088$	$1.2627$	$1.8824$
	$ABias ({\hat{λ}}_{1})$	$0.0642$	$0.0064$	$- 0.1922$	$0.1072$	$0.1207$	$0.4752$
	$MBias ({\hat{λ}}_{1})$	$0.0193$	$- 0.0777$	$- 0.2576$	$0.0638$	$0.0787$	$0.2136$
	$RMSE ({\hat{λ}}_{1})$	$0.6308$	$0.5373$	$0.5156$	$0.6451$	$0.6222$	$1.3140$
	$ABias ({\hat{λ}}_{2})$	$- 0.0569$	$0.0068$	$- 0.0180$	$0.0733$	$0.0634$	$- 0.1531$
	$MBias ({\hat{λ}}_{2})$	$- 0.1791$	$- 0.0025$	$- 0.0488$	$0.0034$	$0.0088$	$- 0.2005$
	$RMSE ({\hat{λ}}_{2})$	$0.4170$	$0.3701$	$0.3542$	$0.4941$	$0.4647$	$0.2817$
20	$ABias (\hat{c})$	$0.4242$	$0.1479$	$0.1295$	$0.2003$	$0.1596$	$0.5729$
	$MBias (\hat{c})$	$0.2188$	$- 0.0407$	$- 0.0539$	$0.0691$	$0.0439$	$0.2837$
	$RMSE (\hat{c})$	$0.9908$	$0.7706$	$0.7506$	$0.7503$	$0.6785$	$1.4994$
	$ABias ({\hat{λ}}_{1})$	$0.1136$	$- 0.0102$	$- 0.1035$	$0.0593$	$0.0472$	$0.3848$
	$MBias ({\hat{λ}}_{1})$	$0.0605$	$- 0.0647$	$- 0.1349$	$0.0448$	$0.0341$	$0.1899$
	$RMSE ({\hat{λ}}_{1})$	$0.4821$	$0.3566$	$0.4080$	$0.4346$	$0.3968$	$1.1138$
	$ABias ({\hat{λ}}_{2})$	$- 0.0939$	$0.0010$	$- 0.0279$	$0.0249$	$0.0268$	$- 0.1224$
	$MBias ({\hat{λ}}_{2})$	$- 0.1189$	$0.0011$	$- 0.0344$	$0.0014$	$0.0070$	$- 0.1507$
	$RMSE ({\hat{λ}}_{2})$	$0.3044$	$0.2324$	$0.2815$	$0.3192$	$0.2837$	$0.2304$
50	$ABias (\hat{c})$	$0.2037$	$0.0335$	$0.0343$	$0.0426$	$0.0254$	$0.3695$
	$MBias (\hat{c})$	$0.0297$	$- 0.0351$	$- 0.0480$	$- 0.0083$	$- 0.0221$	$0.2526$
	$RMSE (\hat{c})$	$0.6502$	$0.3682$	$0.4320$	$0.3960$	$0.3338$	$2.0387$
	$ABias ({\hat{λ}}_{1})$	$0.0820$	$- 0.0109$	$- 0.0466$	$0.0126$	$0.0071$	$0.3530$
	$MBias ({\hat{λ}}_{1})$	$0.0179$	$- 0.0284$	$- 0.0670$	$- 0.0015$	$0.0021$	$0.2278$
	$RMSE ({\hat{λ}}_{1})$	$0.3711$	$0.2296$	$0.2731$	$0.2602$	$0.2292$	$2.1254$
	$ABias ({\hat{λ}}_{2})$	$- 0.0493$	$0.0063$	$- 0.0023$	$0.0246$	$0.0208$	$- 0.0894$
	$MBias ({\hat{λ}}_{2})$	$- 0.0285$	$0.0062$	$- 0.0084$	$0.0088$	$0.0138$	$- 0.1094$
	$RMSE ({\hat{λ}}_{2})$	$0.1973$	$0.1466$	$0.1899$	$0.1941$	$0.1620$	$0.1792$
75	$ABias (\hat{c})$	$0.1406$	$0.0268$	$0.0366$	$0.0424$	$0.0327$	$0.2482$
	$MBias (\hat{c})$	$0.0149$	$- 0.0173$	$- 0.0043$	$0.0078$	$0.0016$	$0.2125$
	$RMSE (\hat{c})$	$0.5084$	$0.2902$	$0.3427$	$0.3067$	$0.2891$	$1.1944$
	$ABias ({\hat{λ}}_{1})$	$0.0569$	$- 0.0073$	$- 0.0265$	$0.0143$	$0.0098$	$0.2395$
	$MBias ({\hat{λ}}_{1})$	$0.0028$	$- 0.0245$	$- 0.0336$	$0.0126$	$0.0048$	$0.2033$
	$RMSE ({\hat{λ}}_{1})$	$0.2913$	$0.1814$	$0.2134$	$0.2039$	$0.1808$	$1.2468$
	$ABias ({\hat{λ}}_{2})$	$- 0.0364$	$0.0014$	$- 0.0075$	$0.0074$	$0.0055$	$- 0.0800$
	$MBias ({\hat{λ}}_{2})$	$- 0.0185$	$0.0029$	$- 0.0138$	$- 0.0033$	$0.0020$	$- 0.0916$
	$RMSE ({\hat{λ}}_{2})$	$0.1591$	$0.1202$	$0.1547$	$0.1522$	$0.1290$	$0.1688$
100	$ABias (\hat{c})$	$0.1248$	$0.0207$	$0.0300$	$0.0321$	$0.0263$	$0.2184$
	$MBias (\hat{c})$	$0.0117$	$- 0.0137$	$- 0.0154$	$- 0.0015$	$- 0.0061$	$0.2094$
	$RMSE (\hat{c})$	$0.4667$	$0.2570$	$0.3008$	$0.2611$	$0.2446$	$0.5693$
	$ABias ({\hat{λ}}_{1})$	$0.0501$	$- 0.0060$	$- 0.0182$	$0.0112$	$0.0085$	$0.2250$
	$MBias ({\hat{λ}}_{1})$	$0.0027$	$- 0.0190$	$- 0.0266$	$0.0119$	$0.0043$	$0.2164$
	$RMSE ({\hat{λ}}_{1})$	$0.2663$	$0.1645$	$0.1900$	$0.1731$	$0.1542$	$0.9415$
	$ABias ({\hat{λ}}_{2})$	$- 0.0307$	$0.0012$	$- 0.0069$	$0.0055$	$0.0027$	$- 0.0693$
	$MBias ({\hat{λ}}_{2})$	$- 0.0149$	$0.0026$	$- 0.0121$	$- 0.0031$	$- 0.0010$	$- 0.0826$
	$RMSE ({\hat{λ}}_{2})$	$0.1415$	$0.1040$	$0.1350$	$0.1275$	$0.1101$	$0.1561$

* The numbers of convergence cases are 971 (n = 10), 974 (n = 20), 963 (n = 50), 955 (n = 75), and 967 (n = 100).

Note

1	https://data.world/garyhoov/household-income-by-state (accessed on 6 June 2024).

References

Beirlant, Jan, Frederico Caeiro, and M. Ivette Gomes. 2012. An overview and open research topics in statistics of univariate extremes. Revstat–Statistical Journal 10: 1–31. [Google Scholar] [CrossRef]
Beirlant, Jan, Yuri Goegebeur, Johan Segers, and Jozef L. Teugels. 2004. Statistics of Extremes: Theory and Applications. Chichester: John Wiley & Sons. [Google Scholar] [CrossRef]
Bhatti, Sajjad Haider, Shahzad Hussain, Tanvir Ahmad, Muhammad Aslam, Muhammad Aftab, and Muhammad Ali Raza. 2018. Efficient estimation of pareto model: Some modified percentile estimators. PLoS ONE 13: e0196456. [Google Scholar] [CrossRef] [PubMed]
Bowley, Arthur L. 1901. Elements of Statistics. London: PS King & Son. [Google Scholar] [CrossRef]
Burr, Irving W. 1942. Cumulative frequency functions. The Annals of Mathematical Statistics 13: 215–32. [Google Scholar] [CrossRef]
Caeiro, Frederico, Ana P. Martins, and Inês J. Sequeira. 2015. Finite sample behaviour of classical and quantile regression estimators for the pareto distribution. In AIP Conference Proceedings. New York: AIP Publishing LLC. [Google Scholar] [CrossRef]
Caeiro, Frederico, and Ayana Mateus. 2023. A new class of generalized probability-weighted moment estimators for the pareto distribution. Mathematics 11: 1076. [Google Scholar] [CrossRef]
Caeiro, Frederico, and Ayana Mateus. 2024. Reduced bias estimation of the shape parameter of the log-logistic distribution. Journal of Computational and Applied Mathematics 436: 115347. [Google Scholar] [CrossRef]
Dagum, Camilo. 1977. A new model for personal income distribution: Specification and estimation. Economie Appliqué 30: 413–37. [Google Scholar] [CrossRef]
Finkelstein, Mark, Howard G. Tucker, and Jerry Alan Veeh. 2006. Pareto tail index estimation revisited. North American Actuarial Journal 10: 1–10. [Google Scholar] [CrossRef]
Gilchrist, Warren. 2000. Statistical Modelling with Quantile Functions. Boca Raton: Chapman & Hall/CRC. [Google Scholar] [CrossRef]
Giorgi, Giovanni Maria, and Saralees Nadarajah. 2010. Bonferroni and gini indices for various parametric families of distributions. METRON 68: 23–46. [Google Scholar] [CrossRef]
Hall, David J. 1991. Repeat Variability in Instantaneously Released Heavy Gas Clouds–Some Wind Tunnel Experiments. Technical report LR 804 (PA). Stevenage: Warren Spring Laboratory. [Google Scholar]
Hankin, Robin K. S., and Alan Lee. 2006. A new family of non-negative distributions. Australian & New Zealand Journal of Statistics 48: 67–78. [Google Scholar] [CrossRef]
Kao, John H. K. 1959. A graphical estimation of mixed weibull parameters in life-testing of electron tubes. Technometrics 1: 389–407. [Google Scholar] [CrossRef]
Lu, Hai-Lin, and Shin-Hwa Tao. 2007. The estimation of pareto distribution by a weighted least square method. Quality & Quantity 41: 913–26. [Google Scholar] [CrossRef]
Mateus, Ayana, and Frederico Caeiro. 2022. Improved shape parameter estimation for the three-parameter log-logistic distribution. Computational and Mathematical Methods 2022: 8400130. [Google Scholar] [CrossRef]
Mehta, Navya Jayesh, and Fan Yang. 2022. Portfolio optimization for extreme risks with maximum diversification: An empirical analysis. Risks 10: 101. [Google Scholar] [CrossRef]
Moors, Johannes Josephus Antonius. 1988. A quantile alternative for kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician) 37: 25–32. [Google Scholar] [CrossRef]
Nair, Narayanan Unnikrishnan, and Balakrishnapillai Vineshkumar. 2010. L-moments of residual life. Journal of Statistical Planning and Inference 140: 2618–31. [Google Scholar] [CrossRef]
Nair, Narayanan Unnikrishnan, Paduthol Godan Sankaran, and Narayanaswamy Balakrishnan. 2013. Quantile-Based Reliability Analysis. New York: Springer. [Google Scholar] [CrossRef]
Ndlovu, Thabani, and Delson Chikobvu. 2023. The generalised pareto distribution model approach to comparing extreme risk in the exchange rate risk of bitcoin/us dollar and south african rand/us dollar returns. Risks 11: 100. [Google Scholar] [CrossRef]
Ramberg, John S., and Bruce W. Schmeiser. 1972. An approximate method for generating symmetric random variables. Communications of the Association for Computing Machinery 15: 987–90. [Google Scholar] [CrossRef]
Reiss, Rolf-Dieter, and Michael Thomas. 2007. Statistical Analysis of Extreme Values with Applications to Insurance, Finance, Hydrology and Other Fields. Basel: Birkhäuser Verlag. [Google Scholar] [CrossRef]
Rytgaard, Mette. 1990. Estimation in the pareto distribution. ASTIN Bulletin: The Journal of the IAA 20: 201–16. [Google Scholar] [CrossRef]
Sankaran, Paduthol Godan, Narayanan Unnikrishnan Nair, and Narayanan Nellikkattu Midhu. 2016. A new quantile function with applications to reliability analysis. Communications in Statistics-Simulation and Computation 45: 566–82. [Google Scholar] [CrossRef]
Schluter, Christian. 2018. Top incomes, heavy tails, and rank-size regressions. Econometrics 6: 10. [Google Scholar] [CrossRef]
Shakeel, Muhammad, Muhammad Ahsan ul Haq, Ijaz Hussain, Alaa Mohamd Abdulhamid, and Muhammad Faisal. 2016. Comparison of two new robust parameter estimation methods for the power function distribution. PLoS ONE 11: e0160692. [Google Scholar] [CrossRef] [PubMed]
Sunoj, Sreenarayanapurath Madhavan, and Paduthol Godan Sankaran. 2012. Quantile based entropy function. Statistics & Probability Letters 82: 1049–53. [Google Scholar] [CrossRef]
Tukey, John Wilder. The Practical Relationship between the Common Transformations of Percentages and Counts and of Amounts. Technical Report 36. Princeton: Statistical Techniques Research Group, Princeton University.
Zaka, Azam, Navid Feroze, and Ahmad Saeed Akhter. 2013. A note on modified estimators for the parameters of the power function distribution. International Journal of Advanced Science and Technology 59: 71–84. [Google Scholar] [CrossRef]

Figure 1. The density function in (3) for fixed parameters

c = 1

,

λ_{1} = 0.1

(left),

λ_{1} = 0.4

(right), and selected values for

λ_{2}

.

Figure 1. The density function in (3) for fixed parameters

c = 1

,

λ_{1} = 0.1

(left),

λ_{1} = 0.4

(right), and selected values for

λ_{2}

.

Figure 2. The density function for cases 1–5.

Figure 3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.1, λ_{2} = 0.1

.

Figure 3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.1, λ_{2} = 0.1

.

Figure 4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.1, λ_{2} = 0.4

.

Figure 4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.1, λ_{2} = 0.4

.

Figure 5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.4

.

Figure 5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.4

.

Figure 6. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.9

. Note that we remove the QLS methods because it is out of the range of plot.

Figure 6. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.4, λ_{2} = 0.9

. Note that we remove the QLS methods because it is out of the range of plot.

Figure 7. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.9, λ_{2} = 0.4

.

Figure 7. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with

c = 1

,

λ_{1} = 0.9, λ_{2} = 0.4

.

Figure 8. Histogram and boxplot for the household income dataset.

Figure 9. Q-Q plots of the household income dataset.

Figure 10. Empirical vs. fitted CDFs using different estimators for the household income dataset.

Figure 11. Histogram and boxplot for the peak concentration dataset.

Figure 12. Q-Q plots of the peak concentration dataset.

Figure 13. Empirical vs. fitted CDFs using different estimators for the peak concentration dataset.

Table 1. Monte Carlo simulated average absolute difference

D_{abs}

in Equation (19).

Table 1. Monte Carlo simulated average absolute difference

D_{abs}

in Equation (19).

$λ_{1}$	$λ_{2}$	n	ML	LQLS	P	LS	WLS	QLS*
0.10	0.10	10	$0.0996$	$0.0935$	$0.1041$	$0.0927$	$0.0912$	$0.0877$
		20	$0.0668$	$0.0633$	$0.0717$	$0.0656$	$0.0642$	$0.0617$
		50	$0.0405$	$0.0399$	$0.0437$	$0.0410$	$0.0399$	$0.0395$
		75	$0.0330$	$0.0318$	$0.0352$	$0.0329$	$0.0319$	$0.0319$
		100	$0.0290$	$0.0278$	$0.0307$	$0.0283$	$0.0276$	$0.0277$
	0.40	10	$0.0983$	$0.1176$	$0.1287$	$0.0921$	$0.0905$	$0.0965$
		20	$0.0675$	$0.0723$	$0.0841$	$0.0653$	$0.0635$	$0.0728$
		50	$0.0403$	$0.0430$	$0.0493$	$0.0409$	$0.0395$	$0.0538$
		75	$0.0315$	$0.0338$	$0.0373$	$0.0328$	$0.0316$	$0.0474$
		100	$0.0271$	$0.0292$	$0.0320$	$0.0282$	$0.0273$	$0.0456$
0.40	0.40	10	$0.1000$	$0.0935$	$0.1056$	$0.0926$	$0.0912$	$0.0954$
		20	$0.0686$	$0.0633$	$0.0717$	$0.0656$	$0.0642$	$0.0694$
		50	$0.0405$	$0.0399$	$0.0437$	$0.0410$	$0.0399$	$0.0483$
		75	$0.0320$	$0.0318$	$0.0352$	$0.0330$	$0.0320$	$0.0412$
		100	$0.0278$	$0.0278$	$0.0307$	$0.0284$	$0.0276$	$0.0383$
	0.90	10	$0.0991$	$0.1053$	$0.1237$	$0.0923$	$0.0909$	$0.1129$
		20	$0.0688$	$0.0663$	$0.0775$	$0.0655$	$0.0640$	$0.1199$
		50	$0.0406$	$0.0408$	$0.0451$	$0.0411$	$0.0398$	$0.1458$
		75	$0.0325$	$0.0325$	$0.0356$	$0.0329$	$0.0318$	$0.1612$
		100	$0.0276$	$0.0283$	$0.0310$	$0.0283$	$0.0275$	$0.1736$
0.9	0.4	10	$0.0996$	$0.0922$	$0.1015$	$0.0924$	$0.0910$	$0.0995$
		20	$0.0690$	$0.0640$	$0.0708$	$0.0654$	$0.0641$	$0.0749$
		50	$0.0440$	$0.0403$	$0.0440$	$0.0410$	$0.0398$	$0.0527$
		75	$0.0349$	$0.0321$	$0.0354$	$0.0329$	$0.0318$	$0.0441$
		100	$0.0296$	$0.0282$	$0.0310$	$0.0283$	$0.0275$	$0.0413$

* Convergence of this estimation method is not achieved in all cases.

Table 2. Monte Carlo simulated average of the maximum absolute difference

D_{\max}

in (20).

Table 2. Monte Carlo simulated average of the maximum absolute difference

D_{\max}

in (20).

$λ_{1}$	$λ_{2}$	n	ML	LQLS	P	LS	WLS	QLS*
0.10	0.10	10	$0.1876$	$0.1658$	$0.1961$	$0.1592$	$0.1559$	$0.1535$
		20	$0.1234$	$0.1115$	$0.1347$	$0.1157$	$0.1122$	$0.1089$
		50	$0.0723$	$0.0698$	$0.0802$	$0.0734$	$0.0705$	$0.0693$
		75	$0.0584$	$0.0557$	$0.0630$	$0.0587$	$0.0561$	$0.0562$
		100	$0.0515$	$0.0487$	$0.0554$	$0.0507$	$0.0484$	$0.0488$
	0.40	10	$0.1807$	$0.2246$	$0.2555$	$0.1562$	$0.1528$	$0.1682$
		20	$0.1239$	$0.1353$	$0.1722$	$0.1143$	$0.1098$	$0.1304$
		50	$0.0715$	$0.0786$	$0.0982$	$0.0727$	$0.0692$	$0.0982$
		75	$0.0552$	$0.0605$	$0.0712$	$0.0580$	$0.0550$	$0.0882$
		100	$0.0475$	$0.0519$	$0.0600$	$0.0500$	$0.0473$	$0.0855$
0.40	0.40	10	$0.1894$	$0.1656$	$0.1990$	$0.1592$	$0.1561$	$0.1654$
		20	$0.1278$	$0.1115$	$0.1348$	$0.1159$	$0.1122$	$0.1234$
		50	$0.0722$	$0.0698$	$0.0802$	$0.0734$	$0.0705$	$0.0878$
		75	$0.0567$	$0.0557$	$0.0630$	$0.0590$	$0.0567$	$0.0766$
		100	$0.0491$	$0.0487$	$0.0554$	$0.0511$	$0.0491$	$0.0714$
	0.90	10	$0.1855$	$0.1919$	$0.2394$	$0.1578$	$0.1542$	$0.2110$
		20	$0.1272$	$0.1179$	$0.1514$	$0.1156$	$0.1115$	$0.2456$
		50	$0.0727$	$0.0718$	$0.0845$	$0.0735$	$0.0706$	$0.3158$
		75	$0.0574$	$0.0572$	$0.0642$	$0.0586$	$0.0558$	$0.3550$
		100	$0.0488$	$0.0498$	$0.0561$	$0.0506$	$0.0484$	$0.3816$
0.90	0.40	10	$0.1860$	$0.1614$	$0.1896$	$0.1586$	$0.1553$	$0.1704$
		20	$0.1289$	$0.1126$	$0.1335$	$0.1152$	$0.1119$	$0.1323$
		50	$0.0809$	$0.0708$	$0.0817$	$0.0732$	$0.0701$	$0.0954$
		75	$0.0630$	$0.0565$	$0.0640$	$0.0590$	$0.0563$	$0.0815$
		100	$0.0527$	$0.0495$	$0.0561$	$0.0506$	$0.0485$	$0.0771$

* Convergence of this estimation method is not achieved in all cases.

Table 3. Household income by state dataset.

60,309	48,237	77,351	58,328	46,894	68,070	72,084	77,556	59,294	72,508
52,277	54,678	73,684	57,780	62,706	57,300	60,365	58,032	46,345	43,103
51,950	75,346	73,820	58,319	71,728	41,983	56,199	58,302	60,651	56,623
77,900	69,940	49,493	62,758	54,920	61,478	55,146	52,039	60,407	62,290
62,851	55,505	58,685	52,448	59,396	68,932	62,145	67,880	71,822	45,308
61,103	59,073

Table 4. Parameter estimates under all methods, K-S statistics, and the associated values for the household income data.

Method	$\hat{c}$	${\hat{λ}}_{1}$	${\hat{λ}}_{2}$	K-S	p-Value	$r_{Q}$
ML	59,636.68	0.0855	0.0909	0.0965	0.6819	0.9838
LQLS	61,322.05	0.0981	0.0723	0.1064	0.5624	0.9868
P	58,936.56	0.0904	0.1004	0.0858	0.8067	0.9822
LS	59,604.21	0.0893	0.0937	0.0980	0.6639	0.9836
WLS	59,578.92	0.0890	0.0934	0.0965	0.6823	0.9837
QLS	62,520.99	0.1093	0.0635	0.1222	0.3880	0.9858

Table 5. Peak concentration dataset.

12.100	1.701	9.074	7.056	7.025	4.777	8.870	7.656	10.920	6.806
$8.757$	$5.670$	$12.890$	$7.119$	$2.523$	$9.055$	$7.341$	$3.938$	$10.460$	$11.050$
$6.678$	$3.026$	$6.806$	$11.750$	$5.742$	$4.007$	$7.340$	$2.849$	$6.418$	$8.456$
$5.702$	$7.262$	$6.086$	$7.568$	$7.941$	$14.030$	$7.844$	$3.150$	$7.818$	$8.554$
$5.796$	$3.497$	$7.087$	$15.800$	$4.316$	$7.591$	$13.990$	$9.185$	$6.286$	$11.040$
$11.280$	$6.804$	$5.292$	$6.273$	$10.840$	$6.587$	$8.757$	$9.344$	$5.513$	$11.040$
$16.160$	$11.500$	$5.072$	$9.041$	$8.927$	$7.560$	$4.694$	$6.832$	$15.380$	$10.250$
$10.550$	$7.655$	$5.229$	$14.900$	$7.087$	$2.646$	$3.704$	$9.293$	$6.117$	$13.650$
$5.072$	$6.045$	$6.458$	$4.993$	$7.403$	$13.480$	$11.530$	$9.926$	$3.451$	$16.910$
$9.010$	$3.215$	$5.859$	$10.020$	$6.962$	$11.440$	$5.765$	$6.928$	$5.171$	$7.825$

Table 6. Parameter estimates under all methods, K-S statistics, and the associated values for the peak concentration dataset.

Method	$\hat{c}$	${\hat{λ}}_{1}$	${\hat{λ}}_{2}$	K-S	p-Value	$r_{Q}$
ML	$8.3220$	$0.3189$	$0.1812$	$0.0634$	$0.7924$	$0.9928$
LQLS	$8.4020$	$0.3213$	$0.1729$	$0.0651$	$0.7649$	$0.9934$
P	$7.9760$	$0.3182$	$0.1984$	$0.0590$	$0.8565$	$0.9908$
LS	$7.7014$	$0.2740$	$0.2293$	$0.0509$	$0.9459$	$0.9834$
WLS	$8.1498$	$0.3142$	$0.1980$	$0.0584$	$0.8647$	$0.9908$
QLS	$9.1620$	$0.3842$	$0.1383$	$0.0854$	$0.4349$	$0.9930$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Caeiro, F.; Norouzirad, M. Comparing Estimation Methods for the Power–Pareto Distribution. Econometrics 2024, 12, 20. https://doi.org/10.3390/econometrics12030020

AMA Style

Caeiro F, Norouzirad M. Comparing Estimation Methods for the Power–Pareto Distribution. Econometrics. 2024; 12(3):20. https://doi.org/10.3390/econometrics12030020

Chicago/Turabian Style

Caeiro, Frederico, and Mina Norouzirad. 2024. "Comparing Estimation Methods for the Power–Pareto Distribution" Econometrics 12, no. 3: 20. https://doi.org/10.3390/econometrics12030020

APA Style

Caeiro, F., & Norouzirad, M. (2024). Comparing Estimation Methods for the Power–Pareto Distribution. Econometrics, 12(3), 20. https://doi.org/10.3390/econometrics12030020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparing Estimation Methods for the Power–Pareto Distribution

Abstract

1. Introduction

2. Statistical Properties of the Power–Pareto Distribution

2.1. Functions

2.2. Moments

2.3. Quantile Measures

2.4. Order Statistics

3. Estimation Methods for the Power–Pareto Distribution

3.1. Maximum Likelihood (ML)

3.2. Log Quantile Least Squares (LQLS)

3.3. Percentile (P)

3.4. Least Squares (LS) and Weighted Least Squares (WLS)

3.5. Quantile Least Squares (QLS)

4. Comparison of the Estimation Methods by Monte Carlo Simulation

5. Application

5.1. Household Income by State in USA

5.2. Peak Concentrations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Monte Carlo Simulation Results

Note

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI