Next Article in Journal
AlphaRouter: Bridging the Gap Between Reinforcement Learning and Optimization for Vehicle Routing with Monte Carlo Tree Searches
Previous Article in Journal
How to Be a Copenhagenistic-QBistic Everettist
Previous Article in Special Issue
Maximum Entropy-Minimum Residual Model: An Optimum Solution to Comprehensive Evaluation and Multiple Attribute Decision Making
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mean Squared Error Representative Points of Pareto Distributions and Their Estimation

1
Faculty of Science and Technology, BNU-HKBU United International College, Zhuhai 519087, China
2
Guangdong Provincial/Zhuhai Key Laboratory of IRADS, BNU-HKBU United International College, Zhuhai 519087, China
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(3), 249; https://doi.org/10.3390/e27030249
Submission received: 25 January 2025 / Revised: 17 February 2025 / Accepted: 25 February 2025 / Published: 27 February 2025
(This article belongs to the Special Issue Number Theoretic Methods in Statistics: Theory and Applications)

Abstract

:
Pareto distributions are widely applied in various fields, such as economics, finance, and environmental studies. The modeling of real-world data has created a demand for the discretization of Pareto distributions. In this paper, we propose using mean squared error representative points (MSE-RPs) as the discrete representation of Pareto distributions. We demonstrate the uniqueness and existence of these representative points under certain parameter settings and provide a theoretical k-means algorithm for the computation of MSE-RPs for Pareto I and Pareto II distributions. Furthermore, to enhance the applicability of MSE-RPs, we employ three methodological approaches to estimate the MSE-RPs of Pareto distributions. By analyzing the estimation bias under different parameters and methods, we recommend estimating the distribution parameters first before estimating the MSE-RPS for Pareto I and Pareto II distributions. For Pareto III and Pareto IV distributions, we suggest using the B q quantiles for MSE-RP estimation. Building on this, we analyze the sources of estimation bias and propose an effective method for determining the number of MSE-RPs based on information gain truncation. Through simulations and real data studies, we demonstrate that the proposed methods for MSE-RP estimation are effective and can be used to fit the empirical distribution function of data accurately.

1. Introduction

The Pareto distribution was originally introduced by the Italian economist Vilfredo Pareto in his seminal work on economics, where it was used as a model for income distribution [1]. Pareto observed that a small proportion of the population tends to control the majority of wealth—a phenomenon commonly referred to as the “80/20 rule”, meaning that 20% of people own 80% of the income. Since then, this heavy-tailed distribution has been widely applied in fields such as economics, finance, and risk management [2], particularly for modeling phenomena characterized by significant inequality or extreme outcomes. Examples include modeling stock return distributions [3] and the extreme tails of financial and insurance loss datasets [4]. The classical Pareto distribution, often referred to as the Pareto Type I distribution, is defined by a skewed heavy-tailed model:
F X ( x ) = 1 x σ α , x σ > 0 , α > 0 ,
where σ > 0 is the scale parameter, typically representing the minimum income in income models, and  α > 0 is the shape parameter, commonly referred to as the Pareto index. The Pareto index α serves as a measure of inequality, with larger values indicating more equitable distributions. For income analysis, the Pareto index typically fluctuates around 1.5 [5].
While the Pareto Type I distribution performs well in modeling the income of high-income groups, it often fails to capture income distributions across the entire population. To address this limitation, economists and statisticians have extended the Pareto distribution by introducing additional flexibility through the location parameter μ and inequality parameter γ , resulting in the development of Pareto Type II, III, and IV distributions [6]. These generalized Pareto distributions have found broader applications. For instance, the Pareto Type II distribution is used for flood modeling and rainfall analysis [7], the Pareto Type III distribution is commonly applied to earthquake intensity modeling [8], and the Pareto Type IV distribution is employed in insurance risk analysis [9].
In statistical research, most studies on the Pareto family focus on data modeling and statistical inference. A major challenge in statistical inference lies in the complex expressions of the generalized Pareto family. Parameter estimation for the Pareto family often involves significant computational effort due to the lack of closed-form expressions for methods such as moment estimation (ME) and maximum likelihood estimation (MLE). Additionally, there has been a lot of research focus on the performance of parameter estimation in small samples and the robustness of these estimates [10,11,12]. In practical data modeling, it is often necessary to discretize continuous data into discrete models while retaining the desirable properties of continuous distributions. For example, Ghosh [13] proposed a discrete version of the Pareto IV distribution derived from rounding continuous random variables to improve the accuracy of lifetime modeling.
In this paper, we focus on the discretization of the Pareto family under the minimum mean squared error (MSE). In statistics, for a given continuous distribution, we can identify optimal discrete approximations, referred to as representative points (RPs), based on different kinds of errors. Different error metrics yield different types of RPs, such as mean squared error representative points (MSE-RPs) [14], Monte Carlo representative points (MC-RPs) [15], and quasi-Monte Carlo representative points (QMC-RPs) [16]. Among these, MSE-RPs are the most widely used, as they demonstrate superior performance over other RPs in distribution fitting [15].
MSE-RPs were initially applied to optimize signal transmission [17]. Since signal distortion in transmission is defined as the expected squared error between the quantizer input and output, which aligns with the MSE-RP criterion, Max successfully applied it to minimize signal distortion using quantizers with fixed output levels [18]. Fang and He [14] further advanced the computation and applications of MSE-RPs. Subsequently, MSE-RPs have been widely studied and applied in various fields, including clothing standardization [14], determining optimal sizes and shapes for protective masks [19], signal and image processing, information theory [20], psychiatric classification [21], statistical inference using resampling techniques [22], and reducing variance in one-dimensional Monte Carlo estimations [23]. They have also been utilized in parameter estimation, such as using MSE-RPs from the gamma distribution as standard samples to improve parameter estimation accuracy [24].
The majority of existing studies on MSE-RPs have focused on univariate continuous distributions, such as normal, exponential, gamma, Weibull, logistic, and mixed normal distributions. However, no studies have been conducted on MSE-RPs derived from Pareto distributions. Owing to the complexity of Pareto density functions, the study of MSE-RP generation and estimation presents significant challenges. Current studies propose two main approaches to generate MSE-RPs for continuous distributions. The first approach relies on the self-consistency property shared by MSE-RPs and k-means cluster centers, which has led to the development of k-means-based algorithms for generating MSE-RPs. This method was initially proposed by Lloyd [25] and Max [18], and is referred to as the Lloyd I algorithm, specifically designed for univariate continuous distributions. To mitigate the impact of initial values and training sets on MSE-RPs, Linde et al. [26] proposed an iterative algorithm, referred to as the LBG algorithm. Fang et al. [27] later developed an enhanced version of the LBG algorithm, employing number-theoretic methods to generate initial points and training sets, which is referred to as the NTLBG algorithm. The second approach derives from the definition of MSE-RPs, with the objective of finding a discrete distribution that minimizes the MSE between it and the density function. This involves solving a series of equations for obtaining the MSE-RPs, a method referred to as the Fang–He algorithm [14]. Compared to k-means based methods, the Fang–He algorithm offers greater accuracy in the computation of MSE-RPs. However, it requires verifying the convergence of the algorithm, which needs to be addressed separately for each specific distribution.
In practical applications, it is often necessary to optimally discretize continuous data, which involves estimating the MSE-RPs of the underlying continuous distribution that the data follow. Tarpey [28] proposed four methods for estimating the MSE-RPs of univariate continuous distributions: maximum likelihood estimation, semi-parametric estimation, quantile estimation, and the k-means algorithm. Subsequent research has expanded these methods. For instance, Matsuura et al. [29] developed an optimal estimation approach for the t-distribution and extended it to the location–scale family and multivariate distributions. In the case of the Pareto distribution, we primarily study the latter two methods for estimating MSE-RPs.
The Pareto distribution is a right-skewed distribution, where the estimation bias of the quantiles at the tail is relatively large. Consequently, when using the quantile method to estimate the MSE-RPs located in the tail, the bias will also be significant. Similarly, due to the smaller number of samples in the tail, the k-means method also introduces substantial bias. The inaccurate estimation of MSE-RPs in the tail increases the bias of the overall estimate. In fact, the amount of information contained in the MSE-RPs at the tail is very limited. This information can be quantified using information gain (IG), which has been applied to evaluate the information content of MSE-RPs in mixed normal distributions [30]. Based on this, we introduce a truncation method based on IG to determine the sufficient number of MSE-RPs. This approach aims to reduce the estimation bias while preserving as much information as possible in the MSE-RPs.
In conclusion, our study focus on the generation and estimation of MSE-RPs for the four types of Pareto distributions. We examine the properties of MSE-RPs for Pareto distributions and propose a reliable algorithm for their generation based on the properties. In addition, simulation results reveal that existing methods for estimating MSE-RPs suffer from significant bias when applied to skewed and heavy-tailed distributions. To address this issue, we introduce the innovative concept of IG-truncated representative points. This approach also provides a new perspective for estimating MSE-RPs for other heavy-tailed distributions.
The structure of this paper is as follows: Section 2 introduces the fundamentals of the Pareto family. Section 3 presents the MSE-RPs of Pareto distributions, including their properties, computation methods, and results. Section 4 compares different methods for estimating the MSE-RPs of Pareto distributions and proposes an IG-based truncation method for selecting the number of MSE-RPs. In Section 5, a real dataset is analyzed in order to illustrate the proposed methods. Finally, this paper is concluded in Section 6.

2. Preliminaries of Pareto Distributions

2.1. Four Types of Pareto Distributions

The two-parameter Pareto distribution was first introduced by Vilfredo Pareto, and its cumulative distribution function (CDF) is given in Equation (1). This distribution is also known as the Pareto Type I distribution, denoted as X P ( I ) ( σ , α ) .
Building upon this, Pareto proposed two additional variations. The Pareto Type II distribution, also known as the Lomax distribution, is a three-parameter distribution derived from the analysis of business failure data [31]. Its CDF is given by
F X ( x ) = 1 1 + x μ σ α , x μ , σ , α > 0 ,
where μ is the location parameter, generally assumed to be non-negative. Additionally, α > 1 is required to ensure a finite expected value. This distribution is denoted as P ( I I ) ( μ , σ , α ) .
The Pareto Type III distribution is another three-parameter distribution, and its CDF is defined as follows:
F X ( x ) = 1 1 + x μ σ 1 γ 1 , x > μ , σ , γ > 0 ,
denoted as P ( I I I ) ( μ , σ , γ ) . When μ = 0 and γ 1 , the parameter γ represents the Gini index for inequality.
In 1983, Arnold introduced the Pareto Type IV distribution, also referred to as the Burr distribution. Its CDF is expressed as follows:
F X ( x ) = 1 1 + x μ σ 1 γ α , x μ , σ , γ , α > 0 ,
denoted as P a r e t o ( I V ) ( μ , σ , γ , α ) .
The four types of Pareto distributions are not entirely independent and can be transformed into one another by adjusting the parameters. Specifically, Pareto(I), Pareto(II), and Pareto(III) distributions are special cases of the Pareto(IV) distribution. These relationships are expressed as follows:
P ( I ) ( σ , α ) = P ( I V ) ( σ , σ , 1 , α ) , P ( I I ) ( μ , σ , α ) = P ( I V ) ( μ , σ , 1 , α ) , P ( I I I ) ( μ , σ , γ ) = P ( I V ) ( μ , σ , γ , 1 ) .
By modifying the parameters of the Pareto IV distribution, the other three types can be derived. Thus, Pareto I, II, and III can all be viewed as special cases of the Pareto IV distribution.
In its early development, the Pareto distribution was considered a specific form of the exponential distribution, as it can be transformed into an exponential distribution. If  U Γ ( 1 , 1 ) is a standard exponential random variable, then X = σ e U α P ( I ) ( σ , α ) and Y = μ + σ e U α 1 P ( I I ) ( μ , σ , α ) .
Furthermore, all four types of Pareto distributions can be transformed into their standard forms by adjusting their parameters. For k > 0 and γ > 0 , the following hold:
  • If X P ( I ) ( 1 , α ) , then
    Z = σ X 1 / k P ( I ) ( σ , k α ) .
  • If X P ( I I ) ( 0 , 1 , α ) , then
    Z = μ σ + σ ( 1 + X ) 1 / k P ( I I ) ( μ , σ , k α ) .
  • If X P ( I I I ) ( 0 , 1 , 1 ) , then
    Z = μ + σ X γ P ( I I I ) ( μ , σ , γ ) .
  • If X P ( I V ) ( 0 , 1 , 1 , α ) , then
    Z = μ + σ ( 1 + X ) 1 / k 1 γ P ( I V ) ( μ , σ , γ , k α ) .

2.2. Parameter Estimation of Pareto Distributions

Although numerous studies have explored parameter estimation for Pareto distributions, most focus on the Pareto Type I distribution due to its simplicity, widespread use, and interpretability. Here, we divide the discussion into two parts: parameter estimation for the Pareto Type I distribution and for the Pareto Type II–IV distributions.
1.
Pareto Type I Distribution:
(a)
Maximum Likelihood Estimation (MLE): Assume a sample X 1 , , X n is drawn from P ( I ) ( σ , α ) , and let X ( n , 1 ) < < X ( n , n ) denote the order statistics. The likelihood function is given by
L ( α , σ X 1 , , X n ) = i = 1 n α σ α X i α + 1 I ( σ X ( n , 1 ) ) .
When α is known, the likelihood function is a decreasing function of σ . Thus, the MLE of σ is
σ ^ M L = X ( n , 1 ) .
Taking the logarithm of the likelihood function and differentiating with respect to α yields
α ^ M L = n i = 1 n log X i / σ ^ M L .
While MLE is asymptotically consistent, normal, and efficient, it is not the minimum variance unbiased estimator (MVUE). The MVUE for this distribution is given by
σ ^ U = 1 ( n 1 ) 1 α ^ M L 1 σ ^ M L ,   α ^ U = n 2 n α ^ M L .
(b)
Moment Estimation (ME): For a Pareto Type I distribution with α > 1 , the first moment is
E ( X ) = α σ α 1 ,   α > 1 .
Using the first moment, the method of moments estimators are as follows:
α ^ M E = n X ¯ X ( n , 1 ) n ( X ¯ X ( n , 1 ) ) ,   σ ^ M E = 1 1 n α ^ M E X ( n , 1 ) .
While α ^ M E requires α > 1 , σ ^ M E does not rely on this condition, as it is derived from the CDF of X ( n , 1 ) [10]. Simulations by Lu and Tao [32] indicate that ME performs better than MLE for estimating σ , provided the estimation of σ is independent of α .
2.
Pareto Type II–IV Distributions:
(a)
Maximum Likelihood Estimation (MLE): For simplicity, we focus on the Pareto Type IV distribution, as Pareto Types II and III are special cases (see Equation (5)). The likelihood function for a sample X 1 , , X n from P ( I V ) ( μ , σ , γ , α ) is
L ( μ , σ , γ , α X 1 , , X n ) = 1 γ 1 i = 1 n log X i μ σ   ( α + 1 ) i = 1 n log 1 + X i μ σ 1 / γ   n log γ n log σ + n log α .
The order statistic X ( n , 1 ) serves as a consistent estimator of μ . Once μ is estimated, the problem can be reduced to estimating the parameters of P ( I V ) ( 0 , σ , γ , α ) using the remaining n 1 samples, X ( n , 2 ) , , X ( n , n ) . Taking partial derivatives of the log-likelihood function with respect to σ , γ , and  α  yields
log L σ = α ( n 1 ) γ σ α + 1 γ σ i = 1 n 1 1 1 + X i / σ 1 / γ = 0 , log L γ = α γ 2 i = 1 n 1 log X i σ α + 1 γ 2 i = 1 n 1 log X i / σ 1 + X i / σ 1 / γ n 1 γ = 0 , log L α = i = 1 n 1 log 1 + X i σ 1 / γ + n 1 α = 0 .
Solving these equations yields the MLEs for P ( I V ) ( 0 , σ , γ , α ) .
(b)
Moment Estimation (ME): Direct computation of moments for Pareto distributions is challenging. Arnold proposed a method based on constructing a statistic relying on the Gamma distribution [6]. Define
X = μ + σ U 1 U 2 γ P ( I V ) ( μ , σ , γ , α ) ,
where U 1 Γ ( 1 , 1 ) and U 2 Γ ( α , 1 ) are independent. Using the moments of the Gamma distribution, the δ -th moment of X is
E ( X δ ) = σ δ Γ ( 1 + δ γ ) Γ ( α δ γ ) Γ ( α ) ,   1 γ < δ < α γ .
By constructing equations from these moments and using X ( n , 1 ) as a consistent estimator of μ , the moment estimators for Pareto Type IV distributions can be obtained. However, the existence of moments depends on the parameter conditions.

3. MSE-RPs of Pareto Distributions

3.1. Definition and Properties of MSE-RPs

Given a continuous random variable X F ( x ) , a set of points { y i R d ,   i { 1 , , m } } and their associated probabilities form the MSE-RPs of F ( x ) if and only if the following condition is satisfied:
E [ d 2 ( x | y 1 , , y m ) ] = min ξ i R d i = 1 , , m E [ d 2 ( x | ξ 1 , , ξ m ) ] ,
where
d 2 ( x | ξ 1 , , ξ m ) = min j { 1 , , m } x ξ j
represents the minimum 2 -norm distance between x and ξ j , j { 1 , , m } .
The probability of y i R d is given by P ( y i ) = P ( X X V i ) , where
V i = { x R d ( x y i ) ( x y i ) ( x y j ) ( x y j ) ,   i j } ,   i { 1 , , m } .
MSE-RPs essentially represent the discretization of a continuous distribution. Through y i and their associated probabilities, the support of the continuous distribution is divided into m adjacent and non-overlapping regions, known as Voronoi partitions. For each partition V i , its center y i satisfies
E [ X X V i ] = y i ,   i = 1 , , m .
These centers are also referred to as self-consistent points. Tarpey and Flury [33] proved that the set of self-consistent points minimizing the MSE is precisely the set of MSE-RPs.
For one-dimensional continuous distributions, MSE-RPs can be arranged as y 1 < y 2 < < y m . In this case, the probability corresponding to y i is given by
P m ( y i ) = M i M i + 1 f ( x )   d x ,   i = 1 , , m ,
where M 1 = F 1 ( 0 ) , M i = y i 1 + y i 2 ,   i = 2 , , m , and  M m + 1 = F 1 ( 1 ) .
For location–scale distributions with standard forms, it suffices to study their MSE-RPs in the standard case. These standard MSE-RPs can then be extended to the full parameter space using the location–scale transformation properties of MSE-RPs proposed by Zoppè [34]. Since the Pareto Type IV distribution belongs to the location–scale family and Pareto Types I-III are special cases of it, we focus on the MSE-RPs of P ( I V ) ( 0 , 1 , γ , α ) . This leads to the following corollary.
Corollary 1.
If { y i , i = 1 , , m } are the m MSE-RPs of P ( I V ) ( 0 , 1 , γ , α ) , then { z i = σ y i + μ , i = 1 , , m } are the m MSE-RPs of P ( I V ) ( μ , σ , γ , α ) .

3.2. Existence and Uniqueness

The existence of MSE-RPs for a continuous distribution requires that its first two moments exist [35]. For Pareto distributions, the conditions for the existence of the second moment are as follows:
P ( I ) ( σ , α ) :   α > 2 , P ( I I ) ( μ , σ , α ) :   α > 2 , P ( I I I ) ( μ , σ , γ ) :   0 < γ < 1 2 , P ( I V ) ( μ , σ , γ , α ) :   α γ > 2 .
If the parameters of a Pareto distribution do not satisfy the above conditions, its MSE-RPs do not exist.
Next, we examine the shape of the density function. According to the theorem by Trushkin [36], if the density function f ( x ) of a continuous distribution is unimodal and log-concave, the MSE-RPs of the distribution are unique. The density function of P ( I V ) ( 0 , 1 , γ , α ) is unimodal, and we provide the following lemma regarding its shape.
Lemma 1.
Let x P ( I V ) ( 0 , 1 , γ , α ) , with 1 γ > 1 . Define a random variable z = x 1 / γ 1 + x 1 / γ , 0 < z < 1 . Then, there exists a z 2 given by
z 2 = ( α + 1 ) 1 γ 1 + ( α + 1 ) 2 1 γ 1 2 + 4 ( α + 1 ) 1 γ 1 2 ( α + 1 ) / γ ,
such that when z ( 0 , z 2 ) , the density function of P ( I V ) ( 0 , 1 , γ , α ) is log-concave. Moreover, as  α + , z 2 1 γ .
We prove that the density function is log-concave by demonstrating that its logarithmic first derivative is a strictly decreasing function. The detailed proof is provided in Appendix A. Using this theorem, we establish the existence and uniqueness of the MSE-RPs for the Pareto IV distribution under the condition 1 γ > 1 .
For other parameter settings where the density function is not log-concave, we rely on proving the uniqueness of the solutions to the system of equations to demonstrate the uniqueness of the MSE-RPs. This approach was proposed by Fang and He [14] in their proof of the uniqueness of MSE-RPs for the normal distribution. The method involves showing that the equations derived from setting the first derivative of (20) to zero have a unique solution.
By taking the partial derivative of (20), we obtain
E [ d 2 ( x y 1 , , y m ) ] y i = 0 ,   i = 1 , , m ,
which simplifies to the following system of equations:
L i = M i M i + 1 ( x y i ) f ( x )   d x = 0 ,   i = 1 , , m .
For i = 1 , , m 1 , taking the partial derivative of L i with respect to y i + 1 gives
L i y i + 1 = 1 2 f ( M i + 1 ) ( M i + 1 y i ) > 0 ,   i = 1 , , m 1 .
Similarly, for  L m , taking the partial derivative with respect to y m yields
L m y m = 1 2 f ( M i ) ( M m + y m ) < 0 .
From (29) and (30), it can be seen that for the i-th equation, given that y j , j < i + 1 are known, the equation has a unique solution for y i + 1 . When solving the system of equations, an initial value is typically assigned to y 1 , and the values are sequentially substituted into the equations to solve for subsequent variables. For  m 1 unknowns, there are m equations, with the last two equations both involving y m . By iteratively comparing the solutions of the last two equations for y m , the system converges to a unique solution when the two solutions are equal.
Next, we analyze the existence conditions for solutions to the system of Equations (29) and (30) for P ( I V ) ( 0 , 1 , γ , α ) . It is known that L i is a monotonic function of y i + 1 , where y i + 1 > 0 . The condition for the existence of a solution to L i = 0 is
L i ( y i + 1 y i ) L i ( y i + 1 + ) < 0 .
Theorem 1.
For the Pareto distribution P ( I V ) ( 0 , 1 , γ , α ) , the system of Equation (29) has a unique solution when y 1 < E ( X ) and { y i < E ( X ) j = 1 i 1 y j 1 F ( M i ) , i = 2 , , m 1 } .
The proof is provided in Appendix A.

3.3. Generation of MSE-RPs

From the definition of MSE-RPs, it can be seen that solving for MSE-RPs is essentially an optimization problem concerning (20). The conditions for the existence of a solution to this optimization problem are strict, and it may not be possible to derive explicit solutions for MSE-RPs by taking partial derivatives of the objective function. To address this issue, Fang and He [14] proposed an iterative method for solving a system of nonlinear Equation (28). Alternatively, MSE-RPs can also be obtained by leveraging their property as a set of self-consistent points that minimize the mean squared error (MSE). Since the centroids of the k-means algorithm are also self-consistent points, when MSE-RPs exist and are unique, k-means can be used to find the centroids, with the set of centroids that minimizes the MSE being the MSE-RPs. This method is efficient and fast but requires multiple iterations until the results stabilize and may converge to a local minimum.
To address the drawbacks of these methods, Xu et al. [37] proposed a k-means method based on the special properties of MSE-RPs for exponential distributions. This method avoids solving large systems of nonlinear equations and prevents k-means from converging to local minima. Inspired by this, we propose a similar method for generating MSE-RPs for Pareto distributions.
For Pareto I and II distributions, substituting their density functions into the last equation in (28) and simplifying yields
P ( I ) ( 1 , α ) : α α + 1 x α + 1 + y m x α | ( y m 1 + y m ) / 2 + = 0 . P ( I I ) ( 0 , 1 , α ) : α α + 1 ( 1 + x ) α + 1 + ( 1 + y m ) ( 1 + x ) α | ( y m 1 + y m ) / 2 + = 0 .
Simplifying further, we obtain
P ( I ) ( 1 , α ) : α y m 1 + ( 2 α ) y m = 0 . P ( I I ) ( 0 , 1 , α ) : α y m 1 + 2 + ( 2 α ) y m = 0 .
These equations describe the relationship between the last two points in a set of MSE-RPs for Pareto I and II distributions, leading to the following theorems.
Theorem 2.
Suppose y 1 < y 2 < < y m are the m MSE-RPs of a Pareto distribution P ( I ) ( 1 , α | α > 2 ) . Then, the following relationship holds for y m 1 and y m :
y m = α y m 1 α 2 .
Theorem 3.
Suppose y 1 < y 2 < < y m are the m MSE-RPs of a Pareto distribution P ( I I ) ( 0 , 1 , α | α > 2 ) . Then, the following relationship holds for y m 1 and y m :
y m = α y m 1 + 2 α 2 .
Unfortunately, for Pareto III and IV distributions, no similar theorems exist. Therefore, we propose the following Algorithm 1 for Pareto I and II distributions. 
Algorithm 1: Theoretical k-means algorithm for Pareto I and II distributions
  • Step 1: For a given pdf p ( x ) , the number of MSE-RPs m, initial iteration t = 0 , and tolerance ϵ , input an initial set of number-theoretic methods representative points (NTM-RPs) [16] y 1 ( t ) < y 2 ( t ) < < y m ( t ) . Define a partition of R as:
    I i ( t ) = a i ( t ) , a i + 1 ( t ) ,   i = 1 , , m 1 ,   I m ( t ) = a m 1 ( t ) , a m ( t ) ,
    where
    a 1 ( t ) = ,   a i ( t ) = y i 1 ( t ) + y i ( t ) 2 ,   i = 2 , , m ,   a m ( t ) = .
  • Step 2: Calculate probabilities:
    p j ( t ) = I j ( t ) p ( x )   x ,   j = 1 , , m ;
  • Step 3: Calculate conditional means:
    For j = 1 , , m 1 :
    y j ( t + 1 ) = I j ( t ) x p ( x )   d x I j ( t ) p ( x )   d x = I j ( t ) x p ( x )   d x p j ( t ) ,   j = 1 , , m 1 .
    For P ( I ) ( 1 , α ) :
    y m ( t + 1 ) = α α 2 y m 1 ( t + 1 ) .
    For P ( I I ) ( 0 , 1 , α ) :
    y m ( t + 1 ) = α y m 1 ( t + 1 ) + 2 α 2 .
  • Step 4: Sort { y 1 ( t + 1 ) , , y m ( t + 1 ) } from smallest to largest.
  • Step 5: Calculate | y m ( t + 1 ) b | , where
    b = I m ( t ) x p ( x )   d x p m ( t ) .
  • Step 6: If | y m ( t + 1 ) b | < ϵ , the process stops, and  { y j ( t ) } are delivered as the MSE-RPs of the distribution with probabilities { p j ( t ) } . Otherwise, let t : = t + 1 and return to Step 1.
This method not only guarantees that the points obtained are MSE-RPs by leveraging Theorems 2 and 3, but also significantly improves convergence speed by replacing the computation of y m .
For Pareto III and IV distributions, solving (28) directly is challenging due to the presence of terms such as x 1 / γ , x 2 / γ , , x ( γ 1 ) / γ and ln ( 1 + x 1 / γ ) , which make the equations complex and computationally expensive. Thus, we also adopt the k-means method (Algorithm 2) for obtaining MSE-RPs for these distributions [38].  
Algorithm 2: Parametric k-means algorithm
  • Step 1: For a given pdf p ( x ) , the number of MSE-RPs m, and initial iteration t = 0 , input an initial set of points y 1 ( t ) < y 2 ( t ) < < y m ( t ) . Define a partition of R as:
    I i ( t ) = a i ( t ) , a i + 1 ( t ) ,   i = 1 , , m 1 ,   I m ( t ) = a m 1 ( t ) , a m ( t ) ,
    where:
    a 1 ( t ) = ,   a i ( t ) = y i 1 ( t ) + y i ( t ) 2 ,   i = 2 , , m ,   a m ( t ) = .
  • Step 2: Calculate probabilities:
    p j ( t ) = I j ( t ) p ( x ) d x ,   j = 1 , , m ;
  • Step 3: Calculate conditional means:
    y j ( t + 1 ) = I j ( t ) x p ( x ) d x I j ( t ) p ( x ) d x = I j ( t ) x p ( x ) d x p j ( t ) , j = 1 , , m .
  • Step 4: If { y j ( t ) } and { y j ( t + 1 ) } are identical, the process stops, and  { y j ( t ) } are delivered as the MSE-RPs of the distribution with probabilities { p j ( t ) } . Otherwise, let t : = t + 1 and return to Step 1.
Using the parametric k-means method is highly effective for calculating MSE-RPs. However, this method cannot guarantee that the output is always the MSE-RPs and may instead converge to other sets of self-consistent points, as pointed out by Stampfer and Stadlober [38]. Therefore, we maximize the sample size for each calculation and select the set of points with the smallest MSE until the results stabilize. Since calculating MSE-RPs for Pareto III and IV distributions is slow, we provide up to 31 MSE-RPs and their corresponding probabilities for reference in Appendix B.

4. Estimation of MSE-RPs from Pareto Distributions

Currently, methods for estimating MSE-RPs from a sample can be categorized into three types based on whether the underlying distribution type and parameters are known:
1.
The first type requires knowledge of the distribution type. After estimating the distribution parameters, the MSE-RPs are computed directly from the estimated distribution.
2.
The second type also requires knowledge of the distribution type but does not involve parameter estimation. Instead, it estimates MSE-RPs by locating the sample quantiles corresponding to the positions of MSE-RPs. This method is limited to distributions with location–scale properties.
3.
The third type does not require any prior information about the distribution. It directly estimates MSE-RPs from the sample by leveraging the property that k-means cluster centers are also self-consistent points. MSE-RPs are estimated as the centers of clusters formed by classifying the sample.
For the first type of method, estimating MSE-RPs for Pareto distributions, faces challenges regardless of whether the method of ME or MLE is used. The ME method requires the existence of moments, while the MLE method is computationally complex, making it difficult to achieve accurate results. Therefore, in the simulations presented in this paper, we estimate MSE-RPs under the assumption that the parameters γ and α are known, and only μ and σ are estimated. Specifically, our estimates are as follows:
-
For a sample following P ( I ) ( σ ^ , α ) , the MSE-RPs are estimated as
y ^ = σ ^ y ,
where y are the MSE-RPs of P ( I ) ( 1 , α ) .
-
For a sample following P ( I V ) ( μ ^ , σ ^ , γ , α ) , the MSE-RPs are estimated as
y ^ = σ ^ y + μ ^ ,
where y are the MSE-RPs of P ( I V ) ( 0 , 1 , γ , α ) .
The second type of method views MSE-RPs as specific quantiles of the distribution. For distributions with location–scale properties, the positions of quantiles remain unchanged under parameter transformation. Thus, the MSE-RPs can be estimated by determining their corresponding quantile positions in the standard distribution and then estimating these quantiles in the sample. The steps are as follows:
1.
Compute the MSE-RPs y 1 , , y m of the continuous distribution F ( x ) .
2.
Compute the positions q i of y i in the standard distribution F ( x ) :
q i = y i f ( x )   d x ,   i = 1 , , m .
3.
In an independent and identically distributed sample S 1 , , S n , estimate the sample quantiles Q ( q i ) for i = 1 , , m , which serve as the estimated MSE-RPs for the sample.
The accuracy of this method depends heavily on the choice of the quantile estimation method. Different distributions have unique characteristics, and the optimal method for quantile estimation may vary. Since the Pareto distribution is a heavy-tailed distribution, we consider the following four quantile estimation methods, which are known to perform well in the tails of distributions. Consider estimating the quantile Q ( q ) ; the sample size is n and X ( i ) is the i-th order statistic of the sample:
  • H D q quantile estimator [39]:
    H D q = i = 1 n i 1 / n i / n Γ ( n + 1 ) Γ ( ( n + 1 ) q ) Γ ( ( n + 1 ) p ) t ( n + 1 ) q 1 ( 1 t ) ( n + 1 ) p 1   d t X ( i ) ,   p = 1 q .
  • B q quantile estimator [40]:
    B q = i = 1 n n 1 · Γ ( n + 1 ) Γ ( i ) Γ ( n i + 1 ) q i 1 ( 1 q ) n i X ( i ) .
  • Q j quantile estimator [41]:
    1.
    Choose an increasing sequence of proportions { q 0 , q 1 , , q m 1 , q m } , where q 0 = 0 and q m = 1 , with m n corresponding to the cumulative probabilities of each quantile.
    2.
    Estimate the quantiles as follows:
    Q j = X ( I j ) + X ( I j + ) X ( I j ) I j I j ,   j = 1 , , ( m 1 ) .
    I j = q j n + 1 / 2 , I j is the largest integer less than or equal to I j , and I j + is the smallest integer greater than or equal to I j .
  • NO quantile estimator [42]:
    NO q =   ( B ( 0 ; n , q ) 2 q + B ( 1 ; n , q ) q ) X ( 1 ) + B ( 0 ; n , q ) ( 2 3 q ) X ( 2 ) B ( 0 ; n , q ) ( 1 q ) X ( 3 ) + i = 1 n 2 ( B ( i ; n , q ) ( 1 q ) + B ( i + 1 ; n , q ) q ) X ( i + 1 ) B ( n ; n , q ) q X ( n 2 ) + B ( n ; n , q ) ( 3 q 1 ) X ( n 1 ) + ( B ( n 1 ; n , q ) ( 1 q ) + B ( n ; n , q ) ( 2 2 q ) ) X ( n ) .
    B ( i ; n , q ) is the binomial probability and q is the probability of success.
The first two methods are modifications of kernel-based quantile estimation methods, optimized by selecting the best bandwidth under various conditions. The last two methods adjust sample quantiles based on order statistics.
We apply these methods to estimate the MSE-RPs for P ( I ) ( 1 , 5 ) , P ( I I ) ( 0 , 1 , 5 ) , P ( I I I ) ( 0 , 1 , 0.2 ) , and P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) , and three sample sizes. Each method is tested through 1000 simulations, and the bias is reported in Table 1. Here, n represents the sample size, m represents the number of MSE-RPs, and the bias is calculated as follows:
a c = average RPs ^ RPs .
The bold numbers in Table 1 indicate the minimum bias in each row. We analyze the impact of sample size, the number of MSE-RPs, the type of Pareto distribution, and parameter values on the accuracy of representative point estimation. From Table 1, it can be observed that, for fixed parameter values and a fixed number of MSE-RPs, the estimation bias decreases as the sample size increases. Conversely, as the number of MSE-RPs increases, the estimation bias increases for both quantile-based methods and the k-means method. However, the bias remains nearly unchanged for parameter-based estimation methods. This is because increasing the number of MSE-RPs does not introduce new estimators, ensuring that the estimation remains highly stable.
The type of Pareto distribution also significantly affects the bias of estimation methods in Table 1, mainly due to differences in distribution complexity and tail behavior. For Pareto I and II distributions, parameter-based estimation methods perform the best, as parameter estimation is relatively straightforward. However, for Pareto III and IV distributions, which introduce additional parameters, solving the nonlinear Equation (17) often results in convergence issues or requires large sample sizes, leading to increased bias. In contrast, the B q method utilizes quantile positions, which are more robust to heavy tails. The flexibility of Pareto III/IV in modeling the tail behavior aligns well with quantile-based estimation, thus reducing sensitivity to parameter misestimation.
The impact of parameters on MSE-RP estimation primarily manifests in the tail region. Smaller values of γ and α increase the difficulty of estimating μ , which relies heavily on the sample minimum. This, in turn, exacerbates the bias of parameter-based estimation methods. However, the B q method remains relatively stable, particularly for Pareto III and IV distributions, where the tail dominates. The B q method effectively captures the critical regions, improving the accuracy of MSE-RP estimation.
Furthermore, we analyze the sources of bias in quantile-based estimation. To investigate how different methods estimate MSE-RPs across various positions in the distribution, we plot the estimation results for m = 10 MSE-RPs of P ( I ) ( 1 , 5 ) in Figure 1.
In Figure 1, the red dots represent the true values of the MSE-RPs. From the box-plots, it can be observed that for the k-means and B q methods, the estimation bias exhibits increasing fluctuation as the MSE-RPs move farther away from the mean of the Pareto distribution. Meanwhile, the Q j method shows low sensitivity to the estimation of MSE-RPs in the tail. The H D q method has a significant estimation bias for the last representative point, and the NO method exhibits a large overall bias with noticeable increases in fluctuation.
Therefore, considering all factors, the B q method is the most stable and accurate method for estimating the MSE-RPs of the Pareto distribution when γ and α are small. However, by examining the MSE-RPs and their corresponding probabilities, it can be observed that as the number of MSE-RPs increases, the probability associated with the tail MSE-RPs becomes significantly smaller, while their corresponding values become exceedingly large. Estimating these tail MSE-RPs notably increases the overall estimation bias.
Li et al. [30] previously proposed a method to measure the information gain (IG) of MSE-RPs, similar to the variance ratio used in principal component analysis. The range of IG is from 0 to 1, and the calculation formula is as follows:
I G = 1 M S E ( y 1 , , y k x ) var ( x ) ,
where k m is the number of MSE-RPs used in calculating the IG. We can use the same IG function to calculate the number of MSE-RPs and their corresponding coverage under different levels of information gain. See Appendix C for details.
By calculating the information gain, we find that for the Pareto distribution, as the number of MSE-RPs increases, the information gain for the tail MSE-RPs becomes very small. For example, for the P ( I ) ( 1 , 5 ) distribution, when m = 17 , the first 12 MSE-RPs account for 90% of the information. We re-estimate the MSE-RPs for the above distributions with m = 17 under different IG proportions, and the results are presented in Table 2.
The bold numbers in Table 2 indicate the minimum bias in each row. From the comparison between Table 1 and Table 2, it can be observed that sacrificing a small amount of information gain can significantly improve the accuracy of representative point estimation. Furthermore, when the parameters γ and α are small, the B q quantile method provides relatively accurate and stable estimates.
The IG truncation method dynamically determines the number of MSE-RPs required to achieve a specified IG level. This process focuses on selecting the most informative points while discarding tail points, which, although having high estimation bias, contribute little to the overall information. As a result, the outcomes presented in Table 2 can be considered robust to the choice of m.
In practical applications, the number of MSE-RPs needed can be determined by referring to Appendix C, which provides the number of MSE-RPs corresponding to different levels of IG and their coverage regions.

5. Real Data Study

5.1. Case I

We utilize commonly used Pareto I distribution data, which are taken from the appendix of Brazauskas’ study [43]. These data were also used by Kim et al. to demonstrate the best parameter estimator for the Pareto I distribution [44]. Therefore, we can directly use the estimated parameters of this dataset from their paper. The study indicates that the data follow a P ( I ) ( 497.085 , 1.2078 ) distribution, for which MSE-RPs do not exist.
Using the properties of the Pareto distribution discussed in Section 2 and Formula (6), this dataset can be approximately transformed into data following a P ( I ) ( 1 , 5 ) distribution, enabling the estimation of MSE-RPs. Since Formula (6) is an affine transformation that preserves the information content, no significant loss of information occurs during this process. To verify whether the data follow the distribution we mentioned, we performed a KS-test on both the transformed and original data, and the results were consistent. The p-value was 0.8943, and the KS statistic was 0.0472. Therefore, we believe the data follow the distribution mentioned above, and there is no loss of information before or after the transformation.
The transformed data are observed to have a distribution range of ( 1.0014 , 3.0836 ) . Referring to the IG coverage tables provided in Appendix C, we perform the following estimations:
-
For 90 % I G 95 % , we first estimate m = 7 MSE-RPs and select the first 6 MSE-RPs as those providing valid information.
-
For 95 % I G 98 % , we first estimate m = 18 MSE-RPs and select the first 14 MSE-RPs as those providing valid information.
We compare the fit of the data using MSE-RPs estimated using different methods.
From Figure 2, it can be observed that as IG increases, the empirical distribution fitted using MSE-RPs becomes closer to the true distribution. Furthermore, apart from the MSE-RPs estimated using the k-means method, which exhibit significant bias, the MSE-RPs estimated using the other three methods show no noticeable differences in the fitting process.

5.2. Case II

Next, we estimate MSE-RPs for real data from a Pareto IV distribution. The data were obtained from testing the unit stress voltage of miniature light bulbs [45]. According to the literature, the data follow a P ( I V ) ( 0 , 95.6575 , 0.3631 , 6.0973 ) distribution. Using the properties of the Pareto distribution and Formula (9), the data are transformed to correspond to a P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) distribution. Since Formula (9) is also an affine transformation that preserves the information content, no significant loss of information occurs during this process. In order to confirm that the data follow the distribution we specified, we conducted a KS-test on both the original and transformed data. The results were consistent, with a p-value of 0.2004 and a KS statistic of 0.0957. So we conclude that the data follow the mentioned distribution, and the transformation process does not result in any loss of information.
The transformed data are observed to have a distribution range of ( 0.1187 , 1.9827 ) . Referring to the IG coverage tables provided in Appendix C, we perform the following estimations:
-
For 90 % I G 95 % , we first estimate m = 11 MSE-RPs and select the first 9 MSE-RPs as those providing valid information.
-
For 95 % I G 98 % , we first estimate m = 20 MSE-RPs and select the first 16 MSE-RPs as those providing valid information.
We compare the fit of the data distribution using MSE-RPs estimated using different methods.
From Figure 3, it can be observed that, similarly, as IG increases, the empirical distribution fitted using MSE-RPs becomes closer to the true distribution. However, unlike Case I, in Case II, the MSE-RPs estimated using the B q method result in a closer fit to the true distribution in terms of empirical distribution, while the other three methods exhibit larger biases.

6. Conclusions

This paper primarily investigates the MSE-RPs of Pareto distributions and their estimation. We provided the uniqueness conditions for MSE-RPs and derived the corresponding unique intervals. For the computation of MSE-RPs of Pareto I and II distributions, we proposed an improved parametric k-means method.
For the estimation of MSE-RPs, we compared three categories of methods: the k-means method, the quantile-based estimation method, and methods that first estimate the parameters and then estimate the MSE-RPs. Based on simulations, we found that the k-means method is unsuitable for estimating MSE-RPs of Pareto distributions. The MLE-based parameter estimation method has the smallest bias when estimating MSE-RPs of Pareto I distributions, while the ME-based parameter estimation method performs best for Pareto II distributions. As the parameters γ and α decrease, the bias when using the B q quantile method to estimate MSE-RPs becomes minimal.
Since the tail MSE-RPs of Pareto distributions contribute significant bias while having very low probabilities, we propose using the IG function to calculate the information content of MSE-RPs. We also provide the number of MSE-RPs required for three levels of IG and their corresponding covered ranges to help select an appropriate number of MSE-RPs in practical analyses. Simulation results demonstrate that sacrificing a small amount of information gain can significantly improve the accuracy of representative point estimation.
Additionally, we applied the proposed methods to estimate MSE-RPs for real data from two different distributions. Using different IG intervals, we estimated MSE-RPs for the data and used them to fit the empirical distribution functions of the real data. These results were consistent with the simulation findings.
In conclusion, this paper studies the MSE-RPs of Pareto distributions and their estimation methods, analyzes the bias characteristics of representative point estimation, and proposes an IG-based optimization method for selecting MSE-RPs. The effectiveness and applicability of the proposed method were validated through simulations and real data studies.

Author Contributions

Conceptualization and writing—review and editing, X.P.; methodology and software, X.L.; software and writing—original draft preparation, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Key R & D Program of China (No. 2022YFC3600300) and in part by Guangdong Provincial Key Laboratory of IRADS (2022B1212010006).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in Case I were obtained from Table A2 in the Appendix of Reference [43]. The data used in Case II were obtained from Table 1 in Section 6 of reference [13]. Further details about the data sources can be found in the cited references. No new data were created in this study.

Acknowledgments

We would like to express our heartfelt gratitude to Zhou Yongdao for his invaluable suggestions, which greatly contributed to the improvement of this work.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. Proof

Appendix A.1

Proof of Lemma 1.
Taking the logarithm of the density function of P ( I V ) ( 0 , 1 , γ , α ) , we have
g ( x ) = log f ( x ) = log α + log γ ( α + 1 ) log ( 1 + x 1 γ ) + 1 γ 1 log x .
By taking the second derivative of g ( x ) , we obtain
2 g ( x ) x 2 = 1 x 2 ( α + 1 ) γ 2 x 1 / γ 1 + x 1 / γ 2 ( α + 1 ) γ 1 γ 1 x 1 / γ 1 + x 1 / γ 1 γ 1 .
From (A2), it is evident that when 1 γ 1 0 , we have 2 g ( x ) x 2 0 , which implies the function is not log-concave. Therefore, it is required that 1 γ > 1 .
Let z = x 1 / γ 1 + x 1 / γ ,   0 < z < 1 , then
h ( z ) = ( α + 1 ) γ 2 z 2 ( α + 1 ) γ 1 γ 1 z 1 γ 1 .
It is clear that h ( z ) is an upward-opening parabola, and z is an increasing function of x. The two roots of h ( z ) = 0 are as follows:
z 1 = ( α + 1 ) 1 γ 1 ( α + 1 ) 2 1 γ 1 2 + 4 ( α + 1 ) 1 γ 1 2 ( α + 1 ) / γ , z 2 = ( α + 1 ) 1 γ 1 + ( α + 1 ) 2 1 γ 1 2 + 4 ( α + 1 ) 1 γ 1 2 ( α + 1 ) / γ .
Clearly, z 1 < 0 and z 2 > 0 . Next, we examine whether z 2 is within the range of z:
z 2 1 = 1 2 ( 1 + γ ) + 1 2 ( 1 γ ) 2 + 4 ( γ γ 2 ) ( α + 1 ) .
To determine whether z 2 > 1 or z 2 < 1 , we square both parts of (A5) and subtract the squares. Since
( 1 γ ) 2 + 4 ( γ γ 2 ) ( α + 1 ) ( 1 + γ ) 2 = 4 γ α 4 γ 2 α + 1 < 0 ,
it follows that z 2 < 1 . Therefore, when 1 γ > 1 , and z lies within ( 0 , z 2 ) , the density function of P ( I V ) ( 0 , 1 , γ , α ) is log-concave, which guarantees the uniqueness of the MSE-RPs. Furthermore, as α + , z 2 1 γ . □

Appendix A.2

Proof of Theorem 1.
Let
G ( y i + 1 ) = M i + 1 F ( M i + 1 ) M i F ( M i ) M i M i + 1 F ( x ) d x y i ( F ( M i + 1 ) F ( M i ) ) ,
where M 1 = F 1 ( 0 ) , M i = y i 1 + y i 2 ,   i = 2 , , m , and M m + 1 = F 1 ( 1 ) .
To prove that the equation has a solution, for i = 1 , , m 1 , we need to show that G ( y i + 1 = y i ) < 0 and lim y i + 1 + G ( y i + 1 ) > 0 . For i = m , we need to show that G ( y i + 1 = y i ) > 0 and lim y i + 1 + G ( y i + 1 ) < 0 .
1.
Case I: For i = 1 , , m 1 ,
G ( y i + 1 = y i ) = ( y i M i ) F ( M i ) M i y i F ( x ) d x .
By the mean value theorem, there exists η ( M i , y i ) such that M i y i F ( x ) d x = ( y i M i ) F ( η ) . Therefore, G ( y i + 1 = y i ) < 0 .
lim y i + 1 + G ( y i + 1 ) = lim M i + 1 + ( y i M i ) F ( M i ) + ( M i + 1 y i ) F ( M i + 1 ) M i M i + 1 F ( x ) d x .
Substituting F ( x ) = 1 ( 1 + x 1 / γ ) α , we have
h ( y i ) = lim y i + 1 + G ( y i + 1 ) = ( y i M i ) ( F ( M i ) 1 ) + M i + ( 1 F ( x ) ) d x = y i ( F ( M i ) 1 ) + M i + x f ( x ) d x ,
where the last equality holds because, when α γ > 2 ,
lim x + x ( F ( x ) 1 ) = lim x + x 1 α γ = 0 .
For i = 1 ,
h ( y 1 ) = y 1 + 0 + x f ( x ) d x = E ( X ) y 1 .
Thus, when y 1 < E ( X ) , h ( y 1 ) > 0 , which ensures the existence of a solution. For i = 2 , , m 1 ,
h ( y i ) = y i ( F ( M i ) 1 ) + 0 + x f ( x ) d x j = 1 i 1 M j M j + 1 x f ( x ) d x = y i ( F ( M i ) 1 ) + E ( X ) j = 1 i 1 y j .
Thus, when y i < E ( X ) j = 1 i 1 y j 1 F ( M i ) , h ( y i ) > 0 , and the equation has a solution.
2.
Case II: For i = m
G ( y m = y m 1 ) = lim N + ( N y m 1 ) F ( N ) y m 1 N F ( x ) d x .
By the mean value theorem, there exists η ( y m 1 , N ) such that y m 1 N F ( x ) d x = ( N y m 1 ) F ( η ) . Thus, G ( y m = y m 1 ) > 0 .
lim y m + G ( y m ) = lim M m + 1 + y m ( F ( M m + 1 ) F ( M m ) ) < 0 .

Appendix B. MSE-RPs of Pareto III and IV Distributions

Table A1. The points of MSE-RPs from P ( I I I ) ( 0 , 1 , 0.2 ) .
Table A1. The points of MSE-RPs from P ( I I I ) ( 0 , 1 , 0.2 ) .
MSERP1RP2RP3RP4RP5RP6RP7RP8RP9RP10RP11RP12RP13RP14RP15RP16
m = 20.0810057050.8932485031.624569032
m = 30.0469240.7921001.2645402.150314
m = 40.0307490.7224581.0958831.5884222.665352
m = 50.0217480.6702490.9906241.3430281.8997483.175156
m = 60.0162100.6290310.9157621.1982401.5732472.2062323.681976
m = 70.0125540.5953210.8584171.0994611.3858251.7967102.5103884.186923
m = 80.0100130.5670320.8123551.0260631.2611751.5648302.0169892.8132474.690613
m = 90.0081740.5428190.7741200.9684061.1705581.4129001.7396042.2355803.1153065.193419
m = 100.0068000.5217660.7416100.9213281.1006591.3040421.5594651.9120822.4531943.4168315.695577
m = 110.0057460.5032270.7134570.8817891.0444321.2211821.4316111.7031272.0832222.6702003.7179806.197247
m = 120.0049200.4867280.6887220.8478620.9977811.1553131.3352001.5557421.8450522.2535362.8868064.0188546.698539
m = 130.0042600.4719120.6667340.8182610.9581491.1012221.2592361.4453291.6777571.9858862.4233143.1031364.3195187.199533
m = 140.0037250.4585070.6469980.7920870.9238501.0556781.1973641.3589001.5530051.7984122.1260042.5927313.3192674.6200187.700288
m = 150.0032850.4462970.6291380.7686880.8937211.0165641.1456541.2889501.4558221.6590711.9181612.2656382.7618953.5352494.9203888.200847
m = 160.0029180.4351110.6128640.7475790.8669340.9824321.1015391.2308361.3775471.5509091.7640442.0372872.4049342.9308783.7511195.2206528.701244
m = 170.0026100.4248100.5979440.7283890.8428770.9522571.0632731.1815331.3128241.4641121.6447281.8682542.1559752.5439883.0997263.9669005.520828
m = 180.0023480.4152830.5841950.7108280.8210870.9252871.0296201.1389841.2581631.3926091.5492511.7376491.9719192.2743452.6828643.2684724.182610
m = 190.0021240.4064330.5714650.6946670.8012080.9009610.9996821.1017391.2111921.3324421.4708271.6333661.8299202.0751872.3924832.8216083.437139
m = 200.0019300.3981840.5596310.6797180.7829600.8788480.9727911.0687471.1702401.2809231.4050311.5479061.7167321.9217122.1781592.5104462.960250
m = 210.0017620.3904690.5485880.6658310.7661160.8586100.9484341.0392251.1341011.2361601.3488561.4763781.6241421.7995372.0131452.2809092.628276
m = 220.0016140.3832320.5382500.6528800.7504960.8399810.9262161.0125801.1018741.1967831.3001861.4154541.5467941.6997421.8819192.1043042.383489
m = 230.0014850.3764240.5285430.6407590.7359500.8227440.9058210.9883491.0728801.1617771.2574911.3627961.4810441.6165031.7748551.9639752.195251
m = 240.0013700.3700030.5194020.6293800.7223520.8067240.8869990.9661711.0465911.1303711.2196351.3167091.4243261.5458591.6856661.8495922.045778
m = 250.0012690.3639350.5107730.6186650.7095990.7917740.8695440.9457541.0225931.1019711.1857571.2759391.3747841.4850211.6100731.7544031.924033
m = 260.0011780.3581860.5026100.6085520.6976010.7777740.8532870.9268651.0005561.0761111.1551961.2395341.3310431.4319691.5450621.6738121.822802
m = 270.0010970.3527290.4948690.5989820.6862830.7646200.8380900.9093090.9802121.0524191.1274291.2067621.2920601.3852071.4884511.6045841.737175
m = 280.0010230.3475410.4875160.5899080.6755800.7522250.8238330.8929270.9613441.0305941.1020441.1770461.2570311.3435991.4386231.5443731.663692
m = 290.0009570.3425990.4805180.5812850.6654360.7405150.8104190.8775860.9437721.0103921.0787071.1499311.2253271.3062731.3943511.4914411.599845
m = 300.0008970.3378830.4738460.5730780.6558000.7294250.7977610.8631720.9273440.9916121.0571451.1250491.1964471.2725431.3546911.4444691.543774
m = 310.0008430.3333780.4674760.5652510.6466300.7188990.7857860.8495910.9119340.9740851.0371361.1021001.1699891.2418651.3189011.4024411.494069
RP17RP18RP19RP20RP21RP22RP23RP24RP25RP26RP27RP28RP29RP30RP31
m = 17 9.201507
m = 18 5.8209329.701655
m = 19 4.3982626.12097410.201706
m = 20 3.6057444.6138676.42096310.701674
m = 21 3.0988133.7742994.8294316.72090711.201569
m = 22 2.7460033.2373143.9428145.0449627.02081211.701401
m = 23 2.4859362.8636493.3757664.1112955.2604647.32068212.201178
m = 24 2.2860342.5882792.9812313.5141774.2797475.4759417.62052312.700906
m = 25 2.1273822.3766862.6905393.0987623.6525564.4481765.6913977.92033713.200591
m = 26 1.9982402.2088272.4672352.7927333.2162513.7909084.6165855.9068338.22012713.700237
m = 27 1.8909332.0722592.2901462.5576992.8948743.3337053.9292364.7849776.1222528.51989614.199850
m = 28 1.8002351.9588472.1461282.3713612.6480962.9969703.4511314.0675464.9533546.3376578.81964714.699431
m = 29 1.7224641.8630502.0265862.2198732.4524932.7384373.0990303.5685344.2058395.1217176.5530489.11938015.198984
m = 30 1.6549491.7809631.9256652.0941812.2935182.5335562.8287333.2010603.6859174.3441185.2900696.7684279.41909915.698512
m = 31 1.5957101.7097541.8392381.9881162.1616582.3670802.6145622.9189903.3030653.8032834.4823855.4584106.9837969.71880316.198018
Table A2. The corresponding probabilities of MSE-RPs from P ( I I I ) ( 0 , 1 , 0.2 ) .
Table A2. The corresponding probabilities of MSE-RPs from P ( I I I ) ( 0 , 1 , 0.2 ) .
P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15P16
m = 20.7597350.240265
m = 30.5348510.4006800.064469
m = 40.3831700.4300960.1642740.022460
m = 50.2831230.4007080.2342390.0725130.009417
m = 60.2156350.3532190.2674780.1238270.0353410.004501
m = 70.1686760.3047420.2742480.1630700.0681320.0187630.002369
m = 80.1349900.2612370.2654970.1875070.0992840.0394600.0106810.001343
m = 90.1101520.2240900.2489450.1992470.1243720.0619630.0239890.0064350.000807
m = 100.0913890.1929980.2292770.2016620.1421520.0828440.0398890.0152180.0040610.000509
m = 110.0769100.1671410.2091030.1978590.1531200.1002580.0561140.0264810.0100160.0026650.000334
m = 120.0655290.1456340.1897650.1902390.1584750.1135630.0710700.0388010.0180860.0068040.0018070.000226
m = 130.0564370.1276830.1718810.1805010.1595450.1228620.0838600.0509910.0274000.0126710.0047510.0012600.000158
m = 140.0490690.1126230.1556670.1697840.1575240.1286280.0941230.0622250.0371240.0197420.0090820.0033980.0009010.000113
m = 150.0430200.0999120.1411310.1588190.1533850.1314720.1018550.0720150.0465920.0274430.0144910.0066430.0024820.0006580.000082
m = 160.0380000.0891190.1281770.1480560.1478790.1320100.1072560.0801460.0553360.0352690.0205900.0108190.0049480.0018460.0004890.000061
m = 170.0337890.0798950.1166670.1377570.1415580.1307920.1106280.0865840.0630720.0428220.0270060.0156670.0082030.0037450.0013970.000370
m = 180.0302260.0719670.1064470.1280610.1348230.1282870.1123000.0914210.0696590.0498210.0334180.0209150.0120780.0063090.0028770.001072
m = 190.0271860.0651110.0973690.1190310.1279520.1248710.1125920.0948100.0750650.0560860.0395780.0263130.0163760.0094260.0049140.002239
m = 200.0245730.0591500.0892940.1106770.1211350.1208430.1117940.0969370.0793290.0615260.0453050.0316510.0209040.0129560.0074380.003873
m = 210.0223110.0539410.0820980.1029860.1144970.1164310.1101560.0979930.0825340.0661100.0504830.0367660.0254930.0167530.0103500.005931
m = 220.0203400.0493650.0756710.0959230.1081190.1118080.1078890.0981590.0847880.0698550.0550470.0415390.0300000.0206810.0135390.008344
m = 230.0186140.0453270.0699180.0894480.1020460.1071040.1051650.0976030.0862090.0728070.0589730.0458900.0343130.0246200.0168970.011029
m = 240.0170950.0417490.0647530.0835160.0963010.1024110.1021240.0964700.0869120.0750300.0622670.0497730.0383520.0284720.0203250.013901
m = 250.0157500.0385640.0601060.0780820.0908940.0977970.0988760.0948860.0870090.0765990.0649540.0531680.0420590.0321570.0237380.016878
m = 260.0145540.0357190.0559130.0731030.0858200.0933080.0955080.0929580.0866030.0775900.0670740.0560750.0454010.0356180.0270650.019889
m = 270.0134870.0331680.0521210.0685360.0810710.0889740.0920870.0907740.0857840.0780790.0686740.0585090.0483620.0388150.0302490.022871
m = 280.0125300.0308720.0486820.0643450.0766330.0848160.0886650.0884090.0846320.0781360.0698060.0604950.0509400.0417220.0332490.025772
m = 290.0116700.0288000.0455570.0604930.0724900.0808430.0852790.0859200.0832160.0778280.0705220.0620640.0531430.0443250.0360330.028550
m = 300.0108930.0269240.0427090.0569490.0686240.0770620.0819580.0833580.0815950.0772150.0708730.0632510.0549860.0466220.0385840.031175
m = 310.0101900.0252200.0401080.0536850.0650170.0734710.0787240.0807590.0798200.0763490.0709060.0640920.0564920.0486170.0408890.033622
P17P18P19P20P21P22P23P24P25P26P27P28P29P30P31
m = 170.000046
m = 180.0002840.000036
m = 190.0008340.0002210.000028
m = 200.0017630.0006570.0001740.000022
m = 210.0030850.0014040.0005230.0001380.000017
m = 220.0047750.0024820.0011290.0004200.0001110.000014
m = 230.0067850.0038780.0020140.0009160.0003410.0000900.000011
m = 240.0090520.0055610.0031760.0016490.0007490.0002790.0000740.000009
m = 250.0115110.0074830.0045910.0026200.0013600.0006180.0002300.0000610.000008
m = 260.0140970.0095930.0062270.0038170.0021770.0011300.0005130.0001910.0000510.000006
m = 270.0167470.0118390.0080420.0052140.0031940.0018210.0009440.0004290.0001600.0000420.000005
m = 280.0194080.0141690.0099960.0067810.0043920.0026890.0015320.0007950.0003610.0001340.0000360.000004
m = 290.0220330.0165380.0120450.0084830.0057480.0037200.0022760.0012970.0006720.0003050.0001140.0000300.000004
m = 300.0245840.0189040.0141510.0102870.0072340.0048970.0031670.0019370.0011040.0005720.0002600.0000970.0000260.000003
m = 310.0270290.0212310.0162770.0121580.0088230.0061980.0041920.0027100.0016570.0009440.0004890.0002220.0000830.0000220.000003
Table A3. The points of MSE-RPs from P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) .
Table A3. The points of MSE-RPs from P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) .
MSERP1RP2RP3RP4RP5RP6RP7RP8RP9RP10RP11RP12RP13RP14RP15RP16
m = 20.0609971.0733281.797978
m = 30.0356991.0017411.4210582.369184
m = 40.0235520.9501061.2593801.7564982.927609
m = 50.0167380.9099341.1652711.4997302.0878743.479813
m = 60.0125200.8772301.1014451.3555791.7367282.4170784.028470
m = 70.0097240.8497731.0539921.2618441.5410571.9724322.7449374.574898
m = 80.0077730.8261981.0165331.1950351.4153361.7250892.2074143.0718965.119828
m = 90.0063570.8056070.9857141.1443061.3270991.5667581.9085032.4419183.3982215.663702
m = 100.0052960.7873740.9595951.1039731.2612601.4562711.7174012.0915722.6760823.7240816.206803
m = 110.0044810.7710510.9369641.0707751.2098581.3744461.5843451.8676792.2744132.9099914.0495916.749318
m = 120.0038410.7563030.9170221.0427101.1683041.3111281.4861321.7119382.0177532.4570913.1437044.3748297.291381
m = 130.0033290.7428740.8992121.0184821.1337711.2604371.4104591.5971541.8392882.1676952.6396433.3772634.6998527.833086
m = 140.0029130.7305640.8831350.9972121.1044271.2187401.3501901.5089031.7078531.9665002.3175422.8220963.6106975.0247028.374504
m = 150.0025710.7192160.8684950.9782821.0790371.1836731.3009011.4388081.6069091.8183772.0936232.4673163.0044683.8440305.3494128.915686
m = 160.0022860.7087020.8550640.9612471.0567331.1536351.2597091.3816731.5268521.7046821.9287992.2206842.6170333.1867754.0772795.6740059.456674
m = 170.0020450.6989160.8426650.9457721.0368931.1275051.2246561.3341041.4617201.6145881.8023242.0391552.3476972.7667043.3690274.3104585.998500
m = 180.0018410.6897740.8311590.9316041.0190551.1044741.1943661.2937901.4076151.5413701.7021491.8998852.1494652.4746742.9163353.5512324.543578
m = 190.0016660.6812010.8204310.9185451.0028731.0839471.1678461.2591071.3618801.4806301.6207921.7896071.9973952.2597422.6016203.0659343.733397
m = 200.0015150.6731370.8103870.9064410.9880781.0654741.1443621.2288801.3226461.4293711.5533571.7000791.8770002.0948692.3699922.7285413.215504
m = 210.0013830.6655310.8009510.8951640.9744601.0487091.1233611.2022381.2885581.3854801.4965061.6259101.7792811.9643492.1923162.4802212.855441
m = 220.0012680.6583360.7920570.8846120.9618541.0333831.1044171.1785241.2586121.3474251.4478861.5634231.6983541.8584302.0516702.2897432.590433
m = 230.0011670.6515160.7836490.8747020.9501231.0192831.0871991.1572331.2320461.3140681.4057901.5100271.6302041.7707301.9375432.1389682.387155
m = 240.0010770.6450350.7756820.8653620.9391591.0062371.0714441.1379691.2082761.2845481.3689481.4638381.5719981.6968961.8430592.0166302.226251
m = 250.0009970.6388650.7681130.8565320.9288700.9941061.0569421.1204201.1868451.2582011.3363991.4234581.5216821.6338591.7635301.9153572.095698
m = 260.0009260.6329790.7609070.8481630.9191800.9827761.0435211.1043351.1673891.2345081.3074021.3878271.4777261.5793901.6956471.8301241.987632
m = 270.0008620.6273560.7540340.8402100.9100260.9721511.0310421.0895111.1496171.2130561.2813751.3561251.4389731.5318331.6370091.7573841.896690
m = 280.0008050.6219740.7474670.8326360.9013540.9621531.0193891.0757821.1332941.1935131.2578571.3277101.4045261.4899291.5858291.6945641.819087
m = 290.0007530.6168150.7411810.8254080.8931170.9527141.0084651.0630091.1182251.1756111.2364771.3020731.3736831.4527071.5407541.6397501.752077
m = 300.0007060.6118640.7351540.8184970.8852740.9437780.9981891.0510781.1042511.1591291.2169331.2788031.3458861.4194051.5007351.5914881.693618
m = 310.0006630.6071060.7293690.8118790.8777920.9352940.9884931.0398921.0912391.1438851.1989771.2575661.3206861.3894171.4649511.5486561.642158
RP17RP18RP19RP20RP21RP22RP23RP24RP25RP26RP27RP28RP29RP30RP31
m = 17 9.997500
m = 18 6.32291310.538188
m = 19 4.7766476.64725511.078759
m = 20 3.9155295.0096726.97153811.619229
m = 21 3.3650504.0976315.2426617.29576712.159612
m = 22 2.9823223.5145744.2797075.4756167.61995112.699919
m = 23 2.7006293.1091873.6640814.4617615.7085427.94409513.240159
m = 24 2.4845532.8108133.2360393.8135704.6437955.9414438.26820413.780340
m = 25 2.3135202.5819412.9209863.3628773.9630464.8258116.1743228.59228114.320469
m = 26 2.1747532.4007792.6793193.0311493.4897054.1125085.0078126.4071808.91633114.860551
m = 27 2.0598912.2537962.4880302.7766883.1413033.6165234.2619595.1897996.6400219.24035515.400592
m = 28 1.9632362.1321382.3328322.5752732.8740513.2514503.7433324.4114005.3717736.8728459.56435715.940596
m = 29 1.8807652.0297682.2043762.4118602.6625102.9714073.3615893.8701344.5608315.5537367.1056559.88833916.480566
m = 30 1.8095591.9424252.0962882.2766062.4908822.7497423.0687573.4717223.9969284.7102545.7356887.33845210.21230317.020505
m = 31 1.7474471.8670192.0040722.1628012.3488302.5698992.8369683.1661033.5818504.1237154.8596695.9176327.57123710.53625017.560417
Table A4. The corresponding probabilities of MSE-RPs from P ( I V ) ( 0 , 1 , 0.1 , 5 ) .
Table A4. The corresponding probabilities of MSE-RPs from P ( I V ) ( 0 , 1 , 0.1 , 5 ) .
P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15P16
m = 20.8381950.161805
m = 30.6420740.3170520.040874
m = 40.4806700.3921150.1130250.014190
m = 50.3606630.4077600.1778120.0477840.005981
m = 60.2742080.3892760.2240620.0865850.0229920.002876
m = 70.2120630.3549070.2511410.1222290.0459630.0121740.001523
m = 80.1669210.3155340.2622710.1510810.0701870.0262030.0069350.000868
m = 90.1336150.2768540.2616380.1721410.0927500.0424710.0158210.0041860.000524
m = 100.1086180.2414520.2531250.1857560.1121470.0590210.0268910.0100100.0026490.000331
m = 110.0895370.2102160.2398390.1929160.1277600.0745910.0389050.0176930.0065840.0017420.000218
m = 120.0747380.1831880.2240550.1948450.1395060.0884470.0509250.0264640.0120250.0044740.0011840.000148
m = 130.0630880.1600340.2073280.1927660.1476180.1002230.0623180.0356610.0185020.0084040.0031270.0008270.000103
m = 140.0537910.1402840.1906600.1877750.1525010.1097920.0726850.0447850.0255550.0132480.0060170.0022390.0005920.000074
m = 150.0462800.1234530.1746570.1807900.1546340.1171890.0818010.0534840.0328070.0186940.0096880.0044000.0016370.0004330.000054
m = 160.0401430.1090920.1596500.1725420.1545090.1225500.0895600.0615220.0399680.0244600.0139280.0072160.0032770.0012190.0003230.000040
m = 170.0350760.0968070.1457940.1635920.1525940.1260680.0959440.0687550.0468270.0303180.0185320.0105480.0054650.0024820.0009230.000244
m = 180.0308530.0862640.1331310.1543540.1493070.1279690.1009940.0751030.0532370.0360870.0233210.0142450.0081060.0042000.0019070.000709
m = 190.0273040.0771810.1216360.1451220.1450090.1284870.1047900.0805380.0591010.0416350.0281470.0181700.0110950.0063130.0032700.001485
m = 200.0242970.0693240.1112450.1361020.1400020.1278500.1074380.0850640.0643610.0468640.0328940.0222030.0143240.0087440.0049750.002577
m = 210.0217320.0625010.1018760.1274250.1345310.1262710.1090560.0887150.0689880.0517070.0374740.0262460.0176990.0114140.0069670.003964
m = 220.0195280.0565500.0934380.1191750.1287930.1239420.1097700.0915410.0729760.0561210.0418190.0302200.0211380.0142460.0091850.005606
m = 230.0176230.0513390.0858430.1113960.1229390.1210330.1097030.0936020.0763350.0600810.0458820.0340650.0245730.0171730.0115700.007458
m = 240.0159680.0467580.0790040.1041060.1170840.1176890.1089720.0949690.0790900.0635780.0496310.0377320.0279490.0201380.0140660.009474
m = 250.0145210.0427150.0728400.0973040.1113160.1140350.1076890.0957140.0812730.0666130.0530450.0411880.0312220.0230920.0166260.011609
m = 260.0132510.0391350.0672800.0909800.1056950.1101730.1059520.0959080.0829230.0691960.0561150.0444070.0343590.0259950.0192070.013822
m = 270.0121300.0359520.0622570.0851130.1002660.1061890.1038530.0956220.0840810.0713440.0588360.0473730.0373320.0288160.0217730.016077
m = 280.0111380.0331140.0577130.0796780.0950570.1021480.1014700.0949220.0847940.0730800.0612140.0500760.0401230.0315280.0242970.018343
m = 290.0102550.0305740.0535950.0746490.0900850.0981070.0988710.0938720.0851060.0744300.0632570.0525130.0427190.0341130.0267530.020594
m = 300.0094670.0282950.0498560.0699980.0853590.0941070.0961160.0925270.0850600.0754190.0649780.0546840.0451110.0365550.0291220.022807
m = 310.0087610.0262430.0464570.0656970.0808800.0901800.0932540.0909410.0847000.0760790.0663910.0565920.0472950.0388430.0313890.024964
P17P18P19P20P21P22P23P24P25P26P27P28P29P30P31
m = 170.000031
m = 180.0001880.000023
m = 190.0005520.0001460.000018
m = 200.0011700.0004350.0001150.000014
m = 210.0020530.0009320.0003470.0000920.000011
m = 220.0031890.0016520.0007500.0002790.0000740.000009
m = 230.0045520.0025900.0013410.0006090.0002270.0000600.000008
m = 240.0061070.0037270.0021200.0010980.0004990.0001860.0000490.000006
m = 250.0078180.0050390.0030750.0017490.0009060.0004120.0001530.0000410.000005
m = 260.0096490.0064970.0041880.0025560.0014540.0007530.0003420.0001270.0000340.000004
m = 270.0115650.0080720.0054350.0035030.0021380.0012160.0006300.0002860.0001060.0000280.000004
m = 280.0135370.0097360.0067950.0045750.0029490.0017990.0010240.0005300.0002410.0000900.0000240.000003
m = 290.0155380.0114630.0082440.0057530.0038730.0024960.0015230.0008670.0004490.0002040.0000760.0000200.000003
m = 300.0175430.0132300.0097590.0070170.0048970.0032970.0021250.0012970.0007380.0003820.0001740.0000650.0000170.000002
m = 310.0195320.0150150.0113210.0083490.0060030.0041890.0028200.0018180.0011090.0006310.0003270.0001480.0000550.0000150.000002

Appendix C. Information Gain and Related Covered Range

Table A5. The IG from different numbers of MSE-RPs of P ( I ) ( 1 , 5 ) and related covered range.
Table A5. The IG from different numbers of MSE-RPs of P ( I ) ( 1 , 5 ) and related covered range.
Number of MSE-RPsIGCovered Number of MSE-RPsCovered Range
m = 6 90 % I G 95 % 6(1, 4.557853)
m = 7 90 % I G 95 % 6(1, 3.122798)
m = 8 90 % I G 95 % 7(1, 3.510284)
95 % I G 98 % 8(1, 5.850473)
m = 9 90 % I G 95 % 7(1, 2.800594)
95 % I G 98 % 9(1, 6.495583)
m = 10 90 % I G 95 % 8(1, 3.078516)
95 % I G 98 % 9(1, 4.284110)
m = 11 90 % I G 95 % 9(1, 3.356274)
95 % I G 98 % 10(1, 4.670642)
m = 12 90 % I G 95 % 9(1, 2.840237)
95 % I G 98 % 11(1, 5.056998)
m = 13 90 % I G 95 % 10(1, 3.057155)
95 % I G 98 % 11(1, 3.911437)
98 % I G 99 % 13(1, 9.072026)
m = 14 90 % I G 95 % 10(1, 2.688677)
95 % I G 98 % 12(1, 4.188889)
98 % I G 99 % 14(1, 9.715537)
m = 15 90 % I G 95 % 11(1, 2.866721)
95 % I G 98 % 12(1, 3.490814)
98 % I G 99 % 14(1, 6.215339)
m = 16 90 % I G 95 % 12(1, 3.044731)
95 % I G 98 % 13(1, 3.707577)
98 % I G 99 % 15(1, 6.601282)
m = 17 90 % I G 95 % 12(1, 2.734661)
95 % I G 98 % 14(1, 3.924305)
98 % I G 99 % 16(1, 6.987163)
m = 18 90 % I G 95 % 13(1, 2.885668)
95 % I G 98 % 14(1, 3.400669)
98 % I G 99 % 16(1, 5.298153)
m = 19 90 % I G 95 % 13(1, 2.637633)
95 % I G 98 % 15(1, 3.578605)
98 % I G 99 % 17(1, 5.575373)
99 % I G 19(1, 12.93129)
m = 20 90 % I G 95 % 14(1, 2.768771)
95 % I G 98 % 16(1, 3.756524)
98 % I G 99 % 18(1, 5.852566)
99 % I G 20(1, 13.574203)
Table A6. The IG from different numbers of MSE-RPs of P ( I I I ) ( 0 , 1 , 0.2 ) and related covered range.
Table A6. The IG from different numbers of MSE-RPs of P ( I I I ) ( 0 , 1 , 0.2 ) and related covered range.
Number of MSE-RPsIGCovered Number of MSE-RPsCovered Range
m = 7 90 % I G 95 % 7(0, 4.186923)
m = 8 90 % I G 95 % 7(0, 2.813247)
m = 9 90 % I G 95 % 8(0, 3.115306)
95 % I G 98 % 9(0, 5.193419)
m = 10 90 % I G 95 % 8(0, 2.453194)
95 % I G 98 % 10(0, 5.695577)
m = 11 90 % I G 95 % 9(0,2.670200)
95 % I G 98 % 10(0,3.717980)
m = 12 90 % I G 95 % 9(0, 2.253536)
95 % I G 98 % 11(0, 4.018854)
m = 13 90 % I G 95 % 10(0, 2.423314)
95 % I G 98 % 12(0, 4.319518)
m = 14 90 % I G 95 % 11(0, 2.592731)
95 % I G 98 % 12(0, 3.319267)
m = 15 90 % I G 95 % 11(0, 2.265638)
95 % I G 98 % 13(0, 3.535249)
98 % I G 99 % 15(0, 8.200847)
m = 16 90 % I G 95 % 12(0, 2.404934)
95 % I G 98 % 14(0, 3.751119)
98 % I G 99 % 16(0, 8.701244)
m = 17 90 % I G 95 % 13(0, 2.543988)
95 % I G 98 % 14(0, 3.099726)
98 % I G 99 % 16(0, 5.520828)
m = 18 90 % I G 95 % 13(0, 2.274345)
95 % I G 98 % 15(0, 2.682864)
98 % I G 99 % 17(0, 4.182610)
m = 19 90 % I G 95 % 14(0, 2.392483)
95 % I G 98 % 16(0, 3.437139)
98 % I G 99 % 18(0, 6.120974)
m = 20 90 % I G 95 % 14(0, 2.178159)
95 % I G 98 % 16(0, 2.960250)
98 % I G 99 % 18(0, 4.613867)
m = 21 90 % I G 95 % 15(0, 2.280909)
95 % I G 98 % 17(0, 3.098813)
98 % I G 99 % 19(0, 4.829431)
99 % I G 21(0, 11.201569)
Table A7. The IG from different numbers of MSE-RPs of P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) and related covered range.
Table A7. The IG from different numbers of MSE-RPs of P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) and related covered range.
Number of MSE-RPsIGCovered Number of MSE-RPsCovered Range
m = 7 90 % I G 95 % 7(0, 4.574898)
m = 8 90 % I G 95 % 7(0, 3.071896)
m = 9 90 % I G 95 % 8(0, 3.398221)
95 % I G 98 % 9(0, 5.663702)
m = 10 90 % I G 95 % 8(0, 2.676082)
95 % I G 98 % 10(0, 6.206803)
m = 11 90 % I G 95 % 9(0, 2.909991)
95 % I G 98 % 10(0, 4.049591)
m = 12 90 % I G 95 % 10(0, 3.143704)
95 % I G 98 % 11(0, 4.374829)
m = 13 90 % I G 95 % 10(0, 2.639643)
95 % I G 98 % 12(0, 4.699852)
m = 14 90 % I G 95 % 11(0, 2.822096)
95 % I G 98 % 12(0, 3.610697)
m = 15 90 % I G 95 % 11(0, 2.467316)
95 % I G 98 % 13(0, 3.844030)
98 % I G 99 % 15(0, 8.915686)
m = 16 90 % I G 95 % 12(0, 2.617033)
95 % I G 98 % 14(0, 4.077279)
98 % I G 99 % 16(0, 9.456674)
m = 17 90 % I G 95 % 13(0, 2.766704)
95 % I G 98 % 14(0, 3.369027)
98 % I G 99 % 16(0, 5.998500)
m = 18 90 % I G 95 % 13(0, 2.474674)
95 % I G 98 % 15(0, 3.551232)
98 % I G 99 % 17(0, 6.322913)
m = 19 90 % I G 95 % 14(0, 2.601620)
95 % I G 98 % 16(0, 3.733397)
98 % I G 99 % 18(0, 6.647255)
m = 20 90 % I G 95 % 15(0, 2.728541)
95 % I G 98 % 16(0, 3.215504)
98 % I G 99 % 18(0, 5.009672)
m = 21 90 % I G 95 % 15(0, 2.480221)
95 % I G 98 % 17(0, 3.365050)
98 % I G 99 % 19(0, 5.242661)
m = 22 90 % I G 95 % 16(0, 2.590433)
95 % I G 98 % 18(0, 3.514574)
98 % I G 99 % 20(0, 5.475616)
99 % I G 22(0, 12.699919)

References

  1. Pareto, V. Cours d’Économie Politique; Librairie Droz: Geneva, Switzerland, 1964; Volume 1. [Google Scholar]
  2. Kleiber, C. Statistical Size Distributions in Economics and Actuarial Sciences; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [Google Scholar]
  3. Coronel-Brizio, H.F.; Hernandez-Montoya, A.R. On fitting the Pareto–Levy distribution to stock market index data: Selecting a suitable cutoff value. Phys. A Stat. Mech. Its Appl. 2005, 354, 437–449. [Google Scholar] [CrossRef]
  4. Embrechts, P.; Kluppelberg, C.; Mikosch, T. Modelling extremal events. Br. Actuar. J. 1999, 5, 465. [Google Scholar]
  5. Bartlett, M.S. Properties of sufficiency and statistical tests. Proc. R. Soc. Lond. Ser. A-Math. Phys. Sci. 1937, 160, 268–282. [Google Scholar]
  6. Arnold, B.C. Pareto and generalized Pareto distributions. In Modeling Income Distributions and Lorenz Curves; Springer: Berlin/Heidelberg, Germany, 2008; pp. 119–145. [Google Scholar]
  7. Van Montfort, M.A.J.; Witter, J.V. The generalized Pareto distribution applied to rainfall depths. Hydrol. Sci. J. 1986, 31, 151–162. [Google Scholar] [CrossRef]
  8. Kagan, Y.Y. Earthquake size distribution and earthquake insurance. Commun. Stat. Stoch. Model. 1997, 13, 775–797. [Google Scholar] [CrossRef]
  9. Khu, S.T.; Madsen, H. Multiobjective calibration with Pareto preference ordering: An application to rainfall-runoff model calibration. Water Resour. Res. 2005, 41, W03004. [Google Scholar] [CrossRef]
  10. Quandt, R.E. Old and mew methods of estimation and the pareto distribution. Metrika 1966, 10, 55–82. [Google Scholar] [CrossRef]
  11. Brazauskas, V.; Serfling, R. Robust estimation of tail parameters for two-parameter Pareto and exponential models via generalized quantile statistics. Extremes 2000, 3, 231–249. [Google Scholar] [CrossRef]
  12. Brazauskas, V.; Serfling, R. Small sample performance of robust estimators of tail parameters for Pareto and exponential models. J. Stat. Comput. Simul. 2001, 70, 1–19. [Google Scholar] [CrossRef]
  13. Ghosh, I. A new discrete pareto type (IV) model: Theory, properties and applications. J. Stat. Distrib. Appl. 2020, 7, 3. [Google Scholar] [CrossRef]
  14. Fang, K.T.; He, S.D. The Problem of Selecting a Given Number of Representative Points in a Normal Population and a Generalized Mills’ Ratio; Department of Statistic Stanford University: Stanford, CA, USA, 1982. [Google Scholar]
  15. Yang, J.; He, P.; Fang, K.T. Three kinds of discrete approximations of statistical multivariate distributions and their applications. J. Multivar. Anal. 2022, 188, 104829. [Google Scholar] [CrossRef]
  16. Niederreiter, H. Random Number Generation and Quasi-Monte Carlo Methods; SIAM: Pennsylvania, PA, USA, 1992. [Google Scholar]
  17. Pagès, G. Introduction to Vector Quantization and Its Applications for Numerics. ESAIM Proc. Surv. 2015, 48, 29–79. [Google Scholar] [CrossRef]
  18. Max, J. Quantizing for Minimum Distortion. IRE Trans. Inf. Theory 1960, 6, 7–12. [Google Scholar] [CrossRef]
  19. Flury, B. Principal Points. Biometrika 1990, 77, 33–41. [Google Scholar] [CrossRef]
  20. Gersho, A.; Gary, R. Vector Quantization and Signal Compression; Kluwer Academic Publishers: Boston, MA, USA, 1992. [Google Scholar]
  21. Tarpey, T.; Petkova, E. Principal Point Classification: Applications to Differentiating Drug and Placebo Responses in Longitudinal Studies. J. Stat. Plan. Inference 2010, 140, 539–550. [Google Scholar] [CrossRef]
  22. Fang, K.; Zhou, M.; Wang, W. Applications of the Representative Points in Statistical Simulations. Sci. China Math. 2014, 57, 2609–2620. [Google Scholar] [CrossRef]
  23. Lemaire, V.; Montes, T.; Pagès, G. New Weak Error Bounds and Expansions for Optimal Quantization. J. Comput. Appl. Math. 2020, 371, 112670. [Google Scholar] [CrossRef]
  24. Ke, X.; Wang, S.R.; Zhou, M.; Ye, H.J. New approaches on parameter estimation of the gamma distribution. Mathematics 2023, 11, 927. [Google Scholar] [CrossRef]
  25. Lloyd, S. Least squares quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef]
  26. Linde, Y.; Buzo, A.; Gray, R. An algorithm for vector quantizer design. IEEE Trans. Commun. 1980, 28, 84–95. [Google Scholar] [CrossRef]
  27. Fang, K.T.; Wang, Y.; Bentler, P.M. Some applications of number-theoretic methods in statistics. Stat. Sci. 1994, 9, 416–428. [Google Scholar] [CrossRef]
  28. Tarpey, T. Estimating principal points of univariate distributions. J. Appl. Stat. 1997, 24, 499–512. [Google Scholar] [CrossRef]
  29. Matsuura, S.; Tarpey, T. Optimal principal points estimators of multivariate distributions of location-scale and location-scale-rotation families. Stat. Pap. 2020, 61, 1629–1643. [Google Scholar] [CrossRef]
  30. Li, Y.N.; Fang, K.T.; He, P.; Peng, H. Representative points from a mixture of two normal distributions. Mathematics 2022, 10, 3952. [Google Scholar] [CrossRef]
  31. Lomax, K.S. Business failures: Another example of the analysis of failure data. J. Am. Stat. Assoc. 1954, 49, 847–852. [Google Scholar] [CrossRef]
  32. Lu, H.L.; Tao, S.H. The estimation of Pareto distribution by a weighted least square method. Qual. Quant. 2007, 41, 913–926. [Google Scholar] [CrossRef]
  33. Tarpey, T.; Flury, B. Self-consistency: A fundamental concept in statistics. Stat. Sci. 1996, 11, 229–243. [Google Scholar] [CrossRef]
  34. Zoppè, A. Principal points of univariate continuous distributions. Stat. Comput. 1995, 5, 127–132. [Google Scholar] [CrossRef]
  35. Fang, K.; Pan, J. A Review of Representative Points of Statistical Distributions and Their Applications. Mathematics 2023, 11, 2930. [Google Scholar] [CrossRef]
  36. Trushkin, A. Sufficient conditions for uniqueness of a locally optimal quantizer for a class of convex error weighting functions. IEEE Trans. Inf. Theory 1982, 28, 187–198. [Google Scholar] [CrossRef]
  37. Xu, L.H.; Fang, K.T.; He, P. Properties and generation of representative points of the exponential distribution. In Statistical Papers; Springer: Berlin/Heidelberg, Germany, 2022; pp. 1–27. [Google Scholar]
  38. Stampfer, E.; Stadlober, E. Methods for estimating principal points. Commun. Stat.-Simul. Comput. 2002, 31, 261–277. [Google Scholar] [CrossRef]
  39. Harrell, F.E.; Davis, C.E. A new distribution-free quantile estimator. Biometrika 1982, 69, 635–640. [Google Scholar] [CrossRef]
  40. Brewer, K.R.W. Likelihood based estimation of quantiles and density estimation. 1986; Unpublished manuscript. [Google Scholar]
  41. Heathcote, A.; Brown, S.; Mewhort, D.J.K. Quantile maximum likelihood estimation of response time distributions. Psychon. Bull. Rev. 2002, 9, 394–401. [Google Scholar] [CrossRef] [PubMed]
  42. Navruz, G.; Özdemir, A.F. A new quantile estimator with weights based on a subsampling approach. Br. J. Math. Stat. Psychol. 2020, 73, 506–521. [Google Scholar] [CrossRef]
  43. Brazauskas, V.; Serfling, R. Favorable estimators for fitting Pareto models: A study using goodness-of-fit measures with actual data. ASTIN Bull. J. IAA 2003, 33, 365–381. [Google Scholar] [CrossRef]
  44. Kim, J.H.T.; Ahn, S.; Ahn, S. Parameter estimation of the Pareto distribution using a pivotal quantity. J. Korean Stat. Soc. 2017, 46, 438–450. [Google Scholar] [CrossRef]
  45. Abd El-Raheem, A.M.; Abu-Moussa, M.H.; Mohie El-Din, M.M.; Hafez, E.H. Accelerated life tests under Pareto-IV lifetime distribution: Real data application and simulation study. Mathematics 2020, 8, 1786. [Google Scholar] [CrossRef]
Figure 1. Box-plots of estimations of m = 10 MSE-RPs from P ( I ) ( 1 , 5 ) . (a) Description of the box-plots estimated for each representative point using the H D q quantile estimator. (b) Description of the box-plots estimated for each representative point using the B q quantile estimator. (c) Description of the box-plots estimated for each representative point using the Q j quantile estimator. (d) Description of the box-plots estimated for each representative point using the NO quantile estimator. (e) Description of the box-plots estimated for each representative point using the k-means method.
Figure 1. Box-plots of estimations of m = 10 MSE-RPs from P ( I ) ( 1 , 5 ) . (a) Description of the box-plots estimated for each representative point using the H D q quantile estimator. (b) Description of the box-plots estimated for each representative point using the B q quantile estimator. (c) Description of the box-plots estimated for each representative point using the Q j quantile estimator. (d) Description of the box-plots estimated for each representative point using the NO quantile estimator. (e) Description of the box-plots estimated for each representative point using the k-means method.
Entropy 27 00249 g001
Figure 2. Case I data fitting P ( I ) ( 1 , 5 ) using MSE-RPs estimated using different methods. (a) Description of using MSE-RPs containing 90 % I G 95 % . (b) Description of using MSE-RPs containing 95 % I G 98 % .
Figure 2. Case I data fitting P ( I ) ( 1 , 5 ) using MSE-RPs estimated using different methods. (a) Description of using MSE-RPs containing 90 % I G 95 % . (b) Description of using MSE-RPs containing 95 % I G 98 % .
Entropy 27 00249 g002
Figure 3. Case II data fitting P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) using MSE-RPs estimated using different methods. (a) Description of using MSE-RPs containing 90 % I G 95 % . (b) Description of using MSE-RPs containing 95 % I G 98 % .
Figure 3. Case II data fitting P ( I V ) ( 0 , 1 , 0.1 , 0.5 ) using MSE-RPs estimated using different methods. (a) Description of using MSE-RPs containing 90 % I G 95 % . (b) Description of using MSE-RPs containing 95 % I G 98 % .
Entropy 27 00249 g003
Table 1. Comparison of mean absolute bias from 1000 simulations in MSE-RP estimation based on sample sizes ( n = 20 , 50 , 100 ) and the number of MSE-RPs ( m = 5 , 10 , 17 ) for four types of Pareto distributions using different methods.
Table 1. Comparison of mean absolute bias from 1000 simulations in MSE-RP estimation based on sample sizes ( n = 20 , 50 , 100 ) and the number of MSE-RPs ( m = 5 , 10 , 17 ) for four types of Pareto distributions using different methods.
P ( I ) ( 1 , 5 )
k-means H D q B q Q j NOMLEME
n = 20m = 50.62440.73080.52480.59490.58310.02150.0926
m = 101.19651.31430.95850.97291.02740.02600.1136
n = 50m = 50.49990.48320.40270.42990.45240.00840.0588
m = 101.04951.12820.81150.82700.88270.00980.0734
m = 171.54421.44371.19511.23091.26930.01220.0830
n = 100m = 50.43300.36190.33400.41790.36750.00400.0403
m = 100.94020.99500.71540.75820.77070.00050.0165
m = 171.43751.33521.08441.12201.15840.00610.0582
P ( I I ) ( 0 , 1 , 5 )
k-means H D q B q Q j NOMLEME
n = 20m = 50.63270.63910.53180.59090.58810.30040.2449
m = 101.20191.14550.95190.96721.03340.43430.3501
n = 50m = 50.50010.45170.40050.42420.45790.16760.1480
m = 101.04821.00160.80470.82010.87690.23410.2215
m = 171.21331.24490.92060.95300.98120.30840.2775
n = 100m = 50.41910.34450.32120.40270.36110.11160.1074
m = 100.93670.89710.70840.75510.76980.16460.1542
m = 171.12861.12150.81950.85110.88360.20910.1985
P ( I I I ) ( 0 , 1 , 0.2 )
k-means H D q B q Q j NOMLEME
n = 20m = 50.45070.39730.34990.37980.38500.46820.5153
m = 100.84430.88670.65480.67670.70550.47950.5273
n = 50m = 50.34910.25790.24630.25860.27810.40390.4199
m = 100.71550.72630.51960.53780.57060.40920.4314
m = 171.09391.08460.80750.83030.86360.40780.4300
n = 100m = 50.30790.19380.18670.19750.20590.46820.5153
m = 100.61840.61780.43800.47270.48130.35420.3651
m = 170.99370.97620.71070.73450.76230.36170.3726
P ( I V ) ( 0 , 1 , 0.1 , 0.5 )
k-means H D q B q Q j NOMLEME
n = 20m = 50.51680.51840.41400.43250.45330.73170.7478
m = 100.94661.02160.73950.77750.80260.74530.7638
n = 50m = 50.40650.33190.30380.36930.33740.68470.6920
m = 100.82220.85680.61700.65630.66820.68440.6917
m = 171.09391.08460.80750.83030.86360.67260.6849
n = 100m = 50.35660.24790.23820.24040.26410.63760.6418
m = 100.72190.74250.52370.55650.57410.63810.6431
m = 170.99370.97620.71070.73450.76230.64120.6470
Table 2. Comparison of mean absolute bias from 1000 simulations in MSE-RP estimation at three levels of IG with m = 17 .
Table 2. Comparison of mean absolute bias from 1000 simulations in MSE-RP estimation at three levels of IG with m = 17 .
P ( I ) ( 1 , 5 )
k-means H D q B q Q j NOMLEME
90 % I G 95 % n = 1000.36770.13730.11450.17960.11470.00330.0321
n = 500.42450.17250.15340.20620.15830.00670.0458
95 % I G 98 % n = 1000.56930.25720.23330.28980.25520.00390.0375
n = 500.64340.34260.30290.34660.32940.00790.0535
98 % I G n = 1000.97680.82480.59150.64000.65320.00490.0475
n = 501.07330.96830.69770.73550.75580.01000.0678
P ( I I ) ( 0 , 1 , 5 )
k-means H D q B q Q j NOMLEME
90 % I G 95 % n = 1000.36840.13470.11720.17230.11950.06710.0643
n = 500.42440.17730.15770.21120.16440.09880.0901
95 % I G 98 % n = 1000.57080.25810.24120.28790.26450.09660.0922
n = 500.64300.33540.30530.34920.33670.14240.1290
98 % I G n = 1000.98050.75510.60550.64550.66300.15120.1438
n = 501.07250.86930.70170.73960.76240.22290.2011
P ( I I I ) ( 0 , 1 , 0.2 )
k-means H D q B q Q j NOMLEME
90 % I G 95 % n = 1000.30740.10550.09200.12250.09210.36240.3682
n = 500.35250.13240.12130.15390.12680.40780.4206
95 % I G 98 % n = 1000.38170.14690.13310.16380.13820.36180.3682
n = 500.43630.18200.17230.20130.18660.40650.4207
98 % I G n = 1000.66280.47090.36250.38780.39640.36040.3689
n = 500.74290.58130.43870.46290.48090.40450.4227
P ( I V ) ( 0 , 1 , 0.1 , 0.5 )
k-means H D q B q Q j NOMLEME
90 % I G 95 % n = 1000.34240.12030.10610.14970.10940.64240.6457
n = 500.38160.16150.14540.18980.15270.67810.6850
95 % I G 98 % n = 1000.43100.16910.15630.19640.16940.64220.6459
n = 500.47850.22480.20820.24860.22570.67730.6848
98 % I G n = 1000.75590.57260.42790.46150.47460.64170.6464
n = 500.82460.70060.51370.54820.55950.67480.6844
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, X.; Peng, X. Mean Squared Error Representative Points of Pareto Distributions and Their Estimation. Entropy 2025, 27, 249. https://doi.org/10.3390/e27030249

AMA Style

Li X, Peng X. Mean Squared Error Representative Points of Pareto Distributions and Their Estimation. Entropy. 2025; 27(3):249. https://doi.org/10.3390/e27030249

Chicago/Turabian Style

Li, Xinyang, and Xiaoling Peng. 2025. "Mean Squared Error Representative Points of Pareto Distributions and Their Estimation" Entropy 27, no. 3: 249. https://doi.org/10.3390/e27030249

APA Style

Li, X., & Peng, X. (2025). Mean Squared Error Representative Points of Pareto Distributions and Their Estimation. Entropy, 27(3), 249. https://doi.org/10.3390/e27030249

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop