You are currently viewing a new version of our website. To view the old version click .
Entropy
  • Article
  • Open Access

24 November 2025

Representative Points of the Inverse Gaussian Distribution and Their Applications

,
and
1
Faculty of Science and Technology, Beijing Normal-Hong Kong Baptist University, Zhuhai 519087, China
2
The Key Lab of Random Complex Structures and Data Analysis, The Chinese Academy of Sciences, Beijing 100045, China
3
Guangdong Provincial Key Laboratory of Interdisciplinary, Research and Application for Data Science (IRADS), Beijing Normal-Hong Kong Baptist University, 2000 Jintong Road, Zhuhai 519088, China
*
Author to whom correspondence should be addressed.
This article belongs to the Section Information Theory, Probability and Statistics

Abstract

The inverse Gaussian (IG) distribution, as an important class of skewed continuous distributions, is widely applied in fields such as lifetime testing, financial modeling, and volatility analysis. This paper makes two primary contributions to the statistical inference of the IG distribution. First, a systematic investigation is presented, for the first time, into three types of representative points (RPs)—Monte Carlo (MC-RPs), quasi-Monte Carlo (QMC-RPs), and mean square error RPs (MSE-RPs)—as a tool for the efficient discrete approximation of the IG distribution, thereby addressing the common scenario where practical data is discrete or requires discretization. The performance of these RPs is thoroughly examined in applications such as low-order moment estimation, density function approximation, and resampling. Simulation results demonstrate that the MSE-RPs consistently outperform the other two types in terms of approximation accuracy and robustness. Second, the Harrell–Davis (HD) and three Sfakianakis–Verginis (SV1, SV2, SV3) quantile estimators are introduced to enhance the representativeness of samples from the IG distribution, thereby significantly improving the accuracy of parameter estimation. Moreover, case studies based on real-world data confirm the effectiveness and practical utility of this quantile estimator methodology.

1. Introduction

Statistical distributions hold a pivotal position in information theory, as they outline the probabilistic features of data or signals, thereby directly influencing the precision and effectiveness of information representation, transmission, compression, and reconstruction. Entropy, being the foremost metric in the realm of information theory, relies on the statistical distribution of the random variable. Numerous applications in information theory necessitate the presumption of the data’s statistical distribution. While the normal distribution is commonly adopted in most statistical analyses owing to its mathematical ease and broad applicability, real-world data often display skewness, prompting the need for more adaptable models. Brownian motion is a widely used model for stochastic processes. In 1915, Schrödinger [] described the probability distribution of the first passage time in Brownian motion. Thirty years later, in 1945, Tweed [] gave the inverse relationship between the cumulant generating function of the first passage time distribution and that of the normal distribution, and named it the inverse Gaussian (IG) distribution. Then, in 1947, Wald [] derived the distribution as a limiting form for the sample size distribution in a sequential probability ratio test, leading to it being known in Russian literature as the Wald distribution. IG distribution is discussed in various books on stochastic processes and probability theory, such as Cox and Miller [] and Bartlett []. This distribution is also known as the Gaussian first passage time distribution [], and sometimes as the first passage time distribution of Brownian motion with positive drift []. IG distribution is suitable for modeling asymmetric data due to its skewness and relationship to Brownian motion. Folks and Chhikara [] and Chhikara and Folks [] have conducted a comprehensive examination of the mathematical and statistical properties of this distribution.
The interpretation of the inverse Gaussian random variable as a first passage time indicates its potential usefulness in examining lifetime or the frequency of event occurrences across various fields. Chhikara and Folks [] suggested that IG distribution is a useful model for capturing the early occurrence of events like failures or repairs in the lifetime of industrial products. Iyengar and Patwardhan [] indicated the possibility of using IG distribution as a useful way to model failure times of equipment. The IG distribution has found applications in various fields. For example, Kourogiorgas et al. [] applied IG distribution to a new rain attenuation time series synthesizer for Earth–space links, which enables accurate evaluation of how satellite communication networks are operating. In insurance and risk analysis, Punzo [] used IG distribution to model bodily injury claims and to analyze economic data concerning Italian households’ incomes. In traffic engineering, Krbálek [] used IG distribution models for vehicular flow.
In statistics, for an unknown continuous statistical distribution, using an empirical distribution of random samples is a conventional approach to approximate the target distribution. Nevertheless, this method frequently results in low accuracy. Thus, to retain as much information of the target distribution as possible, the support points for the discrete approximation, also called representative points (RPs), are investigated. For a comprehensive review of RPs, one may refer to Fang and Pan []. Representative points hold significant potential for applications in statistical simulation and inference. One promising application is a new approach to simulation and resampling that integrates number-theoretic methods. Li et al. [] propose moment-constrained mean square error representative samples (MCM-RS), which are generated from a continuous distribution via a method that minimizes the mean square error (MSE) between the sample and the original distribution while ensuring the first two sample moments match preset values. Peng et al. [] utilized representative points from a Beta distribution to weight patient origin data, thereby constructing the Patient Regional Index (PRI). This study serves as a practical application of representative points from different distributions. In the existing literature, various types of representative points corresponding to different statistical distributions have been investigated. Especially for complex distributions, the study of their representative points is indispensable. Many authors investigated the problem of discretizing a MixN by a fixed number of points under the minimum mean squared error, MSE-RPs of Pareto distributions, their estimation, and RPs of generalized alpha skew-t distribution with applications []. To the best of our knowledge, the representative points of the inverse Gaussian distribution have not yet been investigated, despite their potential utility. Therefore, we investigate three types of representative points for the inverse Gaussian distribution and explore their applications in estimation of moments and density function, as well as resampling.
This paper aims to investigate the applications of RPs as well as a dedicated parameter estimation approach based on nonparametric quantile estimators for the IG distribution. The main contributions of this work are summarized as follows:
1.
We establish, to the best of our knowledge, the first systematic comparative framework for three distinct types of representative points—Monte Carlo (MC-RPs), quasi-Monte Carlo (QMC-RPs), and mean square error RPs (MSE-RPs)—specifically on the Inverse Gaussian distribution. This framework provides a benchmark for evaluating discrete approximation quality in terms of moment estimation, density approximation, and resampling efficiency.
2.
We introduce a novel parameter estimation methodology for the IG distribution by employing the Harrell–Davis (HD) [] and Sfakianakis–Verginis (SV1, SV2, SV3) [] quantile estimators. This constitutes the first application and comprehensive demonstration of these estimators for enhancing sample representativeness and significantly improving the accuracy of IG parameter estimation.
3.
Through extensive simulations and real-world case studies, we provide a comprehensive performance analysis. Our results not only conclusively demonstrate the superiority of MSE-RPs in approximation tasks but also validate the practical utility and effectiveness of the proposed quantile-based estimation framework.
The rest of this paper is organized as follows. Section 2 introduces the fundamental properties of the inverse Gaussian distribution. Section 3 details the generation of representative points and their applications in statistical simulation. Section 4 analyzes resampling methods based on representative points for the IG distribution. Section 5 discusses parameter estimation for the IG distribution using samples enhanced by the introduced nonparametric quantile estimators. A real-data case study is presented in Section 6. Finally, Section 7 provides the conclusions and future research directions.

2. Basic Properties of the IG Distribution

Definition 1.
Consider a random variable X to follow IG distribution IG ( μ , λ ) , denoted as X I G ( μ , λ ) . The probability density function ( p d f ) of X is defined as
f ( x ; μ , λ ) = λ 2 π x 3 exp λ ( x μ ) 2 2 μ 2 x ,
where  x > 0 , μ > 0 , and λ > 0 . Denote its distribution function by F IG ( x ; μ , λ ) .
The mean of the IG distribution is μ , and the variance is μ 3 / λ []. The maximum likelihood estimates (MLEs) of μ and λ are
μ ^ M L = X ¯ = i = 1 n X i / n , λ ^ M L = n / i = 1 n 1 / X i 1 / X ¯ ,
where X 1 , , X n is a random sample from IG ( μ , λ ) . Furthermore, it is also known that X ¯ I G ( μ , n λ ) , λ i = 1 n ( 1 / X i 1 / X ¯ ) χ n 1 2 , and X ¯ and λ i = 1 n ( 1 / X i 1 / X ¯ ) are independent []. Folks and Chhikara [] proved that the uniformly minimum variance unbiased estimators for μ and λ are
μ ^ U M V U E = X ¯ , λ ^ U M V U E = ( n 3 ) / i = 1 n 1 / X i 1 / X ¯ .
The shapes of the IG distributions with varying parameter sets are depicted in Figure 1. For Figure 1a, when μ is fixed and for the lower values of λ , the distribution demonstrates a higher peak and a steep decline in probability, which suggests a sharper distribution. As λ increases, the peak becomes less pronounced, indicating a distribution with a heavier tail. From Figure 1b, it can be seen that as μ increases, the peak value of the density curve gradually decreases, and the peak position moves along the positive direction of the x-axis. At the same time, the entire curve extends to the right, reflecting the influence of the mean μ of the inverse Gaussian distribution on the distribution position and shape. The larger μ is, the more to the right the center of the distribution is, and the flatter the peak is.
Figure 1. (a) Density functions of IG ( μ , λ ) for fixed λ and increasing μ . (b) Density functions of IG ( μ , λ ) for fixed μ and increasing λ .

3. Representative Points of IG Distribution

Three types of representative points will be discussed in this section, namely MC-RPs, QMC-RPs, and MSE-RPs, from the parametric k-means algorithm, the NTLBG algorithm, and obtained by solving a system of nonlinear equations (the Fang–He algorithm []) of the IG ( μ , λ ) . Furthermore, we also discuss the applications of representative points, which are applied to moment estimation and density estimation.

3.1. Three Types of Representative Points

3.1.1. MC-RPs

Let X F ( x ; θ ) be a random vector, where θ denotes the parameters. In conventional statistics, inferences about the population are drawn using an independent and identically distributed random sample x 1 , , x k from F. The empirical distribution is defined as
F k ( x ) = 1 k i = 1 k I { x i x } ,
where I A is the indicator function of set A. This is a discrete distribution assigning probability 1 / k to each sample point, serving as a consistent approximation to F ( x ) .
In statistical simulation, let Y MC denote the random samples generated computationally via Monte Carlo methods; we denote this as Y MC F k ( y ) . Efron [] extended this idea into the bootstrap method, where samples are drawn from the empirical distribution F k rather than F.
Although widely used, the MC method has limited efficiency due to the slow convergence rate O p ( 1 / k ) of ( F k ( x ) to F ( x ) . This slow convergence leads to suboptimal performance in numerical integration, motivating alternative approaches.

3.1.2. QMC-RPs

Consider computing a high-dimensional integral in the canonical form:
I ( g ) = 0 1 0 1 g ( x 1 , , x d ) d x 1 d x d = C d g ( x ) d x ,
where g is a continuous function defined on C d = [ 0 , 1 ] d . Monte Carlo methods approximate I ( g ) using random samples from U ( C d ) , achieving a convergence rate of O p ( 1 / k ) . Quasi-Monte Carlo (QMC) methods improve this by constructing point sets Y = { y 1 , , y k } that are evenly dispersed over C d , achieving a convergence rate of O ( k 1 ( log k ) d ) . For the theoretical foundations and methodologies of QMC, one may consult works by Hua and Wang [] as well as Niederreiter []. In earlier research, the star discrepancy was frequently used by many scholars as a metric to assess the uniformity of Y within C d . The star discrepancy is defined as
D ( G , G ( k ) ) = sup x R d G ( x ) G ( k ) ( x ) ,
where G ( x ) represents the cumulative distribution function (cdf) of U ( C d ) and G ( k ) ( x ) is the empirical distribution corresponding to Y . An optimal set Y minimizes D ( G , G ( k ) ) . In such a case, the points in Y are termed QMC-RPs, which serve as support points of G ( k ) ( y ) with each point having an equal probability of 1 / k .
The optimality of the point set { q i } = { ( 2 i 1 ) / ( 2 k ) } has a profound theoretical foundation. As demonstrated in Examples 1.1 and 1.3 of the seminal work by Fang and Wang [], this set achieves the lowest star-discrepancy, establishing its optimality under a prominent criterion in quasi-Monte Carlo methods for any continuous distribution. Meanwhile, from the perspective of statistical distance minimization, Barbiero and Hitaj [] presented the following fundamental result, proving that the same point set is also optimal under the Cramér–von Mises criterion.
Theorem 1.
Let  F ( x )  be a strictly increasing and continuous cumulative distribution function, and k be a positive integer. Consider the set  F k  of all discrete distributions with k support points, with cumulative distribution function  F ^ ( x ) .
Then, for any r > 0 , the unique optimal discrete distribution F ^ F k that minimizes the Cramér–von Mises distance family
d r ( F , F ^ ) = 0 1 t F ^ ( F 1 ( t ) ) r d t
is given by the following:
Support Points (Quantization Points): x i = F 1 ( q i ) , where q i = 2 i 1 2 k , for i = 1 , , k .
Probabilities (Weights): p i = 1 k , for i = 1 , , k (i.e., a discrete uniform distribution).
In this paper, we construct the QMC-RPs for the distribution F IG ( x ; μ , λ ) by applying the above theoretical result. Specifically, the support points are generated via the inverse transformation method using the optimal point set
b j = F 1 2 j 1 2 k , j = 1 , , k ,
with corresponding probability P ( Y = b j ) = 1 k [16, 32]. Here, F IG 1 ( y ) is the inverse function of F IG ( x ) , and the set of points 2 j 1 2 k , j = 1 , , k is uniformly scattered on the interval ( 0 ,   1 ) and, as established, optimal in both the star-discrepancy and Cramér–von Mises senses.

3.1.3. MSE-RPs

MSE-RPs, also known as principal points [], are representative points designed to minimize the mean square error between the distribution and its discrete approximation. MSE-RPs were independently proposed by Cox [], Fang and He [], and many others. For a random variable X I G ( μ , λ ) with density f ( x ) , MSE-RPs are constructed as follows:
Take < b 1 < b 2 < < b k < and define a stepwise function
Q b ( x ) = b i , when a i < x a i + 1 , i = 1 , , k ,
where a 1 = , a i = b i + b i 1 / 2 , i = 2 , , k , a k + 1 = . Define the mean square error (MSE) to measure bias between F(x) and Q b ( x ) as follows:
MSE ( b ) = MSE b 1 , , b k = 1 σ 2 E X Q b ( X ) 2 = 1 σ 2 + min i x b i 2 f ( x ) d x = 1 σ 2 i = 1 k a i a i + 1 x b i 2 f ( x ) d x .
To find b * = ( b 1 * , , b k * ) , such that M S E ( b ) arrives at its minimum, the solution of B * = b 1 * , , b k * is just MSE-RPs of F IG ( x ; μ , λ ) . The probability of each representative point is given by
f ( Q b ( X ) = b i ) = p i , i = 1 , , k ,
where
p 1 = b 1 + b 2 2 f ( x ) d x = a 1 f ( x ) d x , p i = b i + b i 1 2 b i + b i + 1 2 f ( x ) d x = a i a i + 1 f ( x ) d x , i = 2 , , k 1 , p k = b k 1 + b k 2 f ( x ) d x = a k f ( x ) d x .
In this paper, we use the following three main different approaches to generate MSE-RPs:
(a) NTLBG algorithm: Combines QMC methods with the k-means-based LBG algorithm for univariate distributions [].
(b) Parametric k-means algorithm: An iterative method (Lloyd’s algorithm) for finding RPs of continuous univariate distributions [,].
(c) Fang–He algorithm: Solves a system of nonlinear equations to find highly accurate MSE-RPs, though computationally intensive for large k [].
For convenience, we use PKM-RPs, NTLBG-RPs, and FH-RPs to denote RPs from the parametric k-means algorithm, the NTLBG algorithm, and the Fang–He algorithm [] of the IG( μ , λ ), respectively.

3.2. Lower Moments Estimation Based on RPs of IG Distribution

In this subsection, we consider estimation of mean, variance, skewness, and kurtosis of X I G ( μ , λ ) . QMC-RPs, MSE-RPs which include PKM-RPs, FH-RPs, and NTLBG-RPs mentioned in the previous subsection are used for simulation. To save space in the main text, the specific numerical results of these representative points under different sizes ( k = 5 , 10 , 15 , 20 , 25 , 28 , 30 ) are provided in the Appendix A. Readers may refer to the Appendix A for detailed data.
We have obtained mean and variance of X I G ( μ , λ ) in Section 2, E ( X ) = μ , Var ( X ) = μ 3 λ . Denote the following statistics of X by
Sk ( X ) = E ( X μ ) 3 Var ( X ) 3 2 , Ku ( X ) = E ( X μ ) 4 Var ( X ) 2 3 .
It can be concluded that
Sk ( X ) = 3 μ λ , Ku ( X ) = 15 μ λ .
Consider a group of representative points b = { b 1 , , b k } from IG ( μ , λ ) , where b i with probability p i . Then the above statistics are
E ( b ) = i = 1 k b i p i , Var ( b ) = i = 1 k b i μ b 2 p i , Sk ( b ) = 1 Var ( b ) 3 2 i = 1 k b i μ b 3 p i , Ku ( b ) = 1 Var ( b ) 2 i = 1 k b i μ b 4 p i 3 .
In the following comparisons, we employ MC-RPs, QMC-RPs, FH-RPs, PKM-RPs, and NTLBG-RPs from IG(1, 1) and consider the sample size n = 5 , 10 , 15 , 20 , 25 , 28 , 30 . Therefore, the corresponding mean, variance, skewness, and kurtosis are 1, 1, 3, and 15. For each statistic (mean, variance, skewness, and kurtosis) there are five estimators and five corresponding biases; Table 1 shows numerical results. It is evident that MC-RPs consist of random samples with size n. For the sake of fair comparisons, we generate 10 samples each with size n, and then take the average of the estimated statistics as the result of MC-RPs (10).
Table 1. Estimation bias of variance, kurtosis, mean, and skewness.
We have marked the RP methods which have smallest absolute bias in boldface in Table 1. From the results in Table 1 we may raise the following observations: (1) the estimators of FH-RPs and PKM-RPs are more accurate than those of MC-RPs (10), QMC-RPs, and NTLBG-RPs; (2) FH-RPs and PKM-RPs have same performance in estimation of these four statistics. In fact, the bias values in Table 1 are rounded to four decimal places, which makes FH-RPs and PKM-RPs appear to perform identically. However, the performance of FH-RPs and PKM-RPs is not exactly the same. If the bias values are rounded to eight decimal places, it can be observed that FH-RPs performs better than PKM-RPs.

3.3. Density Estimation via RPs of IG Distribution

In this section, we estimate density function of inverse gaussian distribution IG(1, 1), choose n = 30 as size of the input data, and give comparisons among the four RP methods (QMC-RPs, FH-RPs, PKM-RPs, and NTLBG-RPs). It is noteworthy that for the density estimation based on MC-RPs, owing to the inherent randomness of the Monte Carlo method, the density curve obtained from each fitting process varies significantly. Therefore, the results of MC-RPs-based density estimation will not be presented here.
Rosenblatt [] and Parzen [], provides a way to estimate the density function based on a set of samples, x 1 , , x n . The estimate function
p ^ h ( x ) = 1 n i = 1 n k h ( x x i ) = 1 n h i = 1 n k e r ( x x i h )
is called kernel density estimation, where k e r ( ) is the kernel function, h is the bandwidth, and k h ( y ) = 1 h k e r ( y / h ) . The most popular kernel is the standard density function; therefore, we choose it as the kernel. In this section, we use a set of representative points with related probabilities to replace n i.i.d. samples. In this case, (6) becomes
p ^ h ( x ) = i = 1 n k h ( x x i ) p i = 1 h i = 1 n k e r ( x x i h ) p i ,
and the choice of the bandwidth ( h ) is very important.
In our experiment, we divide the range of x into 2 parts, and choose h to have minimum L 2 d i s t a n c e between p ^ h ( x ) and p ( x ) in each part. Now, (7) becomes
p ^ h ( x ) = i = 1 n k h ( x x i ) p i = j = 1 2 1 h j i = 1 n k e r ( x x i h j ) p i .
The four density estimators are shown in Figure 2. From Figure 2, it can be observed that FH-RPs and PKM-RPs have similar performance, and their performance is better than that of QMC-RPs and NTLBG-RPs.
Figure 2. Kernel density estimations from 4 kinds of RPs, when sample size n = 30 . (a) QMC-RPs (b) FH-RPs. (c) PKM-RPs. (d) NTLBG-RPs.
The recommended h and L 2 -distance between each density estimator and IG(1, 1) in each zone are given in Table 2.
Table 2. Recommended h and L 2 -distance under n = 30.

4. Resampling Based on RPs of IG Distribution

In this section, we use 4 kinds of RPs (QMC-RPs, FH-RPs, PKM-RPs, and NTLBG-RPs) which have been introduced in Section 3 to form four different populations by resampling. Specifically, N denotes the number of representative points used for each distribution, while n represents the sample size generated in each resampling procedure. We then employ the samples obtained via resampling for statistical inference. The steps of resampling used in this paper are as follows:
Step 1. Generate a random number U, i.e., U U ( 0 ,   1 ) ; the latter is the uniform distribution on (0, 1).
Step 2. Define a random variable Y by
Y = b 1 , when U < p 1 , b 2 , when p 1 U < p 1 + p 2 , b k , when i = 1 k 1 p i U .
Step 3. Repeat the above two steps n times, and we have a sample of Y, y 1 , , y n . Calculate the given statistic T.
Step 4. Repeat the above three steps 1000 times, and we obtain a sample of T, T 1 , , T 1000 .
Step 5. Use the mean of a sample of T to infer the statistic of the population.
Now we apply the four RP methods for estimation of four statistics of X I G ( 1 ,   1 ) : mean, variance, skewness, and kurtosis. Table 3, Table 4 and Table 5 show estimation biases for the above four statistics, where QMC-RPs, FH-RPs, PKM-RPs, and NTLBG-RPs are employed involving the following cases: sample size n = 30 , 50 , 100 , and representative points k = 5 , 15 , 30 .
Table 3. Estimation bias for k = 5 by resampling.
Table 4. Estimation bias for k = 15 by resampling.
Table 5. Estimation bias for k = 30 by resampling.
Based on the results in the Table 3, Table 4 and Table 5, the results reveal distinct performance patterns among the four methods. NTLBG-RPs demonstrates superior accuracy in mean and variance estimation, achieving near-zero biases across all configurations. FH-RPs and PKM-RPs excel in higher-moment characterization, particularly for kurtosis and skewness estimation. QMC-RPs provides reasonable mean estimates but shows significant limitations in capturing higher-order moments. This performance dichotomy suggests a fundamental trade-off between central moment accuracy and tail behavior characterization. The optimal method selection should therefore align with specific application requirements, prioritizing either distribution center or tail properties.

5. MLE via Quantile Estimators of IG Distribution

Let x = ( x 1 , x 2 , , x n ) be a set of n i.i.d samples from IG distribution with pdf f ( x ; θ ) . When θ = ( μ , λ ) , then the log-likelihood function is defined as
l ( μ , λ ; x ) = i = 1 n log f x i ; μ , λ = n 2 log λ n 2 log ( 2 π ) 3 2 i = 1 n log x i λ 2 μ 2 i = 1 n x i μ 2 x i .
When θ = ( α , μ , λ ) , the pdf f ( x ; θ ) is
f ( x ) = λ 2 π ( x α ) 3 1 / 2 exp λ ( x α μ ) 2 2 μ 2 ( x α ) I [ α , ) ( x ) ,
and the log-likelihood function is defined as
l ( α , μ , λ ; x ) = n 2 log λ n 2 log ( 2 π ) 3 2 i = 1 n log ( x i α ) λ 2 μ 2 i = 1 n ( x i α μ ) 2 x i α .
The goal of MLE is to find the model parameters θ that can maximize the log-likelihood function over the parameter space Θ , that is,
θ ^ M L E = arg max θ Θ l ( θ ; x ) .
The sequential number theoretic optimization algorithm (SNTO), introduced by Fang and Wang [], represents a broadly applicable optimization technique. Subsequently, this SNTO algorithm can be employed to determine the numerical solutions θ ^ M L E through the maximization of Equations (19) and (21).
In this part, we study parameter estimation of four two-parameter Inverse Gaussian distributions (IG(1,1), IG(1,0.5), IG(7,1), IG(3,3)) and one three-parameter Inverse Gaussian distribution IG(1,0.5,1). Some results for IG(1,0.5), IG(7,1), IG(3,3) are displayed in the Appendix A to save space. The density plots corresponding to these four two-parameter distributions are shown in Figure 3. The three-parameter inverse Gaussian distribution is derived from the two-parameter inverse Gaussian distribution by adding a location shift parameter, which does not alter the shape of the distribution. Therefore, it need not be presented separately.
Figure 3. (a) Density plot of IG(1, 0.5). (b) Density plot of IG(1, 1). (c) Density plot of IG(7, 1). (d) Density plot of IG(3, 3).

5.1. Two Nonparametric Quantile Estimators

5.1.1. HD Estimator

The Harrell–Davis quantile estimator [] consists of a linear combination of the order statistics, admitting a jackknife variance. The Harrell–Davis quantile estimator offers a significant gain in efficiency, with emphasis on small sample results. Let x = { x 1 , , x n } be a random sample of size n from an inverse Gaussian distribution. Denote X ( i ) as the i-th largest value in x and F 1 ( p ) as the p-th population quantile. The quantile estimator based on the random sample is proposed to be Q ( p )
Q ( p ) = i = 1 n W n , i X ( i ) .
where
W n , i = 1 B ( ( n + 1 ) p , ( n + 1 ) ( 1 p ) ) ( i 1 ) / n i / n y ( n + 1 ) p 1 ( 1 y ) ( n + 1 ) ( 1 p ) 1 d y = I i / n [ ( n + 1 ) p , ( n + 1 ) ( 1 p ) ] I ( i 1 ) / n [ ( n + 1 ) p , ( n + 1 ) ( 1 p ) ] ,
and I x ( a , b ) denotes the incomplete beta function.

5.1.2. SV Estimators

The SV estimators, as proposed by Sfakianakis and Verginis [], offer alternative methods for quantile estimation with advantages in small sample sizes and extreme quantiles. Let x = { x 1 , , x n } be a random sample of size n from an inverse Gaussian distribution. The q-th quantile of the population, Q ( q ) , is one of the S i . Define the random variables δ i as
δ i = 1 , X ( i ) Q ( q ) , 0 , X ( i ) > Q ( q ) ,
where δ i Bernoulli ( q ) . The Bernoulli distribution is a discrete distribution with two possible outcomes: 1 (success) with probability q and 0 (failure) with probability 1 q . In this context, q represents the probability of success for each trial. Then, their sum N = δ 1 + δ 2 + + δ n has a binomial distribution with probability q, supposing δ i are independent. So, P ( Q ( q ) S i ) = P ( N = i ) = B ( i ; n , q ) , where i = 0 , 1 , , n . Let the random variable Ψ ( q ) = Q i ( q ) , where Q i ( q ) is a point estimator of Q ( q ) conditioned on the event Q ( q ) S i , i = 0 , 1 , , n . An estimator of Q ( q ) is obtained by calculating Q ( q ) = E ( Ψ ( q ) ) .
Three different definitions of Q i ( q ) are examined to derive three distinct quantile estimators, denoted as Q S V 1 ( q ) , Q S V 2 ( q ) , and Q S V 3 ( q ) . Table 6 shows construction formulas of three quantile estimators and gives assumption of Q 0 ( q ) or Q n ( q ) of these estimators.
Table 6. Definitions of three quantile estimators.
Substitute the above equations into Ψ ( q ) = Q i ( q ) and find the expectation Q ( q ) = E ( Ψ ( q ) ) . After simplification, the following three SV estimators are obtained:
Q S V 1 ( q ) = 2 B ( 0 ; n , q ) + B ( 1 ; n , q ) 2 X ( 1 ) + B ( 0 ; n , q ) X ( 2 ) B ( 0 ; n , q ) X ( 3 ) 2 + i = 2 n 1 B ( i ; n , q ) + B ( i 1 ; n , q ) 2 X ( i ) B ( n ; n , q ) X ( n 2 ) + B ( n ; n , q ) X ( n 1 ) 2 + 2 B ( n ; n , q ) + B ( n 1 ; n , q ) 2 X ( n ) ,
Q S V 2 ( q ) = i = 0 n 1 B ( i ; n , q ) X ( i + 1 ) + ( 2 X ( n ) X ( n 1 ) ) B ( n ; n , q ) ,
Q S V 3 ( q ) = i = 1 n B ( i ; n , q ) X ( i ) + ( 2 X ( 1 ) X ( 2 ) ) B ( 0 ; n , q ) .
These four quantile estimators to QMC-data that approximate the inverse Gaussian distribution are applied to compare their effects in revising data in next section.

5.2. Estimation Accuracy Measures

In this subsection, six methods are showed for parameter estimation. These methods include the following: (1) the Plain method, which is the traditional maximum likelihood estimation (MLE) based on random samples with SNTO optimization; (2) the HD method, which uses QMC data (HD-quantiles) combined with SNTO optimization; (3) the SV1, SV2, and SV3 methods, which are based on QMC data (SV-quantiles) with different SV quantile constructions, also using SNTO optimization; and (4) the MLE-analytic formulas method, a traditional MLE approach based on analytical formulas without SNTO optimization.
There are many metrics to assess estimation precision. The following four accuracy measures, labeled (a) through (d), are examined in this study. Across 100 Monte Carlo iterations, the averages of these accuracy measures (a)–(d) are compiled in tables titled “average accuracy measures” for assessment purposes.The true distribution is represented by F true or f true , while the estimated distributions are denoted as F est or f est .
(a) The L2-distance between the cumulative distribution function (c.d.f.) (F) of the underlying distribution and its estimated distribution is considered. The L2-Distance (L2.cdf) for comparing two c.d.fs is expressed as:
L 2 ( F true , F est ) = R d ( F true ( x ) F est ( x ) ) 2 d x 1 / 2 .
(b) The L2-distance between the probability density function (f) of the underlying distribution and its estimated distribution is considered. The L2-Distance (L2.pdf) for comparing two density functions is given by
L 2 ( f true , f est ) = R d ( f true ( x ) f est ( x ) ) 2 d x 1 / 2 .
(c) The Kullback–Leibler divergence (KL), also known as relative entropy, serves as an indicator of the disparity between two probability distributions. The KL divergence from F est to F true is formulated as
D KL ( F true F est ) = f true ( x ) ln f true ( x ) f est ( x ) d x .
(d) The absolute bias index (ABI) is employed to quantify the overall bias in parameter estimation. Let μ ^ and λ ^ represent the estimated values of μ and λ for the inverse Gaussian distribution, where μ > 0 and λ > 0 . The ABI is defined as
A B I = 1 2 | μ μ ^ | μ + | λ λ ^ | λ .

5.2.1. MLE of IG ( μ , λ )

Five MLE-based methods are compared based on the sample size of n = 30 , 50 , 100 . Table 7 presents the average accuracy measures corresponding to L2-distances, KL-divergence, and ABI. For each metric, the optimal performance across different distributions and sample sizes is emphasized in bold. This bold-highlighting convention is consistently applied in the remaining tables.
Table 7. MLE estimations—average accuracy measures ( n = 30 , 50 , 100 ; N = 100 ).
From the result of Table 7, we can know the comparative analysis reveals that the Plain method consistently excels in L2.cdf (bold in 9/12 cases for IG(1, 1), IG(1, 0.5), IG(7, 1), and IG(3, 3)) and ABI (7/12 bold values), demonstrating robustness for cumulative distribution accuracy, while the HD method dominates L2.pdf (bold in 9/12 cases) and KL divergence (7/12 bold values), indicating superior probability density estimation. QSV variants show mixed results with occasional competitiveness but generally underperform systematically. Performance improves with sample size ( n = 30 100 ), particularly for HD and Plain, where error metrics decrease monotonically. Notably, IG(3, 3) exhibits unique behavior with shared dominance between Plain and HD, while QSV methods struggle, highlighting distribution-dependent efficacy. Overall, HD proves optimal for density-focused tasks (L2.pdf/KL) across most inverse Gaussian parameterizations, whereas the Plain method suits cumulative metrics (L2.cdf/ABI), with QSV methods lacking consistent advantages.
To better interpret the results presented in Table 7, we study the frequency of different rankings for the five methods across various metrics. The ranking of these methods is determined based on 100 Monte Carlo simulations, where each simulation records the relative performance of the methods across the specified accuracy metrics. The tables present the frequency of each method achieving ranks 1 through 5, providing a comprehensive assessment of their effectiveness. This analysis aims to identify the most robust method for parameter estimation under varying sample sizes ( n = 30 , 50 , 100 ).
The Table 8 presents the ranking distribution of five sampling methods (Plain, HD, QSV1, QSV2, QSV3) across four accuracy measures (L2.cdf, L2.pdf, KL, ABI) based on 100 Monte Carlo simulations for sample sizes ( n = 30 , 50 , 100 ) of IG(1, 1). Since the results for IG(1, 0.5), IG(7, 1), and IG(3, 3) are similar to those for IG(1, 1), and to save space, they are not individually presented in the main text but are included in Appendix A, which readers may refer to for further details.
Table 8. The rank of 5 methods in 4 accuracy measures (IG(1, 1)).
In Table 8, the Plain method consistently ranks third or lower across all measures and sample sizes, indicating stable but suboptimal performance. HD performs moderately, often ranking second or third, with occasional first-place rankings, suggesting reliability but limited excellence. QSV1 and QSV2 frequently achieve first-place rankings, particularly in KL and ABI for larger n, but also show higher variability, with notable fifth-place occurrences, indicating sensitivity to specific conditions. QSV3 exhibits balanced performance, with frequent first- and second-place rankings, especially in L2.pdf and ABI at ( n = 100 ), suggesting robustness across diverse measures. As n increases, QSV2 and QSV3 generally improve in top rankings, while Plain and HD remain consistent but less competitive.
The analysis of Table 9 and Table 10 (with additional results for IG(1, 0.5) and IG(7, 1) provided in Appendix A, Table A4 and Table A5) reveals distinct performance patterns across the inverse Gaussian distributions. A consistent pattern emerges for the IG(1, 1) and IG(7, 1) distributions (see Table 9 and Table A5), where the QSV3 method consistently outperforms other contenders in density-related metrics (L2.pdf and KL divergence), achieving the lowest errors across nearly all sample sizes. This makes QSV3 the recommended choice for density-focused tasks. For parameter estimation under these distributions, the Plain method provides the most accurate estimates of μ , especially at smaller sample sizes. However, a different dynamic is observed for the IG(3, 3) distribution (Table 10). Here, the dominance shifts, with Plain and HD methods sharing the lead. Notably, the HD method demonstrates exceptional performance at the largest sample size ( n = 100 ), achieving optimal values in the majority of metrics. This indicates that for distributions with certain parameter configurations, traditional discretization methods can be highly effective for parameter estimation at larger n.
Table 9. Estimation results on IG(1, 1), N = 100.
Table 10. Estimation results on IG(3, 3), N = 100 .
Overall, the choice of optimal discretization method is context-dependent. We recommend QSV3 for applications prioritizing density estimation, Plain for cumulative distribution metrics and μ estimation at small n, and HD for parameter estimation tasks when dealing with larger samples from certain distributions. The Analytic formulas method occasionally excels in specific scenarios but lacks consistency, while QSV1 and QSV2 exhibit limited advantages in this study.

5.2.2. MLE of IG ( μ , α , λ )

In this subsection, parameter estimation for the three-parameter inverse Gaussian distribution IG ( 1 , 0.5 , 1 ) is presented. We know that obtaining an analytical MLE solution for the three-parameter inverse Gaussian distribution is challenging due to the multimodal nature of its log-likelihood function, particularly the complexity in estimating α , which renders direct solutions to the derivative equations impractical. To overcome the limitations of analytical solutions, numerical optimization methods such as SNTO can be used. We employed the SNTO algorithm on a random sample for parameter estimation in the optimization step, comparing its performance with the traditional MLE method for estimating μ ^ and λ ^ based on analytical expressions at a given α .
Table 11 presents parameter estimation results for the IG(1, 0.5, 1)) distribution across five methods (Plain, HD, QSV1, QSV2, QSV3) and three sample sizes ( n = 30 , 50 , 100 ) across four accuracy measures (L2.pdf, L2.cdf, KL, ABI). QSV3 demonstrates superior performance at n = 30 and n = 50 , achieving the lowest errors in L2.pdf, L2.cdf, and KL, indicating its effectiveness for smaller samples. At n = 100 , HD outperforms others with the lowest errors across all measures, suggesting its strength with larger samples. Plain shows consistent but moderate performance, often excelling in ABI, while QSV1 and QSV2 exhibit higher variability with suboptimal rankings across measures. The results highlight QSV3’s robustness at lower n and HD’s dominance as sample size increases.
Table 11. Estimation results on IG(1, 0.5, 1), N = 100 .

6. Case Study

In this section, we utilize inverse Gaussian distribution to model a set of data, focusing on the engineering planning and design. The data set is taken from Example 7 by Chhikara and Folks (1978) [].
The data set gives 25 runoff amounts at Jug Bridge, Maryland, as shown in Table 12.
Table 12. Data for example.
In the above case, the histogram and Q-Q plot of the data set are shown in Figure 4. According to fitted IG density, empirical density, and Q-Q plot, data set tends to inverse Gaussian distribution. Using the graphical method to determine Inverse Gaussian distribution has limitations. Therefore, it is also necessary to conduct a K-S test on the data to avoid misspecification of the data distribution type.
Figure 4. (a) Q-Q Plot for inverse Gaussian distribution. (b) Histogram with inverse Gaussian fit.
The K-S test statistic for Example is D = 0.0621 with a p-value of 0.9976. Therefore, it is reasonable to assume that the data set follows inverse Gaussian distribution. We use the K-S test and three performance indicators to compare the fitting effects of these five methods, i.e., the bias, the sum of squares due to error (SSE), and the coeffcient of determination ( R 2 ).
Table 13 presents the MLE parameter values obtained by different methods, and shows the parameter estimates, maximum likelihood function values, results of the K-S goodness of fit test, and performance indicators among several methods. For each measure, the methods that perform the best are marked in bold to facilitate easier comparison. Among the several revised methods considered, QSV1-MLE demonstrates the best fitting performance.
Table 13. The estimation result of real data.

7. Conclusions

This paper investigates the statistical simulation and parameter estimation of the IG distribution, studying and comparing five RPs methods to enhance inference accuracy and employing QMC-data revisions with HD and SV quantile estimators for MLE. The paper demonstrates that the superiority of MSE-RPs in moment and density estimation, alongside their effectiveness in resampling accuracy, underscores their potential for improving inference precision. The integration of QMC-data revisions with HD and SV quantile estimators, particularly QSV1 and QSV3, optimizes MLE performance across various sample sizes and parameter scenarios, as validated by the real runoff dataset. These findings suggest that MSE-RPs, combined with tailored quantile approaches, provide a reliable framework for addressing the challenges of skewed distributions, with broad implications for fields such as lifetime analysis, satellite communications, and risk modeling, thereby advancing the accuracy of entropy-based evaluations in practical settings.

Author Contributions

Conceptualization and supervision, K.-T.F.; writing—review and editing, X.-L.P.; methodology and software, W.-W.H.; software and writing—original draft preparation, W.-W.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the BNBU research grant UICR0200023-25 and in part by Guangdong Provincial Key Laboratory of IRADS (2022B1212010006).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data used in the case study were obtained from Reference []. Further details about the data sources can be found in the cited references. No new data were created in this study.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A

Table A1. The rank of 5 methods in 4 accuracy measures (IG(1, 0.5)).
Table A1. The rank of 5 methods in 4 accuracy measures (IG(1, 0.5)).
nMethodL2.cdfL2.pdf
Rank1 Rank2 Rank3 Rank4 Rank5 Rank1 Rank2 Rank3 Rank4 Rank5
30Plain5306230124412311
30HD939124004246750
30QSV11315172233161293528
30QSV23951748261192430
30QSV33411828194293541
KLABI
30Plain94236130154416250
30HD2364416262538283
30QSV118562942185281534
30QSV23375253023852341
30QSV3381091726381813922
L2.cdfL2.pdf
50Plain10305370114012370
50HD1338123615256820
50QSV121192019211516164112
50QSV23113956311331637
50QSV325121229223861451
KLABI
50Plain10443213185117231
50HD93937141102829321
50QSV1246123127264321721
50QSV22657214125341751
50QSV33161221303114181126
L2.cdfL2.pdf
100Plain221734023232340
100HD1042444033848110
100QSV125141315331420173217
100QSV2385165035711938
100QSV32518931174632445
KLABI
100Plain1314721053724340
100HD74326240123419350
100QSV12413725311815331123
100QSV2365411442977552
100QSV3328161925367171525
Table A2. The rank of 5 methods in 4 accuracy measures (IG(7, 1)).
Table A2. The rank of 5 methods in 4 accuracy measures (IG(7, 1)).
nMethodL2.cdfL2.pdf
Rank1 Rank2 Rank3 Rank4 Rank5 Rank1 Rank2 Rank3 Rank4 Rank5
30Plain1833837419336366
30HD818621205216563
30QSV113291729121529123014
30QSV2296101540296121934
30QSV34210653741115538
KLABI
30Plain1834636618329374
30HD818646491760140
30QSV113311029171431102718
30QSV2307141930289171333
30QSV33912483739106738
L2.cdfL2.pdf
50Plain1336544213373452
50HD518661103177550
50QSV11327212217123293017
50QSV237105202838861929
50QSV3321043513495151
KLABI
50Plain1434445313359403
50HD41974307206391
50QSV113291029191622162719
50QSV2361252027321481630
50QSV333863503297646
L2.cdfL2.pdf
100Plain2371942044211430
100HD917591414127680
100QSV11330102918222843115
100QSV235991532301561435
100QSV341821484042549
KLABI
100Plain3421837033526360
100HD320689072552124
100QSV1212922721232272523
100QSV234772428311272327
100QSV339344503677545
Table A3. The rank of 5 methods in 4 accuracy measures (IG(3, 3)).
Table A3. The rank of 5 methods in 4 accuracy measures (IG(3, 3)).
nMethodL2.cdfL2.pdf
Rank1 Rank2 Rank3 Rank4 Rank5 Rank1 Rank2 Rank3 Rank4 Rank5
30Plain161954110142538221
30HD132721354122931226
30QSV130201015253219101821
30QSV2307412472219121235
30QSV31535152692022112324
KLABI
30Plain132547141132741181
30HD142921306132629248
30QSV131161510283416101327
30QSV226981938221191939
30QSV320301029112133112312
L2.cdfL2.pdf
50Plain6137290111943270
50HD183410362103534192
50QSV13216913302718142318
50QSV23594547361561132
50QSV3132793516201271843
KLABI
50Plain7285312092540251
50HD114017293132529312
50QSV129121018313116111329
50QSV2377811373013111036
50QSV320121430242219121928
L2.cdfL2.pdf
100Plain318753133149161
100HD940743153728291
100QSV1331161535232182523
100QSV2402345136451045
100QSV315318341233991930
KLABI
100Plain2295217083628280
100HD7392034072340300
100QSV134102134131217932
100QSV233671341307101340
100QSV324181822182416141927
Table A4. Estimation results on IG(1, 0.5), N = 100 .
Table A4. Estimation results on IG(1, 0.5), N = 100 .
nMethodL2.pdfL2.cdfKLABI μ ^ λ ^
30Plain0.073350.020550.004620.075280.991170.57086
30HD0.096920.033360.007370.108741.042140.58768
30QSV10.126000.037410.016310.158580.953350.63526
30QSV20.136360.069590.016300.188281.152470.61204
30QSV30.020760.014040.000860.040450.961270.52109
30Analytic formulas0.063030.018260.003680.072740.978430.56195
50Plain0.035340.009830.001050.036120.993460.53285
50HD0.046590.019970.001640.055831.038750.53646
50QSV10.064210.020640.004290.083230.963400.56493
50QSV20.077910.053600.006320.118251.138630.54893
50QSV30.018220.011060.000300.024240.973200.48916
50Analytic formulas0.051500.014890.002420.058740.982280.54989
100Plain0.018990.005280.000300.019330.996040.51735
100HD0.022470.011720.000410.028891.026620.51558
100QSV10.032410.011690.001160.044670.974790.53207
100QSV20.042740.033160.002200.066491.088290.52234
100QSV30.008970.005720.000080.012260.985880.49481
100Analytic formulas0.015560.006270.000290.022780.985000.51527
The best performance within each measure is highlighted in bold.
Table A5. Estimation results on IG(7, 1), N = 100 .
Table A5. Estimation results on IG(7, 1), N = 100 .
nMethodL2.pdfL2.cdfKLABI μ ^ λ ^
30Plain0.028250.031290.001660.046316.935971.08347
30HD0.042690.049570.003800.066117.027121.12835
30QSV10.052880.060050.006100.087866.920411.16435
30QSV20.055450.068370.006510.095807.153711.16965
30QSV30.000790.008340.000020.008556.894971.00209
30Analytic formulas0.040110.045160.003590.076406.793641.12331
50Plain0.015860.017450.000520.027386.938611.04598
50HD0.023390.026380.001100.034136.996331.06773
50QSV10.031870.035390.002130.052796.925721.09496
50QSV20.033880.042600.002300.059177.140691.09824
50QSV30.011030.014840.000220.020166.919010.97126
50Analytic formulas0.033400.037110.002350.055786.918211.09988
100Plain0.009490.011630.000200.020626.906261.02784
100HD0.011570.012870.000260.016856.993441.03277
100QSV10.017400.019490.000650.032986.895921.05109
100QSV20.018590.027410.000670.037837.173621.05086
100QSV30.005790.011800.000070.014966.890490.98572
100Analytic formulas0.011160.014020.002350.019197.053291.03077
The best performance within each measure is highlighted in bold.
Table A6. The points of RPs from I G ( 1 ,   1 ) .
Table A6. The points of RPs from I G ( 1 ,   1 ) .
categoryRP1RP2RP3RP4RP5RP6RP7RP8
n = 5 QMC-RPs0.2376250.4297420.6758411.0851202.143034
FH-RPs0.4125241.0773682.0613543.6165106.523829
PKM-RPs0.4125231.0773662.0613483.6165016.523817
NTLBG-RPs0.4100671.0666642.0356733.5640046.421223
n = 10 QMC-RPs0.1841130.2853250.3797230.4831710.6048320.7561100.9561091.244060
FH-RPs0.2827800.5833360.9405311.3772291.9184902.6012143.4866824.689563
PKM-RPs0.2827780.5833310.9405231.3772161.9184722.6011903.4866534.689529
NTLBG-RPs0.2612600.5187380.8203601.1898701.6530572.2482303.0347484.122999
n = 15 QMC-RPs0.1624200.2376250.3008760.3636210.4297420.5019270.5828660.675841
FH-RPs0.2302890.4283580.6439970.8880241.1673741.4892361.8624582.298770
PKM-RPs0.2302870.4283520.6439860.8880081.1673511.4892071.8624212.298726
NTLBG-RPs0.1777010.2972680.4204560.5581350.7183210.9094911.1422471.429964
n = 20 QMC-RPs0.1498040.2121750.2617860.3086430.3556540.4043730.4559630.511506
FH-RPs0.2006940.3509710.5062740.6746700.8599391.0650681.2930801.547407
PKM-RPs0.2006850.3509570.5062540.6746430.8599041.0650251.2930271.547344
NTLBG-RPs0.1379580.2099470.2776200.3474370.4227820.5065850.6019230.712768
n = 25 QMC-RPs0.1412530.1957910.2376250.2759540.3133070.3508970.3895060.429742
FH-RPs0.1812460.3038590.4260990.5549030.6930500.8423491.0043521.180611
PKM-RPs0.1812420.3038490.4260830.5548790.6930180.8423081.0043001.180549
NTLBG-RPs0.1207630.1759020.2249580.2727800.3211830.3714870.4249700.482953
n = 28 QMC-RPs0.1372620.1883710.2269300.2617860.2953270.3286670.3624800.397263
FH-RPs0.1723760.2833370.3920990.5051830.6250660.7532370.8908731.039072
PKM-RPs0.1723710.2833260.3920810.5051560.6250310.7531920.8908181.039005
NTLBG-RPs0.1154540.1653810.2089110.2504820.2915650.3331810.3761070.421139
n = 30 QMC-RPs0.1349370.1841130.2208620.2538270.2853250.3164170.3477350.379723
FH-RPs0.1672780.2718100.3732590.4779210.5881330.7052400.8302580.964093
PKM-RPs0.1672720.2717980.3732390.4778930.5880960.7051930.8302000.964023
NTLBG-RPs0.1125440.1600320.2007610.2391800.2770450.3149990.3534150.393126
categoryRP9RP10RP11RP12RP13RP14RP15RP16
n = 10 QMC-RPs1.7253602.922076
FH-RPs6.4662019.623196
PKM-RPs6.4661619.623159
NTLBG-RPs5.7509208.653409
n = 15 QMC-RPs0.7853550.9181361.0851201.3059951.6218602.1430343.411364
FH-RPs2.8145403.4337314.1935355.1562776.4386208.30104811.562225
PKM-RPs2.8144883.4336734.1934705.1562076.4385488.30097911.562190
NTLBG-RPs1.7901942.2480182.8417753.6324914.7112916.2883629.129651
n = 20 QMC-RPs0.5721670.6393150.7146720.8005080.8999501.0175181.1601141.339036
FH-RPs1.8321772.1525382.5150862.9285043.4045703.9598224.6185165.418287
PKM-RPs1.8321032.1524532.5149892.9283953.4044493.9596884.6183705.418128
NTLBG-RPs0.8441041.0018721.1934831.4290181.7214822.0890412.5555893.153091
n = 25 QMC-RPs0.4721610.5173220.5658340.6183950.6758410.7392010.8097790.889279
FH-RPs1.3728161.5828881.8130842.0661052.3452412.6545762.9992633.385945
PKM-RPs1.3727421.5828031.8129872.0659942.3451182.6544392.9991133.385782
NTLBG-RPs0.5470160.6187880.7007770.7958940.9076061.0408011.2013311.396686
n = 28 QMC-RPs0.4334330.4713810.5115060.5542380.6000590.6495320.7033260.762262
FH-RPs1.1989651.3717841.5589201.7619741.9828242.2237072.4873192.776958
PKM-RPs1.1988871.3716941.5588161.7618561.9826932.2235612.4871582.776784
NTLBG-RPs0.4691470.5210230.5777730.6408950.7123460.7946210.8903951.003197
n = 30 QMC-RPs0.4127430.4471200.4831710.5212270.5616470.6048320.6512510.701455
FH-RPs1.1076491.2618831.4278481.6067321.7999002.0089422.2357322.482506
PKM-RPs1.1075671.2617881.4277391.6066091.7997632.0087902.2355652.482323
NTLBG-RPs0.4347010.4785390.5255870.5768290.6332020.6963440.7682720.850992
categoryRP17RP18RP19RP20RP21RP22RP23RP24
n = 20 QMC-RPs1.5746611.9094442.4569553.771838
FH-RPs6.4221647.7480479.65899212.981382
PKM-RPs6.4219957.7478649.65879712.981110
NTLBG-RPs3.9348874.9932696.5398319.328942
n = 25 QMC-RPs0.9799951.0851201.2092881.3595871.5476211.7942862.1430342.709786
FH-RPs3.8233824.3234614.9028715.5860806.4110347.4412528.79549210.738640
PKM-RPs3.8232064.3232734.9026715.5858686.4108117.4410198.79525010.738390
NTLBG-RPs1.6364421.9328972.3033472.7722903.3760424.1674275.2231966.759912
n = 28 QMC-RPs0.8273660.8999500.9817351.0750411.1831051.3106261.4647871.657343
FH-RPs3.0967243.4517933.8488384.2966614.8072035.3972246.0912916.927496
PKM-RPs3.0965353.4515893.8486214.2964314.8069605.3969696.0910246.927220
NTLBG-RPs1.1380591.3009971.4986901.7409752.0416882.4166492.8871193.486758
n = 30 QMC-RPs0.7561100.8160360.8822620.9561091.0393111.1342071.2440601.373608
FH-RPs2.7519663.0474183.3729723.7338214.1366554.5903025.1067335.702749
PKM-RPs2.7517673.0472043.3727433.7335764.1363964.5900285.1064455.702449
NTLBG-RPs0.9474891.0617751.1981361.3622301.5613871.8058562.1074662.482508
categoryRP25RP26RP27RP28RP29RP30RP31RP32
n = 25 QMC-RPs4.058467
FH-RPs14.102428
PKM-RPs14.102179
NTLBG-RPs9.558364
n = 28 QMC-RPs1.9094442.2650272.8411564.206263
FH-RPs7.9695909.33676911.29476914.677905
PKM-RPs7.9693089.33645911.29443414.677578
NTLBG-RPs4.2676395.3241286.8475299.558364
n = 30 QMC-RPs1.5300931.7253601.9807112.3403682.9220764.296947
FH-RPs6.4029787.2455968.2944949.66908611.63564515.030035
PKM-RPs6.4026657.2452728.2941609.66874411.63530015.029713
NTLBG-RPs2.9545703.5550094.3283025.3555696.8475299.558364
Table A7. The corresponding probabilities of RPs from I G ( 1 ,   1 ) .
Table A7. The corresponding probabilities of RPs from I G ( 1 ,   1 ) .
categoryP1P2P3P4P5P6P7P8
n = 5 QMC-RPs0.2000000.2000000.2000000.2000000.200000
FH-RPs0.5434290.2806050.1222930.0442810.009393
PKM-RPs0.5434270.2806050.1222940.0442810.009393
NTLBG-RPs0.5394000.2812000.1238000.0456000.010000
n = 10 QMC-RPs0.1000000.1000000.1000000.1000000.1000000.1000000.1000000.100000
FH-RPs0.3032100.2501720.1712290.1131100.0725680.0445970.0255670.013023
PKM-RPs0.3032070.2501710.1712290.1131110.0725680.0445980.0255670.013024
NTLBG-RPs0.2596000.2350000.1744000.1244000.0858000.0568000.0346000.019000
n = 15 QMC-RPs0.0666670.0666670.0666670.0666670.0666670.0666670.0666670.066667
FH-RPs0.1971290.1987790.1598220.1231960.0934080.0699550.0516550.037432
PKM-RPs0.1971250.1987770.1598210.1231970.0934090.0699560.0516560.037433
NTLBG-RPs0.0990000.1278000.1266000.1186000.1080000.0962000.0836000.070200
n = 20 QMC-RPs0.0500000.0500000.0500000.0500000.0500000.0500000.0500000.050000
FH-RPs0.1398700.1590430.1402580.1173020.0962400.0781950.0630790.050521
PKM-RPs0.1398580.1590390.1402560.1173020.0962410.0781960.0630810.050523
NTLBG-RPs0.0416000.0640000.0726000.0760000.0774000.0774000.0766000.075200
n = 25 QMC-RPs0.0400000.0400000.0400000.0400000.0400000.0400000.0400000.040000
FH-RPs0.1050200.1297280.1218210.1074830.0926470.0789810.0668960.056387
PKM-RPs0.1050130.1297220.1218160.1074810.0926460.0789810.0668970.056389
NTLBG-RPs0.0240000.0400000.0472000.0510000.0524000.0536000.0540000.054800
n = 28 QMC-RPs0.0357140.0357140.0357140.0357140.0357140.0357140.0357140.035714
FH-RPs0.0902030.1158590.1120450.1013300.0893100.0777620.0672440.057882
PKM-RPs0.0901950.1158520.1120400.1013260.0893090.0777610.0672450.057884
NTLBG-RPs0.0196000.0328000.0396000.0426000.0442000.0448000.0452000.045400
n = 30 QMC-RPs0.0333330.0333330.0333330.0333330.0333330.0333330.0333330.033333
FH-RPs0.0820540.1078260.1060820.0973210.0868830.0765610.0669810.058328
PKM-RPs0.0820460.1078170.1060760.0973170.0868810.0765610.0669810.058329
NTLBG-RPs0.0174000.0296000.0354000.0386000.0404000.0408000.0408000.041200
categoryP9P10P11P12P13P14P15P16
n = 10 QMC-RPs0.1000000.100000
FH-RPs0.0053010.001224
PKM-RPs0.0053010.001224
NTLBG-RPs0.0082000.002200
n = 15 QMC-RPs0.0666670.0666670.0666670.0666670.0666670.0666670.066667
FH-RPs0.0264320.0180010.0116390.0069650.0036780.0015440.000366
PKM-RPs0.0264330.0180020.0116400.0069650.0036790.0015440.000366
NTLBG-RPs0.0566000.0434000.0312000.0204000.0114000.0054000.001600
n = 20 QMC-RPs0.0500000.0500000.0500000.0500000.0500000.0500000.0500000.050000
FH-RPs0.0401250.0315350.0244550.0186430.0139010.0100680.0070120.004625
PKM-RPs0.0401270.0315370.0244570.0186450.0139020.0100690.0070130.004626
NTLBG-RPs0.0728000.0690000.0638000.0574000.0496000.0412000.0320000.023200
n = 25 QMC-RPs0.0400000.0400000.0400000.0400000.0400000.0400000.0400000.040000
FH-RPs0.0473170.0395140.0328140.0270690.0221490.0179450.0143640.011327
PKM-RPs0.0473190.0395160.0328170.0270710.0221510.0179470.0143650.011328
NTLBG-RPs0.0552000.0556000.0562000.0562000.0558000.0548000.0526000.049400
n = 28 QMC-RPs0.0357140.0357140.0357140.0357140.0357140.0357140.0357140.035714
FH-RPs0.0496380.0424160.0361050.0305980.0257970.0216180.0179840.014831
PKM-RPs0.0496400.0424180.0361070.0306000.0258000.0216200.0179860.014833
NTLBG-RPs0.0458000.0460000.0464000.0470000.0478000.0486000.0488000.048800
n = 30 QMC-RPs0.0333330.0333330.0333330.0333330.0333330.0333330.0333330.033333
FH-RPs0.0506140.0437820.0377520.0324390.0277630.0236510.0200390.016869
PKM-RPs0.0506160.0437850.0377550.0324420.0277650.0236530.0200410.016872
NTLBG-RPs0.0410000.0410000.0414000.0418000.0422000.0432000.0440000.044600
categoryP17P18P19P20P21P22P23P24
n = 20 QMC-RPs0.0500000.0500000.0500000.050000
FH-RPs0.0028170.0015120.0006440.000154
PKM-RPs0.0028180.0015120.0006440.000154
NTLBG-RPs0.0154000.0090000.0044000.001400
n = 25 QMC-RPs0.0400000.0400000.0400000.0400000.0400000.0400000.0400000.040000
FH-RPs0.0087670.0066290.0048620.0034260.0022840.0014050.0007610.000327
PKM-RPs0.0087690.0066290.0048630.0034260.0022840.0014050.0007610.000327
NTLBG-RPs0.0448000.0392000.0328000.0260000.0192000.0128000.0074000.003800
n = 28 QMC-RPs0.0357140.0357140.0357140.0357140.0357140.0357140.0357140.035714
FH-RPs0.0121040.0097540.0077390.0060250.0045790.0033750.0023890.001600
PKM-RPs0.0121050.0097550.0077400.0060260.0045800.0033760.0023900.001600
NTLBG-RPs0.0482000.0466000.0436000.0400000.0352000.0294000.0232000.017200
n = 30 QMC-RPs0.0333330.0333330.0333330.0333330.0333330.0333330.0333330.033333
FH-RPs0.0140940.0116700.0095590.0077300.0061530.0048050.0036620.002706
PKM-RPs0.0140960.0116720.0095610.0077310.0061540.0048060.0036630.002707
NTLBG-RPs0.0452000.0454000.0446000.0432000.0406000.0374000.0326000.027600
categoryP25P26P27P28P29P30P31P32
n = 25 QMC-RPs0.040000
FH-RPs0.000079
PKM-RPs0.000079
NTLBG-RPs0.001200
n = 28 QMC-RPs0.0357140.0357140.0357140.035714
FH-RPs0.0009880.0005370.0002310.000056
PKM-RPs0.0009880.0005370.0002310.000056
NTLBG-RPs0.0116000.0070000.0034000.001200
n = 30 QMC-RPs0.0333330.0333330.0333330.0333330.0333330.033333
FH-RPs0.0019200.0012890.0007980.0004340.0001880.000046
PKM-RPs0.0019210.0012890.0007980.0004350.0001880.000046
NTLBG-RPs0.0218000.0162000.0108000.0066000.0034000.001200

References

  1. Schrödinger, E. Zur Theorie der Fall- und Steigversuche an Teilchen mit Brownscher Bewegung. Z. Phys. Chem. 1915, 16, 289–295. [Google Scholar]
  2. Tweedie, K.C.M. Inverse Statistical Variates. Nature 1945, 155, 453. [Google Scholar] [CrossRef]
  3. Wald, A. Sequential Analysis; Courier Corporation: Chelmsford, MA, USA, 2004. [Google Scholar]
  4. Cox, D.R.; Miller, H.D. The Theory of Stochastic Processes; Methuen: London, UK, 1965. [Google Scholar]
  5. Bartlett, M.S. An Introduction to Stochastic Processes; Cambridge University Press: London, UK, 1966. [Google Scholar]
  6. Moran, P.A.P. An Introduction to Probability Theory; Clarendon Press: Oxford, UK, 1968. [Google Scholar]
  7. Wasan, M.T.; Roy, L.K. Tables of Inverse Gaussian Percentage Points. Technometrics 1969, 11, 591–604. [Google Scholar] [CrossRef]
  8. Folks, J.L.; Chhikara, R.S. The Inverse Gaussian Distribution and Its Statistical Application—A Review. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 1978, 40, 263–275. [Google Scholar] [CrossRef]
  9. Chhikara, R.S.; Folks, J.L. Estimation of Inverse Gaussian Distribution Function. J. Am. Stat. Assoc. 1974, 69, 250–254. [Google Scholar] [CrossRef]
  10. Chhikara, R.S.; Folks, J.L. The Inverse Gaussian Distribution as a Lifetime Model. Technometrics 1977, 19, 461–468. [Google Scholar] [CrossRef]
  11. Iyengar, S.; Patwardhan, G. Recent Developments in the Inverse Gaussian Distribution. In Handbook of Statistics; Elsevier: Amsterdam, The Netherlands, 1988; Volume 7, pp. 479–490. [Google Scholar]
  12. Kourogiorgas, C.; Panagopoulos, A.D.; Livieratos, S.N.; Chatzarakis, G.E. Rain Attenuation Time Series Synthesizer Based on Inverse Gaussian Distribution. Electron. Lett. 2015, 51, 2162–2164. [Google Scholar] [CrossRef]
  13. Punzo, A. A New Look at the Inverse Gaussian Distribution with Applications to Insurance and Economic Data. J. Appl. Stat. 2019, 46, 1260–1287. [Google Scholar] [CrossRef]
  14. Krbálek, M.; Hobza, T.; Patočka, M.; Krbálková, M.; Apeltauer, J.; Groverová, N. Statistical Aspects of Gap-Acceptance Theory for Unsignalized Intersection Capacity. Phys. A 2022, 594, 127043. [Google Scholar] [CrossRef]
  15. Fang, K.T.; Pan, J. A Review of Representative Points of Statistical Distributions and Their Applications. Mathematics 2023, 11, 2930. [Google Scholar] [CrossRef]
  16. Li, X.; He, P.; Huang, M.; Peng, X. A new class of moment-constrained mean square error representative samples for continuous distributions. J. Stat. Comput. Simul. 2025, 95, 2175–2203. [Google Scholar] [CrossRef]
  17. Peng, X.; Huang, M.; Li, X.; Zhou, T.; Lin, G.; Wang, X. Patient regional index: A new way to rank clinical specialties based on outpatient clinics big data. BMC Med. Res. Methodol. 2024, 24, 192. [Google Scholar] [CrossRef]
  18. Fang, K.T.; Ye, H.; Zhou, Y. Representative Points of Statistical Distributions and Their Applications in Statistical Inference; CRC Press: New York, NY, USA, 2025. [Google Scholar]
  19. Harrell, F.E.; Davis, C.E. A new distribution-free quantile estimator. Biometrika 1982, 69, 635–640. [Google Scholar] [CrossRef]
  20. Sfakianakis, M.E.; Verginis, D.G. A new family of nonparametric quantile estimators. Commun. Stat. Simul. Comput. 2008, 37, 337–345. [Google Scholar] [CrossRef]
  21. Wald, A. Sequential Analysis; John Wiley & Sons: New York, NY, USA, 1947. [Google Scholar]
  22. Tweedie, M.C.K. Statistical Properties of Inverse Gaussian Distributions. II. Ann. Math. Stat. 1957, 28, 696–705. [Google Scholar] [CrossRef]
  23. Fang, K.T.; He, S.D. The problem of selecting a given number of representative points in a normal population and a generalized Mills’ ratio. Acta Math. Appl. Sin. 1984, 7, 293–306. [Google Scholar]
  24. Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Statist. 1979, 7, 1–26. [Google Scholar] [CrossRef]
  25. Hua, L.K.; Wang, Y. Applications of Number Theory to Numerical Analysis; Springer: Berlin, Germany; Science Press: Beijing, China, 1981. [Google Scholar]
  26. Niederreiter, H. Random Number Generation and Quasi-Monte Carlo Methods; SIAM: Philadelphia, PA, USA, 1992. [Google Scholar]
  27. Fang, K.T.; Wang, Y. Number-Theoretic Methods in Statistics; Chapman and Hall: London, UK, 1994. [Google Scholar]
  28. Barbiero, A.; Hitaj, A. Discrete approximations of continuous probability distributions obtained by minimizing Cramér–von Mises-type distances. Stat. Pap. 2022, 64, 1–29. [Google Scholar] [CrossRef]
  29. Flury, B.A. Principal points. Biometrika 1990, 77, 33–41. [Google Scholar] [CrossRef]
  30. Cox, D.R. Note on grouping. J. Am. Stat. Assoc. 1957, 52, 543–547. [Google Scholar] [CrossRef]
  31. Linde, Y.; Buzo, A.; Gray, R. An algorithm for vector quantizer design. IEEE Trans. Commun. 1980, 28, 84–95. [Google Scholar] [CrossRef]
  32. Lloyd, S.P. Least squares quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef]
  33. Tarpey, T. A parametric k-means algorithm. Comput. Stat. 2007, 22, 71–89. [Google Scholar] [CrossRef]
  34. Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 1956, 27, 832–837. [Google Scholar] [CrossRef]
  35. Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.