Next Article in Journal
Robustness Measurement of Comprehensive Evaluation Model Based on the Intraclass Correlation Coefficient
Previous Article in Journal
Transient-State Fault Detection System Based on Principal Component Analysis for Distillation Columns
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Versatile Distribution Based on the Incomplete Gamma Function: Characterization and Applications

1
Departamento de Estadística y Ciencia de Datos, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
2
Faculty of Basic Sciences, Universidad Católica del Maule, Talca 3480112, Chile
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(11), 1749; https://doi.org/10.3390/math13111749
Submission received: 1 April 2025 / Revised: 15 May 2025 / Accepted: 22 May 2025 / Published: 25 May 2025
(This article belongs to the Section D1: Probability and Statistics)

Abstract

:
In this study, we introduce a novel distribution related to the gamma distribution, referred to as the generalized incomplete gamma distribution. This new family is defined through a stochastic representation involving a linear transformation of a random variable following a distribution derived from the upper incomplete gamma function. As a result, the proposed distribution exhibits a probability density function that effectively captures data exhibiting asymmetry and both mild and high levels of kurtosis, providing greater flexibility compared to the conventional gamma distribution. We analyze the probability density function and explore fundamental properties, including moments, skewness, and kurtosis coefficients. Parameter estimation is conducted via the maximum likelihood method, and a Monte Carlo simulation study is performed to assess the asymptotic properties of the maximum likelihood estimators. To illustrate the applicability of the proposed distribution, we present two case studies involving real-world datasets related to mineral concentration and the length of odontoblasts in guinea pigs, demonstrating that the proposed distribution provides a superior fit compared to the gamma, inverse Gaussian, and slash-type distributions.

1. Introduction

The gamma distribution is a continuous probability distribution that is widely used in statistics and probability theory. It is characterized by two parameters, α (shape) and β (scale). The shape parameter determines the shape and skew of the distribution, while the scale parameter influences the spread or scale of the distribution.
A special case of this distribution occurs when α is a positive integer, in which case it is also known as the Erlang distribution. Specifically, when α = 1 , the gamma distribution simplifies to the exponential distribution; see Johnson [1].
The gamma distribution finds extensive use in various fields of statistics, science, and engineering, due to its flexibility in modeling continuous positive data. Here are some common applications of this distribution: reliability and survival analysis, queueing theory, finance, healthcare, environmental science, insurance, quality control, traffic engineering, biostatistics, among others; see for instance Nadarajah and Gupta [2], Husak et al. [3], Mansor et al. [4], Roy et al. [5], Kang et al. [6], and Al-Awadi et al. [7]. It is important to note that the gamma distribution can take on various shapes depending on the values of its shape and scale parameters, making it a versatile choice for modeling a wide range of data types.
Based on this family of distributions, a distribution akin to the gamma distribution is introduced in this manuscript. This novel distribution is defined through a stochastic representation, encompassing a linear relationship with a random variable following a distribution derived from the upper incomplete gamma function. This function is an extension of the complete gamma function and is particularly useful in probability and statistics, especially in contexts involving gamma distributions. The proposed distribution preserves the unimodal nature of the classical gamma distribution, while offering greater flexibility through adjustable skewness and kurtosis. This enhanced adaptability allows it to effectively capture data patterns characterized by light or heavy tails, thereby broadening its applicability to a wider range of real-world scenarios.
The proposed distribution preserves the unimodal nature of the classical gamma distribution, while offering greater flexibility through adjustable skewness and kurtosis. This enhanced adaptability allows it to effectively capture data patterns characterized by light or heavy tails, thereby broadening its applicability to a wider range of empirical scenarios.
The organization of this work is as follows: In Section 2, a brief summary of the incomplete gamma function and the incomplete gamma distribution is provided. In Section 3, the generalized incomplete gamma distribution is derived and characterized in terms of its probability density function (PDF), cumulative distribution function (CDF), and moments, among other properties. In Section 4, the associated inference for the distribution parameters is conducted. Section 5 presents a Monte Carlo simulation study to analyze the asymptotic properties of the maximum likelihood estimators. In Section 6, two applications are presented to demonstrate the practical utility of this distribution. Finally, in Section 7, the conclusions are summarized, some limitations are discussed, and potential future research directions based on this work are proposed.

2. Background

The complete gamma function, denoted as Γ ( α ) , can be extended through the incomplete gamma function (IGF), expressed as Γ ( α , x ) , where it is important to note that Γ ( α ) = Γ ( α , 0 ) . The upper IGF for R e [ t ] > 0 , where x is a positive real random variable and α is a complex variable with a real positive part, is given by
Γ ( α , x ) = x t α 1 exp ( t ) d t .
The upper IGF is frequently encountered in probability and statistics, especially in scenarios involving gamma distributions. It offers a method for calculating the PDF and CDF for gamma-distributed random variables when the random variable surpasses a certain threshold x. In recent years, the IGF has been widely used for this purpose. Ozarslan and Ustaoglu [8] introduced an extended incomplete version of Pochhammer symbols in terms of generalized IGFs, while Reynolds and Stauffer [9] expressed definite integrals of hyperbolic and logarithmic functions in terms of this function. Jangid et al. [10] developed Lambert’s law involving the IGF, and From [11] presented new inequalities and limits involving the IGF, among other articles found in the literature in recent years. For more details about the IGF, refer to Abramowitz and Stegun [12].
Some properties of the upper IGF are listed below:
  • Γ ( a + 1 , x ) = a Γ ( a , x ) + x a exp ( x ) ;
  • Γ α + r , x = α + r 1 ! α 1 ! Γ α , x + x α exp ( x ) α + r 1 ! α ! + α + r x r 3 + x r 2 , r = 1 , 2 , 3 ,
Proof. 
Property 1 results from the definition of an IGF and Property 2 is obtained using induction for r = 1 , 2 , 3 , .    □
Based on the Equation (1), the incomplete gamma (IG) distribution can be defined as follows:
Definition 1.
A random variable follows an incomplete gamma distribution, with shape parameter α > 0 and support bound parameter q > 0 , denoted as X IG ( α , q ) , if its PDF is given by
f X ( x ; α , q ) = x α 1 exp ( x ) Γ α , q , q < x < .
The CDF of X is given by
F X ( x ; α , q ) = 1 Γ ( α , x ) Γ α , q , q < x < , α > 0 .
In Section 3, based on the IG distribution, we derive a new three-parameter distribution with support bounded to the interval ( 0 , ) , making it a viable alternative for modeling positive data. The proposed distribution exhibits flexibility in skewness and kurtosis, providing an alternative to commonly used distributions that may fail to adequately capture empirical skewness and kurtosis levels.

3. The Proposed Model

In this section, we outline the stochastic representation of the generalized incomplete gamma (GIG) distribution, including its PDF, CDF, quantile function, moments, and various properties associated with the model.

3.1. Generalized Incomplete Gamma Distribution

Definition 2.
Let X IG ( α , q ) . Then, the random variable Y follows a generalized incomplete gamma (GIG) distribution, with scale parameter β > 0 , if it can be represented according to the stochastic representation:
Y = X q β ,
with α and q as specified in Equation (2). We denote this distribution as Y GIG ( α , β , q ) .
Proposition 1.
Let Y be distributed according to a GIG distribution with parameters α, β, and q, i.e., Y GIG ( α , β , q ) . Then, the PDF of the random variable Y is given by
f Y ( y ; α , β , q ) = β α y + q β α 1 exp β y + q Γ α , q , y > 0 ,
where α > 0 and q > 0 are shape parameters, β > 0 is a scale parameter, and Γ ( α , q ) is defined in Equation (1).
Proof. 
From Equation (3), we find that x = g ( y ) = ( Y q ) / β , then
f Y ( y , α , β , q ) = f Y ( g 1 ( y ) ) d g 1 ( y ) d y = [ g 1 ( y ) ] α 1 exp ( g 1 ( y ) ) Γ α , q ,
by replacing g 1 ( y ) = β y + q , the result is obtained.   □
Proposition 2.
Let Y GIG ( α , β , q ) . Then, the CDF of Y is expressed as
F Y ( y ; α , β , q ) = 1 Γ α , β y + q Γ α , q , y > 0 .
Proof. 
Both IGF functions are obtained immediately from their definitions.    □
Proposition 3.
Let Y GIG ( α , β , q ) . Then, the hazard function (HF) of Y can be expressed as
h Y ( y ; α , β , q ) = β α y + q β α 1 exp β y + q Γ α , β y + q , y > 0 .
Proof. 
By employing the definition of the hazard function, i.e.,
h Y ( y ; α , β , q ) = f Y ( y ; α , β , q ) 1 F Y ( y ; α , β , q ) ,
the result follows immediately.    □
To illustrate the graphical performance of the GIG distribution and analyze the effects of parameters on the shape of the PDF, CDF, and HF, we explored its flexibility in terms of skewness and kurtosis. Figure 1, Figure 2 and Figure 3 display plots for the PDF, CDF, and HF, respectively.
In Figure 1 (top left panel), it can be seen that the parameter β controls the scale of the GIG distribution. In the top right and bottom panels, it can be seen that the parameters α and q control the decreasing and unimodal shape of the PDF, modifying the levels of asymmetry and kurtosis.
In Figure 2 (top right and bottom panels), it can be seen that the shape parameters determine the form of the CDF; as α and q increase, the CDF modifies its slope and curvature. Regarding the scale parameter β (top left panel), as it increases, the PDF becomes more compressed, causing a horizontal shift in the CDF. The shape of the curve does not change, but the distribution of values along the horizontal axis does.
Figure 3 shows that the HF of the GIG distribution can be increasing, decreasing, or constant, depending on the choice of parameters.
Remark 1.
Let Y GIG ( α , β , q ) . The following distributions are special cases of the GIG distribution:
  • If q = 0 , then Y GIG ( α , β , q = 0 ) reduces to the gamma distribution, denoted as Y G a m m a ( α , β ) .
  • If q = 0 and α = 1 , then the Y GIG α = 1 , β , q = 0 distribution reduces to the exponential distribution, denoted as Y Exp ( β ) (See Johnson [1]).
  • If q = 0 , α = λ 2 and β = 1 2 , then Y G I G α = λ 2 , β = 1 2 , q = 0 reduces to the chi-squared distribution, denoted as χ 2 ( λ , 1 ) (See Johnson [1]).
Figure 4 summarizes the relationships among the GIG distribution and the aforementioned special cases.

3.2. Quantile Function

Next, we will introduce the quantile function of the GIG distribution and compute specific quantiles, such as quartiles. This function plays a key role in simulation studies, facilitating the generation of pseudo-random samples (see Algorithm 1).
Proposition 4.
If Y GIG ( α , β , q ) , then the quantile function of Y is given by
Q ( p ) = F Y ( p ; α , β , q ) 1 = β 1 Γ 1 ( α , ( 1 p ) Γ ( α , β ) ) q , 0 < p < 1 ,
where Γ 1 ( · , · ) denotes the inverse of the upper IGF.
Proof. 
It follows from a direct computation, by applying the definition of quantile function.    □
Corollary 1.
The quartiles of the GIG distribution are given by
  • (First quartile) Q ( 0.25 ) = β 1 Γ 1 ( α , 0.75 Γ ( α , β ) ) q .
  • (Median) Q ( 0.5 ) = β 1 Γ 1 ( α , 0.5 Γ ( α , β ) ) q .
  • (Third quartile) Q ( 0.75 ) = β 1 Γ 1 ( α , 0.25 Γ ( α , β ) ) q .
Algorithm 1 Procedure for generating pseudo-random samples from the GIG ( α , β , q ) distribution.
1:
for  i = 1 , , n   do
2:
    Generate a random number u from the standard uniform distribution U ( 0 , 1 ) .
3:
    Compute y using the quantile function:
y = Q ( u ) = F Y 1 ( u ; α , β , q ) = β 1 Γ 1 α , ( 1 u ) Γ ( α , β ) q ,
  where Q ( u ) is given by Equation (4).
4:
    Set y i = y .
5:
return  { y 1 , y 2 , , y n }

3.3. Properties and Moments

Next, we derive the raw moment of order r of the GIG distribution and utilize it to describe its skewness and kurtosis, providing a deeper characterization of the distribution’s shape and tail behavior.
Proposition 5.
Let Y GIG ( α , β , q ) . For r = 1 , 2 , , the r-th moment of Y is given by
μ r = E [ Y r ] = M r ( α , q ) β r Γ ( α , q ) ,
where M r ( α , q ) = j = 0 r r j 1 j q j Γ r + α j , q .
Proof. 
Using the stochastic representation given in Equation (3), we obtain that
E [ Y r ] = E X q β r = 1 β r E [ ( X q ) r ]
applying the Binomial Theorem, the result is obtained. □
The following Corollaries 2 and 3, present useful results derived from the first four moments. These include explicit expressions for the mean, variance, skewness (CS), and kurtosis (CK) coefficients.
Corollary 2.
If Y GIG ( α , β , q ) , then the first four moments are given by
  • μ 1 = E [ Y ] = M 1 ( α , q ) β Γ α , q .
  • μ 2 = E [ Y 2 ] = M 2 ( α , q ) β 2 Γ α , q .
  • μ 3 = E [ Y 3 ] = M 3 ( α , q ) β 3 Γ α , q .
  • μ 4 = E [ Y 4 ] = M 4 ( α , q ) β 4 Γ α , q .
Corollary 3.
Let Y GIG ( α , β , q ) , then the variances, CS and the CK, are
V ( Y ) = Γ α , q M 2 M 1 2 β 2 Γ 2 α , q , C S ( Y ) = Γ 2 α , q M 3 3 Γ 2 α , q M 2 M 1 + 2 M 1 3 Γ α , q M 2 M 1 2 3 2 , C K ( Y ) = Γ 3 α , q M 4 4 Γ 2 α , q M 1 M 3 + 6 Γ 3 α , q M 1 2 M 2 3 M 1 4 Γ α , q M 2 M 1 2 2 ,
where M r = M r ( α , q ) .
Proof. 
By definition of the variances, CS and CK, we have
V ( Y ) = μ 2 μ 1 2 ; CS ( Y ) = μ 3 3 μ 2 μ 1 + 2 μ 1 3 [ μ 2 μ 1 2 ] 3 / 2 a n d CK ( Y ) = μ 4 4 μ 1 μ 3 + 6 μ 1 2 μ 2 3 μ 1 4 [ μ 2 μ 1 2 ] 2 .
   □
To explore the influence of the GIG distribution’s parameters on its shape, we analyze its CS and CK. Specifically, we examine the effect of varying α from 2 to 7, while setting q = 1 , 4 , 6 , 8 . The resulting skewness and kurtosis values are visualized graphically, and the corresponding coefficient values are presented in Table 1.
Figure 5 visually represents the CS and CK values presented in Table 1, revealing a clear relationship between the parameters α and q and the shape of the GIG distribution. Specifically, for α < 1 , the highest skewness and kurtosis values are observed at low values of q. Conversely, when α > 1 , the peak skewness and kurtosis values shift to higher values of q.

4. Parameter Estimation

In this section, we address the maximum likelihood (ML) estimation of the GIG distribution parameters.
Let Y 1 , Y 2 , , Y n be a random sample from Y GIG ( α , β , q ) . The log-likelihood function for θ = ( α , β , q ) is given by
( θ ) = n α log ( β ) + α 1 i = 1 n log y i + q β β i = 1 n y i n q n log Γ α , q ,
taking partial derivatives with respect to α , β and q, the elements of the score vector are obtained, S ( θ ) = α , β , q , that is
( θ ) α = n log ( β ) + i = 1 n log y i + q β n ρ 1 ( α , q ) Γ ( α , q ) ,
( θ ) β = n α β [ α 1 ] i = 1 n q β y i β + q i = 1 n y i ,
( θ ) q = [ α 1 ] i = 1 n 1 [ y i β + q ] n + n q α 1 exp ( q ) Γ ( α , q ) ,
where ρ 1 ( α , q ) = q log ( x ) x α 1 exp ( x ) d x .
The system S ( θ ) = 0 , defined by Equations (6)–(8), does not yield an explicit analytical solution. Consequently, numerical methods, such as the Newton–Raphson or quasi Newton–Raphson algorithm, are required to obtain the ML estimators (MLEs), denoted by θ ^ = ( α ^ , β ^ , q ^ ) . Alternatively, optimization techniques that directly maximize the log-likelihood function, for instance, the method proposed by MacDonald [13], may also be employed.

Observed Information Matrix

The asymptotic variance of the MLEs, denoted by θ ^ = ( α ^ , β ^ , q ^ ) , can be estimated through the Fisher information matrix defined by
I ( θ ) = E 2 ( θ ) θ θ ,
where ( θ ) is given by Equation (5). Under suitable regularity conditions, the MLEs are asymptotically normally distributed, i.e.,
I ( θ ) 1 / 2 θ ^ θ D N 3 ( 0 3 , I 3 ) , as n ,
where D denotes convergence in distribution, and N 3 ( 0 3 , I 3 ) represents the standard trivariate normal distribution (see Casella and Berger [14], Chapter 10). Additionally, the Fisher information matrix I ( θ ) can be approximated by the observed information matrix, computed from the second derivative of the log-likelihood function as
I ( θ ) = 2 ( θ ) θ θ ,
whose entries are explicitly given by
I α α = 2 ( θ ) α 2 , I α β = 2 ( θ ) α β , and so forth .
Explicitly, we have
I α α = n Γ 2 ( α , β ) ρ 2 ( α , q ) Γ ( α , β ) ρ 1 2 ( α , q ) , I α β = n β + i = 1 n 1 [ y i β + q ] , I α q = n q α 1 exp ( q ) log ( q ) ρ 1 ( α , q ) q α 1 exp ( q ) Γ ( α , q ) , I β β = n α β 2 [ α 1 ] i = 1 n q q + 2 y i β β 2 [ y i β + q ] I β q = [ α 1 ] i = 1 n β [ y i β + q ] q β β 2 [ y i β + q ] 2 , I β q = [ α 1 ] i = 1 n 1 [ y i β + q ] 2 + n q α 1 exp ( q ) Γ 2 ( α , q ) ρ 1 ( α , q ) + Γ ( α , q ) α q α 1 exp ( q ) ,
where ρ 2 ( α , q ) = q log 2 ( x ) x α 1 exp ( x ) d x .
In practice, obtaining a closed-form expression for the expectation of the above derivatives is generally infeasible. Therefore, the covariance matrix of the MLEs, given by I ( θ ) 1 , is typically approximated by the inverse of the observed information matrix, I ( θ ^ ) 1 . The observed information matrix I ( θ ^ ) is computed by evaluating the second derivatives of the log-likelihood function at the MLEs θ ^ , explicitly given by
I ( θ ^ ) = 2 ( θ ) θ θ θ = θ ^ .
Thus, the asymptotic variances of α ^ , β ^ , and q ^ are estimated by the diagonal elements of I ( θ ^ ) 1 , while their corresponding standard errors (SEs) are obtained as the square roots of these estimated variances. For further theoretical details underlying these results, see [15].

5. Monte Carlo Simulation Study

In this section, we present a simulation study designed to evaluate the performance of the MLEs. To ensure model identifiability, we fix q = 2 and simulate only the parameters α and β . When q is fixed, the resulting GIG distribution is fully determined by two parameters: a shape parameter ( α ) and a scale parameter ( β ). This avoids potential identifiability issues that may arise when estimating all three parameters simultaneously, since different combinations of α , β , and q could produce similar distributional shapes. Moreover, the choice q = 2 defines a parsimonious model with the same number of parameters as the gamma distribution, while preserving sufficient flexibility to represent both light- and heavy-tailed behaviors through variations in α .
In particular, the heavy-tailed behavior of the GIG distribution makes it a competitive alternative to slash-type models, such as the slash-Frechét (SFr) [16] and slash half-normal (SHN) [17] distributions, which are commonly used to model data exhibiting substantial skewness and kurtosis.
Pseudo-random samples from the GIG ( α , β , q = 2 ) distribution are generated using the inverse transform method, as detailed in Algorithm 1.
We conducted a Monte Carlo simulation study to evaluate the finite-sample behavior of the MLEs for the GIG model. Specifically, we considered four values for α (6, 7, 8, and 9), three values for β (2, 3, and 4), and four sample sizes ( n = 50 , 100 , 150 , 200 ). For each combination of α , β , and n, we generated 1000 simulated datasets, computed the corresponding MLEs, and estimated their standard errors (SEs). Table 2 summarizes the results, including the mean bias (Bias), average SE, estimated root mean squared error (RMSE), and empirical coverage probability (CP) of 95% confidence intervals. Figure 6 presents the empirical distribution of the ML estimators.
Table 2 shows that as the sample size increased, the bias, SE, and RMSE all decreased, indicating the strong finite-sample performance of the MLEs for the GIG model. Additionally, the SE and RMSE became increasingly similar as the sample size grew, suggesting accurate estimation of the variance in the estimators. Finally, the CP approached the nominal 95% level as n increased, supporting the validity of asymptotic normality assumptions for the GIG model’s MLEs, even in moderate finite samples. Figure 6 illustrates that as the sample size increased, the estimation of the parameters α and β became more precise, converging towards their nominal values, as expected for maximum likelihood estimators.

6. Illustrations with Real Data

In this section, we illustrate the performance of the GIG model through two applications with real data, comparing its results against alternative approaches from the existing literature.

6.1. Illustration 1: Rubidium Concentrations Dataset

Rubidium (Rb) is a naturally occurring alkali metal widely distributed in the Earth’s crust, commonly found in minerals, soils, and water. Due to its chemical properties, Rb concentrations serve as valuable indicators of geological and environmental processes. The first dataset consists of Rb concentration measurements from n = 86 soil samples collected by the Mining Department at the University of Atacama, Chile. Table 3 presents key descriptive statistics, including the mean, median, standard deviation (SD), CS, and CK. As shown in this table, the Rb concentration data are strictly positive and exhibit pronounced right-skewness and elevated kurtosis, indicating a significant departure from normality. Given these distributional characteristics, particularly the strong asymmetry and heavy-tailed behavior, the GIG distribution emerges as a compelling choice for modeling these data, providing greater flexibility than standard distributions commonly applied in environmental studies.
Figure 7 presents a boxplot of the Rb concentration data, complementing the summary statistics reported in Table 3. The distribution shows pronounced right skewness, with the median located toward the lower end of the range and a markedly high observation near 400. This graphical evidence is consistent with the elevated values of CS and CK, further supporting the suitability of the GIG distribution, which effectively captures the observed asymmetry and heavy-tailed behavior in the data.
For this illustration, we focus on the GIG distribution with q = 2 , whose PDF is given by
f Y ( y ; α , β , q = 2 ) = β α y + 2 β α 1 exp ( β y + 2 ) Γ ( α , 2 ) , y > 0 .
To benchmark its performance, we compared this distribution with other two-parameter distributions, including the Gamma (G), Inverse Gaussian (IG), and slash Fréchet (SFr) distributions, as detailed in Castillo et al. [16]. The PDF of the SFr distribution is given by
f Y ( y ; α , β ) = β y β + 1 Γ 1 β α , y α , y , α , β > 0 ,
where α > β , and Γ ( a , t ) is defined in Equation (1). By comparing these distributions, we aimed to evaluate the flexibility of the GIG model in capturing the characteristics of the data, particularly in terms of skewness and heavy tails.
Statistical metrics and ML estimates were obtained for the GIG distribution and compared with those from the G, IG, and SFr distributions. In addition to these estimates, the corresponding SEs were computed. The performance of the GIG model was evaluated against the alternative distributions based on log-likelihood, the Akaike information criterion (AIC) [18], and Bayesian information criterion (BIC) [19].
Table 4 presents the ML estimates, SEs, and the AIC and BIC values for each model. Furthermore, the Kolmogorov–Smirnov (KS) test was applied to assess the goodness of fit for each distribution to the Rb concentration data. The results indicate that the GIG distribution attained the lowest AIC and BIC values, suggesting the best overall fit. Additionally, the KS test confirmed that the GIG model provided a statistically adequate representation of the data at conventional significance levels.
Figure 8 (right) presents the histogram of Rb concentration data overlaid with the PDFs of the fitted models, while the right panel displays the empirical CDF alongside the theoretical CDFs of the considered models. These visual representations corroborate the findings in Table 4, further validating the suitability of the GIG model for this dataset. Figure 8 (left) displays the quantile–quantile (Q-Q) plot for the Rb concentration data fitted with the GIG distribution. The theoretical and sample quantiles align closely along the diagonal, indicating a good fit across both the central part and the tails of the distribution. This supports the ability of the GIG distribution to adequately model data that included outliers.
Figure 9 displays the profile of the log-likelihood function for the GIG distribution fitted to the data, as a function of the parameters α y β , with the other parameters held fixed at their respective maximum likelihood estimates. In both cases, a single well-defined maximum is observed, suggesting the uniqueness of the maximum likelihood estimator under this model. Furthermore, the smooth and convex shape around the maximum supports the numerical stability of the estimation procedure.

6.2. Illustration 2: ToothGrowth Dataset

We analyzed a dataset that records the length of odontoblasts, cells responsible for tooth growth, in 60 guinea pigs. Originally examined by Crampton [20], this dataset provides valuable insights into the effects of vitamin C on tooth development. It is available in the R programming language [21] under the name ToothGrowth, with the variable len representing odontoblast length.
In this second illustration, we compared the performance of the GIG model against the Slash Half-Normal (SHN) distribution, as introduced by Olmos et al. [17]. Let Z SHN ( α , β ) be a random variable with a PDF given by
f Z ( z ; α , β ) = β 2 β π α β Γ β + 1 2 z ( β + 1 ) G z 2 , β + 1 2 , 1 2 α 2 , z , α , β > 0 ,
where G ( · , · , · ) represents the CDF of the G distribution. This comparison aimed to evaluate the flexibility and suitability of these distributions in modeling the observed data.
Table 5 presents some descriptive statistics for the len variable. From this table, it can be observed that the data consist of positive values that exhibit smooth negative skewness and moderate kurtosis.
Table 5 summarizes key descriptive statistics for the len variable in the ToothGrowth dataset. The data consist of positive values, with a mean of approximately 18.81 and a median of 19.25 , indicating a slight asymmetry. The CS is slightly negative, implying a mild left skewness, while the CK is close to 2, suggesting a moderately peaked distribution.
Table 6 presents the ML estimates, along with their SEs in parentheses, for the parameters of the fitted distributions. Additionally, the table includes the maximum log-likelihood values, AIC and BIC values, as well as the KS test results for assessing goodness-of-fit. From these results, it is evident that the GIG distribution provided the best fit to the len data, as indicated by the lowest AIC and BIC values. Moreover, at conventional significance levels, the KS test results further support the adequacy of the GIG distribution in modeling the data.
Figure 10 (left) displays a histogram of the len data overlaid with the PDFs of the fitted models, while the right panel presents the CDF alongside the theoretical CDFs of the considered models. These graphical representations provide a visual validation of the results summarized in Table 6, further supporting the statistical conclusions.
The results from both applications highlight the flexibility and effectiveness of the GIG distribution in modeling positively skewed and heavy-tailed data. In the first application, the GIG distribution demonstrated superior performance in capturing the variability and asymmetry of the Rb concentration measurements in soil samples, as evidenced by lower AIC and BIC values and a strong agreement with empirical distributions. Similarly, in the second application, the GIG distribution provided the best statistical fit for the len data of the ToothGrowth dataset, outperforming alternative distributions in terms of goodness-of-fit criteria.
The visual analyses, including histograms overlaid with fitted PDFs and empirical vs. theoretical CDFs, further support these findings, reinforcing the suitability of the GIG model for these types of data. Given its flexibility in accommodating diverse distributional shapes, the GIG distribution emerges as a valuable tool for modeling real-world data in various scientific fields.
Figure 11 shows the profile log-likelihood functions for the parameters α (left) and β (right) of the GIG distribution fitted to the len data from the ToothGrowth dataset. In both cases, the curves exhibit a single, well-defined peak, which supports the existence and uniqueness of the maximum likelihood estimators. The smooth and concave shape of the profiles near the maximum further indicates the local identifiability and numerical stability of the estimation process.

7. Conclusions and Future Research Directions

In this work, we have introduced the generalized incomplete gamma distribution, a novel probability distribution derived from the upper incomplete gamma function. The proposed distribution is defined by three parameters: a scale parameter β , a shape parameter α , and an additional parameter q, providing notable flexibility for modeling positive data.
The incomplete gamma distribution exhibits remarkable adaptability in terms of skewness and kurtosis, making it well suited for datasets with varying degrees of asymmetry and both moderate and high kurtosis. This versatility positions it as a competitive alternative for modeling positive data across diverse applications.
A particularly interesting case arises when q = 2 , which reduces the model to a parsimonious two-parameter distribution (shape and scale). This configuration yields a heavy-tailed distribution, making it an appealing alternative to the gamma distribution and other commonly used two-parameter models, such as the inverse Gaussian. Moreover, as a heavy-tailed model, it competes favorably with slash-type distributions, including the slash Fréchet and slash half-normal distributions.
Parameter estimation for the incomplete gamma distribution was addressed using the maximum likelihood method. Simulation studies were conducted to evaluate the performance of the estimators in the special case q = 2 , demonstrating that the maximum likelihood approach yielded satisfactory parameter estimates for the proposed distribution.
Furthermore, we presented two practical applications for the case q = 2 , where the incomplete gamma distribution outperformed the gamma, inverse Gaussian, slash Fréchet, and slash half-normal distributions. These results highlight the potential of the incomplete gamma distribution for modeling data across various domains.
Despite its flexibility, a key limitation of the incomplete gamma distribution is the challenge of formulating a regression model based on the mean, which is the conventional approach for non-negative responses. This complexity stems from the analytical difficulty of expressing one of the shape parameters explicitly in terms of the mean, as noted in Corollary 2.
To address this issue, Proposition 1 shows that setting f ( y ; α , β , q ) = 0 leads to the mode of the incomplete gamma distribution being ( α 1 q ) / β . This allows for reparameterization as α = δ β + q + 1 , where δ serves as a modal parameter. This transformation enables the GIG probability density function to be rewritten in terms of the mode as follows:
f Y ( y ; δ , β , q ) = β δ β + q + 1 y + q β δ β + q exp β y + q Γ δ β + q + 1 , q , y > 0 , δ , β , q > 0 ,
this reparameterization provides a foundation for constructing a regression model that quantifies the relationship between a set of explanatory variables and the mode of a positive response variable. Modal regression models are particularly advantageous, as they capture structural patterns that mean-based regression models may overlook. The latter often fail to detect key trends in the response, as illustrated by Chen et al. [22].
Future research could explore the applicability of the GIG distribution in other fields and extend the framework to incorporate covariate effects through regression models. The development of a modal regression model based on the GIG distribution remains an open avenue for future study.
In addition, we acknowledge several valuable directions for future research suggested during the review process. Inspired by Fang and Pan [23], a promising extension involves the construction of representative points for the proposed distribution, which could improve discrete approximations and reduce computational complexity in simulation-based applications. Moreover, following the contributions of Li and Fang [24], future work may also explore alternative estimation strategies (such as penalized likelihood and bias-corrected moment methods) to enhance the accuracy of parameter estimates, particularly in terms of reducing the mean squared error. Finally, the development of a Bayesian estimation framework for the GIG distribution represents a complementary research avenue, especially in contexts where prior information is relevant.

Author Contributions

Conceptualization, J.R., C.M., K.I.S. and Y.A.I.; Methodology, J.R., C.M., K.I.S. and Y.A.I.; Software, J.R., C.M., K.I.S. and Y.A.I.; Formal analysis, J.R., C.M., K.I.S. and Y.A.I.; Investigation, J.R., C.M., K.I.S. and Y.A.I.; Writing—original draft, J.R., C.M., K.I.S. and Y.A.I.; Writing—review & editing, J.R., C.M., K.I.S. and Y.A.I. All authors have read and agreed to the published version of the manuscript.

Funding

The research of K.I. Santoro was supported by the project SFP-24-002.

Data Availability Statement

The data presented in this study are openly available in [R dataset] at [https://vincentarelbundock.github.io/Rdatasets/doc/datasets/ToothGrowth.html] (accessed on 6 January 2025), reference number [20].

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; John Wiley & Sons Ltd.: New York, NY, USA, 1994; Volume 1. [Google Scholar]
  2. Nadarajah, S.; Gupta, A.K. A generalized gamma distribution with application to drought data. Math. Comput. Simul. 2007, 74, 1–7. [Google Scholar] [CrossRef]
  3. Husak, G.J.; Michaelsen, J.; Funk, C. Use of the gamma distribution to represent monthly rainfall in Africa for drought monitoring applications. Int. J. Climatol. 2007, 27, 935–944. [Google Scholar] [CrossRef]
  4. Mansor, M.M.; Ibrahim, N.; Jamil, S.A.M.; Shafie, N.A.; Zahari, S.M. Visualising the Optimistic, Realistic, and Pessimistic Financial Distress Outlooks for Airport Operations in Malaysia. J. Cases Inf. Technol. 2023, 25, 1–20. [Google Scholar] [CrossRef]
  5. Roy, M.; Brokamp, C.; Balachandran, S. Clustering and Regression-Based Analysis of PM2.5 Sensitivity to Meteorology in Cincinnati, Ohio. Atmosphere 2022, 13, 545. [Google Scholar] [CrossRef]
  6. Kang, C.; Ji, M.; Sekiguchi, Y.; Naito, M.; Sato, C. A high-throughput technique to evaluate the probability distribution of strength of adhesively bonded joints after moisture absorption. J. Adhes. 2023, 101, 18–40. [Google Scholar] [CrossRef]
  7. Al-Awadi, A.T.; Al-Saadi, R.J.M.; Mutasher, A.K.A. Frequency analysis of rainfall events in Karbala city, Iraq, by creating a proposed formula with eight probability distribution theories. Smart Sci. 2023, 11, 639–648. [Google Scholar] [CrossRef]
  8. Ozarslan, M.A.; Ustaoglu, C. Extended Incomplete Version of Hypergeometric Functions. Filomat 2020, 34, 653–662. [Google Scholar] [CrossRef]
  9. Reynolds, R.; Stauffer, A. Definite integral of exponential polynomial and hyperbolic function in terms of the incomplete gamma function. Eur. J. Pure Appl. Math. 2021, 34, 653–662. [Google Scholar] [CrossRef]
  10. Jangid, K.; Purohit, S.D.; Suthar, D.L. A note on lambert’s law involving incompletea-functions. J. Sci. Arts 2022, 22, 91–96. [Google Scholar] [CrossRef]
  11. From, S.G. Some probability theory-based inequalities for the incomplete gamma function. J. Inequelities Spec. Funct. 2023, 14, 1–15. [Google Scholar] [CrossRef]
  12. Abramowitz, M.; Stegun, I.A. (Eds.) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables; See Section 6.5; Dover Publications: New York, NY, USA, 1972. [Google Scholar]
  13. MacDonald, I.L. Does Newton-Raphson really fail? Stat. Methods Med. Res. 2014, 23, 308–311. [Google Scholar] [CrossRef] [PubMed]
  14. Casella, G.; Berger, R.L. Statistical Inference, 2nd ed.; Duxbury: Pacific Grove, CA, USA, 2002. [Google Scholar]
  15. Rohatgi, V.K.; Saleh, A.K.M.E. An Introduction to Probability Theory and Mathematical Statistics, 3rd ed.; John Wiley: New York, NY, USA, 2001. [Google Scholar]
  16. Castillo, J.S.; Rojas, M.A.; Reyes, J. A More Flexible Extension of the Fréchet Distribution Based on the Incomplete Gamma Function and Applications. Symmetry 2023, 15, 1608. [Google Scholar] [CrossRef]
  17. Olmos, N.M.; Varela, H.; Gómez, H.W.; Bolfarine, H. An extension of the half-normal distribution. Stat. Pap. 2012, 53, 875–886. [Google Scholar] [CrossRef]
  18. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  19. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  20. Crampton, E.W. The Growth of the Odontoblasts of the Incisor Tooth as a Criterion of the Vitamin C Intake of the Guinea Pig: Five Figures. J. Nutr. 1947, 33, 491–504. [Google Scholar] [CrossRef] [PubMed]
  21. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2025; Available online: https://www.R-project.org/ (accessed on 5 March 2025).
  22. Chen, Y.C.; Genovese, C.R.; Tibshirani, R.J.; Wasserman, L. Nonparametric modal regression. Ann. Stat. 2016, 44, 489–514. [Google Scholar] [CrossRef]
  23. Fang, K.-T.; Pan, J. A Review of Representative Points of Statistical Distributions and Their Applications. Mathematics. 2023, 11, 2930. [Google Scholar] [CrossRef]
  24. Li, Y.; Fang, K.-T. A new approach to parameter estimation of mixture of two normal distributions. Commun. Stat.-Simul. Comput. 2024, 53, 1161–1187. [Google Scholar] [CrossRef]
Figure 1. Probability density functions of Y G I G ( α , β , q ) for indicated parameters values.
Figure 1. Probability density functions of Y G I G ( α , β , q ) for indicated parameters values.
Mathematics 13 01749 g001
Figure 2. Cumulative distribution functions of Y G I G ( α , β , q ) for indicated parameters values.
Figure 2. Cumulative distribution functions of Y G I G ( α , β , q ) for indicated parameters values.
Mathematics 13 01749 g002
Figure 3. Hazard function of Y G I G ( α , β , q ) for indicated parameter values.
Figure 3. Hazard function of Y G I G ( α , β , q ) for indicated parameter values.
Mathematics 13 01749 g003aMathematics 13 01749 g003b
Figure 4. Special cases for the GIG distribution.
Figure 4. Special cases for the GIG distribution.
Mathematics 13 01749 g004
Figure 5. Coefficients of skewness and kurtosis of Y G I G ( α , β , q ) for indicated parameter values.
Figure 5. Coefficients of skewness and kurtosis of Y G I G ( α , β , q ) for indicated parameter values.
Mathematics 13 01749 g005
Figure 6. Left: Empirical distribution of α ^ for different sample sizes and α = 9 (vertical black dashed line), β = 3 , and q = 2 . Right: Empirical distribution of β ^ for different sample sizes and α = 9 , β = 3 (vertical black dashed line), and q = 2 .
Figure 6. Left: Empirical distribution of α ^ for different sample sizes and α = 9 (vertical black dashed line), β = 3 , and q = 2 . Right: Empirical distribution of β ^ for different sample sizes and α = 9 , β = 3 (vertical black dashed line), and q = 2 .
Mathematics 13 01749 g006
Figure 7. Boxplot of Rb concentration data.
Figure 7. Boxplot of Rb concentration data.
Mathematics 13 01749 g007
Figure 8. Left: Quantile–Quantil plots of the Rb concentration data under the GIG and G distributions. Right: Histogram of Rb concentration data with fitted PDFs.
Figure 8. Left: Quantile–Quantil plots of the Rb concentration data under the GIG and G distributions. Right: Histogram of Rb concentration data with fitted PDFs.
Mathematics 13 01749 g008
Figure 9. Profile log-likelihood functions for the parameters α (left) and β (right) of the GIG distribution fitted to the Rb concentration data.
Figure 9. Profile log-likelihood functions for the parameters α (left) and β (right) of the GIG distribution fitted to the Rb concentration data.
Mathematics 13 01749 g009
Figure 10. Left: Histogram of the len data with fitted PDFs. Right: Empirical CDFs of the len data with fitted theoretical CDFs.
Figure 10. Left: Histogram of the len data with fitted PDFs. Right: Empirical CDFs of the len data with fitted theoretical CDFs.
Mathematics 13 01749 g010
Figure 11. Profile log-likelihood functions for the parameters α (left) and β (right) of the GIG distribution fitted to the len data.
Figure 11. Profile log-likelihood functions for the parameters α (left) and β (right) of the GIG distribution fitted to the len data.
Mathematics 13 01749 g011
Table 1. Coefficients of skewness and kurtosis of Y G I G ( α , β , q ) for different values of α and q.
Table 1. Coefficients of skewness and kurtosis of Y G I G ( α , β , q ) for different values of α and q.
CSCK
q q
α 14681468
21.61981.87641.92551.95026.79598.15928.47248.6389
31.29461.72821.83461.88985.38067.26017.87448.2225
41.06171.56011.72681.81764.61376.36767.22417.7540
50.91421.38041.60321.73294.22285.55066.54877.2416
60.82131.20051.46631.63584.00294.86565.88256.6984
70.75691.03271.32061.52703.85734.34325.26186.1427
Table 2. Estimated bias, SEs, and RMSE for MLEs in finite samples from the GIG model.
Table 2. Estimated bias, SEs, and RMSE for MLEs in finite samples from the GIG model.
True Value n = 50 n = 100 n = 150 n = 200
α β Estim.BiasSERMSECPBiasSERMSECPBiasSERMSECPBiasSERMSECP
93 α ^ 0.30651.18411.279995.30.15040.81440.861695.50.10440.65890.664095.30.06330.56680.575395.1
β ^ 0.15280.51740.567795.30.07240.35460.380594.70.04990.28670.291495.60.02980.24640.251895.3
2 α ^ 0.30651.18411.279995.30.15060.81440.861795.50.10450.65890.663995.40.06330.56680.575395.1
β ^ 0.10190.34490.378595.40.04830.23640.253794.70.03330.19120.194295.60.01980.16430.167995.3
83 α ^ 0.22851.01621.090295.50.11210.70120.736495.50.07760.56780.569795.80.04460.48890.494594.4
β ^ 0.14560.51450.558795.60.07240.35360.375595.00.05200.28620.288295.40.03280.24620.248995.1
2 α ^ 0.22851.01621.090295.50.11230.70120.736795.50.07790.56780.570495.70.04460.48890.494494.4
β ^ 0.09710.34300.372595.60.04840.23580.250495.00.03480.19080.192395.40.02180.16410.165995.1
72 α ^ 0.14550.88090.923795.70.06730.61100.630595.90.04130.49540.491895.90.01640.42720.429294.6
β ^ 0.10120.35060.369195.90.05870.24230.250795.70.04590.19630.194296.30.03390.16900.167895.7
3 α ^ 0.14570.88090.923695.70.06720.61100.630595.90.04130.49530.491795.90.01630.42720.429194.6
β ^ 0.15190.52590.553795.90.08800.36340.376195.70.06880.29440.291396.30.05080.25360.251695.7
63 α ^ 0.06290.79770.808296.80.0100.55540.558295.8−0.0090.4510.440796.6−0.0270.3910.389494.6
β ^ 0.22350.57410.584597.20.16420.39840.407496.20.14570.32320.325796.20.12870.27870.285895.3
2 α ^ 0.06290.79770.808296.80.0100.55560.558295.8−0.0090.4510.440896.6−0.0270.3900.389494.6
β ^ 0.14900.38270.389797.20.10950.26560.271696.20.09710.21550.217296.20.08580.18580.190595.3
Table 3. Descriptive statistics for Rb concentration dataset.
Table 3. Descriptive statistics for Rb concentration dataset.
nMinimumMeanMedianSDCSCKMaximum
86 2.000 88.570 84.000 55.468 2.233 13.750 406.000
Table 4. ML estimates (with SEs in parentheses), maximum log-likelihood values, AIC and BIC values, and KS goodness-of-fit test results for the distributions fitted to the Rb concentration data.
Table 4. ML estimates (with SEs in parentheses), maximum log-likelihood values, AIC and BIC values, and KS goodness-of-fit test results for the distributions fitted to the Rb concentration data.
ParametersGIGGammaIGSFr
α ^ 6.075 (0.597)2.346 (0.335)88.582 (10.528)0.724 (0.118)
β ^ 0.047 (0.006)0.026 (0.004)72.953 (11.125)0.336 (0.046)
log-likelihood−454.368−457.400−486.573−561.741
AIC912.715918.800997.1461127.481
BIC917.624923.709982.0551132.390
KS statistic0.0800.1180.2510.462
KS p-value0.6470.185<0.001<0.001
Table 5. Descriptive statistics for the len data of the ToothGrowth dataset.
Table 5. Descriptive statistics for the len data of the ToothGrowth dataset.
nMinimumMeanMedianSDCSCKMaximum
60 4.200 18.813 19.250 7.649 0.146 2.024 33.900
Table 6. ML estimates (with SEs in parentheses), maximum log-likelihood values, AIC and BIC values, and KS goodness-of-fit test results for the distributions fitted to the len data.
Table 6. ML estimates (with SEs in parentheses), maximum log-likelihood values, AIC and BIC values, and KS goodness-of-fit test results for the distributions fitted to the len data.
ParametersGIGGIGSHN
α ^ 8.860 (0.997)4.903 (0.866)18.813 (1.285)14.101 (1.540)
β ^ 0.364 (0.054)0.260 (0.049)67.217 (12.272)1.910 (0.268)
log-likelihood−208.256−209.224−213.520−230.495
AIC420.512422.448431.040464.990
BIC424.701426.637435.229469.179
KS statistic0.1190.1290.1530.225
KS p-value0.3610.2690.1220.005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Reyes, J.; Marchant, C.; Santoro, K.I.; Iriarte, Y.A. A Versatile Distribution Based on the Incomplete Gamma Function: Characterization and Applications. Mathematics 2025, 13, 1749. https://doi.org/10.3390/math13111749

AMA Style

Reyes J, Marchant C, Santoro KI, Iriarte YA. A Versatile Distribution Based on the Incomplete Gamma Function: Characterization and Applications. Mathematics. 2025; 13(11):1749. https://doi.org/10.3390/math13111749

Chicago/Turabian Style

Reyes, Jimmy, Carolina Marchant, Karol I. Santoro, and Yuri A. Iriarte. 2025. "A Versatile Distribution Based on the Incomplete Gamma Function: Characterization and Applications" Mathematics 13, no. 11: 1749. https://doi.org/10.3390/math13111749

APA Style

Reyes, J., Marchant, C., Santoro, K. I., & Iriarte, Y. A. (2025). A Versatile Distribution Based on the Incomplete Gamma Function: Characterization and Applications. Mathematics, 13(11), 1749. https://doi.org/10.3390/math13111749

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop