1. Introduction
Bounded distributions (on the unit interval (0, 1)) have garnered considerable attention in the statistical literature due to their applicability in modeling proportions, rates, and probabilities. The beta distribution remains one of the most widely used models in this class owing to its flexibility in shape and tractable mathematical properties; see Gupta and Nadarajah [
1]. However, despite its utility, the beta distribution structure may not be sufficient to capture data with more complex features such as multimodality or non-monotonic hazard rates; see Lai and Jones [
2]. To address these limitations, several generalizations and alternative bounded models have been proposed. The Kumaraswamy distribution, introduced by Kumaraswamy [
3], offers similar flexibility to the beta distribution while providing a closed-form cumulative distribution function, which is computationally advantageous in many applications. Classical distributions such as the beta and Kumaraswamy families have long served as foundational tools in this context. Consequently, there has been continued interest in developing new bounded models that offer greater flexibility and improved fit across a wide range of applications. Subsequent extensions, including the generalized beta, McDonald distribution, and various transformed models, have sought to improve tail behavior, accommodate different skewness structures, or simplify inference.
More recently, research has focused on transforming well-known continuous distributions into the unit interval to develop new families of bounded distributions. This approach leverages the structural properties of established distributions while adapting them for modeling unit-bounded data. For example, transformations of the Weibull, log-logistic, and Burr families have led to bounded analogues with enhanced flexibility and improved modeling capabilities. See, for example, unit-Weibull (by Mazucheli et al. [
4]), unit-logistic (by Menezes et al. [
5]), unit-Birnbaum–Saunders (by Mazucheli et al. [
6]), unit-gamma (by Dey et al. [
7]), unit-Gompertz (by Mazucheli et al. [
8]), unit-Lindley (by Mazucheli et al. [
9]), unit-extended-Weibull (by Guerra et al. [
10]), unit-Teissier (by Krishna et al. [
11]), unit-log–log (by Korkmaz and Korkmaz [
12]), power new power function (by Karakaya et al. [
13]), and unit Zeghdoudi (by Bashiru et al. [
14]), among others. It should be noted that most of these distributions have more than two parameters, which can lead to biased estimates, particularly when the sample size is small.
The Gompertz–Makeham (GM) distribution was originally introduced by Benjamin Gompertz in 1825 and William Makeham in 1860. It is frequently used to construct growth models, analyze insurance data, and describe human mortality; see Marshall and Olkin [
15]. This distribution is well known in demography, actuarial science, and reliability engineering due to its ability to capture varying hazard rate structures and aging properties. It has also received recent attention in various real-world applications; for example, see Jodr’a [
16], Missov and Lenart [
17], and Wang and Guo [
18], among others. Moreover, the GM model is identifiable for all values of its parameters, as recently noted by Castellares et al. [
19].
Suppose that
Y is the lifetime random variable of a test subject following the GM distribution, denoted by
, where
. Hence, the corresponding cumulative distribution function (CDF),
, and the probability density function (PDF),
, of
are
and
respectively.
Despite the usefulness of existing bounded models such as the beta and Kumaraswamy distributions, they exhibit important shortcomings in practice. For instance, the beta distribution, while highly flexible, cannot adequately capture multimodal behaviors or complex hazard rate shapes, and the Kumaraswamy distribution, although computationally convenient, shares similar limitations. More general alternatives, such as the generalized beta or McDonald families, improve tail behavior and skewness but introduce multiple parameters, which can complicate inference and lead to unstable estimates with small or moderate sample sizes. Recently proposed unit distributions—such as the unit-Weibull, unit-gamma, and unit-Birnbaum–Saunders—expand the modeling toolbox; however, most of them cannot simultaneously accommodate decreasing, increasing, bathtub-shaped, and inverted-bathtub-shaped hazard rate functions (HRFs). The proposed unit-GM (UGM) distribution addresses these limitations by retaining the structural advantages of the classical GM model while adapting it to the unit interval. Consequently, the UGM model combines tractability with enhanced flexibility, offering diverse density and hazard rate forms within a parsimonious three-parameter framework.
Despite the usefulness of existing bounded models, such as the beta and Kumaraswamy distributions, they exhibit notable shortcomings in practical applications. Recently proposed unit distributions—including the unit-Weibull, unit-gamma, and unit-Birnbaum–Saunders—expand the modeling toolbox; however, most cannot simultaneously accommodate decreasing, increasing, bathtub-shaped, and inverted-bathtub-shaped HRFs. By extending the GM random variable to the unit interval, this study addresses these limitations while retaining the structural advantages of the classical GM model. To this end, we introduce a new unit distribution derived from the traditional GM distribution, hereafter referred to as the UGM distribution. This construction employs a transformation-based approach designed to preserve the desirable aging properties of the original distribution while adapting it to model variables bounded in (0,1). Beyond its theoretical appeal, the UGM distribution offers several practical advantages. It provides closed-form expressions for the cumulative distribution and survival functions, as well as tractable forms for its moments and entropy, making it particularly suitable for likelihood-based inference and model comparison. Furthermore, through real data applications and Monte Carlo simulations, we demonstrate that the UGM distribution can outperform existing bounded models, including the beta, Kumaraswamy, unit-Weibull, unit-gamma, and unit-Gompertz distributions, across multiple goodness-of-fit metrics.
Given the simplicity of the GM distribution, we next propose its unit model, referred to as the UGM distribution. Leveraging the tractability of the PDF and CDF of the base GM lifespan model, the UGM distribution is highly versatile and applicable across a wide range of areas, particularly in life testing and reliability studies. Moreover, the UGM distribution can capture increasing, decreasing, bathtub-shaped, and inverted-bathtub-shaped HRF patterns, which are often challenging to describe adequately using conventional models. The main contributions of this study can be summarized as follows:
Introduction of a novel probability distribution that serves as an effective tool for modeling real-world data, with the potential to outperform existing distributions in capturing decreasing and inverted-bathtub-shaped hazard rate patterns.
Comprehensive investigation of the new distribution’s key properties, including skewness, kurtosis, moments, and tail behavior, which are essential for understanding its characteristics and potential applicability across various domains.
Parameter estimation using eight classical methods, providing a thorough evaluation of their accuracy and efficiency and offering guidance for selecting suitable techniques in practical applications.
A simulation study assessing the performance of the estimation methods under different scenarios using standard statistical accuracy measures.
Application to two engineering datasets—one from oil reservoirs and the other from mechanical components—demonstrating the practical utility of the model and its potential to achieve better fit than existing bounded distributions.
The remainder of this manuscript is organized as follows.
Section 2 formally introduces the UGM model.
Section 3 presents its fundamental statistical properties.
Section 4 addresses parameter estimation using eight classical methods.
Section 5 presents a Monte Carlo simulation study.
Section 6 applies the model to two engineering datasets. Finally,
Section 7 provides concluding remarks.
5. Simulation Study
This section utilizes a Monte Carlo simulation framework to investigate the behavior of the proposed estimators under various controlled conditions. To this end, a total of 5000 synthetic datasets are generated for each considered scenario, with sample sizes n (=20, 50, 100, 150, 200). The data are drawn from the distribution, incorporating different configurations of the model parameters to capture a range of distributional shapes. In particular, we examine the impact of two values of each unknown parameter, such as , , and , applied across the following five parameter combinations for :
Set-1: ;
Set-2: ;
Set-3: ;
Set-4: ;
Set-5: .
All computational analyses were carried out using the (v 4.2.2) software environment. The parameter values , , and were chosen to encompass a wide range of distributional shapes of the UGM model. In particular, smaller values of generate highly skewed and heavy-tailed distributions, whereas larger values yield more symmetric and lighter-tailed forms. This selection allows the simulation study to evaluate estimator performance under both extreme and moderate scenarios, thereby providing a comprehensive assessment of robustness and accuracy. It is worth noting that the starting points of , , and were set to their true values. Alternatively, moment-based or percentile-based initialization, together with grid-search refinement, can be used to assign suitable starting points for each parameter.
A comparative evaluation of all estimation procedures was conducted using three standard error metrics: mean squared error (MSE), mean absolute bias (MAB), and relative absolute bias (RAB). For each method evaluated, the estimates for these metrics, along with total ranks (TRs) and order ranks (ORs) for
,
, and
, are presented in
Table 2,
Table 3,
Table 4,
Table 5 and
Table 6. Across all experimental conditions, results show a clear trend of improved estimator accuracy with larger sample sizes
n, confirming the consistency of the estimators for all parameters. Notably, both ML and MPS methods consistently provide more accurate and stable estimates compared to other approaches.
To summarize overall performance across different parameter settings, mean total ranks (MTRs) and mean order ranks (MORs) were calculated and are summarized in
Table 7. This table demonstrates that the MLE generally outperforms all others in most scenarios, with the MPSE ranking close behind. When ranking the estimation strategies according to their average performance, the efficiency from most to least effective is as follows: MLE, MPSE, CRVME, OLSE, WLSE, PCE, ADE, and RADE. These rankings are fully aligned with the individual results presented in
Table 2,
Table 3,
Table 4,
Table 5 and
Table 6. From Set-1 (as an example),
Figure 2 displays the simulated results, including MSE, MAB, and RAB values corresponding to the parameters
,
, and
. The fitted lines shown in
Figure 2 correspond to the estimation methods and confirm the same inferential findings listed in
Table 2.
To summarize, we recommend using the ML method as the preferred approach in practical applications involving the UGM distribution, with the MPS method serving as a reliable alternative when ML is unsuitable or computationally intensive.
6. Real-World Applications
This section examines two distinct engineering applications to demonstrate the practical utility of the proposed inferential methods and to highlight the advantages of the UGM model in capturing various real-world scenarios. These applications are as follows:
Application 1: This application analyzes twelve core samples extracted from petroleum reservoirs, each obtained across four distinct cross sections, resulting in a total of forty-eight observations. For each core sample, permeability was measured as the primary response variable. Additionally, three key geometric properties were recorded at the cross-section level: total pore area, total pore perimeter, and pore shape. These variables are critical for characterizing the microstructural features influencing fluid flow through the reservoir rock and provide a basis for investigating the relationship between pore geometry and permeability. The petroleum reservoirs dataset was first provided by the R Core Team [
26] and later reanalyzed by Mazucheli et al. [
4] and Dey et al. [
7].
Application 2: This application investigates twenty mechanical components by analyzing their time to failure under controlled operational stress, aiming to quantify their reliability characteristics, estimate distributional parameters, and provide predictive insights into failure risk. Given that this dataset contains a single extreme outlier (0.485), the analysis is presented as an illustrative case study without loss of generality. Accordingly, the results are intended to demonstrate the potential applicability of the UGM distribution in small-sample contexts. The mechanical components dataset was presented by Murthy et al. [
27] and later discussed by Elshahhat et al. [
28] and Alqasem et al. [
22].
For clarification, in Application 1, the UGM distribution captures a non-monotonic hazard rate consistent with systems that exhibit early failures followed by stability, which is a feature not accounted for by competing bounded models. In Application 2, the fitted UGM parameters indicate heavier tails, underscoring the non-negligible probabilities of extreme performance outcomes and their implications for reliability margins.
Before analyzing the new UGM model and after multiplying each data point in the petroleum reservoirs dataset by two,
Table 8 lists the time data points for Applications 1 and 2. It should be noted that the permeability data were rescaled to the unit interval (by multiplication with a constant) solely for compatibility with bounded distributions and for computational convenience; this transformation does not alter the substantive interpretation of the results. Furthermore,
Table 9 presents a comprehensive summary of descriptive statistics—including the mean, mode, three quartiles (
,
), standard deviation (SD), and skewness—for the datasets used in Applications 1 and 2. The summary statistics reveal that Application 1 employed a dataset with a relatively balanced distribution and modest right skew, whereas Application 2 used a dataset with heavy right skew, indicating the presence of extreme upper values. The greater spread and higher mean in Application 1’s data also reflect more variability and generally larger observations compared to the more compact and lower-valued data in Application 2.
To evaluate the performance of the UGM distribution across the full datasets presented in Applications 1 and 2, we compared it against nine alternative lifetime models characterized by flexible and unbounded failure rate structures (see
Table 10). The model comparison was conducted using eight widely used goodness-of-fit and information criteria: (i) negative log-likelihood (
), (ii) Akaike Information (
), (iii) Bayesian Information (
), (iv) Consistent AI (
), (v) Hannan–Quinn Information (
), (vi) Anderson–Darling (
), (vii) Cramér–von Mises (
), and (viii) Kolmogorov–Smirnov (
) statistic along with its corresponding
-value.
Using the AdequacyModel package (Marinho et al. [
29]), the detailed results of criteria (i)–(viii) are summarized in
Table 11. In addition to model selection metrics,
Table 12 provides the MLEs (along with standard errors, SEs) of each model parameter. Here, we initialize all parameters with starting points chosen based on their theoretical domains; see, for example, Mariel et al. [
30]. According to the estimated selection criteria, the best (optimal) models correspond to the smallest values of
,
,
,
,
,
,
, and
and of the largest
-value. The findings in
Table 11 clearly favor the proposed UGM model as the most suitable candidate among all competing alternatives presented in
Table 10.
Table 10.
Nine competitive models for UGM distribution.
Table 10.
Nine competitive models for UGM distribution.
Model | Symbol | Author(s) |
---|
Complementary unit-Weibull | CUW | Guerra et al. [10] |
Unit-Birnbaum–Saunders | UBS | Mazucheli et al. [6] |
Unit-log–log | ULL | Korkmaz and Korkmaz [12] |
Topp–Leone | TL | Topp and Leone [31] |
Unit-Weibull | UW | Mazucheli et al. [4] |
Unit-Gamma | UG | Dey et al. [7] |
Unit-Gompertz | UGo | Mazucheli et al. [8] |
Kumaraswamy | Kum | Kumaraswamy [3] |
Beta | Beta | Gupta and Nadarajah [1] |
Table 11.
Summary fit for the UGM and its competitor models.
Table 11.
Summary fit for the UGM and its competitor models.
Model | | | | | | | | |
---|
| | | | Distance | -value | | | | |
---|
Application 1 |
UGM | −25.499 | −44.998 | −39.384 | −44.452 | −42.876 | 0.244 | 0.035 | 0.085 | 0.880 |
CUW | −12.564 | −21.129 | −17.386 | −20.862 | −19.714 | 2.410 | 0.391 | 0.162 | 0.163 |
UBS | −7.3930 | −10.786 | −7.0436 | −10.519 | −9.3717 | 3.384 | 0.564 | 0.230 | 0.012 |
ULL | −24.886 | −43.772 | −38.030 | −44.206 | −41.358 | 0.398 | 0.059 | 0.105 | 0.666 |
TL | −13.872 | −25.743 | −23.872 | −25.656 | −25.036 | 1.528 | 0.244 | 0.218 | 0.021 |
UW | −23.196 | −42.392 | −38.650 | −42.126 | −40.978 | 0.683 | 0.108 | 0.130 | 0.387 |
UG | −18.046 | −32.092 | −28.350 | −31.826 | −30.678 | 0.788 | 0.128 | 0.174 | 0.109 |
UGo | −25.387 | −44.774 | −39.031 | −44.151 | −40.945 | 0.246 | 0.038 | 0.089 | 0.876 |
Kum | −15.924 | −27.848 | −24.106 | −27.582 | −26.434 | 1.926 | 0.310 | 0.175 | 0.104 |
Beta | −17.622 | −31.245 | −27.502 | −30.978 | −29.830 | 1.666 | 0.267 | 0.176 | 0.102 |
Application 2 |
UGM | −38.048 | −70.097 | −67.110 | −68.597 | −69.514 | 0.342 | 0.046 | 0.167 | 0.633 |
CUW | −24.950 | −45.900 | −43.909 | −45.194 | −45.512 | 2.744 | 0.456 | 0.280 | 0.088 |
UBS | −26.103 | −48.205 | −46.214 | −47.499 | −47.817 | 2.672 | 0.441 | 0.276 | 0.094 |
ULL | −37.808 | −69.617 | −66.625 | −67.911 | −71.228 | 0.517 | 0.066 | 0.172 | 0.620 |
TL | −13.743 | −25.486 | −24.490 | −25.264 | −25.291 | 2.156 | 0.339 | 0.484 | 0.038 |
UW | −35.819 | −67.637 | −65.645 | −66.931 | −67.248 | 0.826 | 0.108 | 0.172 | 0.624 |
UG | −29.272 | −54.544 | −52.553 | −53.839 | −54.156 | 1.579 | 0.232 | 0.215 | 0.314 |
UGo | −8.3496 | −12.699 | −10.708 | −11.993 | −12.310 | 2.078 | 0.324 | 0.511 | 0.035 |
Kum | −25.648 | −47.297 | −45.305 | −46.591 | −46.908 | 2.650 | 0.437 | 0.263 | 0.126 |
Beta | −27.881 | −51.763 | −49.771 | −51.057 | −51.374 | 2.315 | 0.370 | 0.254 | 0.152 |
Table 12.
The fitted parameters of UGM and its competitor models.
Table 12.
The fitted parameters of UGM and its competitor models.
Model | MLE |
---|
| | | |
| Est. | SE | Est. | SE | Est. | SE |
Application 1 |
UGM | 0.2388 | 0.1901 | 2.4241 | 0.6719 | 0.1245 | 0.2833 |
CUW | - | - | 1.6067 | 0.1568 | 0.4418 | 0.0343 |
UBS | - | - | 0.6727 | 0.0687 | 0.7222 | 0.0662 |
ULL | - | - | 1.9857 | 0.2336 | 1.9258 | 0.1392 |
TL | - | - | - | - | 2.2062 | 0.3184 |
UW | - | - | 1.0035 | 0.1508 | 2.6984 | 0.3202 |
UG | - | - | 4.0698 | 0.7990 | 4.5516 | 0.9511 |
UGo | - | - | 0.0637 | 0.0316 | 2.6847 | 0.3755 |
Kum | - | - | 2.3129 | 0.3005 | 4.1671 | 0.9699 |
Beta | - | - | 3.3613 | 0.6613 | 4.1643 | 0.8311 |
Application 2 |
UGM | 0.0022 | 0.0005 | 3.0478 | 0.1465 | 0.0116 | 0.0321 |
CUW | - | - | 1.4158 | 0.1981 | 0.1110 | 0.0193 |
UBS | - | - | 0.2841 | 0.0449 | 2.1427 | 0.1347 |
ULL | - | - | 5.9728 | 0.9040 | 1.0038 | 0.0032 |
TL | - | - | - | - | 0.6248 | 0.1397 |
UW | - | - | 0.0032 | 0.0013 | 6.7414 | 0.5128 |
UG | - | - | 17.648 | 5.5289 | 7.9145 | 2.5150 |
UGo | - | - | 46.856 | 70.9036 | 0.0097 | 0.0146 |
Kum | - | - | 1.5865 | 0.2442 | 21.809 | 10.172 |
Beta | - | - | 3.1129 | 0.9369 | 21.826 | 7.0425 |
To initialize the parameters
,
, and
of the UGM distribution for the proposed computational analysis based on the two real-world datasets in Applications 1 and 2, the contours of the log-likelihood function for these parameters are displayed in
Figure 3. It indicates that the optimal starting values (red-point coordinates) of
,
, and
are close to their MLEs reported in
Table 11. Moreover, the results confirm that the MLEs of
,
, and
exist and are unique. Therefore, we recommend using these estimates as initial values for subsequent computational iterations.
Using six visualization panels for a detailed comparison of the UGM and its competitor models under the datasets analyzed in Applications 1 and 2,
Figure 4 and
Figure 5 display the following: (i) probability–probability (PP) plots, (ii) quantile–quantile (QQ) plots, (iii) fitted RFs, (iv) fitted PDFs, (v) scale TTT-transform (TTT), and (vi) violin plots with boxplots.
Subplots (a)–(c) visually support the numerical results in
Table 11, clearly showing that the UGM model produces fitted values closely aligned with the empirical observations in both Application 1 and Application 2.
Subplot (d) indicates that the fitted UGM density shapes for Applications 1 and 2 are right-skewed and strictly right-skewed, respectively.
Subplot (e) shows that the proposed datasets in Applications 1 and 2 exhibit increasing and upside-down bathtub failure rate shapes, respectively, supporting the theoretical UGM failure rates first depicted in
Figure 1.
Subplot (f) demonstrates that Application 1’s data have a wider spread with a higher median, indicating greater variation, whereas Application 2’s data are more concentrated with a lower median, suggesting more consistency. Overall, Application 1’s dataset appears more heterogeneous, while Application 2’s dataset is more homogeneous.
While several competing models in
Table 11 achieve an acceptable fit, the superiority of the UGM distribution is supported by multiple complementary criteria. In particular, the UGM consistently attains the lowest values of all information-based measures while also yielding the highest
-value, indicating an optimal balance between goodness of fit and parsimony. Moreover, graphical diagnostics (
Figure 4 and
Figure 5) demonstrate a closer agreement between the empirical and theoretical distributions under the UGM model. Beyond these empirical findings, the UGM possesses a theoretical advantage: unlike most unit distributions, it can flexibly accommodate decreasing, increasing, bathtub-shaped, and inverted-bathtub-shaped hazard rate patterns.
7. Conclusions
In this paper, we introduced and studied a novel continuous probability distribution defined on the unit interval . The model is constructed by applying a suitable transformation to the classical GM distribution, enabling flexible modeling of bounded data while retaining key characteristics of the parent distribution. We thoroughly investigated its statistical and mathematical properties, including explicit derivations of the cumulative distribution function, probability density function, quantile function, and moments. Important reliability properties, such as the hazard rate, reversed hazard rate, and mean residual life, were also examined, highlighting the model’s potential in survival and reliability analysis. To estimate model parameters, eight estimation methods were considered, including maximum likelihood, maximum product of spacings, and ordinary least squares. A comprehensive simulation study demonstrated that maximum likelihood consistently provides the most accurate and efficient parameter estimates, closely followed by the product of spacings method in terms of bias and mean square error. Applications to real-world datasets further illustrated the model’s superior flexibility and goodness of fit relative to existing bounded distributions, particularly in capturing diverse hazard rate shapes, including decreasing, increasing, bathtub, and inverted-bathtub forms. The results underscore two major insights. First, the distribution offers an unprecedented range of shapes for both its density and hazard functions, which are difficult to replicate with traditional bounded models. Second, maximum likelihood and maximum product of spacings are the most reliable estimation techniques, providing stable and accurate parameter inference. This work opens several avenues for future research. Theoretically, extensions to regression frameworks, multivariate forms, and copula-based constructions could further enhance the model’s applicability. Methodologically, Bayesian estimation procedures and computationally efficient algorithms for large datasets merit further investigation. Practically, the distribution can be especially useful for modeling bounded reliability data in finance, biomedical studies, environmental sciences, and other fields, with potential for incorporating covariate information or dependence structures to improve modeling performance. Overall, the proposed distribution represents a flexible, tractable, and powerful addition to the family of bounded probability models, offering strong potential for both theoretical development and practical application across diverse scientific domains.