1. Introduction
Many real-world datasets, especially those with non-negative and right-skewed observations, have been demonstrated to fit well with a single-parameter distribution [
1]. Numerous fields, including economics, environmental sciences, and medical research, have made extensive use of the Lindley distribution. Its probability density function, which has a strong right tail and a steep decrease approaching zero, defines it. This distribution works particularly well for applications in survival analysis and reliability studies, as well as for modeling count data with an excess of zeros [
2]. New estimating methods and a more thorough analysis of the Lindley distribution’s statistical characteristics have been developed in recent years as interest in the distribution and its practical applications has grown. By providing a thorough analysis of the Lindley distribution [
3], its properties, and its uses, this work adds to the growing body of knowledge. The Lindley distribution’s capacity to account for the effects of covariates or explanatory factors is one of its key advantages. In particular, it may be used in regression frameworks to simulate how one or more predictors affect the distribution of the response variable. Because of this, it is a useful analytical tool in a variety of domains, including finance and epidemiology. Despite its widespread applicability, the Lindley distribution and many of its one-parameter extensions exhibit notable limitations. For instance, the exponential distribution is constrained by its constant hazard rate, making it unsuitable for datasets with non-monotonic failure patterns. The Lindley distribution itself, while more flexible, often fails to capture complex tail behaviors or provide adequate fits in the presence of extreme values [
4]. Other one-parameter models such as the Zeghdoudi [
5], pseudo-Lindley [
6], XLindley [
7], its truncated version [
8,
9], and Novel One Parameter Family
NPFD [
10] distributions, although developed to enhance flexibility, still encounter difficulties in modeling datasets with varying hazard shapes or when the underlying failure mechanism involves an initial increase followed by a decrease in risk. Moreover, many of these distributions lack mathematical tractability for Bayesian estimation under censoring schemes, limiting their applicability in reliability and survival analysis. The reader may consult [
11,
12,
13] for more generalization of the Lindley distribution. A combination of exponential exp(θ) and gamma(2, θ) distributions can be used to understand the one-parameter Lindley distribution. Its statistical characteristics were then further studied by Ghitany et al. [
14], who showed that it performs better than the traditional one-parameter exponential distribution in a number of ways. Data with monotonically increasing failure rates can be modeled using the Lindley distribution, which has a single scale parameter. As a result, distributions that are more adaptable than the typical Lindley model may be needed for the study of some lifespan datasets. This study’s main goal is to develop and investigate a novel single-parameter distribution that combines the benefits of the exponential and Lindley distributions. Numerous fields, including biology, engineering, astronomy, actuarial science, and medicine, can use this suggested paradigm. Additionally, the new distribution shows a declining average residual life function and a higher hazard rate [
15]. These features imply that the suggested model may spark a lot of attention in the scientific community.
Next, we suggest analyzing the new polynomial single-parameter distribution (
NPSD) using a Bayesian approach. For type II censored data, we calculate these parameters’ maximum likelihood (ML) estimators. Additionally, we develop the entropy, the Linex loss functions, and the Bayesian estimators of these parameters under the generalized quadratic (GQ). Using Pitman’s proximity criteria, we conduct a simulation experiment to examine the behavior of the suggested estimators and compare them with the ML estimator. Lastly, we calculate the three Bayesian estimators’ integrated mean square error (IMSE). This paper is structured as follows: The suggested distribution’s formulation is shown in
Section 2. A few of the new model’s distributional characteristics are covered in
Section 3. Several estimation techniques for the model parameter are described in
Section 4. In
Section 5, a Monte Carlo simulation study is used to evaluate the performance of various estimators. We provide an example based on real data in
Section 6 and
Section 7 to demonstrate the outcomes. In
Section 8, we wrap up the paper.
2. Derivation of the Proposed NPSD
Suppose
T is a random variable whose values fall between [0, +∞] and whose distribution is dependent upon an unknown parameter
θ having values within the range [0, +∞], and this is how its cumulative distribution function (
C D F) is written:
where
c(
θ) is real-valued function on [0, +∞] and
We can verify that the CDF is a right continuous function right away, and verify that satisfies non-negativity, differentiability, and the condition in (3) to ensure is a valid probability density function.
3. Statistical and Reliability Measures of Some Properties of NPSD
Proposition 1. The FN PS D (t; θ) in (1) of the NPSD is according to:
- 1.
FN PS D (0, θ) = 0 if
- 2.
FN PS D (∞, θ) = 1 if
- 3.
FN P SD (t; θ) increasing if
Proof. - 1.
We have: FN PS D (0, θ) =
Equating to zero and solving it in relation to t allows us to determine:
- 2.
if it was:
We put in
- 3.
The first derivatives of the
NPSD in (1) is established in this manner:
must be positive, which we interpret as
□
Subspecific instances:
In this case, we can choose the polynomial
positive and of second degree, which satisfies the condition (3); we set:
A. Asymptotic behavior:
The form characteristics of the
NPSD probability density function (PDF) in (4) are discussed in this section at
t = 0 and
t = ∞, respectively,
Since
= 1
While
Proof. - 1.
The first derivative of the PDF in Equation (4) is as follows:
Such as:
In algebra, a quadratic equation of the form , has , , are real numbers, and its discriminant has three cases as the following. If , has two quadratic distinct real roots. If , the quadratic has two non-real complex conjugate roots. If the quadratic has a repeated real root. In our case
- (a)
When , ,
- (1)
If , and the (t, ) is decreasing-increasing-decreasing.
- (2)
If , and the (t, ) is increasing-decreasing-increasing.
- (3)
If , (or , ) and , the (t, ) is unimodal.
- (4)
If , (or , ) and , the (t, ) is bathtub-shaped (BSBB).
- (b)
When , has two non-real complex conjugate roots, , .
- (1)
If , the (t, ) is decreasing
- (2)
If , the (t, ) is increasing
- (c)
When , , the (t, ) is decreasing.
- 2.
With and □
B. Survival and hazard rate functions:
For the
NPSD, the following definitions apply to the survival functions
SNPSD(
t) and the hazard rate function (
hr f)
h NPSD (
t):
C. Moments and related measures:
Corollary 1. where: By inserting the numbers k = 1, 2, 3, 4, one may compute the first four moments of the
NPSD random variable using Equation (7). The coefficient of variation, skewness, kurtosis, and variance of
NPSD are among the statistical measures that are subsequently computed using these moments, in that order:
where
where
where
Special case:
As a specific illustration of (4), the model we propose is obtained in the following manner,
We will put:
Firstly, we should test the condition (3):
Hence:
Which is always satisfied because
therefore
Secondly, we verify the term; in (4) of the NPSD is decreasing.
After substitution, we derive:
since and therefore
Finally, we validate the proposition if
We have:
Where:
Since: and thus consequently:
Then the cumulative distribution function (
cdf) of the
NPSD:
Therefore, the survival function
and hazard rate function
for the
NPSD are respectively defined as follows:
Furthermore, the
th moment of the
NPSD is defined as follows:
where
Proposition 3. mean,
variance,
coefficients of variation,
skewness,
and kurtosis for X are respectively defined as follows: The new distribution is leptokurtic and right-skewed according to the skewness and kurtosis.
Theorem 1. .
Proof. Let and
We obtain that and from the CDF in (9),
Note that . As a result, noting that is monotone increasing for and all , we conclude that □
4. Estimation of the Unknown Parameters
A Bayesian analysis of the NPSD distribution given in Equation (8) is presented in this section. For Type II censored data, maximum likelihood estimation is first addressed, after which Bayesian estimation under the Linex, Entropy, and Generalized Quadratic loss functions is explored.
4.1. Maximum Likelihood Estimation
To estimate the parameter, we are interested in type II censored data. Assuming the n-sample (x1, x2, …, xn), i.e., and a constant m, we may say that the NPSD distribution generates the m-sample (x1, x2, …, xm).
The following is this sample’s likelihood function:
For
where
Replacing both (8) and (9), we have:
With
the corresponding log-likelihood function is given by:
The maximum likelihood estimator
of the parameter
is the result of solving the following non-linear system:
where:
Since it seems impossible to solve the problem (15) analytically, we shall use numerical methods to obtain an approximate solution. In particular, we will use the R package BB to determine the approximate value of the parameter
’s maximum likelihood estimator
. The R package BB is successfully used for solving nonlinear system of equations; see Varadhan and Gilbert [
16].
4.2. Bayesian Estimation
In this section, we address Bayesian estimation. This method treats the unknown values as random variables and resumes a prior distribution of the parameter to be estimated based on some prior information.
For the parameter
, we utilize the gamma distribution.
The prior distribution is:
We also utilize Equation (13) to interpret the posterior distribution, which is as follows, when estimating using Bayesian methods for type
II censored data (see
Appendix A.4):
where:
Estimators and their corresponding risks:
Definition 1. The posteriorthe expected value of the lossrespect to the posterior distribution of the unknownthe observed data:wherea decision or estimatethe posterior density. This quantity is central to Bayesian decision theory, as it quantifies the cost associated after observing data. The three loss functions: Entropy, Generalized Quadratic, and Linex are described in the
Table 1 below:
(1) We obtain the estimator and its corresponding risk (where is an integer)
Under the Entropy loss function:
(2) We obtain the estimator and its corresponding risk (where
under the Generalized quadratic loss function:
(3) We obtain the estimator and its corresponding risk (where
r is an integer) under the Linex loss function:
5. Comparing the Likelihood Estimation and the Bayesian Estimation Using Pitman’s Closeness Criterion
We have: .
To evaluate the 5000 performance of the proposed estimators, we generated data from the NPSD with true parameter value
(see
Appendix A.1). For each replication, a random sample of size
was generated, and Type-II censoring was imposed by retaining the first
ordered observations. For each configuration,
Monte Carlo replications were performed. For every estimator, we report the posterior risk, the integrated mean squared error in order to compare the performance of the suggested Bayes estimators with the MLEs.
We have used the R package BB solve to derive the numerical values of the ML estimators. The estimators’ values utilizing the function BB algorithm are listed in
Table 2. Here, we note that, particularly as sample size n increases, the estimated values of
are near the true values of the parameter. The Bayesian estimators and PR (in brackets) under the GQ loss function are provided in
Table 3. The Bayesian estimators and PR (included in brackets) under the entropy loss function are shown in
Table 4. Bayesian estimators and PR (in brackets) under the Linex loss function are shown in
Table 5. The Bayesian estimators and PR (in brackets) for each of the three loss functions are displayed in
Table 6.
We see that the option
provides the best posterior risk in
Table 3, the estimation under the GQ loss function. Additionally, when n is large, we acquire the minimal appropriate posterior risk.
Table 4 shows that the value
for
offers the best posterior risk in the estimation under the entropy loss function. In summary, a brief comparison of the three loss functions reveals that the quadratic loss function yields the best results;
Table 6 provides a detailed illustration of these findings. It is evident that the value
yields the best
. We suggest comparing the greatest likelihood estimators with the optimal Bayesian estimators.
We employ the Pitman closeness criteria in
Table 7 for this purpose (see Pitman for more details [
17]).
Definition 2. According to Pitman’s proximity criteria, The values of the Pitman probabilities are shown in
Table 7, which enables us to compare the Bayesian estimators with the MLE estimator under the three loss functions for
,
, and
. Definition 2 states that the Bayesian estimators outperform the MLE estimators when the probability is higher than 0.5. Next, we see that the Bayesian estimators of the parameters are superior to the MLE based on this criterion. Additionally, with
,
, and
, the GQ loss function has the best results when compared to the other two loss functions.
6. Application with Real Data Set
To demonstrate the value of the suggested distribution, four applications are now suggested. More specifically, we investigate the
NPSD’s tuning behavior in relation to the exponential, Lindley, Zeghdoudi, XLindley, Xgamma, and new XLindley distributions. In order to do this, we use the maximum likelihood method to estimate the unknown parameters of each model and take into account the corresponding standard errors (SE), the estimated log likelihoods (−2logL), the values of AIC [
18] (Akaike information criterion), AICC (Akaike information criterion correction), HQIC (Hannan–Quinn information criterion), and BIC (Bayesian information criterion) (see
Appendix A.2).
Data Set 1: Populations Recorded by the US Census data
This data set gives the population of the United States (in millions) as recorded by the decennial census for the period 1790–1970. The proposed data set was previously studied by McNeil [
19] and its values are given by
3.93, 5.31, 7.24, 9.64, 12.90, 17.10, 23.20, 31.40, 39.80, 50.20, 62.90, 76.00, 92.00, 105.70, 122.80, 131.70, 151.30, 179.30, 203.20.
| Model | Denisty | | | | | | |
| Exponentiel | | 0.01435 | 201.3175 | 202.2619 | 199.3175 | 201.5528 | 201.4773 |
| Lindley | | 0.02828 | 207.6266 | 208.5710 | 205.6266 | 207.8619 | 207.7864 |
| XLindley | | 0.02791 | 206.9240 | 207.8684 | 204.9240 | 207.1593 | 207.0838 |
| New-XLindley | | 0.02115 | 201.6523 | 202.5968 | 199.6523 | 201.8876 | 201.8122 |
| Xgamma | | 0.04030 | 215.4385 | 216.3829 | 213.4385 | 215.6738 | 215.5983 |
| Zeghdoudi | | 0.04270 | 220.9815 | 221.9260 | 218.9815 | 221.2168 | 221.1414 |
| NPSD | | 0.01198 | 200.7878 | 201.7300 | 198.7878 | 201.023 | 200.9500 |
Data Set 2: Failure Times of Ball Bearings
This data set represents the lifetimes (in millions of revolutions) of 23 ball bearings tested in an industrial study. The proposed data set was previously studied by Lawless [
20] and its values are
17.88, 28.92, 33.00, 41.52, 42.12, 45.60, 48.48, 51.84, 51.96, 54.12, 55.56, 67.80, 68.64, 68.64, 68.88, 84.12, 93.12, 98.64, 105.12, 105.84, 127.92, 128.04, 173.40.
| Model | Denisty | | | | | | |
| Exponentiel | | 0.01234 | 174.126 | 175.584 | 172.126 | 174.512 | 174.389 |
| Lindley | | 0.02341 | 171.883 | 173.340 | 169.883 | 172.269 | 172.146 |
| XLindley | | 0.02188 | 170.662 | 172.119 | 168.662 | 171.048 | 170.925 |
| New-XLindley | | 0.01872 | 170.457 | 171.915 | 168.457 | 170.843 | 170.720 |
| Xgamma | | 0.02714 | 173.110 | 174.568 | 171.110 | 173.496 | 173.373 |
| Zeghdoudi | | 0.03125 | 176.502 | 177.960 | 174.502 | 176.888 | 176.765 |
| NPSD | | 0.00987 | 168.992 | 170.449 | 166.992 | 169.378 | 169.255 |
Data Set 3: Survival Times of Cancer Patients
This data set provides survival times (in months) for 33 patients suffering from a particular type of cancer. The proposed data set is previously studied by Lee and Wang [
21] and its values are given as follows
1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 6, 6, 7, 8, 8, 8, 9, 10, 11, 11, 12, 13, 13, 13, 14, 15,16, 17, 19, 20, 22, 24, 30.
| Model | Denisty | | | | | | |
| Exponentiel | | 0.05231 | 132.713 | 134.098 | 130.713 | 132.942 | 132.836 |
| Lindley | | 0.07514 | 129.773 | 131.158 | 127.773 | 130.002 | 129.896 |
| XLindley | | 0.08122 | 129.118 | 130.503 | 127.118 | 129.347 | 129.241 |
| New-XLindley | | 0.06892 | 128.432 | 129.817 | 126.432 | 128.661 | 128.555 |
| Xgamma | | 0.09345 | 127.546 | 128.931 | 125.546 | 127.775 | 127.669 |
| Zeghdoudi | | 0.08921 | 128.764 | 130.149 | 126.764 | 128.993 | 128.887 |
| NPSD | | 0.03974 | 126.881 | 128.266 | 124.881 | 127.110 | 127.004 |
7. Modeling Cancer Survival Data with the NPSD Distribution
In this section we illustrate the applicability of the
NPSD distribution by performing the above estimations using a set of real data. The data set includes This data set provides survival times (in months) for 33 patients suffering from a particular type of cancer. The proposed data set was previously studied by Lee and Wang [
21], and its value can be expressed as
1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 6, 6, 7, 8, 8, 8, 9, 10, 11, 11, 12, 13, 13, 13, 14, 15,16, 17, 19, 20, 22, 24, 30.
The Kolmogorov–Smirnov (K-S) test did not reject the fitted NPSD model at the 5% significance level (p = 0.793548), indicating that the model is not contradicted by the observed data. The K-S test value is 0.012901 which is smaller than their corresponding critical value at 5% level of significance, which is 0.025449. Its p-value is equal to 0.793548.
The following table shows all of the observations:
For the cancer survival data, the Kolmogorov–Smirnov test did not reject the fitted NPSD model at the 5% significance level (
p = 0.793548).
Table 8 reports the MLE and Bayesian estimates together with their posterior risks. For both (
n,
m) = (33, 33) and (33, 28), the
GQJ estimator has the smallest posterior risk among the Bayesian estimators.
Table 9 gives the Pitman closeness probabilities comparing each Bayesian estimator to the MLE. Since all reported values are above 0.5, the Bayesian estimators are preferred to the MLE under Pitman closeness for this data set. The Bayesian and ML estimators’ integrated mean-square error values are shown in
Table 10. We see that all of the Bayesian estimators outperform the ML estimators, and the generalized quadratic loss function yields the lowest results. Therefore, the preferred estimator depends on the chosen performance criterion.
Definition 3. The integrated mean square error is defined as: The integrated mean square error shows that when n is small, the Bayesian estimators gave better results, while when n is large enough, the ML estimators are closer to the true values but provide a higher IMSE than the Bayesian estimators. Finally, we show that the same conclusions hold using a set of real data.
8. Conclusions
In this study, we propose a family of distributions with a single parameter. Among the characteristics examined were moments, distribution function, characteristic function, failure rate, stochastic order, and the maximum likelihood method. The flexibility required to analyze and model various types of data pertaining to lifespan data and survival analysis is lacking in the Lindley and Zeghdoudi distributions. In contrast, the NPSD distribution is flexible, uncomplicated, and easy to use. Three real data sets were evaluated using the new distribution, and it was contrasted with other distributions (Lindley, exponential, Zeghdoudi, exponential, and Xgamma). The results of the comparison validate the quality modification of the NPSD distribution. We expect that many more life data, reliability analysis, and actuarial science applications will be drawn to our expanded distribution family.
In subsequent studies, we can use a broader distribution with two parameters. Thereafter, we examined Bayesian estimators of the NPSD distribution under different loss functions and presented a new model called NPSD. In comparison to the methods based on the other suggested loss functions, the Bayesian strategy based on the GQJ loss function produced the best estimator, according to the conducted Monte-Carlo research. Using the Pitman closeness criterion and the integrated mean square error, these chosen Bayesian estimators are compared with the maximum likelihood estimators of the unknown parameters. Bayesian estimators yield better results for small n, while MLE estimators become more accurate as n grows large. Lastly, we use a collection of actual data to demonstrate that the same findings hold. A two-parameter extension (NPSD-II) will be developed in a future study.
9. Discussion
The present work introduces a new polynomial single-parameter distribution (NPSD) and investigates its statistical properties, along with Bayesian and non-Bayesian inference procedures under Type-II censoring. The main methodological contribution of this paper lies in the development of a flexible yet parsimonious one-parameter lifetime model that bridges the gap between the exponential and Lindley distributions. By carefully selecting the polynomial components in the general construction, we obtained a tractable special case with closed-form expressions for the density, distribution, survival, and hazard functions, as well as for moments and related measures.
The simulation study provides several important insights into the finite-sample behavior of the proposed estimators. First, the maximum likelihood estimator performs adequately for moderate to large sample sizes, with estimates approaching the true parameter value as n increases. Second, among the Bayesian estimators, those obtained under the generalized quadratic loss function with γ = 1 consistently yield the smallest posterior risks across all sample sizes and censoring levels. Third, the Pitman closeness criterion reveals that the Bayesian estimators dominate the MLE for all configurations considered, with probabilities substantially exceeding 0.5. The GQ loss function with γ = 1 achieves the highest Pitman probabilities, reaching up to 0.951 for n = 200. Fourth, the integrated mean square error analysis confirms that Bayesian estimators outperform the MLE, particularly when the sample size is small or moderate. For the cancer survival data application, the IMSE values for Bayesian estimators range from 0.1093 to 0.1613, compared to 0.1893–0.1903 for the MLE, representing a substantial improvement.
The real-data applications further demonstrate the practical utility of the NPSD. Across three distinct datasets(US census population records, ball bearing failure times, and cancer patient survival times), the proposed model consistently outperforms several competing one-parameter distributions, including the exponential, Lindley, XLindley, new XLindley, Xgamma, and Zeghdoudi distributions. In all cases, the NPSD achieves the lowest values for: −2 log L, AIC, AICC, HQIC, and BIC, indicating a superior balance between goodness of fit and model parsimony. The Kolmogorov–Smirnov test does not reject the NPSD for the cancer survival data (p = 0.7935), confirming its adequacy for this dataset. The hazard rate function of the NPSD, given by , exhibits increasing behavior for the estimated parameter values, which is consistent with the failure mechanism observed in many biomedical and engineering applications where risk accumulates over time.
Despite its flexibility and favorable performance, the one-parameter formulation of the NPSD has inherent limitations. While it captures right-skewed data effectively and accommodates increasing hazard rates, it cannot model non-monotonic hazard shapes such as bathtub or unimodal failure rates without further extension. The skewness and kurtosis coefficients are fixed constants (1.3156 and 19.0413, respectively), which may limit the model’s ability to adapt to datasets with different tail behaviors. Furthermore, the absence of a scale or location parameter restricts its applicability in regression settings where covariate effects need to be incorporated.
These limitations naturally suggest directions for future research. A two-parameter extension (NPSD-II) could be developed by introducing an additional shape or scale parameter, thereby enhancing flexibility to accommodate various hazard shapes and tail behaviors. Such an extension would allow the model to capture decreasing, increasing, constant, and non-monotonic failure rates, making it applicable to a wider range of reliability and survival datasets. Additionally, incorporating regression structures would enable the modeling of covariate effects on the response variable, broadening the scope of applications in biomedical studies, engineering, and actuarial science. From a Bayesian perspective, future work could explore more robust prior specifications, including informative priors when historical data or expert knowledge is available, as well as hierarchical modeling frameworks for complex data structures. Finally, the development of diagnostic tools for model adequacy and influence diagnostics would further strengthen the practical utility of the NPSD family.
In summary, the NPSD proposed in this paper represents a valuable addition to the toolkit of one-parameter lifetime distributions, offering a competitive fit for skewed nonnegative data while maintaining mathematical tractability for Bayesian inference under censoring. The simulation and real-data results support its practical relevance, and the identified limitations provide clear motivation for future extensions and refinements.