Abstract
In this paper, we present a new univariate flexible generator of distributions, namely, the odd Perks-G class. Some special models in this class are introduced. The quantile function (QFUN), ordinary and incomplete moments (MOMs), generating function (GFUN), moments of residual and reversed residual lifetimes (RLT), and four different types of entropy are all structural aspects of the proposed family that hold for any baseline model. Maximum likelihood (ML) and maximum product spacing (MPS) estimates of the model parameters are given. Bayesian estimates of the model parameters are obtained. We also present a novel log-location-scale regression model based on the odd Perks–Weibull distribution. Due to the significance of the odd Perks-G family and the survival discretization method, both are used to introduce the discrete odd Perks-G family, a novel discrete distribution class. Real-world data sets are used to emphasize the importance and applicability of the proposed models.
1. Introduction
Over the past two decades, a number of generalized classes of statistical models have been developed and explored for the modeling of data in a variety of applications, including in the medical sciences, engineering, environmental and biological studies, life-testing challenges, demographics, actuarial science, and economics. As a result, a number of researchers have presented novel distribution classes that broaden well-known statistical models while also providing a high degree of adaptability for the analysis of data. As a result, various classes have been proposed in the statistical literature for generating new distributions by adding one or more factors. A few famous examples are as follows: the exponentiated Weibull family presented by Mudholkar et al. [1]; the novel approach offered by Marshall Olkin [2], involving the embedding of a parameter into a class of statistical models; the exponentiated T-X family of distributions reported by Alzaghal et al. [3]; Type II half Logistic-G by Hassan et al. [4]; the Weibull-G family by Bourguignon et al. [5]; the beta-generated familyby Eugene et al. [6]; the gamma-generated family by Zografos et al. [7]; the additive Weibull-G family by Hassan et al. [8]; the odd Lindley-G family by Silva et al. [9]; odd inverse power generalized Weibull-G by Al-Moisheer et al. [10]; Marshall-Olkin odd Burr III-G by Afify et al. [11]; Topp–Leone odd Fréchet-G by Al-Marzouki et al. [12]; the transmuted odd Fréchet-G family of distributions by Badr et al. [13]; odd generalized N-H-G by Ahmad et al. [14]; generalized odd Burr III-G by Haq et al. [15] and the odd Fréchet-G family by [16], among others. The Weibull-power Cauchy distribution, presented by Tahir et al. [17], is also worthy of mention.Cordeiro et al. [18] created a new family of generalized distributions. Different authors introduced new distribution to fit COVID-19 data as [19,20,21].
Perks [22] has presented a four-parameter extension of the Gompertz–Makeham distribution with the following hazard rate function:
When , the Gompertz–Makeham hazard rate function is obtained. The parameters appear to have been designed by Perks to be non-negative, and Marshall and Olkin [23] have demonstrated that cannot be used. However, by setting and choosing as the limit, the Gompertz–Makeham distribution can be obtained. Richards [24] has recently introduced a modified version of the Perks distribution, which takes the hazard function of the Perks distribution into account:
The Perks distribution has several applications in the field of actuarial science. Haberman et al. [25] and Richards [24] have shown that this distribution is a good fit for pensioner mortality data. The parametric mortality projection is well-described by the Perks distribution, according to Haberman et al. [25]. The cumulative distribution function (cdf) and probability density function (pdf) of the Perks distribution are given as follows:
and
The authors in [26] defined a new idea for the generation of larger families, making use of any pdf as a generator. The above generator is a member of the distribution family, and its cdf is specified by
where c is the pdf of a random variable (RV) T for , G is the cdf of a random variable X, and is a function of G, which satisfies the following conditions:
- (i)
- ;
- (ii)
- is differentiable and monotonically non-decreasing;
- (iii)
- as and as .
The relevant pdf can be obtained as follows:
Inspired by the T-X concept, we develop a new, broader, and more flexible class of distributions, called the odd Perks-G class, by combining = and replacing c by , where ; ; is the baseline cdf, which depends on a parameter vector ; and is the baseline reliability function. For each baseline the cdf is provided as follows:
The associated pdf may be obtained as follows:
where is the baseline pdf of a baseline model. is used to represent an RV X with density function (6). The survival function (SF) of the -G family is:
and the hazard rate function (HRF) is defined as
The -G family can be explained in the following way. Assume Y is a stochastic system’s RV lifetime with a specified continuous G model, where is the odds ratio that an individual (or item) may not be active (failure or death) at time x following a lifetime Y. If the diversity of this chance of failure is denoted by the RV X and, as such, by the extended exponential model with parameters and , then the cdf of X is as follows:
The primary motives for employing the -G family in practice are:
- (i)
- To realize special models for all sorts of HRFs;
- (ii)
- Under the same baseline distribution, to regularly provide better fits than alternative produced models;
- (iii)
- Compared to the baseline model, to increase the adjustability of the kurtosis;
- (iv)
- To construct symmetric, left- and right-skewed, and inverted J-shaped distributions.
The association between survival time and numerous factors, such as sex, weight, blood pressure, and many more, has recently sparked significant attention in the relevant literature. Different parametric regression models, including the log-location-scale regression model, have been employed in a number of applications to quantify the effects of co-variate variables on survival time. As it has been extensively utilized in clinical trials and many other domains of application, the log-location-scale regression model stands out. In many real-world applications involving lifetime data, determining the link between survival time and independent (explanatory) variables is critical. In this context, the regression model method can be applied. The linear log-location-scale regression odd Perks-X model can be stated as follows:
where is the random error, with density function ; is a vector of unknown parameters of the explanatory variables; is the scale parameter of the regression model; and is the explanatory variable vector, where k is the number of explanatory variables. For more information about linear location-scale regression models, see, for example, [27,28,29,30,31,32].
The remainder of this paper is divided into several sections, structured as follows. In Section 2, a useful expansion of -G is derived and some special models are obtained by means of the -G generator. Several mathematical statistical properties, MOMs, probability-weighted moments (PRWMOMs), residual life (RL) and reversed residual life (RRL) FUNs, and entropy (EN) are investigated in Section 3. In Section 4, non-Bayesian estimates of the model’s parameters are obtained. In Section 5, Bayesian estimates of the model’s parameters are obtained. In Section 6, bootstrap confidence intervals for the model’s parameters are obtained. In Section 7, the log-odd Perks–Weibull regression model is introduced. Simulation studies are described in Section 8. Section 9 details the discretization of the -G family. In Section 10, real-world data sets are used to demonstrate the adaptability of the proposed family. Finally, we present our conclusions.
2. Density of the - Class: Useful Expansions
We propose a handy linear representation of the -G density function in this section. We can write, using the generalised binomial expansion,
For the exponential function, we can use the power series
When we substitute this expansion into Equation (11), we obtain the following:
If and yield a true non-integer, then the following power series occurs:
Applying (12) in (11), for the term , the -G density function can be expressed as an infinite mixture of expo-G density functions, as
where is the expo-G pdf with power parameter and
The cdf of the -G class can also be expressed as a mixture of expo-G cdfs
where is the cdf of the exp-G function with power parameter . Thus, several mathematical and statistical properties of the -G family can be determined obviously from those of the expo-G family.
2.1. Special Models
In this section, we examine four different -G special models.
2.1.1. Odd Perks Uniform
Let the parent distribution be uniform in the range , , and . The cdf and pdf of the odd Perks uniform (OPU) are respectively given by
and
2.1.2. Odd Perks Exponential
The exponential cdf and pdf with parameter are and . The cdf and pdf of the odd Perks exponential (OPE) are respectively given by
and
Figure 1 shows various pdf curves for OPE models with different parameter values.
Figure 1.
The pdf curves for OPE models with various parameter values.
2.1.3. Odd Perks–Weibull
Let us consider the Weibull distribution with cdf and pdf values given by and . The odd Perks–Weibull (OPW) has cdf and pdf given, respectively, by
and
Figure 2 show various pdf curves for OPW models with different parameter values.
Figure 2.
The pdf curves for OPW models with various parameter values.
2.1.4. Odd Perks–Lomax
The Lomax cdf has the parameters and b, where . The odd Perks–Lomax (OPL) model has the cdf
and the associated pdf is given by
Figure 3 shows various pdf curves for OPL models with different parameter values.
Figure 3.
The pdf curves for OPL models with different parameter values.
3. Statistical Features
The statistical features of the -G family are investigated in this section; specifically, the QFUN, MOMs, incomplete MOMs, PRWMOMs, and RL and RRL FUNs.
3.1. Quantiles
3.2. Moments
In this sub-section, the ordinary MOM and MOM GFUNs of the -G class are derived. Most of the necessary characteristics and features of a distribution can be studied through its MOMs.
Let be an RV having the exp-G pdf with power parameter . The rth moment of the -G family of distributions can be obtained from (13), as follows
Another formula for the rth MOM follows from (2.5), as .
Table 1 provides some numerical values of moments for the OPE model, including , , , , variance (var), skewness (SK), kurtosis (KU), and the coefficient of variation (CV).
Table 1.
Some numerical values of moments with various parameters in the OPE model.
For the MOM GFUN, we now introduce two formulas. According to Equation (13), the first formula can be calculated by
where is the MOM-GFUN of . As a result, the exp-G GFUN may readily be used to determine . The second formula for can be derived from (13) as
where is the mgf of the RV , given by
For each real , the sth incomplete MOMs of X, defined by , can be written as
where
which can be evaluated numerically.
The th PRWMOMs of the -G class are provided by:
based on (5) and (6). Then, after some calculation, we obtain
where
As a result, th PRWMOMs of the -G class can be expressed as
Thus, the th PRWMOMs of X may be generated by combining an unlimited number of exp-G MOMs, provided as
3.3. Residual Lifetimes
The rth-order MOM of the RL is given as:
where . Such a procedure may be used to calculate the rth-order MOM of the RRL.
3.4. Four Different Types of Entropy
The Rényi EN (REN) (see [33]) is characterized by (
As a result, the REN of the -G class is given by
The Tsallis EN (TEN) measure (see [34]) is defined as
The Havrda and Charvat EN (HCEN) measure (see [35]) is defined as
The Arimoto EN (AEN) measure (see [36]) of -G is defined as
Numerical values of the REN, TEN, HCEN, and AEN under various parameter values in the OPE model are provided in Table 2.
Table 2.
Numerical values of the REN, TEN, HCEN, and AEN for the OPE model.
4. Non-Bayesian Estimation
In this section, we examine two different non-Bayesian estimation approaches for the -G family parameters: The maximum likelihood and maximum product of spacings methods.
4.1. Likelihood Method
Various parameter estimation strategies have been introduced in the literature, the most prominent of which is the maximum likelihood (ML) method, which may be used to create confidence ranges for model parameters, as well as in the testing of statistics. Using complete samples, we can calculate the ML estimates (MLEs) of the parameters for the proposed class. Let be a random sample of size n from the -G class with parameters , and . The log-likelihood (LL) FUN is given as
where The components of the score vector are given by
and
where , and is the kth element of the vector of parameters .
4.2. Maximum Product of Spacings (MPS) Estimation
The authors in [37] developed the MPS methodology as an alternative to the MLE method for estimating the parameters of continuous univariate distributions. They argued that, by replacing the likelihood function with a product of spacings, the MPS approach possesses most of the properties of ML. The authors in [38] also considered the MPS technique as an independent approximation of the Kullback–Leibler information measure.
Let be from the -G family with cdf (5) and parameters , and . Then, the uniform spacings of this random sample are defined as
where
The MPSEs can be obtained by maximizing the product of spacings, as follows
The MPSEs of , , and are calculated by first solving the non-linear equations. The logarithm of the product of spacings in Equation (28) is then differentiated with respect to each parameter. Non-linear optimization algorithms (e.g., the Newton–Raphson method) can be used to numerically solve these equations, as they are difficult to solve analytically. An asymptotic variance–covariance matrix and normal approximation confidence intervals are computed after the ACI.
5. Bayesian Estimation
In this section, we consider the Bayesian estimation of the parameters of the model obtained when data are observed based on the squared error loss function (SELF), defined by
where is an estimator of . We denote the prior and posterior distributions of by and , respectively. Under the SELF, the Bayesian estimate of any FUN of is given by
A prior distribution is important for the development of Bayes estimators.
Under the assumption of gamma prior distributions, we investigate this estimation problem. Therefore, it is assumed that , and follow independent gamma distributions, with , , and if and if is an individual parameter, with respective pdfs given by
Using the informative prior (30) and the likelihood FUN (6), the joint posterior density may be calculated as follows:
The marginal posterior densities of the parameters , and can be derived as
As the marginal posterior densities in (32), (33) and (34) are not well-known distributions, we utilize the Metropolis–Hastings sampler to produce values for , and , using the normal proposed distribution in (32), (33) and (34).
Furthermore, the approach of Chen and Shao [39] has been widely used to create highest posterior density (HPD) intervals for Bayesian estimates with uncertain benefit distribution parameters. For example, using the two endpoints from MCMC sample outputs, the and percentiles, a HPD interval can be produced. The Bayesian credible intervals for the parameters , and are calculated as follows:
- Sort the parameters as , , , and , where N is the length of the generated MCMC.
- The symmetric credible intervals for , and become , , and .
6. Bootstrap CI
We propose bootstrap confidence intervals as an alternative to the asymptotic confidence interval for the parameters of the model. For this objective, we created parametric bootstrap samples and discovered two unique bootstrap confidence intervals. First, we employed the Efron [40] percentile bootstrap method (boot-p). Use of the bootstrap-t technique was then proposed, based on the concept of Hall [41] (boot-t). For further information on how these bootstrap confidence intervals work, see [42,43,44].
- (i)
- Boot-p method
Step 1: Generate separate bootstrap samples after computing the MLEs for all parameters, with , and as the actual parameters.
Step 2: Calculate the MLEs of all parameters according to the bootstrap samples, denoted by , and .
Step 3: Repeat Step 2 B times, as needed, in order to obtain a set of bootstrap estimates for , and .
Step 4: Arrange , ,..., in ascending order, as , ,...,.
Step 5: Then, the approximate CIs for , and are calculated as follows
- (ii)
- Boot-t method
Step 1: The approach is the same as that in the boot-p approach.
Step 2: Compute the bootstrap estimate of by replacing the parameters in Equation (24) with their bootstrap estimates, denoting them by and the following statistics
Step 3: Step 2 should be repeated B times, as needed.
Step 4: Arrange , where , in ascending order as .
Step 5: The approximate CIs are then obtained by
7. The Log-Odd Perks–Weibull Regression Model
If X is an RV with an odd Perks–Weibull (OPW) distribution, is an RV with a log-OPW (LOPW) distribution with the transformation parameters and . As a result, the pdf and cdf of the LOPW distribution are as follows:
and
where is the location parameter, are the shape parameters, and is the scale parameter. The SF and HRF are provided by
and
Using the linear location-scale regression model in Equation (1), where , the SF of can thus be written as:
MLE Method for Parameters of the Regression Model
The likelihood FUN of the regression model can be expressed as:
where .
By maximizing the log-likelihood function (42), the MLEs , and of , and can be obtained. The survival function for can be computed using the fitted model ():
The survival function for is derived, using the invariance characteristics of the MLE, as follows:
where and
The asymptotic distribution of is multivariate normal , where is the information matrix, when the requirements are met for the parameter vector in the interior of the parameter space but not at the boundary. The approximated multivariate normal distribution can be used to build approximate confidence areas for particular parameters in in the traditional manner.
8. Simulation Studies
8.1. Simulation for OPE Distribution
To demonstrate the performance of the MLE, MPS, and Bayesian estimation methods with respect to the OPE distribution parameters, we ran a Monte Carlo simulation; that is, for two separate sets of parameter values, we randomly produced 10,000 samples of sizes 30, 70, and 150 from the OPE distribution:
The parameter estimates were obtained by computing the bias and mean square error (MSE), as well as the length of the confidence interval (L.CI) for MLE and MPS by asymptotic CI, in addition to the bootstrap CI approach for MLE and the credible CI determined using the HPD interval for Bayesian estimation.The simulation outcomes are shown in Table 3 and Table 4. As a result of these findings, we concluded that as the sample size increased, the empirical means tended to approach the true value of the parameters. Furthermore, as the sample size grew larger, the MSEs and biases decreased.
8.2. Simulation of the LOPW Regression Model
Next, we conducted a Monte Carlo simulation to examine the performance of the ML parameter estimates of the LOPW regression model. The lifetimes were obtained from the OPW distribution, and independent variables were generated using the uniform distribution in the range (0, 1). A total of 1000 samples were created, using the parameters detailed below.
- In Table 5: multiple regression , , , and .
Table 5. Point and interval estimates by MLE for multiple regression. - In Table 6: simple regression , , , and .
Table 6. Point and interval estimates by MLE for simple regression.
The simulation was conducted using , and 150.
9. Discretization
There are a variety of approaches in the statistical literature for converting a continuous distribution to a discrete one. The survival discretization method is the most often-used methodology for generating discrete distributions; for further information, see Roy [45]. It requires the existence of a cdf, a continuous and non-negative survival function, and times separated into unit intervals. The discrete distribution PMF is defined as follows:
Then, the PMF of the discrete -G family can be expressed as
The cdf of the discrete -G family is given as follows
and the HRF of the discrete -G family is
Regarding the OPE distribution, the PMF of the discrete OPE (DOPE) distribution is
Figure 4 shows various PMFs for the DOPE distribution under various parameters.
Figure 4.
DOPE PMFs under different parameters.
The HRF of the DOPE distribution is given as
Figure 5 shows various HRFs for the DOPE distribution.
Figure 5.
DOPE HRFs under different parameters.
10. Applications
We utilized three real data sets to test the superiority of the continuous distribution, COVID-19 data from Saudi Arabia to test the superiority of the discrete distribution, and Stanford heart transplant data to test the superiority of the regression model. We used various statistical measures, including Kolmogorov–Smirnov (KOS) with p-value (PV), Cramér–von Mises (CVOM), Anderson–Darling (AND), and Chi-squared (), using different criteria, including the Akaike information criterion (INC) (AKINC), Bayesian INC (BINC), Hannan–Quinn INC (HQINC), and consistent AKINC (CAKINC) statistics.
10.1. Radiation Failure Mice
We first examined the genuine data set of radiation failure mice (RFM) reported by Hoel [9], obtained from a laboratory experiment in male mice aged 5–6 weeks that had been exposed to a 300 roentgen radiation dosage. Our goal was to look at other causes of death that were not related to the two main causes of death: Reticulum cell sarcoma and thymic lymphoma. The data were 40, 42, 51, 62, 163, 179, 206, 222, 228, 252, 249, 282, 324, 333, 341, 366, 385, 407, 420, 431, 441, 461, 462, 482, 517, 517, 524, 564, 567, 586, 619, 620, 621, 622, 647, 651, 686, 761, and 763. Table 7 presents the MLE with SE and different measures for the RFM data. Table 7 presents the comparison between our model and different distributions: The Marshall–Olkin alpha power exponential (MOAPEx) introduced by [46], the Marshall–Olkin alpha power Weibull (MOAPW) introduced by [47], the Weibull–Lomax (WL) model introduced by [48], the Kumaraswamy Weibull (KWW) introduced by [49], the alpha power inverse Weibull (APIW) introduced by [50], and the generalized inverse Weibull (GIW) introduced by [51]. Based on these results, we present the measured values of AKINC, BINC, KOS, CVOM, and AND. The smallest values were observed for the OPE distribution, while the largest values were seen with the PV. Based on the results presented in Table 7, we note that OPE can be considered as the best model to fit the RFM data. Figure 6 confirms the results shown in Table 7. Figure 7 shows the PP-plot and QQ-plot for the OPE distribution on the RFM data set.
Table 7.
MLE with SE and different measures on RFM data set.
Figure 6.
Estimated cdfs and pdfs for RFM data.
Figure 7.
PP-plot and QQ-plot for OPE applied to RFM data.
10.2. Failure Times of a Certain Product
The second data set, that of Gacula and Kubala [52], contains 26 observations and indicates failure times for a specific product. This information has also been utilized by Nassar at al. [46]. The data are 24, 24, 26, 26, 32, 32, 33, 33, 33, 35, 41, 42, 43, 47, 48, 48, 48, 50, 52, 54, 55, 57, 57, 57, 57, and 61. Table 8 presents the MLE with SE and different measures for the failure time data. Table 8 presents a comparison between our model and different distributions, including the MOAPEx, MOAPW, WL, KWW, APIW, and GIW distributions. Based on these results, we found that the measured values of AKINC, BINC, KOS, CVOM, and AND were the smallest for the OPE distribution, while the largest values were obtained with PV. Based on the results in Table 8, we note that the OPE represented the best model to fit the failure time data. Figure 8 confirms the results shown in Table 8. Figure 9 shows the PP-plot and QQ-plot for the OPE distribution on the failure time data set.
Table 8.
MLE with SE and different measures on the failure time data set.
Figure 8.
Estimated cdfs and pdfs for the failure time data set.
Figure 9.
PP-plot and QQ-plot for OPE on the failure time data set.
10.3. Mechanical Data
The third data set comprises the times measured between failures for repairable mechanical equipment items, obtained from the work of Seber et al. [53]. The data are 0.11, 0.30, 0.40, 0.45, 0.59, 0.63, 0.70, 0.71, 0.74, 0.77, 0.94, 1.06, 1.17, 1.23, 1.23, 1.24, 1.43, 1.46, 1.49, 1.74, 1.82, 1.86, 1.97, 2.23, 2.37, 2.46, 2.63, 3.46, 4.36, and 4.73. These data have been used to fit the extended inverse Gompertz distribution, which was compared to different distributions by Elshahhat et al. [54]. The results shown in Table 9 indicate that the smallest values of the AKINC, BINC, KOS, CAKINC, and HQINC were obtained by OPE, while the largest values were obtained by the PV for KOS, when comparing our results with those discussed by Elshahhat et al. [54]. Thus, the OPE distribution performed better than the inverse-Weibull, APIW, inverse gamma, generalized inverse Weibull (GIW), exponentiated inverted-Weibull, generalized inverted half-logistic, inverted Kumaraswamy, inverted Nadarajah–Haghighi, alpha-power inverse-Weibull, and extended inverse Gompertz (EIGo), which were discussed in [54]. Figure 10 shows the estimated cdf, estimated pdf, PP-plot, and QQ-plot for the OPE distribution, which confirm the good fit of the OPE distribution to these data.
Table 9.
MLE with SE and different measures on the mechanical data set.
Figure 10.
Estimated cdf, pdf, PP-plot, and QQ-plot for the OPE distribution on the mechanical data set.
10.4. Stanford Heart Transplant Data
Data for patients were acquired from the work of Kalbfleisch and Prentice [55]. The number of days between admittance to a heart transplant program and death was used to calculate the patient survival times. Each patient was linked to the following data: , log survival follow-up time (days); , age (in years); and , prior surgery (coded as 0 = No, 1 = Yes). We present the fitting results for the following model:
where follows the LOPW distribution. Table 10 shows MLE, SE, and Z-values, as well as PVs, for the LOPW regression model, while Table 11 provides different measures obtained for the LOPW regression model. Figure 11 shows the correlation values.
Table 10.
MLE, SE, and Z-values, along with PVs, for the LOPW regression model.
Table 11.
Different measures for the LOPW regression model.
Figure 11.
Correlation matrix.
Then,
The results regarding the prediction of dependent variables using the regression model are shown in Figure 12.
Figure 12.
Data prediction using the regression model.
10.5. COVID-19 Data
We used a COVID-19 data set from Saudi Arabia that spanned 32 days (from 1 September 2021 to 4 October 2021). This data set was comprised of newly reported instances of daily deaths. The data were as follows: 6, 7, 8, 5, 7, 7, 6, 6, 7, 6, 6, 7, 6, 5, 5, 7, 5, 6, 5, 5, 6, 5, 7, 5, 4, 6, 5, 5, 5, 3, 3, and 2. These data were obtained from the World Health Organization (at https://covid19.who.int/ accessed on 3 March 2022). Table 12 shows the data values, real frequency (count), and frequencies estimated using different discrete distributions. The distributions used for comparison were the discrete Marshall–Olkin inverse Toppe–Leone (DMOITL) introduced by [56], discrete Burr (DB) introduced by [57], discrete inverse Weibull (DIW) introduced by [58], negative binomial distribution (NBinom) introduced by [59], Poisson (Pois), discrete generalized exponential (DGE) suggested by [60], discrete alpha power inverse Lomax (DAPL) introduced by [61], discrete Lindley (DL) introduced by [62], discrete inverse Toppe–Leone (DITL) introduced by [63], exponentiated discrete Weibull introduced by [64], and discrete Marshall–Olkin generalized exponential (DMOGE) introduced by [65]. Figure 13 shows that the DOPE distribution was the best model for fitting these data, with an estimated frequency close to the real frequency. To confirm this conclusion, we used the KS value and Chi-squared () test to determine the best model fit for these data, as well as the AKINC, CAKINC, BINC, and HQINC measures. The results are shown in Table 13. We note that all of these measures had the smallest values with the DOPE distribution. Figure 14 shows the cdf and PMF for the DOPE distribution of these data.
Table 12.
Estimated count of values determined by each model.
Figure 13.
Graphical plots of the expected frequencies and the data obtained using the PMFs of different distributions.
Table 13.
MLE with SE and different measures for the compared models.
Figure 14.
cdf and PMF for DOPE distribution.
11. Concluding Remarks
In this study, we explored the novel odd Perks-G family, and several of its statistical and mathematical features were established. We obtained some of its special models, including the OPU, OPE, OPW, and OPL distributions. The associated model parameters were estimated using the ML technique, the MPS method, and the Bayesian estimation approach, and simulation tests were conducted to evaluate the effectiveness of the OPE estimators using various estimation methods based on biases, MSE, and the CI length. In addition, the OPW distribution was used to develop a new log-location regression model. The unknown parameters of the new regression model were estimated using ML estimation methods. Furthermore, we introduced the discrete odd Perks-G family using the survival discretization method and obtained the DOPE distribution as a special model. Finally, we examined the utility of -G family distributions using three real data sets, analyzed Stanford heart transplant data using the LOPW regression model, and analyzed COVID-19 data using the discrete model. The OPE distribution outperformed other state-of-the-art distributions in terms of goodness of fit, according to our findings. Furthermore, the LOPW regression model fit the Stanford heart transplant data well. Additionally, the DOPE distribution provided a good fit to the COVID-19 data. In our future research, the new suggested family will be used to generate more new distributions, the statistical properties of which will be explored. We also intend to study the statistical inferences of new models generated using the odd Perks-G family.
Author Contributions
Conceptualization, I.E. and E.M.A.; methodology, I.E. and E.M.A.; software, E.M.A. and M.E.; validation, N.A., S.A.A., M.E. and I.E.; formal analysis, E.M.A.; resources, I.E.; data curation, I.E., N.A. and S.A.A.; writing—original draft preparation, I.E., E.M.A. and M.E.; writing—review and editing, N.A. and S.A.A. and M.E.; funding acquisition, I.E., N.A. and S.A.A. All authors have read and agreed to the published version of the manuscript.
Funding
The authors extend their appreciation to the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University for funding this work through Research Group no. RG-21-09-08.
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
Data sets are available in the application section.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Mudholkar, G.S.; Srivastava, D.K. Exponentiated Weibull family for analyzing bathtub failure-real data. IEEE Trans. Reliab. 1993, 42, 299–302. [Google Scholar] [CrossRef]
- Marshall, A.; Olkin, I. A new method for adding a parameter to a family of distributions with applications to the exponential and Weibull families. Biometrika 1997, 84, 641–652. [Google Scholar] [CrossRef]
- Alzaghal, A.; Famoye, F.; Lee, C. Exponentiated T-X Family of Distributions with Some Applications. Int. J. Stat. Probab. 2013, 3, 31–49. [Google Scholar] [CrossRef] [Green Version]
- Hassan, A.S.; Elgarhy, M.; Shakil, M. Type II half Logistic family of distributions with applications. Pak. J. Stat. Oper. Res. 2017, 13, 245–264. [Google Scholar]
- Bourguignon, M.; Silva, R.B.; Cordeiro, G.M. The Weibull-G family of probability distributions. J. Data Sci. 2014, 12, 1253–1268. [Google Scholar] [CrossRef]
- Eugene, N.; Lee, C.; Famoye, F. Beta-normal distribution and its applications. Commun. Stat. Theor. Methods 2002, 31, 497–512. [Google Scholar] [CrossRef]
- Zografos, K.; Balakrishnan, N. On families of beta-and generalized gamma-generated distributions and associated inference. Stat. Methodol. 2009, 6, 344–362. [Google Scholar] [CrossRef]
- Hassan, A.S.; Hemeda, S.E. The Additive Weibull-G Family of Probability Distributions. Int. J. Math. Its Appl. 2016, 4, 151–164. [Google Scholar]
- Gomes-Silva, F.S.; Percontini, A.; de Brito, E.; Ramos, M.W.; Venâncio, R.; Cordeiro, G.M. The odd Lindley-G family of distributions. Austrian J. Stat. 2017, 46, 65–87. [Google Scholar] [CrossRef] [Green Version]
- Al-Moisheer, A.S.; Elbatal, I.; Almutiry, W.; Elgarhy, M. Odd inverse power generalized Weibull generated family of distributions: Properties and applications. Math. Probl. Eng. 2021, 2021, 5082192. [Google Scholar] [CrossRef]
- Afify, A.Z.; Cordeiro, G.M.; Ibrahim, N.A.; Jamal, F.; Elgarhy, M.; Nasir, M.A. The Marshall-Olkin odd Burr III-G family: Theory, estimation, and engineering applications. IEEE Access 2021, 9, 4376–4387. [Google Scholar] [CrossRef]
- Al-Marzouki, S.; Jamal, F.; Chesneau, C.; Elgarhy, M. Topp-Leone odd Fréchet generated family of distributions with applications to COVID-19 data sets. Comput. Model. Eng. Sci. 2020, 125, 437–458. [Google Scholar]
- Badr, M.M.; Elbatal, I.; Jamal, F.; Chesneau, C.; Elgarhy, M. The transmuted odd Fréchet-G family of distributions: Theory and applications. Mathematics 2020, 8, 958. [Google Scholar] [CrossRef]
- Ahmad, Z.; Elgarhy, M.; Hamedani, G.G.; Butt, N.S. Odd generalized N-H generated family of distributions with application to exponential model. Pak. J. Stat. Oper. Res. 2020, 16, 53–71. [Google Scholar] [CrossRef]
- Haq, M.A.; Elgarhy, M.; Hashmi, S. The generalized odd Burr III family of distributions: Properties, applications and characterizations. J. Taibah Univ. Sci. 2019, 13, 961–971. [Google Scholar] [CrossRef] [Green Version]
- Haq, A.; Elgarhy, M. The odd Fréchet-G class of probability distributions. J. Stat. Appl. Probab. 2018, 7, 189–203. [Google Scholar] [CrossRef]
- Tahir, M.H.; Zubair, M.; Cordeiro, G.M.; Alzaatreh, A.; Mansoor, M. The Weibull-Power Cauchy Distribution: Model, Properties and Applications. Hacet. J. Math. Stat. 2017, 46, 767–789. [Google Scholar] [CrossRef]
- Cordeiro, G.; de Castro, M. A new family of generalized distributions. J. Stat. Comput. Simul. 2011, 81, 883–898. [Google Scholar] [CrossRef]
- Hassan, A.S.; Almetwally, E.M.; Ibrahim, G.M. Kumaraswamy inverted Topp–Leone distribution with applications to COVID-19 data. Comput. Mater. Contin. 2021, 68, 337–358. [Google Scholar] [CrossRef]
- Almetwally, E.M. The odd Weibull inverse topp–leone distribution with applications to COVID-19 data. Ann. Data Sci. 2022, 9, 121–140. [Google Scholar] [CrossRef]
- Nagy, M.; Almetwally, E.M.; Gemeay, A.M.; Mohammed, H.S.; Jawa, T.M.; Sayed-Ahmed, N.; Muse, A.H. The New Novel Discrete Distribution with Application on COVID-19 Mortality Numbers in Kingdom of Saudi Arabia and Latvia. Complexity 2021, 2021, 7192833. [Google Scholar] [CrossRef]
- Perks, W. On some experiments in the graduation of mortality statistics. J. Inst. Actuar. 1932, 43, 12–57. [Google Scholar] [CrossRef]
- Marshall, A.W.; Olkin, I. Life Distributions; Springer: New York, NY, USA, 2007; Volume 13. [Google Scholar]
- Richards, S.J. Applying survival models to pensioner mortality data. Br. Actuar. J. 2008, 14, 257–303. [Google Scholar] [CrossRef] [Green Version]
- Haberman, S.; Renshaw, A. A comparative study of parametric mortality projection models. Insur. Math. Econ. 2011, 48, 35–55. [Google Scholar] [CrossRef] [Green Version]
- Alzaatreh, A.; Lee, C.; Famoye, F. A new method for generating families of continuous distributions. Metron 2013, 71, 63–79. [Google Scholar] [CrossRef] [Green Version]
- Carrasco, J.M.; Ortega, E.M.; Paula, G.A. Log-modified Weibull regression models with censored data: Sensitivity and residual analysis. Comput. Stat. Data Anal. 2008, 52, 4021–4039. [Google Scholar] [CrossRef]
- Silva, G.O.; Ortega, E.M.; Cancho, V.G. Log-Weibull extended regression model: Estimation, sensitivity and residual analysis. Stat. Methodol. 2010, 7, 614–631. [Google Scholar] [CrossRef]
- Hashimoto, E.M.; Ortega, E.M.; Cordeiro, G.M.; Barreto, M.L. The Log-Burr XII regression model for grouped survival data. J. Biopharm. Stat. 2012, 22, 141–159. [Google Scholar] [CrossRef]
- Ortega, E.M.; Cordeiro, G.M.; Kattan, M.W. The log-beta Weibull regression model with application to predict recurrence of prostate cancer. Stat. Pap. 2013, 54, 113–132. [Google Scholar] [CrossRef]
- Alamoudi, H.H.; Mousa, S.A.; Baharith, L.A. Estimation and application in log-Fréchet regression model using censored data. Int. J. Adv. Stat. Probab. 2017, 5, 23–31. [Google Scholar] [CrossRef] [Green Version]
- Baharith, L.A.; Al-Beladi, K.M.; Klakattawi, H.S. The Odds Exponential-Pareto IV Distribution: Regression Model and Application. Entropy 2020, 22, 497. [Google Scholar] [CrossRef] [PubMed]
- Rényi, A. On measures of entropy and information. In Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 20–30 June 1960; Volume 1, pp. 47–561. [Google Scholar]
- Tsallis, C. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 1988, 52, 479–487. [Google Scholar] [CrossRef]
- Havrda, J.; Charvat, F. Quantification method of classification processes, concept of structural a-entropy. Kybernetika 1967, 3, 30–35. [Google Scholar]
- Arimoto, S. Information-theoretical considerations on estimation problems. Inf. Control. 1971, 19, 181–194. [Google Scholar] [CrossRef] [Green Version]
- Cheng, R.C.H.; Amin, N.A.K. Estimating parameters in continuous univariate distributions with a shifted origin. J. R. Stat. Soc. Ser. B (Methodol.) 1983, 45, 394–403. [Google Scholar] [CrossRef]
- Ranneby, B. The maximum spacing method. An estimation method related to the maximum likelihood method. Scand. J. Stat. 1984, 11, 93–112. [Google Scholar]
- Chen, M.H.; Shao, Q.M. Monte Carlo estimation of Bayesian credible and HPD intervals. J. Comput. Graph. Stat. 1999, 8, 69–92. [Google Scholar]
- Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
- Hall, P. Theoretical comparison of bootstrap confidence intervals. Ann. Stat. 1988, 16, 927–953. [Google Scholar] [CrossRef]
- Muhammed, H.Z.; Almetwally, E.M. Bayesian and non-Bayesian estimation for the bivariate inverse weibull distribution under progressive type-II censoring. Ann. Data Sci. 2020, 1–32. [Google Scholar] [CrossRef]
- Nassr, S.G.; Almetwally, E.M.; El Azm, W.S.A. Statistical inference for the extended weibull distribution based on adaptive type-II progressive hybrid censored competing risks data. Thail. Stat. 2021, 19, 547–564. [Google Scholar]
- Almongy, H.M.; Almetwally, E.M.; Alharbi, R.; Alnagar, D.; Hafez, E.H.; El-Din, M.M.M. The Weibull generalized exponential distribution with censored sample: Estimation and application on real data. Complexity 2021, 2021, 6653534. [Google Scholar] [CrossRef]
- Roy, D. Discrete Rayleigh distribution. IEEE Trans. Reliab. 2004, 53, 255–260. [Google Scholar] [CrossRef]
- Nassar, M.; Kumar, D.; Dey, S.; Cordeiro, G.M.; Afify, A.Z. The Marshall–Olkin alpha power family of distributions with applications. J. Comput. Appl. Math. 2019, 351, 41–53. [Google Scholar] [CrossRef]
- Almetwally, E.M.; Sabry, M.A.; Alharbi, R.; Alnagar, D.; Mubarak, S.A.; Hafez, E.H. Marshall–Olkin alpha power weibull distribution: Different methods of estimation based on type-I and type-II censoring. Complexity 2021, 2021, 5533799. [Google Scholar] [CrossRef]
- Tahir, M.H.; Cordeiro, G.M.; Mansoor, M.; Zubair, M. The Weibull-Lomax distribution: Properties and applications. Hacet. J. Math. Stat. 2015, 44, 461–480. [Google Scholar] [CrossRef]
- Cordeiro, G.M.; Ortega, E.M.; Nadarajah, S. The Kumaraswamy Weibull distribution with application to failure data. J. Frankl. Inst. 2010, 347, 1399–1429. [Google Scholar] [CrossRef]
- Basheer, A.M. Alpha power inverse Weibull distribution with reliability application. J. Taibah Univ. Sci. 2019, 13, 423–432. [Google Scholar] [CrossRef] [Green Version]
- De Gusmao, F.R.; Ortega, E.M.; Cordeiro, G.M. The generalized inverse Weibull distribution. Stat. Pap. 2011, 52, 591–619. [Google Scholar] [CrossRef]
- Gacula, M.C., Jr.; Kubala, J.J. Statistical models for shelf life failures. J. Food Sci. 1975, 40, 404–409. [Google Scholar] [CrossRef]
- Seber, G.A.; Wild, C.J. Wiley Series in Probability and Statistics. Linear Regression Analysis; Wiley: Hoboken, NJ, USA, 2003; pp. 36–44. [Google Scholar]
- Elshahhat, A.; Aljohani, H.M.; Afify, A.Z. Bayesian and Classical Inference under Type-II Censored Samples of the Extended Inverse Gompertz Distribution with Engineering Applications. Entropy 2021, 23, 1578. [Google Scholar] [CrossRef]
- Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; John Wiley & Sons: Hoboken, NJ, USA, 2011; Volume 360. [Google Scholar]
- Almetwally, E.M.; Abdo, D.A.; Hafez, E.H.; Jawa, T.M.; Sayed-Ahmed, N.; Almongy, H.M. The new discrete distribution with application to COVID-19 Data. Results Phys. 2022, 32, 104987. [Google Scholar] [CrossRef] [PubMed]
- Krishna, H.; Pundir, P.S. Discrete Burr and discrete Pareto distributions. Stat. Methodol. 2009, 6, 177–188. [Google Scholar] [CrossRef]
- Jazi, M.A.; Lai, C.D.; Alamatsaz, M.H. A discrete inverse Weibull distribution and estimation of its parameters. Stat. Methodol. 2010, 7, 121–132. [Google Scholar] [CrossRef]
- Fisher, P. Negative Binomial Distribution. Ann. Eugen. 1941, 11, 182–787. [Google Scholar] [CrossRef]
- Nekoukhou, V.; Alamatsaz, M.H.; Bidram, H. Discrete generalized exponential distribution of a second type. Statistics 2013, 47, 876–887. [Google Scholar] [CrossRef]
- Almetwally, E.M.; Ibrahim, G.M. Discrete alpha power inverse Lomax distribution with application of COVID-19 data. Int. J. Appl. Math. 2020, 9, 11–22. [Google Scholar]
- Gómez-Déniz, E.; Calderín-Ojeda, E. The discrete Lindley distribution: Properties and applications. J. Stat. Comput. Simul. 2011, 81, 1405–1416. [Google Scholar] [CrossRef]
- Eldeeb, A.S.; Ahsan-Ul-Haq, M.; Babar, A. A discrete analog of inverted Topp-Leone distribution: Properties, estimation and applications. Int. J. Anal. Appl. 2021, 19, 695–708. [Google Scholar]
- Nekoukhou, V.; Bidram, H. The exponentiated discrete Weibull distribution. Sort 2015, 39, 127–146. [Google Scholar]
- Almetwally, E.M.; Almongy, H.M.; Saleh, H.A. Managing risk of spreading “COVID-19” in Egypt: Modelling using a discrete Marshall-Olkin generalized exponential distribution. Int. J. Probab. Stat. 2020, 9, 33–41. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).