Abstract
The use of statistical distributions to model life phenomena has received considerable attention in the literature. Recent studies have shown the potential of statistical distributions in modeling data in applied sciences, especially in environmental sciences. Among them, the Weibull distribution is one of the most well-known models that can be used very effectively for modeling data in the fields of pollution and gas emissions, to name a few. In this paper, we introduce a family of distributions, which we call the modified Alpha-Power Weibull-X family of distributions. Based on the proposed family, we introduce a new model with five parameters, the modified Alpha-Power Weibull–Weibull distribution. Some mathematical properties were determined. Bayesian and maximum likelihood estimates for the model parameters were derived. The MLEs, bootstrap and Bayesian HPD credibility intervals for the unknown parameters were performed. A Monte Carlo simulation study was performed to evaluate the performance of the estimates. A simulation study was performed based on the parameters of the proposed model. An application to the carbon dioxide emissions dataset was performed to predict unique symmetric and asymmetric patterns and illustrate the applicability and potential of the model. For this data set, the proposed model is compared with the modified alpha power Weibull exponential distribution and the two-parameter Weibull distribution. To show which of the competing distributions is the best, we draw on certain analytical tools such as the Kolmogorov–Smirnov test. Based on these analytical measures, we found that the new model outperforms the competing models.
1. Introduction
Increased burning of fossil fuels, deforestation, soil degradation, and various industrial practices have increased carbon dioxide levels in the atmosphere, raising global concerns about climate change and its impact on the environment. In the field of Big Data science and other related fields, the best possible description of real-world phenomena is an important research topic. We refer the reader to [1,2], for more information. Global carbon dioxide emissions from fossil fuels have increased substantially since 1900. Since 1970, carbon dioxide emissions have increased by about 90%, with increasing emissions from fossil fuel combustion and industry accounting for about 78% of total greenhouse gas emissions from 1970 to 2011. Agriculture, deforestation, and other land use changes accounted for the second largest share. We refer the reader to [3,4,5] for more background on recent reports on carbon dioxide emissions.
The quality of the statistical analysis depends on the statistical distribution chosen to model the data. Since various distributions have been used to represent the data and the well-known classical distributions are not sufficient to explain the actual behavior of the data, many transformed, augmented, composite, and mixed distributions have been developed and applied in various fields. However, there are still many important problems that cannot be explained by the current distributions, so we need more flexible and consistent distributions for these problems; see [6,7,8]. One of the most important and recent problems that has piqued our interest is carbon dioxide emissions. For more information on statistical modeling with different models, see [9,10,11].
The problem of complex behavior of data related to carbon dioxide emissions, which do not have a fixed shape or behavior, but rather fluctuate and are unstable. Therefore, we have taken it upon ourselves to find a suitable probability distribution to explain the behavior of carbon dioxide emissions as they are an environmental and health problem.
The objectives of this research are:
- To introduce a new probability distribution that is more flexible and suitable for modeling real data by adding three additional parameters to the Weibull distribution function.
- Derive general mathematical properties of the new distribution.
- Estimate the parameters of the probability distribution of the complete data using the maximum likelihood and Bayesian estimation methods, and compare them using Monte Carlo simulation estimates, biases, and expected errors.
- Apply experimentally obtained results to the study of carbon dioxide emissions. To show the adequacy of the distribution, it is compared with some other special and standard distributions.
Now, we introduce the suggested family of distributions called the new modified alpha power Weibull-X family. Let be the probability density function (PDF) of a random variable (RV) T, where for , and let be a function of , and is the cumulative distribution function (CDF) of an RV X, satisfying the conditions
- (1)
- ,
- (2)
- is differentiable and monotonically increasing, and
- (3)
- as and as .
Alzaatreh et al. [12] define the cdf of the T-X family of distributions as
The corresponding PDF is
Based on the alpha power transformation method, the modified alpha power Weibull distribution was proposed by Chettri et al. [13]. The modified alpha power Weibull distribution CDF and PDF are respectively given by
and
where .
The remainder of this article is organised as follows: Section 2 introduces the new modified Alpha-Power Weibull-X family. Section 3 generates the modified alpha-power Weibull–Weibull distribution. Section 4 discusses the estimation of unknown parameters under the quadratic error, LINEX and the general entropy loss function. A Monte Carlo simulation study is discussed in Section 5. In Section 6, the application of carbon dioxide emissions is carried out to evaluate the efficiency of the proposed model. In Section 7, the discussion and some future frameworks are presented. Finally, brief conclusions are drawn in Section 7.
2. The New Modified Alpha Power Weibull-X Family
Let ; then, let its CDF and PDF be given by and , respectively. Considering the modified alpha power Weibull distribution as a generator, we obtain the modified alpha power Weibull X family of distributions by replacing x with in the alpha power Weibull distribution; this is due to its unique symmetric and asymmetric patterns. Now, let
where is a vector of parameters. By using (1), we define the cdf of the modified alpha power Weibull (APMW)-X family by
The APMW-X density function is
The survival and hazard functions of APMW-X are given, respectively, by
and
The main motivations for using the APMW-X family in practice are the following:
- (i)
- To develop the flexibility and properties of the basic models;
- (ii)
- To provide a suitable procedure for adding additional parameters in extended models with strong outliers, which are very useful in gas emission modeling;
- (iii)
- Introduce the extended version of a basic model with closed forms for the cdf and hazard rate function, where the special submodels of this family can be used in the analysis of censored data sets;
- (iv)
- Compared to existing competing models, the special cases of the APMW-X approach are able to model data sets with high tail content.
Figure 1 plots different SF for the APMW-X family.
Figure 1.
Different SF for the APMW-X .
2.1. Quantile Function
Supposing , we have solved the following equation for the quantile function :
Letting , we have
Thus,
where is the inverse cumulative of the baseline distribution.
2.2. The Likelihood Function of the APMW-X Family
Let be the observed sample. The likelihood function based on the APMW-X for is given by
The corresponding log-likelihood function is given by
Let and be the baseline CDF and PDF, respectively. The first partial derivatives of (15) with respect to are given by
and
3. The Modified Alpha Power Weibull–Weibull (APMW-W) Distribution
The aim of this section is to propose a new composite probability distribution model (the modified alpha power Weibull–Weibull distribution) using the alpha power Weibull-X family to obtain a more convenient and flexible distribution for modeling observations. The main motive for studying and applying the APMW-X method to the Weibull distribution is the following:
- The APMW method is an effective way to add more than three parameters to the distribution family;
- The APMW method makes the distribution richer and more flexible;
- The APMW method provides models that can model both monotonic and non-monotonic hazard rate function (HRF);
- The APMW method gives us a better fit than other modified models with the same or fewer parameters.
Let X be a random variable (R.V.) that follows the two-parameters Weibull distribution , then its CDF, denoted by is given by
Here, , and are the shape and the scale parameter, respectively. The corresponding PDF, denoted by is given by
by taking and to be and , respectively. The CDF and PDF of the APMW-W distribution are given, respectively, by
The survival and hazard functions of APMW-W are given, respectively, by
Figure 2.
Different SF for the APMW-X .
Figure 3.
Different PDF for the APMW-X .
Figure 4.
Different HRF for the APMW-X .
By substituting the CDF of the APMW-W distribution (21) in (13), the quantile function is
if is the quantile of the baseline distribution APMW-W . We can write
In addition, the effects of shape parameters on skewness and kurtosis can be determined using quantile measures. We obtain skewness and kurtosis measures of APMW-W . The skewness measure (SK) of APMW-W (see Bowley [14]) of X is given by
and the kurtosis (K) (see Moor [15]) is given by
Some quantile values for and are shown in Figure 5. The skewness obtained is 0.592713, and the kurtosis is 2.19598 for the same case shown in Figure 5. Table 1 shows some quantile values for the same case. Figure 6, Figure 7 and Figure 8 show the SK and K for some cases of APMW-W .
Figure 5.
Different quantile functions for the APMW-X .
Table 1.
Some quantile values for and .
Figure 6.
Plots for the and .
Figure 7.
Plots for the and .
Figure 8.
Plots for the and .
4. Estimation of the Parameters
We obtain estimators of the model parameters of the APMW-W distribution in this section.
4.1. The Maximum Likelihood Estimation
Here, we discuss the maximum likelihood estimators (MLEs) of the model parameters of the APMW-W distribution. The first partial derivatives of (15) with respect to are given by
and
where
and
The MLEs of the parameters and are obtained by equating Equations (16)–(18), (26) and (27) to zero and solving the above equation simultaneously. However, it is difficult to solve these equations to obtain the estimates of the unknown parameters in explicit form. Therefore, a numerical technique can be used to solve these nonlinear equations.
4.2. Bayesian Estimation
Bayesian inference is a suitable method to work with the full samples of APMW-W . Prior predictive distributions can be used to check the reasonableness of a prior for a given situation before observing sample data. Gamma distribution is one of the most commonly used distributions as a pre-distribution and gives good experimental results. We assume that and are R.V.s that follow the prior PDFs Gamma, Gamma, Gamma, Gamma, and Gamma, respectively. Then, the posterior density of and the data are given by
where J is the normalizing constant.
5. Monte Carlo Simulation Study
This section is concerned with evaluating the performance of the maximum likelihood and Bayesian estimators of the APMW-W distribution through a Monte Carlo simulation study. The simulation is performed for the parameters and of the APMW-W distribution model.
5.1. MLE Monte Carlo Simulation
The simulation study is conducted as follows.
- Random samples of size are generated from the APMW-W distribution.
- Model parameters were estimated using the maximum likelihood method;
- One-thousand replicates were performed to calculate the biases and expected errors (ERs) of these estimators;
- The formulas used to calculate the estimate, biases, and ERs are as follows:and
- Step (4) is also repeated for the parameters and .
5.2. The Bootstrap Confidence Intervals: Boot-p Algorithm
Next, obtain the bootstrap confidence intervals for boot-p for the unknown parameters ), we apply the following algorithms:
- Generate sample of size n from the APMW-W and estimate a ;
- Generate another sample of size n using . Then, estimate ;
- Repeat step 2 B times;
- Via , that is, the CDF of , the C.I. of is given bywhere and x is prefixed.
For more details about the bootstrap confidence intervals, one may refer to Kundu and Joarder [16].
5.3. Bayesian Monte Carlo Simulation Study
We assume that and has the prior PDFs Gamma, Gamma, Gamma,Gamma and Gamma, respectively. We use the Metropolis–Hastings procedure as:
- Set start values and . Then, simulate sample of size n from , next set ;
- Simulate and . using the proposal distributions , , , and ;
- Calculate ;
- Simulate U from Uniform(0, 1);
- If , then ;If , then ;
- Set ;
- Iterate Steps 2–6, M repetitions, and obtain and for .
Suppose the squared error loss function, given by by using the generated random samples from the M-H technique, and N is the nburn. Then, the Bayes estimator of against the squared error loss function is given by
Next, suppose the LINEX () loss function, given by
The approximate Bayes estimate of under loss function is given by
The parameter in LINEX is chosen as 0.2 (), and 0.8 (). Finally, suppose the general entropy (GE) loss function, given by
The parameter in GE is chosen as 0.6. The approximate Bayes estimate of the parameters, given by
5.4. MCMC HPD Credible Interval Algorithm
- Arrange and in rising values;
- The lower bounds of and are in the rank ;
- The upper bounds of and is in the rank ;
- Iterate the previous steps M times. Obtain the average value of the lower and upper bounds of and .
The point and interval simulation results of the APMW-W distribution for , and are, respectively, presented in Table 2 and Table 3.
Table 2.
Point estimation of the APMW-W parameters.
Table 3.
Interval estimation of the APMW-W parameters.
6. The Carbon Dioxide Emissions Application
The data set of 50 carbon dioxide emissions for the period (1970–2019) given by Albank Aldawli for Saudi Arabia is considered as an application of the APMW-W distribution. Table 4 shows the descriptive statistics of the data for carbon dioxide emissions . The boxplot and Q-Q plot are shown in Figure 9. Figure 10 shows the fitted PDF of APMW-W and CDF. Figure 11 shows the PP plot and the Kaplan–Meier survival function of APMW-W.
Table 4.
Descriptive statistics of the carbon dioxide emissions data.
Figure 9.
The boxplot and Q-Q plot of the carbon dioxide emissions data.
Figure 10.
The fitted PDF and CDF of the APMW-W distribution.
Figure 11.
The PP plot and the Kaplan–Meier survival function of the APMW-W distribution.
The goodness-of-fit of APMW-W is compared with some other models, including the modified alpha-power Weibull exponential distribution (APMW-E ) and the two-parameter Weibull distribution (TW-D) (Equation (19)). The estimated values of the APMW-W and the competing models for the given data set of 50 carbon dioxide emissions are shown in Table 4. The distribution functions of these competitive distributions are given by:
- APMW-W distribution:
- APMW-E distribution:
- TW-d distribution:
Table 5 shows the estimated parameter values of APMW-W and the competing models. In Table 6, the Kolmogorov–Smirnov test is performed.
Table 5.
Estimated values of the APMW-W and the competing models.
Table 6.
Kolmogorov–Smirnov test.
7. Discussion and Future Framework
The addition of the three parameters to the family shows a significant effect on the diversity of SF as in Figure 1. The Weibull model became very flexible after the APMW-X family parameters were added. Sometimes, it resembles a bell curve with some torsion and at other times it seems to have strong swings as seen in Figure 3, which depends on the specific values of the parameters. The proposed model is a good candidate for data modeling in various financial, industrial, medical and other applications. However, it can be seen from Figure 3 that APMW-W has simple features and an elastic failure rate. The simple hazard rate and flexible features are another superiority of the proposed model along with its heavy-tail behavior.
The Kolmogorov–Smirnov (KS) statistics for one sample with p-values are given in Table 6. From the results in Table 6, it can be seen that the APMW-W model could be selected as the best model among the fitted models. In the future, it is possible to expand the scientific aspects associated with the application, such as the medical, technical, and industrial aspects.
We empirically show that the new five-parameter expansion of the Weibull distribution provides the best fit to the carbon dioxide emission data than the competing distributions. The practical example shows that the proposed model is a suitable alternative distribution for modeling carbon dioxide emission data.
8. Conclusions
The main objective of this study is to instruct a new flexible modification of the Weibull model by introducing three additional parameters. The introduction of the additional parameters leads to greater flexibility to improve the goodness of fit to the reliability data. We determined the maximum likelihood estimators for the intended model parameters and performed a Bayesian Monte Carlo simulation study. The performance of the Bayesian estimators is better than that of the corresponding ML estimators. The expected errors support the Bayesian estimator in most cases. The width of the intervals of the Bayesian estimator is shorter than that of the maximum likelihood estimator at the same confidence level. The loss functions of LINEX and general entropy behave better and are close in terms of variances. The biases and mean squared errors decrease with increasing sample size. It is clear that the proposed model fits well with the estimated PDF and CDF plots. The proposed model fits the Kaplan–Meier survival plot very well. Based on the Kolmogorov–Smirnov one-sample test, the new model provides a better fit than other competing models.
Author Contributions
Formal analysis, W.E.; Funding acquisition, Y.T.; Methodology, W.E.; Software, W.E.; Supervision, W.E.; Writing—original draft, W.E.; Writing—review and editing, Y.T. All authors have read and agreed to the published version of the manuscript.
Funding
The study was funded by Researchers Supporting Project number (RSP2023R488), King Saud University, Riyadh, Saudi Arabia.
Data Availability Statement
All the datasets used in this paper are available from the corresponding author upon request.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Gomez-Deniz, E.; Calder in-Ojeda, E. On the usefulness of the logarithmic skew normal distribution for describing claims size data. Math-Ematical Probl. Eng. 2020, 2020, 1420618. [Google Scholar] [CrossRef]
- Ahmad, Z.; Mahmoudi, E.; Hamedani, G. A class of claim distributions: Properties, characterizations and applications to insurance claim data. Commun. Stat.-Deory Methods 2020, 49, 2183–2208. [Google Scholar] [CrossRef]
- Emissions due to Agriculture Global, Regional and Country Trends 2000–2018. Available online: https://www.fao.org (accessed on 15 September 2021).
- 2019 UK Greenhouse Gas Emissions, Final Figures. Available online: https://assets.publishing.service.gov.uk (accessed on 15 September 2021).
- Shafiq, A.; Lone, S.A.; Sindhu, T.N.; Khatib, Y.E.; Al-Mdallal, Q.M.; Muhammad, T. A new modified Kies Fréchet distribution: Applications of mortality rate of COVID-19. Results Phys. 2021, 28, 104638. [Google Scholar] [CrossRef] [PubMed]
- Ahmad, Z.; Mahmoudi, E.; Dey, S. A new family of heavy tailed distributions with an application to the heavy tailed insurance loss data. Commun. Stat.-Simul. Comput. 2020, 49, 4372–4395. [Google Scholar] [CrossRef]
- Klakattawi, H.S.; Aljuhani, W.H. A new technique for generating distributions based on a combination of two techniques: Alpha power transformation and exponentiated TX distributions family. Symmetry 2021, 13, 412. [Google Scholar] [CrossRef]
- Mansoor, M.; Tahir, M.H.; Cordeiro, G.M.; Alzaatreh, S.B. The Marshall-Olkin logistic-exponential distribution. Commun. Stat.-Theory Methods 2019, 48, 220–234. [Google Scholar] [CrossRef]
- El-Khatib, Y.; Hatemi-J, A. Computations of price sensitivities after a financial market crash. In Electrical Engineering and Intelligent Systems; Springer: New York, NY, USA, 2013; pp. 239–248. [Google Scholar]
- El-Khatib, Y.; Al-Mdallal, Q.M. Numerical simulations for the pricing of options in jump diffusion markets. Arab. J. Math. Sci. 2012, 18, 199–208. [Google Scholar] [CrossRef]
- Foss, S.; Korshunov, D.; Zachary, S. An Introduction to Heavy-Tailed and Subexponential Distributions; Springer Science & Business Media: Berlin, Germany, 2013. [Google Scholar]
- Alzaatreh, A.; Lee, C.; Famoye, F. A new method for generating families of continuous distributions. Metron 2013, 71, 63–79. [Google Scholar] [CrossRef]
- Chettri, S.; Das, B.; Chakraborty, S. A New Modified Alpha Power Weibull Distribution: Properties, Parameter Estimation and Application. J. Indian Soc. Probab. Stat. 2021, 22, 417–449. [Google Scholar] [CrossRef]
- Bowley, A.L. Elements of Statistics, 4th ed.; Charles Scribner’s Sons: New York, NY, USA, 1920. [Google Scholar]
- Moors, J. The meaning of kurtosis: Darlington re-examined. Am. Stat. 1986, 40, 283–284. [Google Scholar]
- Kundu, D.; Joarder, A. Analysis of Type-II progressively hybrid censored data. Comput. Stat. Data Anal. 2006, 50, 2509–2528. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).