1. Introduction
Suppose that a company has
N systems functioning independently and producing a certain product at a given time, where
N is a random variable determined by economy, customers demand, etc. The reason for considering
N as a random variable comes from a practical viewpoint, because failure (of a device for example) often occurs due to the present of an unknown number of initial defects in the system. In this paper, we consider the case in which
N is taken to be a geometric random variable with the probability mass function given by
for
and
n is a positive integer. We may take
N to follow other discrete distributions, such as binomial, Poisson, etc, whereas they need to be truncated 0 because one must have
.
Another rationale by taking N to be a geometric random variable is that the “optimum” number can be interpreted as “number to event”, matching up with the definition of a geometric random variable, as commented by [
1]. The geometric distribution has been widely used for the number of “systems” in the literature; see, for example, [
2,
3]. It has also been adopted to obtain some new class of distributions; see [
4] for the exponential geometric (EG) distribution, [
5] for the exponentiated exponential geometric (EEG) distribution, [
6] for the Weibull geometric distribution, [
1] for the geometric exponential Poisson (GEP) distribution, to name just a few.
On the other hand, we assume that each of
N systems is made of
parallel components, and therefore, the system will completely shutdown if all of the components fail. Meanwhile, we assume that the failure times of the components for the
ith system, denoted by
, are independent and identically distributed (iid) with the cumulative distribution function (cdf)
and the probability density function (pdf)
. For simplicity of notation, let
stand for the failure time of the
ith system and
X denote the time to failure of the first out of the
N functioning systems, i.e.,
. Then it can be seen from [
5] that the conditional cdf of
X given
N is given by
and the unconditional cdf of
X can thus be written as
The new class of distribution in (
1) depends on the cdf of the failure times of the components in the system, which may follow some continuous probability distributions, such as the exponential, Lindley, and Weibull distributions. As an illustration, if the failure times of the components for the
ith system are iid exponential random variables with the rate parameter
, i.e.,
, then we obtain the EEG distribution due to [
5]. Its cdf is given by
Please note that in reliability engineering and lifetime analysis, we often assume that the failure times of the components within each system follow the exponential lifetimes; see, for example [
4,
5,
7], among others. This assumption may be unreasonable because the hazard rate of the exponential distribution is a constant, whereas some real-life systems may not have constant hazard rates, and the components of a system are often more rigid than the system itself. Accordingly, it becomes reasonable to consider the components of a system following a distribution with a non-constant hazard function that has flexible hazard function shapes.
In this paper, we propose a new three-parameter lifetime distribution by compounding the Lindley and geometric distributions based on the new class of distribution in (
1). The Lindley distribution was first proposed by [
8] in the context of Bayesian statistics, as a counterexample of fiducial statistics. It has recently received considerable attention as an appropriate model to analyze lifetime data especially in applications modeling stress-strength reliability; see, for example, [
9,
10,
11]. Ghitany et al. [
12] argue that the Lindley distribution could be a better lifetime model than the exponential distribution through a numerical example and show that the hazard function of the Lindley distribution does not exhibit a constant hazard rate, indicating the flexibility of the Lindley distribution over the exponential distribution. These observations motivate us to study the structure properties of the distribution in (
1) when the failure times of the units for the
ith system are iid Lindley random variables with the parameter
, i.e.,
where the parameter
. Its corresponding cdf is given by
where the parameters
,
, and
. We call the distribution as the
exponentiated Lindley geometric (ELG) distribution. Indeed, it is necessary to compute the entropy measure for ELG distribution under the assumption that errors are non-Gaussian distributed (e.g., [
13]). Other motivations of the ELG distribution are briefly summarized as follows. (i) It contains several lifetime distributions as special cases, such as the Lindley-geometric (LG) distribution due to [
14] when
. (ii) It can be viewed as a mixture of exponentiated Lindley distributions introduced by [
15]. (iii) The ELG distribution is a flexible model which can be widely used for modeling lifetime data in reliability and survival analysis. (iv) It exhibits monotonically increasing, decreasing, unimodal (upper-down bathtub), and bathtub shaped hazard rates but does not exhibit a constant hazard rate, which makes the ELG distribution to be superior to other lifetime distributions, which exhibit only monotonically increasing/decreasing, or constant hazard rates.
The remainder of the paper is organized as follows. In
Section 2, we discuss various statistical properties of the new distribution. The maximum-likelihood estimation is considered in
Section 3, and an EM algorithm is proposed to find the maximum likelihood estimates because they cannot be obtained in closed form. The maximum-likelihood estimation for censored data is also discussed briefly. In
Section 4, two real-data applications are provided for illustrative purposes. Some concluding remarks are given in
Section 5.
4. Two Real-Data Applications
In this section, we illustrate the applicability of the ELG distribution using two real-data examples. We use the same data sets to compare the ELG distribution with the Gamma, Weibull, Lindley geometric (LG), Weibull geometric (WG) distributions, whose densities are given by
for
, respectively. To compare the ELG distribution with the four distributions listed above, we advocate the Akaike information criterion (AIC), the Bayesian information criterion (BIC), and the AIC with a correction (AICc) for the two-real data sets. In addition, we apply two formal goodness-of-fit tests: the Cramér-von Mises (
) and Anderson-Darling (
) statistics to further verify which distribution fits better to the data; see, for example, [
5,
20], among others. The smaller the value of the considered criterion, the better the fit to the data.
The first data set is about the remission time (in months) of a random sample of 128 bladder cancer patients. This data set presented in
Table 1 was studied by [
21] in fitting the extended Lomax distribution and [
22] for the modified Weibull geometric distribution.
Table 2 shows the MLEs of the parameters, AIC, BIC, and AICc for the ELG, Gamma, Weibull, LG, and WG distributions for the first data set. We observe from
Table 2 that the ELG distribution and its special case LG provide an improved fit over other distributions that are commonly used for fitting lifetime data. The plots of the fitted probability density and survival function are also shown in
Figure 4. Please note that the density and survival functions of the ELG distribution seem to be better than Gamma, Weibull, and WG density and survival functions. In addition, we observe from the values of goodness-of-fit tests in
Table 3 that the ELG distribution fits the current data better than other distributions under consideration.
As mentioned in
Section 3.1, we can adopt the LR statistic to compare between the ELG distribution and its special submodels. For example, the LR statistic for testing between the LG and ELG distributions (i.e.,
versus
) is
and the corresponding
p-value is
. Thus, we fail to reject
and conclude that there is no statistical difference between the fits to this data using the ELG and its submodel LG. This is quite reasonable because the estimate of
in the ELG model is
, which is close to 1 in the LG model.
In the second data set, we consider the waiting time (in minutes) before service of 100 bank customers. The data are presented in
Table 4. This data set was used by [
12] in fitting the Lindley distribution.
Table 5 shows the MLEs of the parameters, AIC, BIC, and AICc for the ELG, Gamma, Weibull, LG, and WG distributions for the second data set.
Table 5 indicates that the ELG distribution is still a strong competitor to other lifetime distributions. In addition, the plots of the fitted probability density and survival function are shown in
Figure 5. Please note that the ELG and WG distributions perform identically and that the empirical and fitted five survival curves almost overlap for this data set, supporting that the ELG distribution fits this data at least as good as the four alternative distributions. In addition, we observe from the values of goodness-of-fit tests in
Table 6 that the ELG distribution fits the current data better than the Gamma, Weibull, and LG distributions and is comparable with the WG distribution.