1. Introduction
Information theory provides an intuitive tool to measure the uncertainty of random variables and the information shared by them, in which the entropy and the mutual information are two critical concepts.
Let X be a random variable with a cumulative distribution function (cdf)
and probability density function (pdf)
. The differential entropy
of the random variable is defined by Cover and Thomas [
1] to be
Let us consider a life-testing experiment where
n units is kept under observation until failure. These units could be some system, components, or computer chips in reliability study experiments, or they could be patients put under certain drug or clinical conditions. Suppose the life lengths of these
n units are independent identical random variables with a common cdf
and pdf
. Data collected from such experiments called the order statistics sample
where
is called the rth-order statistics (OS).
For some reason, suppose that we have to terminate the experiment before all items have failed. For example, individuals in a clinical trial may drop out of the study, or the study may have to be terminated for lack of funds. In an industrial experiment, units may break accidentally. There are, however, many situations in which the removal of units prior to failure is pre-planned. One of the main reasons for this is to save time and cost associated with testing. Data obtained from such experiments are called censored data.
The most common censoring schemes are Type I and Type II censoring. In conventional Type I censoring, the experiment continues up to a prespecified time T. Any failures that occur after T are not observed. The termination point T of the experiment is assumed to be independent of the failure times. In conventional Type II censoring, the experimenter decides to terminate the experiment after a prespecified number of items fail. In this scenario, only the smallest lifetimes are observed. In Type I censoring, the number of failures observed is random and the endpoint of the experiment is fixed, whereas in Type II censoring the endpoint is random, while the number of failures is fixed.
Park [
2] studied the entropy of Type II censored sample. Park [
3] considered testing exponentiality based on the Kullback-Leibler information with the Type II censored data. The entropy of a single
, and a complete order statistic sample has been studied in Wong and Chen [
4] and Ebrahimi
et al. [
5].
Here we considers progressive Type II censored schemes. Among the different censoring schemes, the progressive censoring scheme has received a considerable attention in the last few years, particularly in reliability analysis. It is a more general censoring mechanism than the traditional Type I and Type II censoring [
6]. The recent review article by Balakrishnan [
7] provide details on progressive censoring schemes and on its different applications. This paper is concerned with simplifying calculation of the entropy in progressively Type II censored data from the i.i.d. random sample of size
n. However, the extension to progressively Type II censored data is not so straightforward, because the joint entropy of progressively Type II censored data is an n-dimensional integral. Besides, removals cause additional complications.
Following Balakrishnan and Aggarwala [
8], progressively Type II censored samples can be described as follows. Let
n units be placed in test at time zero.
The m, and the and are fixed prior to the test.
At the first failure, units are randomly removed from the remaining surviving units.
At the second failure, units are randomly removed from the remaining units.
The test continues until the mth failure, when all remaining are removed from experiment, so the life testing stops at the mth failure..
The observed failure times constitute Type II progressive censored OS.
If then which corresponds to the Type II censoring.
If then which corresponds to the usual order statistics.
Thus, usual OS and the Type II censoring become a special cases of progressively Type II censored samples. So any result established for progressively Type II censoring becomes a generalization of the corresponding result for OS and the Type II censoring.
The likelihood function may be written as [
8]
where
. The joint entropy contained in
,
i.e., a collection of first
i of progressively Type II censored OS, is defined to be
where
, is the density function of
. To our knowledge, Balakrishnan
et al. [
9] generalized the result of Park [
3] testing exponentiality based on the Kullback-Leibler information with the type II censored data to a progressively Type II censored data and obtained an approximate to the joint entropy in progressively Type II censored samples based on nonparametric estimation. Hence, the exact values of the joint entropy in progressively Type II censored samples has not been obtained. Several applications for entropy such as characterization, tests for goodness-of-fit based on censored data, parameter estimation and quantization theory are known, for example see [
3,
9].
In the case of
, difficulty arise from the removal as well as the expression of
, which involves integration over
i random variables, so simplifying the calculation of
is more attractive. In this article we focus on the study of the properties of the joint entropy in progressively Type II censored OS. In
Section 2 we developed the idea of Park [
2] about the decomposition of entropy in OS to introduce an indirect approach for decomposition of entropy in progressive Type II censored OS. In
Section 3 we derive a recurrence relations for the entropy in progressively Type II censored samples, which will prove helpful in calculating the entropy. In
Section 4 we derive an efficient computational method to reduce
r-dimensional integrals in the calculation of
to no integral where the computation of the entropy in progressively Type II censored samples simplifies to a sum; entropy of the smallest OS of varying sample size. In
Section 5 we apply our results for computing the entropy in collections of a progressively Type II censored samples from normal and exponential distributions.
2. Decomposition of the Joint Entropy
Park [
2] and Wong and Chen [
4] have shown that the total entropy of i.i.d. random sample of size
n is decreased if the sample is ordered. Park [
2] showed how much the entropy of i.i.d. random sample of size
n is decreased if the sample is ordered through, the following identity about the entropy of the ordered data
In view of Equation (
4) and noting that progressive Type II censored sample can be seen as an ordered sample
with the removals
, we have the following result for the entropy of the progressive Type II censored OS sample.
Lemma 2.1.
where
, and
.
Since the progressively Type II censored sample form a Markov chain [
8], we have the following results.
Lemma 2.2.
PROOF. From the Markov chain property of progressive Type II censored OS, the first part follows directly. The second part can be shown by using the first part and the symmetry of the mutual information Csisz
[
10],
Next we show the following decomposition of the entropy of progressive Type II censored OS.
Lemma 2.3.
PROOF. By the additive property of the entropy measure and Lemma 2.1. we have the result.
We see from Equations (
5) and (
6) that the entropy of
r progressive censored data
can be obtained from
. So we consider
to study
. Let
be a progressively Type II censored sample with censoring scheme
. The entropy in a collection of first
i progressively Type II censored
is defined by Equation (
3), and can be written as
where
is the joint pdf of the first
i order statistics of the progressively Type II censored sample.
Using the Markov chain property of the order statistics from progressive Type II censored samples, we have the following decomposition for the score function:
where
is the pdf of
given
. The following decomposition follows from the strong additivity of the entropy
where
is the average of the conditional information in
given
.
On the other hand, in view of the result of Balakrishnan and Aggarwala [
8], the
is the joint density of the progressively Type II censored sample of size
, with censoring scheme
, from
drawn from the parent distribution
truncated from the left at
with density
. Therefore
can be written as the double integral
where
and
is defined by
where
and
Since we already know about the entropy of the complete sample
, the entropy
can be now easily derived from Equations (
6) and (
8).
EXAMPLE 2.1. For the exponential density
, we can show that
so that
Thus in that case
where
is the entropy in a single observation from the exponential density
.
REMARK 2.1. We note that all of Park’s results concerning the entropy for the minimum order statistics works for the case of progressively Type II censored sample, since .
3. Recurrence Relations
Recurrence relations between the cdf (pdf) of OS and progressive Type II censored OS have been studied by many authors for the purpose of simplifying the calculation of moments of OS and progressive Type II censored OS.
The standard recurrence relation for the moments of OS was obtained by Cole [
11], and can be written as
where
is the moments of the usual OS
.
This result can be directly derived from the corresponding recurrence relation between the cdf’s of OS. Kamps and Cramr Lemma 4 [
12] obtained the corresponding recurrence relation for generalized OS as
Since the generalized OS includes the progressive Type II censored OS, it is clear that the case of progressive Type II censoring is subsumed in the above result. By setting
for
,
for
and
, we have
Using Equation (
14) and the decomposition of the entropy in Equation (
8) we have the following results for the entropy in the progressive censoring scheme.
RELATION 3.1
where
and
.
PROOF. From Equation (
8) we have
on the other hand Equation (
14) yields
combining Equations (
16) and (
17) and noting that
since
, then the Lemma follows.
The following relation shows that the entropy of the first r of the progressive Type II censored OS of sample size can be obtained as a linear combination of the first r and of the progressive Type II censored OS of sample size n.
PROOF. For a sample of size
the general decomposition of the entropy of progressive Type II censoring takes the form
By applying RELATION 3.1 on Equation (
20) we get
where
defined above. Equation (
21) can be written, by using Equations (
5) and (
6), as
After some simplifications the result follows.
REMARK 3.1. With
all results of
Section 2 and
Section 3 reduce to corresponding results for the entropy in a collections usual OS.
4. Computational Method for Calculating
In this section we provide another approach to simplify the calculation of the entropy in a collection of progressively Type II censored OS. We reduce r integrals in the calculation of to no integral where the computation of the entropy in progressively Type II censored samples simplifies to a sum; entropy of the smallest OS of varying sample size .
Lemma 4.1. Let
be i.i.d. random sample of size
n from pdf
with cdf
and hazard function
, and let
be OS corresponding to this sample. Park [
2] obtained the entropy in the smallest order statistics as
Theorem 4.1. Let
be a progressively Type II censored sample with censoring scheme
. The entropy in the
r collection of progressively Type II censored sample
can be written as
where
,
,
and
in which empty products are defined as 1.
PROOF. By the Markov chain properties of progressive Type II censored samples, one can write
where
is the conditional pdf of
given
, which also is the density of the first order statistic of a sample of size
with the truncated density
. Therefore, we have
where
is the expected entropy in
given
i.e.,
By Lemma 4.1. and noting that, condition on
,
has the same pdf as the first order statistic from a random sample of size
with pdf
. Equation (
27) can be written as
where
By changing integrals and noting that
we have
Therefore Equation (
31), can be written as
Thus by using Equations (
26) and (
31)
can be expressed as a summation of single integral as
where
is defined above. From Theorem 1 in Balakrishnan
et al. [
13], we have the following relation for
where,
,
,
and
are defined above.
We reexpress Equation (
33) as
where
is the usual smallest order statistics in a sample of size
. If we use Equations (
23) and (
34) in Equation (
32) the result follows.
We have written program in the algebraic manipulation package, MATHEMATICA [
14], for computing Theorem 4.1 and Lemma 4.1 calculated above. For a pre-determined progressively Type II censoring scheme
the program return the numerical values of the entropy. The electronic version of the computer program can be obtained by contacting the corresponding author.
REMARK 4.1. The entropy of the smallest usual order statistics are known for well-known distributions for example see Park [
2] and Asadi
et al. [
15].