Abstract
When the additional sample for the second stage may not be available, one-stage multiple comparisons for exponential median lifetimes with the control under heteroscedasticity including one-sided and two-sided confidence intervals are proposed in this paper since the median is a more robust measure of central tendency compared to the mean. These intervals can be used to identify treatment populations that are better than the control or worse than the control in terms of median lifetimes in agriculture, stock market, pharmaceutical industries. Tables of critical values are obtained for practical use. An example of comparing the survival days for four categories of lung cancer in a standard chemotherapeutic agent is given to demonstrate the proposed procedures.
1. Introduction
In the lifetime test problems, the lifetime of some products follows an exponential distribution (see Lawless [1]). As the lifetime of products possesses a two-parameter exponential distribution, this research focuses on the development of multiple comparison procedures with the control population in terms of median lifetime. We consider independent populations , where possesses a two-parameter exponential distribution denoted by , . The kth population is regarded as the control population and the first k − 1 populations are regarded as the treatment populations. The location parameters are unknown and usually called the guaranteed time in reliability analysis. The unequal and unknown scale parameters are regarded as the mean lifetime minus their location parameters since the mean lifetime for the ith population is , . Regarding the multiple comparisons with the control population, Ng et al. [2] proposed a procedure in terms of the location parameter under the assumption of equal scale parameters. Under heteroscedasticity (unequal scale parameters), Lam and Ng [3] developed a design-oriented two-stage multiple comparison procedure in terms of location parameters. For the problem of multiple comparisons with the average under heteroscedasticity, Wu and Wu [4] investigated the two-stage procedures in terms of exponential location parameters. However, it may happen that the experimenters are not able to collect the additional sample for the second stage for the two-stage procedure. Because of this reason, Wu et al. [5] propose one-stage multiple comparisons with the average instead. Wu [6] proposes an one-stage multiple comparisons with the control for exponential distributions in terms of mean lifetimes under heteroscedasticity. Instead of doing the multiple comparisons with the control based on mean lifetimes, the median lifetimes should be considered since the median lifetimes are more robust for measuring the central tendency of the exponential lifetime distributions than mean lifetimes. Therefore, we consider the multiple comparisons with the control in terms of median lifetimes instead of mean lifetimes in this study. The median lifetime for the ith population is obtained as , . The one-sided and two-sided confidence interval of the ith median lifetime deviated from the control median lifetime denoted by , − 1 are proposed in the next section. All critical values are computed and listed in a table for the application of users. In Section 3, an example of comparing the survival days for four categories of lung cancer is considered for the illustrative aims to illustrate the implementation of proposed methods. Finally, our conclusions are summarized in Section 4.
2. One-Stage Multiple Comparisons with the Control for Exponential Median Lifetimes under Heteroscedasticity
The exponential distribution is primarily used in reliability applications and this distribution is used to model data with a constant failure rate, see for example, Johnson et al. [7]. The probability density function (pdf) and cumulative distribution function (cdf) for the ith exponentially distributed population are defined as
and
Location parameters and scale parameters are unknown and possibly unequal. The survival function for the ith exponentially distributed population is . The qth quantile of the ith exponential distribution denoted by can be obtained by solving and results in , . That is there are at least 1 − q percentage of products having lifetime longer than units. Let q = 0.5 and then the median lifetimes can be obtained as , . In other words, there are at least 50% of products having lifetime longer than units.
Lam and Ng [3] proposed two-stage multiple comparison procedures with the control. But for some reasons like lacking budget or encountering experimental difficulties, it is possible that the experimenters cannot collect the additional sample for the second stage. In this case, one-stage procedures should be considered instead. Therefore, we propose one-stage multiple comparison procedures for exponential median lifetimes with the control as follows:
Take a random sample of size from denoted by for the one-stage procedures. Let and and let
It is well-known that the complete sufficient statistics for () are (). From Roussas [8], the following three distributional results are observed.
(1) has a chi-square distribution with degrees of freedom (df) denoted by .
(2) has a standard exponential distribution denoted by .
(3) and are independent.
Using the distribution results of (1) and (2), we can find the uniformly minimum variance unbiased estimator (UMVUE) for () as (). Furthermore, the UMVUE for the ith median lifetime is . Then we find the UMVUE for the ith median lifetime deviated from the control median lifetime denoted by as . Based on this estimator, we are going to propose the simultaneous confidence intervals for in Theorem 1.
For the ith population, we consider the pivotal quantity
Based on these pivotal quantities, we propose the one-stage multiple comparison procedures with the control in terms of median lifetimes in Theorem 1.
Theorem 1.
For a given, we can find the upper confidence bounds, lower confidence bounds, and two-sided confidence intervals for, i = 1, …, k−1 as follows:
- (a)
- Ifis thepercentile of the distribution of U, where U =), then the simultaneous P* upper confidence bounds forare,.
- (b)
- Ifis thepercentile of the distribution of L, where L=, then the simultaneous P* lower confidence bounds forare,.
- (c)
- Ifis thepercentile of the distribution of T, where T =, then the simultaneous P* two-sided confidence intervals for are
The technique we use to prove the above Theorem is the following Lemma given in Lam [9,10]:
Lemma 1.
Suppose X and Y are two random variables, a and b are two positive constants, then.
Proof of Theorem 1.
For (a), we have
It is clear that represents the percentile of the distribution of U and thus the proof is completed.
For (b), we have
It is clear that represents the percentile of the distribution of L and thus the proof is completed.
For (c), combining (a) and (b), we have
It is clear that represents the percentile of the distribution of T and thus the proof is completed. □
When the lifetime of products follows a two-parameter exponential distribution, this theorem can be used to find the upper confidence bounds and the lower confidence bounds for the parameters of , i = 1, …, k − 1, where represents the median lifetime of the control population. This theorem can also be used to find the two-sided simultaneous confidence intervals for parameters , i = 1, …, k − 1. Based on these estimations, experimenters can identify better-than-the-control, worse-than-the-control, and not-much-different-from-the-control treatment populations in terms of median lifetimes. The real-life example to demonstrate the application of this theorem is given in Section 3.
It is very difficult to derive the p.d.f. or c.d.f. for U, L, and T. Using the above three distributional results (1)~(3), we observe that
, , = 2 m−2. If we can generate independent random variables , , then we can find the empirical distribution of U, L, and T and the critical values , , and are the empirical percentiles of the distributions of U, L, and T, through Monte-Carlo simulation methods.
The steps to find the critical values of , , and in theorem 1 are enumerated as follows:
Step 1: Generate k independent random variables ~ and another k independent random variables ~ and then obtain the k independent random variables .
Step 2: Compute U = ;
L = and
T = , .
Step 3: Repeat Steps 1,2 for 100,000 times. After sorting, we have ; ; .
Step 4: The critical values are obtained as = ; = ; = , where [x] is the largest integer less than or equal to x.
Remark:
For unequal initial sample sizes denoted as, Theorem 1 can be modified by replacing m by.
For the practical use of application, we find the critical values , , and by using the above algorithm under k = 3,4,…,10, m = 2,3, …, 10,15,20,25,30 and P* = 0.90, 0.95 and 0.975. The critical values are listed in the following table. From part (c) of Theorem 1, we observe that the length of the two-sided confidence intervals for is L1 = 2. The larger the critical values, the larger the confidence length when c* is fixed. From Table 1, we observe that the critical value increases when increases for fixed k and m or when k increases for fixed and m. Therefore, the confidence length L1 increases when increases for fixed k and m or when k increases for fixed and m.
Table 1.
Approximate critical values of , , and .
3. Example
Referring to Maurya et al. [11], the example of survival days of patients with inoperable lung cancer who were subjected to a standard chemotherapeutic agent is used to illustrate our proposed multiple comparison procedures with the control in Theorem 1. The patients are divided into four categories based on the histological type of their tumor: squamous, small, adeno, and large. The data are a part of a larger data set collected by the Veterans Administrative Lung Cancer Study Group in the United States. The survival days of 9 patients for four kinds of lung cancer are listed in Table 2:
Table 2.
Survival times for four categories of lung cancer.
Maurya et al. [11] had indicated that the data in the four categories may be assumed to be drawn from the two-parameter exponential distributions , . We regard the first category as the control population.
The required statistics and critical values for and 0.975 are summarized in Table 3.
Table 3.
The required statistics and critical values.
The upper confidence bounds and the lower confidence bounds for , i = 2,3,4 under confidence coefficients 0.90, 0.95, and 0.975 are listed in Table 4 using parts (a) and (b) in Theorem 1. Since all upper bounds are positive, no categories are selected in a subset of all treatment populations which are worse than the control population (Category 1 lung cancer). Since only Category 4 has positive lower bound for all confidence coefficients, we conclude that only Category 4 is selected in a subset of all treatment populations which are better than the control population with the probability of correct selection being at least 0.90, 0.95, and 0.975.
Table 4.
The 90%, 95%, and 97.5% upper bounds (the number before comma) and lower bounds (the number after comma) for three categories compared with the control category (Category 1).
The two-sided confidence intervals for , i = 2,3,4 with confidence coefficients 0.90, 0.95, and 0.975 are computed using part (c) in Theorem 1 and the results are listed in Table 5. For confidence levels of 0.90 and 0.95, only the confidence interval for Category 4 does not contain zero and the lower limit is positive. We conclude that only Category 4 has median survival days better than Category 1. For confidence level of 0.975, no categories are identified to have median survival days greater than Category 1 in terms of median survival days.
Table 5.
The 90%, 95% and 97.5% two-sided confidence intervals for three categories compared with the control category (Category 1).
4. Conclusions
We analyze the impact of confidence levels P* and number of population k on the confidence length in this paper. Instead of doing multiple comparisons with the control for exponential mean lifetimes, we propose multiple comparison procedures with the control in terms of median lifetimes since the measurement of median lifetimes are more robust than mean lifetimes for the measurement of central tendency for exponential lifetime distribution. For the illustrative aim, we give a real life example to illustrate how to find the upper bounds, lower bounds, and two-sided confidence intervals for our parameters related to median lifetimes.
Funding
This research was funded by [Ministry of Science and Technology, Taiwan] MOST 108-2118-M-032-001- and MOST 109-2118-M-032 -001 -MY2 and the APC was funded by MOST 109-2118-M-032 -001 -MY2.
Acknowledgments
The author wish to thank an associate editor and referees for their careful reading and valuable suggestions so that the article is more readable and applicable. The author’s research was supported by Ministry of Science and Technology MOST 108-2118-M-032-001- and MOST 109-2118-M-032 -001 -MY2 in Taiwan, ROC.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Lawless, J.F. Statistical Models and Methods for Lifetime Data; Wiley: New York, NY, USA, 2003. [Google Scholar]
- Ng, C.; Lam, K.; Chen, H. Multiple Comparison of Exponential Location Parameters with the Best under Type II Censoring. Am. J. Math. Manag. Sci. 1992, 12, 383–402. [Google Scholar] [CrossRef]
- Lam, K.; Nag, C. Two-stage procedures for comparing several exponential populations with a control when the Scale Parameters are unknown and unequal. Seq. Anal. 1990, 9, 151–164. [Google Scholar] [CrossRef]
- Wu, S.-F.; Wu, C.-C. Two stage multiple comparisons with the average for exponential location parameters under heteroscedasticity. J. Stat. Plan. Inference 2005, 134, 392–408. [Google Scholar] [CrossRef]
- Wu, S.-F. One stage multiple comparisons with the average for exponential location parameters under heteroscedasticity. Comput. Stat. Data Anal. 2013, 68, 352–360. [Google Scholar] [CrossRef]
- Wu, S.F. One stage multiple comparisons of k − 1 treatment mean lifetimes with the control for exponential distributions under heteroscedasticity. Commun. Stat. Simul. Comput. 2018, 47, 2968–2978. [Google Scholar] [CrossRef]
- Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions; Wiley: New York, NY, USA, 1994. [Google Scholar]
- Roussas, G.G. A Course in Mathematical Statistics; Academic Press: San Diego, CA, USA, 1997. [Google Scholar]
- Lam, K. Subset selection of normal populations under heteroscedasticity. In IPASRAS-II: Proceedings and Discussions of the Second International Conference on Inference Procedures Associated with Statistal Ranking and Selection on the Frontiers of Modern Statistical Inference Procedures, II; ACM: New York, NY, USA, 1992; pp. 307–344. [Google Scholar]
- Lam, K. An improved two-stage selection procedure. Commun. Stat. Simul. Comput. 1988, 17, 995–1006. [Google Scholar] [CrossRef]
- Maurya, V.; Goyal, A.; Gill, A.N. Simultaneous testing for the successive differences of exponential location parameters under heteroscedasticity. Stat. Probab. Lett. 2011, 81, 1507–1517. [Google Scholar] [CrossRef]
© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).