On the Admissibility of Simultaneous Bootstrap Conﬁdence Intervals

: Simultaneous conﬁdence intervals are commonly used in joint inference of multiple parameters. When the underlying joint distribution of the estimates is unknown, nonparametric methods can be applied to provide distribution-free simultaneous conﬁdence intervals. In this note, we propose new one-sided and two-sided nonparametric simultaneous conﬁdence intervals based on the percentile bootstrap approach. The admissibility of the proposed intervals is established. The numerical results demonstrate that the proposed conﬁdence intervals maintain the correct coverage probability for both normal and non-normal distributions. For smoothed bootstrap estimates, we extend Efron’s (2014) nonparametric delta method to construct nonparametric simultaneous conﬁdence intervals. The methods are applied to construct simultaneous conﬁdence intervals for LASSO regression estimates.


Introduction
In statistical sciences, the computation of simultaneous confidence intervals (SCIs) for parameters of interest plays a major role [1]. For instance, multiple testing problems, regression analysis, and multivariate inference are well known application areas. When the finite sampling distribution of the estimator (here, a vector) under the alternative is known, exact SCIs can be computed. However, that is often not the case, and when the underlying distribution of the estimator is difficult to obtain analytically, bootstrap methods (using drawing with replacement from the data) can be used to approximate its sampling distribution (see, e.g., [2,3], among others). The authors of [4] provide an algorithm to construct SCIs for multivariate estimates, which calculates the number of bootstrap samples that fall outside a confidence region. To achieve the pre-specified level of (1 − α) × 100%, one can trial different values of the lower and upper bounds of the intervals until attaining the target level. In [5], a more efficient way of computing (1 − α) × 100% upper SCIs was developed by sorting the bootstrap realizations of the multivariate estimates by the maximum rank across all the coordinates. Then, the (1 − α) × 100% percentile of the maximum rank is determined. The limit of the confidence interval for each coordinate is the value of the estimate sharing the same rank as the (1 − α) percentile of the maximum rank. By symmetry, the (1 − α) × 100% lower SCIs can also be constructed.
In this note, we improve their method by sharpening the limits of the SCIs for each coordinate. It is observed that the value of the estimate sharing the same rank as the (1 − α) × 100% percentile of the maximum rank is indeed higher than the estimates in the selected (1 − α) × 100% samples, but the threshold for each coordinate is too conservative. We can tighten them rather than using the same maximum rank cutoff values (see details below). Furthermore, we show that the new proposed SCIs are admissible in the sense that there exist no other bootstrap SCIs which have the same (1 − α) × 100% coverage probability while being uniformly smaller than or equal to the proposed confidence intervals in all coordinates and strictly smaller in at least one coordinate. We apply the proposed nonparametric SCIs on LASSO penalized regression estimates to perform post model selection inference. We observe that the nonparametric method is able to preserve the asymmetry of the sampling distribution of the LASSO penalized regression estimates. This is advantageous compared to parametric-based SCIs, which are often constructed symmetrically around the point estimate ignoring the asymmetric nature of the penalized estimates.

Methods
In the following, let Z denote a dataset and let θ=(θ 1 , . . . , θ p ) denote a vector of parameters of interest. Then, the SCI Bootstrap-based algorithms for the computations of the thresholds a j (Z) and b j (Z) will now be discussed. First, we will introduce the Mandel-Betensky intervals (MB) and our proposed modification (MMB), which are obtained by sharpening their individual thresholds. Later, we show that they are admissible at level (1 − α).

Mandel-Betensky Intervals and Modified (MMB) Upper Intervals
We first estimate the upper bounds (using resampling) and will focus later on twosided SCIs. Since the computation of the upper MBB bounds uses the same initial steps as those of the MB intervals, we provide their computational details simultaneously:

2.
For each coordinate j, j = 1, . . . , p, order the bootstrap estimates asθ (1j) ≤θ (2j) ≤ · · · ≤ θ (Bj) . In the case of tied observation, we use random ranks. The rank ofθ bj is denoted as r(b, j). For each sample, find the maximum rank r(b) = max j r(b, j), which is the largest rank associated with the bth bootstrap sample.

4.
Denote the collection of bootstrap samples with the maximum rank below r 1−α as Construct the upper limits for each coordinate: For coordinate j, calculate t j = max b∈Φ r(b, j). The upper limits of the SCIs are given byθ (t 1 1) ,θ (t 2 2) . . . ,θ (t p p) .
In comparison, the MB method uses t j = r 1−α for all j. It can be shown that t j = max b∈Φ r(b, j) ≤ r 1−α . This is because for any b ∈ Φ, r(b, j) ≤ r(b) ≤ r 1−α . The proposed method using individual rank thresholds instead of the same maximum rank threshold for all coordinates. By this means, the new method sharpens the upper bound and leads to universally shorter or equal SCIs. We summarize the results in the following theorem.

1.
Follow the same steps 1 and 2 as in the algorithm for the computation of the MMB upper intervals.

3.
Denote the collection of bootstrap samples with the maximum sample rank equal or below the For each coordinate j, order the bootstrap estimates in set Φ. The new rank of θ bj in set Φ is denoted as r (b, j). For each sample, find the minimum sample rank r (b) = min j r (b, j) which is the smallest rank associated with the bth bootstrap sample in set Φ.

5.
Calculate Denote the collection of bootstrap samples within Φ with the minimum sample rank above To construct the upper and lower limits for each coordinate, calculate t j = max b∈Ψ r(b, j), and w j = min b∈Ψ r(b, j). The simultaneous confidence interval upper limits areθ (t j j) and the lower limits areθ (w j j) , j = 1, . . . , p, respectively. Proofs of Theorem 1 and 2 are provided in Appendix A. The theoretical results above assume that there are no ties for the (1 − α/2) percentile of r(b), and the (α/(2 − α)) percentile of r (b). In practice, however, ties can often occur. Increasing the number of bootstrap samples can decrease the chance of having ties. In the presence of ties in ranking r(b), there could be multiple samples denoted as set S 1 with the same maximum rank r 1−α/2 . We can arbitrarily split them to set Φ and set Φ c while maintaining that there are B(1 − α/2) samples in Φ. A similar strategy can be applied when there are multiple samples denoted as set S 2 with tied minimum sample rank for the (α/(2 − α))−percentile of r (b). The resulting SCIs are not unique, but the admissibility result still holds true if the competing set of intervals have the same coverage rate and the covered set of bootstrap samples Ω satisfies the condition that Ω ∩ Ψ c (S 1 ∪ S 2 ). The condition implies Ω contains at least a sample b such that b is not in Ψ and also not in S 1 ∪ S 2 . That means either maximum rank r(b ) > r 1−α/2 , or the minimum rank r (b ) < r α/2−α . Then, the competing set cannot have intervals universally shorter than the proposed MMB intervals.
We conduct simulation studies to verify the validity of the proposed method. We generate a regression model y i = x T i β + i , where β = (β 1 , . . . , β p ) T , and i either follows a normal N(0, 1), or lognormal(0, 1) distribution, i = 1, . . . , n. We construct the nonparametric 100(1 − α)% SCIs for the unknown regression parameters with varying sample sizes and regression parameters. We compared the performance of the percentile-based methods with the simultaneous confidence intervals based on the multivariate normal (MVN) critical values. We also include the Bonferroni method which uses the critical value of the univariate normal distribution but the significance level is adjusted to α/p. The covariance structure of the multivariate normal is estimated by the sample covariance of the bootstrap estimates. For each simulation setting, we simulate 1000 datasets. For each dataset, we generate 10, 000 bootstrap samples. The significance level α is set to 0.05. It is shown in Table 1 that the proposed MMB intervals maintain satisfactory simultaneous coverage rates close to the nominal level for both normal and lognormal errors and different values of p. The MMB method's coverage probability is very close to that of the MB method, while the interval lengths are uniformly shorter across all the simulation settings. The Bonferroni method is always more conservative than the MVN method. When p increases, for lognormal errors, the MVN method and the Bonferroni method are conservative with familywise type I error rate equal to 0.018 which is much less than the norminal level of 0.05. Results are obtained from 1000 simulated datasets, the number of bootstrap samples is 10,000; "cov-prob" denotes the coverage probability; the size of intervals is calculated as ||θ U −θ L || 2 2 . The simultaneous confidence level is 0.95.

Simultaneous Confidence Interval for Penalized Regression Estimates
In this application, we construct simultaneous confidence intervals using penalized estimates outputted from LASSO generalized linear regression [6]. In order to make post model selection confidence intervals, two strategies were proposed by [7]: bootstrap confidence intervals and smoothed bootstrap confidence interval.
Hereby, different bootstrap samples may lead to different identified submodels s. This implies that for some of the bootstrap samples, θ j could be thresholded to be zero, whereas in other bootstrap samples, the resulting θ j could be nonzero. The author of [7] proposed using the bootstrap of the smoothed estimateθ = ∑ B b=1θ b /B ( [8][9][10]) to correct the erratic jumpiness of the estimates obtained from different bootstrap samples. He also proposed a method for estimating the asymptotic standard error and to construct a confidence interval for a single parameter using the nonparametric delta method. It is, however, desirable to construct SCIs using the vector of smoothed bootstrap estimates. First, we compute the (asymptotic) covariance matrix of two smoothed bootstrap estimates. This result is established for the ideal bootstrap, where B is equal to all the n n possible choices of the bootstrap samples. Given the original dataset Z = (z 1 , . . . , z n ), let the bth bootstrap sample be (z * b1 , . . . , z * bn ) and the two estimates beθ bl , andθ bm . Let Z * bi = #{k ∈ {1, . . . , n} | z * bk = z i }, which is the number of data points in the bootstrap sample equal to z i . Then, the nonparametric delta-method-based estimator of the asymptotic covariance matrix for the ideal smoothed bootstrap estimatesθ l = ∑ B b=1θ bl /B, andθ m = ∑ B b=1θ bm /B, is where Cov * (Z * bi ,θ bl ) and Cov * (Z * bi ,θ bm ) denote the bootstrap covariance between Z * bi and θ bl , and the bootstrap covariance between Z * bi andθ bm , respectively. The ideal smoothed bootstrap estimates are a functional on the data (z 1 , . . . , z n ), and symmetric in their argument. They can be considered as the functionals of the empirical distributionF, which assigns 1/n probability to each of the observed data points. It has been shown by [11], that the asymptotic covariance of two functionals of the empirical distribution, can be obtained as 1 In practice, the estimator of the covariance of the non-ideal bootstrap estimates can take the form .i )(θ bm −θ m ), respectively. Then, the joint limiting distribution of (θ 1 , . . . ,θ p ) is multivariate normal with covariance matrix that can be estimated asΣ = (σ lm ). The numerical evaluation of the multivariate normal distribution is provided with the R package mvtnorm, see [12]. Let c 1−α denote the critical value of the multivariate distribution of Σ, then the SCIs for all the parameters are given by CI l = θ l − c 1−α σ ll ,θ l + c 1−α σ ll , l = 1, . . . , p.
We perform the generalized linear regression with the LASSO penalty on the heart data [13]. The dataset contains the coronary heart disease status of 462 patients and the covariates include sbp (systolic blood pressure), tobacco (cumulative tobacco), ldl (low density lipoprotein cholesterol), adiposity, famhist (family history of heart disease), typea (type-A behavior), obesity, alcohol (current alcohol consumption), and age (age at onset). The optimum penalty size is determined by the Bayesian information criterion. Then, we perform 1000 bootstrap samples and construct the SCIs for the unknown regression parameters. We compare the results of MMB and MB methods with the smoothed bootstrap confidence intervals using the critical values from the multivariate normal distributions and the Bonferroni critical values. The nonparametric delta method is used to derive the limiting covariance structure. It is shown in Table 2 that all four methods provide similar lower and upper bounds. The MVN and Bonferroni methods generate confidence intervals all containing zeros. In comparison, most of the confidence intervals generated by MMB and MB methods have upper bounds or lower bounds at zero value. Figure 1 depicts all the confidence intervals generated by all four methods. Each confidence interval is scaled by the standard deviation of the point estimate so that all intervals can be presented on the same scale. The MMB and MB methods have mostly asymmetric confidence intervals which are either non-negative or non-positive. This asymmetric structure not only reflects the sparse nature of the penalized estimates, but also provides evidence on the direction of the parameter. Compared to the MB method, the proposed MMB method produces shorter intervals.

Discussion
We propose nonparametric simultaneous confidence intervals to provide distributionfree joint inferences for multiple parameters. Compared to existing MB intervals, the proposed modified MB intervals have uniformly sharper upper and lower bounds. When used in post model selection inference, the MMB intervals are advantageous as they preserve the asymmetry of the sampling distributions of the penalized estimates.
When the number of parameters increases with the sample size, it is difficult to develop simultaneous confidence intervals with desired coverage probability for all the parameters. There are a number of challenges for parametric or nonparametric methods: (1) To obtain the empirical multivariate distribution ofθ 1 , . . . ,θ P , it requires much larger bootstrap samples. (2) The correction for the adjustment of multiplicities is more severe.
(3) The quantile for the high-dimensional multivariate distribution is harder to compute. To overcome this issue, we may adopt the approach of false-discovery-rate adjusted multiple confidence intervals [14]. This warrants future investigations to develop nonparametric simultaneous intervals for high dimensional parameters.
Funding: Gao's research was funded by Natural Sciences and Engineering Research Council of Canada.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The R codes of all the numerical studies are available at https://github. com/xingaostat/Nonparametric-simultaneous-intervals, accessed date: 2 July 2021.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: