Quantile-Adaptive Sufficient Variable Screening by Controlling False Discovery

Sufficient variable screening rapidly reduces dimensionality with high probability in ultra-high dimensional modeling. To rapidly screen out the null predictors, a quantile-adaptive sufficient variable screening framework is developed by controlling the false discovery. Without any specification of an actual model, we first introduce a compound testing procedure based on the conditionally imputing marginal rank correlation at different quantile levels of response to select active predictors in high dimensionality. The testing statistic can capture sufficient dependence through two paths: one is to control false discovery adaptively and the other is to control the false discovery rate by giving a prespecified threshold. It is computationally efficient and easy to implement. We establish the theoretical properties under mild conditions. Numerical studies including simulation studies and real data analysis contain supporting evidence that the proposal performs reasonably well in practical settings.


Introduction
When the dimension p grows exponentially with n, the unbearable computational cost of the classical variable selection incurred by the ultra-high dimensionality will not only heavily slow down the computing speed of the algorithm but also result in unstable solutions [1]. To rapidly screen out the inactive predictors, variable screening methods for ultra-high dimensional data coupled with response have been examined to reduce dimension and effectively retain all the active variables with high probability in the reduced variable space [2]. This is referred to as the sure screening property. Fan and Lv (2008) proposed the concept of sure independence screening (SIS) based on the marginal Pearson correlation coefficient in the linear regression model [1]. Since then, a series of variable screening methods have been proposed successively, such as variable screening frameworks based on the generalized linear model, additive model, and general models [3][4][5][6]. The above methods are based on some specific model assumptions. In many scientific applications, the correlation between predictors and the response is difficult to assume for ultra-high dimensional data [7]. Model-based screening procedures enjoy quick computational speed but suffer the risk of model misspecification [7,8].
To avoid the inconsistency between the assumptions of the regression model and the actual distribution of data, model-free variable screening methods have been initially designed for the continuous outcome variables.  [7][8][9][10][11]. For ultra-high dimensional covariates coupled with the categorical response, Mai and Zou (2013) advocated the Kolmogorov-Smirnov distance for binary classification problems [12]. With a possibly diverging number of classes, the marginal feature screening procedure for ultra-high dimensional discriminant analysis was introduced by Huang et al. (2014) and Cui et al. (2015) [13,14]. Han (2019) researched a general and unified nonparametric screening framework under conditional strictly convex loss [15]. Zhou et al. (2020) established a forward screening procedure based on a new measure called cumulative divergence [16]. Xie et al. (2020) explored a category-adaptive screening procedure with ultrahigh dimensional heterogeneous categorical data [17].
As reported in Hao and Zhang (2017), the variable screening results depend on the signal-to-noise (SNR) level: when the signal is weak with massive noise variables, it could not be easy to detect the active variables from the noise variables, and the sure screening property may not be established [18]. In terms of this situation, one path considers controlling some false discoveries. In this regard, Tang et al. (2021) explored a quantile correlation-based screening framework (QCS), which could screen variables by conducting multiple testing to control the false discovery rate (FDR) [19]. Liu et al. (2022) posed a two-step approach to specify the threshold for feature screening with the help of knockoff features such that the FDR is controlled under a prespecified level [20]. For controlling the FDR, Guo et al. (2022) advocated a data-adaptive threshold selection procedure with FDR control based on sample-splitting [21].
However, most of the above sure screening methods are not sufficient variable screening (SVS) technology, which was first proposed in Cook (2004) and also discussed by Yin and Hilafu (2015) and Yuan et al. (2022) [22][23][24]. For illustration, consider a population with a response variable Y and a p-dimensional vector of predictors X = (X 1 , . . . , X p ) T , let X A be a subset of X, and XĀ denotes the orthogonal complement of X A . Based on the research of Yin and Hilafu (2015) and Yuan et al. (2022), sufficient variable screening means to find the smallest and unique active variable set X A such that Y ⊥ ⊥ XĀ | X A , that is, given the set X A , Y is independent of XĀ [23,24].
In this paper, without any specific regression or parametric assumptions, we advocate a new sufficient variable screening procedure by using a robust multiple testing procedure by controlling the false discovery to distinguish active variables by splitting the continuous response at different quantile levels. Thus, we achieve quantile-adaptive sufficient variable screening by controlling the false discovery (QA-SVS-FD). The proposed procedure is based on a one-versus-rest (OVR) test statistic with an asymptotic chi-square distribution under the null hypothesis. Thus, with the asymptotic distribution, the sufficient variables set could be estimated precisely by controlling the FDR accurately at a given level for high dimensionality or controlling the number of the FD adaptively with the error of 1. In addition, the proposed procedure is a model-free method for the measurement of independence without any specified distribution model; thus, it is robust to detect sufficient relevant variables against different model types.
The rest of this paper is organized as follows: Section 2 develops the sufficient variable screening testing statistic by using the conditionally imputing marginal rank correlation at different quantile levels of response. The false discovery controlling procedure under mild conditions will be studied in Section 3. Sections 4 and 5 evaluate the proposed procedure's performance via extensive numerical research, which contains simulation studies and two real data examples, which verify the robustness and flexibility of our methods. In Section 6, we shall give a short concluding discussion. All the theoretical properties are proved in Appendix A.

Sufficient Screening Utility
As the statement of Yuan et al. (2022), the analysis of sufficient variable screening includes the iterative two-step screening procedure, which contains complex computation [24]. This section will propose a novel sufficient variable screening statistic by the quantile-adaptive correlation test (QA-SVS). He et al. (2013). We do not elaborate on it again as we regard it as a special case of the proposed framework in this paper. Lemma 1. For any j = 1, . . . , p and k = 1, . . . , Lemma 1 will be proved in Appendix A.2. According to Yuan et al. (2022) [24], the sufficient active variable set is actually screened based on the structure of (Y, X A k , X A c k ). Lemma 1 provides the information that the structure of (Y, X A k , X A c k ) could be transformed into another structure based on the marginal structure of (Y, X j ) by judging the difference of F jk (x) and F j (x) for each j = 1, . . . , p and k = 1, . . . , K.
In terms of the quantile-heterogeneity of response, consider a series of tests to detect sufficient active variables simultaneously at different quantile levels, that is, for 1 ≤ j ≤ p and 1 ≤ k ≤ K. Rewrite the test in Equation (1) as where A = ∪ K k=1 A k . To investigate the difference of the conditional distribution of X j (j = 1, . . . , p) different quantile levels, for given k ∈ {1, . . . , K}, a variable screening approach developed by capturing the dependence between I(Y ∈ G k ) and x j is specified as the following screening utility: where reflects the difference between conditional cumulative distribution function (CDF) and marginal CDF of X j at each quantile level. Actually, υ jk = 12 · (n + 1) · Var OVR {τ jk } = 12 · (n + 1) · (p k · τ jk , and Var OVR {τ jk } represents the variance of τ jk in the one-versus-rest test of k-th series of H 0,j,k . As the definition in Equation (4), the higher υ jk represents the stronger correlation between the variable X j and I(Y ∈ G k ).

Asymptotic Properties of the Test Statistic
According to approximation distribution for a sample sum in sampling without replacement from a finite population in Mohamed and Mirakhmedov (2016) [26], the asymptotic properties ofτ jk andυ jk would be obtained as follows.
Lemma 2 (Asymptotic Distribution ofτ jk ). if H 0,j,k is true for all k = 1, 2, . . . , K and j = 1, 2, . . . , p, and lim n→∞p k (1 −p k ) > 0, then we obtain the following asymptotic distribution: Corollary 1 (Asymptotic Distribution ofυ jk ). If H 0,j,k is true for any k = 1, 2, . . . , K and j = 1, 2, . . . , p, and lim where χ 2 m follows the chi-square distribution with degree of freedom m. If H 0,j is true for any j = 1, 2, . . . , p, we obtainυ j k=1υ jk . Then,υ jk andυ j satisfies that Lemma 2 and Corollary 1 will be proved in Appendices A.3 and A.4, respectively. According to Lemma 2, it could be found that the asymptotic normal distribution ofτ jk depends onp k . Thus, to remove the influence ofp k on asymptotic distribution and consider the composite hypothesis testing in (3),υ jk is established in this paper.
When giving additional conditions as below, we can obtain Theorem 1 and 2.
Condition (C1) requires that the proportion of samples in each grid should not be too small or too large. Condition (C2) guarantees that there exist thresholds ρ 0 ensuring that the smaller value of υ jk ≤ ρ 0 represents the weaker correlation. Condition (C3) allows the number of grids to diverge as n increases. This ensures the rationality of the series of hypothesis tests. Conditions of (C1)-(C3) provide concise foundations and do not specify any distribution models and moment assumptions of variables.
Theorem 2 (Ranking Consistency Property). Suppose conditions of (C1) and (C2) hold. If K log(p) = o(nρ 2 0 ), then We shall provide the proofs of Theorems 1 and 2 in Appendices A.5 and A.6, respectively. Note that Theorem 1 is established for the fixed number of variables p. As long as 4(n + 2)p exp −c 4 n 1−2κ−ξ to 0 asymptotically, the sure screening property of QA-SVS is robust to heavy-tailed distributions of the predictors and the presence of potential outliers. The ranking consistency property in Theorem 2 indicates that the values ofυ jk of sufficient active variables responding to the k-th grid can be ranked above that of all the inactive ones with a high probability, which implies that the QA-SVS can separate the active and inactive with a certain threshold. Theorems 1 and 2 mainly illustrate the properties of the marginal utility itself, and the estimation of the certain threshold for the partition of sufficient variable sets needs further work in Section 3.

False Discovery Control Model
Based on Theorems 1 and 2, we shall design two routes for screening sufficient active variables by considering the false discovery (FD): one is to control the cardinality of FD adaptively by detecting the outlier, and the other is to control false discovery rate (FDR) accurately by survival analysis function.
We shall prove this property in Appendix A.7. Theorem 3 implies that the adaptive thresholdρ 0 can separate the active and inactive variables with a low false discovery in a high probability, which converges to 1 − e −1 as n increases. The expectation and variance of the FD number can be controlled at 1 + O(n −1/2 ), indicating that the number of the selected variables can be sufficiently controlled. The sufficient screened set is defined aŝ The definition ofÂ k,ρ 0 satisfies the estimation of the smallest and unique active variable set X A such that Y ⊥ ⊥ XĀ | X A . Furthermore, we obtain the sufficient screening property of A k,ρ 0 .

Corollary 2 (Sufficient Screening Property by AFD).
Supposing that conditions of (C1)-(C3) hold, we have where c 6 is some positive constant, and s k = |A k | is the true model size, k = 1, . . . , K.
Corollary 2 will be proved in Appendix A.8. In fact, Corollary 2 also can be regarded as the sure screening property as in Fan and Lv (2008). Under the definition of the sufficient variable, the utility in this paper screening the sufficient variables by controlling the false discovery could lead to more precise results. Thus, rename the property in Corollary 2 as a sufficient screening property. We call the proposed AFD control procedure QA-SVS-AFD. The QA-SVS-AFD is computationally efficient and its validity to detect active variables is guaranteed by Corollary 2. A stock-in-trade in the existing screening methods such as Xie et al. (2020) [17] is to control the cardinality of the screened active variable set by setting a certain threshold, and reduce the number of screened variables to be negligible with the ultra-high dimensionality. However, the number of the FD is non-negligible. In this paper, the QA-SVS-AFD procedure can control false discovery precisely by sufficiently controlling the expectation and variance of false discovery to converge to 1.
The estimation of the certain threshold by Theorem 3 is to control the determination of the rejection region under the level of O(1/p). In other words, we reject the null hypothesis H 0,j,k with the significant level around 1/p. As a result, the maximum subset of variables in the rejection region is the estimation of the sufficient active variable set by the AFD procedure. The AFD control path can be summarized as the following Algorithm 1: Algorithm 1 QA-SVS-AFD algorithm.

False Discovery Rate Control Model
The adaptive error detection control model can adaptively set the rejection region, with the probability of rejection of the null hypothesis testing. It leads to a large type-II error in hypothesis testing (2). Therefore, similar to Tang et al. (2021) [19], considering the control of the type-I error in hypothesis testing (2), a false discovery rate (FDR) control procedure is developed for testing H 0,j,k simultaneously for j = 1, . . . , p, k = 1, . . . , K. Without assuming any prespecified distribution, to sufficiently detect active variables at different quantile levels, we provide a suitable estimation for the threshold ρ to separate sufficient active variables by controlling the FDR of each H 0,j,k .
With the proposed test statisticυ jk , the false discovery proportion is for any given ρ, and the false discovery rate is FDR k,ρ = E(FDP k,ρ ). By Corollary 1 in Section 2.2, under H 0,j,k , eachυ jk converges to distribution of χ 2 1 under conditions of (C1)-(C3). Let q k = 1 − s k be the cardinality of A c k , for any given ρ, under the assumption that s r /p → 0 as p → ∞. Intuitively, the estimation for the FDR k,ρ can use the equation that FD k,ρ /q k max{∑ j∈I I(υ jk ≥ρ),1}/p . However, the separation of the null set A c k and q k is still intractable. Thus, we attempt to estimate the FDR, by replacing FD k,ρ /q k by S χ 2 survival function of the distribution of χ 2 1 . Hence, for any given ρ, the estimated FDR k,ρ is defined as Consequently, similar to the procedures of Benjamini and Hochberg (1995) [27] and Tang et al. (2021) [19] to control the FDR at a prespecified level α ∈ (0, 1), we suggest selecting the estimation of the threshold ρ for screening the sufficient active variables bŷ for some constant ρ 0 given in Condition (C2). In practical implementation, adopt the appropriate value ofυ 1k , . . . ,υ pk as the estimation of FDR k,ρ for ρ. Thus, the screened set could be defined asÂ Defineυ lk ≡ arg max k∈Â k,α FDR k,υ jk . In other words,υ lk is the threshold ρ such that FDR k,ρ is maximized subject to FDR k,ρ ≤ α. Hence, the estimation of FDR is FDR k,υ jk . The proposed FDR control path can be summarized as the following Algorithm 2: Input: Observation sample (X, Y), the number of grids K, and the prespecified level α Output: The screened sufficient variable setsÂ k,α (k = 1, . . . , K) Step 1 Calculateυ k,1 , · · · ,υ k,p of Equation (5) for different k = 1, . . . , K; Step 2 Compute each FDR k,ρ of Equation (11) for ρ by taking each value ofυ k,1 , · · · ,υ k,p ; Step 3 For given α, search for the setÂ k,α ≡ {j : FDRυ jk ≤ α, 1 ≤ j ≤ p} in Equation (12); Step 4 Find υ k,l ≡ arg max t∈Â k,α FDRυ jk and letρ k =υ k,l ; Step 5 Separate the screened sufficient active setÂ k,α of Equation (11) byρ k .
We call the proposed FDR control path QA-SVS-FDR. The computational cost of the QA-SVS-FDR is on the order of K · O(p). The QA-SVS-FDR is also computationally efficient, and its validity to detect active variables is guaranteed by the following theorem.
Theorem 4 (Sufficient Screening Property by Controlling FDR). Supposing conditions (C1)-(C3) hold, we obtain that where c 7 is some positive constant and s k = |A k | is the true model size, k = 1, . . . , K. For a prespecified level α, if s k = |A k | = O(n ς ) for some ς < 1/2, the FDR of the proposed multiple testing procedure satisfies whereρ k is given in Equation (11).
We shall prove Theorem 4 in Appendix A.9. Theorem 4 shows the sufficient screening property of the estimation by controlling FDR accurately. The result of the screened variable set by controlling the cardinality with an empirical threshold leads to the FDR being non-negligible. Hence, in terms of the asymptotic null distribution of the test statistic in Theorem 1, the FDR of the QA-SVS-FDR can be controlled accurately at a prespecified level α, as the estimation of FDR can be approximated sufficiently well by large n.

Remark 5.
Alternatively, if focusing on selecting sufficient predictors relevant to the response Y by testing the H 0,j in Equation (1), one can consider a refined version that is and the estimation of FDR ρ * as As a result, the screened sufficient active variable set is defined aŝ Defineυ l ≡ arg max k∈Â α FDRυ j . The estimation of the FDR is FDR k,υ jk . The path ofÂ α is summarized in Algorithm 3. Under the given level α, the FDR of the testing (3) satisfies that lim n→∞ FDRρ * α = 1. The conclusion can be simply proved by Corollary 2, and we omit it.
Thus far, we have completely shown the two paths of sufficient variable screening by controlling the false discovery. The two paths have different essential frameworks: one is to give the adaptive threshold and outlier detecting model to control the false discovery, and the other is to control the false discovery rate accurately by using the survival functions for estimation under a given prespecified level α. These two paths both can control the false discovery to sufficiently screen active predictors, which is simply the two-step sufficient screening procedure in Yuan et al. (2022) [24].

Simulation Studies
In this section, the performance of the proposed procedure will be demonstrated via several simulated examples. In practice, the sample splitting idea is adopted to avoid mathematical challenges caused by the reuse of the sample.
. . , n}. The proposed sufficient screening procedure consists of two steps: QA-SVS-SUP, to screen all active variables; QA-SVS-FD, to control the FD adaptively (QA-SVS-AFD) and to control the FDR accurately (QA-SVS-FDR). The two steps are specified as the following: (1) QA-SVS-A: The p covariates are ranked in descending order according to Remark 5 . . , n 1 } and evaluate the minimum model size that all active variables are included. ( . . , n 2 }, (i) the sufficient predictors are screened according to Equation (10) at different quantile levels, denoted byÂ AFD k ; (ii) Given an FDR level α, the thresholdρ k is estimated by Equation (11), and the selected setÂ AFD k,α is defined by Equation (12).

Performance of QA-SVS-A
In this subsection, the variable screening performance of our proposed QA-SVS is compared with SIS (Fan and Lv, 2008) [1], the distance correlation-based screening (DC-SIS; Li et al., 2012) [8], the quantile-adaptive model-free sure independence screening (QA-SIS; He et al., 2013) [9], and the quantile-based correlation screening (QCS; Tang et al., 2013) [19]. The performance of each procedure is evaluated via 5%, 25%, 50%, 75%, and 95% quantiles of the minimum model size that all active variables belong to based on 100 replications. The size is closer to the true model size, which indicates the better performance of variable screening.
In the simulation, the predictors X = X 1 , . . . , X p T are generated from a p-variate normal distribution with mean 0 and covariance matrix Σ = (σ ij ) p×p , where σ ij = ρ |i−j| . We set ρ = 0 and 0.5. Let the number of quantile grid points K = 5, 6, . . . , 11. To simulate a high-dimensional scenario, we set n = 500 and p = 1000 or 5000 for each scenario. The response variable is sampled from the following models: Scenario 1.1: The error term ε follows N (0, 1), independent of X. The quantiles of the minimum model size in Scenario 1.1 and Scenario 1.2 that include all active variables with p = 1000 and p = 5000 are shown in Tables 1 and 2. Due to limited space, the simulation results of the rest Scenario are presented in Appendix B Tables A1-A4.    Under Scenarios 1.5 and 1.6 with interactions, the proposed QA-SVS-S and QCS perform relatively stable, while both of them behave a little poorly when there are higher-order effects. The QA-SIS in extremely low or high quantile level suffers a major setback, but the proposed QA-SVS-S screens robustly. In addition, the performance of the proposed QA-SVS-S is only discounted slightly when p increases from 1000 to 5000, but the other methods are not. Furthermore, the results of the proposed QA-SVS-S in all Scenarios under ρ = 0.5 indicate that the correlation of covariates provides sufficient screening relationships. Through the different settings of the number of grid points K, it shows that the QA-SVS-S will be more effective at detecting the active predictors as the increase of K, whereas QCS has the opposite trend.
Based on 100 replications, the results of the QA-SVS-FDR and the QCS procedure are stored in Table 3, and the results with p = 5000 are presented in Appendix B Table A5.
Under Scenarios 2.1-2.4, the proposed QA-SVS-FDR performs as well as QCS-FDR. The proposed QA-SVS-AFD has the same performance with a small K, whereas QA-SVS-AFD would miss some active predictors as the increase of K. It can be found that the three procedures control the empirical FDR under the prespecified level α for most scenarios. As the increase of the number of active predictors, F 1 -score of the proposed QA-SVS-FD (QA-SVS-AFD and QA-SVS-FDR) has a little improvement, such as 0.92 to 0.97. The QCS-FDR shows the opposite trend. Combined with the |Â|, we obtain that our proposed method screens out the null predictors more accurately but will lose some active predictors. With sufficient screening by controlling FDR, our procedure can retain active predictors as much as possible. Under Scenarios 2.5 and 2.7, it can be seen that our method works slightly better than in QCS, especially the FDR and F 1 -score of QA-SVS-AFD reach 0 and 1, respectively. Under Scenarios 2.6 and 2.8, the proposed QA-SVS-AFD and QCS-FDR both fail. However, it is worth mentioning that QA-SVS-FD has larger |Â| and F 1 -score than QCS-FDR, which indicates that the performance of QA-SVS-FDR is more effective. In addition, our QA-SVS-FD procedure works reasonably well as p increases from 1000 to 5000, where QCS behaves slightly poorly. In summary, our proposed method performs almost as well and is more effective than QCS-FDR in various practical settings.
In terms of the highly sensitivity of the model-free method to some factors that can distort the underlying relationships between the covariates and the response, we suggest that one can reduce the sensitivity by using the QA-SVS procedure with different numbers of unfixing grid points. This can lead to different model complexity, where the large K can lead to overfitting, and the small K can lead to under-fitting. Table 3. The result of criteria in all scenarios under p = 1000 with α = 0.05 of Section 4.2.

Real Dataset Research
In the era of rapid development of machine learning and pattern recognition, some image recognition technologies are applied in the medical field. For example, through the processing of lung CT images, we can identify whether the lung has a disease. The following two methods are often used to quantitatively evaluate the severity of emphysema: one is CT density measurement. Based on the pixel image of CT digital, calculate the average lung density of the patient, then establish the threshold, calculate the proportion of the area below the threshold, and evaluate the situation of emphysema. The other is the percentile density measurement (PD) technique. Analyze the attenuation distribution curve of lung density, give a percentile (commonly 5% and 95%), calculate the area below the percentile density curve, and evaluate the symptoms of emphysema [28]. In this section, we shall apply our proposed method to analyze the lung CT image dataset downloaded from Kaggle, which can segment lungs accurately.
There is a picture of a subject in Appendix C Figure A1. Among them, we regard 5% and 95% PD data as the corresponding continuous response variable, respectively. For smokers, these values are usually high, indicating that other substances in the lungs have accumulated. The data include 267 instances and 512*512 continuous covariates stretched by picture pixels.
By giving different values of quantile grid points K = 2, 3, . . . , 6 and considering the threshold of FDR under the given prespecified level α = 0.05, we obtain the different segmentations and extractions. The numbers of selected picture pixels are displayed in Table 4. It is clear that QCS-FDR loses efficacy, and QA-SVS-AFD works when K ≤ 3. Fortunately, QA-SVS-FDR works effectively under all values of K = 2, 3, . . . , 6. Compared with the QA-SVS-AFD(K) and the QA-SVS-FDR(K), under the hypothesis testing (2), it could be found that the screened active variable set is estimated by the rejection region of the QA-SVS-AFD(K) path, which controls the probability around 1/(512 * 512) with not enough number of active variables. The QA-SVS-FDR(K) selects the active variable sufficiently by testing the null hypothesis of testing (2) under the given prespecified FDR level α = 0.5. We illustrate the extraction by plotting the segmented lung CT with the average of the values of the selected predictors, which are presented in Appendix C, Figures A2-A6. These results may provide some information for measuring important clinical parameters (lung volume, PD, etc); considering the length of this paper, we do not go further. Table 4. The numbers of selected picture pixels in applications of Section 5.

Conclusions
In this paper, we propose a multiple testing procedure with false discovery control to detect active variables sufficiently. The multiple testing procedure can be applied with the quantile-adaptive screening method when the dimensionality is ultra-high. Although the QA-SVS procedure is built on the quantile-adaptive marginal screening statistic, by controlling the FD of the marginal structural testings, the QA-SVS procedure can screen out sufficient variables through the precise separation of the sufficient variable set. As the results in this paper, if the grid points K grow faster than n and p, the QA-SVS statistic can capture more subtle values better than QC-SIS, which is in line with the definition of the sufficient variable. In addition, the convergent rate of the asymptotic null distribution of our proposed procedure is larger than the QCS under a large K. In the simulation studies, we set different values of K to inspect the performance of the QA-SVS. Nevertheless, it would be of interest to study a data-driven way to select K. We leave some space here for the future research. Acknowledgments: Many thanks to reviewers for their positive feedback, valuable comments, and constructive suggestions that helped improve the quality of this article. Many thanks to the editors' great help and coordination for the publication of this paper.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Main Proof
Appendix A.1. Proof of Remark 1 Proof. According to the definition of A k in Assumption (I), for any . Due to the multiplicative law of probability, it is clear that In terms of invertibility, we proved that Thus, A k is the sufficient screening variables index set.
Then, denote q k = p − s k , and we have When β ∈ (2, +∞), we have ρ = o(n 1/4 ) as a constant. Hence, we obtain From the definition of ultra-high dimensional data, we have p = o(exp{n α }), α > 0 and s k = o(n). If α > 1/2, according to Equations (A10) and (A11), we have To conclude, F c |τ k | (k) → 1 − e −1 a.s. n → ∞; in other words, The number of variables screened into the adaptive FD set is subjecting to the p times Bernoulli test; then, the expectation and variance of the number of the false discovery are EFD = q k · 1/p = (1 + O(n −1/2 ))(1 − o(p −1 )) = 1 + O(n −1/2 ), Appendix A.8. Proof of Corollary 2 Proof. According to the Condition (C2), the definition ofρ 0 in Section 3.1 and max j∈A k | υ j,k − υ j,k |≤ cn −κ in Theorem 1, we have Therefore, we obtain that In order to prove FDRρ k → α in probability, under the assumption that q k /p → 1 as p → ∞ and for any ρ > 0, by Corollary 1 and the Hoeffding's inequality, it suffices to show that (ρ), the survival function of the distribution of χ 2 1 . Thus, Notice that ∑ j∈A k I(υ jk ≥ ρ) is monotone in ρ and asymptotically converges to s k , and S χ 2 1 (ρ) is continuous and monotone. Then, there exists a unique constant 0 <ρ k ≤ Cn −β such that in probability as n → ∞. Therefore, according to the Equations (A12) and (A13), we obtain that lim n→∞ FDRρ k α = 1.