Abstract
Multi-stage sampling designs are often used in household surveys because a sampling frame of elements may not be available or for cost considerations when data collection involves face-to-face interviews. In this context, variance estimation is a complex task as it relies on the availability of second-order inclusion probabilities at each stage. To cope with this issue, several bootstrap algorithms have been proposed in the literature in the context of a two-stage sampling design. In this paper, we describe some of these algorithms and compare them empirically in terms of bias, stability, and coverage probability.
1. Introduction
Many surveys conducted by national statistical offices use stratified multi-stage sampling designs for selecting a sample. Reasons for using multi-stage sampling rather than direct element sampling include the lack of element-level sampling frames and cost considerations when data collection involves face-to-face interviews. Stratified multi-stage sampling designs include some form of stratification, selection of primary sampling units (psu), and subsampling within selected psus. This is especially common in social and health surveys. For general multi-stage sampling designs, unbiased variance estimation is a complex task as it relies on the availability of the second-order inclusion probabilities at each stage. If the first-stage sampling fraction is small, a common variance estimation strategy is to pretend that the psus were selected with replacement and use the customary with replacement variance estimator. The resulting estimator is generally conservative in the sense that it may suffer from a small positive bias.
Another approach to variance estimation for survey data is bootstrap variance estimation originally proposed by Efron [] in the context of independent and identically distributed observations. In a finite population sampling, bootstrap procedures can be classified into two broad groups. In the first, bootstrap samples are selected from the original sample; e.g., [,] among others. Rao and Wu [] applied a scale adjustment directly to the survey data values so as to recover the usual variance formulae. Rao et al. [] presented a modification of the method of Rao and Wu [], where the scale adjustment is applied to the survey weights rather than to the data values. The second group of procedures consists of first creating a pseudo-population from the original sample. Bootstrap samples are then selected from the pseudo-population using the same sampling design utilized to select the original samples; see [,,,], among others. Many of the aforementioned bootstrap procedures may be implemented by randomly generating bootstrap weights so that the first two (or more) design moments of the sampling error are tracked by the corresponding bootstrap moments; see [,]. These procedures are often referred to as bootstrap weight procedures. For a comprehensive review of bootstrap procedures for survey data, the reader is referred to Mashreghi et al. [].
The goal of this paper is to empirically compare several existing bootstrap algorithms that have been proposed in the literature for two-stage sampling designs. The bootstrap procedures are compared with respect to bias, stability, and coverage probability of confidence intervals. In Section 2 we present the basic setup and discuss some classical variance estimation procedures for two-stage sampling designs. In Section 3, we present some bootstrap algorithms proposed in the case of simple random sampling without replacement in both stages. Bootstrap algorithms for unequal probability sampling designs are described in Section 4. In Section 5, we present the results from a simulation study. We make some final remarks in Section 6.
2. The Setup
Consider a finite population U consisting of N primary sampling units (psu), of size such that . Let be the total number of elements in the population. We are interested in estimating a population total of a survey variable y:
where denotes the ith psu total, , and denotes the y-value for the kth element in the ith psu. To that end, we select a sample according to a two-stage sampling design:
- (i)
- A sample of psus, of size n, is selected according to a given sampling design with first-order inclusion probabilities, and with second-order inclusion probabilities, Finally, let
- (ii)
- In the ith psu sampled at the first stage, , a subsample of the elements, , of size is selected according to a given sampling design with first-order inclusion probabilities and second-order inclusion probabilities . Subsampling in a given psu is carried out independently of subsampling in any other psu.
A design-unbiased estimator of is the Horvitz–Thompson estimator given by
where The estimator (1) can be written as where with and denotes the sample of elements of size
The design variance of denoted by can be unbiasedly estimated by
where
and That is, where denotes the expectation with respect to the first-stage sampling design, and denotes the expectation with respect to the second-stage sampling design conditionally on . In the case of simple random sampling without replacement at both stages, the estimator (2) reduces to
where
and
with
For general two-stage sampling designs, the computation of (2) is cumbersome as it requires the availability of the second-order inclusion probabilities at each stage. A simplified variance estimator is given by
That is, only the first term of (2) is kept. The bias of which is always negative, is expected to be small provided that the first-stage sampling fraction, is small; see [,].
An alternative simplified variance estimator can be obtained by pretending that the psus are selected with replacement. It is given by
where with denoting the probability of selection of the ith psu at any given draw. If the first-stage sampling fraction, , is small, we expect (5) to suffer from a small positive bias. Unlike (4), the estimator does not require the availability of the second-order inclusion probabilities
So far, we have considered the case of a population total . In practice, it may be of interest to estimate more complex parameters such as distribution functions and quantiles. Let be defined as the solution of the following census estimating equation:
where can be either a smooth (i.e., a function that is differentiable and whose derivatives are continuous) or a non-smooth function of When is smooth, the solution of (6) is called a smooth parameter; otherwise, it is called a non-smooth parameter. Common parameters include: (i) the population mean obtained with ; (ii) the finite population distribution function obtained with (iii) the -th population percentile obtained with The population mean is an example of a smooth parameter, whereas distribution functions and quantiles are examples of non-smooth parameters.
An estimator of can be obtained by solving the following sample estimating equation:
The variance of may be obtained using a first-order Taylor expansion or by using a resampling method such as balanced repeated replication, jackknife, and bootstrap; see [] for a discussion of resampling methods. In the remainder of this paper, we confine to bootstrap.
3. Bootstrap Procedures for Simple Random Sampling without Replacement at Both Stages
In this section, we describe several bootstrap algorithms for a two-stage sampling design with simple random sampling without replacement at both stages.
3.1. The Rescaling Bootstrap Algorithm
Rao and Wu [] proposed a rescaled bootstrap algorithm for both uni-stage and two-stage sampling designs. Because the rescaling factor is applied to the y-values, this method is applicable to smooth statistics but not to the case of non-smooth statistics such as quantiles. The algorithm can be described as follows:
- Step 1.
- Draw a sample of size n psus from , according to simple random sampling with replacement.
- Step 2.
- From each psu selected in Step 1, select a sample of elements, of size according to simple random sampling with replacement. For a psu selected more than once in Step 1, perform independent subsampling.
- Step 3.
- Let be the y-value of the kth bootstrap element in the ith bootstrap psu and be the -value of the ith bootstrap psu and is defined similarly. Let
- Step 4.
- Compute using the same formulae that were used to obtain the original point estimator.
- Step 5.
- Repeat Steps 1–4 a large number of times, B, to obtain .
- Step 6.
- The bootstrap variance estimator is . In practice, the Monte Carlo approximation of is applied
Rao and Wu [] showed that in the case of a population total, the above algorithm matches the standard variance estimator (3). Rao et al. [] proposed a weighted version of the Rao–Wu method, whereby the rescaling is applied to the sampling weights rather than the y-values; see also []. The method of Rao et al. [] is described in Section 4.
3.2. The Mirror-Match Bootstrap Algorithm
Sitter [] proposed an extension of his mirror-match bootstrap to the case of a two-stage sampling design. In [], the algorithm assumed that the number of repetetions and (see below) are integers. It can be described as follows:
- Step 1.
- Choose and draw a sample of size psus from , according to simple random sampling without replacement.
- Step 2.
- Repeat Step 1 times independently to obtain a bootstrap sample of psus of size , where .
- Step 3.
- Choose and draw according to simple random sampling without replacement units within the ith psu obtained in Steps 1 and 2.
- Step 4.
- Repeat Step 3 times independently to obtain a bootstrap sample of size from the ith psu drawn in Step 3, where
- Step 5.
- Compute using the same formulae that were used to obtain the original point estimator.
- Step 6.
- Repeat Steps 1–5 a large number of times, B, to obtain .
- Step 7.
- The bootstrap variance estimator is . In practice, the Monte Carlo approximation of is applied
Sitter [] showed that in the case of a population total, the above algorithm matches the standard variance estimator (3). If and are not integers, a randomization between bracketing integer values is proposed in [].
3.3. The Without-Replacement Bootstrap Algorithm
Sitter [] proposed a pseudo-population bootstrap algorithm, referred to as the without-replacement bootstrap (BWO) method, in the case of uni-stage and two-stage sampling designs. We focus on the latter. In [], the algorithm assumed that the quantities , , , and (see below) are integers. It can be described as follows:
- Step 1:
- Create a pseudo-population by replicating each psu in times and each unit within the ith psu times. Let be the resulting pseudo-population consisting of psus, , of size , where there exists such that . Let be the total number of elements in the pseudo-population.
- Step 2:
- From the pseudo-population , select a sample of psus, , of size according to simple random sampling without replacement. In each selected psu, select a sample, , of size according to simple random sampling without replacement.
- Step 3:
- Compute using the formulae that were used to obtain the original point estimator.
- Step 4:
- Repeat Steps 2 and 3 a large number of times, B, to obtain .
- Step 5:
- The bootstrap variance estimator is . In practice, the Monte Carlo approximation of is applied
In the case of the population total (or the population mean), Sitter [] showed that the bootstrap variance estimator reduces to the standard variance estimator provided that and satisfy
and and satisfy
where and , for each i. If , , , and are not integers, a randomization between bracketing integer values was proposed in []. In Appendix A, we show that, if we define , , , and as in (8) and (9), the bootstrap variance estimator does not reduce to the standard variance estimator in (3). We suggest a modification to (8) and (9) so that the bootstrap variance estimators reduces to the standard variance estimator (3). In the simulation study (see Section 5), we show that the modified version of Sitter [] works well in terms of bias and coverage probability of confidence intervals.
3.4. The Bernoulli Bootstrap Algorithm
Funaoka et al. [] proposed two bootstrap procedures, referred to as Bernoulli bootstrap, for stratified three-stage sampling. Here, we consider the special case of two-stage sampling. Funaoka et al. [] proposed a short-cut algorithm and a general algorithm. The general algorithm can handle any combination of sample sizes but is more computationally intensive. The general algorithm can be described as follows:
- Step 1.
- Draw a sample, of size from the original sample of clusters, , according to simple random sampling with replacement. Generate n Bernoulli random variables, , with probabilityFor each , keep the ith cluster in the bootstrap sample and go to Step 2, if , and replace the ith cluster with one randomly selected cluster from , if .
- Step 2.
- For cluster i kept in Step 1, draw a sample, of size from the original sample according to simple random sampling with replacement. Generate Bernoulli random variable, , with probabilityFor each , keep the th element in the bootstrap sample, , if , and replace it with one randomly selected element from , if .
- Step 3.
- Compute using the formulae that were used to obtain the original point estimator.
- Step 4.
- Repeat Steps 1–3 a large number of times, B, to obtain .
- Step 5.
- The bootstrap variance estimator is . In practice, the Monte Carlo approximation of is applied
Funaoka et al. [] argued that the resulting bootstrap variance estimator is consistent for both smooth and non-smooth parameters.
3.5. The Preston Bootstrap Weights Algorithm
Preston [] proposed a bootstrap weights approach for stratified three-stage sampling. Here, we focus on the special case of two-stage sampling. The algorithm can be described as follows:
- Step 1.
- Draw a sample of size psus from , according to simple random sampling without replacement. Let if the ith psu is selected and otherwise.
- Step 2.
- Define the psu bootstrap weights:
- Step 3.
- Within each of the sample of psus selected in Step 1, draw a simple random sample without replacement, of size Let if the kth element in the ith psu is selected and otherwise. We define the conditional element bootstrap weights:
- Step 4.
- Compute using the formulae that were used to obtain the original point estimator with the original weights replaced by the bootstrap weights .
- Step 5.
- Repeat Steps 1–4 a large number of times, B, to obtain .
- Step 6.
- The bootstrap variance estimator is . In practice, the Monte Carlo approximation of is applied
In the case of the population total, Preston [] showed that the bootstrap variance estimator reduces to the textbook variance estimator (3). He suggested that the choice of and will be optimal and lead to non-negative bootstrap weights, where denotes the integer part.
4. Bootstrap Procedures for Unequal Probability Sampling Designs
4.1. The Rao-Wu-Yue Bootstrap Weights Algorithm
Rao et al. [] proposed a bootstrap weights approach for stratified multi-stage sampling designs. Unlike the method of Rao and Wu [], it can be applied to estimate the variance of smooth and non-smooth parameters (e.g., quantiles).
- Step 1.
- Select psus according to simple random sampling with replacement from .
- Step 2.
- Define the bootstrap weight as
- Step 3.
- Compute using the formulae that were used to obtain the original point estimator with the original weights replaced by the bootstrap weights .
- Step 4.
- Repeat Steps 1–3 B times to obtain .
- Step 5.
- The bootstrap variance estimator is . In practice, the Monte Carlo approximation of is applied
The algorithm of Rao et al. [] leads to consistent variance estimators provided that the first-stage sampling fraction is negligible. The choice leads to non-negative bootstrap weights.
4.2. The Pseudo-Population Bootstrap Algorithm
Chauvet [] proposed a pseudo-population bootstrap approach (PPB) in the case of unequal two-stage sampling designs. It can be described as follows:
- Step 1.
- Each unit is duplicated times to create a second-stage pseudo-population denoted by , where denotes the closet integer.
- Step 2.
- Each pair is duplicated times. The population of pairs is completed by selecting a sample in the set by means of sampling design with first-order inclusion probabilities . This leads to the pseudo-population .
- Step 3.
- Select a first-stage bootstrap sample from using the original first-stage sampling design with first-order inclusion probabilities .
- Step 4.
- Select a second-stage bootstrap sample from using the original second-stage sampling design. We set with probability and with probability . This procedure is applied to each pair The union of the ’s leads to the bootstrap sample .
- Step 5.
- Compute using the formulae that were used to obtain the original point estimator.
- Step 6.
- Steps 3–5 are repeated times to obtain the bootstrap statistics . Let
- Step 7.
- Steps 2–6 are repeated times to obtain . The variance of is estimated by
Chauvet [] showed that in the case of high entropy sampling design (e.g., [,,,]) at both stages, the above algorithm leads to a consistent estimator in the context of a population total. In the case of a fixed-size sampling design, Chauvet [] suggested completing the pseudo-population in Step 2, by applying Poisson sampling design with inclusion probabilities .
5. Simulation Study
We conducted a simulation study to assess the performance of the bootstrap procedures described in Section 3 and Section 4 in terms of bias, stability, and coverage rate of confidence intervals based on the t-distribution with degrees of freedom. The simulation study was carried out using the R software. We created two finite populations consisting of primary sampling units. The cluster sizes were generated according to a Poisson distribution with a mean equal to 50. In each population, we generated a survey variable y according to
where
The parameter in (10) was set to 0.1 for Population 1 and 0.3 for Population 2. We were interested in estimating the population total of the y-variable, as well as the finite population median.
From each population, we selected samples according to a two-stage sampling design:
- (i)
- At the first stage, we selected n psus according to two sampling designs: simple random sampling without replacement and inclusion probability-proportional-to-size randomized systematic sampling. The value of n was set to which corresponds to a first-stage sampling fraction, and which corresponds to
- (ii)
- At the second stage, elements within each psu selected at the first stage were selected according to simple random sampling without replacement.
In each sample, we computed the estimator given by (1). Its variance was estimated using the variance estimation procedures listed in Table 1. Except of the procedure of Chauvet [], we used bootstrap samples for all the other bootstrap procedures. For the procedure of Chauvet [], we used and ().

Table 1.
Variance estimation procedures for each population parameter/sampling design.
As a measure of bias of a variance estimator , we computed its Monte Carlo percent relative bias
where denotes the Monte Carlo expectation of and denotes the Monte Carlo variance estimator of As a measure of stability of a variance estimator we computed its Monte Carlo percent coefficient of variation given by
where denotes the Monte Carlo variance estimator of . Finally, we computed the Monte Carlo coverage rate of t-based confidence intervals and their corresponding Monte Carlo average length.
Table 2, Table 3, Table 4 and Table 5 show the results for the bootstrap methods in the case of SRSWOR/SRSWOR. Table 2 shows the Monte Carlo percent relative bias of the bootstrap variance estimators. For the population total, all the procedures led to small biases for with the value of absolute relative bias ranging from 1.08% to 8.66%. For the population median, except for the method of Rao and Wu [], the other procedures led to good results with an absolute relative bias ranging from 0.02% to 16.58%. As expected, the method of Rao and Wu [] led to substantial bias because it cannot be applied to non-smooth statistics. For , the absolute relative bias varied between 2.49% and 10.00% for all bootstrap methods except for the method of Rao et al. [] who suffered from a significant positive bias. This can be explained by the fact that the sampling fraction was not small. For the absolute relative bias varied between 2.60% and 13.20% for all bootstrap methods except for the methods of Rao and Wu [] and Rao et al. [].

Table 2.
Monte Carlo percent RB of the bootstrap variance estimators to estimate the variance of the point estimator based on 3000 samples selected according to SRSWOR/SRSWOR.

Table 3.
Monte Carlo percent CV based on 3000 samples selected according to SRSWOR/SRSWOR.

Table 4.
Coverage rate (CR) of the 95% t-distribution based confidence intervals constructed using bootstrap standard error estimators based on 3000 samples selected according to SRSWOR/SRSWOR.

Table 5.
Average length (AL) of the 95% t-distribution based confidence intervals constructed using bootstrap standard error estimators based on 3000 samples selected according to SRSWOR/SRSWOR.
Table 3 shows the percent CV. All the bootstrap methods led to similar Monte Carlo coefficients of variation (CV). For , the CV varied between 41.2% and 46.0% for the population total, and between 59.0% and 64.3% (except for the method of Rao and Wu [] that led to a CV of 69.1% for ) for the population median. For , the CV varied between 18.9% and 21.0% for the population total, and between 36.4% and 42.3% for the population median.
Table 4 and Table 5 show the coverage probability and the average length of 95% confidence intervals based on the t-distribution, respectively. All the bootstrap methods led to good coverage and similar average length except the method of Rao and Wu [] for the population median. The coverage rate in all cases, except the method of Rao and Wu [] for the population median, varied between 93.17% and 96.73%.
Table 6, Table 7, Table 8 and Table 9 show the results for the bootstrap methods in the case of randomized IPPS systematic/SRSWOR. Table 6 shows the percent relative bias of the bootstrap variance estimators. The method of Chauvet [] worked generally better than the method of Rao et al. [] in terms of relative bias, especially in the case of the population median. The percent CVs presented in Table 7 were very similar for both methods.

Table 6.
Monte Carlo percent RB of the bootstrap variance estimators to estimate the variance of the point estimator based on 3000 samples selected according to IPPS/SRSWOR.

Table 7.
Monte Carlo percent CV based on 3000 samples selected according to IPPS/SRSWOR.

Table 8.
Coverage rate (CR) of the 95% t-distribution based confidence intervals constructed using bootstrap standard error estimators based on 3000 samples selected according to IPPS/SRSWOR.

Table 9.
Average length (AL) of the 95% t-distribution based confidence intervals constructed using bootstrap standard error estimators based on 3000 samples selected according to IPPS/SRSWOR.
Table 8 and Table 9 respectively show the coverage probability and the average length of the 95 percent confidence intervals based on the t-distribution for both methods. Both bootstrap methods led to good coverage and similar average length. The coverage rate in all cases varied from 93.23% to 96.43%.
6. Final Remarks
The results of the simulation studies suggest that most bootstrap procedures work well in terms of bias and coverage rate of confidence intervals for estimating smooth or non-smooth parameters. An exception is the method of Rao and Wu [] for quantiles and the method of Rao et al. [] for appreciable first-stage sampling fractions. In terms of stability of the variance estimators, there is little difference between the bootstrap procedures. Our results are aligned with those of Saigo [] who empirically compared four bootstrap procedures in the context of stratified three-stage sampling with simple random sampling without replacement at each stage: the procedure of Funaoka et al. [], the mirror match bootstrap of Sitter [], the method of Rao et al. [], and the BWO method of Sitter [].
In this paper, we have compared several bootstrap algorithms in the context of a two-stage sampling design. The algorithms were described in an ideal setup. In practice, the design weights undergo a weighting process that involves a nonresponse adjustment followed by some form of calibration, whose goal is to ensure consistency between survey estimates and known population quantities. When the first-stage sampling fraction is small, the method of Rao et al. [] is typically used in surveys conducted by national statistical offices. To account for unit nonresponse and calibration, the bootstrap weights need to undergo the same weighting process (non-response adjustment and calibration) that was used in the original sample were.
Bootstrap may be used to estimate the variance of imputed estimators in the context of imputation for item nonresponse. If the first-stage sampling fraction is small, one can re-impute the missing values in each bootstrap sample using the same imputation procedure that was utilized in the original sample, see []. The case of non-negligible first-stage sampling fractions requires further research.
We end this paper by pointing out a very recent paper by Beaumont and Émond [], who proposed a bootstrap weight approach for multi-stage sampling designs. It would be interesting to compare this method to the procedures considered in this paper.
Author Contributions
Conceptualization, S.C., D.H. and Z.M.; methodology, S.C., D.H. and Z.M.; software, S.C. and Z.M.; validation, S.C., D.H. and Z.M.; formal analysis, NA; investigation, S.C., D.H. and Z.M.; resources, S.C., D.H. and Z.M.; data curation, NA; writing—original draft preparation, S.C., D.H. and Z.M.; writing—review and editing, S.C., D.H. and Z.M.; visualization, NA; supervision, NA; project administration, S.C., D.H. and Z.M.; funding acquisition, NA. All authors have read and agreed to the published version of the manuscript.
Funding
Sixia Chen is partly supported by the National Institute on Minority Health and Health Disparities at the National Institutes of Health (1R21MD014658-01A1) and the Oklahoma Shared Clinical and Translational Resources (U54GM104938) with an Institutional Development Award (IDeA) from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The work of David Haziza and Z.M. is funded by grants of the Natural Sciences and Engineering Research Council of Canada.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
We thank the Associate Editor and three reviewers for their comments that helped improve the overall quality of this paper.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
In this section, we show that, obtaining , , , and from Equations (8) and (9) in the Sitter [] algorithm in Section 3.3, the resulting bootstrap variance estimator does not reduce to the standard variance estimator in the case of the population total .
In the case of , the bootstrap estimator is
where is the y-value for the kth selected element in in Step 2. The bootstrap variance of is
where the subscripts and respectively denote the expectation and the variance with respect to the first-stage and second-stage resampling design in Step 2. Let be the total of y-values in in Step 1. The first and the second component of the bootstrap variance estimator in (A1) respectively are
and
where . In [], it is claimed that the first and the second components of the bootstrap variance estimator respectively are
and
see Equation (3.5) in Section 3.2 in []. This is true only when
In other words, we have to define and which is contradictory to the way Sitter [] defined , , , and using Equations (8) and (9).
In the following, we suggest how the method of Sitter [] can be modified. We first define
To have the equality between the first and the second component of the bootstrap variance estimator and the first and the second component of the standard variance estimator in (3), respectively, we need to have
and
Defining , , , and as in Equations (A3)–(A5), the bootstrap variance estimator reduces to the usual variance estimator in (3). In the simulation study, the method "Modified Sitter" refers to the method of Sitter [] while applying the modified , , , and defined in Equations (A3)–(A5). When , , , or is not integer, we simply rounded it to the closest integer value.
References
- Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Stat. 1979, 7, 1–26. [Google Scholar] [CrossRef]
- Rao, J.N.K.; Wu, C.F.J. Resampling inference with complex survey data. J. Am. Stat. Assoc. 1988, 83, 231–241. [Google Scholar] [CrossRef]
- Sitter, R.R. A resampling procedure for complex survey data. J. Am. Stat. Assoc. 1992, 87, 755–765. [Google Scholar] [CrossRef]
- Rao, J.N.K.; Wu, C.F.J.; Yue, K. Some recent work on resampling methods for complex surveys. Surv. Methodol. 1992, 18, 209–217. [Google Scholar]
- Gross, S. Median estimation in sample surveys. Proc. Sect. Surv. Res. Methods 1980, 181–184. [Google Scholar]
- Bickel, P.J.; Freedman, D.A. Asymptotic normality and the bootstrap in stratified sampling. Ann. Stat. 1984, 12, 470–482. [Google Scholar] [CrossRef]
- Booth, J.G.; Butler, R.W.; Hall, P. Bootstrap methods for finite populations. J. Am. Stat. Assoc. 1994, 89, 1282–1289. [Google Scholar] [CrossRef]
- Chauvet, G. Méthodes de Bootstrap en Population Finie. Ph.D. Thesis, École Nationale de Statistique et Analyse de l’Information, Bruz, France, 2007. [Google Scholar]
- Antal, E.; Tillé, Y. A direct bootstrap method for complex sampling designs from a finite population. J. Am. Stat. Assoc. 2011, 106, 534–543. [Google Scholar] [CrossRef] [Green Version]
- Beaumont, J.F.; Patak, Z. On the generalized bootstrap for sample surveys with special attention to Poisson sampling. Int. Stat. Rev. 2012, 80, 127–148. [Google Scholar] [CrossRef]
- Mashreghi, Z.; Haziza, D.; Léger, C. A survey of bootstrap methods in finite population sampling. Stat. Surv. 2016, 10, 1–52. [Google Scholar] [CrossRef]
- Särndal, C.E.; Swensson, B.; Wretman, J. Model-Assisted Survey Sampling; Springer: New York, NY, USA, 1992. [Google Scholar]
- Beaumont, J.F.; Béliveau, A.; Haziza, D. Clarifying some aspects of variance estimation in two-phase sampling. J. Surv. Stat. Methodol. 2015, 3, 524–542. [Google Scholar] [CrossRef]
- Wolter, K.M. Introduction to Variance Estimation; Springer Series in Statistics: New York, NY, USA, 2007. [Google Scholar]
- Sitter, R.R. Resampling Procedures for Complex Survey Data. Ph.D. Thesis, University of Waterloo, Waterloo, ON, Canada, 1989. [Google Scholar]
- Sitter, R.R. Comparing three bootstrap methods for survey data. Can. J. Stat. 1992, 20, 135–154. [Google Scholar] [CrossRef]
- Funaoka, F.; Saigo, H.; Sitter, R.R.; Toida, T. Bernoulli bootstrap for stratified multistage sampling. Surv. Methodol. 2006, 32, 151–156. [Google Scholar]
- Preston, J. Rescaled bootstrap for stratified multistage sampling. Surv. Methodol. 2009, 35, 227–234. [Google Scholar]
- Hájek, J. Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann. Math. Stat. 1964, 35, 1491–1523. [Google Scholar] [CrossRef]
- Berger, Y.G. Rate of convergence for asymptotic variance of the Horvitz–Thompson estimator. J. Stat. Plan. Inference 1998, 74, 149–168. [Google Scholar] [CrossRef]
- Matei, A.; Tillé, Y. Evaluation of variance approximations and estimators in maximum entropy sampling with unequal probability and fixed sample size. J. Off. Stat. 2005, 21, 543–570. [Google Scholar]
- Haziza, D.; Mecatti, F.; Rao, J.N.K. Evaluation of some approximate variance estimators under the Rao–Sampford unequal probability sampling design. Metron 2008, 66, 91–108. [Google Scholar]
- Saigo, H. Comparing four bootstrap methods for stratified three-stage sampling. J. Off. Stat. 2010, 26, 193–207. [Google Scholar]
- Shao, J.; Sitter, R.R. Bootstrap for imputed survey data. J. Am. Stat. Assoc. 1996, 91, 1278–1288. [Google Scholar] [CrossRef]
- Beaumont, J.F.; Émond, N. A bootstrap variance estimation method for multistage sampling and two-phase sampling when poisson sampling is used at the second phase. Stats 2022, 5, 339–357. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).