In order to derive bounded width confidence intervals, the (asymptotic) distribution of the empirical Gini index

${\widehat{G}}_{n}$ must be tackled. It has been shown by

Bhattacharya (

2007), that if

$\mathrm{E}\left(\left|X\right||s\right)<\infty $, and if

${n}_{s}\to \infty $ for each stratum

$s=1,\dots ,S$ at the same rate, then

Here,

${\xi}^{2}$ denotes the (asymptotic) variance of

$\sqrt{n}{\widehat{G}}_{n}$. Due to its quite involved representation, we refer to

Bhattacharya (

2007) for the specific variance formula. The asymptotic distribution, however, can now be used for the computation of

$100(1-\alpha )\%$ confidence intervals for the population Gini index the width of which does not exceed a pre-specified value

$\omega $, that is

and

Here,

${z}_{\alpha /2}$ is the

$100(1-\alpha /2)$th percentile of the standard normal distribution N

$(0,1)$. Thus, the actual arising task is the computation of

n that will guarantee that the width of the confidence interval is bounded by

$\omega $, i.e.,

Hence,

C denotes the optimal total number of clusters from all strata needed such that

$L\le \omega $. Therefore, the optimal number of clusters that will be required to be sampled from the

${s}^{\mathrm{th}}$ stratum

$(s=1,2,\dots ,S)$ will be

${C}_{s}=C{a}_{s}$. Here, the term optimal is used in the sense of minimum number of clusters to meet the requirements and not as in the sense of optimal allocation used in sample survey methods (see

Cochran 1997). If

C is known, one can find the sufficiently narrow confidence interval

that satisfies (

5). However without knowing the underlying distribution of the income (or assets or expenditure), the value of

${\xi}^{2}$ is unknown in practical scenarios. Thus, the optimal cluster size from all the

S strata,

C, is also unknown. We note that supposed value (or previous survey estimate) of

${\xi}^{2}$ may be used to obtain the value of

C. However, a potential problem that may arise is that the supposed value of

${\xi}^{2}$ may be different from the actual value. Moreover, using previous survey estimates in many situations is not advised as that may not be applicable in the current population. This is because of a possible change in socio-economic conditions that may arise due to the change in distribution of income or expenditure as a result of change in economic policies or situations. Due to all these factors, the value of

C may widely differ from what it would have been if

${\xi}^{2}$ is known and will not guarantee that (

5) is satisfied. The (asymptotic) variance

${\xi}^{2}$ of the estimated Gini index is, however, unknown in practical applications and must be estimated in an appropriate way. Consistent estimators will now be discussed below.

#### Estimation of ${\xi}^{2}$

Several articles published in statistics and economics journals have proposed different estimators of the asymptotic variance parameter of the estimator of the Gini index under different sampling schemes.

Zitikis and Gastwirth (

2002) proposed explicit formulas for the asymptotic variance of a general class of the Gini index (i.e., the

S-Gini index) for simple random sampling with observations coming from the Exponential and Pareto distributions. We refer to

Langel and Tillé (

2013) for a discussion on several techniques used in estimating the asymptotic variance of the Gini index for various sampling designs. Under the current framework,

Binder and Kovacevic (

1995) proposed an estimator of

${\xi}^{2}$ using the empirical variance

of the values

Here,

${\overline{u}}_{s}={n}_{s}^{-1}{\sum}_{{c}_{s}=1}^{{n}_{s}}{u}_{s{c}_{s}}$ denote the empirical mean of

${u}_{s{c}_{s}}$ and

are weighted placements and averages of the income values obtained from

n clusters, respectively. It should be noted that

Bhattacharya (

2007) proposed an alternative estimator of

${\xi}^{2}$ which is given by

where

However,

Hoque and Clarke (

2015) showed that the estimators in (

6) and (

7) are numerically the same, i.e.,

${V}_{n,1}^{2}={V}_{n,2}^{2}$. We therefore chose

${V}_{n,1}^{2}$ as a consistent estimator of

${\xi}^{2}$ and drop the second subscript, without loss of generality (i.e., we use

${V}_{n}^{2}$ as the estimator of

${\xi}^{2}$). Having found a consistent estimator of the (asymptotic) variance

${\xi}^{2}$, it follows that the optimal number of clusters

C defined in (

5) that lead to the bounded width confidence interval can now be estimated from the data. In order to do so, different sequential methodologies will be discussed in the next section.