# Estimation of Gini Index within Pre-Specified Error Bound

## Abstract

## 1. Introduction

## 2. Problem Statement and Optimal Sample Size

## 3. The Sequential Estimation Procedure

#### Implementation and Characteristics

**Stage 1:**Compute the pilot sample size $m=max\{4,\lceil {z}_{\alpha /2}/d\rceil \}$ and draw a random sample of size m from the population of interest. Based on this pilot sample of size m, obtain an estimate of ${\xi}^{2}$ by finding ${V}_{m}^{2}$ as given in (11) and check whether $m\ge {({z}_{\alpha /2}/d)}^{2}\left({V}_{m}^{2}+{m}^{-1}\right)$. If $m<{({z}_{\alpha /2}/d)}^{2}\left({V}_{m}^{2}+{m}^{-1}\right)$ then go to the next step. Otherwise, set the final sample size ${N}_{d}=m$.

**Stage 2:**Draw an additional observation independent of the pilot sample and update the estimate of ${\xi}^{2}$ by computing ${V}_{m+1}^{2}$. Check if $m+1\ge {({z}_{\alpha /2}/d)}^{2}\left({V}_{m+1}^{2}+{(m+1)}^{-1}\right)$. If $m+1<{({z}_{\alpha /2}/d)}^{2}\left({V}_{m+1}^{2}+{(m+1)}^{-1}\right)$ then go to the next step. Otherwise, stop sampling and report the final sample size as ${N}_{d}=m+1$.

**Theorem 1.**

- (i)
- ${N}_{d}/C\stackrel{a.s.}{\to}1$ as $d\downarrow 0$.
- (ii)
- $P\left({\widehat{G}}_{{N}_{d}}-d<{G}_{F}<{\widehat{G}}_{{N}_{d}}+d\right)\to 1-\alpha \phantom{\rule{1.em}{0ex}}\phantom{\rule{4.pt}{0ex}}as\phantom{\rule{4.pt}{0ex}}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}d\downarrow 0$.

**Theorem 2.**

## 4. Simulation Study

## 5. Discussion and Concluding Remarks

#### Possible Extension to Stratified Sampling

## Appendix A

**Lemma A1.**

**Proof.**

**Lemma A2.**

**Proof.**

#### Appendix A.1 Proof of Theorem 1

- (i)
- The definition of stopping rule ${N}_{d}$ in (12) yields$$\begin{array}{c}\hfill {\left(\frac{{z}_{\alpha /2}}{d}\right)}^{2}\phantom{\rule{0.166667em}{0ex}}{V}_{{N}_{d}}^{2}\phantom{\rule{0.166667em}{0ex}}\le {N}_{d}\phantom{\rule{0.166667em}{0ex}}\le mI({N}_{d}=m)\phantom{\rule{0.166667em}{0ex}}+\phantom{\rule{0.166667em}{0ex}}{\left(\frac{{z}_{\alpha /2}}{d}\right)}^{2}\left({V}_{{N}_{d}-1}^{2}+{(N}_{d}{-1)}^{-1}\right).\end{array}$$
- (ii)
- In order to show that our procedure satisfies the asymptotic consistency property, we will derive an Anscombe-type random central limit theorem for Gini index. This requires the existence of usual central limit theorem of Gini index and uniform continuity in probability (u.c.i.p.) condition. For details about the u.c.i.p. condition, we refer to [17,30,31,32] etc.

#### Appendix A.2 Proof of Theorem 2

**Lemma A3.**

**Proof.**

**Table 1.**Performance of the proposed sequential procedure when the data is from gamma, log-normal, and Pareto distribution.

d | Distribution | $\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\overline{N}$ | C | $\overline{N}/C$ | p |
---|---|---|---|---|---|

$\mathbf{\alpha}$ | $\underset{}{\mathbf{s}\left(\overline{\mathbf{N}}\right)}$ | ${\mathit{s}}_{\mathit{p}}$ | |||

$d=0.01$ | Gamma | 1283.7450 | 1267 | 1.0132 | 0.9090 |

$\alpha =0.1$ | 1.7561 | 0.0064 | |||

$d=0.02$ | Gamma | 469.1650 | 450 | 1.0426 | 0.9535 |

$\alpha =0.05$ | 1.0485 | 0.0047 | |||

$d=0.01$ | Lognormal | 1435.3640 | 1440 | 0.9968 | 0.8965 |

$\alpha =0.1$ | 4.091604 | 0.0068 | |||

$d=0.02$ | Lognormal | 509.0020 | 511 | 0.9961 | 0.9430 |

$\alpha =0.05$ | 2.0538 | 0.0052 | |||

$d=0.01$ | Pareto | 654.5364 | 686 | 0.9541 | 0.9018 |

$\alpha =0.1$ | 4.2151 | 0.0063 | |||

$d=0.02$ | Pareto | 244.3330 | 244 | 1.0014 | 0.9470 |

$\alpha =0.05$ | 2.0099 | 0.0050 |

