# Estimation of Gini Index within Pre-Specified Error Bound

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Problem Statement and Optimal Sample Size

## 3. The Sequential Estimation Procedure

#### Implementation and Characteristics

**Stage 1:**Compute the pilot sample size $m=max\{4,\lceil {z}_{\alpha /2}/d\rceil \}$ and draw a random sample of size m from the population of interest. Based on this pilot sample of size m, obtain an estimate of ${\xi}^{2}$ by finding ${V}_{m}^{2}$ as given in (11) and check whether $m\ge {({z}_{\alpha /2}/d)}^{2}\left({V}_{m}^{2}+{m}^{-1}\right)$. If $m<{({z}_{\alpha /2}/d)}^{2}\left({V}_{m}^{2}+{m}^{-1}\right)$ then go to the next step. Otherwise, set the final sample size ${N}_{d}=m$.

**Stage 2:**Draw an additional observation independent of the pilot sample and update the estimate of ${\xi}^{2}$ by computing ${V}_{m+1}^{2}$. Check if $m+1\ge {({z}_{\alpha /2}/d)}^{2}\left({V}_{m+1}^{2}+{(m+1)}^{-1}\right)$. If $m+1<{({z}_{\alpha /2}/d)}^{2}\left({V}_{m+1}^{2}+{(m+1)}^{-1}\right)$ then go to the next step. Otherwise, stop sampling and report the final sample size as ${N}_{d}=m+1$.

**Theorem 1.**

- (i)
- ${N}_{d}/C\stackrel{a.s.}{\to}1$ as $d\downarrow 0$.
- (ii)
- $P\left({\widehat{G}}_{{N}_{d}}-d<{G}_{F}<{\widehat{G}}_{{N}_{d}}+d\right)\to 1-\alpha \phantom{\rule{1.em}{0ex}}\phantom{\rule{4.pt}{0ex}}as\phantom{\rule{4.pt}{0ex}}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}d\downarrow 0$.

**Theorem 2.**

## 4. Simulation Study

## 5. Discussion and Concluding Remarks

#### Possible Extension to Stratified Sampling

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Appendix A

**Lemma A1.**

**Proof.**

**Lemma A2.**

**Proof.**

#### Appendix A.1 Proof of Theorem 1

- (i)
- The definition of stopping rule ${N}_{d}$ in (12) yields$$\begin{array}{c}\hfill {\left(\frac{{z}_{\alpha /2}}{d}\right)}^{2}\phantom{\rule{0.166667em}{0ex}}{V}_{{N}_{d}}^{2}\phantom{\rule{0.166667em}{0ex}}\le {N}_{d}\phantom{\rule{0.166667em}{0ex}}\le mI({N}_{d}=m)\phantom{\rule{0.166667em}{0ex}}+\phantom{\rule{0.166667em}{0ex}}{\left(\frac{{z}_{\alpha /2}}{d}\right)}^{2}\left({V}_{{N}_{d}-1}^{2}+{(N}_{d}{-1)}^{-1}\right).\end{array}$$
- (ii)
- In order to show that our procedure satisfies the asymptotic consistency property, we will derive an Anscombe-type random central limit theorem for Gini index. This requires the existence of usual central limit theorem of Gini index and uniform continuity in probability (u.c.i.p.) condition. For details about the u.c.i.p. condition, we refer to [17,30,31,32] etc.

#### Appendix A.2 Proof of Theorem 2

**Lemma A3.**

**Proof.**

## References

- B.C. Arnold. “Inequality measures for multivariate distributions.” Metron 63 (2005): 317–327. [Google Scholar]
- R. Andres, and C. Samuel. Inference on Income Inequality and Tax Progressivity Indices: U-Statistics and Bootstrap Methods. ECINEQ working paper 2005-9; Palma, Spain: ECINEQ, 2005. [Google Scholar]
- J.A. Bishop, J.P. Formby, and B. Zheng. “Statistical inference and the sen index of poverty.” Int. Econ. Rev. 38 (1997): 381–387. [Google Scholar] [CrossRef]
- R. Davidson. “Reliable inference for the Gini index.” J. Econom. 150 (2009): 30–40. [Google Scholar] [CrossRef]
- J.L. Gastwirth. “The estimation of the Lorenz curve and Gini index.” Rev. Econ. Stat. 54 (1972): 306–316. [Google Scholar] [CrossRef]
- P. Palmitesta, P. Corrado, and S. Cosimo. “Confidence interval estimation for inequality indices of the Gini family.” Comput. Econ. 16 (2000): 137–147. [Google Scholar] [CrossRef]
- K. Xu. “U-statistics and their asymptotic results for some inequality and poverty measures.” Econom. Rev. 26 (2007): 567–577. [Google Scholar] [CrossRef]
- E. Maasoumi. “Empirical analysis of inequality and welfare.” In Handbook of Applied Microeconomics. Edited by S. Schmidt and H. Pesaran. Malden, MA, USA: Blackwell Publishers Inc., 1997. [Google Scholar]
- G.B. Dantzig. “On the non-existence of tests of “student’s” hypothesis having power functions independent of σ.” Ann. Math. Stat. 11 (1940): 186–192. [Google Scholar] [CrossRef]
- N. Mukhopadhyay, and B.M. de Silva. Sequential Methods and Their Applications. Boca Raton, FL, USA: CRC Press, 2009. [Google Scholar]
- P.K. Sen. Sequential Nonparametrics: Invariance Principles and Statistical Inference. New York, NY, USA: Wiley, 1981. [Google Scholar]
- W. Hoeffding. “A class of statistics with asymptotically normal distribution.” Ann. Math. Stat. 19 (1948): 293–325. [Google Scholar] [CrossRef]
- A.J. Lee. U-Statistics: Theory and Practice. New York, NY, USA: CRC Press, 1990. [Google Scholar]
- M. Loève. Probability Theory. Princeton, NJ, USA: Van Nostrand, 1963. [Google Scholar]
- J.L. Doob. Stochastic Processes. New York, NY, USA: Wiley, 1953. [Google Scholar]
- C. Schröder, and S. Yitzhaki. Reasonable Sample Sizes for Convergence to Normality. No. 714, SOEP Papers on Multidisciplinary Panel Data Research. 2014. Available online: http://papers.ssrn.com/sol3/papers.cfm?abstractid=2539096 (accessed on 5 March 2016).
- R. Sproule. “A Sequential Fixed-Width Confidence Interval for the Mean of a U-Statistic.” Ph.D. Thesis, University of North Carolina, Chapel Hill, NC, USA, 1969. [Google Scholar]
- B. Chattopadhyay, and N. Mukhopadhyay. “Two-stage fixed-width confidence intervals for a normal mean in the presence of suspect outliers.” Seq. Anal. 32 (2013): 134–157. [Google Scholar] [CrossRef]
- M. Langel, and Y. Tillè. “Variance estimation of the Gini index: Revisiting a result several times published.” J. R. Stat. Soc. Ser. A Stat. Soc. 176 (2013): 521–540. [Google Scholar] [CrossRef]
- M.R. Ransom, and J.S. Cramer. “Income distribution functions with disturbances.” Eur. Econ. Rev. 22 (1983): 363–372. [Google Scholar] [CrossRef]
- D. Bhattacharya. “Asymptotic inference from multi-stage samples.” J. Econom. 126 (2005): 145–171. [Google Scholar] [CrossRef]
- D. Bhattacharya. “Inference on inequality from household survey data.” J. Econom. 137 (2007): 674–707. [Google Scholar] [CrossRef]
- D.A. Binder, and M.S. Kovacevic. “Estimating some measures of income inequality from survey data: An application of the estimating equations approach.” Surv. Methodol. 21 (1995): 137–146. [Google Scholar]
- W.G. Cochran. Sampling Techniques. New York, NY, USA: Wiley & Sons, 1977. [Google Scholar]
- C.M. Beach, and R. Davidson. “Distribution-free statistical inference with Lorenz curves and income shares.” Rev. Econ. Stud. 50 (1983): 723–735. [Google Scholar] [CrossRef]
- R. Davidson, and J. Duclos. “Statistical inference for stochastic dominance and for the measurement of poverty and inequality.” Econometrica 68 (2000): 1435–1464. [Google Scholar] [CrossRef]
- S. Zacks. Stage-Wise Adaptive Designs. New York, NY, USA: Wiley, 2009. [Google Scholar]
- C. Damgaard, and J. Weiner. “Describing inequality in plant size or fecundity.” Ecology 81 (2000): 1139–1142. [Google Scholar] [CrossRef]
- A. Gut. Stopped random walks: Limit theorems and applications. New York, NY, USA: Springer, 2009. [Google Scholar]
- F.J. Anscombe. “Sequential estimation.” J. R. Stat. Soc. Ser. B 15 (1953): 1–29. [Google Scholar]
- E. Isogai. “Asymptotic consistency of fixed-width sequential confidence intervals for a multiple regression function.” Ann. Inst. Stat. Math. 38 (1986): 69–83. [Google Scholar] [CrossRef]
- N. Mukhopadhyay, and B. Chattopadhyay. “A tribute to Frank Anscombe and random central limit theorem from 1952.” Seq. Anal. 31 (2012): 265–277. [Google Scholar]
- S.K. De, and B. Chattopadhyay. “Minimum Risk Point Estimation of Gini Index.” Available online: http://arxiv.org/abs/1503.08148 (accessed on 27 March 2015).
- P.K. Sen, and M. Ghosh. “Sequential point estimation of estimable parameters based on U-statistics.” Sankhyā Indian J. Stat. Ser. A 43 (1981): 331–344. [Google Scholar]
- M. Ghosh, N. Mukhopadhyay, and P.K. Sen. Sequential Estimation. New York, NY, USA: Wiley, 1997. [Google Scholar]

**Table 1.**Performance of the proposed sequential procedure when the data is from gamma, log-normal, and Pareto distribution.

d | Distribution | $\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\overline{N}$ | C | $\overline{N}/C$ | p |
---|---|---|---|---|---|

$\mathbf{\alpha}$ | $\underset{}{\mathbf{s}\left(\overline{\mathbf{N}}\right)}$ | ${\mathit{s}}_{\mathit{p}}$ | |||

$d=0.01$ | Gamma | 1283.7450 | 1267 | 1.0132 | 0.9090 |

$\alpha =0.1$ | 1.7561 | 0.0064 | |||

$d=0.02$ | Gamma | 469.1650 | 450 | 1.0426 | 0.9535 |

$\alpha =0.05$ | 1.0485 | 0.0047 | |||

$d=0.01$ | Lognormal | 1435.3640 | 1440 | 0.9968 | 0.8965 |

$\alpha =0.1$ | 4.091604 | 0.0068 | |||

$d=0.02$ | Lognormal | 509.0020 | 511 | 0.9961 | 0.9430 |

$\alpha =0.05$ | 2.0538 | 0.0052 | |||

$d=0.01$ | Pareto | 654.5364 | 686 | 0.9541 | 0.9018 |

$\alpha =0.1$ | 4.2151 | 0.0063 | |||

$d=0.02$ | Pareto | 244.3330 | 244 | 1.0014 | 0.9470 |

$\alpha =0.05$ | 2.0099 | 0.0050 |

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Chattopadhyay, B.; De, S.K. Estimation of Gini Index within Pre-Specified Error Bound. *Econometrics* **2016**, *4*, 30.
https://doi.org/10.3390/econometrics4030030

**AMA Style**

Chattopadhyay B, De SK. Estimation of Gini Index within Pre-Specified Error Bound. *Econometrics*. 2016; 4(3):30.
https://doi.org/10.3390/econometrics4030030

**Chicago/Turabian Style**

Chattopadhyay, Bhargab, and Shyamal Krishna De. 2016. "Estimation of Gini Index within Pre-Specified Error Bound" *Econometrics* 4, no. 3: 30.
https://doi.org/10.3390/econometrics4030030