econometrics Using the GB2 Income Distribution

: To use the generalized beta distribution of the second kind (GB2) for the analysis of income and other positively skewed distributions, knowledge of estimation methods and the ability to compute quantities of interest from the estimated parameters are required. We review estimation methodology that has appeared in the literature, and summarize expressions for inequality, poverty, and pro-poor growth that can be used to compute these measures from GB2 parameter estimates. An application to data from China and Indonesia is provided.


Introduction
Specification and estimation of parametric income distributions has a long history in economics. Much of the literature on alternative distributions can be accessed through the book by Kleiber and Kotz (2003), and the papers in Chotikapanich (2008). A series of papers by McDonald and his coauthors (McDonald 1984;McDonald and Xu 1995;Bordley et al. 1997;McDonald and Ransom 2008;McDonald et al. 2011) carry details of many of the distributions and the relationships between them. Our focus in this paper is on the generalized beta distribution of the second kind (GB2). It is a four-parameter distribution defined over the support (0, ∞), and obtained by transforming a standard beta random variable defined on (0, 1). As described by McDonald and Xu (1995), it nests many popular three-parameter specifications of income distributions including the generalized gamma, beta2, Singh-Maddala and Dagum distributions. Two-parameter special cases of these distributions include the lognormal, gamma, Weibull, Lomax and Fisk distributions. 1 Parker (1999) describes a model of firm optimizing behavior that leads to a GB2 distribution for earnings. Applications have appeared in Butler and McDonald (1986), Cummins et al. (1990), Feng et al. (2006), Jenkins (2009), Graf and Nedyalkova (2014), and Jones et al. (2014). Biewen and Jenkins (2005) analyze poverty differences using Singh-Maddala and Dagum distributions, with parameters as functions of personal household characteristics, and with their choice between the Singh-Maddala and Dagum 1 McDonald and Xu (1995) and McDonald and Ransom (2008) also consider a five-parameter generalized beta distribution which nests the GB2 and a GB1 distribution. distributions based on preliminary estimates of GB2 distributions. Quintano and D'Agostino (2006) use the Dagum distribution and the Biewen-Jenkins methodology to examine the dependence of inequality and poverty on personal characteristics. In an extensive study examining global inequality, Chotikapanich et al. (2012) estimate special case beta2 distributions for 91 countries in 1993 and 2000. In an application involving 10 regions, Hajargasht and Griffiths (2013) find that the GB2 distribution compares favorably with the four-parameter double Pareto-lognormal distribution in terms of goodness-of-fit.
Estimation of a good-fitting parametric income distribution such as the GB2 facilitates further analysis. Once important quantities such as mean income, the Gini coefficient, the Lorenz curve, and the headcount ratio have been expressed in terms of the parameters of the distribution, they can be readily estimated from those parameters. If interest centers on a region which comprises a collection of countries or areas, a GB2 distribution can be estimated for each country/area; inequality, poverty and pro-poor growth for the region can be analyzed by computing estimates of indicators expressed in terms of the parameters of a regional distribution which will be a population-weighted mixture of the GB2 distributions. If only grouped data are available, then estimating a distribution such as the GB2 provides a means for accommodating within-group variation, an important consideration for assessing inequality and poverty.
The purpose of this paper is to collect results on measures for inequality, poverty, and pro-poor growth, expressed as functions of the parameters of the GB2 distribution and its mixtures, and to summarize various methods of estimation that have appeared in the literature for estimating GB2 parameters from single observations or from grouped data. Expressions for the inequality, poverty, and pro-poor growth measures are given in Section 2. Section 3 contains a description of the various estimation techniques. The results from an application to 4 years of data for China and Indonesia are presented in Section 4. Some concluding remarks are offered in Section 5.

Inequality and Poverty Measures from the GB2 Distribution
Throughout we assume that income Y for a given country or area, can be represented by a GB2 distribution whose probability density function (pdf) is given by where a > 0, b > 0, p > 0 and q > 0 are its parameters and B(p, q) = 1 0 t p−1 (1 − t) q−1 dt is the beta function. The cumulative distribution function (cdf) corresponding to (1) is given by where w = (y/b) a / 1 + (y/b) a . The function B(w|p, q) is the cdf for the normalized beta distribution, defined on the (0, 1) interval, with parameters p and q, and evaluated at w. It is a convenient representation because both it, and its inverse, are commonly included as readily-computed functions in statistical software. Properties of the GB2 distribution and its special cases have been considered extensively by McDonald (1984) and Kleiber and Kotz (2003). Three-parameter special cases, which have been popular in the literature, are the Singh-Maddala distribution 2 where p = 1, the Dagum distribution where q = 1, and the beta2 distribution where a = 1. Extension to a 5-parameter GB distribution has been considered by McDonald and Xu (1995) and McDonald and Ransom (2008).

2
The Singh-Maddala distribution is also commonly known as the Burr distribution, and has been described using a variety of other names. See (Kleiber and Kotz 2003, p. 198).
Some further properties of the GB2 distribution are described by Graf and Nedyalkova (2014). In this section, we summarize the main results from the GB2 distribution that are relevant for computing measures of inequality, poverty and pro-poor growth. We envisage a scenario where GB2 distributions have been estimated for a number of countries, or for specific areas within a country such as urban and rural, and the objective is to evaluate inequality and poverty measures using the estimated parameters of the GB2 distributions. As well as evaluation of the measures from single GB2 distributions, we are interested in evaluating them for mixtures that arise when urban and rural GB2 distributions are combined to obtain a distribution for a country, or when country GB2 distributions are combined to obtain the distribution for a region. In most instances, we can express measures in terms of quantities such as beta and gamma functions that are readily computed by available software. Measures whose exact computation proves to be difficult can usually be written in terms of expectations which can be estimated by averaging values of the function over simulated draws from one or more of the GB2 distributions. Key quantities that are used for calculation of many measures, and for estimation of GB2 distributions, are the GB2 moments and moment distribution functions. We begin by giving expressions for them, as well as indicating how the GB2 Lorenz curve can be obtained. We then consider measures for inequality, poverty and pro-poor growth.
The k-th moment of the GB2 exists for −ap < k < aq and is given by where Γ(·) is the gamma function. The k-th moment distribution function for the GB2 is given by 3 This result-that the GB2's moment distribution functions can be written in terms of its cdf evaluated at different parameter values-is particularly useful for deriving the Lorenz curve and for setting up and computing GMM estimates from grouped data. The Lorenz curve, relating the cumulative proportion of income η to the cumulative proportion of population u is given by where the function B(·|·, ·) is defined in Equation (2).

Gini Coefficient
The most widely used inequality measure is the Gini coefficient. McDonald (1984) and McDonald and Ransom (2008) use hypergeometric functions to express the Gini coefficient in terms of the GB2 parameters. An algorithm for computing these functions has been proposed by Graf (2009). It has been our experience that it is easier computationally to compute the Gini coefficient via numerical integration than to numerically evaluate the hypergeometric functions. Another alternative is to 3 See, for example, (Butler and McDonald 1989). estimate the Gini coefficient by simulating from the GB2 distribution. Specifically, noting that the Gini coefficient is given by and φ = (a, b, p, q), we can draw observations (y 1 , y 2 , . . . , y M ) from f ( y|φ) and estimate G from The number of draws M can be made as large as necessary to achieve the derived level of accuracy.
To draw observations from f ( y|φ), we first draw observations (w 1 , w 2 , . . . , w M ) from a standard beta (p, q) distribution, defined on the (0, 1) interval, and then compute y m = b[w m /(1 − w m )] 1/a . If interest centers on one of the special case distributions where p = 1, q = 1 or a = 1, then closed form expressions in terms of gamma or beta functions are available for the Gini coefficient. They are Beta2 Suppose now we have estimated GB2 income distributions for a number of different areas, such as countries within a region or urban and rural areas within a country, and we are interested in estimating the Gini coefficient for the combined area. The combined income distribution can be written as a population-weighted mixture of the individual GB2 distributions. That is, where Φ = φ 1 , φ 2 , . . . , φ J , λ j is the proportion of the combined population in area j, and φ j = a j , b j , p j , q j is the vector of parameters of the distribution for area j. As noted by Chotikapanich et al. (2007), in this case the Gini coefficient for a combination of J areas can be estimated from λ j µ j is the mean of the combined areas, µ j is the mean for area j, and y j,m is the m-th draw from pdf f y φ j . For the empirical work in this paper we estimated separate distributions for rural and urban areas in China and Indonesia, then combined them.

Generalized Entropy Measures
Next we consider the generalized entropy (GE) class of inequality measures, whose expressions in terms of the parameters of the GB2 distribution were provided by Jenkins (2009). The GE index is given by where, for the GB2 distribution, µ (α) = ∞ 0 y α f (y|φ)dy is given in (3), and µ α = µ (1) α . For large positive α, the index I(α) is sensitive to large differences at the top of the distribution; for large negative α, it is sensitive to differences at the bottom end of the distribution. Theoretically, α can range from −∞ to ∞, but values between −1 and 2 are usually considered in applications. Two popular special cases are obtained by taking limits as α → 0 and α → 1 . The case where α → 0 is known as the mean logarithmic deviation or Theil(0) (Theil 1967, p. 127). Its general expression, and the result for the GB2 distribution, are 4 where ψ(c) = d log Γ(c)/dc is the digamma function, computable by most software. The index obtained as α → 1 is known as Theil(1) (Theil 1967, p. 96). Its general expression, and result for the GB2 distribution, are In the event that software is not available to compute the digamma function, draws (y 1 , y 2 , . . . , y M ) from f ( y|φ) can be used to calculate ∑ M m=1 log(y m )/M and ∑ M m=1 y m log(y m )/M as estimators for E[log(y)] and E[y log(y)], respectively.
The GE index for a mixture of income distributions and its decomposition into within and between group inequality has been considered by Sarabia et al. (2017). To obtain the GE index for a region whose income distribution is a mixture of GB2 distributions, the quantities µ (α) and µ α , defined in (5) for the GB2 distribution f ( y|φ), are replaced by the corresponding moments for the mixture distribution f ( y|Φ) = ∑ J j=1 λ j f y φ j given in (4). For α = 0, 1, the resulting index is where µ (α) j = E j (y α ) is the α-moment with respect to f y φ j , the distribution of the j-th component. For the case where α = 0, we have where, for the GB2 distribution, E j (log y) = ψ p j − ψ q j /a j + log b j . For the case where α = 1, An attractive feature of the GE index from a mixture is that it decomposes into a GE measure of inequality within the components of the mixture and a GE measure of inequality between components. To establish this decomposition, we write the index for the j-th area as Substituting this expression into (6) yields is a weighted average of the inequalities for each area with weights given by λ j µ j /µ C α , and I betw − 1 is a discrete version of the GE index for the J areas, measuring between inequality. Note that, unless α = 0 or 1, the weights do not sum to 1. When α = 0, the weights are the population shares λ j ; when α = 1, the weights are the income shares λ j µ j /∑ J j=1 λ j µ j . The components for these two cases are The Atkinson index is an inequality index that can be viewed as an ordinal special case of a GE index. It is given by The parameter ε reflects the degree of aversion to inequality in a social welfare function. As ε → 0, there is no aversion to inequality, and A(ε) → 0 . As ε → ∞, social welfare is increased by redistributing income towards complete equality; A(ε) → 1 . To compute A from the parameters of the GB2 distribution, we note that µ (1−ε) is given in Equation (3) and Alternatively, and for computing A C (ε), the Atkinson index for a mixture of GB2 distributions, the relationship between A(ε) and the GE index I(α) can be exploited. With α = 1 − ε, and ε > 0, it is given by In contrast to the Gini coefficient, which is equal to twice the area between the Lorenz curve and the line of perfect equality, the Pietra index is equal to the maximum distance between the Lorenz curve and the perfect equality line (Kleiber and Kotz 2003), as well as twice the area of the largest triangle within the area between the Lorenz curve and line of perfect equality (Butler and McDonald 1989). Details of these results and an extensive analysis of the Pietra index, generally, and in terms of several distributions and their mixtures, can be found in Sarabia and Jordá (2014). For a single GB2 distribution, we have For a mixture of distributions, it is given by Inequality is often also expressed in terms of the ratio of the income share of the richest to the income share of the poorest in the population. Graf and Nedyalkova (2014) consider the quintile share ratio (QSR), which is the ratio of the income share of the richest 20% relative to the income share of the poorest 20%. For the GB2 distribution, it is given by Noting that, the QSR for a mixture of GB2 distributions can be computed from where w j,0.8 = y 0.8 /b j a j / 1 + y 0.8 /b j a j and w j,0.2 = y 0.2 /b j a j / 1 + y 0.2 /b j a j , with y 0.2 and y 0.8 being the 20th and 80th percentiles from the mixture distribution. To obtain y 0.2 and y 0.8 , the mixture distribution function needs to be inverted to obtain its corresponding quantile function, something that is not possible in closed form. As alternatives, one can (1) attempt to solve the required equation numerically, or (2) generate a large number of observations from each component, combine and sort these components, choosing the 20th and 80th empirical percentiles as estimates.

Poverty Measures
Expressions for several poverty measures in terms of the parameters of the GB2 distribution have been provided by Chotikapanich et al. (2013). The first is the headcount ratio which is simply the proportion of the population with income less than or equal to a poverty line z where v = (z/b) a / 1 + (z/b) a . Setting the poverty line at 0.6 times the median gives what Graf and Nedyalkova (2014) term the at-risk-poverty rate (ARPR). It can be calculated from (7) after setting the poverty line at A second poverty measure used extensively in the literature is the FGT(α) class of measures (Foster et al. 1984) given by For integer values of α, this expression can be written in terms of incomplete moments of the GB2 distribution as well as in terms of the income gap ratio, defined as the average amount of money that must be given to each of the poor to bring them up to the poverty line, expressed relative to the poverty line. Working in this direction, we define the k-th incomplete moment for the GB2 distribution, relative to poverty line z, as Defining the income gap ratio as g and z − µ 2 z is the variance of the income of the poor. For noninteger values of α, we can simulate values y 1 , y 2 , . . . y M from the GB2 distribution and use the estimator where I(·) is an indicator function equal to 1 if its argument is true and zero otherwise.
As an alternative to the income gap ratio g(z) = (z − µ z )/z, Graf and Nedyalkova (2014) use a concept known as the relative median poverty gap (RMPG). It is defined as the relative gap between a poverty line, which is 0.6 times the median income of the population, and the median income of the poor. Specifically, with z defined as in (8), where the median of the poor is defined as with A being the at-risk-poverty rate (the headcount ratio using the poverty line in (8)).
Considering the income shortfall in log format leads to the Watts index (Watts 1968), defined as where D p B(v|p, q) and D q B(v|p, q) are the derivatives of the beta cdf B(v|p, q) with respect to p and q, respectively. These derivatives are available in some software (e.g., EViews), otherwise (9) can be estimated via simulation.
The last poverty measure that we describe is the Sen index (Sen 1976) where the poverty gap is weighted by a person's rank in the ordering of the poor. This index is given by where G(z) is the Gini coefficient for the poor given by The last line in (10) shows how the index can be written in terms of the headcount ratio, the aggregate income gap ratio and the inequality of the poor measured using G(z). Expressing S in terms of the parameters of the GB2 distribution is more difficult than it was for the other indices. In (10) we can use H(z) = B(v|p, q) and g(z) = 1 − µ z /z, but evaluation of G(z) is more troublesome. If we follow the simulation approach and draw M observations y m , m = 1, 2, . . . , M from f ( y|φ), it can be estimated using For aggregating poverty over a number of areas each of which has a GB2 distribution, the headcount ratio, FGT, and Watts indexes are simply population-weighted averages of the indexes for each area. That is, using obvious notation, This result does not hold for the at-risk-poverty rate and the relative median poverty gap where the poverty line is endogenous, nor does it hold for the Sen index, which contains the cdf. For ARPR and RMPG, the median of the mixture is required and RMPG also needs the median of the poor from the mixture distribution. These values can be estimated by simulating observations from the component distributions and ordering them as was suggested for the QSR. For the Sen index for the mixture, we have where the y j,m are draws from f (y φ j ) .

Measures of Pro-Poor Growth
In addition to examining changes in poverty incidence over time using measures such as the headcount ratio or refinements of it that take into account the severity of the poverty, it is useful to examine whether growth has favored the poor relative to others placed at more favorable points in the income distribution. Following Duclos and Verdier-Chouchane (2010), we consider three such pro-poor measures, namely, measures attributable to Ravallion and Chen (2003), Kakwani and Pernia (2000), and a "poverty equivalent growth rate" (PEGR) suggested by Kakwani et al. (2004).
The first step towards the Ravallion-Chen measure is the construction of a "growth incidence curve" (GIC), which describes the growth-rate of income at each percentile u of the distribution. Specifically, if F A (y) is the income distribution function at time A, and F B (y) is the distribution function for the new income distribution at a later point B, then For computing values of GIC(u) from the GB2 distribution, note that is the quantile function of the standardized beta distribution evaluated at u. When we have a regional distribution or a country distribution, which is a mixture of rural and urban GB2 distributions, it is no longer straightforward to compute the quantile function. In this case, we require F −1 ( u|Φ) which is the inverse function of F( y|Φ) = ∑ J j=1 λ j F y φ j . One needs to either solve the resulting nonlinear equation numerically or estimate F −1 ( u|Φ) using an empirical distribution function obtained by generating observations from the relevant GB2 distributions in the mixture. We followed the latter approach in our applications.
The GIC can be used in a number of ways. If GIC(u) > 0 for all u, then the distribution at time B first-order stochastically dominates the distribution at time A. If GIC(u) > 0 for all u up to the initial headcount ratio H A , then growth has been absolutely pro-poor. If GIC(u) > (µ B − µ A )/µ A for all u up to the initial headcount ratio H A , that is, the growth rate of income of the poor is greater than the growth rate of mean income (µ), then growth has been relatively pro-poor.
For a single measure of pro-poor growth Ravallion and Chen suggest using the average growth rate of the income of the poor. It can be expressed as For a GB2 distribution (not a mixture), this integral can be evaluated numerically. Alternatively, we can generate observations from a GB2 distribution or a mixture and computê where N is the total number of observations generated, and N 1 = H A N. The Kakwani-Pernia measure compares the change in a poverty index such as the change in the headcount ratio, H A − H B , with the change that would have occurred with the same growth rate, but with distribution neutrality, H A − H B . Here, B denotes an income distribution that would be obtained if all incomes changed in the same proportion as the change in mean income that occurred when moving from distribution A to distribution B. To obtain B in the context of single GB2 distributions, we can simply change the scale parameter b and leave the parameters a, p and q unchanged. The Lorenz curve and inequality measures obtained from a GB2 distribution depend on a, p and q, but do not depend on b. Thus, we have Finding B for a mixture of GB2 distributions-a situation that occurs when we combine rural and urban distributions to find a country distribution-is less straightforward. In this case, the scale parameters in all components of the mixture change and the other parameters are left unchanged. For example, using the superscripts r and u to denote rural and urban, respectively, and λ r A , λ u A and λ r B , λ u B to denote the respective population proportions at times A and B, we first compute the combined means at times A and B as Then, we obtain the distribution function for B as follows Thus, to obtain B we assume that all incomes in the rural and urban sectors increase in the same proportion as their respective mean incomes, and the distributions of income and the population proportions in each of the sectors remain the same. The Kakwani-Pernia measure is Assuming the growth in mean income has been positive, a value KP > 0 implies the change in the distribution has been absolutely pro-poor, and a value KP > 1 implies the change in distribution has been relatively pro-poor. The third measure of pro-poor growth is the poverty-equivalent growth rate (PEGR) suggested by Kakwani et al. (2004). In the context of our description of the Kakwani-Pernia measure, it is the growth rate used to construct distribution B such that H B = H B . In other words, it is the growth rate necessary to achieve the observed change in the headcount ratio when distribution neutrality is maintained. In terms of the GB2 distribution, it is the value g * that solves the following equation As was the case with previous calculations, for a mixture of GB2 distributions, this procedure is less straightforward. As an alternative, to find an approximate g * for a combined rural-urban distribution, we computed separate growth rates and g * u for the two sectors and found a weighted average of them using weights from period B.
, then, under distribution neutrality, the growth rate required to achieve the same outcome for the headcount ratio is less than realized growth rate, implying that the change in the distribution has not favored the poor. Conversely, when g * > g, a higher growth rate is required under distributional neutrality to equate the two headcount ratios. In this case, the distributional effect must have favored the poor.

Estimation
All the required quantities-the means of the distributions, the density and distribution functions, the Gini coefficients, the poverty measures, and the pro-poor growth measures-depend on the unknown parameters φ j of the GB2 distributions. Potential methods of estimation of these parameters depend on whether the available data are in the form of single observations or are grouped, and, if they are grouped, whether information on group means, as well as the number of observations in each group, is available.

Estimation with Single Observations
For single observations, say a sample of observations (y 1 , y 2 , . . . , y T ), maximum likelihood estimation can be used with the log-likelihood given by For samples where sampling weights are available, a pseudo log-likelihood can be maximized to provide consistent parameter estimates, and their precision can be assessed with a sandwich covariance matrix estimator. Details of this estimation procedure are described by Graf and Nedyalkova (2014). With income equivalized over all household members, and sampling weights w i attached to each household, their pseudo log-likelihood is given by where h is the number of households and n i is the number of persons in household i.
A further estimation method has been suggested by Graf and Nedyalkova (2014). This method minimizes a weighted sum of squared distance between sample quantities for (ARPR, RMPG, QSR, Gini), and these quantities are expressed in terms of GB2 parameters. This method has some similarities to the grouped data methods of estimation we describe in the next subsection, where a weighted squared distance between empirical and theoretical quantiles and group means is minimized. One difference is that, for using quantiles and group means, an optimal weight matrix can be derived. Deriving an optimal weight matrix for the Graf-Nedyalkova proposal would appear to be a more difficult problem.

Estimation with Grouped Data
Suppose now that the observations (y 1 , y 2 , . . . , y T ,) have been grouped into N income classes (x 0 , x 1 ), (x 1 , x 2 ), · · · , (x N−1 , x N ) with x 0 = 0 and x N = ∞. Let c i be the proportion of observations in the i-th group, let y i be mean income for the i-th group, and let y be overall mean income. In some instances, where income share data for each group (s 1 , s 2 , . . . s N ) are available, the group means may need to be calculated from y i = s i y/c i . Choice of an estimation method depends on how much of the information just described is available. If the c i and x i are available, but the y i are not, then the multinomial likelihood is a natural choice. In this case the log-likelihood is given by Another possibility is the minimum chi-squared estimator described in McDonald and Ransom (2008).
For the scenario where one also has data for the group means y i , and when the group bounds x i may or may not be available, estimators based on moment conditions have been suggested by Chotikapanich et al. (2007), Hajargasht et al. (2012) and Griffiths and Hajargasht (2015). To describe the objective functions that are minimized to obtain these estimators, we need the moments of each group up to order 2, expressed in terms of φ and x = (x 1 , x 2 , . . . , x N−1 ). Working in this direction, we define where F 1 ( x i |φ) and F 2 ( x i |φ) are the moment distribution functions defined in Section 2. Further, Hajargasht et al. (2012) show that the GMM estimator that uses moments for c i and y i = c i y, and the optimal weight matrix, can be written as where w 1i = µ can be minimized with respect to both x and φ, or, if observations on x are available, with respect to φ only. Because the weights depend on (x, φ), a variety of estimators can be used, depending on whether GMM 1 (x, φ) is minimized directly or a two-step or iterative procedure is employed. In a two-step procedure, initial estimates with weights that are not dependent on the parameters are obtained, and then estimates that minimize GMM 1 (x, φ), with weights computed from the initial estimates, are computed. Iterating this process leads to an iterative estimator.
An estimator that uses weights that do not depend on (x, φ), and which is useful for obtaining starting values for a two-step or iterative estimator from (11), is that proposed by Chotikapanich et al. (2007). In contrast to (11), they considered moment conditions for c i and y i instead of c i and y i = c i y i . Although they focused on the special case beta 2 distribution, their results also hold for the more general GB2 distribution. The function that they minimized is The weights used for this estimator c −2 i and y −2 i are not optimal, but they have the intuitive appeal of minimizing the sum of squares of percentage errors. Also, computation of the second moment µ (2) i is not required. A third GMM estimator is that described by Griffiths and Hajargasht (2015). Like (12), this estimator considers the moment conditions for c i and y i , but uses the optimal weight matrix. 5 It is given by Relative to the other optimal weight formulation in (11), this objective function avoids the term with the cross product of the moment conditions.

Applications
A major source of data for the cross-country study of income distributions, inequality and poverty is from the World Bank PovcalNet website. We used data from China and Indonesia, two Asian countries with relatively large populations. The years considered were 1999were , 2005were , 2010were and 2013were for China and 1999were , 2005were , 2010 and 2016 for Indonesia 6 . The data available are in grouped form comprising population shares and corresponding expenditure shares for a number of classes, together with mean monthly expenditure that has been reported from surveys, and then converted to purchasing power parity (PPP) using the World Bank's 2011 PPP exchange rates for the consumption aggregate for national accounts. Also available are the data on population size. Throughout the paper we use the generic term income distributions, although our example distributions are for expenditure. For both countries, separate data were available for rural and urban populations and so distributions were estimated for each of these components. Data for China were in the form of 20 groups, with the exception of China-rural 1999 (19 groups) and 2005 (17 groups), while those for Indonesia were available in 100 groups. To make the data for both countries relatively consistent for estimation, we aggregated the Indonesian data into 20 groups. The distributions were estimated by minimizing the objective function GMM 3 (x, φ) given in (13). Initial estimates were obtained by minimizing GMM 2 (x, φ), those initial estimates were used to compute the weights for GMM 3 (x, φ), the estimates from were then used to compute a new set of weights, and the process was continued for 10 iterations. Parameterizing the objective function in terms of (a, µ, p, q) instead of (a, b, p, q) facilitated convergence.
Parameter estimates for each of the distributions are presented in Table 1, along with corresponding estimates for mean income and the populations for each region. The density functions for China and Indonesia, obtained as mixtures of the urban and rural densities, are plotted in Figures 1  and 2, respectively. A striking feature of the parameter estimates is the very large estimates for p (and correspondingly small estimates for b) for Indonesia-urban in 2010 and 2016. As p → ∞, the GB2 distribution approaches the 3-parameter inverse generalized gamma distribution, 7 and so the results 5 It may be better to describe the estimators that minimize GMM 2 (x, φ) and GMM 3 (x, φ) as minimum distance estimators rather than GMM estimators because the "moment condition" for y i is plim y i = µ i /k i not E(y i ) = µ i /k i . The asymptotic distribution is the same, however. See, for example, (Greene 2012, chp. 13). 6 The version of the data that was used was downloaded on 9 March 2018 at http://iresearch.worldbank.org/PovcalNet/ povOnDemand.aspx. 7 See (McDonald and Xu 1995, p. 139). suggest this special-case distribution would be adequate for these two cases. Its density function is given by The figures show that, for both countries, there is an improvement over time in the sense that the distribution shifts to the right, and mean income increases, with the most dramatic improvements being from 1999 to 2005, and after 2010.   Inequality measures for the rural and urban areas and their combined distributions are presented in Table 2. We computed the Gini coefficient, the Pietra index, QSR, I(0) and I(1). The within and between urban and rural components for I C (0) and I C (1) are reported in Table 3. Tables 4 and 5 contain poverty measures and pro-poor growth measures, respectively. For poverty measures, the headcount, FGT(1), FGT(2) and Sen indices were computed using a poverty line of $57.8 per month, equivalent to $1.9 per day. Pro-poor growth measures, RC, KP and PEGR were computed for the combined distributions; the GIC's for each time interval are depictured in Figures 3-8. From the tables and figures, we can make the following observations about China.

1.
All inequality measures indicate that inequality increased from 1999 to 2010, and then declined from 2010 to 2013. The recent decline is attributable to a decline in rural inequality; there was an increase in urban inequality in the same period. Also, there is no clear conclusion about how rural inequality changed from 1999 to 2005; the Gini and I(1) suggest a slight decrease, whereas QSR, I(0) and Pietra suggest a slight increase.

2.
Inequality is much greater in the combined distribution than in its components, reflecting the large discrepancy in mean incomes between the rural and urban areas. Within inequality remains greater than between inequality, however. 3.
The changes in inequality have been accompanied by large increases in mean income and large decreases in poverty. The decline in poverty was particularly dramatic for rural China where the headcount ratio declined from 57% in 1999 to 3.7% in 2013. Poverty in rural China is uniformly greater than that in urban China. 4.
The GIC curves show that, from 1999 to 2010, growth has favored the rich more than the poor, but from 2010 to 2013, growth has strongly favored the poor relative to the rich, a result consistent with the decline in inequality over this period. The scalar measures of pro-poor growth are also consistent with this observation. Growth has favured the poor in an absolute sense from 1999 to 2010 (0 < RC < g, 0 < KP < 1, PEGR < g), and in a relative sense after 2010 (RC > g, KP > 1, PEGR > g).
Examining the results for Indonesia, we find: The GIC curves show that growth has favored the rich relative to the poor in all time intervals. From 2005 to 2010 the poor faired very badly; the growth rate for the bottom 15% of the population was negative. This period was also one where the growth in mean incomes was low relative to that in the other two periods. The scalar pro-poor growth measures are in line with the conclusions from the GIC curves. Growth was absolutely but not relatively pro-poor in the first and third time intervals; in the second interval it was not absolutely pro-poor according to the RC measure, and only slightly absolutely pro-poor using the KP measure.

Concluding Remarks
Studying income distributions can provide valuable information about important aspects of a society's welfare such as the degree of inequality, the incidence of poverty, and whether there have been improvements in welfare over time. The GB2 is a popular and versatile distribution well suited to this purpose. We have reviewed some of the common indexes for measuring inequality, poverty and pro-poor growth, and described how values for these indexes can be computed from estimates of the parameters of the GB2 distribution. Optimal techniques for estimating the parameters using either single observations or grouped data are also reviewed. It is our hope that the bringing together of all these results into a single source will facilitate and promote use of the GB2 distribution.