Testing for Stochastic Dominance up to a Common Relative Poverty Line

: Although a wide array of stochastic dominance tests exist for poverty measurement and identiﬁcation, they assume the income distributions have independent poverty lines or a common absolute (ﬁxed) poverty line. We propose a stochastic dominance test for comparing income distributions up to a common relative poverty line (i.e., some fraction of the pooled median). A Monte Carlo study demonstrates its superior performance over existing methods in terms of power. The test is then applied to some Canadian household survey data for illustration.


Introduction
The seminal works of ( Sen 1976;Foster et al. 1984;Atkinson 1987) have propagated a growing body of literature surrounding poverty measurement. While earlier works emphasize the identification of poverty, later works highlight the need for accurate and reliable statistical inference (e.g., (Kakwani 1993;Davidson and Duclos 2000;Zheng 2001;Mehdi 2017)). From a research as well as a policy standpoint, there is always interest in comparing poverty outcomes between different income distributions. However, as pointed out by (Garcia-Gomez et al. 2019), analysts often face the multiplicity of poverty indices problem. Since poverty measures depend on a poverty line, which is an income threshold dividing the poor and non-poor, distributional orderings are sensitive to the choice of poverty lines (i.e., rank reversals could occur when switching poverty lines).
To overcome the multiplicity of poverty indices problem, researchers appeal to stochastic dominance which is a robust distributional ranking method with wide-ranging applications in multiple fields. Rather than relying on estimates from singular points in distributions, stochastic dominance examines the entire support by considering the cumulative distribution function (CDF). In the context of poverty comparisons, stochastic dominance permits partial orderings of income distributions by considering all permissible income thresholds within a pre-specified range of poverty lines. If a dominance relation can be established between distributions, debates surrounding the choice of poverty lines can effectively be avoided since one distribution would always exhibit lower poverty regardless of the poverty line (see (Atkinson 1987)).
Several statistical tests of stochastic dominance have been put forth in the literature (e.g., (McFadden 1989;Kaur et al. 1994;Anderson 1996;Davidson and Duclos 2000;Barrett and Donald 2003;Barrett et al. 2016; Thompson and Stengos 2012), but the ones geared towards poverty measurement assume either separate poverty lines for each distribution or a common absolute poverty line between distributions. As ) points out, if there is a situation where we may be interested in comparing poverty outcomes between subgroups of the same population (e.g., males and females), setting two separate poverty lines may lead to rather incongruous findings. 1 With increasing usage of relative poverty measures by international organizations (e.g., (OECD 2016)), we develop the asymptotic framework for testing for stochastic dominance up to a common relative poverty line (i.e., some fraction of the pooled median income level of two distributions). (Barrett and Donald 2003) propose a test for stochastic dominance based on the one-sided Kolmogorov-Smirnov type test statistic which considers the supremum of the distances between the CDFs. Our test, on the other hand, is similar to that of (Davidson and Duclos 2000) in the sense that it relies on evaluating a finite number of distances between the distributions throughout the support. Our proposed test, much like those of (Anderson 1996;Davidson and Duclos 2000;Thompson and Stengos 2012), suffers from the issue of inconsistency due to the fact that such tests rely on examining the CDFs at a finite number of points as opposed to those put forth by (Barrett and Donald 2003;McFadden 1989;or Kaur et al. 1994) which consider either the supremum or infimum of the distances (see, e.g., (Davidson and Duclos 2000;Thompson and Stengos 2012)). However, the advantage offered by the former type of tests is that they make use of the covariances between estimates at the different points of the CDFs, which in theory leads to increased statistical power.
The remainder of this article is organized as follows. Section 2 briefly discusses the notion of stochastic dominance and its relation to poverty measures. Section 3 derives the asymptotic framework of our proposed test for stochastic dominance when the poverty line is some fraction of the pooled median income level. Section 4 presents a Monte Carlo study to assess the size and power of the test. Section 5 illustrates the proposed test using Canadian household survey data. Section 6 provides the conclusions.

Stochastic Dominance and Poverty Measurement
Consider two income distributions (or some other measure of individual welfare), characterized by CDFs, F A and F B , with support contained in the non-negative real number line. Following similar notation as (Davidson and Duclos 2000), let where s ≥ 1, and (x − y) s−1 + = (x − y) s−1 I(y ≤ x), where I(·) is an indicator function that equals 1 if its argument is true, and 0 otherwise. It is straightforward to check that D 1 A (x) = F A (x). If a poverty line z is established, an individual with income y is said to be poor if y ≤ z (this follows from the so-called "focus axiom"; see, e.g., (Foster 1984)). Thus, F A (z) measures the proportion of individuals in subgroup A below the poverty line (also known as the headcount ratio). Let D s B (x) be defined analogously. The D 1 A curve is typically referred to as the poverty incidence curve, D 2 A is the poverty deficit curve, and D 3 A is the poverty severity curve (see, e.g., (Ravallion 1994)). Distribution A is said to stochastically dominate B at order s up to poverty line z if D s First-order stochastic dominance (i.e., s = 1) guarantees dominance at higher orders (i.e., if D 1 see, e.g., Lemma 1, (Davidson and Duclos 2000)). The notion of stochastic dominance has broader implications for popular classes of poverty indices such as those proposed by : P γ = [(z − y)/z] γ I(y ≤ z)dF(y), where γ ≥ 0 is a poverty "aversion" parameter. Thus, the class of indices is based on the normalized poverty gap, (z − y)/z, or income shortfall as a share of the poverty line, of the poor. It is easy to see that, when γ = 0, the index simply becomes the headcount ratio (i.e., P 0 = F(z)), which measures the proportion of the population below the poverty line.  proposes an empirical likelihood-based test for comparing poverty measures between two distributions using a poverty line set to some fraction of the pooled median of the combined distribution, but the method cannot detect stochastic dominance since it permits only equality restrictions on the hypotheses. is that P γ will not only show lower poverty incidence for distribution A for all poverty lines up to z, but lower poverty deficit, lower poverty severity, etc. Consider a population of size N = N A + N B composed of N A individuals from subgroup A, and Checking for restricted stochastic dominance is tantamount to examining the differences between the distributions at all points leading up to z. We follow (Davidson and Duclos 2000) and construct test statistics at equidistant grid points that lie below z.
Let the vector of differences between the two distributions at the J grid points be given by

Estimation and Inference
be random iid draws from F A and F B , respectively, and let n = n A + n B be the pooled sample size. Assume that n → ∞ implies n A → ∞ and n B → ∞, and n A /N A and n B /N B are sufficiently small so that no finite population adjustment is necessary. At the jth grid point, D s A can be consistently estimated byD . Using similar arguments as ( (Zheng 2001), Section 4.2), an asymptotic expression for the jth difference is given by Using the Bahadur representation (see, e.g., (Zheng 2001, p. 351)), we can express the difference between the sample quantile and population quantile as For instance, if the poverty line is set to 50% (c = 0.5) of the pooled median (q = 0.5), then some possible grid points could be 10%, 20%, 30%, 40%, and 50% of the pooled median.
Let the joint population moments of order 2s − 2 of y A and y B be finite and suppose that F A and F B are differentiable. Then, √ n(∆ s − ∆ s ) will converge in distribution to a normal random vector with mean vector zero and covariance matrix Σ with typical element In practice, Σ can be consistently estimated byΣ which will have typical element + n BFB (y (r) )[1 −F B (y (r) )]]/[nf 2 (y (r) )], ∀j, k, andf A is the estimated underlying density function of F A . 3 The estimates of distribution B are just the analogues of A.
To test the null hypothesis that distribution A stochastically dominates B at order s, The alternate hypothesis is simply the negation of H 0 . Since we are testing multiple inequality restrictions, the relevant statistical inference methods can be found in (Kodde and Palm 1986). (Davidson and Duclos 2000) also uses this framework for their hypothesis tests. First, we compute the Wald-type test statistic where the right-hand side is a quadratic programming problem. Under the null, W will converge in distribution to a mixture of χ 2 distributions.
Obtaining critical values is not a straightforward process, so we follow (Davidson and Duclos 2000) and advocate the use of the bootstrap. The procedure can be explained as follows. Given samples y A and y B , we pool them and obtain y = (y A , y B ) . Then, the bootstrap samples y * A and y * B are generated by resampling n A and n B observations (with replacement) from y. Next, using the bootstrap samples, we compute the bootstrap test statistic W * in a similar manner to W. After repeating this process, a large number of times, the bootstrap p-value is the proportion of times that W * exceeds W. A value less than the nominal size of the test should lead to the rejection of H 0 .
Failure to infer dominance at order s by either distribution may imply that there exists some critical poverty line, z s < z where the distributions cross and thus a rank reversal occurs. If such a threshold exists, it can be implicitly characterized by D s Davidson and Duclos 2000) showed in Theorem 3, in that case, √ n B (ẑ s − z s ) will be asymptotically normal with mean zero and asymptotic variance var((z s − y A ) s−1 which, of course, can be estimated by simply replacing the terms with their respective empirical analogues.

Simulation Evidence
We now assess the size and power of our proposed test using a series of Monte Carlo experiments. The experiments were carried out using 10,000 independent trials, sample sizes of n A = n B = n/2, and nominal size set equal to 5%. We consider tests of first-order stochastic dominance (i.e., s = 1) for which a A j = 0. The poverty line is set equal to 50% of the pooled median (i.e., c = q = 0.5). Five different parametric distributions are considered in assessing the size of the test: gamma, Singh-Maddala, log-normal, unit exponential, and uniform. The CDF of the gamma distribution is given by F(y) = γ(a 2 , y/a 1 )/Γ(a 2 ), where a 1 is a scale parameter, a 2 is a shape parameter, γ(·) is the gamma function, and Γ(·) is the incomplete gamma function. The CDF of the Singh-Maddala distribution is given by F(y) = 1 − (1 + b 1 y b 2 ) −b 3 , where b 1 is a scale parameter, and b 2 and b 3 are shape parameters. Following (McDonald 1984), we set a 2 = 2.1557 for the gamma distribution, and b 2 = 1.697 and b 3 = 8.368 for the Singh-Maddala distribution, which were used to simulate 1980 U.S. income distribution. The scale parameters for both distributions are set to unity. For the log-normal distribution, the mean and standard deviation are set to 2.9372 and 0.7797, respectively, which were also used by (McDonald 1984) to simulate 1980 U.S. income distribution. For the uniform distribution, we follow (Zheng 2001) and specify the support as the unit interval [0, 1]. We consider five grid points set to 10%, 20%, 30%, 40%, and 50% of the pooled median. 4 To assess the size of the test, we generate observations for subgroup A and B from the same distribution. We test the null hypothesis that ∆ 1 ≥ 0, which is (weakly) true in this case. We consider pooled sample sizes varying from n = 100 to n = 1000 and utilize 199 bootstrap replications. Table 1 reports the rejection frequencies. Overall, we can conclude that a combined sample size of 1000 observations should be sufficient for achieving asymptotic normality. This is not a very demanding requirement at all as typical household survey datasets tend to have thousands of observations. We focus exclusively on the gamma distribution in assessing the power of the test and consider the tests of (Davidson and Duclos 2000;Barrett and Donald 2003) as benchmarks. 5 The shape and scale parameters for subgroup A remain set to their original levels from the size simulation. For subgroup B, we vary the parameters such that the CDF of B lies below the CDF of A for all points up to the poverty line. Rejection frequencies based on 199 bootstrap replications are reported in Table 2. The test exhibits excellent power properties as evidenced by the fact that it outperforms the test of (Barrett and Donald 2003) regardless of sample size. A direct comparison of our test with (Davidson and Duclos 2000) cannot really be made since their test is based on the assumption of either an absolute poverty line or poverty lines relative to each distribution (not pooled). Nonetheless, there is similarity between the two tests in terms of power. However, note that, in order to enable a somewhat fair comparison, the shape and scale parameters of the two distributions were chosen such that the medians remain the same for both distributions (thus, the pooled median is just the median of either subgroup A or subgroup B). This reduces sampling variability for the test of (Davidson and Duclos 2000) and permits a more valid comparison. The advantage offered by our proposed test is that it accounts for the sampling variance of the common poverty line that depends on both distributions while the other two tests do not.  Note: The nominal size of the test is 5% and ∆ 1 j is the jth difference between the CDFs of subgroup B and A with grid points set to 10% (j = 1), 20% (j = 2), 30% (j = 3), 40% (j = 4), and 50% (j = 5) of the pooled median. Both distributions are generated from the gamma distribution. The scale parameter of distribution A is set to unity and its shape parameter is set to 2.1557. For distribution B, the scale is 0.651 and shape is 3.143 for the first row, while, for the second row, scale is 0.421 and shape is 4.688. DD denotes the test of (Davidson and Duclos 2000), and BD denotes the bootstrap test of (Barrett and Donald 2003). 5 For the case of a common relative poverty line, the test statistic of (Barrett and Donald 2003) is based on the supremum of the distances between the censored CDFs which sets all income values above the poverty line equal to the poverty line.

Illustration
In this section, we provide a simple example of how our proposed test can be applied in a real-world scenario. Using data from the 2017 Canadian Income Survey, we compare poverty outcomes among men and women. One of the ways Canada's national statistical agency, Statistics Canada, measures "low income" is through its low income measure (LIM), which sets the low income line equal to 50% of the median adjusted household income (household income is divided by the square root of the household size). 6 The pooled median adjusted household income is determined to be $46,461 implying a poverty line of $23,230.50.
The benefit of using scalar poverty indices such as the low income measure is that it allows policy makers to monitor and assess trends across socio-economic groups, regions, and time. The downside is that robust comparisons are not assured since distributional comparisons are not made at points below the low income line. Since our proposed test assumes a common pooled relative poverty line such as Statistics Canada's low income line, we can make robust poverty comparisons between any subgroups of the population while still maintaining that common relative poverty line.
To illustrate our test, we compare poverty outcomes among Canadian men and women by testing for first-order stochastic dominance up to the low income line from the 2017 Canadian Income Survey. The sample consists of 47,800 men and 49,388 women. We consider five grid points set at 10%, 20%, 30%, 40%, and 50% of the median adjusted household income. The headcount ratios at the different points along with the median and standard deviation (SD) of adjusted household income for men and women are reported in Table 3. In testing the null that the male income distribution stochastically dominates the female income distribution up to the low income line, we obtain a bootstrap p-value of exactly zero leading to the rejection of the null. We also obtain a p-value of zero in testing the reverse hypothesis. This suggests that the CDFs of the two distributions cross at least once at some point below the low income line, which leads to our ambiguous conclusion that no dominance could be detected leading up to the low income line. Note: D 1 j denotes the headcount ratios at grid points set to 10% (j = 1), 20% (j = 2), 30% (j = 3), 40% (j = 4), and 50% (j = 5) of the pooled sample median ($46,461).
From Table 3, observe that, according to the low income line, the poverty rate is 11.3% for men and 13.2% for women (note that our estimates vary slightly from the official estimates because we treat negative incomes as zero and ignore the complex sampling scheme of the survey to simplify this exposition). 7 Notice that, at the first three grid points, the poverty rates are lower for women. The trend reverses at the higher grid points which suggests that the CDFs cross, negating the possibility of first-order dominance by either subgroup.
However, can we make any inference regarding the critical poverty line where the rank reversal occurs? In other words, we want to determine the threshold level of income up to which the womens' distribution dominate the mens'. We determine first-order stochastic dominance of womens' income 6 The low-income cut-offs (LICOs) and market basket measures (MBMs) are two of the other complementary ways Statistics Canada measures and monitors low income and poverty. Unlike the LIM, the LICO and MBM are absolute poverty lines that vary by region and attempt to account for cost-of-living. As of 2018, the MBM became Canada's official poverty line. However, the LIM continues to be the most commonly used measure for international poverty comparisons. For detailed information regarding the different measures, see (Statistics Canada 2016). distribution over men up to an adjusted household income level of $15, 058, which is 32% of the median. The 95% confidence interval for the critical threshold in our case is $12, 731.56 to $17, 384.44, or 27% to 37% of the median.

Conclusions
In this article, we proposed a test for stochastic dominance up to a common relative poverty line. Much of the existing tests of stochastic dominance, in the context of poverty measures, assume either separate poverty lines for each distribution or a common absolute poverty line. It is increasingly the case that relative poverty lines (i.e., 50% of the median income level) are being used in cross-country and group comparison studies.
A series of Monte Carlo experiments validates our asymptotic framework. The proposed test exhibits good size and power properties, under varying conditions, and outperforms existing methods due to the fact that this test utilizes the underlying covariances between estimates at the different points of the distributions. A sample size of 1,000 observations appears to be sufficient for achieving asymptotic normality which is not a very demanding requirement as household surveys tend to have thousands of observations. For illustration purposes, household income data from the 2017 Canadian Income Survey were used to rank poverty outcomes among men and women using the Canadian national statistical agency's low income line.
Funding: This research received no external funding.

Conflicts of Interest:
The author declares no conflict of interest.