A Note on Identification of Bivariate Copulas for Discrete Count Data

Copulas have enjoyed increased usage in many areas of econometrics, including applications with discrete outcomes. However, Genest and Nešlehová (2007) present evidence that copulas for discrete outcomes are not identified, particularly when those discrete outcomes follow count distributions. This paper confirms the Genest and Nešlehová result using a series of simulation exercises. The paper then proceeds to show that those identification concerns diminish if the model has a regression structure such that the exogenous variable(s) generates additional variation in the outcomes and thus more completely covers the outcome domain.


Introduction
The copula approach for constructing joint distributions has gained popularity in recent years in applied econometric studies, including models with discrete outcomes (Van Ophen (1999) [1]; Cameron et al. (2004) [2]; Zimmer and Trivedi (2006) [3]; Bien et al. (2011) [4]; Winkelmann (2012) [5]).While copula researchers have long understood that a multivariate discrete distribution does not possess a unique copula representation (Marshall (1996) [6]), recent research also indicates that any copula applied to discrete data is not identified.The lack of identification of the copula in a model for discrete data, as explained by Genest and Nešlehová (2007) [7], arises when one of the marginal distributions is discontinuous.Although Genest and Nešlehová present findings for other discontinuous settings, this paper focuses on their main emphasis: count outcomes.
We derive motivation from research in the areas of health economics and demography, where, due to count outcomes having small means, the empirical support present in the data is far smaller than the theoretically infinite support of count outcomes.For example, the widely-used Medical Expenditure Panel Survey, published by a unit of the U.S. Department of Health and Human Services, asks respondents their number hospital discharges in a calendar year.Not surprisingly, because most respondents report zero hospital discharges, the mean number of annual discharges is small (e.g., 0.085 discharges in the 2014 wave of the survey).Reflecting our health economic motivation, the remainder of this paper emphasizes low-mean settings.
This paper shows that the identification problem appears to shrink when the count outcomes more completely cover the outcome domain.We present two ways in which this might occur.First, coverage of the domain improves as the means of the outcome variables become larger.Second, coverage of the domain also improves if the marginal distributions have regression structures, as the addition of covariates changes marginal distributions to conditional distributions.

Background on Bivariate Copulas
A bivariate copula is a two-dimensional cumulative distribution function (cdf) with uniform margins [0, 1] and support contained in [0, 1] 2 .For detailed treatments of copulas, see Joe (1997) [8]; McNeil et al. (2005) [9]; Nelsen (2006) [10]; Trivedi and Zimmer (2007) [11].The practical usefulness of copulas follows from Sklar's (1959) theorem [12], which holds that the copula parameterizes a multivariate distribution in terms of its marginals.Thus, for random variables y 1 and y 2 with respective marginal distributions F 1 (y 1 ) and F 2 (y 2 ), the bivariate distribution F(y 1 , y 2 ) can be expressed as where, throughout this paper, the copula function C is assumed to be indexed by a scalar-valued dependence parameter θ.Equation ( 1) provides a fairly general approach to modeling complex joint distributions.By plugging the known marginal distributions (F 1 , F 2 ) into a copula function, the right hand side of Equation ( 1) provides a parametric representation of the unknown, or difficult to work with, joint distribution on the left hand side.Results in this paper rely on the following three commonly-employed copulas.
In this notation, the symbol Φ represents the cdf of the standard normal distribution, Φ G (•, •) is the standard bivariate normal distribution with Pearson correlation θ, and u j = − ln F j (y j ).The Gaussian copula has a symmetric shape, owing to its reliance on the normal distribution.The Clayton and Gumbel copulas, by contrast, are symmetric in their arguments, but asymmetric in their tail dependence patterns, with Clayton dependence stronger in the lower tail, and Gumbel dependence concentrated in the upper tail.Because magnitudes of dependence parameters are not comparable across copulas, it is standard to convert those to measures of concordance, such as Kendall's τ.
With the focus of this paper being count outcomes, the marginals (F 1 , F 2 ) both follow Poisson distributions, a common distributional choice in applied econometric work.(Another common choice is the closely-related negative binomial distribution, which is a Poisson with exchangeable iid heterogeneity.Due to the exchangeable iid nature of that heterogeneity, the main message of this paper also applies to negative binomial marginals).
A number of approaches to estimating copulas appear in the literature.In fully parametric settings, such as those considered in this paper, one may maximize the full likelihood function, or first maximize the marginals and then treat them as given while maximizing the likelihood for θ (Joe (2005) [13]).Genest et al. (1995) [14], Shih and Louis (1995) [15], and Kim et al. (2007) [16] advocate a two-step approach in which the marginals are estimated nonparametrically using empirical distributions.McNeil et al. (2005) [9] (Chapter 5) discuss an approach that involves first calculating Kendall's τ and then converting it to θ.This paper opts for the aforementioned full maximum likelihood approach based on the probability mass function (pmf) version of the copula, which can be computed so long as the researcher knows (or assumes) specific forms for the marginal distributions and copula.The pmf is calculated as Then taking the natural logarithm of expression (2) and summing over all observations gives the log likelihood function.

Drawbacks of Copulas for Discrete Outcomes
If the margins (F 1 , F 2 ) are continuous, then the corresponding copula in Equation ( 1) is unique.If (F 1 , F 2 ) are not both continuous, the joint distribution function can always be expressed as (1), although in such a case the copula lacks uniqueness (see Schweizer and Sklar (1983) [17] (Chapter 6)).This usually does not pose a problem in applied settings, as researchers use copulas because the joint distribution F(y 1 , y 2 ) is either not known or is difficult to work with.Genest and Nešlehová (2007) [7] state "The fact that there exist (infinitely many) copulas for the same discrete joint distribution does not invalidate models of this sort." A much more serious problem is that estimates of the dependence parameter θ are biased when either F 1 or F 2 is noncontinuous.Consider two variables (y 1 , y 2 ) that arise from copula C (•, •; θ).Each observation (y 1i , y 2i ), where i indexes observations, can be viewed as arising from a latent pair (u 1i , u 2i ) where ), and (u 1 , u 2 ) is a random sample from the copula.When F 1 or F 2 are continuous, Genest and Nešlehová (2007) [7] show that estimates of dependence are identical for both (y 1 , y 2 ) and (u 1 , u 2 ).Thus, an unbiased estimate of the dependence parameter θ can be obtained.
However, when F 1 or F 2 is discontinuous, then the marginal distributions have jumps that cause the inverses F −1 1 and F −1 2 to have plateaus.Genest and Nešlehová (2007) [7] show that those plateaus potentially lead to biased estimates of θ.To illustrate, we borrow from their Definition 1 and Example 1 (pp.477-479).First, Sklar's Theorem asserts that, when F 1 and F 2 are continuous, the functions )) are the same, which is one of the important foundations of copula inference (Genest and Favre (2007) [18]).The notation u j← indicates the limit of u j as it approaches from above.But if F 1 or F 2 is discontinuous, then transformations that lead to a unique copula in the continuous case now lead to different objects, some of which are copulas, and some of which are not.
Various methods have been proposed to accommodate discrete margins, including Bayesian data augmentation (Smith and Khaled (2012) [19]) and continuous extensions of discrete variables (Denuit and Lambert (2005) [20]).The remainder of this paper illustrates that, in count data settings, the identification problem diminishes if the count outcomes more completely cover the outcome domain, such as when means increase or the model has a regression structure.

"Ties" in Count Variables
For count variables, one way to think about the identification problem is in terms of "ties", where multiple observations of an outcome measure assume the same value (Li et al. (2016) [21]; Pappadà et al. (2016) [22]).Naturally, a count outcomes with many ties also tends to have poor coverage of the outcome domain.Denuit and Lambert (2005) [20] provide the formula for the probability of a tie for arbitrary discrete marginals.In the following notation y j,k denotes an observation other than y j,i .Re-expressing the formula for count outcomes, the probability that any two independent observations are tied is For simplicity, assume that y 1 and y 2 share the same mean µ.Table 1 calculates this formula for the three aforementioned copulas, each with dependence set to τ = 0.25, 0.50, or 0.75, and each with Poisson marginals.(Applying the formula requires replacing the infinities with large finite numbers.)Keeping an eye toward our health economics motivation, the table intentionally focuses on small values for µ.As highlighted by Denuit and Lambert (2005) [20], the probabilities of ties appear to diminish as the means of y 1 and y 2 increase.And because the partition of the unit interval induced by the quantile functions becomes finer as µ increases, the lack of identification of θ likewise should diminish as µ increases.• Step 1: Randomly draw simulated Poisson variates y 1 and y 2 with means µ 1 and µ 2 from the three aforementioned copulas, each with dependence set to τ = 0.25, 0.50, or 0.75.The experiments consider sample sizes of N = 100 and N = 2500.• Step 2: Estimate the copulas using the log likelihood function generated from Equation (2).
• Step 3: Replicate steps 1 and 2 1000 times, and report the mean and standard deviation of θ.
The experiments are then repeated several times after increasing the means, all the while focusing on small-mean settings, in keeping with our health economics motivation.
Results for this set of experiments appear in the top panels of Tables 2-10.Those results show that copulas for discrete count outcomes fail to capture the true dependence magnitudes at extremely small means, which suggests lack of identification of the dependence parameter in such settings.Only in Experiment 4, where the means are larger than 1, do the estimates of θ fall closer to their true values.But even in Experiment 4, the Clayton and Gumbel copulas still appear to miss their true values.
Experiments 1−4 confirm the Genest and Nešlehová word of caution regarding copulas applied to discrete outcomes.The experiments also suggest that identification problems diminish as probabilities of ties decrease.However, what recourse do practitioners have who apply copulas to count data in small-mean settings?The following section provides evidence that the introduction of covariates facilitates identification.

Identification Through Covariates
This section presents evidence that, even with many ties, copulas applied to count data for which the marginals are conditioned nontrivially upon covariates encounter fewer identification problems.The reason is that, with covariates, the arguments to the copula functions are expected means, rather than the outcome variables themselves, and those expected means are continuous.
To illustrate, the Monte Carlo experiments in the previous section are modified: the Poisson marginals include a single explanatory variable, denoted x, common to each marginal.We consider

Table 1 .
Probabilities that any two independent observations are tied, based on Equation (3).