Abstract
This paper provides a new approach to recover relative entropy measures of contemporaneous dependence from limited information by constructing the most entropic copula (MEC) and its canonical form, namely the most entropic canonical copula (MECC). The MECC can effectively be obtained by maximizing Shannon entropy to yield a proper copula such that known dependence structures of data (e.g., measures of association) are matched to their empirical counterparts. In fact the problem of maximizing the entropy of copulas is the dual to the problem of minimizing the Kullback-Leibler cross entropy (KLCE) of joint probability densities when the marginal probability densities are fixed. Our simulation study shows that the proposed MEC estimator can potentially outperform many other copula estimators in finite samples.
Keywords:
entropy; relative entropy measure of joint dependence; copula; most entropic copula; canonical; kullback-Leibler cross entropy JEL:
C190; C590; C130
1. Introduction
There has been a substantial literature on estimation and inference of relative entropy measures of joint dependence as measures of serial correlation. These particular measures of dependence were first proposed by Joe [1] and extended by Granger and Lin [2]. Relative entropy based measures of dependence have so far received much interest in econometrics because they provide very general concepts for gauging joint dependence; and they can be used for a set of variables that can be a mixture of continuous, ordinal-categorical, and nominal-categorical variables. Interested readers are referred to [3,4,5] for a concise review of important contributions in this area.
Econometricians have recently become interested in the computation of maximum entropy densities (see, e.g., Golan [6], Usta and Kantar [7], and references therein for the background and discussions regarding maximum entropy (ME) densities.) The ME densities are derived by maximization of an information criterion (the level of uncertainty) subject to mass and mean preserving constraints. The justification for using the ME in this context can be found in [8]. Rockinger and Jondeau [9] apply the ME method to determine the ME return distribution which is then utilized to extend Bollerslev’s GARCH into autoregressive conditional skewness and kurtosis. Maasoumi and Racine [10] employ a metric entropy measure of dependence to examine the predictability of asset returns. Hang [11] uses the ME to determine flexible functional forms of regression functions subject to side conditions. Miller and Liu [12] propose a method to recover a joint distribution function by applying the KLCE distance while imposing a required degree of dependence through the joint moments. An example is the normal distribution which is completely characterized by first and second moments. In this case, the minimum KLCE distribution is the multivariate Normal distribution where the dependence is specified through conventional linear correlation.
There has been a great deal of interest in copulas, especially in financial economics, as they have the potential to model and explain asymmetric dependence between random variables separately from their marginal distributions. For example, Patton [13] employs various families of copulas to investigate the inter-relationship between univariate skewnesses, asymmetric dependence between asset returns, and the optimal portfolios of assets. Rodriguez [14] models financial contagion using copulas. Chollete, Heinen, and Valdesogo [15] propose a multivariate regime-switching copula to capture asymmetric dependence and regime-switching in portfolio selection. Ning, Xu, andWirjanto [16] investigate asymmetric pattern in volatility clustering by employing a semi-parametric copula approach. Detailed indications of various econometric aspects or applications of copulas in economics and finance can be found, for instance, in the survey papers by Patton [17] and Fan and Patton [18]. A comprehensive treatment of copula theory is presented in the monograph by Nelsen [19].
Given the broad context described above, we propose a theoretical framework to recover relative entropy measures of joint dependence from limited information by constructing a set of the most entropic copulas (MEC’s), which can essentially be done by maximizing Shannon entropy subject to constraints on the uniform marginal distributions and other constraints on the copula-based measures of dependence (or the distance between the MEC and an arbitrary nested copula). In the class of MECs, there exists a simplified form, namely the most entropic canonical copula (MECC). Moreover, it can be shown that the proposed MEC approach and the KLCE approach in Miller and Liu [12] are dual in the sense that they can recover the same joint distribution. Applications of MEC’s to economics include Chu [20], Dempster, Medova, and Yang [21], Friedman and Huang [22], Veremyev, Tsyurmasto, Uryasev, and Rockafellar [23], Zhao and Lin [24].
We shall now discuss the contributions of the current paper in relation to [20]. The similarity between the two papers is that rank correlations are employed as prior information about dependence in order to construct the MECC. This paper differs from [20] in several respects. First, in [20], Carleman’s condition permits constraints on moments to be employed so as to ensure that the MEC satisfies all the properties of a copula while, in the present paper, constraints are explicitly imposed on marginal copula densities. Therefore the entropy maximization problem defined in [20] is merely a good approximation of the entropy maximization problem in this study. Second the main problem in [20] is the standard entropy maximization problem while the main problem in the present paper involves a continuum of constraints on the marginal distributions, which can be written as integrals with varying end-points that need to be smoothed out by using kernels. This kernel-smoother can generate MECs with smooth densities whilst the discrete approximation technique proposed by [21] can only allow for MECs with discrete densities. The feasibility and benefits of the proposed approach to construct MECs will then be demonstrated through a Monte-Carlo simulation study presented in Section 3.
Although our analysis is restricted to the bivariate case, the multivariate case is a straightforward extension. The remainder of the paper is organized in three sections. In Section 2, we formulate and approximate most entropic copulas (MECs). Next, we discuss the link between the MEC and the minimum KLCE density and the extent to which the MEC is more flexible than the KLCE method. We then compute the MEC and the MECC subject to marginal constraints and other constraints on various copula-based dependence measures such as Spearman’s rho and tau. We also outline the large sampling properties of the relevant parameter estimators. We present these results in Theorems 2.1–2.4. A simulation study is presented in Section 3, demonstrating that the MEC fits data well when compared with other competing procedures (e.g., parametric copulas and kernel estimators). Derivation of statistical properties for the proposed copula estimator is rather challenging and will be left for future research. Finally, to facilitate reading of this paper, we collect all materials of technical flavour into the three main appendices at the end of this paper.
2. Recovering the Most Entropic Copulas
2.1. Maximum Entropy and Copula
This section provides a brief explanation of entropy and copula. We refer to [25] for a comprehensive review of entropy econometrics and [19] for important results concerning copulas.
Shannon entropy has been used as an information criterion to construct the probability densities for economic or financial variables such as stock returns, income, GDP, etc. (see, inter alia, [26,27,28]). A univariate ME density is generally obtained by maximizing Shannon entropy, , with respect to under probability and moment constraints. A bivariate ME density that is closest to a given reference density, say the product of two univariate densities, can be obtained by minimizing the KLCE under joint moment constraints (see, e.g., [1] and [12]):
subject to
where f is a bivariate density, and are some univariate densities, and h is an arbitrary function such that .
The copula is proposed by Sklar [29] as a method to construct joint distributions with given marginals. The advantage of copulas is that dependence between random variables can be parametrically specified entirely independently from their marginals. A bivariate copula is defined as a function from to with the following properties: 1) for every it holds that and 2) is 2-increasing, i.e., for every such that and (see, e.g., [19], p. 8)). Note that Property (2) always holds if has a positive density , and Property (1) implies that a copula is a function with Uniform[0,1] marginals. Sklar’s theorem links a copula, , to a joint distribution, , via , where and are the marginals.
We shall use measures of association and rank correlations to construct the MEC, which we discuss next. Measures of association are, unlike joint moments, invariant under nonlinear transformations of the underlying random variables, and thus they are natural measures of dependence for non-elliptical random variables (see Appendix A for formal definitions of measures of association). A measure of association is, in general, defined as , where h is a bivariate function such that . This measure, based on C, is also referred to as the copula-based measure of dependence. In practice, τ can be estimated by the rank statistic , where represents the ranks of in a sample of size N. An advantage of using rank statistics as nonparametric measures of nonlinear dependence is that they are robust—in the sense that they will be insensitive to contamination and maintain a high efficiency for heavier tailed elliptical distributions as well as for multivariate normal distributions (see, e.g., [30] for a detailed treatment of rank statistics). Examples of include Spearman’s rho and Blest’s rank correlations (see, e.g., [31]), which are summarized in Table 1.
Table 1.
Blest’s measures of rank correlation.
Nonetheless, it is worth mentioning that the definition of τ is somewhat restrictive since it does not include Kendall’s tau, for example.1 Moreover, not every rank correlation can be formulated in terms of the above general rank statistic . For instance, the statistic , which was proposed by Gideon and Hollister [32] as a coefficient of rank correlation resistant to outliers even in a small sample, has the form:
where is the value of with the subscript i satisfying , and is the greatest integer notation. In addition, estimates a copula-based measure of dependence, .
In the present paper, we use the bivariate Shannon entropy of a copula, given by
By Sklar’s theorem the Shannon entropy of a copula is then equivalent to the KLCE:
Hence, minimization of the KLCE and maximization of the bivariate Shannon entropy are dual problems. Let denote the MEC. Then, in view of [1], the relative entropy measure of dependence (recovered from limited information) is given by . Generally speaking, a multivariate Shannon entropy can be defined in an obvious way, and this dual relationship holds. However, as pointed out in Friedman and Huang [22] the problem of maximizing a multivariate Shannon entropy of copulas can suffer from the curse of dimensionality because the number of constraints (on the marginal densities) needed for the MEC to satisfy all the properties of a copula increases as the problem involve more dimensions.
2.2. The Most Entropic Copula
We assume for the rest of this paper that the MEC is a differentiable function so that its copula density exists. The bivariate MEC (or the MEC) is obtained by maximizing the bivariate Shannon entropy (2) under two following constraints: (1) the marginals of are Uniform[0,1]; and (2) the measures of association, defined in Section 2.1, are set equal to the corresponding rank correlations. We call this Problem EM.
subject to
where (4) implies that is a joint density on the unit circle; Equations (5) and (6) imply that the marginals of are Uniform[0,1] distributions; Equation (7) imposes a constraint on the joint behavior of U and V. To give an example, let , then the left-hand side of (7) becomes Spearman’s rho and (note that, in what follows, we sometimes omit ‘N’ for brevity) is the rank correlation associated with Spearman’s rho. To give another example, suppose that the true data generating copula, say , belongs to a family, . Given this prior information, to recover a MECC from the data, one may randomly choose a copula, , from , then use it to construct (7) with , where and is an estimate of the difference between the probabilities of concordance and discordance (cf. Appendix A). By doing this, it is expected that some feature of the family could be effectively incorporated into the MECC. Other examples of Equation (7) also include Blest’s coefficients or Gideon and Hollister’s (1987) coefficient, etc. Also note that we may have more than one constraint like (7). It is to be stressed at this point that some versions of the MEC problem may exhibit boundary solutions due to theoretical restrictions on the measures of dependence employed (e.g., the Hoeffding-Frechet bounds on correlation statistics). Consequently, the large-sample theory stated in Section 2.3 below only holds for interior solutions to the stated problem.2
For future reference, we shall denote by , where is a vector of coefficients, as the MEC [that solves Problem EM]. The MECs (accordingly the MECC) can then be approximated by replacing the continuums of varying end-points in (5) and (6) by sets of definite integrals. We now present an approximate solution to Problem EM in Theorem 2.1 below.
THEOREM 2.1.
The MEC, , can be approximated by an approximator, , as follows:
with
where
and contains the minimal values of the following potential function:
Note that is the standard normal cdf (arising from smoothing indicator functions, , with the Gaussian kernel) and is an arbitrary copula (which may involve a nuisance parameter that needs to be estimated).
In particular, the MEC, , can be symmetrized by letting be equal to () and be a symmetric function.
Proof:
The proof utilizes the standard method of Variational Calculus for maximization of functions in normed linear spaces (see, e.g., [33], p. 129). See Appendix D. ■
As we can see, the MEC density nests an arbitrary copula, , (cf. Equation (9)). Indeed, the MEC depends on both and , thus no uniqueness is obtained. However, we can obtain a canonical form, which is called the MECC, by setting to zero. This idea of a canonical model can be traced back to Jeffreys3 who proposed to use the principle of simplicity for deductive inference—that is, for any given set of data, there is usually an infinite number of possible laws that will “explain” the data precisely; and the simplest model should be chosen.
It is also worth noting at this point that, like the empirical copula, the MECC is a valid distribution function; however, it satisfies the Uniform[0,1] marginal constraints only asymptotically. In addition the potential function in the above theorem is a multivariate convex function of Λ, which in general has a unique minimum because it is the product of (positive) univariate convex functions.
We can claim that the MECC, , is equivalent to a maximum likelihood estimator (MLE). Now, we need to verify this claim—given a bivariate sample for , the average maximum log-likelihood function is given by
where is defined in (8),
and
in which and are the ranks of and in the sample, respectively. Assuming that N is greater than n and that n is large enough, in view of (9) with , we obtain the following representation:
where ; the approximation (≈) follows because for every ; and the last equality holds because is set equal to its consistent rank estimator, . Hence, the claim has been verified.
REMARK 2.1.
To compute the MECC, we could use either a Monte-Carlo integration procedure or Gaussian quadratures to approximate the potential function (10) (see Appendix C for further details), and then employ a global optimization technique (for example the stochastic search algorithm proposed by Csendes [34]) to minimize this function.
In general, we can also approximate by using a collection of equally-spaced partitions of the unit interval , and then, a high-order kernel smoothing of the indicator function. This is stated in Theorem 2.2:
THEOREM 2.2.
The MEC, , can be approximated by an approximator, , as follows:
with
where
for some kernel function, , in , where is the space of symmetric, Lebesgue integrable, kernel functions of order, r, (cf. Definition B.1) and contains the minimal values of the following potential function:
Proof:
The proof is very similar to Theorem 2.1 combined with Lemma B.1. So we shall omit its details here. ■
2.3. Large Sample Properties with Unknown Parameters of Dependence
The approximate MECC densities are members of a statistical exponential family parametrized by the Lagrange multipliers. Since the true parameters of dependence in (7) are unknown, a random sample of size N is then used to form their consistent estimates . Therefore, the sampling properties of may be derived from the associated sampling properties of . Let represent the approximate potential function with the dependence parameters Θ as formulated in Section 2, where and denote the minimal values of for and respectively. The Hessian matrices of are and . The following assumptions are maintained
- AS1.
- , where is some non-empty compact set; is the number of dependence constraints. Further,is also a non-empty and compact set, where is the number of the Lagrange multipliers in . Therefore, the number of marginal constraints is .
- AS2.
- The map from to is a diffeomorphism (i.e., one-to-one, continuous and onto in both directions).
- AS3.
- is a strictly convex function of Λ for all Θ and uniformly continuous (in probability) in Θ, i.e.
- AS4.
- The vector of dependence parameter estimates is asymptotically normal such thatwhere Ψ is an asymptotic variance-covariance matrix of .
THEOREM 2.3.
In view of AS1–AS4, we obtain
Proof:
See Appendix D. ■
If the dependence constraints are linear in their parameters, i.e., , we can redefine the potential function associated with the constraints of Problem EM as follows:
where , and is the Lagrange multiplier for the constraint .
THEOREM 2.4.
Proof:
Noting that , the proof follows directly from Theorem 2.3. ■
Theorem 2.4 suggests that in general the efficiency of the estimators can be improved by using more marginal constraints. However, adding too many marginal constraints can decrease efficiency since this may increase the probability that the covariances of in are negative. Thus, the Hessian matrix contains some negative elements which may cause the asymptotic variance of to increase overall. Theorems 2.3 and 2.4 can be used to develop tests of hypotheses about the “distance” between the MECC and another copula of the exponential function family.
3. Simulation
In this section, we perform some simulations to investigate the finite-sample properties of the MECC approximators (proposed above). We shall address three main issues in these simulations. First, the MECC can outperform the parametric copulas used in this study (the Gaussian copula, Student’s t copula, the Clayton copula, and the Gumbel copula) while its performance remains comparable to other nonparametric estimators (i.e., the “shrinked” local linear (LLS) type kernel copula estimator and the “shrinked” mirror-reflection (MRS) kernel copula estimator proposed by Omelka, Gijbels, and Veraverbeke [36]). Second, an increase in the number of marginal constraints leads to an improvement in the performance of the MECC. Third the MECC, for the most part, becomes as stable as other parametric copulas as more marginal constraints are utilized.
To accomplish the above objectives, we choose Frank’s copula,
where as the true model whereby samples are generated. (See [37,38] for the statistical properties of Frank’s copula.) This copula is radially symmetric and close to the independence as θ approaches the origin, i.e., . Later, we shall use two values, and , for the true parameter θ; these values, roughly speaking, correspond to the close-to-independence case and the weak dependence case respectively.
The simulation procedure is outlined as follows. First, we generate 100 samples of 5000 observations from Frank’s copula for each value of θ. With these samples in hand, we estimate four commonly-used parametric copulas, mentioned above, by using MLE method. We also estimate 12 MECCs (that is, with combinations of marginal constraints and joint moment constraints) by using our proposed method. To gauge the errors of these estimators, we shall use the integrated mean squared error (IMSE);
where is the density of Frank’s copula; and represents an estimate using one of the above-mentioned parametric copulas or a MECC. Next, for each copula, we use the 100 samples of 5000 observations drawn from Frank’s copula to estimate the squared bias and the variance (as the functions of u and v). Both the integrated squared bias () and the integrated variance () are then obtained by evaluating the estimated squared bias ( where denotes the empirical mean calculated using 100 samples) and the estimated variance () at 10000 pseudo-random Uniform [0,1] points, then taking their individual averages, i.e.,
where denotes a sample of 10000 points (drawn from the Uniform [0,1] distribution) whereby both and are evaluated. To gauge the errors of the nonparametric copula estimators, we shall use the expressions for the asymptotic bias and variance given in [36,39]; the optimal bandwidth is obtained by minimizing the integrated asymptotic MSE [39]. We report our simulation results in Table 2.
Table 2.
IMSE for the MECC and parametric copulas: Frank copula as the true copula.
First, it can be noticed from Table 2 that the MECCs significantly outperform elliptical copulas (i.e., the Normal copula and Student’s t copula) in terms of Int. and IMSE. However, with a small number of marginal constraints the MECCs are mostly less stable than other parametric copulas; the only way to improve the stability (Int. Var.) of the MECCs is to increase the number of marginal constraints. For the close-to-independence case (), the asymmetric copulas (i.e., the Clayton copula and the Gumbel copula) outperform the MECCs. The intuition for these asymmetric copulas to have small Int. and Int. Var. is that Frank’s copula, the Clayton copula, and the Gumbel copula all behave like the independence copula for It is also interesting to note that the MECCs often outperform the LLS and MRS estimators in terms of whilst these nonparametric estimators outperform the MECCs in terms of The reason for the existence of non-zero in the LLS and MRS estimators is that the optimal bandwidth (being shrinked close to zero at the corners of the unit square) can keep the bias bounded, but does not completely remove the bias.
Second, when the data will become less independent, leading to a significant increase in Int. pertaining to the estimation of the Clayton copula and Gumbel copula by using samples drawn from Frank’s copula. In this case, MECC(4,1), MECC(16,1), MECC(64,1), MECC(4,2), MECC(64,2), and MECC(64,3) all show significant improvements in Int. over all the other estimators. It is also important to note at this point that, for a fixed number of marginal constraints, Int. and tend to deteriorate as one increases the number of joint moment constraints. To ameliorate this, it suffices to increase the number of marginal constraints as one adds one more joint moment constraint into the MEC problem. Indeed, as shown in Table 2, for one joint moment constraint, one merely needs four marginal constraints to yield MECC(4,1) with minimum Int. and IMSE; meanwhile, for two joint moment constraints, one needs to use up to 64 marginal constraints to yield MECC(64,2) with minimum Int. , Int. Var., and IMSE. Our final observation is that, for a fixed number of moment constraints, an increase in the number of marginal constraints will always lead to a significant reduction in
Finally, to check the general validity of the obtained simulation results, we also replicate the above simulation study using data generated from Clayton copulas. Table 3 shows that the good performance of the MECCs relative to other copula estimators is still carried over to this case when a sufficient number of marginal constraints is being used.
Table 3.
IMSE for the MECC and parametric copulas: Clayton copula as the true copula.
4. Conclusions
We propose to employ the entropy-maximization principle to recover copulas from limited information regarding contemporaneous dependence between random variables. The main results of this article are twofold. First, we provide an entropy approach to recover relative entropy measures of joint dependence that are independent of marginal distributions by constructing most entropic copulas (MECs), in particular, their canonical forms, namely most entropic canonical copulas (MECC). Second, as a consequence of the MEC, we can construct ME joint distributions with a fixed dependence structure given by a MEC. Our method is shown to incorporate Miller and Liu [12]’s approach and can handle both moment-based and copula-based measures of dependence. Simulation results confirm that the accuracy of the approximate MECC can effectively be improved by increasing the number of side constraints.
Acknowledgments
We would like to express our sincere thanks to the guest editors (Professors Fredj Jawadi, Tony S. Wirjanto, Marc S. Paolella, and Nuttanan Wichitaksorn) and three anonymous referees for many valuable comments and constructive suggestions that help us to substantially improve this paper.
Author Contributions
Both authors contributed equally to the paper.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix
A. Known Results
DEFINITION A.1
(Adapted and modified from [19], Chapter 5). Let τ denote the difference between the probabilities of concordance and discordance of and , i.e., let
where and are independent vectors of continuous random variables with joint distributions and respectively which have common marginals (of ) and (of ). When and have the same joint distribution function , τ is Kendall’s tau (). The other measures of dependence such as Spearman’s rho and Gini’s gamma can be defined similarly.
THEOREM A.1
(Nelsen [19], Chapter 5). Let and denote the copulas of and () respectively so that and , then
If is the Falier-Gumbel-Mogernstern (FGM) copula, i.e., then . Note that we choose the FGM copula as a reference copula because it consists of quadrants of which then enter as uniform joint moments.
If then (Spearman’s rho).
If then (Gini’s gamma).
B. Auxiliary Results
DEFINITION B.1.
A kernel function of real order is a symmetric, Lebesgue integrable, function such that
- (i)
- ,
- (ii)
- for and
- (iii)
- where is the integer part of r.
LEMMA B.1.
Let represent a measurable function of such that
as , where is a sequence of positive constants such that as .
- i
- ,
- ii
- ,
- iii
- ,
Proof:
See ([41], p. 362). ■
LEMMA B.2.
Let , let P denote Lebesgue measure, and let . Put
Then , where is a compact dyadic sequence dense in Ω. (Note that a sequence is defined to be dense in an interval if for every point of the interval, there exist a point of the sequence which is arbitrarily close to it. (See [42], p. 515.))
LEMMA B.3.
[DuBois-Reymond’s lemma] Let a function be continuous on the interval . Assuming that the following equality holds for any continuous function with mean value zero (i.e. )
Then, . Vice versa, if then . (See [43], p. 400)
LEMMA B.4.
The indicator function can be approximated by a continuous function , where is given by
has the following properties:
where is Dirac’s delta function. (See [44] (p. 30))
C. Approximation of Potential Functions
We now present a Gaussian-Legendre quadrature method to approximate the potential function (10) for the MECC. Using affine transformations, with and with , (10) can be rewritten as follows:
where , , and has an obvious meaning.
The function can be expanded into a series of the orthogonal Legendre polynomials, that is,
where are products of two Legendre orthogonal polynomials (see, e.g., [45] for further details of the Legendre polynomials),
and
Now, let , and , be the roots of the polynomials and respectively – are also called the abscissae of the Legendre polynomials – then, choose weights, , satisfying the following M×N relations:
where . We obtain:
Hence,
where is an error term, and are large enough.
We now present a Quasi-Newton algorithm to minimize (C5).
where is the first-order gradient vector of and is a standard matrix norm.
| (1) Given a starting point , a convergence tolerance and an initial step length , |
| , an initial inverse Hessian matrix and the numbers of Gaussian quadratures . |
| (2) While : |
| (search direction). |
| repeat until |
| Stop with (step length satisfies the Goldstein condition (see, e.g., [46]) |
| Set |
| (3) Define and |
| Compute the updated Hessian matrix: |
| where |
| end; |
To compute the MECC, we used a stochastic search algorithm to minimize (C5) whilst setting .
D. Proofs
Proof of Theorem 2.1:
Since (5)–(6) are continuums of constraints with varying end-points, we need to replace these continuums with sets of definite integrals:
where a and b are arbitrary numbers in . Using a dense dyadic sequence in , (D1) can be approximated by
where and are chosen such that and , where ϵ is small enough. Hence, (D1) is equivalent to
The Lagrangian function of Problem EM can be formulated as follows:
Taking the first derivative of with respect to c leads to
Define , then applying Lemma B.3 to the function
where is an arbitrary copula density such that , we obtain the following representation:
and is a generic constant. Now, by substituting (D6) into (4) the leading term, , is canceled out, then we obtain:
where
The Lagrangian multipliers can be solved out by substituting (D7) into (5), (6), and (7), leading to the following system of equations:
for all . Since (D7) can be rewritten as
we can define the potential function as follows:
Then, (D8) is equivalent to the following system of equations:
for all . Also note that, since the second order derivatives of is the covariance matrix of , thus is positive definite. It follows that the solutions to (D9) are the minimum values of , which depend on , , and .
Since the potential function and the MEC (D7) are non-smooth, following common practice, they need to be smoothed out. We can obtain their smoothings by using a continuous approximation to the indicator function, , for a sufficiently large n. An application of Lemma B.4 yields
We then immediately obtain:
and
where are the minimum values of (D10). In particular, can be symmetrized by letting for every and letting be a symmetric function.
Finally, to complete this proof, we still need to prove that the MEC approximator, , is 2-increasing. Let’s denote by a rectangle in , we immediately establish that, since is a (positive) exponential function, the mass of the rectangle, , is thus nonnegative. Now, we can obtain the MECs by letting n and become sufficiently large. ■
Proof of Theorem 2.3:
For all , has a unique finite supremum for all T in view of AS1 and AS2. AS2 and implies that . Thus, has a unique interior supremum for a sufficiently large T. Let denote the unique supremum of . In view of AS3, implies .
An application of the mean-value theorem yields:
where . Thus, we have
Another application of the mean-value theorem yields:
where . Thus, we obtain
Since implies , the continuous mapping theorem yields:
Hence, Slutsky’s theorem and AS4 yield
■
References
- H. Joe. “Relative entropy measures of multivariate dependence.” J. Am. Stat. Assoc. 84 (1989): 157–164. [Google Scholar] [CrossRef]
- C. Granger, and J.L. Lin. “Using the mutual information coefficient to identify lags in nonlinear models.” J. Time Ser. Anal. 15 (1994): 371–384. [Google Scholar] [CrossRef]
- V.H. De la Peña, R. Ibragimov, and S. Sharakhmetov. “Characterizations of joint distributions, copulas, information, dependence and decoupling, with applications to time series.” In Optimality, Institute of Mathematical Statistics Lecture Notes—Monograph Series 49. Beachwood, OH, USA: The Institute of Mathematical Statistics, 2006, pp. 183–209. [Google Scholar]
- A. Golan. “Information and entropy econometrics: Editor’s view.” J. Econom. 107 (2002): 1–15. [Google Scholar] [CrossRef]
- J.M. Sarabia, and E. Gomez-Deniz. “Construction of multivariate distributions: A review of some recent results.” Stat. Oper. Res. Trans. 32 (2008): 3–36. [Google Scholar]
- A. Golan. “Information and entropy econometrics—Volume overview and synthesis.” J. Econom. 138 (2007): 379–387. [Google Scholar] [CrossRef]
- I. Usta, and Y.M. Kantar. “On the performance of the flexible maximum entropy distributions within partially adaptive estimation.” Comput. Stat. Data Anal. 55 (2011): 2172–2182. [Google Scholar] [CrossRef]
- E.T. Jaynes. “Information theory and statistical mechanics.” Phys. Rev. 106 (1957): 620–630. [Google Scholar] [CrossRef]
- M. Rockinger, and E. Jondeau. “Entropy densities with an application to autoregressive conditional skewness and kurtosis.” J. Econom. 106 (2002): 119–142. [Google Scholar] [CrossRef]
- E. Maasoumi, and J. Racine. “Entropy and predictability of stock market returns.” J. Econom. 107 (2002): 291–312. [Google Scholar] [CrossRef]
- R.K. Hang. “Maximum entropy estimation of density and regression functions.” J. Econom. 56 (1993): 397–400. [Google Scholar]
- D.J. Miller, and W. Liu. “On the recovery of joint distributions from limited information.” J. Econom. 107 (2002): 259–274. [Google Scholar] [CrossRef]
- A.J. Patton. “On the out-of-sample importance of skewness and asymmetric dependence for asset allocation.” J. Financ. Econom. 2 (2004): 130–168. [Google Scholar] [CrossRef]
- J.C. Rodriguez. “Measuring financial contagion: A copula approach.” J. Empir. Financ. 14 (2007): 401–423. [Google Scholar] [CrossRef]
- L. Chollete, A. Heinen, and A. Valdesogo. “Modeling international financial returns with a multivariate regime-switching copula.” J. Financ. Econom. 7 (2009): 437–480. [Google Scholar] [CrossRef]
- C. Ning, D. Xu, and T.S. Wirjanto. “Is volatility clustering of asset returns asymmetric? ” J. Bank. Financ. 52 (2015): 62–76. [Google Scholar] [CrossRef]
- A.J. Patton. “Copula-based models for financial time series.” In Handbook of Financial Time Series. Edited by T. Mikosch, J.-P. Kreiß, R.A. Davis and T.G. Andersen. Berlin, Heidelberg, Germany: Springer-Verlag, 2009, pp. 767–785. [Google Scholar]
- Y. Fan, and A.J. Patton. “Copulas in econometrics.” Annu. Rev. Econ. 6 (2014): 179–200. [Google Scholar] [CrossRef]
- R.B. Nelsen. An Introduction to Copulas. New York, NY, USA: Springer-Verlag, 1998. [Google Scholar]
- B. Chu. “Recovering copulas from limited information and an application to asset allocation.” J. Bank. Financ. 35 (2011): 1824–1842. [Google Scholar] [CrossRef]
- M.A.H. Dempster, E.A. Medova, and S.W. Yang. “Empirical copulas for CDO tranche pricing using relative entropy.” Int. J. Theor. Appl. Financ. 10 (2007): 679–701. [Google Scholar] [CrossRef]
- C.A. Friedman, and J. Huang. “Most Entropic Copulas: General Form, and Calibration to High-Dimensional Data in an Important Special Case.” SSRN Electron. J., 2010. [Google Scholar] [CrossRef]
- A. Veremyev, P. Tsyurmasto, S. Uryasev, and R.T. Rockafellar. “Calibrating probability distributions with convex-concave-convex functions: Application to CDO pricing.” Comput. Manag. Sci. 11 (2014): 341–364. [Google Scholar] [CrossRef]
- N. Zhao, and W.T. Lin. “A copula entropy approach to correlation measurement at the country level.” Appl. Math. Comput. 218 (2011): 628–642. [Google Scholar] [CrossRef]
- A. Golan, G. Judge, and D. Miller. Maximum Entropy Econometrics: Robust Estimation with Limited Data. New York, NY, USA: John Wiley & Sons, 1996. [Google Scholar]
- X. Wu. “Calculation of maximum entropy densities with application to income distribution.” J. Econom. 115 (2003): 347–354. [Google Scholar] [CrossRef]
- X. Wu, and J.M. Perloff. “GMM estimation of a maximum entropy distribution with interval data.” J. Econom. 138 (2007): 532–546. [Google Scholar] [CrossRef]
- A. Zellner, and R.A. Highfield. “Calculation of maximum entropy distributions and approximation of marginal posterior distributions.” J. Econom. 37 (1988): 195–209. [Google Scholar] [CrossRef]
- A. Sklar. Fonctions de Repartition n Dimensions et Leurs Marges. Paris, France: Publications de l’Institut Statistique de l’Universit de Paris, 1959, pp. 229–231. [Google Scholar]
- J. Hajek, and Z. Sidak. Theory of Rank Tests. New York, NY, USA: Academic Press, 1967. [Google Scholar]
- C. Genest, and J.-F. Plante. “On Blest’s measure of rank correlation.” Can. J. Stat. 31 (2003): 1–18. [Google Scholar] [CrossRef]
- R.A. Gideon, and R.A. Hollister. “A rank correlation coefficient resistant to outliers.” J. Am. Stat. Assoc. 82 (1987): 656–666. [Google Scholar] [CrossRef]
- J.L. Troutman. Variational Calculus and Optimal Control, 2nd ed. New York, NY, USA; Berlin, Heidelberg, Germany: Springer, 1996. [Google Scholar]
- T. Csendes. “Nonlinear parameter estimation by global optimization—Efficiency and reliability.” Acta Cybern. 8 (1988): 361–370. [Google Scholar]
- W. Hardel, M. Muller, S. Sperlich, and A. Werwatz. Nonparametric and Semiparametric Models. Springer Series in Statistics; Berlin, Heidelberg, Germany; New York, NY, USA: Springer-Verlag, 2004. [Google Scholar]
- M. Omelka, I. Gijbels, and N. Veraverbeke. “Improved kernel estimation of copulas: Weak convergence and goodness-of-fit testing.” Ann. Stat. 37 (2009): 3023–3058. [Google Scholar] [CrossRef]
- C. Genest. “Frank’s family of bivariate distributions.” Biometrika 74 (1987): 549–555. [Google Scholar] [CrossRef]
- R.B. Nelsen. “Properties of a one-parameter family of bivariate distributions with specified marginals.” Commun. Stat. Theory Methods 15 (1986): 3277–3285. [Google Scholar] [CrossRef]
- S.X. Chen, and T.-M. Huang. “Nonparametric estimation of copula functions for dependence Modelling.” Can. J. Stat. 35 (2007): 265–282. [Google Scholar] [CrossRef]
- I. Gijbels, and J. Mielniczuk. “Estimating the density of a copula function.” Commun. Stat. - Theory Methods 19 (1990): 445–464. [Google Scholar] [CrossRef]
- A. Pagan, and A. Ullah. Nonparametric Econometric, 1st ed. Themes in Modern Econometrics; Cambridge, UK: Cambridge University Press, 1999. [Google Scholar]
- A.N. Shiryaev. Probability, 2nd ed. Graduate text in mathematics; New York, NY, USA; Berlin, Heidelberg, Germany: Springer-Verlag, 1995, Volume 95. [Google Scholar]
- A.D. Ioffe, and V.M. Tihomirov. Theory of Extremal Problems. Edited by J.L. Lions, G. Papanocolalaou and R.T. Rockafellar. Studies in mathematics and its applications; Amsterdam, The Netherlands; New York, NY, USA; Oxford, UK: North Holland Publishing Company, 1979, Volume 6. [Google Scholar]
- Y.A. Kutoyants. Statistical Inference for Ergodic Diffusion Processes. Springer series in statistics; London, UK; Berlin, Heidelberg, Germany: Springer-Verlag, 2004. [Google Scholar]
- M. Abramowitz, and I.A. Stegun. Handbook of Mathematical Functions. New York, NY, USA: Dover Publications, 1972. [Google Scholar]
- D.P. Bertsekas. Convex Analysis and Optimization. Belmont, MA, USA: Athena Scientific, 2003. [Google Scholar]
- 1.We are indebted to a referee for pointing this out.
- 2.We are indebted to a referee for suggesting to us this point.
- 3.Jeffreys, H. (1961). Theory of Probability. Oxford: Clarendon, pp. 2–3.
© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license ( http://creativecommons.org/licenses/by/4.0/).