Abstract
There are many families of bivariate distributions with given marginals. Most families, such as the Farlie–Gumbel–Morgenstern (FGM) and the Ali–Mikhail–Haq (AMH), are absolutely continuous, with an ordinary probability density. In contrast, there are few families with a singular part or a positive mass on a curve. We define a general condition useful to detect the singular part of a distribution. By continuous extension of the bivariate diagonal expansion, we define and study a wide family containing these singular distributions, obtain the probability density, and find the canonical correlations and functions. The set of canonical correlations is described by a continuous function rather than a countable sequence. An application to statistical inference is given.
Keywords:
bivariate copulas; stochastic dependence; correlation functions; singularity along a curve; continuos dimensionality MSC:
62H20; 60E05
1. Introduction
In probability theory, a copula is a bivariate cumulative distribution function (cdf) with uniform marginals, which captures the dependence properties of two r.v.’s defined on the same probability space.
Constructing copulas is important because they are versatile and allow us to generate bivariate distributions. A copula can model the dependence between two random variables, without the influence of marginal distributions. Copulas have applications in finances, credit risk, insurances, hydrology, physics, psychometry, quality control, statistics, and other fields. Most copulas have absolutely continuous distributions, but there are copulas containing a singular part. These copulas are useful in situations where there are coincidences between the variables.
Section 2 defines a general family of copulas. Section 3 describes the absolutely continuous and singular parts (which could be non-null) of a copula, showing that this family can deal with copulas with a non-null singular part. In Section 3, a general definition of singularity is introduced, which can obtain the probability density with respect to a suitable measure. Section 4 is devoted to the canonical correlation analysis of a copula with singular part. The concept of singularity is extended in Section 5. Section 6 studies the singularity of general bivariate distributions. An application to Bayesian statistics is proposed in Section 7.
We use the following notations
where and W are copulas. The quotient will have an important role in defining singularities.
W and M are the Fréchet-Hoeffding bounds. Any copula C satisfies
uniformly in Both W and M are singular copulas. The “shuffle of min” is another example of a singular copula. But most distributions are absolutely continuous and there are few probability models with a singular part. This is studied in Section 3.
For properties and construction of copulas, see [1,2,3,4,5]. For applications in finances (including copulas with a singular component) and marketing, see [6,7]. For general applications, see [3,8].
Based on [9], we present a general method of generating copulas, putting special emphasis on constructing copulas with a singular part.
2. Correlation Functions and Families
We indicate the unit interval by and the unit square by In all cases, we suppose
Definition 1.
A parametric canonical correlation function is an integrable function
Definition 2.
A quotient function is a two-variable function satisfying
The adjective “canonical” for a correlation function is justified in Section 4. It can also be called a “dependence generator”. For the sake of simplicity, and following the terminology used in [9], we say “correlation function”.
Examples of correlation and quotient functions are
Note that is the quotient of two copulas. In general, for two arbitrary copulas
gives rise to a quotient function.
The definition below is a continuous extension of the diagonal expansion of a bivariate distribution. It is mainly based on [9], but it is presented here as a general family constructed by combining correlation and quotient functions. This family is quite useful for constructing copulas with a singular part.
Definition 3.
Given a correlation function and a quotient function we define the general family of copulas
Properties. Most of them are readily proved.
- Independence copula. If then
- Self-generation. If and where C is any copula, then
- If Q is a quotient function then is also a quotient function.
- If is the quotient of two copulas then
- If C is a copula and then
- Quadrant dependence. Let and be Spearman’s rank correlation and Kendall’s correlation coefficients, respectively. We have:is positive quadrant dependent (PQD) if Then and are positive.is negative quadrant dependent (NQD) if Then and are negative.Using a simplified notation, Spearman’s rank correlation coefficient is given bywhere d Then if (PQD) and if (NQD).Kendall’s tau isIf (PQD) then shows that Analogously, (NQD) implies .
- Fréchet family. If and thenAs a useful alternative of Definition 1, we give an equivalent expression for family (1).
Definition 4.
If and Q are correlation and quotient functions, we define the general family of copulas
where is a primitive of
Clearly, from the above property 3,
is also a copula, being related to by
However, (2) could not provide a copula for some and Q values. For instance, and give But this is not a copula for .
- Examples.
- With and we obtain the FGM copulaand
- With and we obtain the AMH copulaand
Remark 1.
In some sense, family (1) is a continuous extension of the diagonal expansion
where The set is the countable sequence of canonical correlations and are the related sequences of canonical variables and functions of U and respectively. This expansion can be obtained integrating the Lancaster diagonal expansion of a bivariate density [8,10,11,12,13,14].
3. Singular Copulas
From the above property 5, the quotient function satisfies
Thus, is an infimum quotient function that may provide a class of copulas with singular parts.
Let us consider the class of copulas constructed from a correlation function and a fixed quotient
Proposition 1.
If is increasing in θ, then the class with is ordered in
Proof.
If for , then
□
Proposition 2.
If is increasing in θ and we suppose then
Proof.
It follows from considering the primitive of □
As a consequence, is a supremum copula for the sub-family generated by and where is fixed and Q may vary.
If is a primitive of an equivalent expression for this supremum family, generated by and achieved in is given by
This class of copulas was (implicitly) introduced in [15] and studied in [16]. We next study this class for different correlation functions.
3.1. Defining Singularity
Let C be a general copula. Suppose that the partial derivatives and exist. Consider the step function
This function is the limit of as with minus the limit of as with If the bivariate distribution is absolutely continuous, then
If the joint distribution of is it is s worth noting [4] that, for any
and this partial derivative exists for almost all Therefore, in (4) means that the conditional distribution function of V given has a discontinuity at
Indeed, any copula C defines a measure in which has an absolutely continuous part and a singular part, i.e.,
C is absolutely continuous if whereas C is singular if In short, C has a singular part if there exists a non-empty Borel set with Lebesgue measure but In plain words, the “area” of is zero but the probability is positive. For instance, has a singular part if is a line. See [17,18,19] for further details.
We introduce a class of copulas with singular part.
Definition 5.
Suppose that the cdf of is the copula We say that the joint distribution is M-singular if g defined in (4) satisfies
This means that there is positive probability concentrated on the diagonal of Note that has zero Lebesgue measure, i.e.,
Now we consider the class (2) with Let be the Lebesgue measure on the diagonal Dirac’s delta function is the indicator of the diagonal i.e., if and 0 if Similarly,
Theorem 1 can be proven using Schwartz’s distribution theory [15,20] or by means of the Radon–Nykodim theorem [17,18,21,22]. We present a more affordable proof by proper use of limits and integrals, which can be quite useful in practice. See the Appendix A.
Theorem 1.
Suppose that has the joint cdf
Let and be the Lebesgue measures on and the diagonal of , respectively. The probability density of with respect to the measure is given by
Proof.
If the second partial derivative of is given by
Considering defined in (4), we have
as these integrals give the mass in plus the mass on the line from to where The second integral is
This expression may be interpreted by considering
with arbitrarily small. Accordingly, the limit as may be informally understood as a kind of second partial derivative at post-multiplied by
Let us find an explicit expression for If , the partial derivative is
If , the partial derivative is
The difference is and the limit as is □
Proposition 3.
The function is the correlation function. Hence,
Proof.
is a primitive of □
Proposition 4.
The probability of coincidence is
Proof.
and , so we should consider only the mass on □
3.2. Examples of M-Singular Copulas
1. Fréchet copula. This copula is the weighted arithmetic mean of M and
We have The second partial derivative gives the constant We also find The probability density is
and
2. The Cuadras–Augé copula is the weighted geometric mean of M and
Obtaining the derivatives we find
Computing we obtain
3. From the correlation function , we obtain
The probability density is
Also,
4. From , we obtain
The probability density is
where Then,
See [4,19] for more examples of copulas with singular parts.
4. Canonical Analysis of a Copula
Let C be a copula with a cdf of the random vector Consider the kernels and If and are functions of bounded variation, the covariance between and is [23]:
The variance of is
and similarly, var In particular, if and is symmetric in , the correlation coefficient between and is
The notation is justified below
Therefore, we can write the correlation as
Our aim is to find the pairs of canonical functions and correlations for a copula In particular,
is the first canonical correlation. This functional analysis approach is related to seeking the eigenpairs of the symmetric kernel with respect to
Definition 6.
A generalized eigenfunction, eigenvalue, of K with respect to L is a pair that satisfies in the sense that
for all
Clearly, if with is an eigenpair of K with respect to then
This leads us to consider the canonical pairs as eigenpairs.
Definition 7.
For arbitrarily small and , we define
the limit of which is the indicator function i.e.,
Propositions 5–8 contain preliminary results, which are useful to prove Theorem 2.
Proposition 5.
We have and
Proof.
Similarly,
□
Proposition 6.
Suppose that is M-singular. Then,
Proof.
As this reduces to □
Proposition 7.
Suppose that the correlation function generates the copula Consider the symmetric kernels and Then,
Proof.
Clearly, as Then,
where □
Proposition 8.
Suppose that is M-singular. Consider the symmetric kernels Then,
Proof.
From (7) with and taking the limit reduces to
Since and as an equivalent proof follows from
This limit gives □
Theorem 2.
The set of canonical functions and correlations for the M-singular family is where = is the indicator of γ and is the correlation function.
Proof.
As , it is clear that , so
Thus, □
Remark 2.
If is a continuous function, it is worth noting that the set of canonical correlations has the power of the continuum.
Examples of Eigenpairs
- Fréchet copula. We have For a fixed parameter the set of canonical functions and correlations is Note that is an eigenvalue of continuous multiplicity. In fact, any function is eigenfunction. Also note that is the correlation coefficient.
- Cuadras–Augé copula. The correlation function is The set of canonical functions and correlations is Each eigenvalue is simple and we have a continuous set of eigenvalues. Note that is the maximum canonical correlation.
- For the family the correlation function is The set of canonical functions and correlations is
- For the family the correlation function is The set of canonical functions and correlations is
5. Extended Singularity
Let be a random vector with cdf of the copula To define the singularity on the second diagonal of , we consider the joint distribution of
Definition 8.
Suppose that the cdf of is the copula We say that the joint distribution of is W-singular if the distribution of is M-singular.
Proposition 9.
The cdf of a W-singular copula is
where is the primitive of and f is a correlation function.
Proof.
Theorem 3.
Let be a W-singular copula; see (8). The probability density with respect to where is the Lebesgue measure on and is the Lebesgue measure on the diagonal is given by
where
Proof.
Taking into account the step of at the diagonal the proof is quite similar to the one given in Theorem 1. The cdf (8) can be expressed as
where and □
Examples of W-Singular Copulas
- Fréchet. If with , we obtainthe weighted average of the lower bound W and The density is
- Cuadras–Augé. If with , thenThe density is
From and this family reduces to if and to W if
6. Bivariate Singular Distributions
Let be a random vector with joint cdf H and univariate marginals From Sklar’s theorem [1,4], there exists a copula such that H can be expressed as
For example, considering the family (2), we have
where is a quotient function.
The diagonal of now becomes the curve with implicit equation The singularity is along this curve and the density is
where and
Next, we introduce a non-linear singularity on a general curve i.e., along the points with coordinates
Definition 9.
We say that the bivariate cdf is φ-singular if
satisfies
Thus, if is M-singular, then is -singular, where
There are more constructions of distributions with singular components.
6.1. Regression Family
An alternative construction of -singular distributions is as follows [24]. Suppose that X and Y have the same support . Let be a real function with positive derivative Consider the inverse function
A family of distributions with singular parts is
where, for ,
is a univariate cdf.
Proposition 10.
The family (10) is φ-singular for such that is a cdf. The density with respect to the measure where is the Lebesgue measure on and is the Lebesgue measure on the curve is given by
Proof.
The difference (9) is and the second-order derivative is
Note the stochastic independence if □
This family has an interesting property.
Proposition 11.
Suppose that X and Y have absolutely continuous distributions and the expectations exist. The regression curve is , where
with Hence, is linear in
Proof.
If the regression curve is
We use the change and agree [25] that □
This regression family can be generated by the initial model (2), as a consequence of the self-generation property. Namely, consider Then, and can be expressed as
where
6.2. Another Extension
We may generalize the family (3) by replacing with where is a function such that The extended family is
Proposition 12.
The family is φ-singular. The probability density with respect to the measure where is the Lebesgue measure on the curve is given by
where
Proof.
The partial derivative is
The difference in the limits as is We similarly obtain the second-order partial derivative □
7. An Application to Statistical Inference
Consider two independent binomial distributions and the null hypothesis If and (diagonal of ), then we accept if
The classical approach interprets as fixed parameters and uses the chi-squared test. The Bayesian approach interprets as random quantities and postulates a prior probability distribution. The probability of is The null hypothesis is accepted if has probability Since is a set such that the prior distribution concentrates mass on [26]. Indeed, the prior distribution must be M-singular, in order to assign positive probability to We accept the null hypothesis if the probability of is
This implies that Four suitable correlation functions are
Therefore, has the prior density
with respect to the measure where and the right sides of
are given in Theorem 1.
Once has been chosen, we construct Then, from statistical data, e.g., the frequencies of the events with probabilities the decision can be made using the Bayes factor [27,28]
where L is the likelihood function and reduces to in (where ), and to in
Note the use of averages (Bayesian factor) as opposed to the use of an eigenpair (likelihood ratio in the frequentist approach).
If the null hypothesis is this proposal suggests working with W-singular copulas. Of course, all this can be generalized to other comparison tests, with data drawn from normal, exponential, logistic, etc., distributions.
8. Discussion, Conclusions and Future Work
Starting from a correlation function (dependence generator), we studied several methods for constructing copulas with singular parts. The singularity is defined by a line with equation having a positive probability. If is linear, we obtain singularities related to the Fréchet–Hoeffding bounds M and The function can be non-linear. We study a case in which is increasing. The decreasing and the general cases can be approached by using the extensions proposed in [24]. We obtain the probability density related to the singular part, a function that defines the continuous set of canonical correlations. This set is uncountable rather than countable (Mercer’s theorem [25]).
A uniparametric procedure follows from the direct application of the above models. For instance, if we have dimension we may consider
Then, and are dimensional copulas with singular parts. See an application in [29].
More generally, we can naturally define the family
where is a correlation function and is a dimensional quotient function. For example, See [15,24] for other multivariate families of distributions with singular parts.
An application to the Bayesian inference is commented on, showing that the singularity of the prior distribution is implicit in some tests. This approach to testing the hypothesis justifies the M-singular copulas. If the null hypothesis is , we should use W-singular copulas.
The properties obtained via integral operators and eigenanalysis on two kernels are useful for symmetric copulas. It is an open question to find the additional conditions for the correlation and quotient functions to ensure that these models provide a copula, and to perform a generalization to non-symmetric copulas [30]. This challenge may be solved by functional singular value decomposition.
Funding
This research received no external funding.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Acknowledgments
I am indebted to three anonymous reviewers for useful comments that improved this article.
Conflicts of Interest
The author declares no conflicts of interest.
Appendix A
The following list of references by topics may be helpful.
Definition of copula: [1] p. 10, [2] p. 12, [4] p. 10.
Families of copulas: [2] chap. 5, [4] chap. 4, [9,15,16].
Absolutely continuous and singular parts: [17] p. 59, [18] p. 247.
Canonical analysis: [3] p. 49, [10] p. 108, [13] p. 582.
Diagonal expansion: [8] p. 41, [9,10] p. 248, [12] chap. 6.
Sklar’s theorem: [1] p. 42, [2] p. 13, [4] p. 17.
Distribution theory: [20] chap. 2.
Radon–Nykodim theorem: [17] p. 63, [18] p. 193, [21], p. 196, [22].
Mercer’s theorem: [25] p. 271.
Dirac delta function: [25] p. 303.
Bayesian inference, Bayes factor: [26,27] p. 153, [28] p. 30.
References
- Durante, F.; Sempi, C. Principles of Copula Theory; CRC Press: Boca Raton, FL, USA; Chapmam and Hall: London, UK; New York, NY, USA, 2016. [Google Scholar]
- Joe, H. Multivariate Models and Dependence Concepts; Chapman and Hall: London, UK, 1997. [Google Scholar]
- Mardia, K.V. Families of Bivariate Distributions; Charles Griffin: London, UK, 1970. [Google Scholar]
- Nelsen, R.B. An Introduction to Copulas, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
- Drouet Mari, D.; Kotz, S. Correlation and Dependence; Imperial College Press: London, UK, 2004. [Google Scholar]
- Cherubini, U.; Gobbi, F.; Mulinacci, S.; Romagnoli, S. Dynamic Copula Methods in Finance; Wiley: New York, NY, USA, 2012. [Google Scholar]
- van der Goorbergh, R.W.J.; Genest, C.; Werker, B.J.M. Bivariate option pricing using dynamic copula models. Insur. Econ. 2005, 37, 101–114. [Google Scholar] [CrossRef]
- Hutchinson, T.P.; Lai, C.D. The Engineering Statistician’s Guide to Continuous Bivariate Distributions; Rumsby Scientific Pub.: Adelaide, Australia, 1991. [Google Scholar]
- Cuadras, C.M. Contributions to the diagonal expansion of a bivariate copula with continuous extensions. J. Multivar. Anal. 2015, 139, 28–44. [Google Scholar] [CrossRef]
- Greenacre, M.J. Theory and Applications of Correspondence Analysis; Academic Press: London, UK, 1983. [Google Scholar]
- Lancaster, H.O. The structure of bivariate distributions. Ann. Math. Stat. 1958, 29, 719–736. [Google Scholar] [CrossRef]
- Lancaster, H.O. The Chisquared Distribution; Wiley: New York, NY, USA, 1969. [Google Scholar]
- Rao, C.R. Linear Statistical Inference and their Applications; Wiley: New York, NY, USA, 1973. [Google Scholar]
- Cuadras, C.M. Correspondence analysis and diagonal expansions in terms of distribution functions. J. Stat. Inference 2002, 103, 137–150. [Google Scholar] [CrossRef]
- Cuadras, C.M.; Augé, J. A continuous general multivariate distribution and its properties. Commun. Stat.-Theory Methods 1981, A10, 339–353. [Google Scholar] [CrossRef]
- Durante, F. A new family of symmetric bivariate copulas. C. R. Acad. Sci. Paris Ser. I 2007, 344, 195–198. [Google Scholar] [CrossRef]
- Ash, R.B. Real Analysis and Probability; Academic Press: New York, NY, USA, 1972. [Google Scholar]
- Chow, Y.S.; Teicher, H. Probability Theory: Independence, Interchangeability, Martingales; Springer: New York, NY, USA, 1978. [Google Scholar]
- Durante, F.; Fernández Sxaxnchez, J.; Sempi, C. A note on the notion of singular copula. Fuzzy Sets Syst. 2013, 211, 120–122. [Google Scholar] [CrossRef]
- Schwartz, L. Théorie des Distributions; Hermann: Paris, France, 1957. [Google Scholar]
- Munroe, M.E. Introduction to Measure and Integration; Addison Wesley Pub. Co.: Reading, UK, 1973. [Google Scholar]
- Ruiz-Rivas, C.; Cuadras, C.M. Inference properties of a one-parameter curved exponential family of distributions with given marginals. J. Multivar. Anal. 1988, 27, 447–456. [Google Scholar] [CrossRef]
- Cuadras, C.M. On the covariance between functions. J. Multivar. Anal. 2002, 81, 19–27. [Google Scholar] [CrossRef]
- Cuadras, C.M. Distributions with Given Marginals and Given Regression Curve; Lecture Notes-Momograph Series; Institute of Mathematical Statisitics: Hayward, CA, USA, 1996; Volume 28, pp. 76–83. [Google Scholar]
- Thomas, J.B. An Introduction to Applied Probability and Random Processes; Wiley: New York, NY, USA, 1971. [Google Scholar]
- Casella, G.; Moreno, E. Assessing robustness of intrinsic tests of independence in two-way contingency tables. J. Am. Stat. Assoc. 2009, 104, 1261–1271. [Google Scholar] [CrossRef]
- Garthwaite, P.H.; Jolliffe, I.; Jones, B. Statistical Inference; Prentice Hall: London, UK, 1995. [Google Scholar]
- Girón, F.J. Bayesian Testing of Statistical Hypotheses; Royal Academy of Sciences, Ed.; Arguval: Málaga, Spain, 2021. [Google Scholar]
- Pérez, A.; Prieto-Alaiz, M.; Chamizo, M.; Liebscher, E.; Úbeda-Flores, M. Nonparametric estimation of the multivariate Spearman’s footrule: A further discussion. Fuzzy Sets Syst. 2023, 467, 108489. [Google Scholar] [CrossRef]
- Marshall, A.W. Copulas, Marginals, and Joint Distributions; Lecture Notes-Momograph Series; Institute of Mathematical Statisitics: Hayward, CA, USA, 1996; Volume 28, pp. 213–222. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).