1. Introduction
Descriptions and measurements of correlation and dependence between risks and losses have been important in various fields, such as finance, risk management, and actuarial studies. Bivariate copulas of different dependence structures can model the tail dependence in extreme risks Ref. [
1] but be independent of the marginal distributions of the risks. Ref. [
2] numerically and graphically illustrated the types of relationships that various copula-based measures of association can detect. Ref. [
3] addressed the mathematics of copula functions illustrated with a finance application to financial topics in derivative pricing and credit risk analysis. The seminal paper by [
4] demonstrated their practical applications, such as the estimation of joint-life mortality and multi-decrement models. Extreme or tail losses tend to occur together; see [
5]. Ref. [
6] employed copulas to study the effect of tail dependence and tailedness by quantifying extreme risks. Refs. [
7,
8] applied copula modeling to investigate the increasing hydroclimatic extremes associated with a warming climate; see also [
9] for flood and hydrological models. Overall, copula modeling has shown to be an effective tool in analyzing dependent structures between variables.
Sklar’s theorem [
10] states that the joint cumulative distribution function (cdf) can be expressed as the product of the marginals and the copula, and conversely, the copula can be uniquely determined if we know the cdf and marginals. For instance, the Gaussian copula is derived from the bivariate Gaussian distribution and can also be used to generate new bivariate probability distributions via (
A1) in the appendix; see [
11] for summaries of the methods of constructing copulas. Necessary preliminaries on bivariate copulas corralled in Appendix A can be found in [
11,
12].
Most recently, Ref. [
13] extended the traditional one-parameter Archimedean copulas by integrating the log-gamma-generated margins. Ref. [
14] proposed a class of bivariate independence copula transformation of the form
where
f is a twice differentiable. Let
be the independence copula. Generalizing the FGM copula given by
, Refs. [
15,
16] constructed new copulas of the form
where
Q is a perturbation function involving trigonometric, hyperbolic, logarithmic, or exponential functions. Ref. [
7] used a truncation of the log-concave half-logistic distribution function
F as a multiplicative Archimedean generator to construct the copulas of the form
In this paper, we are concerned with the construction of new bivariate copulas via distortion functions. A function
T is called a distortion function if it is continuous and increasing on the unit interval
, with
and
A new family of copulas born of the distortion is given by
T is termed as admissible distortion if (
1) is a copula. If the initial copula is Archimedean with generator
then
is Archimedean with generator
; see [
17,
18]. Theorem 3.3.3 in [
11] (p. 96) shows that
T is admissible if and only if
T is increasing and convex; see also [
19]. This key result dictates the convexity requirement and opens the door for explorations of admissible distortion functions. Ref. [
20] showed that
T is admissible if
is log-convex and suggested several distortion functions. Ref. [
21] proposed to apply the distortion to the copula function only, marginals only, or both. The induced copula in (
1) is a result of distortions to both the copula and marginals.
Refs. [
22,
23] constructed new families of copulas via beta and Kumaraswamy cdf distortions.Ref. [
24] employed the unit-Lomax distortion. Ref. [
25] studied families of copulas generated by a unit-Weibull distortion. Ref. [
26] investigated the properties of unit-Gompertz distorted copulas and applied them to analyze the anthropocentric data. The unit-Lomax, unit-Weibull, and unit-Gompertz distortions are derived from an exponential transformation of the Lomax, Weibull, and Gompertz random variables.
Motivated by the fact that the cdf of a continuous random variable with unit interval support meets the definition of a distortion function, we propose a transformation that converts a non-negative random variable to one with unit interval support, which, consequently, establishes a distortion function. The distortion can then be used to generate new families of copulas by distorting existing ones. Similar to the other constructions of new copulas in the literature, the aim is to obtain new families of copulas that may account for a wider range of tail dependence values. With the parameters in the distortion cdf, the distorted copulas have additional parameters in addition to those in the existing copulas, making them more flexible.
The paper is organized as follows.
Section 2 begins with the proposed mechanism for generating new distortion functions and admissible parameter spaces of distortions to be studied further.
Section 3 provides Archimedean generators for the new families of distorted copulas when the base copulas are independence, Clayton, Frank, and Gumbel. The family of distorted independence copulas is presented to serve as a validation of the results obtained in this paper.
Section 4 and
Section 5 investigate the properties of tail dependence, tail order, concordance order, and Kendall’s tau.
Section 6 contains the numerical results of a simulation study and empirical application, followed by concluding remarks. Copula preliminaries and derivations of tail orders and concordance ordering are included in the
Appendix A and
Appendix B.
2. Proposed Method
Let
Y be a non-negative continuous random variable with cdf
Consider the following transformation of the random variable
Y:
The random variable
X has a support of the unit interval, and its cdf is given by
The cdf
G and its quantile or inverse function may both serve as distortion functions to develop new copulas. If the variable
Y assumes any value on the real number line, one may consider a transformation of its absolute value; that is,
, whose cdf is not as straightforward as (
3).
We below demonstrate the method with F being the non-negative Lomax and inverse Lomax distributions, and derive the parameter space on which each of the distortions is convex. All the parameters in the generating distribution function F are assumed to be positive. The prime symbol, such as or , denotes the derivative of a function.
Example 1. Unit-Lomax (UL) distortion and its quantile (QUL). Let Y be a Lomax or Pareto Type II random variable with a cdf given by where In this case, the transformation in (2) produces the distortion given bywhere Note that, for example, Equation (5) demonstrates that multiple values of the parameters (β, θ) give rise to the same distortion in (4), and thus the parameters cannot be uniquely identified. Therefore, to solve this problem, with reparametrization, we consider the distortion (UL) and its inverse (QUL) given by Lemma 1. The distortion is convex on I if and
Proof. Let
, and then
For simplicity, the argument notation of
is dropped. Note that
The derivatives are given by
If and , then for all □
Lemma 2. The distortion is convex on I if and
Proof. Since
is the inverse function of
, and
is an increasing function, by (
6), we obtain this lemma. □
Example 2. Unit-inverse Pareto (UIP) distortion and its quantile (QUP). Consider the inverse Pareto random variable, defined to be the reciprocal of a Lomax random variable, with a cdf given by where In this case, (3) giveswhere For the same reason explained by (5), we propose the distortion (UIP) and its inverse (QUP) given by Lemma 3. The distortion is convex on I if and
Proof. Let
and
. Then
and
For simplicity, the argument notation of (x) is dropped. The relevant derivatives are given by
If and , then the second derivative for all □
Lemma 4. is convex onIifand
Proof. This lemma follows since
is the inverse function of
and
is increasing and concave by (
7) when
and
. □
In summary, applying the proposed transformation in (
2) to non-negative Lomax-related random variables, we are bestowed with four new admissible distortions tabulated in
Table 1. Two functional forms are displayed as they would come in handy for calculations. Note that the admissible parameter spaces are only a sufficient condition for
in (
1) to be a copula. The dual power distortion is a special case of
with
The power distortion is a special case of
with
When
, it is important to note that all the distortions in
Table 1 become the identity function. This means that when these distortions are applied to a base copula, the resulting family of copulas offers a greater flexibility when fit to data because it includes the base copula as a special case.
Remark 1. Let Y be a Lomax random variable. We also considered the cdf of , i.e., the Burr distribution, given by as the generating cdf. However, there does not exist a parameter space on which the resulting distortion is convex. Another generating cdf candidate is the exponentiated Lomax cdf of the form , which is more complex and will be investigated in the future.
Remark 2. Instead of (2), the transformation of also gives rise to a distortion cdf. The cdf of is given by, for In this case, when Y has a Lomax distribution in Example 1, we derive the When Y has a Lomax distribution in Example 2, we derive the
Remark 3. There are a multitude of cdfs that may be used as the generating cdf F in (3). For instance, an exponential distribution with mean β. In this case, (3) gives a candidate distortion given by Another example, a generalized Pareto with a pdf given bywhere , and 4. Tail Dependence Coefficients and Tail Orders
In this section, we investigate the tail dependence coefficients, tail orders, and concordances for the new families of copulas emerging from the four distortions in
Table 1. The lengthy derivations of tail orders are stationed in
Appendix B.
Let
and assume that the lower tail dependence (ltd) coefficient
of the base copula
C and
exist. By definition in (
A3) and L’Hopital’s rule, the ltd coefficient for a
T distortion-induced copula is given by
Since
, with the substitution of
the upper tail dependence (utd) coefficient of
is given by
Below, we assume that the ltd coefficient when and the utd coefficient when for the base copula C. Furthermore, we assume that as and as for some slowly varying functions 𝓁 and at Let the subscript denote a property owner, e.g., the subscript T in is used to denote the ltd coefficient of a -distorted copula.
Proposition 1 (
Unit-Lomax Distortion)
. Let be the -distorted copula defined in (8), where and Then,- (i)
and
- (ii)
when and when And
Proof. The tail orders are shown in (
A9) and (
A17) in
Appendix B. The ltd coefficient, by (
14) and L’Hopital’s rule, is given by
since
The utd coefficient, by (
15) and L’Hopital’s rule, we obtain that
since
□
Proposition 2 (
Quantile Unit-Lomax Distortion)
. Let be the -distorted copula defined in (10), where and Then,- (i)
and
- (ii)
when and when And
Proof. The tail orders are shown in (
A10) and (
A18) in
Appendix B. By L’Hopital’s rule,
since
By (
15) and L’Hopital’s rule, the ltd coefficient is given by
since
□
Proposition 3 (
Unit-Inverse Pareto Distortion)
. Let be the -distorted copula defined in (11), where and Then,- (i)
and
- (ii)
and
Proof. The tail orders are shown in (
A11) and (
A19) in
Appendix B. For the ltd, with the help of L’Hopital’s rule, we obtain that
The utd coefficient, by (
15) and L’Hopital’s rule, is given by
since
and
□
Proposition 4 (
Quantile Unit-Inverse Pareto Distortion)
. Let be the -distorted copula defined in (13), where and Then,- (i)
and
- (ii)
If then
Proof. The tail orders are derived in (
A13) and (
A20) in
Appendix B. For the ltd coefficient,
The utd coefficient, by (
15) and L’Hopital’s rule, is given by
since
and
□
The results of the propositions are summarized in
Table 2. The utd coefficients of UL- and QUL-distorted copulas differ from those of base copulas when
, while the ltd coefficients remain unchanged. The UL and QUL distortions render new copulas with upper tail dependence regardless of whether the base copula has it or not. Conversely, the UIP and QUP distortions form copulas with different ltd coefficients when
from the base copula, while the utd coefficients remain the same. The Clayton copula has zero upper tail dependence. However, based on
Table 2, by applying UL and QUL distortions to a Clayton copula, we create a new family of copulas that are more flexible in the sense that they can accommodate upper tail dependence values ranging from 0 to 1. This same conclusion can be applied to the Frank copula.
We next construct the density contour plots for UL- and UIP-distorted copulas in
Figure 1 and
Figure 2. Contour plots and observations for QUL- and QUP-distorted copulas are displayed in
Figure A1 and
Figure A2 in
Appendix B.1, respectively. The joint pdf of a copula is
Note, if a contour plot is elongated along one direction, it signals a strong dependence in that direction. A circular contour shape indicates independence between variables. For example, the Clayton has lower tail dependence and no upper tail dependence, therefore, one expects to see a elongated or tightened contour line on the lower left-hand side. Let
r be the parameter in the base copula. When
both the UL and UIP distortions deliver us the base copulas shown in the first column of both figures.
While the Frank copula is tail-independent, the Clayton and Gumbel copulas are characterized by asymmetric tail dependence with only utd and ltd, respectively. As summarized in
Table 2, UL distortions may yield new families of copulas with upper tail dependence. Copulas constructed using the UL distortion exhibit upper tail dependence and maintain the same patterns of lower tail behaviors as the base copula. Notably, UL-distorted Clayton copulas have both lower and upper tails.
As revealed in
Table 2 and
Figure 2, UIP-distorted copulas have similar characteristics to the base copulas in the first column. If the base copula has lower tail dependence, the family introduced an additional parameter to the ltd coefficient but retains the upper tail dependence of the base copula.
6. Numerical Results
In this section, we run a simulation study to inspect how the newly minted families of UL distortion copulas perform when fit to data generated from the beloved Clayton, Gumbel, Gaussian, and Frank copulas, and vice versa. Additionally, the copula models are applied to a bivariate dataset consisting of the daily return rates of Amazon and Google stocks.
6.1. A Simulation Study
A general algorithm to generate draws from a bivariate copula
C is the conditional distribution approach, as described by [
19,
24]. It consists of two steps: (i) generate two independent uniform random values
and (ii) solve
for
where
The desired pair is
Using this algorithm, we generated 2000 bivariate pseudo-observations from the Clayton, Gumbel, Gaussian, and Frank copulas. We also simulated the same number of pseudo-observations from the families of UL-distorted copulas, where the four copulas served as base copulas. We do not present the figures for QIP, QUL, and QUP distortions, as the conclusions are similar to those from UL distortion.
The values of parameters in the base copulas are selected so they have Kendall’s tau value of 0.5. The UL distortion has parameter values of
and
. We used the pseudo-likelihood estimation method [
12] that maximizes (
17) to fit the base copula and distorted copulas to the data. We then computed the empirical probabilities using the estimated copula models and constructed the probability–probability (PP) plots of the estimated probability distribution against the theoretical one. In the first row, the four base copulas are approximated by their UL-distorted counterparts, UL-Clayton, UL-Gumbel, UL-Gaussian, and UL-Frank copulas, and vice versa in the second row. The solid black line is the one that resulted from fitting the data to the copula model, from which observations were generated.
As another way of comparison, we also calculated the maximum distance in
Table 3 between the theoretical and empirical probabilities at each data pair of
. The univariate Kolmogorov–Smirnov (KS) test came to mind, and for a sample size of 2000, the 95% critical value of 0.03 is used as an ad hoc threshold.
Based on
Figure 5 and
Table 3, the Clayton copula, which is represented in red in the second row of
Figure 5, shows greater deviations from the 45-degree line when fit to the data generated from UL-distorted copulas. According to Lemma 1, the UL-distorted copula has an upd coefficient of
, which may be attuned to zero. Furthermore, it has zero ltd when the base copula has zero ltd. The Clayton copula does not have upper tail dependence, so one would expect it to perform poorly when fit to the data generated from the UL-distorted family. In contrast, e.g., the UL-Gumbel appears to do well when fit to the data generated from the Frank and Gaussian copulas. In general, the performance of a copula depends on its tail dependence characteristics, and the results show that the UL-distorted copulas are more flexible as they have extra parameters.
6.2. Empirical Application
We fit the proposed families of copula models to a bivariate dataset of daily return rates on Amazon and Google stocks. Historical data for the daily open, close, high, and low prices, and the adjusted closing price for stocks can be downloaded from Yahoo Finance. The adjusted closing price accounts for any splits and dividend distributions. We downloaded the data for Amazon and Google stocks for the period from January 2014 to December 2023, which amounts to a sample size of 2516 daily data points. The daily net return rates in percentages were then calculated based on the adjusted closing price. To calculate the return rate for today, the difference between today’s price and yesterday’s price is divided by yesterday’s price.
Table 4 displays the summary statistics for both variables. The sample Pearson correlation and Kendall’s tau are 0.71 (
p-value < 0.001) and 0.49 (
p-value < 0.001), respectively, both of which are significantly different from zero. Compared to Google, the center tendency measures for Amazon are smaller, but there is no significant difference in means. The Amazon daily rate return is significantly more variable based on the F test and is more skewed judging from the skewness measures and histograms in
Figure 6.
Let
denote the bivariate observations. The pseudo-observations or scaled empirical distributions
are defined to be
and
where
is the indicator function.
Figure 6 contains the scatter plots of
versus
and
versus
Based on the histograms, the return rates for both stocks are concentrated around their center, which is also reflected in the resulting scatter plot. However, the pseudo-observations computed using a scaled empirical distribution are uniformly distributed. Therefore, one would expect a more evenly dispersed scatter plot.
The maximum pseudo-likelihood estimation introduced by [
18] is used to estimate the parameters. It maximizes the pseudo-log-likelihood function, i.e., the log-likelihood with the copula functions evaluated at pseudo-observations, given by
where
is the copula pdf in (
A1) and
is the parameter vector.
We did not fit a marginal distribution to each of the net returns. Our primary objective was to compare the new families of the distorted copula with the base copulas. Let
r be the parameter in the base copula.
Table 5 below reports the estimates with the estimated standard error in the parentheses, the maximum pseudo-likelihood (MPL), and the AIC values. All the parameter estimates fall within admissible spaces. Kendall’s tau estimates
were computed by plugging parameter estimates into either the theoretical Kendall’s tau formula or (
16).
Based on the scatter plots, it appears that there is a weak dependence in both the lower and upper tails between the daily net returns of the two stocks.
Table 5 shows that among the base copulas considered, the Gaussian copula performs the best in terms of MPL and AIC, followed by Gumbel. The estimated Kendall’s tau calculated from the Gaussian copula produces the closest match to the sample Kendall’s tau between Google and Amazon.
The Frank copula, which is supposedly suitable for data with weak tail dependence, performs the worst. Note that the Gumbel copula with a parameter value of 1 represents the independent copula where Kendall’s tau is equal to 0. It performs better than the Clayton copula, which suggests that there might be a stronger upd than ltd. Furthermore, a distortion copula that can accommodate a wider range of upper tail dependence, e.g., UL-distorted copulas, may do well in fitting this net return dataset.
Table 5 indicates that the UL-distorted copula model outperforms the corresponding base copula. The fitted UL-Clayton copula model has the largest AIC value, with the estimated lower and upper tail dependence coefficients of
and
It is less satisfactory than other UL-distorted copula models, probably due to weak lower tail dependence in the data. Both the UL-distorted Frank and Gaussian copulas have an estimated upper tail dependence coefficient of 0.39 and perform better than the Gaussian copula in terms of MPL and AIC.
The UIP-Clayton and UIP-Frank copulas do not exhibit tail dependence behaviors and their performance is worse than their base copulas. The UL-Gumbel and UIP-Gumbel copulas are the best performers, with UL-Gumbel being slightly better than UIP-Gumbel in terms of MPL and AIC. The UIP-Gumbel model produces an estimated Kendall’s tau closer to the sample Kendall’s tau. Both models have upper tail dependence, but not lower tail tail dependence.
According to the copula models employed in this application, there is a moderate linear correlation between the daily net returns of Amazon and Google stocks. Additionally, there appears asymmetrical in the extreme co-movements; that is, joint extremes are more likely for high daily net return values than for low daily net return values.
7. Concluding Remarks
The framework advanced in the paper originates from the fact that a cumulative distribution function with unit interval support is a distortion function. It employs a transformation of a non-negative random variable into a variable with the support of the unit interval. The additional parameters in the distortion allow for more modeling flexibility. As demonstrated in
Section 3.2, distortion of the independence copula creates a new family of copulas that includes the base copula and other existing copulas as its members and accommodates a wider range of tail dependence behaviors that the independence copula would never dream of having.
The tail behavior of a copula model is a crucial factor in determining whether it can adequately fit the data. The use of UL and QUL distortions can morph a family of base copulas without upper tail dependence into a new family of copulas with upper tail dependence. The upper tail dependence coefficient of the UL- and QUL-distorted copulas involves more parameters than the one of the base copula. The distortions can ultimately lead to a better accommodation of the upper tail dependence when compared to the base copula. The tail behaviors in the families of the UIP- and QUP-distorted copulas are similar to the ones in the base copula. However, they can accommodate better the lower tail dependence when compared to the base copulas.
We are not certain whether a more complicated generating cdf or distortion, e.g., one with more than two parameters, would result in a new family of copulas with both upper and lower tail dependence when applied to a base copula with no tail dependence behavior. The framework proposed in this article opens the door to a world of new distortions. Due to the page length limit, further exploration of the concordance ordering of the new family of distorted copulas will be pursued in more detail. The distortions of multivariate copulas of higher dimensions may also be of interest. Unlike the distortions of bivariate copulas, the distortions of multivariate copulas require more care and will be explored in the future.