Next Article in Journal
Stylometry and Numerals Usage: Benford’s Law and Beyond
Previous Article in Journal
Optimal Investment and Consumption for Multidimensional Spread Financial Markets with Logarithmic Utility
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Portfolio Management of Copula-Dependent Assets Based on P(Y < X) Reliability Models: Revisiting Frank Copula and Dagum Distributions

by
Pushpa Narayan Rathie
1,†,
Luan Carlos de Sena Monteiro Ozelim
2,*,† and
Bernardo Borba de Andrade
1
1
Department of Statistics, University of Brasilia, Brasilia 70910-900, Brazil
2
Department of Civil and Environmental Engineering, University of Brasilia, Brasilia 70910-900, Brazil
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Stats 2021, 4(4), 1027-1050; https://doi.org/10.3390/stats4040059
Submission received: 29 September 2021 / Revised: 21 November 2021 / Accepted: 24 November 2021 / Published: 9 December 2021
(This article belongs to the Section Financial Statistics)

Abstract

:
Modern portfolio theory indicates that portfolio optimization can be carried out based on the mean-variance model, where returns and risk are represented as the average and variance of the historical data of the stock’s returns, respectively. Several studies have been carried out to find better risk proxies, as variance was not that accurate. On the other hand, fewer papers are devoted to better model/characterize returns. In the present paper, we explore the use of the reliability measure P ( Y < X ) to choose between portfolios with returns given by the distributions X and Y. Thus, instead of comparing the expected values of X and Y, we will explore the metric P ( Y < X ) as a proxy parameter for return. The dependence between such distributions shall be modelled by copulas. At first, we derive some general results which allows us to split the value of P ( Y < X ) as the sum of independent and dependent parts, in general, for copula-dependent assets. Then, to further develop our mathematical framework, we chose Frank copula to model the dependency between assets. In the process, we derive a new polynomial representation for Frank copulas. To perform a study case, we considered assets whose returns’ distributions follow Dagum distributions or their transformations. We carried out a parametric analysis, indicating the relative effect of the dependency of return distributions over the reliability index P ( Y < X ) . Finally, we illustrate our methodology by performing a comparison between stock returns, which could be used to build portfolios based on the value of the the reliability index P ( Y < X ) .

1. Introduction

Portfolio management and optimization consist of selecting a set of assets, and their respective portfolio participation weights, which best satisfy the investor’s ideal risk–return relationship [1]. One of the earliest and most important studies in this direction was performed by Markowitz, who proposed the Mean-Variance (MV) model [2]. The author considered that a suitable proxy variable for returns would be the average of the historical data of the stock’s return, and the proxy variable for risk would be the variance of these returns [3].
From the seminal work of Markowitz [2], several researchers studied variations and enhancements of his so-called modern portfolio theory. Normally, studies were carried out to find better risk proxies, as variance was not that accurate. For example, Conditional-Value-at-Risk (CVaR) is proposed by Uryasev [4] as a risk measure of the portfolio, and therefore optimization would be on top of that metric.
Other researchers considered a robust optimization approach, where the uncertainty generated by the errors of estimation of the parameters used in optimization were mathematically accounted for in the model [5]. Several review papers were also devoted to explore Portfolio optimization methods. For example, several authors covered the following topics: Evolutionary computation in the discovery of rules in algorithmic trading for shares [6]; swarm intelligence research for portfolio optimization [7]; portfolio optimization problem with the Markowitz mean-variance structure [3]; 20 years of portfolio optimization based on operational research [8].
Despite these contributions, fewer papers were devoted to better model/characterize returns. In general, researchers were worried about forecasting the returns, but few alternative proxy parameters have been proposed. Literature reveals that classical strategies, such as Markowitz MV portfolio, applied with parameters estimated from data normally provide highly volatile portfolio weights. This is primarily due to the difficulty of estimating expected returns with sufficient accuracy [9,10].
To address this issue, several different methods have been reported in the literature. These methods range from combining Capital Asset Pricing Model (CAPM) equilibrium with subjective investor views to estimate returns [9] to applying the Arbitrage Pricing Theory (APT) [11]. Despite these efforts, the problem of getting unstable portfolio weights was not fully solved [10].
The portfolio optimization approaches which are based on the modern portfolio theory not only suffer from this instability in portfolio weights, but also lack a better description of the dependence structure between assets. Markowitz MV portfolio assumes that the financial returns are subject to a joint normal distribution, implying that the dependence between financial returns is fully described by the linear correlation coefficient [2,12].
Empirical studies indicate that the assumption of normality of financial returns distributions does not hold, indicating the need to examine such distributions by considering extra moments, such as the skewness (the third moment of distribution) and the kurtosis (the fourth moment of distribution) of the real-world returns datasets collected [12].
In order to overcome this shortcoming, several authors studied a number of methods to better characterize the dependence structure of assets’ returns such as: Exceedance correlation [13] and the extreme value theory [14,15,16]; multivariate Generalized Auto Regressive Conditional Heteroskedasticity (GARCH) model with skewness [17] and/or kurtosis [18,19]. These methods are adequate to model time-varying conditional correlations, however they cannot reproduce asymmetries in asymptotic tail dependence [12].
An increasingly applied method to account for such tail dependence is to consider the copula theory, which allows one to model the dependence structure of multivariate data without imposing any assumption on marginal distributions [12]. As pointed out by several authors, the advantage of copula lies in separating marginal distributions and dependence structure from joint distribution [12].
Therefore, in the present paper, we explore the use of the reliability measure P ( Y < X ) to choose between portfolios with returns given by the distributions X and Y. Thus, instead of comparing the expected values of X and Y, we will explore the metric P ( Y < X ) as a proxy parameter for return. The dependence between such distributions shall be modelled by copulas.
This paper presents, at first, some general mathematical formulations and theorems which will latter support an empirical exploratory analysis of the portfolio optimization problem. For simplicity, such analysis considered dependence as being modelled by Frank copulas and that asset prices follow distributions from the Dagum family. For more general applications where multi-asset portfolio are considered, highly dimensional problems of portfolio optimization shall be considered and are out of scope of the present effort.
The paper is structured as follows: Section 2 presents some preliminary aspects of the reliability index P ( Y < X ) . Section 3, on the other hand, introduces some concepts related to the copula modelling of dependence between random variables. This section also presents general results concerning the reliability of copula-dependent random variables. In order to illustrate our methodology, Section 4 revisits Frank copula and presents some novel results for the polynomial representation of such copula type. Section 5 then presents the analytical formulas hereby developed to explicitly and exactly obtain the reliability index P ( Y < X ) when both X and Y follow Dagum-like distributions whose dependence is modeled by a Frank copula. We finally present a parametric study and a real application of portfolio management of stock data in Section 6. Section 7 presents the conclusions of the present paper. The codes developed and used in the present paper are also presented at the end of the paper.

2. Reliability

When modelling the reliability of a given component whose strength is described by a random variable X subjected to stresses modelled by a random variable Y, the component fails if the stress applied to it exceeds the strength, while the component works whenever Y < X . Thus, P ( Y < X ) is a measure of component reliability [20,21].
Apart from the clear application in problems of physics and engineering, this reliability index is also applicable to the areas of quality control, genetics, psychology, and economics [21]. In special, the index P ( Y < X ) can be used as a proxy parameter to examine the probability of inequality of variables, leading to a measure of the difference between two populations as P ( Y < X ) = P ( Y X < 0 ) .
Both empirical and analytical studies have been carried out to estimate P ( Y < X ) when X and Y are independent variables belonging to the same univariate family of distributions. The reader may check [20] for a comprehensive account of this topic. Recent contributions include calculating this reliability index for Generalized Pareto distributions [22,23], two-parameter exponential functions [24,25], logistic random variables [26] and generalized logistic distributions [27]. More recently, [28] obtained analytical expressions for the reliability index when both the variables compared were Stable random variables. All these contributions rely on the assumption of independent stress and strength variables.
Even though the independence assumption may be a first approximation, fewer studies were concerned about accounting for the dependence between X and Y when calculating the reliability index [21]. Such dependence may become crucial for applications in finance. For example, when X and Y are disposable household income and consumption, P ( Y < X ) is a measure of household financial affordability or P ( X < Y ) measures household financial fragility [29]. Other finance-related applications include excess-of-loss reinsurance models [30] and stock picking [28].
Studies which tried to assess the relationship between X and Y for the evaluation of the reliability index considered bivariate distributions of strength and stress. In this context, a few of the distributions considered are bivariate normal [31], bivariate Pareto [32], bivariate exponential [33], bivariate gamma [34] and bivariate log-normal [35]. A direct shortcoming of this approach is that a bivariate distribution often admits a certain specific form of dependence between margins only and presupposes that both the marginal distributions are of the same type [21].
Copula-based approaches show up as good candidates to better accommodate the dependence structure between the random variables considered [21,29]. The potential of the copula-based approach is clear since a copula function joins margins of any type (parametric and non-parametric distributions) not necessarily belonging to the same family, and captures various forms of dependence (linear, non-linear, tail dependence etc.) [21].
While the idea of relaxing the independence assumption in the stress-strength models may be addressed by involving any family of copulas, in this work we use Frank copulas. We get a closed-form expression for the reliability index by modeling the dependence through Frank copulas when the marginal distributions of stress and strength are chosen as Dagum-like random variables [36]. Frank copula was chosen since it is commonly used in applications and is implemented in most software packages for copula-based modeling [21]. In the next section, we will present some basic formulations linked to the copula modelling of random variables’ dependence.

3. Copulas and Reliability Measures

In short, a two-dimensional copula is a bivariate distribution function whose margins follow Uniform distributions on ( 0 , 1 ) . Sklar’s theorem proves how copulas link joint distribution functions to their one-dimensional margins by proving that any bivariate distribution H ( x , y ) of variables X and Y, with marginal distributions F ( x ) and G ( y ) , can be written as H ( x , y ) = C ( F ( x ) , G ( y ) ) , where C is a copula. Thus any copula, together with any marginal distribution, allows us to construct a joint distribution [21,37].
Copula families depend, in general, on one or more parameters called association parameters, say θ , related to the degree of dependence between margins. For simplicity, we may represent the copulas as C θ . To further understand the concepts of copulas and their applications, the reader may refer to the monographs in [37,38].
In order to account for the dependence between stress and strength random variables, we calculate the reliability index under the hypothesis that the bivariate distribution of the stress and strength variables is defined by joining the margins F ( x ) and G ( y ) , of any type, through a copula function C θ ( F ( x ) , G ( y ) ) . Moreover, let the joint density function be denoted as h ( x , y ) = c θ ( F ( x ) , G ( y ) ) f ( x ) g ( y ) where c θ ( F ( x ) , G ( y ) ) = 2 C θ ( F ( x ) , G ( y ) ) F ( x ) G ( y ) is the copula density, and f ( x ) , g ( y ) indicate the marginal density functions.
In this context, the reliability index for dependent random variables X 0 and Y 0 , can be written as [21]:
P ( Y < X ) = 0 0 x c θ ( F ( x ) , G ( y ) ) f ( x ) g ( y ) d y d x
We can rewrite Equation (1) by letting u = F ( x ) and v = G ( y ) . Then:
P ( Y < X ) = 0 1 0 G ( F 1 ( u ) ) c θ ( u , v ) d v d u = 0 1 u C θ ( u , v ) | v = G ( F 1 ( u ) ) d u
Theorem 1 (Splitting Theorem).
The reliability measure P ( Y < X ) , where the dependency between X and Y is described by any copula model, can always be split into the sum of two parts as:
P ( Y < X ) = R + D θ ,
where R is the value of P ( Y < X ) when both X and Y are independent and D θ represents the difference between P ( Y < X ) when both X and Y are dependent and independent.
Proof. 
For any single parameter copula, with parameter ι , let us suppose without loss of generality that when ι = ι i n d , this implies independence between the random variables modeled. Thus, let us define the copula density as d ι ( u , v ) . Whenever d ι ( u , v ) is infinitely differentiable with respect to ι at a real value ι i n d , the Taylor series expansion of d ι ( u , v ) at ι = ι i n d is:
d ι ( u , v ) = 1 + n = 1 n w n d w ( u , v ) | w = ι i n d ( ι ι i n d ) n = 1 + h ( θ , u , v )
The free term in Equation (4) must be 1 as when the random variables are independent, ι = ι i n d . By combining Equations (1) and (4) when c θ ( u , v ) = d ι ( u , v ) :
P ( Y < X ) = 0 0 x f ( x ) g ( y ) d y d x + 0 0 x h ( θ , F ( x ) , G ( y ) ) f ( x ) g ( y ) d y d x
The first term in Equation (5) can be further simplified as:
R = 0 G ( x ) f ( x ) d x ,
which is the value of P ( Y < X ) when when both X and Y are independent [20].
The result for m-parameter copulas is easily obtained by using the Taylor series in several variables. Besides, an alternative proof follows directly from Ruschendorf’s method to represent copulas [39]. □
We can see that Theorem 1 generalizes the results in [21], which only considered the splitting when using the Farlie-Gumbel-Morgenstern and its generalizations. Moreover, we should highlight that, regardless of the type of distribution considered, for any strictly increasing or strictly decreasing continuous function β ( x ) :
P ( Y < X ) = P ( β ( Y ) < β ( X ) ) ,
which may come in handy when dealing with transformed variables.

4. Revisiting Frank Copula

In order to actually calculate the reliability index for real assets, we need to choose a copula model. As previously indicated, we chose Frank copulas. For this particular type of copula [21,37]:
C θ ( u , v ) = 1 θ ln 1 1 e θ u 1 e θ v 1 e θ
where θ R / { 0 } . Moreover [21,37]:
c θ ( u , v ) = θ e θ ( u + v ) 1 e θ 1 1 e θ u 1 e θ v 1 e θ 2
The calculation of the Kendall’s τ ( τ K ) and Spearman’s ρ ( ρ S ) for a Frank copula can be given as [21,37]:
τ K = 1 + 4 ( D b 1 ( θ ) 1 ) / θ
and
ρ S = 12 ( D b 2 ( θ ) D b 1 ( θ ) ) / θ 1
where D b k ( x ) is the Debye function, defined as [21,37]:
D b k ( x ) = k x k 0 x t k e t 1 d t
In order to find exact representations for the integrals in Equation (2), we first need to obtain the series representation for both C θ ( u , v ) and c θ ( u , v ) . Thus, let one consider the following identities [40]:
x e x 1 = n = 0 B n x n n ! , | x | < 2 π
where B n is the n-th Bernoulli number. Moreover, the following formula holds as the generating function of Bernoulli polynomials [40]:
x e x t e x 1 = n = 0 B n ( t ) x n n ! , | x | < 2 π , t R
where B n ( t ) is the n-th Bernoulli Polynomial. It is easy to see that B n ( 0 ) = B n . Besides, another generating function of interest is [41]:
exp n = 1 λ n t n n ! = 1 + n = 1 Y n λ 1 , . . . , λ n t n n !
where Y n is the n-th complete exponential Bell Polynomial. From Bell polynomials theory, the following inversion formula holds [41]:
y n = Y n λ 1 , . . . , λ n = k = 1 n Y n , k λ 1 , . . . , λ n k + 1
where Y n , k is the partial exponential Bell Polynomial [41]. Thus:
λ n = k = 1 n ( 1 ) k 1 ( k 1 ) ! Y n , k y 1 , . . . , y n k + 1
Let us assume, without loss of generality, that:
C θ ( u , v ) = n = 0 x n ( θ ) n n !
From Equation (8), one may state that:
1 e θ u 1 e θ v 1 e θ = 1 exp n = 1 x n 1 ( θ ) n ( n 1 ) !
The Taylor expansion of the exponential function is known as:
e x = n = 0 x n n !
Thus:
1 e θ u 1 e θ v = n = 1 ( θ u ) n n ! n = 1 ( θ v ) n n !
By performing a change of index in the summations and using the following identity:
n = 0 a n x n n = 0 b n x n = n = 0 c n x n , c n = k = 0 n a k b n k ,
the product of series in Equation (21) can be given as:
1 e θ u 1 e θ v = θ 2 u v n = 0 k = 0 n u k v n k ( k + 1 ) ! ( n k + 1 ) ! ( θ ) n
Thus:
1 e θ u 1 e θ v 1 e θ = θ 2 u v 1 e θ n = 0 k = 0 n u k v n k ( k + 1 ) ! ( n k + 1 ) ! ( θ ) n
By using the generating function for Bernoulli numbers given in Equation (13), one may get:
1 e θ u 1 e θ v 1 e θ = θ u v n = 0 B n n ! ( θ ) n n = 0 k = 0 n u k v n k ( k + 1 ) ! ( n k + 1 ) ! ( θ ) n , | θ | < 2 π
Equation (25) leads to:
1 e θ u 1 e θ v 1 e θ = θ u v n = 0 r = 0 n k = 0 r u k v r k ( k + 1 ) ! ( r k + 1 ) ! B n r ( n r ) ! ( θ ) n , | θ | < 2 π
From Equations (19) and (15):
1 exp n = 1 n x n 1 ( θ ) n n ! = n = 1 Y n x 0 , . . . , n x n 1 ( θ ) n n !
From Equations (27) and (26):
n = 0 r = 0 n k = 0 r u k + 1 v r k + 1 ( k + 1 ) ! ( r k + 1 ) ! B n r ( n r ) ! ( θ ) n + 1 = n = 1 r = 0 n 1 k = 0 r u k + 1 v r k + 1 ( k + 1 ) ! ( r k + 1 ) ! B n 1 r ( n 1 r ) ! ( θ ) n = n = 1 Y n x 0 , . . . , n x n 1 ( θ ) n n !
A term by term comparison leads to:
Y n x 0 , . . . , n x n 1 = n ! r = 0 n 1 k = 0 r u k + 1 v r k + 1 ( k + 1 ) ! ( r k + 1 ) ! B n 1 r ( n 1 r ) ! = y n
By using the inversion property of Bell Polynomials given in Equations (16) and (17):
n x n 1 = k = 1 n ( 1 ) k 1 ( k 1 ) ! Y n , k y 1 , . . . , y n k + 1
More explicitly:
x n 1 =     n 1 k = 1 n ( 1 ) k 1 ( k 1 ) ! Y n , k u v , . . . , r = 0 n k l = 0 r u l + 1 v r l + 1 B n k r ( n k + 1 ) ! ( l + 1 ) ! ( r l + 1 ) ! ( n k r ) !
The first few terms of the expansion are:
x 0 = u v x 1 = u v 2 ( u 1 ) ( v 1 ) x 2 = u v 6 2 u 2 3 u + 1 2 v 2 3 v + 1 x 3 = u v 4 ( u 1 ) ( v 1 ) u 2 6 v 2 6 v + 1 + u 6 v 2 + 6 v 1 + ( v 1 ) v
Now, we can find the series for c θ ( u , v ) given in Equation (9). A first, a direct application of Equation (14) leads to:
θ e θ ( u + v ) 1 e θ = n = 0 B n ( u + v ) ( θ ) n n ! , | θ | < 2 π
On the other hand, Faà di Bruno’s formula states that [41]:
n x n f ( g ( x ) ) = k = 1 n k y k f ( y ) | x = g ( x ) Y n , k x g ( x ) , . . . , n r + 1 x n k + 1 g ( x )
Let:
1 1 e θ u 1 e θ v 1 e θ 2 = f ( g ( θ ) )
By taking f ( y ) = ( 1 y ) 2 and noticing that g ( θ ) has been expressed as a Maclaurin Series of θ in Equation (26), it follows that:
n x n g ( x ) | x = 0 = n ! r = 0 n 1 k = 0 r u k + 1 v r k + 1 ( k + 1 ) ! ( r k + 1 ) ! B n r 1 ( n r 1 ) !
On the other hand,
k x k f ( x ) = ( k + 1 ) ! ( 1 x ) 2 k
Thus:
f ( g ( θ ) ) = n = 0 ( θ ) n n ! n x n f ( g ( x ) ) | x = 0
where:
n x n f ( g ( x ) ) | x = 0 = k = 1 n k y k f ( y ) | x = g ( 0 ) Y n , k x g ( x ) | x = 0 , . . . , n r + 1 x n k + 1 g ( x ) | x = 0
which is valid for n 1 . In order to make Equation (39) also valid for n = 0 , in this specific case, it suffices to expand the summation range of k from 1 to to 0 to . Thus:
f ( g ( θ ) ) = n = 0 ( θ ) n n ! × k = 0 n ( k + 1 ) ! Y n , k u v , . . . , r = 0 n k l = 0 r u l + 1 v r l + 1 B n k r ( n k + 1 ) ! ( l + 1 ) ! ( r l + 1 ) ! ( n k r ) !
Finally, from Equations (9), (33) and (40), the following holds:
c θ ( u , v ) = n = 0 B n ( u + v ) ( θ ) n n ! n = 0 ( θ ) n n ! × k = 0 n ( k + 1 ) ! Y n , k u v , . . . , r = 0 n k l = 0 r u l + 1 v r l + 1 B n k r ( n k + 1 ) ! ( l + 1 ) ! ( r l + 1 ) ! ( n k r ) ! , | θ | < 2 π
which can be further simplified as:
c θ ( u , v ) = n = 0 p n ( θ ) n = n = 0 w = 0 n k = 0 w × Y w , k u v , . . . , r = 0 w k l = 0 r u l + 1 v r l + 1 B w k r ( w k + 1 ) ! ( l + 1 ) ! ( r l + 1 ) ! ( w k r ) ! ( k + 1 ) ! B n w ( u + v ) ( n w ) ! w ! ( θ ) n ,
which is valid for | θ | < 2 π .
The first few terms of the expansion are:
p 0 = 1 p 1 = 1 2 + u + v 2 u v p 2 = 1 12 6 u 2 6 u + 1 6 v 2 6 v + 1 p 3 = 1 12 ( 2 u 1 ) ( 2 v 1 ) u 2 12 v 2 12 v + 1 + u 12 v 2 + 12 v 1 + ( v 1 ) v
The analytical structure of Frank copulas reveals that this type of copula can be well approximated by polynomial copulas by simply taking as much terms as needed in Equation (42). The series expansions above are valuable to understand the analytical structure of Frank copula models and can be used as approximations when needed.
We can now proceed to obtain exact series representations for the reliability index when Frank copulas are considered.

5. Reliability Calculations

By using the results from Section 4 we can dissect the analytical structure of the equations involved in determining the relative impact of the dependence between X and Y on P ( Y < X ) when Frank copulas are used to model such dependence.
Moreover, in order to numerically investigate how P ( Y < X ) changes when the variables compared are dependent, we must choose which distribution the variables involved follow. In this paper, we chose distributions from the Dagum family.
In the present section, the exact expressions for P ( Y < X ) when both X and Y follow Dagum distributions and have their dependence modelled by Frank copula are presented. These expressions have not been previously obtained in the literature. Moreover, some of the results are obtained in a closed-form in terms of the H-function.
Thus, from Equation (8), it is easy to see that:
u C θ ( u , v ) = e θ u ( 1 e θ v ) 1 e θ 1 1 e θ u 1 e θ v 1 e θ 1
Thus:
P ( Y < X ) = 0 1 e θ u ( 1 e θ v ) 1 e θ 1 1 e θ u 1 e θ v 1 e θ 1 | v = G ( F 1 ( u ) ) d u
Now, let e θ = t , thus:
P ( Y < X ) = 0 1 t u ( 1 t v ) 1 t 1 1 t u 1 t v 1 t 1 | v = G ( F 1 ( u ) ) d u
When t < 1 the Binomial expansion leads to:
P ( Y < X ) = 0 1 t u ( 1 t v ) 1 t n = 0 1 t u n 1 t v n 1 t n | v = G ( F 1 ( u ) ) d u = n = 0 ( 1 t ) n 1 0 1 t u 1 t u n 1 t v n + 1 | v = G ( F 1 ( u ) ) d u
By using the Binomial expansion again one gets:
P ( Y < X ) = n = 0 ( 1 t ) n 1 0 1 t u k = 0 n n k ( t u ) k r = 0 n + 1 n + 1 r ( t v ) r | v = G ( F 1 ( u ) ) d u = n = 0 k = 0 n r = 0 n + 1 n k n + 1 r ( 1 ) k + r ( 1 t ) n + 1 0 1 t u ( k + 1 ) + v r | v = G ( F 1 ( u ) ) d u
By considering the series expansion of the Exponential function:
0 1 t u ( k + 1 ) + v r | v = G ( F 1 ( u ) ) d u = 0 1 l = 0 ( u ( k + 1 ) + v r ) l ( θ ) l l ! | v = G ( F 1 ( u ) ) d u = 0 1 l = 0 ( θ ) l l ! m = 0 l l m ( u ( k + 1 ) ) l m ( v r ) m | v = G ( F 1 ( u ) ) d u = l = 0 ( θ ) l l ! m = 0 l l m ( k + 1 ) l m ( r ) m 0 1 u l m v m | v = G ( F 1 ( u ) ) d u
For simplicity, one may define I a , b ( ω X , Y ) as:
I a , b ( ω X , Y ) = 0 1 u a v b | v = G ( F 1 ( u ) ) d u = 0 1 u a ( G ( F 1 ( u ) ) ) b d u
where ω X , Y represents a vector of parameters which characterize G ( F 1 ( u ) ) and, therefore, X and Y. For simplicity, we may represent I a , b ( ω X , Y ) as I a , b .
From Equations (49) and (50):
P ( Y < X ) = n = 0 k = 0 n r = 0 n + 1 l = 0 m = 0 l n k n + 1 r l m × ( θ ) l ( 1 ) k + r ( k + 1 ) l m r m I l m , m ( 1 t ) n + 1 l !
which is valid for | θ | < 1 .
Equation (51) reveals that the reliability measure P ( Y < X ) for two random variables whose dependence is described by a Frank copula is fully described by I a , b and θ . Here, I a , b contains all the information about the random variables X and Y, since it depends on the function composition G ( F 1 ( u ) ) . From Equations (3) and (50), it can be seen that I 0 , 1 = R .
To actually calculate the value of the reliability index, we need to choose the marginal distributions. As indicated, we chose Dagum random variables.
Let Z be a continuous random variable taking positive values, z > 0 , whose cummulative distribution function is defined as [36]:
F ( z ) = ( 1 + γ 2 z γ 3 ) γ 1
where γ i > 0 , i = 1 , 2 , 3 are the distributions parameters. Thus, we say Z is distributed as a Dagum Distribution, D a g ( γ 1 , γ 2 , γ 3 ) , and write Z D a g ( γ 1 , γ 2 , γ 3 ) .
One can easily obtain the Quantile function for Z D a g ( γ 1 , γ 2 , γ 3 ) as:
F 1 ( y ) = ( ( y 1 / γ 1 1 ) / γ 2 ) 1 / γ 3
where 0 y 1 .
Thus, if one consider the reliabiliy measure P ( Y < X ) when X D a g ( γ 1 , 1 , γ 1 , 2 , γ 1 , 3 ) and Y D a g ( γ 2 , 1 , γ 2 , 2 , γ 2 , 3 ) , the function I a , b can be calculated as:
I a , b = 0 1 u a 1 + γ 2 , 2 u 1 γ 1 , 1 1 γ 1 , 2 γ 2 , 3 γ 1 , 3 b γ 2 , 1 d u
Literature [42] reveals that:
( 1 + x ) a = 1 Γ ( a ) H 1 , 1 1 , 1 x | ( 1 a , 1 ) ( 0 , 1 )
where H p , q m , n stands for the H-function [42], whose definition is presented in the Appendix A.1. Thus:
I a , b = 0 1 u a 1 Γ ( b γ 2 , 1 ) H 1 , 1 1 , 1 γ 2 , 2 u 1 γ 1 , 1 1 γ 1 , 2 γ 2 , 3 γ 1 , 3 | ( 1 b γ 2 , 1 , 1 ) ( 0 , 1 ) d u
By considering the contour representation of the H-function, the following holds [42]:
I a , b = 1 Γ ( b γ 2 , 1 ) 0 1 u a 1 2 π i L γ 2 , 2 u 1 γ 1 , 1 1 γ 1 , 2 γ 2 , 3 γ 1 , 3 s Γ ( s ) Γ ( b γ 2 , 1 s ) d s d u = 1 Γ ( b γ 2 , 1 ) ( 2 π i ) L γ 2 , 2 γ 1 , 2 γ 2 , 3 γ 1 , 3 s Γ ( s ) Γ ( b γ 2 , 1 s ) 0 1 u a u 1 γ 1 , 1 1 γ 2 , 3 s γ 1 , 3 d u d s
In order to calculate the integral in (57), one may notice that:
0 1 u a u 1 γ 1 , 1 1 γ 2 , 3 s γ 1 , 3 d u = 0 1 u a + γ 2 , 3 s γ 1 , 3 γ 1 , 1 1 u 1 γ 1 , 1 γ 2 , 3 s γ 1 , 3 d u
By letting u 1 γ 1 , 1 = y :
0 1 u a u 1 γ 1 , 1 1 γ 2 , 3 s γ 1 , 3 d u = γ 1 , 1 0 1 y ( a + 1 ) γ 1 , 1 + γ 2 , 3 s γ 1 , 3 1 1 y γ 2 , 3 s γ 1 , 3 d y
The Beta function B ( x , y ) can be defined as [42]:
B ( x , y ) = 0 1 q x 1 1 q y 1 d q = Γ ( x ) Γ ( y ) Γ ( x + y )
A direct comparison of Equations (59) and (60) indicates:
0 1 u a u 1 γ 1 , 1 1 γ 2 , 3 s γ 1 , 3 d u = γ 1 , 1 B ( a + 1 ) γ 1 , 1 + γ 2 , 3 s γ 1 , 3 , 1 γ 2 , 3 s γ 1 , 3 = Γ ( a + 1 ) γ 1 , 1 + γ 2 , 3 s γ 1 , 3 Γ 1 γ 2 , 3 s γ 1 , 3 ( a + 1 ) Γ ( a + 1 ) γ 1 , 1
From Equations (57) and (61):
I a , b = 1 ( 2 π i ) L γ 2 , 2 γ 1 , 2 γ 2 , 3 γ 1 , 3 s Γ ( s ) Γ ( b γ 2 , 1 s ) Γ ( a + 1 ) γ 1 , 1 + γ 2 , 3 s γ 1 , 3 Γ 1 γ 2 , 3 s γ 1 , 3 ( a + 1 ) Γ ( b γ 2 , 1 ) Γ ( a + 1 ) γ 1 , 1 d s = 1 ( a + 1 ) Γ ( b γ 2 , 1 ) Γ ( a + 1 ) γ 1 , 1 H 2 , 2 2 , 2 γ 2 , 2 γ 1 , 2 γ 2 , 3 γ 1 , 3 | ( 1 b γ 2 , 1 , 1 ) , 0 , γ 2 , 3 γ 1 , 3 ( 0 , 1 ) , ( a + 1 ) γ 1 , 1 , γ 2 , 3 γ 1 , 3 = 1 ( a + 1 ) Γ ( b γ 2 , 1 ) Γ ( a + 1 ) γ 1 , 1 H 2 , 2 2 , 2 γ 1 , 2 γ 2 , 3 γ 1 , 3 γ 2 , 2 | ( 1 , 1 ) , 1 ( a + 1 ) γ 1 , 1 , γ 2 , 3 γ 1 , 3 ( b γ 2 , 1 , 1 ) , 1 , γ 2 , 3 γ 1 , 3
An interesting case shows up when both X and Y are identically distributed, i.e., γ 1 , j = γ 2 , j , j = 1 , 2 , 3 and G ( F 1 ( u ) ) = u :
P ( Y < X ) = n = 0 ( 1 t ) n 1 0 1 t u 1 t u 2 n + 1 d u = n = 0 ( 1 t ) ( n + 1 ) 2 ( n + 1 ) log ( t ) = 1 2
By using the expressions hereby derived, we can illustrate the calculations of the reliability index.

6. Numerical Calculations

From a theoretical point of view, the series obtained in Equation (51) are important to show the analytical structure of the reliability measure P ( Y < X ) when both X and Y follow Dagum distributions whose dependence is modeled by a Frank Copula.
On the other hand, from a numerical point of view, the series in Equation (51) converges well for | θ | < 1 . Around 30 terms are needed on each infinite summation presented in Equation (51) to account for a 4-digit accuracy. The computation, on the other hand, is not fast as it considers nested series with large factorials involved.
For values of θ outside this range, instead of looking for other series expansions, the integrand in Equation (46) is quite well behaved. Therefore, simple numerical integration techniques can be employed to get high accuracy results.
If we consider the generalized midpoint integration rule, the integral in (46) can represented as:
P ( Y < X ) = lim n 1 n k = 1 n 1 t k / m ( 1 t G ( F 1 ( k / m ) ) ) ( 1 t ) 1 t k / m 1 t G ( F 1 ( k / m ) ) + 1 2
where t = exp ( θ ) and:
G ( F 1 ( x ) ) = 1 + γ 2 , 2 x 1 γ 1 , 1 1 γ 1 , 2 γ 2 , 3 γ 1 , 3 b γ 2 , 1
Numerical experiments show that the summation in Equation (64) provide high accuracy results for n around 200. The computation above is quite fast, as only standard functions are involved and it is valid for all the possible values of θ .

6.1. Parametric Study

Now that the computation requirements have been discussed, one may investigate how D θ is affected by the parameters of the Dagum random variables modeled as well as the value of θ for the Frank copula used to model the dependency relation between those RVs. Figure 1 and Figure 2 present some results. Additional horizontal axes, placed on top of the plots, present the correspondent values of Kendall’s τ and Spearman’s ρ according to the values of θ .
Figure 1 reveals that the impact of θ on the Reliability measure P ( Y < X ) is quite important. Since P ( Y < X ) ranges from 0 to 1, one may see that the impact of the dependence between the random variables can be as high as 0.08 for the variables considered in Figure 1. For high reliability systems, this may be too much.
Figure 2 presents a sense of the relative importance of D θ with respect to R. It can be seen that D θ may be almost the half of R for some of the cases studied.
One important aspect to be highlighted from the analysis of Figure 1 and Figure 2 is that the sign of θ does not always represent the relative impact of D θ on P ( Y < X ) . Besides, Figure 1 and Figure 2 indicate that there exist inflection points on the curves, which reveals the sign of D θ may change even when one is dealing with strictly positive or strictly negative correlation cases (the dotted curve highlights this behavior).

6.2. Applications to Real Stock Data

In the previous section, we performed a parametric study on how the reliability index between Dagum Random variables may be affected by the dependency parameter θ of Frank copulas.
On the other hand, stock returns cannot be modelled as Dagum random variables as they are not strictly positive. By using (7), it is easy to see that by letting β ( x ) = ln ( x ) , all the formulas hereby proposed are also applicable to the random variables ln ( X ) and ln ( Y ) , where both X and Y follow Dagum distributions. These transformed random variables are called Log–Dagum distributions and have been studied as alternative models for financial data [43].
In order to show the applicability of the expressions obtained, three real stock return data have been analysed. The stock return data have been collected from Yfinance Python package.
We highlight that the analysis we will carry out is only illustrative of the real portfolio optimization problem, as the multidimensional nature of multi-asset portfolio is hereby reduced to a single-asset portfolio selection. This, on the other hand, is not a major drawback, as the general methodological approach proposed is still valid.
As described, each data set has been modelled as a Log–Dagum random variable and the reliability of the type P ( Y < X ) has been calculated. This type of reliability index represents a situation in which a given investor has to choose between two stocks [28]. In the case P ( Y < X ) < 1 / 2 , the reliability analysis indicates that the random variable Y is more profitable, as its returns are greater than the ones of X. On the other hand, when P ( Y < X ) > 1 / 2 the opposite happens. Finally, when P ( Y < X ) = 1 / 2 , there is no difference between both stocks [28].
When using the value of P ( Y < X ) to build multi-asset portfolios, it is important that such measure satisfy transitivity, i.e., given three distributions X, Y and Z, if P ( X > Y ) > 0.5 and P ( Y > Z ) > 0.5 , we should study if we can infer that P ( X > Z ) > 0.5 . If a circular ordering among assets X, Y and Z occurs, the optimization procedure of multi-asset portfolios is considerably affected. Proving transitivity of P ( Y < X ) requires further studies which are out of the scope of the present paper and shall be addressed in future publications. On the other hand, the methodological framework hereby presented is not invalidated. For example, if we consider a finite number of portfolios (which is translated into quantized weights instead of arbitrary weights from 0 to 1), we can perform a brute-force optimization by checking all the possible portfolios. Of course this is not desirable, but can be used as a first approximation.
Python routines have been used to adjust the data to copula-dependent Log–Dagum distributions, to estimate the value of Kendall’s τ from empirical data and to calculate the reliability index P ( Y < X ) . In special, Kendall’s τ has been estimate by using the . c o r r ( m e t h o d = k e n d a l l ) from Pandas Library. Moreover, the values of the parameters of the fitted Log–Dagum distributions and the respective value of the Frank copula parameter θ were obtained as Maximum Likelihood Estimates by using a Python code based on the pycopula package. The codes are given in the Appendix A.2.
The stock return data analysed is presented in Table 1 and corresponds to the period from 1 January 2010 to 31 December 2020.
Regarding the reliability evaluation, four approaches are used to compare both theoretical and real data. At first, the reliability index P ( Y < X ) has been estimated from real data by following the procedure [28]:
  • Collect the real stock return data samples from 1 January 2010 to 31 December 2020 using the Yfinance Python package for the random variables X and Y;
  • Let l be the length of the collected data. Moreover, let x i and y i , i = 1 , . . . , l , denote the elements of the real samples of the real variables X and Y, respectively. Consider the indicator function I ( x , y ) = 1 u ( x y ) , where u ( x ) = 0 , x < 0 and u ( x ) = 1 , otherwise. The value of P ( X < Y ) can be estimated as P ( X < Y ) e = i = 1 l I ( x i , y i ) and P ( Y < X ) is simply 1 P ( X < Y ) ;
For the random variables presented in Table 1, the reliability estimator can be evaluated by following the procedure below [28]:
  • Generate random samples with 10 5 elements each for the random variables X and Y;
  • Let x i and y i , i = 1 , . . . , 10 5 , denote the elements of the random samples of the random variables X and Y, respectively. Consider the indicator function I ( x , y ) = 1 u ( x y ) , where u ( x ) = 0 , x < 0 and u ( x ) = 1 , otherwise. The value of P ( X < Y ) can be estimated as P ( X < Y ) e = i = 1 10 5 I ( x i , y i ) . Thus, P ( Y < X ) is simply 1 P ( X < Y ) ;
  • Repeat the above process 1000 times and then take the mean value of the P ( Y < X ) e s generated as the estimated value of P ( Y < X ) . We also compute the standard deviation of such results to check the suitability of the mean estimate.
Finally, Equations (46) and (64) are used to calculate the reliability index for the fitted random variables. In the next subsections, the random variables presented in Table 1 are compared.

6.2.1. Modelling Stock Return Data as Log–Dagum Random Variables

At first, the Maximum Likelihood Estimates (MLE) of the Log–Dagum distributions’ hyper-parameters and the Frank copulas θ parameters are given in Table 2 and Table 3. It is worth noticing that the pycopula code used maximizes the log-likelihood function over all parameters and hyper-parameters of marginals. Thus, the estimation procedure carried out presents the copula’s parameter and all estimated hyper-parameters at the same time.
Moreover, the initial values of the marginal’s hyper-parameters used in the optimization procedure were the MLE obtained as if the variables were independent. Besides, the initial value of θ in the optimization algorithm was set to be the Canonical Maximum Likelihood Estimator or θ , obtained by the pycopula package.
Figure 3 and Figure 4 show the real histogram and adjusted p.d.f.s for X i , i = 1 , . . . , 3 .
As can be verified from Figure 3 and Figure 4, there is a visual agreement between the real data histograms and the fitted p.d.f.s. Moreover, by using the Scipy Python package, both Spearman’s Rank Correlation and Kendall’s Rank Correlation hypothesis tests indicated the random variables analyzed are dependent. In the next subsection, reliability analyses are carried out.

6.2.2. Reliability Analyses

By following the estimation procedures for P ( Y < X ) and using the equations developed in the present paper, the reliability indexes for the stock return data modelled are shown in Table 4.
It can be seen from Table 4 that the real and modelled data behave similarly. The standard deviation of the 1000 estimates of P ( Y < X ) e was below 2.5 × 10 6 for both cases of the random data estimation procedures. Moreover, the applicability of Equation (64) is illustrated. By taking up to 100 terms, the approximate and exact reliability formulas differ by less than 0.00015, which is sufficient for practical applications.
If the random variables were considered as independent, the values for R would be 0.498709 and 0.492342 for P ( X 1 < X 3 ) and P ( X 2 < X 3 ) , respectively. In both cases, the values of D θ are of −0.004186 and −0.00508, respectively. Even though small, these values are almost 1% of the real reliability index and need to be taken into account to better model the stock behavior.
If we were to choose the stocks solely based on the mean values of the historical returns, we would pick the stock of RIO ( X 3 ) as the mean values of returns of BHP ( X 1 ), XOM ( X 2 ) and RIO ( X 3 ) are 0.000319, 0.000073 and 0.000554, respectively. This is the opposite of the conclusion we can draw from comparing the reliability metric, as Table 4 reveals that in both cases we would pick other stock over RIO. This comes from the fact that both P ( X 1 < X 3 ) and P ( X 2 < X 3 ) are less than 0.5, implying that X 3 is not statistically greater than X 1 and X 2 .

7. Conclusions

Using the reliability measure P ( Y < X ) to assess whether a given component with strength X will be able to support stresses given by the random variable Y is not a new concept. On the other hand, the use of such metric to choose between assets whose returns are X and Y has not been extensively explored in the literature.
In the present paper, we describe a mathematical framework to choose between portfolios with returns given by the distributions X and Y. Instead of comparing the expected values of X and Y, we explored the metric P ( Y < X ) as a proxy parameter for return. Since real assets show correlation between each other, we modeled such dependence by copulas.
By splitting the value of the reliability measure P ( Y < X ) as the sum of independent and dependent parts, we are able to understand how correlation impact such measure. Then, to further develop our mathematical framework, we chose Frank copula to model the dependency between assets. Understanding the analytical structure of Frank copulas is important to relate this type of copula to other types presented in the literature. We derive a new polynomial representation for Frank copulas, revealing that such copulas can be well approximated by polynomial copulas by simply taking as much terms as needed in this new representation.
Then, to finish our mathematical derivations, we considered the cases where the marginals follow Dagum distributions or their transformations, since these distributions are powerful models which can account for skewness. Our parametric studies revealed that the impact of Frank copulas parameter θ on the Reliability measure P ( Y < X ) is quite important. Since P ( Y < X ) ranges from 0 to 1, the impact of the dependence between the random variables studied was observed to be be as high as 0.08 . This can completely change the asset selection, specially when P ( Y < X ) is around 0.5.
By applying our methodology to real-world stock data, we could perform a stock picking procedure by calculating P ( Y < X ) when both X and Y represent stock returns. We considered the cases where returns follow Log–Dagum distributions. The same procedure can be carried out to compare portfolio of stocks, enabling one to build more general portfolios based on the value of the the reliability index P ( Y < X ) . It is important to highlight that the single-asset stock picking procedure we described is just illustrative of the real multi-asset portfolio optimization problem, which shall be covered in subsequent studies. Moreover, considering the impact on the optimization procedure for multi-asset portfolios, proving that P ( Y < X ) is transitive is still an open problem which must be further studied.

Author Contributions

Conceptualization, L.C.d.S.M.O.; methodology, P.N.R. and L.C.d.S.M.O.; software, L.C.d.S.M.O.; validation, P.N.R., L.C.d.S.M.O. and B.B.d.A.; formal analysis, P.N.R. and L.C.d.S.M.O.; investigation, L.C.d.S.M.O.; writing—original draft preparation, L.C.d.S.M.O.; writing—review and editing, P.N.R. and B.B.d.A.; supervision, P.N.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding and the APC was gently waived by the Editorial Office of Stats.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used is available upon request to the corresponding author.

Acknowledgments

The authors acknowledge the support provided by the University of Brasilia (UnB).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. H-Function

The H-function (see [42]) is defined, as a contour complex integral which contains gamma functions in their integrands, by
H p , q m , n z | ( a 1 , A ) , , ( a n , A n ) , ( a n + 1 , A n + 1 ) , , ( a p , A p ) ( b 1 , B 1 ) , , ( b m , B m ) , ( b m + 1 , B m + 1 ) , , ( b q , B q ) = 1 2 π i L j = 1 m Γ ( b j + B j s ) j = 1 n Γ ( 1 a j A j s ) j = m + 1 q Γ ( 1 b j B j s ) j = n + 1 p Γ ( a j + A j s ) z s d s ,
where A j and B j are assumed to be positive quantities and all the a j and b j may be complex. The contour L runs from c i to c + i such that the poles of Γ ( b j + B j s ) , j = 1 , , m lie to the left of L and the poles of Γ ( 1 a j A j s ) , j = 1 , , n lie to the right of L.
By performing the variable change s r and adjusting the contour L to L * , where the integral runs from c * i to c * + i , the H-function can be alternatively defined as:
H p , q m , n z | ( a 1 , A ) , , ( a n , A n ) , ( a n + 1 , A n + 1 ) , , ( a p , A p ) ( b 1 , B 1 ) , , ( b m , B m ) , ( b m + 1 , B m + 1 ) , , ( b q , B q ) = 1 2 π i L * j = 1 m Γ ( b j B j r ) j = 1 n Γ ( 1 a j + A j r ) j = m + 1 q Γ ( 1 b j + B j r ) j = n + 1 p Γ ( a j A j r ) z r d r ,
for which the same parameter domains’ restrictions apply.
By considering the definition in Equation (A2), the H-function can be expressed in computable form as [42]:
When the poles of j = 1 m Γ ( b j B j r ) are simple, we have:
H p q m n ( z ) = h = 1 m ν = 0 j = 1 h m Γ b j B j b h + ν B h j = m + 1 q Γ 1 b j + B j b h + ν B h × j = 1 n Γ 1 a j + A j b h + ν B j j = n + 1 p Γ a j A j b h + ν B h ( 1 ) ν z ( b h + ν ) / B h ν ! B h
for z 0 if δ > 0 and for 0 < | z | < D 1 if δ = 0 , where δ = j = 1 p B j j = 1 q A j and D = j = 1 p A j A j / j = 1 q B j B j .
When the poles of j = 1 n Γ ( 1 a j + A j r ) are simple, we have
H p q m n ( z ) = h = 1 n ν = 0 j = 1 h n Γ 1 a j A j 1 a h + ν A h j = n + 1 p Γ a j + A j 1 a h + ν A h × j = 1 m Γ b j + B j 1 a h + ν A h j = m + 1 q Γ 1 b j B j 1 a h + ν A h ( 1 ) ν ( 1 / z ) ( 1 a h + ν ) / A h ν ! A h
for z 0 if δ < 0 and for | z | > D 1 if δ = 0 .
Both representations above apply when the poles of the gamma function in the numerator of the quotients are simple. When this simplification does not hold, residue theorem has to be applied. For details about this theorem, one may refer to [44].

Appendix A.2. Codes

  • fullflexible
  • from scipy import integrate
  • import numpy as np
  • from scipy.special import gamma
  • from scipy.special import binom
  • from scipy.stats import mielke
  • from scipy.stats import kendalltau
  • from scipy.stats import spearmanr
  • import pandas as pd
  • import matplotlib.pyplot as plt
  • import yfinance as yf
  • from scipy.optimize import fsolve
  • from scipy.stats import genlogistic
  •  
  • def RI_Emp (px,py):
  •      u1,u2,u3=px
  •      v1,v2,v3=py
  •      rvals = []
  •      for n in range(1000):
  •          X = mielke.rvs(u1*u3, u3, 0, np.power(u2,1/u3), size=3000)
  •          Y = mielke.rvs(v1*v3, v3, 0, np.power(v2,1/v3), size=3000)
  •          rvals.append(sum(1 for i in Y-X if i < 0)/3000)
  •      return [np.mean(rvals),np.var(rvals)]
  •  
  • def H2222(z,p1,p2,p3):
  •     gprod = lambda s: np.power(z,-s)*gamma(p3+s)*gamma(1+p2*s)*gamma(-s)*gamma(1-p1-p2*s)
  •     c = np.max([-p3,-1/p2])/2
  •     return integrate.quad(lambda w: np.real(gprod(c+1j*w))/(2*np.pi),-np.infty,np.infty)[0]
  •  
  • def RI(px,py):
  •     u1,u2,u3=px
  •     v1,v2,v3=py
  •     return H2222(np.power(u2,v3/u3)/v2,1-u1,v3/u3,v1)/(gamma(u1)*gamma(v1))
  •  
  • def R_FC_Dagum_Emp(theta,size,px,py):
  •     u1,u2,u3=px
  •     v1,v2,v3=py
  •     rvals = []
  •     for n in range(1000):
  •         un1 = np.random.random(size)
  •         vn2 = np.random.random(size)
  •         un2 = -np.power(theta,-1)*np.log(1+(vn2*(1-np.exp(-theta)))/(vn2*(np.exp(-theta*un1)-1)-np.exp(-theta*un1)))
  •         rvx = np.power((np.power(un1,-1/u1)-1)/u2,-1/u3)
  •         rvy = np.power((np.power(un2,-1/v1)-1)/v2,-1/v3)
  •         rvals.append(sum(1 for i in rvy-rvx if i < 0)/size)
  •     return [np.mean(rvals),np.var(rvals)]
  •  
  • def I_ab(px,py,a,b):
  •     u1,u2,u3=px
  •     v1,v2,v3=py
  •     return H2222(np.power(u2,v3/u3)/v2,1-(a+1)*u1,v3/u3,b*v1)/((a+1)*gamma((a+1)*u1)*gamma(b*v1))
  •  
  • def R_Series_Dagum(theta,px,py,lgt1,lgt2):
  •     u1,u2,u3=px
  •     v1,v2,v3=py
  •     return sum([binom(n,k)*binom(n+1,r)*binom(l,m)*((-theta)**(l))*((-1)**(k+r))*((k+1)**(l-m))*((r)**(m))*I_ab(px,py,l-m,m)/(((1-np.exp(-theta))**(n+1))*gamma(l+1)) for n in range(lgt1) for k in range(n+1) for r in range(n+2) for l in range(lgt2) for m in range(l+1)])
  •  
  • def R_Num_App(theta,px,py,m1):
  •     u1,u2,u3=px
  •     v1,v2,v3=py
  •     t=np.exp(-theta)
  •     return (1/m1)*(1/2+sum((t**(w/m1))*(1-t**(np.power(1+v2*np.power((np.power(w/m1,-1/u1)-1)/u2,v3/u3),-v1)))/((1-t)-(1-t**(w/m1))*(1-t**(np.power(1+v2*np.power((np.power(w/m1,-1/u1)-1)/u2,v3/u3),-v1)))) for w in range(1,m1)))
  •  
  • def R_Exact(theta,px,py):
  •     u1,u2,u3=px
  •     v1,v2,v3=py
  •     t=np.exp(-theta)
  •     return integrate.quad(lambda w: (t**(w))*(1-t**(np.power(1+v2*np.power((np.power(w,-1/u1)-1)/u2,v3/u3),-v1)))/((1-t)-(1-t**(w))*(1-t**(np.power(1+v2*np.power((np.power(w,-1/u1)-1)/u2,v3/u3),-v1)))),0,1)[0]
  •  
  • def Debye(k,x):
  •     return (k/(x**k))*integrate.quad(lambda w:(w**k)/(np.exp(w)-1),0,x)[0]
  •  
  • def KendallTau(theta):
  •     return 1-4*(1/theta)*(1-Debye(1,theta))
  •  
  • def SpearmanRho(theta):
  •     return 1-12*(1/theta)*(Debye(2,-theta)-Debye(1,-theta))
  •  
  • def flatten_l(l):
  •     return [item for sublist in l for item in sublist]
  •  
  • def theta_effect(p):
  •     tval=np.linspace(-30.0000001,30.0000001, num=100)
  •     colmn=[r’$\theta$’]
  •     colmn.append(p)
  •     lst=[]
  •     for pvs in p:
  •         px,py = pvs
  •         ri=RI(px,py)
  •         lstq=R_Num_App(tval,px,py,300)-ri
  •         lst.append(pd.DataFrame({
  •             r’$\theta$’:tval,
  •             str(pvs):lstq
  •             },columns=[r’$\theta$’,str(pvs)]).set_index(r’$\theta$’))
  •     lf=pd.concat(lst,sort=False,axis=1)
  •     ax=lf.plot(color=’k’,style = ["-","--","-.",":"])
  •     ax.set_ylabel(r’$D_{\theta}$’)
  •     ax.axhline(linewidth=1, color=’k’,ls=’--’,alpha=0.5)
  •     ax.axvline(linewidth=1, color=’k’,ls=’--’,alpha=0.5)
  •     axXs = ax.get_xticks()
  •     ax2Xs = []
  •     for X in axXs:
  •         ax2Xs.append("%.3f" %KendallTau(X+0.00001))
  •     ax2 = ax.twiny()
  •     ax2.set_xlabel(r"Kendall’s $\tau$")
  •     ax2.set_xticks(axXs)
  •     ax2.set_xbound(ax.get_xbound())
  •     ax2.set_xticklabels(ax2Xs)
  •     ax3 = ax.twiny()
  •     ax3Xs = []
  •     for X in axXs:
  •         ax3Xs.append("%.3f" %SpearmanRho(X+0.00001))
  •     ax3.xaxis.set_ticks_position("top")
  •     ax3.xaxis.set_label_position("top")
  •     ax3.spines["top"].set_position(("axes", 1.20))
  •     ax3.set_frame_on(True)
  •     ax3.patch.set_visible(False)
  •     for sp in ax3.spines.values():
  •         sp.set_visible(False)
  •     ax3.spines["top"].set_visible(True)
  •     ax3.set_xlabel(r"Spearman’s $\rho$")
  •     ax3.set_xticks(axXs)
  •     ax3.set_xbound(ax.get_xbound())
  •     ax3.set_xticklabels(ax3Xs)
  •     ax3.figure.savefig(’full_figure.pdf’,bbox_inches="tight")
  •     return ax3
  •  
  • def theta_effect_p(p):
  •     tval=np.linspace(-30.0000001,30.0000001, num=100)
  •     colmn=[r’$\theta$’]
  •     colmn.append(p)
  •     lst=[]
  •     for pvs in p:
  •         px,py = pvs
  •         ri=RI(px,py)
  •         lstq=100*(R_Num_App(tval,px,py,300)-ri)/ri
  •         lst.append(pd.DataFrame({
  •             r’$\theta$’:tval,
  •             str(pvs):lstq
  •             },columns=[r’$\theta$’,str(pvs)]).set_index(r’$\theta$’))
  •     lf=pd.concat(lst,sort=False,axis=1)
  •     ax=lf.plot(color=’k’,style = ["-","--","-.",":"])
  •     ax.set_ylabel(r’$100(D_{\theta}-R)/R$’)
  •     ax.axhline(linewidth=1, color=’k’,ls=’--’,alpha=0.5)
  •     ax.axvline(linewidth=1, color=’k’,ls=’--’,alpha=0.5)
  •     axXs = ax.get_xticks()
  •     ax2Xs = []
  •     for X in axXs:
  •         ax2Xs.append("%.3f" %KendallTau(X+0.00001))
  •     ax2 = ax.twiny()
  •     ax2.set_xlabel(r"Kendall’s $\tau$")
  •     ax2.set_xticks(axXs)
  •     ax2.set_xbound(ax.get_xbound())
  •     ax2.set_xticklabels(ax2Xs)
  •     ax3 = ax.twiny()
  •     ax3Xs = []
  •     for X in axXs:
  •         ax3Xs.append("%.3f" %SpearmanRho(X+0.00001))
  •     ax3.xaxis.set_ticks_position("top")
  •     ax3.xaxis.set_label_position("top")
  •     ax3.spines["top"].set_position(("axes", 1.20))
  •     ax3.set_frame_on(True)
  •     ax3.patch.set_visible(False)
  •     for sp in ax3.spines.values():
  •         sp.set_visible(False)
  •     ax3.spines["top"].set_visible(True)
  •     ax3.set_xlabel(r"Spearman’s $\rho$")
  •     ax3.set_xticks(axXs)
  •     ax3.set_xbound(ax.get_xbound())
  •     ax3.set_xticklabels(ax3Xs)
  •     ax3.figure.savefig(’full_figure_p.pdf’,bbox_inches="tight")
  •     return ax3
  •  
  • def DagumPDF(x,gamma1,gamma2,gamma3):
  •     return mielke.pdf(x, gamma1*gamma3, gamma3, 0, np.power(gamma2,1/gamma3))
  •  
  • def Fit_Dagum(data):
  •     k,s,mu,sigma = mielke.fit(data,floc=0)
  •     return [k/s,sigma**s,s]
  •  
  • def Theta_from_Tau(Tau):
  •     if Tau==1:
  •         return 100
  •     else:
  •         def f_solv(x):
  •             return KendallTau(x)-Tau
  •         return fsolve(f_solv,0.3)[0]
  •  
  • def Emp_R(data1,data2):
  •     return sum(1 for i in data1-data2 if i < 0)/len(data1)
  •  
  • def Fit_LogDagum(data):
  •     c,mu,sigma = genlogistic.fit(data)
  •     return [c,np.exp(mu/sigma),1/sigma]
  •  
  • def LogDagumPDF(x,gamma1,gamma2,gamma3):
  •     return genlogistic.pdf(x, gamma1, np.log(gamma2)/gamma3,1/gamma3)

References

  1. Milhomem, D.A.; Dantas, M.J.P. Analysis of new approaches used in portfolio optimization: A systematic literature review. Production 2020, 30. [Google Scholar] [CrossRef]
  2. Markowitz, H. Portfolio Selection. J. Financ. 1952, 7, 77–91. [Google Scholar] [CrossRef]
  3. Zhang, Y.; Li, X.; Guo, S. Portfolio selection problems with Markowitz’s mean–variance framework: A review of literature. Fuzzy Optim. Decis. Mak. 2017, 17, 125–158. [Google Scholar] [CrossRef]
  4. Uryasev, S. Conditional value-at-risk: Optimization algorithms and applications. In Proceedings of the IEEE/IAFE/INFORMS 2000 Conference on Computational Intelligence for Financial Engineering (CIFEr) (Cat. No.00TH8520), New York, NY, USA, 28–28 March 2000; pp. 49–57. [Google Scholar] [CrossRef]
  5. Fabozzi, F.J.; Kolm, P.N.; Pachamanova, D.A.; Focardi, S.M. Robust Portfolio Optimization. J. Portf. Manag. 2007, 33, 40–48. [Google Scholar] [CrossRef]
  6. Hu, Y.; Liu, K.; Zhang, X.; Su, L.; Ngai, E.; Liu, M. Application of evolutionary computation for rule discovery in stock algorithmic trading: A literature review. Appl. Soft Comput. 2015, 36, 534–551. [Google Scholar] [CrossRef]
  7. Ertenlice, O.; Kalayci, C.B. A survey of swarm intelligence for portfolio optimization: Algorithms and applications. Swarm Evol. Comput. 2018, 39, 36–52. [Google Scholar] [CrossRef]
  8. Mansini, R.; Ogryczak, W.; Speranza, M.G. Twenty years of linear programming based portfolio optimization. Eur. J. Oper. Res. 2014, 234, 518–535. [Google Scholar] [CrossRef]
  9. Black, F.; Litterman, R. Global Portfolio Optimization. Financ. Anal. J. 1992, 48, 28–43. [Google Scholar] [CrossRef]
  10. Lindberg, C. Portfolio optimization when expected stock returns are determined by exposure to risk. Bernoulli 2009, 15, 464–474. [Google Scholar] [CrossRef]
  11. Ross, S.A. The arbitrage theory of capital asset pricing. J. Econ. Theory 1976, 13, 341–360. [Google Scholar] [CrossRef]
  12. Boubaker, H.; Sghaier, N. Portfolio optimization in the presence of dependent financial returns with long memory: A copula based approach. J. Bank. Financ. 2013, 37, 361–377. [Google Scholar] [CrossRef]
  13. Ang, A.; Chen, J. Asymmetric correlations of equity portfolios. J. Financ. Econ. 2002, 63, 443–494. [Google Scholar] [CrossRef]
  14. Longin, F.; Solnik, B. Extreme Correlation of International Equity Markets. J. Financ. 2001, 56, 649–676. [Google Scholar] [CrossRef]
  15. Hartmann, P.; Straetmans, S.; Vries, C.G.d. Asset Market Linkages in Crisis Periods. Rev. Econ. Stat. 2004, 86, 313–326. [Google Scholar] [CrossRef] [Green Version]
  16. Beine, M.; Cosma, A.; Vermeulen, R. The dark side of global integration: Increasing tail dependence. J. Bank. Financ. 2010, 34, 184–192. [Google Scholar] [CrossRef]
  17. Harvey, C.R.; Siddique, A. Autoregressive Conditional Skewness. J. Financ. Quant. Anal. 1999, 34, 465–487. [Google Scholar] [CrossRef]
  18. Jondeau, E.; Rockinger, M. Conditional volatility, skewness, and kurtosis: Existence, persistence, and comovements. J. Econ. Dyn. Control 2003, 27, 1699–1737. [Google Scholar] [CrossRef] [Green Version]
  19. Brooks, C.; Burke, P.; Heravi, S.; Persand, G. Autoregressive Conditional Kurtosis. J. Financ. Econom. 2005, 3, 399–421. [Google Scholar] [CrossRef] [Green Version]
  20. Kotz, S.; Lumelskii, Y.; Pensky, M. The Stress-Strength Model and Its Generalizations; World Scientific Publishing Co. Pte Ltd.: Singapore, 2003. [Google Scholar]
  21. Domma, F.; Giordano, S. A copula-based approach to account for dependence in stress-strength models. Stat. Pap. 2013, 54, 807–826. [Google Scholar] [CrossRef]
  22. Rezaei, S.; Tahmasbi, R.; Mahmoodi, M. Estimation of P(Y<X) for generalized Pareto distribution. J. Stat. Plan. Inference 2010, 140, 480–494. [Google Scholar] [CrossRef]
  23. Wong, A. Interval estimation of P(Y<X) for generalized Pareto distribution. J. Stat. Plan. Inference 2012, 142, 601–607. [Google Scholar] [CrossRef]
  24. Baklizi, A. Estimation of Pr(X<Y) Using Record Values in the One and Two Parameter Exponential Distributions. Commun.-Stat.-Theory Methods 2008, 37, 692–698. [Google Scholar] [CrossRef]
  25. Sengupta, S. Unbiased estimation of P(X>Y) for two-parameter exponential populations using order statistics. Statistics 2011, 45, 179–188. [Google Scholar] [CrossRef]
  26. Ozelim, L.C.S.M.; Otiniano, C.E.G.; Rathie, P.N. On The Linear Combination of N Logistic Random Variables and Reliability Analysis. South East Asian J. Math. Math. Sci. 2016, 12, 19–34. [Google Scholar]
  27. Ozelim, L.C.S.M.; Rathie, P.N. Linear Combination and Reliability of Generalized Logistic Random Variables. Eur. J. Pure Appl. Math. 2019, 12, 722–733. [Google Scholar] [CrossRef]
  28. Rathie, P.N.; Ozelim, L.C.S.M. Exact and approximate expressions for the reliability of stable Lévy random variables with applications to stock market modelling. J. Comput. Appl. Math. 2017, 321, 314–322. [Google Scholar] [CrossRef]
  29. Domma, F.; Giordano, S. A stress–strength model with dependent variables to measure household financial fragility. Stat. Methods Appl. 2012, 21, 375–389. [Google Scholar] [CrossRef]
  30. Embrechts, P.; Kluppelberg, C.; Mikosch, T. Modelling Extremal Events: For Insurance and Finance (Stochastic Modelling and Applied Probability); Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  31. Gupta, R.C.; Subramanian, S. Estimation of reliability in a bivariate normal distribution with equal coefficients of variation. Commun. Stat.-Simul. Comput. 1998, 27, 675–698. [Google Scholar] [CrossRef]
  32. Hanagal, D.D. Note on estimation of reliability under bivariate pareto stress-strength model. Stat. Pap. 1997, 38, 453–459. [Google Scholar] [CrossRef]
  33. Nadarajah, S.; Kotz, S. Reliability for some bivariate exponential distributions. Math. Probl. Eng. 2006, 2006, 041652. [Google Scholar] [CrossRef] [Green Version]
  34. Nadarajah, S. Reliability for some bivariate gamma distributions. Math. Probl. Eng. 2005, 2005, 924843. [Google Scholar] [CrossRef] [Green Version]
  35. Gupta, R.C.; Ghitany, M.E.; Al-Mutairi, D.K. Estimation of reliability from a bivariate log-normal data. J. Stat. Comput. Simul. 2013, 83, 1068–1081. [Google Scholar] [CrossRef]
  36. Dagum, C. Inequality measures between income distributions with applications. Econometrica 1980, 28, 1791–1803. [Google Scholar] [CrossRef]
  37. Nelsen, R.B. An Introduction to Copulas, 2nd ed.; Springer Series in Statistics; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  38. Joe, H. Multivariate Models and Multivariate Dependence Concepts (Chapman & Hall CRC Monographs on Statistics & Applied Probability), 1st ed.; Chapman and Hall/CRC: London, UK, 1997. [Google Scholar]
  39. Ruschendorf, L. Construction of multivariate distributions with given marginals. Ann. Inst. Stat. Math. 1985, 37, 225–233. [Google Scholar] [CrossRef]
  40. Luke, Y.L. The Special Functions and Their Approximations; Mathematics in Science and Engineering 53-1; Academic Press: Cambridge, MA, USA, 1969; Volume 1. [Google Scholar]
  41. Comtet, L. Advanced Combinatorics: The Art of Finite and Infinite Expansions; D. Reidel Pub. Co.: Dordrecht, The Netherlands, 1974. [Google Scholar]
  42. Mathai, A.M.; Haubold, H.J. Special Functions for Applied Scientist; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  43. Domma, F.; Perri, P.F. Some developments on the log–Dagum distribution. Stat. Methods Appl. 2009, 18, 205–220. [Google Scholar] [CrossRef]
  44. Springer, M.D. The Algebra of Random Variables; John Wiley: Hoboken, NJ, USA, 1979. [Google Scholar]
Figure 1. Values of D θ for different random variables.
Figure 1. Values of D θ for different random variables.
Stats 04 00059 g001
Figure 2. Percentage Values of D θ with respect to R for different random variables.
Figure 2. Percentage Values of D θ with respect to R for different random variables.
Stats 04 00059 g002
Figure 3. Stocks BHP and RIO.
Figure 3. Stocks BHP and RIO.
Stats 04 00059 g003
Figure 4. Stocks XOM and RIO.
Figure 4. Stocks XOM and RIO.
Stats 04 00059 g004
Table 1. Stock return data modelled.
Table 1. Stock return data modelled.
Random VariableCompany CodeCompany Name
X 1 BHPBHP Group
X 2 XOMExxon Mobil Corporation
X 3 RIORio Tinto Group
Table 2. MLE for P ( X 1 < X 3 ) .
Table 2. MLE for P ( X 1 < X 3 ) .
Random Variables θ
X 1 L D ( 0.957 , 1.092 , 85.078 ) 12.275
X 3 L D ( 1.032 , 0.980 , 76.521 )
Table 3. MLE for P ( X 2 < X 3 ) .
Table 3. MLE for P ( X 2 < X 3 ) .
Random Variables θ
X 2 L D ( 0.884 , 1.192 , 141.362 ) 4.182
X 3 L D ( 0.982 , 1.054 , 82.934 )
Table 4. Reliability Measures P ( Y < X ) .
Table 4. Reliability Measures P ( Y < X ) .
IndexReal DataRnd DataEquation (46)Equation (64) w/10 TermEquation (64) w/100 TermsEquation (64) w/400 Terms
P ( X 1 < X 3 ) 0.4953030.4945460.4945230.4959820.4945750.494528
P ( X 2 < X 3 ) 0.4855490.4873650.4872620.4894790.4874000.487282
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Rathie, P.N.; de Sena Monteiro Ozelim, L.C.; de Andrade, B.B. Portfolio Management of Copula-Dependent Assets Based on P(Y < X) Reliability Models: Revisiting Frank Copula and Dagum Distributions. Stats 2021, 4, 1027-1050. https://doi.org/10.3390/stats4040059

AMA Style

Rathie PN, de Sena Monteiro Ozelim LC, de Andrade BB. Portfolio Management of Copula-Dependent Assets Based on P(Y < X) Reliability Models: Revisiting Frank Copula and Dagum Distributions. Stats. 2021; 4(4):1027-1050. https://doi.org/10.3390/stats4040059

Chicago/Turabian Style

Rathie, Pushpa Narayan, Luan Carlos de Sena Monteiro Ozelim, and Bernardo Borba de Andrade. 2021. "Portfolio Management of Copula-Dependent Assets Based on P(Y < X) Reliability Models: Revisiting Frank Copula and Dagum Distributions" Stats 4, no. 4: 1027-1050. https://doi.org/10.3390/stats4040059

Article Metrics

Back to TopTop