Maximum Varma Entropy Distribution with Conditional Value at Risk Constraints

It is well known that Markowitz’s mean-variance model is the pioneer portfolio selection model. The mean-variance model assumes that the probability density distribution of returns is normal. However, empirical observations on financial markets show that the tails of the distribution decay slower than the log-normal distribution. The distribution shows a power law at tail. The variance of a portfolio may also be a random variable. In recent years, the maximum entropy method has been widely used to investigate the distribution of return of portfolios. However, the mean and variance constraints were still used to obtain Lagrangian multipliers. In this paper, we use Conditional Value at Risk constraints instead of the variance constraint to maximize the entropy of portfolios. Value at Risk is a financial metric that estimates the risk of an investment. Value at Risk measures the level of financial risk within a portfolio. The metric is most commonly used by investment bank to determine the extent and occurrence ratio of potential losses in portfolios. Value at Risk is a single number that indicates the extent of risk in a given portfolio. This makes the risk management relatively simple. The Value at Risk is widely used in investment bank and commercial bank. It has already become an accepted standard in buying and selling assets. We show that the maximum entropy distribution with Conditional Value at Risk constraints is a power law. Algebraic relations between the Lagrangian multipliers and Value at Risk constraints are presented explicitly. The Lagrangian multipliers can be fixed exactly by the Conditional Value at Risk constraints.


Introduction
Markowitz's mean-variance model [1] is the pioneer portfolio selection model. The mean-variance model is based on the assumption that return of assets follows a normal distribution. However, empirical observations on financial markets show that the tails of the distribution decay slower than the log-normal distribution (see [2][3][4]). A variety of models have been proposed to improve Markowitz's mean-variance model. There is empirical evidence indicating that volatility is driven by a mean-reverting stochastic process (see [5][6][7][8][9]). It has been suggested that the volatility of an empirical financial market should be a stochastic quantity (see [10][11][12][13][14][15][16][17][18][19]). The autoregressive conditional heteroscedasticity (ARCH) model (see [20]) describes the variance of the current error term or innovation as a function of the actual sizes of the previous time periods' error terms. If an autoregressive moving average model is assumed for the error variance, the model is a generalized autoregressive conditional heteroscedasticity (GARCH) model (see [8,9,[21][22][23][24]). GARCH models are commonly employed in modeling financial time series that exhibit time-varying volatility and volatility clustering.
In particular, the power-law tails of empirical observations on financial markets have been studied extensively ( [25][26][27][28][29]). Mandelbrot [25] first noticed the scaling properties of financial markets. It has been shown that the price distribution of financial assets is of the form of power law. Therefore, for distributions that are asymmetrical or non-normal, a different measure of uncertainty is required, which should be more dynamic and general than the variance, and does not rely on a specific distribution.
As entropy is a well-known measure of diversity, many scholars apply it to the portfolio selection and asset pricing. Philippatos and Wilson first used the concept of entropy to portfolio selection [30]. They tried to maximize the expected portfolio return and minimize the portfolio entropy. The concepts of individual entropy, joint entropy, and conditional entropy were introduced. The mean-entropy portfolios are consistent with the full covariance and the single-index models. Many progresses have been made in exploring entropy of the portfolio weights as a maximization objective to encourage diversification levels ( [31][32][33][34][35][36][37]). Lassance [38] used the Rényi entropy discussed the portfolio optimization. For a detailed review of the application of entropy, please see [39].
In this paper, following [40], we try to use Value at Risk constraints instead of the variance constraint to maximize the entropy of portfolios. Value at Risk is a financial metric that estimates the risk of an investment. Value at Risk measures the level of financial risk within a portfolio. It is the loss, which is exceeded with a given probability α, over a given horizon. For a finite horizon [0, T], we define W(0) as the initial wealth, and W(T) as the terminal-horizon wealth. If the value at risk horizon coincides with the investment horizon, we can describe the value at risk as [41,42] Thus, value at risk is the worse loss over a given time interval. The convenient way of management on value at risk is setting the VaR(α) to be maintained below a pre-specified level, where W is specified exogenously. Then, the value at risk constraint can be expressed as The metric is most commonly used by an investment bank to determine the extent and occurrence ratio of potential losses in portfolios. Value at Risk is a single number that indicates the extent of risk in a given portfolio. This makes the risk management relatively simple. The Value at Risk is widely used in investment banks and commercial banks. It has already become an accepted standard in buying and selling assets. It is noticed that, in practice, financial agents control the magnitude of a loss rather than its probability. To control the magnitude of a loss, one should limit the expected shortfall, This constraint penalizes both a high probability of a loss and a high expected loss given there is a loss. It is a risk measure of time-T losses. This measure of risk is call the Conditional Value at Risk. It should be noticed that a generalized Markowitz's mean-variance model with a Value at Risk constraint has been investigated for a long time [43,44]. However, the probability density distribution of returns was still assumed to be normal. In this paper, we show that the maximum entropy distribution with Value at Risk constraints is a power law. Algebraic relations between the Lagrangian multipliers and Value at Risk constraints are presented explicitly. The Lagrangian multipliers can be fixed exactly by the Value at Risk constraints.

The Shannon Entropy
In a financial market, return R of a portfolio is a discrete random variable with probability p i (i = 1, 2, · · · , n). The portfolio with return R i constitutes a complete system of mutually exclusive events E 1 , E 1 , · · · , E n with probabilities p 1 , p 2 , . . . ., p n , respectively. Entropies are measures of uncertainty for such a system. We can define the Shannon entropy as with constraint ∑ n i=1 p i = 1. If all probability p i with return R i are equal, p i = 1 n . At the case, the Shannon entropy is maximal. Shannon entropy satisfies the Gibbs inequality: H n (p 1 , p 2 , · · · , p n ) ≤ ln n. It is monotonically increasing with n, The Shannon entropy is symmetric under permutation of any two variables p j and p k . Enlarging by an event of probability 0 does not change the Shannon entropy. This is to say that the entropy is expansible, H n+1 (p 1 , p 2 , · · · , p n , 0) = H n (p 1 , p 2 , · · · , p n , 0) .
If the portfolio S has return R S i with probabilities p i (R S i ) (i = 1, 2, · · · , n) and the portfolio T has return R T j with probabilities p j (R T j ) (j = 1, 2, · · · , m), then the probability of the portfolio S ⊕ T with return Information expected from two portfolios is not greater than the sum of information expected from the single portfolio. It is called subadditivity.
If the two portfolios are independent, we have p ij (R S i + R S j ) = p(R S i )p(R T j ) (i = 1, 2, · · · , n and j = 1, 2, · · · , m). In this case, one has Information expected from two independent portfolios equals the sum of informations expected from the single portfolio.This is called additivity.
Another characterization of the Shannon entropy is recursivity, When dealing with continuous probability density distribution p(x), we can define the Shannon entropy as with constraint Furthermore, one can set the mean and variance of the portfolio as constraints to probability density distribution of return p(x).
To maximize the Shannon entropy (11), we construct the Lagrangian Lagrangian will be maximized when its functional variation with respect to the unknown probability density distribution p(x) is zero, This leads immediately to The expressions of unity of probability density distribution, mean, and variance of portfolio, can be integrated out and solved to fix the value of multiplier λ, γ 1 , and γ 2 . Thus, we get the maximum Shannon entropy distribution as It is just the normal distribution.

The Rényi Entropy
The Rényi entropy of order α (≥ 0, = 1) is defined as Notice a simple differential da t dt = a t ln a. In the limit of α → 1, the Rényi entropy behaves as where we have used L'Hôpital's rule in the first equal and the constraint ∑ n i=1 p i = 1 at the third equal. Thus, in the limit of α → 1, the Rényi entropy reduces to the Shannon entropy. The Rényi entropy is a generalization of the Shannon entropy.
The Rényi entropy has similar properties to the Shannon entropy. It is additive. It has maximum = ln(n) for p i = 1 n . However, the Rényi entropy contains additional parameter α which can be used to make it more or less sensitive to the shape of probability density distributions.
When dealing with continuous probability density distribution p(x), we can define the Rényi entropy as with constraint Furthermore, one can set the mean and variance of the portfolio as constraints to probability density distribution of return p(x).
To maximize the Rényi entropy (21), we construct the Lagrangian Lagrangian will be maximized when its functional variation with respect to the unknown probability density distribution p(x) is zero, This leads immediately to where we have used the notation Furthermore, we can rewrite the probability density distribution as the form where we have introduced the notationsλ ≡ λ The parametersλ,γ 1 , andγ 2 can be fixed by the unity of probability density distribution, mean, and variance of portfolio.

Value at Risk
Value at Risk (VaR) is a financial metric that estimates the risk of an investment. VaR measures the level of financial risk within a portfolio. The metric is most commonly used by investment bank to determine the extent and occurrence ratio of potential losses in portfolios. VaR gives the probability of losing more than a given amount in a given portfolio. We measure VaR by assessing the amount of potential loss, the probability of occurrence for the amount of loss. Value at Risk is a single number that indicates the extent of risk in a given portfolio. Value at Risk is measured in either price units or as a percentage. This makes the risk management relatively simple. The Value at Risk is widely used in investment banks and commercial banks. It already has become an accepted standard in buying and selling assets.
One can use a historical method to calculate the VaR for a portfolio. The advantage of the historical method is simple. We take market data for the last few days to calculate the percentage change for each risk factor on each day. In addition, then calculate each percentage change with current market values to present scenarios for future value. For each of the scenarios, the portfolio is valued using pricing models. The shortcoming of the historical method is that empirical return of a portfolio is a stochastic variable and can not be obtained by historical data. That is to say, in principle, one can not use historical distribution of return to calculate future distribution.
To overcome the shortcoming of the historical method, one can use the parametric method to calculate the VaR for a portfolio. The present used parametric method is in fact a variance-covariance method. The parametric method assumes that the probability density distribution of return is a normal distribution. To calculate the VaR for a portfolio, one has to estimate the expected return and standard deviation of the return. If the distribution is almost normal and the mean and variance can be estimated reliably, the parametric method is best suited to risk measurement. However, it is well known that the empirical distribution of return of portfolios is fat tailed. It is almost a power law at tail.
The Monte Carlo Method has been proposed to deal with the problem met in the parameter model. One can calculate Value at Risk by randomly creating a set of scenarios for future returns using pricing models to estimate the change in value for each scenario. The Monte Carlo Method also assumes that there is a known probability distribution for risk factors. In fact, we do not have the exact distribution.
A useful alternative of the VaR is the Conditional Value at Risk (CVaR). CVaR is also known as the expected shortfall, average value at risk, tail VaR, mean excess loss, or mean shortfall. CVaR can be used to calculate the average of the losses that occur beyond the Value at Risk point in a distribution.
In this paper, we should use maximum entropy to get the distribution.

Entropy Maximisation with VaR Constraints
A well-known generalization of Shannon's entropy is Varma's entropy. This kind of entropy was introduced by Varma [45]. Varma entropy plays an important role as a measure of complexity and uncertainty in different areas. Malhotra, Srivastava, and Taneja used Varma entropy to calibrate the risk-neutral density function [46]. In this section, we follow the method developed in [46].
Let p(x) be the unknown probability density distribution with normalization condition, The two parameters Varma entropy would be defined as The global mean constraint is of the form In value at risk management, we have the constraints of tail probability and expected shortfall (CVaR). The tail probability is defined as The expected shortfall can be expressed as We can rewrite formally the tail probability and expected shortfall as follows: and Jaynes' maximum entropy principle states that, out of all possible distributions consistent with the constraints, the optimal one has the maximum uncertainty. It is least committed to the information not given to us. In other words, the optimal distribution is most random and most unbiased. To maximize the Varma entropy (29), we construct the Lagrangian The Lagrangian will be maximized when its functional variation with respect to the unknown probability density distribution p(x) is zero, This leads immediately to Thus, we obtain the probability density distribution under the constraints of value at risk as We can rewrite the probability density distribution under the constraints of value at risk as where Furthermore, we can rewrite the probability density distribution under the constraints of value at risk as the form p(x) = λ 0 +λ 1 x +γ Tail g Tail (x) +γ CVaR g CVaR (x) where we have introduced the notationsλ 0

Lagrangian Multipliers Determination
By making use of the probability density distribution under the constraints of value at risk (41), we can express the unity condition (28) as This constraint can be integrated out explicitly, If the two parameters of the Varma entropy satisfy 1 < α + β < 2, the above integration equals zero at −∞ and ∞. In addition, we have By making use of the probability density distribution under the constraints of value at risk (41), we can express the mean constraint (30) as Using integration by parts, we have This constraint can be integrated out explicitly, If the two parameters of the Varma entropy satisfy 3 2 < α + β < 2, the above integration equals zero at −∞ and ∞. In addition, we have By making use of the probability density distribution under the constraints of value at risk (41), we can express the tail probability constraint (31) as This constraint can be integrated out explicitly, If the two parameters of the Varma entropy satisfy 1 < α + β < 2, the above integration equals zero at −∞. In addition, we have By making use of the probability density distribution under the constraints of value at risk (41), we can express the expected shortfall constraint (33) as Using integration by parts, we have 1 This constraint can be integrated out explicitly, If the two parameters of the Varma entropy satisfy 3 2 < α + β < 2, the above integration equals zero at −∞. In addition, we have For given parameters of the Varma entropy α and β , as well as the value-at-risk parameters θ = (K, , ν − ), we can use the algebraic Equations (44), (48), (51), and (55) determining uniquely the parametersλ 0 ,λ 1 ,γ Tail , andγ CVaR in the probability density distribution (41). It should be noticed that, to set up all of the algebraic Equations (44), (48), (51), and (55), we limited the given parameters of the Varma entropy as 3 2 < α + β < 2.

Discussion and Conclusions
In this paper, we used Conditional Value at Risk constraints instead of the variance constraint to maximize the entropy of portfolios. It has been noticed that Value at Risk is a financial metric that estimates the risk of an investment. Value at Risk measures the level of financial risk within a portfolio. The metric is most commonly used by investment banks to determine the extent and occurrence ratio of potential losses in portfolios. Value at Risk is a single number that indicates the extent of risk in a given portfolio. This makes the risk management relatively simple. The Value at Risk is widely used in investment banks and commercial banks. It has already become an accepted standard in buying and selling assets. We show that the maximum entropy distribution with Conditional Value at Risk constraints is a power law. Algebraic relations between the Lagrangian multipliers and Value at Risk constraints are presented explicitly. The Lagrangian multipliers can be fixed exactly by the Conditional Value at Risk constraints. Empirical observations on financial markets show really that the tails of the distribution decay slower than the log-normal distribution. In particular, the power-law tails of empirical observations on financial markets have been studied extensively. By using the maximum Varma entropy with CVaR constraints, we obtained a proper probability density distribution of returns.
Recently, there has been a surge of interest in risk management involving more sophistical distortion measures [47][48][49]. The distortion risk measure encompasses VaR and CVaR as its special cases. Using a distortion function g, we can express the distortion risk measure ρ g as ρ g (X) = 1 0 VaR(X)dg(s), whereg(s) = 1 − g(1 − s). It is interesting to discuss the maximum entropy distribution with distortion risk measure constraint. To obtain a distribution, we need know the specified form of the distortion function. For example, the mean-CVaR distortion risk measure is defined as [50] ρ(X) where θ, β ≥ 0 and α ∈ [0, 1). We can redefine the g CVaR (x) in (35) as In addition, all of calculations can be mimicked straightforwardly. One also gets a power law distribution. Here, we thank the anonymous reviewer for suggesting to us to notice the distortion risk measure.
Author Contributions: C.L. and C.C. performed the formal analysis of the investigation, the methodology, and the software, and wrote the first draft of the paper. Z.C. reviewed and edited the paper. All authors have read and agreed to the published version of the manuscript.