A Communication Theoretic Interpretation of Modern Portfolio Theory Including Short Sales, Leverage and Transaction Costs

: Modern Portfolio Theory is the ground upon which most works in portfolio optimization context ﬁnd their foundations. Many studies attempt to extend the Modern Portfolio Theory to include short sale, leverage and transaction costs, features not considered in Markowitz’s seminal work from 1952. The drawback of such theories is that they complicate considerably the simplicity of the original technique. Here, we propose a simple and uniﬁed method, which takes inspiration from, and shows connections with the matched ﬁlter theory in communications, to evaluate the best portfolio allocation with the possibility of including a leverage factor and short sales. Finally, we extend the presented method to also consider the transaction costs.


Introduction
Let (γ 1,k , γ 2,k , · · · γ N,k ) be the prices of N goods at discrete time instant k. The evolution of the price can be modelled for n = 1, 2, · · · , N and k = 0, 1, · · · as γ n,k+1 = γ n,k (1 + x n,k ) , where γ n,0 > 0 is the initial price of good n and x n,k is the relative price change of the nth good between time k and time k + 1. Thus, if the price of good n raises by 3% between time k and time k + 1, then x n,k = 0.03. The returns x n,k are random variables whose knowledge is only limited at the level of some statistical properties. Thus, it is common practice to diversify the investor's wealth among the N goods, so that the entire capital is not reserved to a single good and that, in the case of a disastrous event (bankrupt), affect only the portion allocated to the unsuccessful good. Markowitz's Modern Portfolio Theory Markowitz (1952), sometimes referred as Mean-Variance Approach, provides a method to choose an optimal allocation of wealth in view of knowledge of few statistical properties of the return variables. In particular, as the name suggests, this approach is based on the evaluations of the first-and second-order moments of x n,k , and the investor can adopt the portfolio that maximizes the expected return given a maximum level of risk accepted. The relevance of this theory is enormous and it has been usually taken as a starting point in a variety of works in the context of portfolio optimization, as shown in Elton and Gruber (1997). Many of these works focus on overcoming some constraints of the original formulation of the Modern Portfolio theory. A point that has been deeply investigated in the past is how to deal with the estimation error that, due to the limited sample size, affects the mean and the variance (see DeMiguel et al. (2009), Kritzman et al. (2010)).
In addition, the mean-variance approach has been compared to competitor approaches in the presence of unavoidable estimation errors (e.g., DeMiguel et al. (2007), Adler and Kritzman (2007)). A criticism that has been moved to the mean-variance approach is about the possibility of describing the distribution of the returns in terms of two parameters only, the mean and the variance (basically, an assumption of Gaussianity). Papers that explore portfolio construction when distribution asymmetries and higher order moments are considered include those by Cremers et al. (2004), Low (2017), and Low et al. (2013Low et al. ( , 2016. Two assumptions impose severe limitations on modern portfolio theory. First, much of the literature assumes that mean and variance are knowable and known. Second, scholars often assume these two parameters alone describe the entire distribution. In addition, modern portfolio theory, as originally laid out, often presumptively forbids the use of short sales and leveraged positions. In other words, conventional works on portfolio design exclude borrowing, either for the purpose of profiting from declines in stock prices (short sales) or for the purpose of increasing the total volume of the investment and correspondingly the resulting profit or loss (leverage). Most studies concerned with these aspects (see Levy (2013a, 2013b) and Pogue (1970)) are based on complex methods and cannot be considered as simple extensions of the original theory.
In this paper, we move back to the original mean-variance approach assuming that mean and variance are known and present a simple extension of Markowitz's theory, inspired by a well known principle in communication theory, that handles simultaneously short-sales and leverage. Finally, we also show that even transaction costs can be taken into account in a straightforward manner. The scope of the paper is thus two-fold: showing a connection between portfolio optimization and matched filter theory in communications, and providing a simple unified presentation of the mentioned important extensions to Markowitz's original theory.

Trader Portfolio
We consider here the financial activity commonly called trading. The distinction that we make between investing and trading, and that we exploit in the present paper, is that in the latter short trading and leveraged trading are allowed. Basically, this means that in trading money is used not merely to buy assets, but to guarantee risky positions that can remain open until the occurrence of a stop loss condition. When the stop loss condition occurs, the position is closed and money used to guarantee it is lost.
The time evolution of a leveraged position on good n is ζ n,k+1 = ζ n,k (1 + s n,k λ n,k x n,k ), where ζ n,k ≥ 0 is the money invested on good n at time k, λ n,k ≥ 0 is the leverage factor (we return on the meaning of λ n,k ≤ 1 later on), and s n,k ∈ {±1} is the long/short indicator. Hence, s n,k = 1 when the position on good n at time k is long, while s n,k = −1 when the position is short. In the following, we assume that meaning that position on good n has been closed at time k + 1 because all money used to guarantee it has been lost. Basically, Equation (2) describes the evolution of a financial product that invests with directionality indicated by s n,k and leverage λ n,k on good n, and that consolidates its value at each iteration. The portfolio is the sum of leveraged positions, that is The evolution of the portfolio for k = 0, 1, · · · , is π k+1 = π k + N ∑ n=1 s n,k λ n,k ζ n,k x n,k .
The iteration terminates at time instant K such that The relative return of the positions taken at time k, which is the focus of the following analysis, is In the following, we consider one step of the iteration and withdraw the time index, thus writing It is convenient to write the relative return of the portfolio as where r n = s n λx n (10) is the relative return of position n with and Note that v n ≥ 0, n = 1, 2, · · · , N, hence v n is the relative money invested on good n and λ is a leverage factor that can be set equal for all goods without affecting the relative return of the portfolio. Although the leverage factor of stocks is often an integer, the trader may obtain any real value of the leverage by mixing positions on the same good with different leverages. Note that leverages between 0 and 1 can also be obtained simply by investing with leverage 1 a fraction λ of the available money while letting a fraction 1 − λ be cash. In the end, the real limit on λ is determined by the maximum leverage available on the financial market, which is in practice so high that we can ignore the limits that it poses.

Portfolio Optimization
Let boldface lowercase characters denote column vectors and boldface uppercase characters denote matrices, and let where the superscript T denotes transposition, be a multivariate random process representing the market behavior with mean vector and covariance matrix where E{·} denotes expectation. In the following, we assume that the above mean vector and covariance matrix are known. For any λ ≥ 0, portfolio optimization requires of finding the relative money allocation vector that maximizes the mean value of the relative return (Equation (9)) with a fixed variance of the relative return. Note that the optimal portfolio will include only positions with that is with because positions with negative expected return will be outperformed by cash. For this reason, without losing generality, we consider in the following only being understood that, when E{x n } ≤ 0, one takes s n = −1 and then puts − x n ← x n , s n = 1.
With these assumptions, for any λ ≥ 0, portfolio optimization requires of determining the vector that maximizes the expected relative return with a fixed variance of the relative return and with the constraints in Equations (13) and (14). This is a classical constrained optimization problem that can be worked out by maximizing the Lagrangian where α ≥ 0 is the Lagrange multiplier and the constraints are Equations (13) and (14). Writing the relative return as one promptly finds that the mean value and the variance of the relative return are and respectively. Hence, the Lagrangian is Assuming that Ψ x,x is definite positive, hence invertible, it is easy to see that the unique maximum of Equation (26) where Imposing Equation (14), one finds which, substituted in Equation (27), leads to Note that, since Ψ x,x is definite positive, its inverse is also definite positive and, since all the entries of µ x are non-negative, all the entries of ω are non-negative, and hence the constraint in Equation (13) is always satisfied. Equations (28) and (30) say that the relative weights of positions in the optimal portfolio are fixed and independent of the risk that the trader wants to take, while the balance between expected return and risk can be tuned by tuning the leverage according to Equations (24) and (25).
The performance of the optimal portfolio is characterized by the following expected relative return and variance of the relative return The unconstrained efficient frontier is a straight line in the first quadrant of the plane (σ r , µ r ) that departs from (σ r = 0, µ r = 0) and that has slope that is the Sharpe ratio (see Sharpe (1966)), a figure that has been object of many studies after the original paper by Sharpe (see, e.g., Bailey and Lopez de Prado (2012) for estimates of the Sharpe ratio when µ x and Ψ x,x cannot be assumed to be known). The entire efficient frontier can be visited by sweeping α from 0 (zero expected return) to ∞ (infinite expected return). Note that including explicitly a risk free position in the portfolio would lead to a covariance matrix that is not invertible, while, with our approach, the risk free position is automatically brought into the portfolio when λ < 1, because a fraction 1 − λ of the portfolio is cash. 1 1 Including the risk free position in the classical portfolio theory is possible, provided that a work around is made to circumvent problems encountered in matrix inversion (see Merton (1972)). Insights on portfolios with and without can be traced back to Tobin (1958).
As shown in North (1963) and Turin (1960), Equations (28) and (33) are very well known in the context of narrowband discrete-time Single Input Multiple Output (SIMO) transmission systems affected by correlated additive noise, e.g. interference. In that context, the transmission channel is characterized by applying an impulse to the input and by collecting the N channel outputs into vector µ x . In addition, channel's output is affected by zero-mean additive noise with covariance matrix Ψ x,x . Vector ω is the matched filter (MF), that is the vector of weights that, applied to the multiple outputs, that is to the goods held in the portfolio, produces through Equation (9) the least noisy version of the input impulse in the mean-square error sense, while µ 2 r /σ 2 r is the Signal-to-Noise Ratio (SNR) after the matched filter. The application of the matched filter in portfolio optimization is not new, as illustrated by Pillai (2009), where however it is shown only how the said communication theory technique can be exploited in investing contexts, without any extension to trading. To put more light on the link between communication theory and the subject treated in this paper, we point out that, in the context of communication theory, the random vector x introduced in Equation (15) can be regarded as the vector signal at the output of a SIMO channel excited by an input impulse of unit amplitude: where µ x is a vector that, in the communication context, is the impulse response of the SIMO channel, and n is a random noise vector having zero mean values and covariance matrix Ψ xx (also the covariance matrix of the noise vector is assumed here to be known). As pointed out by an anonymous referee, the suggestion of regarding the variability in financial returns as the effect of noise (the Brownian motion of gas) dates back to Bachelier's thesis Bachelier (1900). At the receiver side, the received vector is passed through the matched filter (Equation (30)), getting where, in the context of communication, the scaling factor λ is set to so that meaning that r recovers the unit amplitude of the input impulse keeping minimum the mean squared error E{(r − E{r}) 2 }, hence achieving the maximum of Equation (33). Besides this, the matched filter has another interesting property that is well known in the context of communication and information theory. Let a be a random scalar parameter that, in the context of communication, is the information-carrying amplitude, e.g., a ∈ {0, 1}. When an impulse of amplitude a is applied at channel input, the channel output is With Gaussian noise, the matched filter output is a sufficient statistic for the hidden a. This means that r can be seen as a synthetic representation of s that fully preserves the possibility of making inference, for instance, maximum likelihood detection, about the hidden a based on r alone. In a more technical language, this concept is expressed by saying that the information processing inequality I(a, s) ≥ I(a, T(s)), where I(x, y) is Shannon information between x and y, is met with equality when the transformation T(x) is the right hand side of Equation (38). The interested reader is referred to Chapter 2 of Cover and Thomas (2006) for the basics about Shannon information, for the concept of sufficient statistic and for the meaning of the information processing inequality. An example illustrates similarities and differences between the leveraged portfolio and classical Markowitz's portfolios. Consider three goods with For the leveraged portfolio, the optimal money allocation vector is proportional to and the slope of the efficient frontier is µ r σ r = √ 10. Figure 1 reports µ versus σ for the individual goods and the efficient frontiers for the leveraged portfolio and for the Markowitz's portfolios with and without the possibility of holding cash. What happens is that the efficient frontier of the Markowitz's portfolio without cash is equal to the efficient frontier of the leveraged portfolio until because the fraction (1 − λ) of cash of the leveraged portfolio is invested in the non-leveraged portfolio on the low risk stocks, thus leaving the performance of Markowitz's portfolio only slightly worse than the performance of the leveraged portfolio. In the limit, including the possibility of holding cash in Markowitz's portfolio would close the gap between leveraged and non-leveraged portfolios until the inequality in Equation (43) is satisfied. Differences between the two portfolios become bigger and bigger as λ → ∞ because, while the leveraged portfolio continues taking profit from diversification by maintaining the optimal mix between goods thanks to the big leverage, the absence of leverage forces the Markowitz's portfolios to renounce to diversification, leading in the limit to a portfolio composed by only one good, the one that offers the greater expected return, even if, as it happens in the example, this portfolio could be easily outperformed also by a portfolio made by only one good with better µ/σ, simply by leveraging it.

Cost of Transactions
Another relevant feature not considered in Markowitz's original paper is the cost deriving from a transaction on the market. Transaction costs may influence portfolio's optimization, and, for this reason, transaction costs have been widely considered in the subsequent literature (Lobo et al. (2007), Mansini et al. (2003), Speranza (1999, 2005)). Type and structure of transaction costs depend on the financial operator (e.g., broker or bank) and on the type of stocks that the trader uses for her/his operation (e.g., Exchange Trade Commodities, Exchange Traded Funds, or Contract for Difference). Let us consider a general case where the relative cost is of the type λδ n (λ), δ n (λ) ≥ 0, n = 1, 2, · · · , N, where, here again, without losing generality, we assume that the leverage factor is common to all the goods and consider a transaction cost that can depend on the specific good and on the leverage factor. The relative return of the nth position is r n = λ(s n x n − δ n (λ)).
The above inequality means that a position that does not meet the above inequality cannot be in the portfolio, because it is outperformed by cash. A brief comment about the dependence of δ n on λ is in order. Consider the investment in bonds with one-year life emitted by a very trusted central bank. If the time step is one year and the return of the bond is higher than the cost of transaction, then one could borrow money to buy bonds. Actually, this is allowed in our model, because borrowing money simply means increasing λ. However, in a case like the one of this example, borrowing money is a cost that increases the cost of transaction, so that, in the end, the cost of transaction becomes higher than the return of the bond. When this happens it is the dependence of δ n on λ that, through Equation (46), rules out this position from the leveraged portfolio. Actually, a realistic cost model for such a case is where β 1 takes into account costs such as bank commissions that apply when investor's money is employed, that is for λ ≤ 1, while β 2 takes into account the cost of borrowing money, which applies to a fraction (λ − 1) of the investment. If, for λ > 1, then the condition in Equation (46) is violated and the position on good n with leverage λ cannot be included in the portfolio.
The constrained optimization is again Equation (22) with the usual constraints, but now The optimal portfolio and the efficient frontier are found by the same means as in the previous section, simply putting (µ x − δ(λ)) in place of µ x . The only difference is that, when the cost of transactions depends on λ, the optimal portfolio will depend on λ and the efficient frontier will no longer be a straight line.
As an example, we consider again the example of Section 3 where the statistical properties are described by Equations (41) and (42) with the addition of the following transaction costs equal for the three goods. Figure 2 displays a new comparison between Markowitz's portfolios and leveraged portfolio with the inclusion of transaction costs (Equation (50)). The first region corresponds to λ ≤ 1, where the efficient frontier of the leveraged portfolio is a straight line since we are in the first case of Equation (50), where δ(λ) is actually a constant since there is no dependance of λ.
The breakpoint between the straight line and the second region is at λ = 1, where the transaction cost changes and is no longer independent of λ as in the second case of Equation (50). The separation between the second and the third region is at λ = 2, where δ(λ) = E{x n } = 10 −2 and good 1 is ruled out by Equation (46). Finally, as λ → ∞, the efficient frontier approaches a straight line, because δ(λ) approaches the constant 1.5 × 10 −2 . These features are not easily noticed in Figure 2. Thus, in Figure 3, the efficient frontier of the leveraged portfolio is illustrated from a different perspective, where the division in three different regions highlighted by green dots is more visible.

Conclusions
In this work, we propose a method that takes inspiration from matched filter, a known concept in communication theory, and that extends Markowitz's Modern Portfolio Theory to account for short-sales, leverage and transaction costs. The illustrated method differs from the other known techniques in its simplicity and for the universality that allows the trader to simultaneously consider all the extensions to the original theory or just a part of them.