Combining Alphas via Bounded Regression

Kakushadze, Zura

doi:10.3390/risks3040474

Open AccessArticle

Combining Alphas via Bounded Regression

by

Zura Kakushadze

^1,2

¹

Quantigic® Solutions LLC, 1127 High Ridge Road #135, Stamford, CT 06905, USA

²

Business School & School of Physics, Free University of Tbilisi, 240, David Agmashenebeli Alley, Tbilisi 0159, Georgia

Risks 2015, 3(4), 474-490; https://doi.org/10.3390/risks3040474

Submission received: 31 July 2015 / Accepted: 30 October 2015 / Published: 4 November 2015

Download

Browse Figure

Versions Notes

Abstract

:

We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM) for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted) regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.

Keywords:

hedge fund; alpha stream; alpha weights; portfolio turnover; investment allocation; weighted regression; diversification; bounds; optimization; factor models

JEL classifications:

G00

1. Introduction

With technological advances, there is an ever-increasing number of alpha streams.1 Many of these alphas are ephemeral, with relatively short lifespans. As a result, in practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM) for a large number of alpha streams; SCM is singular. Therefore, directly using SCM in, say, alpha portfolio optimization is not an option.2

One approach to circumvent this difficulty is to build a factor model for alpha streams [1]. Because of the utmost secrecy in the alpha business, such factor models must be built in-house; there are no commercial providers of “standardized” factor models for alpha streams. As with factor models for equities, such model building for alphas requires certain nontrivial expertise and time expenditure.

Therefore, in practice, one often takes a simpler path. As was discussed in more detail in [1], one can deform SCM, such that it is nonsingular and, then, use the so-deformed SCM in, say, Sharpe ratio optimization for a portfolio of alphas. For small deformations, this then reduces to a cross-sectional weighted regression of the alpha stream expected returns [1]. The regression weights are the inverse sample variances of the alphas. The columns of the loadings matrix, over which the expected returns are regressed, are nothing but the first K principal components of SCM corresponding to its positive (i.e., non-vanishing) eigenvalues [1].

Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. Thus, if some expected returns are skewed, then, despite non-unit regression weights (which suppress more volatile alphas), the corresponding alpha weights can be larger than desired by diversification considerations. Furthermore, the principal components know nothing about quantities, such as turnover.3 A simple way of obtaining a more “well-rounded” portfolio composition is to set bounds on alpha weights. This is the approach we discuss here.

When individual alpha streams are traded on separate execution platforms, the alpha weights are non-negative. By combining and trading multiple alpha streams on the same execution platform, the framework we adapt here, one saves on transaction costs by internally crossing trades between different alpha streams (as opposed to going to the market).4 Then, the alpha weights can be negative.

When alpha weights can take both positive and negative values, the bounded regression problem simplifies. It boils down to an iterative algorithm that we discuss in Section 2. This algorithm can actually be derived from an optimization algorithm with bounds (in a factor model context) discussed in [2] by taking the regression limit of optimization. We also give R source code for the bounded regression algorithm in Appendix A. Appendix B contains some legalese. We conclude in Section 3, where we also discuss bounded regression with transaction costs following [3].

2. Bounded Regression

2.1. Notations

We have N alphas

α_{i}

,

i = 1, \dots, N

. Each alpha is actually a time series

α_{i} (t_{s})

,

s = 0, 1, \dots, M

, where

t_{0}

is the most recent time. Below,

α_{i}

refers to

α_{i} (t_{0})

.

Let

C_{i j}

be the sample covariance matrix (SCM) of the N time series

α_{i} (t_{s})

. If

M < N

, then only M eigenvalues of

C_{i j}

are non-zero, while the remainder have “small” values, which are zeros distorted by computational rounding.5

Alphas

α_{i}

are combined with weights

w_{i}

. Any leverage is included in the definition of

α_{i}

, i.e., if a given alpha labeled by

j \in {1, \dots, N}

before leverage is

α_{j}^{'}

(this is a raw, unlevered alpha) and the corresponding leverage is

L_{j} : 1

, then we define

α_{j} \equiv L_{j} α_{j}^{'}

. With this definition, the weights satisfy the condition:

\sum_{i = 1}^{N} |w_{i}| = 1

(1)

Here, we allow the weights to be negative, as we are interested in the case where the alphas are traded on the same execution platform, and trades between alphas are crossed, so one is actually trading the combined alpha

α \equiv \sum_{i = 1}^{N} α_{i} w_{i}

.

2.2. Weighted Regression

When SCM

C_{i j}

is singular and no other matrix (e.g., a factor model) to replace it is available, one can deform SCM, such that it is nonsingular and then use the so-deformed SCM in, say, Sharpe ratio optimization for a portfolio of alphas [1]. For small deformations, this reduces to a cross-sectional weighted regression of the alpha stream expected returns [1]. The regression weights

z_{i}

(not to be confused with the alpha weights

w_{i}

) are the inverse sample variances of the alphas:

z_{i} \equiv 1 / C_{i i}

. The columns of the loadings matrix

Λ_{i A}

,

A = 1, \dots, K

, over which the expected returns are regressed, are nothing but the first K principal components of SCM corresponding to its positive (i.e., non-vanishing) eigenvalues. However, for now, we will keep

Λ_{i A}

general (e.g., one may wish to include other risk factors in

Λ_{i A}

[1]).

The weights

w_{i}

are given by:

w_{i} = γ z_{i} ε_{i}

(2)

where

ε_{i}

are the residuals of the cross-sectional regression of

α_{i}

over

Λ_{i A}

(without the intercept, unless the intercept is subsumed in

Λ_{i A}

, see below) with the regression weights

z_{i}

:

ε_{i} = α_{i} - \sum_{j = 1}^{N} z_{j} α_{j} \sum_{A, B = 1}^{K} Λ_{i A} Λ_{j B} Q_{A B}^{- 1}

(3)

where

Q_{A B}^{- 1}

is the inverse of:

Q_{A B} \equiv \sum_{i = 1}^{N} z_{i} Λ_{i A} Λ_{i B}

(4)

and the overall factor γ in Equation (2) is fixed via Equation (1). Note that we have:

\forall A \in {1, \dots, K} : \sum_{i = 1}^{N} w_{i} Λ_{i A} = 0

(5)

Therefore, the weights

w_{i}

are neutral w.r.t. the risk factors defined by the columns of the loadings matrix

Λ_{i A}

.

2.3. Bounds

Since the weights

w_{i}

can have either sign, we will assume that the lower and upper bounds on the weights:

w_{i}^{-} \leq w_{i} \leq w_{i}^{+}

(6)

satisfy the conditions:

\begin{matrix} w_{i}^{-} \leq 0 \end{matrix}

(7)

\begin{matrix} w_{i}^{+} \geq 0 \end{matrix}

(8)

\begin{matrix} w_{i}^{-} < w_{i}^{+} \end{matrix}

(9)

The last condition is not restrictive: if for some alpha labeled by i, we have

w_{i}^{-} = w_{i}^{+}

, then we can simply set

w_{i} = w_{i}^{-}

and altogether exclude this alpha from the bounded regression procedure below. Furthermore, if, for whatever reason, we wish to have no upper/lower bound for a given

w_{i}

, we can simply set

w_{i}^{\pm} = \pm 1

.

The bounds can be imposed for diversification purposes: e.g., one may wish to require that no alpha has a weight greater than some fixed (small) percentile ξ, i.e.,

| w_{i} | \leq ξ

, so

w_{i}^{\pm} = \pm ξ

. One may also wish to suppress the contributions of high turnover alphas, e.g., by requiring that

| w_{i} | \leq \tilde{ξ}

if

τ_{i} \geq τ_{*}

, where

τ_{i}

is the turnover,6

τ_{*}

is some cut-off turnover, and

\tilde{ξ}

is some (small) percentile. Bounds can also be used to limit the weights of low capacity7 alphas, etc.8

2.4. Running a Bounded Regression

Therefore, how do we impose the bounds in the context of a regression? There are two subtleties here. First, we wish to preserve the factor neutrality property (5), which is invariant under the simultaneous rescalings

w_{i} \to ζ w_{i}

(where ζ is a constant). If we simply set some

w_{i}

to their upper or lower bounds, this generally will ruin the rescaling invariance, so the property (5) will be lost. Second, we must preserve the normalization condition (1). In fact, it is precisely this normalization condition that allows one to meaningfully set the bounds

w_{i}^{\pm}

, as the regression itself does not fix the overall normalization coefficient γ in Equation (2), owing to the rescaling invariance

w_{i} \to ζ w_{i}

.

Here, we discuss the bounded regression algorithm. To save space, we skip the detailed derivation as it follows straightforwardly by taking the regression limit of optimization with bounds in the context of a factor model, both of which are discussed in detail in [2].9

Let us define the following subsets of the index

i \in J \equiv {1, \dots, N}

:

\begin{matrix} w_{i} = w_{i}^{+}, i \in J^{+} \end{matrix}

(10)

\begin{matrix} w_{i} = w_{i}^{-}, i \in J^{-} \end{matrix}

(11)

\begin{matrix} \bar{J} \equiv J^{+} \cup J^{-} \end{matrix}

(12)

\begin{matrix} \tilde{J} \equiv J \ \bar{J} \end{matrix}

(13)

Further, let:

\begin{matrix} {\tilde{α}}_{i} \equiv γ α_{i} \end{matrix}

(14)

\begin{matrix} y_{A} \equiv \sum_{i \in \tilde{J}} z_{i} {\tilde{α}}_{i} Λ_{i A} + \sum_{i \in J^{+}} w_{i}^{+} Λ_{i A} + \sum_{i \in J^{-}} w_{i}^{-} Λ_{i A} \end{matrix}

(15)

where γ is to be determined (see below). Then, we have:

\begin{matrix} w_{i} = z_{i} ({\tilde{α}}_{i} - \sum_{A, B = 1}^{K} Λ_{i A} {\tilde{Q}}_{A B}^{- 1} y_{B}), i \in \tilde{J} \end{matrix}

(16)

\begin{matrix} \forall i \in J^{+} : z_{i} ({\tilde{α}}_{i} - \sum_{A, B = 1}^{K} Λ_{i A} {\tilde{Q}}_{A B}^{- 1} y_{B}) \geq w_{i}^{+} \end{matrix}

(17)

\begin{matrix} \forall i \in J^{-} : z_{i} ({\tilde{α}}_{i} - \sum_{A, B = 1}^{K} Λ_{i A} {\tilde{Q}}_{A B}^{- 1} y_{B}) \leq w_{i}^{-} \end{matrix}

(18)

where

{\tilde{Q}}^{- 1}

is the inverse of the

K \times K

matrix

\tilde{Q}

:

{\tilde{Q}}_{A B} \equiv \sum_{i \in \tilde{J}} z_{i} Λ_{i A} Λ_{i B}

(19)

Here, the loadings matrix

Λ_{i A}

must be such that

\tilde{Q}

is invertible.10 Furthermore, note that

w_{i}

,

i \in \tilde{J}

given by Equation (16) together with

w_{i} = w_{i}^{+}

,

i \in J^{+}

and

w_{i} = w_{i}^{-}

,

i \in J^{-}

satisfy Equation (5), as they should.

Note that, for a given value of γ, Equation (15) solves for

y_{A}

given

J^{+}

and

J^{-}

. On the other hand, Equations (17) and (18) determine

J^{+}

and

J^{-}

in terms of

y_{A}

. The entire system is then solved iteratively, where at the initial iteration, one takes

{\tilde{J}}^{(0)} = J

, so that

J^{+ (0)}

and

J^{- (0)}

are empty. However, we still need to fix γ. This is done via a separate iterative procedure, which we describe below.

Because we have two iterations, to guarantee (rapid) convergence, the

J^{\pm}

iteration (that is, for a given value of γ) can be done as follows. Let

{\hat{w}}_{i}^{(s)}

be such that:

\begin{matrix} \forall i \in J : w_{i}^{-} \leq {\hat{w}}_{i}^{(s)} \leq w_{i}^{+} \end{matrix}

(20)

\begin{matrix} \forall A \in {1, \dots, K} : \sum_{i = 1}^{N} {\hat{w}}_{i}^{(s)} Λ_{i A} = 0 \end{matrix}

(21)

At the

(s + 1)

-th iteration, let

w_{i}^{(s + 1)}

be given by Equation (16) for

i \in {\tilde{J}}^{(s)}

, with

w_{i}^{(s + 1)} = w_{i}^{\pm}

for

i \in J^{\pm (s)}

. This solution satisfies Equation (5), but may not satisfy the bounds. Let:

\begin{matrix} q_{i} \equiv w_{i}^{(s + 1)} - {\hat{w}}_{i}^{(s)} \end{matrix}

(22)

\begin{matrix} h_{i} (t) \equiv {\hat{w}}_{i}^{(s)} + t q_{i}, t \in [0, 1] \end{matrix}

(23)

Then:

{\hat{w}}_{i}^{(s + 1)} \equiv h_{i} (t_{*}) = {\hat{w}}_{i}^{(s)} + t_{*} q_{i}

(24)

where

t_{*}

is the maximal value of t, such that

h_{i} (t)

satisfies the bounds. We have:

\begin{matrix} q_{i} > 0 : p_{i} \equiv \min (w_{i}^{(s + 1)}, w_{i}^{+}) \end{matrix}

(25)

\begin{matrix} q_{i} < 0 : p_{i} \equiv \max (w_{i}^{(s + 1)}, w_{i}^{-}) \end{matrix}

(26)

\begin{matrix} t_{*} = \min (\frac{p_{i} - {\hat{w}}_{i}^{(s)}}{q_{i}} | q_{i} \neq 0, i \in J) \end{matrix}

(27)

Now, at each step, instead of Equations (17) and (18), we can define

J^{\pm (s + 1)}

via:

\begin{matrix} \forall i \in J^{+ (s + 1)} : {\hat{w}}_{i}^{(s + 1)} = w_{i}^{+} \end{matrix}

(28)

\begin{matrix} \forall i \in J^{- (s + 1)} : {\hat{w}}_{i}^{(s + 1)} = w_{i}^{-} \end{matrix}

(29)

where

{\hat{w}}_{i}^{(s + 1)}

is computed iteratively as above, and we can take

{\hat{w}}_{i}^{(0)} \equiv 0

at the initial iteration. Unlike Equations (17) and (18), Equation (28) and (29) add new elements to

J^{\pm}

one (or a few) element(s) at each iteration.

The convergence criteria are given by:

\begin{matrix} J^{+ (s + 1)} = J^{+ (s)} \end{matrix}

(30)

\begin{matrix} J^{- (s + 1)} = J^{- (s)} \end{matrix}

(31)

These criteria are based on discrete quantities and are unaffected by computational (machine) precision effects. However, in practice, the equalities in Equations (28) and (29) are understood within some tolerance (or machine precision); see the R code in Appendix A. We will denote the value of

{\hat{w}}_{i}^{(s + 1)}

at the final iteration (for a given value of γ) via

{\tilde{w}}_{i}

.

Finally, γ is determined via another iterative procedure as follows (we use superscript a for the γ iterations to distinguish them from the superscript s for the

J^{\pm}

iterations):

γ^{(a + 1)} = \frac{γ^{(a)}}{\sum_{i = 1}^{N} |{\tilde{w}}_{i}^{(a)}|}

(32)

where

{\tilde{w}}_{i}^{(a)}

is computed as above for

γ = γ^{(a)}

. To achieve rapid convergence, the initial value

γ^{(0)}

can be set as follows:

γ^{(0)} = \frac{1}{\sum_{i = 1}^{N} z_{i} |ε_{i}|}

(33)

where

ε_{i}

are the residuals of the weighted regression (without bounds) given by Equation (3). The convergence criterion for the γ iteration is given by:

γ^{(a + 1)} = γ^{(a)}

(34)

understood within some preset computational tolerance (or machine precision).

The R code for the above algorithm with some additional explanatory documentation is given in Appendix A. Note that this code is not written to be “fancy” or optimized for speed or in any other way. Instead, its sole purpose is to illustrate the bounded regression algorithm as it is described above in a simple-to-understand fashion. Some legalese relating to this code is given in Appendix B.

2.5. Application to Stock Portfolios

Above, we discussed the bounded regression algorithm in the context of computing weights for portfolios of alpha streams. However, the algorithm is quite general and, with appropriate notational identifications, can be applied to portfolios of stocks or other suitable instruments. In fact, it can also be applied outside of finance. Here, for the sake of definiteness, we will focus on stock portfolios; in fact, we will assume that they are dollar neutral, so both long and short positions are allowed.11

2.5.1. Establishing Trades

Let us first discuss establishing trades, i.e., we start from nil positions and establish a portfolio of N stocks. Instead of alpha streams, our index

i \in {1, \dots, N} \equiv J

now labels the stocks. We will denote the desired dollar (not share) holdings via

H_{i}

and the total dollar investment (long plus short) via I:

I \equiv \sum_{i = 1}^{N} | H_{i} |

(35)

Let

w_{i} \equiv H_{i} / I

. These are now our stock weights (analogous to the alpha weights). Then, we have the familiar normalization condition:

\sum_{i = 1}^{N} | w_{i} | = 1

(36)

However, normally, one imposes bounds on

H_{i}

, not on

w_{i}

. For example, in the case of establishing trades, one may wish to cap the positions, such that: (i) not more than a small percentile ξ of the total dollar investment I is allocated to any given stock; this is a diversification constraint; and (ii) only a small percentile

\tilde{ξ}

of ADDV (average daily dollar volume)

V_{i}

is traded; this is a liquidity constraint (see below). In this case, we have the following bounds on the dollar holdings

H_{i}

:

\begin{matrix} H_{i}^{-} \leq H_{i} \leq H_{i}^{+} \end{matrix}

(37)

\begin{matrix} H_{i}^{\pm} = \pm \min (ξ I, \tilde{ξ} V_{i}) \end{matrix}

(38)

In this case, the upper and lower bounds are symmetrical. In some cases, such as for hard-to-borrow stocks, we may have some

H_{i}^{-} = 0

. In other cases, one may not wish to have a long position in some stocks, etc. We will only assume that

H_{i}^{-} \leq 0

and

H_{i}^{+} \geq 0

, in line with our discussion above for the bounds on the weights, which are then given by:

w_{i}^{\pm} \equiv H_{i}^{\pm} / I

(39)

The final touch then is that instead of

α_{i}

, one uses some expected returns

E_{i}

in the case of stocks. The rest goes through exactly as above for a suitably-chosen

Λ_{i A}

.

2.5.2. Rebalancing Trades

With rebalancing trades, we have the current dollar holdings

H_{i}^{*}

and the desired dollar holdings

H_{i}

. In this case, one may wish to cap the positions such that: (i) not more than a small percentile ξ of the total dollar investment I is allocated to any given stock; this is the same diversification constraint as above; (ii) only a small percentile

\tilde{ξ}

of ADDV

V_{i}

is traded; this the same liquidity constraint as above; and (iii) not more than a small percentile

ξ^{'}

of ADDV

V_{i}

is allocated to any given stock; this is another liquidity constraint stemming from the consideration that, if the portfolio must be liquidated swiftly (e.g., due to an unforeseen event), to mitigate liquidation costs, the positions are capped based on liquidity. Here,

ξ^{'}

typically can be several times larger than

\tilde{ξ}

; the portfolio can be built up in stages as long as at each stage, the bounds are satisfied. The bounds on

H_{i}

now read:

\begin{matrix} | H_{i} | \leq \min (ξ I, ξ^{'} V_{i}) \end{matrix}

(40)

\begin{matrix} | H_{i} - H_{i}^{*} | \leq \tilde{ξ} V_{i} \end{matrix}

(41)

It is more convenient to rewrite these bounds in terms of the traded dollar amounts

D_{i} \equiv H_{i} - H_{i}^{*}

:

\begin{matrix} D_{i}^{-} \leq D_{i} \leq D_{i}^{+} \end{matrix}

(42)

\begin{matrix} D_{i}^{+} = \min (\min (ξ I, ξ^{'} V_{i}) - H_{i}^{*}, \tilde{ξ} V_{i}) \geq 0 \end{matrix}

(43)

\begin{matrix} D_{i}^{-} = \max (- \min (ξ I, ξ^{'} V_{i}) - H_{i}^{*}, - \tilde{ξ} V_{i}) \leq 0 \end{matrix}

(44)

and we are assuming that

| H_{i}^{*} | \leq m i n (ξ I, ξ^{'} V_{i})

. Furthermore, we will assume that

H_{i}^{*}

itself satisfies Equation (5):

\forall A \in {1, \dots, K} : \sum_{i = 1}^{N} H_{i}^{*} Λ_{i A} = 0

(45)

Then, the bounded regression algorithm can be straightforwardly applied to the weights

w_{i}

and

x_{i}

defined as follows:

\begin{matrix} w_{i} \equiv H_{i} / I \end{matrix}

(46)

\begin{matrix} x_{i} \equiv D_{i} / I \end{matrix}

(47)

In the

J^{\pm}

iteration, we now use

x_{i}

instead of

w_{i}

, while in the γ iteration, we still use

w_{i}

. Then, the rest of the algorithm goes through unchanged. Let us note, however, that the source code given in Appendix A is written with alpha weights in mind, so while it can be adapted to the case of stock portfolios in the case of establishing trades, straightforward modifications are required to accommodate rebalancing trades.

2.5.3. Examples: Intraday Mean Reversion Alphas

To illustrate the use of the algorithm, we have employed it to construct portfolios for intraday mean reversion alphas with the loadings matrix

Λ_{i A}

in the following five incarnations: (i) intercept only (so

K = 1

); (ii) BICS (Bloomberg Industry Classification System) sectors; (iii) BICS industries; (iv) BICS sub-industries; and (v) the four style factors prc, mom, hlvand vol, of [60] plus BICS sub-industries. The regression weights are the inverse sample variances:

z_{i} = 1 / C_{i i}

(see below). In Cases (ii)–(v) above, the intercept is subsumed in the loadings matrix

Λ_{i A}

. Indeed, we have

\sum_{A \in G} Λ_{i A} \equiv 1

, where G is the set of columns of

Λ_{i A}

corresponding to sectors in Case (ii), industries in Case (iii) and sub-industries in Cases (iv) and (v). Consequently, the resultant portfolios are automatically dollar neutral.

The portfolio construction and backtesting are identical to those in [61], where a more detailed discussion can be found; so, to save space, here, we will only give a brief summary. The portfolios are assumed to be established at the open and liquidated at the close on the same day, so they are purely intraday, and the algorithm of Section 2.5.1 for establishing trades applies. The expected returns

E_{i}

for each date are taken to be

E_{i} = - R_{i}

, where

R_{i} \equiv ln (P_{i}^{o p e n} / P_{i}^{c l o s e})

, and for each date,

P_{i}^{o p e n}

is today’s open, while

P_{i}^{c l o s e}

is yesterday’s close adjusted for splits and dividends if the ex-date is today. Therefore, these are intraday mean-reversion alphas.

The universe is top 2000 by ADDV

V_{i}

, where ADDV is computed based on 21-trading day rolling periods. However, the universe is not rebalanced daily, but also every 21 trading days (see [61] for details). The sample variances

C_{i i}

are computed based on the same 21-trading day rolling periods and are not applied daily, but also, every 21 trading days, the same as the universe rebalancing (see [61] for details). We run our simulations over a period of five years (more precisely,

252 \times 5

trading days going back from 5 September 2014, inclusive). The annualized return-on-capital (ROC) is computed as average daily P&L divided by the total (long plus short) intraday investment level I (with no leverage) and multiplied by 252. The annualized Sharpe ratio (SR) is computed as the daily Sharpe ratio multiplied by

\sqrt{252}

. Cents-per-share (CPS) are computed as the total P&L divided by the total shares traded. On each day, the total (establishing plus liquidating) shares traded for each stock are given by

Q_{i} = 2 | H_{i} | / P_{i}^{o p e n}

(see [61] for details).

For comparison purposes, the results for regressions without bounds are given in Table 1. The results for the bounded regressions, with the bounds on the desired holdings set as:

| H_{i} | \leq 0.01 V_{i}

(48)

so not more than 1% of each stock’s ADDV is bought or sold, are given in Table 2, and the corresponding P&Ls are plotted in Figure 1. Thus, as expected, adding the liquidity bounds has the diversification effect on the portfolios, so the Sharpe ratios are substantially improved; as usual, at the expense of (slightly) lowering paper ROC and CPS. Note that, even with tight liquidity bounds, the four style factors, prc, mom, hlv and vol, of [60] add value, further validating the four-factor model of [60].

Table 1. Simulation results for the 5 alphas via regression without bounds discussed in Section 2.5.3. ROC, return-on-capital; SR, Sharpe ratio; CPS, cents-per-share; BICS, Bloomberg Industry Classification System.

**Table 1.** Simulation results for the 5 alphas via regression without bounds discussed in Section 2.5.3. ROC, return-on-capital; SR, Sharpe ratio; CPS, cents-per-share; BICS, Bloomberg Industry Classification System.
Alpha	ROC	SR	CPS
Regression: Intercept only	33.59%	5.59	1.38
Regression: BICS sectors	39.28%	7.05	1.61
Regression: BICS industries	42.66%	8.19	1.75
Regression: BICS sub-industries	45.25%	9.22	1.84
Regression: 4 style factors plus BICS sub-industries	46.60%	9.85	1.90

Figure 1. P&L graphs for the intraday alphas discussed in Section 2.5.3, with a summary in Table 2. Bottom-to-top-performing: (i) regression over intercept only; (ii) regression over BICS sectors; (iii) regression over BICS industries; (iv) regression over BICS sub-industries; and (v) regression over the four style factors, prc, mom, hlvand vol, of [60] plus BICS sub-industries. The investment level is $10 M long plus $10 M short.

Table 2. Simulation results for the 5 alphas via bounded regression discussed in Section 2.5.3.

**Table 2.** Simulation results for the 5 alphas via bounded regression discussed in Section 2.5.3.
Alpha	ROC	SR	CPS
Regression: Intercept only	29.66%	7.36	1.25
Regression: BICS sectors	35.32%	9.89	1.48
Regression: BICS industries	39.25%	12.00	1.65
Regression: BICS sub-industries	42.23%	14.13	1.75
Regression: 4 style factors plus BICS sub-industries	43.70%	15.54	1.82

3. Concluding Remarks

One, but not the only, way to think about bounded regression is as an alternative to optimization with bounds when the latter is not attainable. In fact, as mentioned above, bounded regression is a zero specific risk limit of optimization with bounds in the context of a factor model. Therefore, when a factor model is not available, e.g., in the context of alpha streams, bounded regression can be used in lieu of optimization.

In this regard, one can further augment the bounded regression algorithm we discussed above by including linear transaction costs, as in [3]. A systematic approach is to start with optimization with bounds and linear transaction costs in the context of a factor model, as in [2], and to take a zero specific risk limit. Non-linear transaction costs (impact) in the context of alpha weights can be treated using the approximation discussed in [3] using the spectral model of turnover reduction [62].

Conflicts of Interest

The author declares no conflict of interest.

Appendix

A. The R Code

Below, we give R (R Package for Statistical Computing, http://www.r-project.org) source code for the bounded regression algorithm we discuss in the main text. The entry function is calc.bounded.lm(), which runs the γ iteration loop and calls the function bounded.lm(), which runs the

J^{\pm}

iteration loop. The args() of calc.bounded.lm() are: ret, which is the N-vector of alphas

α_{i}

(or, more generally, some other returns); load, which is the

N \times K

loadings matrix

Λ_{i A}

; weights, which is the N-vector of the regression weights

z_{i}

; upper, which is the N-vector of the upper bounds

w_{i}^{+}

; lower, which is the N-vector of the lower bounds

w_{i}^{-}

; and prec, which is the desired precision with which the output weights

w_{i}

, the N-vector of which calc.bounded.lm() returns, must satisfy the normalization condition Equation (1). Internally, bounded.lm() calls the function calc.bounds(), which computes

{\hat{w}}_{i}^{(s + 1)}

in Equation (24) at each iteration. The code is straightforwardly self-explanatory. Jp, Jm in bounded.lm() correspond to

J^{\pm}

. One subtlety is that, when restricting

Λ_{i A}

to

\tilde{J} \subset J

, in the case of binary industry classification (e.g., when

Λ_{i A}

corresponds to BICS sub-industries, which can be small), the so-restricted

Λ_{i A}

may have null columns, which must be omitted, and the code below does just that. For non-binary cases, one may wish to augment the code to ensure that the matrix Q <- t(load[Jt, take]) %*% w.load[Jt, take] is nonsingular (and, if it is, then remove the culprit columns in

Λ_{i A}

or otherwise modify the latter); however, for non-binary

Λ_{i A}

and generic regression weights, this should not occur, except for special, non-generic cases.


	  calc.bounded.lm <- function(ret, load, weights, upper, lower, prec = 1e-5)
      {
         reg <- lm(ret ∼ -1 + load, weights = weights)
         x <- weights * residuals(reg)
         ret <- ret / sum(abs(x))
         repeat{
             x <- bounded.lm(ret, load, weights, upper, lower)
             if(abs(sum(abs(x)) - 1) < prec)
                break
             ret <- ret / sum(abs(x))
         }
	     return(x)
      }
      bounded.lm <- function(ret, load, weights, upper, lower, tol = 1e-6)
      {
          calc.bounds <- function(z, x)
          {
             q <- x - z
             p <- rep(NA, length(x))
             pp <- pmin(x, upper)
             pm <- pmax(x, lower)
             p[q > 0] <- pp[q > 0]
             p[q < 0] <- pm[q < 0]
             t <- (p - z)/q
             t <- min(t, na.rm = T)
             z <- z + t * q
             return(z)
          }
          if(!is.matrix(load))
            load <- matrix(load, length(load), 1)
	      n <- nrow(load)
          k <- ncol(load)
	      ret <- matrix(ret, n, 1)
          upper <- matrix(upper, n, 1)
          lower <- matrix(lower, n, 1)
          z <- diag(weights)
          w.load <- z %*% load
          w.ret <- z %*% ret
	      J <- rep(T, n)
          Jp <- rep(F, n)
          Jm <- rep(F, n)
          z <- rep(0, n)
          repeat{
            Jt <- J & !Jp & !Jm
            y <- t(w.load[Jt, ]) %*% ret[Jt, ]
            if(sum(Jp) > 1)
              y <- y + t(load[Jp, ]) %*% upper[Jp, ]
            else if(sum(Jp) == 1)
              y <- y + upper[Jp, ] * matrix(load[Jp, ], k, 1)
            if(sum(Jm) > 1)
              y <- y + t(load[Jm, ]) %*% lower[Jm, ]
            else if(sum(Jm) == 1)
              y <- y + lower[Jm, ] * matrix(load[Jm, ], k, 1)
            if(k > 1)
              take <- colSums(abs(load[Jt, ])) > 0
            else
              take <- T
            Q <- t(load[Jt, take]) %*% w.load[Jt, take]
            Q <- solve(Q)
            v <- Q %*% y[take]
	        xJp <- Jp
            xJm <- Jm
            x <- w.ret - w.load[, take] %*% v
            x[Jp, ] <- upper[Jp, ]
            x[Jm, ] <- lower[Jm, ]
	        z <- calc.bounds(z, x)
            Jp <- abs(z - upper) < tol
            Jm <- abs(z - lower) < tol
            if(all(Jp == xJp) & all(Jm == xJm))
               break
       }
       return(z)
     }

B. Disclaimers

Wherever the context so requires, the masculine gender includes the feminine and/or neuter, and the singular form includes the plural and vice versa. The author of this paper (“Author”) and his affiliates including without limitation Quantigic Solutions LLC (“Author’s Affiliates” or “his Affiliates”) make no implied or express warranties or any other representations whatsoever, including without limitation implied warranties of merchantability and fitness for a particular purpose, in connection with or with regard to the content of this paper including without limitation any code or algorithms contained herein (“Content”).

The reader may use the Content solely at his/her/its own risk and the reader shall have no claims whatsoever against the Author or his Affiliates and the Author and his Affiliates shall have no liability whatsoever to the reader or any third party whatsoever for any loss, expense, opportunity cost, damages or any other adverse effects whatsoever relating to or arising from the use of the Content by the reader including without any limitation whatsoever: any direct, indirect, incidental, special, consequential or any other damages incurred by the reader, however caused and under any theory of liability; any loss of profit (whether incurred directly or indirectly), any loss of goodwill or reputation, any loss of data suffered, cost of procurement of substitute goods or services, or any other tangible or intangible loss; any reliance placed by the reader on the completeness, accuracy or existence of the Content or any other effect of using the Content; and any and all other adversities or negative effects the reader might encounter in using the Content irrespective of whether the Author or his Affiliates is or are or should have been aware of such adversities or negative effects.

The R code included in Appendix A hereof is part of the copyrighted R code of Quantigic^® Solutions LLC and is provided herein with the express permission of Quantigic^® Solutions LLC. The copyright owner retains all rights, title and interest in and to its copyrighted source code included in Appendix A hereof and any and all copyrights therefor.

References

Z. Kakushadze. “Factor Models for Alpha Streams.” J. Invest. Strategies 4 (2014): 83–109. [Google Scholar]
Z. Kakushadze. “Mean-Reversion and Optimization.” J. Asset Manag. 16 (2015): 14–40. [Google Scholar] [CrossRef]
Z. Kakushadze. “Combining Alpha Streams with Costs.” J. Risk 17 (2015): 57–78. [Google Scholar] [CrossRef]
T. Schneeweis, R. Spurgin, and D. McCarthy. “Survivor Bias in Commodity Trading Advisor Performance.” J. Futures Markets 16 (1996): 757–772. [Google Scholar] [CrossRef]
C. Ackerman, R. McEnally, and D. Revenscraft. “The Performance of Hedge Funds: Risk, Return and Incentives.” J. Financ. 54 (1999): 833–874. [Google Scholar] [CrossRef]
S.J. Brown, W. Goetzmann, and R.G. Ibbotson. “Offshore Hedge Funds: Survival and Performance, 1989–1995.” J. Bus. 72 (1999): 91–117. [Google Scholar] [CrossRef]
F.R. Edwards, and J. Liew. “Managed Commodity Funds.” J. Futures Markets 19 (1999): 377–411. [Google Scholar] [CrossRef]
F.R. Edwards, and J. Liew. “Hedge Funds versus Managed Futures as Asset Classes.” J. Deriv. 6 (1999): 45–64. [Google Scholar] [CrossRef]
W. Fung, and D. Hsieh. “A Primer on Hedge Funds.” J. Empir. Financ. 6 (1999): 309–331. [Google Scholar] [CrossRef]
B. Liang. “On the Performance of Hedge Funds.” Financ. Anal. J. 55 (1999): 72–85. [Google Scholar] [CrossRef]
V. Agarwal, and N.Y. Naik. “On Taking the “Alternative” Route: The Risks, Rewards, and Performance Persistence of Hedge Funds.” J. Altern. Invest. 2 (2000): 6–23. [Google Scholar] [CrossRef]
V. Agarwal, and N.Y. Naik. “Multi-Period Performance Persistence Analysis of Hedge Funds Source.” J. Financ. Quant. Anal. 35 (2000): 327–342. [Google Scholar] [CrossRef]
W. Fung, and D. Hsieh. “Performance Characteristics of Hedge Funds and Commodity Funds: Natural vs. Spurious Biases.” J. Financ. Quant. Anal. 35 (2000): 291–307. [Google Scholar] [CrossRef]
B. Liang. “Hedge Funds: The Living and the Dead.” J. Financ. Quant. Anal. 35 (2000): 309–326. [Google Scholar] [CrossRef]
C.S. Asness, R.J. Krail, and J.M. Liew. “Do Hedge Funds Hedge? ” J. Portf. Manag. 28 (2001): 6–19. [Google Scholar] [CrossRef]
F.R. Edwards, and M.O. Caglayan. “Hedge Fund and Commodity Fund Investments in Bull and Bear Markets.” J. Portf. Manag. 27 (2001): 97–108. [Google Scholar] [CrossRef]
W. Fung, and D. Hsieh. “The Risk in Hedge Fund Strategies: Theory and Evidence from Trend Followers.” Rev. Financ. Stud. 14 (2001): 313–341. [Google Scholar] [CrossRef]
B. Liang. “Hedge Fund Performance: 1990–1999.” Financ. Anal. J. 57 (2001): 11–18. [Google Scholar] [CrossRef]
A.W. Lo. “Risk Management For Hedge Funds: Introduction and Overview.” Financ. Anal. J. 57 (2001): 16–33. [Google Scholar] [CrossRef]
C. Brooks, and H.M. Kat. “The Statistical Properties of Hedge Fund Index Returns and Their Implications for Investors.” J. Altern. Invest. 5 (2002): 26–44. [Google Scholar] [CrossRef]
D.-L. Kao. “Battle for Alphas: Hedge Funds versus Long-Only Portfolios.” Financ. Anal. J. 58 (2002): 16–36. [Google Scholar] [CrossRef]
G. Amin, and H. Kat. “Stocks, Bonds and Hedge Funds: Not a Free Lunch! ” J. Portf. Manag. 29 (2003): 113–120. [Google Scholar] [CrossRef]
N. Chan, M. Getmansky, S.M. Haas, and A.W. Lo. “Systemic Risk and Hedge Funds.” In The Risks of Financial Institutions. Edited by M. Carey and R.M. Stulz. Chicago, IL, USA: University of Chicago Press, 2006, Chapter 6; pp. 235–338. [Google Scholar]
H. Markowitz. “Portfolio selection.” J. Financ. 7 (1952): 77–91. [Google Scholar]
A. Charnes, and W.W. Cooper. “Programming with linear fractional functionals.” Nav. Res. Logist. Q. 9 (1962): 181–186. [Google Scholar] [CrossRef]
W.F. Sharpe. “Mutual fund performance.” J. Bus. 39 (1966): 119–138. [Google Scholar] [CrossRef]
R.C. Merton. “Lifetime portfolio selection under uncertainty: The continuous time case.” Rev. Econ. Stat. 51 (1969): 247–257. [Google Scholar] [CrossRef]
S. Schaible. “Parameter-free convex equivalent and dual programs of fractional programming problems.” Z. Oper. Res. 18 (1974): 187–196. [Google Scholar] [CrossRef]
M. Magill, and G. Constantinides. “Portfolio selection with transactions costs.” J. Econ. Theory 13 (1976): 245–263. [Google Scholar] [CrossRef]
A.F. Perold. “Large-scale portfolio optimization.” Manag. Sci. 30 (1984): 1143–1160. [Google Scholar] [CrossRef]
M. Davis, and A. Norman. “Portfolio selection with transaction costs.” Math. Oper. Res. 15 (1990): 676–713. [Google Scholar] [CrossRef]
B. Dumas, and E. Luciano. “An exact solution to a dynamic portfolio choice problem under transaction costs.” J. Financ. 46 (1991): 577–595. [Google Scholar] [CrossRef]
C.J. Adcock, and N. Meade. “A simple algorithm to incorporate transactions costs in quadratic optimization.” Eur. J. Oper. Res. 79 (1994): 85–94. [Google Scholar] [CrossRef]
S. Shreve, and H.M. Soner. “Optimal investment and consumption with transaction costs.” Ann. Appl. Probab. 4 (1994): 609–692. [Google Scholar] [CrossRef]
D. Bienstock. “Computational study of a family of mixed-integer quadratic programming problems.” Math. Program. 74 (1996): 121–140. [Google Scholar] [CrossRef]
J. Cvitanić, and I. Karatzas. “Hedging and portfolio optimization under transaction costs: A martingale approach.” Math. Financ. 6 (1996): 133–165. [Google Scholar] [CrossRef]
A. Yoshimoto. “The mean-variance approach to portfolio optimization subject to transaction costs.” J. Oper. Res. Soc. Jpn. 39 (1996): 99–117. [Google Scholar]
C. Atkinson, S.R. Pliska, and P. Wilmott. “Portfolio management with transaction costs.” Proc. R. Soc. Lond. Ser. A 453 (1997): 551–562. [Google Scholar] [CrossRef]
D. Bertsimas, C. Darnell, and R. Soucy. “Portfolio construction through mixed-integer programming at Grantham, Mayo, Van Otterloo and Company.” Interfaces 29 (1999): 49–66. [Google Scholar] [CrossRef]
A. Cadenillas, and S.R. Pliska. “Optimal trading of a security when there are taxes and transaction costs.” Financ. Stoch. 3 (1999): 137–165. [Google Scholar] [CrossRef]
T.-J. Chang, N. Meade, J.E. Beasley, and Y.M. Sharaiha. “Heuristics for cardinality constrained portfolio optimisation.” Comput. Oper. Res. 27 (2000): 1271–1302. [Google Scholar]
H. Kellerer, R. Mansini, and M.G. Speranza. “Selecting portfolios with fixed costs and minimum transaction lots.” Ann. Oper. Res. 99 (2000): 287–304. [Google Scholar] [CrossRef]
R.T. Rockafellar, and S. Uryasev. “Optimization of conditional value-at-risk.” J. Risk 2 (2000): 21–41. [Google Scholar]
J. Gondzio, and R. Kouwenberg. “High-performance computing for asset-liability management.” Oper. Res. 49 (2001): 879–891. [Google Scholar]
H. Konno, and A. Wijayanayake. “Portfolio optimization problem under concave transaction costs and minimal transaction unit constraints.” Math. Program. 89 (2001): 233–250. [Google Scholar] [CrossRef]
S. Mokkhavesa, and C. Atkinson. “Perturbation solution of optimal portfolio theory with transaction costs for any utility function.” IMA J. Manag. Math. 13 (2002): 131–151. [Google Scholar] [CrossRef]
O.L.V. Costa, and A.C. Paiva. “Robust portfolio selection using linear-matrix inequalities.” J. Econ. Dyn. Control 26 (2002): 889–909. [Google Scholar]
F. Alizadeh, and D. Goldfarb. “Second-order cone programming.” Math. Program. 95 (2003): 3–51. [Google Scholar] [CrossRef]
M.J. Best, and J. Hlouskova. “Portfolio selection and transactions costs.” Comput. Optim. Appl. 24 (2003): 95–116. [Google Scholar] [CrossRef]
K. Janeček, and S. Shreve. “Asymptotic analysis for optimal investment and consumption with transaction costs.” Financ. Stoch. 8 (2004): 181–206. [Google Scholar] [CrossRef]
M.S. Lobo, M. Fazel, and S. Boyd. “Portfolio optimization with linear and fixed transaction costs.” Ann. Oper. Res. 152 (2007): 341–365. [Google Scholar] [CrossRef]
R. Zagst, and D. Kalin. “Portfolio optimization under liquidity costs.” Int. J. Pure Appl. Math. 39 (2007): 217–233. [Google Scholar]
M. Potaptchik, L. Tunçel, and H. Wolkowicz. “Large scale portfolio optimization with piecewise linear transaction costs.” Optim. Methods Softw. 23 (2008): 929–952. [Google Scholar] [CrossRef]
E. Moro, J. Vicente, L.G. Moyano, A. Gerig, J.D. Farmer, G. Vaglica, F. Lillo, and R.N. Mantegna. “Market impact and trading profile of hidden orders in stock markets.” Phys. Rev. E 80 (2009): 066102. [Google Scholar] [CrossRef]
J. Goodman, and D.N. Ostrov. “Balancing small transaction costs with loss of optimal allocation in dynamic stock trading strategies.” SIAM J. Appl. Math. 70 (2010): 1977–1998. [Google Scholar] [CrossRef]
M. Bichuch. “Asymptotic analysis for optimal investment in finite time with transaction costs.” SIAM J. Financ. Math. 3 (2012): 433–458. [Google Scholar] [CrossRef]
J.E. Mitchell, and S. Braun. “Rebalancing an investment portfolio in the presence of convex transaction costs, including market impact costs.” Optim. Methods Softw. 28 (2013): 523–542. [Google Scholar] [CrossRef]
H. Soner, and N. Touzi. “Homogenization and asymptotics for small transaction costs.” SIAM J. Control Optim. 51 (2013): 2893–2921. [Google Scholar] [CrossRef]
Z. Kakushadze, and J.K.-S. Liew. “Is It Possible to OD on Alpha? ” J. Altern. Invest. 18 (2015): 39–49. [Google Scholar] [CrossRef]
Z. Kakushadze. “4-Factor Model for Overnight Returns.” Wilmott Mag. 2015 (2015): 56–62. [Google Scholar] [CrossRef]
Z. Kakushadze. “Russian-Doll Risk Models.” J. Asset Manag. 16 (2015): 170–185. [Google Scholar] [CrossRef]
Z. Kakushadze. “A Spectral Model of Turnover Reduction.” Econometrics 3 (2015): 577–589. [Google Scholar] [CrossRef]

¹Here “alpha”, following the common trader lingo, generally means any reasonable “expected return” that one may wish to trade on and is not necessarily the same as the “academic” alpha. In practice, often, the detailed information about how alphas are constructed may not be available, e.g., the only data available could be the position data, so “alpha” then is a set of instructions to achieve certain stock holdings by some times $t_{1}, t_{2}, \dots$
²For a partial list of hedge fund literature, see, e.g., [4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23] and the references therein. For a partial list of portfolio optimization and related literature, see, e.g., [24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58] and the references therein.
³One approach to rectify this is to add a turnover-based factor to the loadings matrix [1].
⁴For a recent discussion, see [59].
⁵Actually, this assumes that there are no N/Asin any of the alpha time series. If some or all alpha time series contain N/As in a non-uniform manner and the correlation matrix is computed by omitting such pair-wise N/As, then the resulting correlation matrix may have negative eigenvalues that are not zeros distorted by computational rounding.
⁶Here, the turnover (over a given period, e.g., daily turnover) is defined as the ratio $τ_{i} \equiv D_{i} / I_{i}$ of total dollars $D_{i}$ (long plus short) traded by the alpha labeled by i over the corresponding total dollar holdings $I_{i}$ (long plus short).
⁷By capacity $I_{i}^{*}$ for a given alpha, we mean the value of the investment level $I_{i}$ for which the P&L $P_{i} (I_{i})$ is maximized (considering nonlinear effects of impact).
⁸Since the regression we consider here is weighted with the regression weights $z_{i} = 1 / C_{i i}$ , this already controls exposure to alpha volatility, so imposing bounds based on volatility would make a difference only if one wishes to further suppress volatile alphas.
⁹The regression limit of optimization essentially amounts to the limit $ξ_{i}^{2} \equiv η {\tilde{ξ}}_{i}^{2}$ , $η \to 0$ , ${\tilde{ξ}}_{i}^{2} = f i x e d$ , where $ξ_{i}$ is the specific (idiosyncratic) risk in the factor model with the factor loadings matrix identified with the regression loadings matrix $Λ_{i A}$ (and the $K \times K$ factor covariance matrix becomes immaterial in the regression limit); see [2] for details.
¹⁰This is the case if the columns of $Λ_{i A}$ are comprised of the first K principal components of SCM $C_{i j}$ corresponding to its positive eigenvalues. However, as mentioned above, here, we keep the loadings matrix general.
¹¹Various generalizations are possible, some more straightforward than others.

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kakushadze, Z. Combining Alphas via Bounded Regression. Risks 2015, 3, 474-490. https://doi.org/10.3390/risks3040474

AMA Style

Kakushadze Z. Combining Alphas via Bounded Regression. Risks. 2015; 3(4):474-490. https://doi.org/10.3390/risks3040474

Chicago/Turabian Style

Kakushadze, Zura. 2015. "Combining Alphas via Bounded Regression" Risks 3, no. 4: 474-490. https://doi.org/10.3390/risks3040474

APA Style

Kakushadze, Z. (2015). Combining Alphas via Bounded Regression. Risks, 3(4), 474-490. https://doi.org/10.3390/risks3040474

Article Menu

Combining Alphas via Bounded Regression

Abstract

1. Introduction

2. Bounded Regression

2.1. Notations

2.2. Weighted Regression

2.3. Bounds

2.4. Running a Bounded Regression

2.5. Application to Stock Portfolios

2.5.1. Establishing Trades

2.5.2. Rebalancing Trades

2.5.3. Examples: Intraday Mean Reversion Alphas

3. Concluding Remarks

Conflicts of Interest

Appendix

A. The R Code

B. Disclaimers

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI