Combining Alphas via Bounded Regression
:1. Introduction
2. Bounded Regression
2.1. Notations
2.2. Weighted Regression
2.3. Bounds
2.4. Running a Bounded Regression
2.5. Application to Stock Portfolios
2.5.1. Establishing Trades
2.5.2. Rebalancing Trades
2.5.3. Examples: Intraday Mean Reversion Alphas
Alpha | ROC | SR | CPS |
Regression: Intercept only | 33.59% | 5.59 | 1.38 |
Regression: BICS sectors | 39.28% | 7.05 | 1.61 |
Regression: BICS industries | 42.66% | 8.19 | 1.75 |
Regression: BICS sub-industries | 45.25% | 9.22 | 1.84 |
Regression: 4 style factors plus BICS sub-industries | 46.60% | 9.85 | 1.90 |
Alpha | ROC | SR | CPS |
Regression: Intercept only | 29.66% | 7.36 | 1.25 |
Regression: BICS sectors | 35.32% | 9.89 | 1.48 |
Regression: BICS industries | 39.25% | 12.00 | 1.65 |
Regression: BICS sub-industries | 42.23% | 14.13 | 1.75 |
Regression: 4 style factors plus BICS sub-industries | 43.70% | 15.54 | 1.82 |
3. Concluding Remarks
Conflicts of Interest
A. The R Code
calc.bounded.lm <- function(ret, load, weights, upper, lower, prec = 1e-5)
reg <- lm(ret ∼ -1 + load, weights = weights)
x <- weights * residuals(reg)
ret <- ret / sum(abs(x))
x <- bounded.lm(ret, load, weights, upper, lower)
if(abs(sum(abs(x)) - 1) < prec)
ret <- ret / sum(abs(x))
bounded.lm <- function(ret, load, weights, upper, lower, tol = 1e-6)
calc.bounds <- function(z, x)
q <- x - z
p <- rep(NA, length(x))
pp <- pmin(x, upper)
pm <- pmax(x, lower)
p[q > 0] <- pp[q > 0]
p[q < 0] <- pm[q < 0]
t <- (p - z)/q
t <- min(t, na.rm = T)
z <- z + t * q
load <- matrix(load, length(load), 1)
n <- nrow(load)
k <- ncol(load)
ret <- matrix(ret, n, 1)
upper <- matrix(upper, n, 1)
lower <- matrix(lower, n, 1)
z <- diag(weights)
w.load <- z %*% load
w.ret <- z %*% ret
J <- rep(T, n)
Jp <- rep(F, n)
Jm <- rep(F, n)
z <- rep(0, n)
Jt <- J & !Jp & !Jm
y <- t(w.load[Jt, ]) %*% ret[Jt, ]
if(sum(Jp) > 1)
y <- y + t(load[Jp, ]) %*% upper[Jp, ]
else if(sum(Jp) == 1)
y <- y + upper[Jp, ] * matrix(load[Jp, ], k, 1)
if(sum(Jm) > 1)
y <- y + t(load[Jm, ]) %*% lower[Jm, ]
else if(sum(Jm) == 1)
y <- y + lower[Jm, ] * matrix(load[Jm, ], k, 1)
if(k > 1)
take <- colSums(abs(load[Jt, ])) > 0
take <- T
Q <- t(load[Jt, take]) %*% w.load[Jt, take]
Q <- solve(Q)
v <- Q %*% y[take]
xJp <- Jp
xJm <- Jm
x <- w.ret - w.load[, take] %*% v
x[Jp, ] <- upper[Jp, ]
x[Jm, ] <- lower[Jm, ]
z <- calc.bounds(z, x)
Jp <- abs(z - upper) < tol
Jm <- abs(z - lower) < tol
if(all(Jp == xJp) & all(Jm == xJm))
B. Disclaimers
- 1Here “alpha”, following the common trader lingo, generally means any reasonable “expected return” that one may wish to trade on and is not necessarily the same as the “academic” alpha. In practice, often, the detailed information about how alphas are constructed may not be available, e.g., the only data available could be the position data, so “alpha” then is a set of instructions to achieve certain stock holdings by some times
- 2For a partial list of hedge fund literature, see, e.g., [4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23] and the references therein. For a partial list of portfolio optimization and related literature, see, e.g., [24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58] and the references therein.
- 3One approach to rectify this is to add a turnover-based factor to the loadings matrix [1].
- 4For a recent discussion, see [59].
- 5Actually, this assumes that there are no N/Asin any of the alpha time series. If some or all alpha time series contain N/As in a non-uniform manner and the correlation matrix is computed by omitting such pair-wise N/As, then the resulting correlation matrix may have negative eigenvalues that are not zeros distorted by computational rounding.
- 6Here, the turnover (over a given period, e.g., daily turnover) is defined as the ratio of total dollars (long plus short) traded by the alpha labeled by i over the corresponding total dollar holdings (long plus short).
- 7By capacity for a given alpha, we mean the value of the investment level for which the P&L is maximized (considering nonlinear effects of impact).
- 8Since the regression we consider here is weighted with the regression weights , this already controls exposure to alpha volatility, so imposing bounds based on volatility would make a difference only if one wishes to further suppress volatile alphas.
- 9The regression limit of optimization essentially amounts to the limit , , , where is the specific (idiosyncratic) risk in the factor model with the factor loadings matrix identified with the regression loadings matrix (and the factor covariance matrix becomes immaterial in the regression limit); see [2] for details.
- 10This is the case if the columns of are comprised of the first K principal components of SCM corresponding to its positive eigenvalues. However, as mentioned above, here, we keep the loadings matrix general.
- 11Various generalizations are possible, some more straightforward than others.
