There is an important fundamental concept, which can be stated as decoupling of time horizons (or, equivalently, frequencies or wavelengths). In a nutshell, what happens at time horizon (or frequency ) is not affected by what happens at time horizon (or frequency ) if and are vastly different. By time horizon we mean the relevant time scales. e.g., the time horizon for a daily close-to-close return is 1 day. In terms of returns, the decoupling can be restated as the returns for long-term horizons being essentially uncorrelated with the returns for short-term horizons . Here is a simple argument.
3.1. Short v. Long Time Horizons
Here is a simple argument for a single stock (or security). Consider a time interval from time
to time
. Let us divide it into
M intervals
. For simplicity, we can assume that these intervals are uniform,
,
, albeit this is not critical here. Let the stock prices at times
be
. Let us define the return from time
t to time
as
Then we have
Now let us ask the following question. How correlated is the return
for the most recent period (
i.e.,
to
) with the return
for the entire period (
i.e.,
to
)? To define a “correlation”, we need multiple observations. So, we hang yet another index onto our returns, call it
α, where
labels different periods
to
, each consisting of
M periods (
to
,
), and we wish to compute the correlation between the return for the last such period
and the return for the entire such period
, and
α (not
s) labels the series (which is a time series) over which the correlation is computed. For simplicity, we can assume that the periods labeled by
α are “tightly packed”,
i.e.,
, albeit this is not crucial here. We then have
time points
,
and, consequently,
returns
,
, where
Note that
p can be much larger than
M, in fact, we will assume this to be the case.
16With the covariance
defined as above, we have
where
and we have used the fact that
, which implies that all
M variances
are approximately the same. We then have
where
is the
correlation matrix of returns
, and
. We have
where
and
.
Because we have
, the matrix
is approximately “self-similar” in the sense that all
sub-matrices of
with
,
and
(
i.e., for each
m there are
such sub-matrices) are approximately the same. Put differently,
approximately depend only on the difference
, and, in fact, only on
since
is symmetric. Let
,
. Then we have
,
, and
To estimate the correlation
, we need to make some assumptions about the correlations
. A reasonable assumption is that the correlations
decay as
grows. e.g., we can assume that
for some positive
, or, equivalently, that
,
, where
. We then have
and our correlation
reads
We also have
where we have taken into account that
.
So, if
, we have (assuming
M is large)
and for positive
the bound is even tighter.
What about
? First, note that
is still positive—for it to become negative, the argument of the square root in the denominator in Equation (
49) would become negative, which is not possible (see below). However, for negative
a priori it might appear that
need not be small as the denominator in Equation (
49) could become small when
, where
. Nonetheless, this cannot be the case if there is randomness in the returns
. Indeed, we have
17
So, the argument of the square root in the denominator of Equation (
49) is (up to
) the variance of the return
for the period
. If there is randomness in the returns
, the variance
should scale linearly with
and, consequently, with
M. If this variance were of order
, this would imply that the returns
are highly anti-correlated with each other and the entire process is highly deterministic. Put differently, there would be essentially no dispersion in this case. Under normal circumstances, where we have randomness in the returns
, the variance
should be of order
. If there are any negative correlations
, they are offset by other positive correlations so that
and we have (52).
The upshot is—this is a generalization of our example above—that quantities with long time horizons have low correlations with quantities with short horizons. What happens, say, at milliseconds gets diluted by the time one gets to, say, month-long horizons—and this dilution is due to the cumulative effect of everything that transpires in between such vastly different time scales. Randomness plays a crucial role in this dilution. If things were deterministic, such dilution would not occur.
18 3.2. Implication for Risk Factors
A practical implication of the above discussion is that care is needed in choosing which risk factors to use in RM depending on what the time horizons of the strategies are for which RM is used. If these horizons are short, then risk factors such as value and growth, whose underlying fundamental data updates quarterly, should not be used as they add no value in short holding (a few days, overnight, intraday,
etc.) strategies. Here is a simple argument. Consider high frequency trading at, say, millisecond time scales. Does book value make a difference to such trading? The answer is no. What is relevant here is the market microstructure at the millisecond timescales (bid, ask, bid and ask sizes, order book depth, hidden liquidity, posting orders fast on different exchanges, whether the trader’s collocation is close to the exchange connectivity hub,
etc.).
19 Whether the book value for stock XYZ is $100M or $1B does not directly affect the market microstructure at millisecond time scales.
20On the other hand, quantities such as liquidity and market cap
21 do affect market microstructure. e.g., liquidity affects typical bid/ask sizes, print sizes, etc. More precisely, liquidity computed based on, say, 20-trading-day ADDV indirectly relates to such “micro” quantities because of the expected
linear scaling of volumes.
22 i.e., even though ADDV is computed using longer horizons, it is a relevant risk factor for shorter horizon strategies precisely because of the aforementioned linear scaling of volumes, allowing an
extrapolation from longer to shorter horizons.
Similarly, volatility is a relevant style factor. Typically, it is computed as historical volatility of, say, close-to-close returns. As an
extrapolation—based on the
assumption that historically more volatile stocks are also more volatile intraday—one can use this style factor for shorter horizon strategies. Preferably, one can also define volatility style factor based on shorter horizons (e.g., intraday; see
Section 2).
So, conceptually, if the underlying quantity (e.g., book value or earnings) has a long time horizon (
i.e., changes, say, quarterly), then the corresponding risk factors are not relevant for shorter horizon strategies (e.g., those involving overnight returns),
23 unless there is a linear
extrapolating argument that reasonably relates such longer term quantities to their shorter term counterparts (as, e.g., in the case of liquidity). More technically, suppose we have
K factors we know add value. How do we determine if a new,
-th, factor adds value?
24 Here is a simple method.
Thus, suppose we have
N stocks and we have FLM
,
,
. Let
,
,
,
be new FLM once we add a new,
-th, risk factor. (So we have
,
.) Let
be the returns used in our strategy,
i.e., these returns have the time horizon relevant to our strategy.
25 We can run two regressions (without intercept—unless it is included in
), first
over
, and second
over
. In R notations:
In actuality,
,
and
are time series:
,
,
,
. We can run the above two regressions for each value of
s and look at, e.g., two time-series vectors of the regression F-statistic to assess if the new risk factor improves the overall
F-statistic.
26 Alternatively, we can pull the
matrix
into a vector
of length
(
i.e., treat the index pair
as a single index
σ), and do the same with FLM:
,
. We can now run two regressions
and compare the F-statistic.
27 If
K is not large, it is also informative to compare the t-values of the regression coefficients and assess the effect of the new factor.
For illustrative purposes, we ran such regressions for overnight returns
, where the open
and the previous close
prices are adjusted for splits and dividends. In the case of, say, book value, as a benchmark it suffices to consider a
model, where the sole risk factor is the intercept. Then we add the second risk factor, which is (log of) book (or tangible book, price-to-book, etc.),
28 so
. The regression F-statistic and t-values are given in
Table 1 (for regressions (56) and (57)), which shows that the second regression (57) involving (tangible) book value does not have improved statistic over the intercept-only regression. The 1-factor regressions other than the intercept-only regression can be thought of as regressions over “betas”. The
case (see
Table 1) is the closest to the intercept-only case because the regression
has F-statistic 56,230, and the t-value 237.1,
i.e., price and book value are highly correlated. As to the 2-factor regressions, (T)Book does not improve the statistic. It is log(Prc) that makes impact, precisely because prices change daily.
We also ran the (56) and (57) regressions with a
model as a benchmark, where the risk factors are 10 BICS sectors
29 (so
). The results are given in
Table 2, which shows that book value does not improve regression statistic. As above, it is log(Prc) that provides improvement. We also ran the (54) and (55) regressions separately for each date (
i.e., without pulling the index pair
into a single index
σ—see above) with the same
benchmark. The results are given in
Table 3 and agree with those in
Table 2. Log(Prc), not Book, has impact.
Table 1.
Results for regressions (56) and (57) with the intercept-only 1-factor model as the benchmark. Int = intercept; (T)Book = (tangible) book; Prc = adjusted previous close; RPrc = raw (unadjusted) previous close. Next to Int+log(Prc) we also give Int+log(RPrc) results. We do this because adjusting the previous close introduces a bias of anticipating future splits and/or dividends. However, as can be seen from the Int+log(RPrc) row, this bias is relatively mild and does not affect our conclusions. The blank entries “—" stand for N/As.
Table 1.
Results for regressions (56) and (57) with the intercept-only 1-factor model as the benchmark. Int = intercept; (T)Book = (tangible) book; Prc = adjusted previous close; RPrc = raw (unadjusted) previous close. Next to Int+log(Prc) we also give Int+log(RPrc) results. We do this because adjusting the previous close introduces a bias of anticipating future splits and/or dividends. However, as can be seen from the Int+log(RPrc) row, this bias is relatively mild and does not affect our conclusions. The blank entries “—" stand for N/As.
Regression/Statistic | F-Statistic | Intercept t-Value | Second Coefficient t-Value |
---|
Int only | 737.7 | 27.16 | — |
Book only | 237.2 | — | 15.40 |
TBook only | 191.2 | — | 13.83 |
Prc only | 1.34 | — | 1.16 |
Prc/Book only | 12.5 | — | 3.54 |
Prc/Tbook only | 3.84 | — | 1.96 |
log(Book) only | 707.5 | — | 26.60 |
log(TBook) only | 583.7 | — | 24.70 |
log(Prc) only | 526.0 | — | 22.94 |
log(Prc/Book) only | 739.1 | — | –27.19 |
log(Prc/Tbook) only | 608.7 | — | –24.67 |
Int+Book | 362.5 | 22.08 | 4.10 |
Int+TBook | 297.6 | 20.10 | 4.56 |
Int+(Prc/Book) | 354.3 | 26.38 | –0.66 |
Int+(Prc/TBook) | 287.2 | 23.89 | 0.15 |
Int+Prc | 368.9 | 27.14 | –0.24 |
Int+log(Book) | 354.2 | 0.98 | 0.53 |
Int+log(TBook) | 294.1 | –2.11 | 3.70 |
Int+log(Prc) | 473.9 | 20.53 | –14.48 |
Int+log(RPrc) | 468.7 | 20.18 | –14.12 |
Int+log(Prc/Book) | 394.0 | –6.99 | –8.93 |
Int+log(Prc/TBook) | 329.9 | –7.14 | –9.23 |
Finally, for the (54) and (55) regressions we computed the t-statistic of actual risk factor time series
a la Fama and MacBeth [
29], both for the
(intercept only) and
(BICS sectors) benchmark factor models. The results are given in
Table 4 and
Table 5 and agree with those in
Table 1,
Table 2 and
Table 3.
Table 2.
Results for regressions (56) and (57) with the BICS-sector 10-factor model as the benchmark. S = 10 BICS sectors labeled by S1(30), S2(63), S3(45), S4(30), S5(91), S6(75), S7(42), S8(48), S9(41) and S10(28) (the parentheticals show the number of tickers in each sector); X = the 11th factor (P, B, P/B, log(P), log(B) and log(P/B)); P = adjusted previous close; B = book; F = F-statistic; t = t-value. e.g., in the “Reg:” line “S+(P/B)” means that the returns R are regressed over FLM Ω containing 11 columns corresponding to the 10 sectors S1 through S10 plus the 11th factor X, which is (P/B) in this case. In the S+log(P) column we also give the values when P is taken to be the raw (unadjusted) previous close. We do this because adjusting the previous close introduces a bias of anticipating future splits and/or dividends. However, as can be seen from the S+log(P) column, this bias is relatively mild and does not affect our conclusions.
Table 2.
Results for regressions (56) and (57) with the BICS-sector 10-factor model as the benchmark. S = 10 BICS sectors labeled by S1(30), S2(63), S3(45), S4(30), S5(91), S6(75), S7(42), S8(48), S9(41) and S10(28) (the parentheticals show the number of tickers in each sector); X = the 11th factor (P, B, P/B, log(P), log(B) and log(P/B)); P = adjusted previous close; B = book; F = F-statistic; t = t-value. e.g., in the “Reg:” line “S+(P/B)” means that the returns R are regressed over FLM Ω containing 11 columns corresponding to the 10 sectors S1 through S10 plus the 11th factor X, which is (P/B) in this case. In the S+log(P) column we also give the values when P is taken to be the raw (unadjusted) previous close. We do this because adjusting the previous close introduces a bias of anticipating future splits and/or dividends. However, as can be seen from the S+log(P) column, this bias is relatively mild and does not affect our conclusions.
Reg: | S | S+P | S+B | S+(P/B) | S+log(P) | S+log(B) | S+log(P/B) |
---|
F | 80.6 | 73.3 | 72.4 | 71.3 | 92.6/91.7 | 71.3 | 77.8 |
t:S1 | 6.40 | 6.40 | 6.16 | 6.40 | 15.07/14.80 | 1.56 | –6.42 |
t:S2 | 6.67 | 6.67 | 5.94 | 6.50 | 15.74/15.44 | 1.19 | –7.08 |
t:S3 | 5.57 | 5.57 | 5.02 | 5.60 | 15.08/14.76 | 1.56 | –7.04 |
t:S4 | 5.40 | 5.40 | 5.08 | 5.41 | 14.13/13.87 | 1.32 | –6.79 |
t:S5 | 13.31 | 13.28 | 11.12 | 13.36 | 19.26/18.93 | 1.80 | –6.38 |
t:S6 | 13.13 | 13.13 | 12.26 | 12.50 | 19.29/18.99 | 1.98 | –6.09 |
t:S7 | 4.97 | 4.97 | 3.25 | 3.74 | 14.40/14.15 | 0.89 | –7.37 |
t:S8 | 6.85 | 6.85 | 6.42 | 6.88 | 15.88/15.58 | 1.35 | –6.80 |
t:S9 | 12.83 | 12.84 | 11.90 | 12.92 | 19.40/19.12 | 2.49 | –5.39 |
t:S10 | 8.63 | 8.63 | 8.21 | 8.87 | 16.17/15.92 | 2.19 | –5.69 |
t:X | — | –0.42 | 3.42 | –0.45 | –14.57/–14.21 | –0.20 | –8.47 |
Table 3.
Results for regressions (54) and (55) with the BICS-sector 10-factor model as the benchmark. The notations are the same as in
Table 2, except that F = median F-statistic, and t = median t-value, where F-statistic and t-values are computed based on regressions (54) and (55) for each date, and the median is computed serially over all dates. The meaning of double entries in the S+log(P) column is the same as in
Table 2.
Table 3.
Results for regressions (54) and (55) with the BICS-sector 10-factor model as the benchmark. The notations are the same as in Table 2, except that F = median F-statistic, and t = median t-value, where F-statistic and t-values are computed based on regressions (54) and (55) for each date, and the median is computed serially over all dates. The meaning of double entries in the S+log(P) column is the same as in Table 2.
Reg: | S | S+P | S+B | S+(P/B) | S+log(P) | S+log(B) | S+log(P/B) |
---|
F | 13.5 | 12.2 | 12.0 | 12.0 | 12.3/12.3 | 12.1 | 12.2 |
t:S1 | 0.40 | 0.40 | 0.40 | 0.43 | 0.67/0.68 | 0.11 | –0.18 |
t:S2 | 0.61 | 0.61 | 0.62 | 0.63 | 0.85/0.83 | 0.11 | –0.16 |
t:S3 | 0.34 | 0.33 | 0.32 | 0.32 | 0.69/0.63 | 0.11 | –0.20 |
t:S4 | 0.23 | 0.23 | 0.23 | 0.25 | 0.62/0.56 | 0.10 | –0.21 |
t:S5 | 0.91 | 0.91 | 0.76 | 0.88 | 0.90/0.85 | 0.15 | –0.19 |
t:S6 | 0.80 | 0.80 | 0.82 | 0.91 | 0.96/0.92 | 0.14 | –0.14 |
t:S7 | 0.39 | 0.39 | 0.24 | 0.25 | 0.70/0.64 | 0.10 | –0.20 |
t:S8 | 0.54 | 0.54 | 0.57 | 0.57 | 0.76/0.74 | 0.08 | –0.19 |
t:S9 | 0.76 | 0.76 | 0.77 | 0.80 | 1.00/0.98 | 0.18 | –0.12 |
t:S10 | 0.53 | 0.53 | 0.52 | 0.55 | 0.84/0.88 | 0.14 | –0.14 |
t:X | — | –0.02 | 0.13 | –0.04 | –0.55/–0.52 | –0.03 | –0.30 |
Table 4.
Results for regressions (54) and (55) with the intercept-only 1-factor model as the benchmark. The notations are the same as in
Table 1, except that the t-statistic here refers to the t-statistic of the corresponding risk factor time series
a la Fama and MacBeth [
29]. These t-statistic are annualized,
i.e., we compute the daily t-statistic and then multiply it by
.
Table 4.
Results for regressions (54) and (55) with the intercept-only 1-factor model as the benchmark. The notations are the same as in Table 1, except that the t-statistic here refers to the t-statistic of the corresponding risk factor time series a la Fama and MacBeth [29]. These t-statistic are annualized, i.e., we compute the daily t-statistic and then multiply it by .
Regression/Statistic | Intercept t-Statistic | Second Coefficient t-Statistic |
---|
Int only | 0.90 | — |
Int+Book | 0.82 | 2.21 |
Int+(Prc/Book) | 0.90 | –0.69 |
Int+Prc | 0.90 | –0.42 |
Int+log(Book) | 0.23 | 0.32 |
Int+log(Prc) | 1.90 | –3.50 |
Int+log(RPrc) | 1.78 | –3.10 |
Int+log(Prc/Book) | –2.15 | –2.90 |
Table 5.
Results for regressions (54) and (55) with the BICS-sector 10-factor model as the benchmark. The notations are the same as in
Table 2, except that “t:✭” refers to the annualized t-statistic of the corresponding risk factor “✭” time-series
a la Fama and MacBeth [
29], same as in
Table 4. The meaning of double entries in the S+log(P) column is the same as in
Table 2.
Table 5.
Results for regressions (54) and (55) with the BICS-sector 10-factor model as the benchmark. The notations are the same as in Table 2, except that “t:✭” refers to the annualized t-statistic of the corresponding risk factor “✭” time-series a la Fama and MacBeth [29], same as in Table 4. The meaning of double entries in the S+log(P) column is the same as in Table 2.
Reg: | S | S+P | S+B | S+(P/B) | S+log(P) | S+log(B) | S+log(P/B) |
---|
t:S1 | 0.76 | 0.76 | 0.74 | 0.78 | 1.76/1.64 | 0.39 | –2.02 |
t:S2 | 0.59 | 0.59 | 0.54 | 0.60 | 1.61/1.50 | 0.28 | –2.24 |
t:S3 | 0.76 | 0.76 | 0.68 | 0.77 | 2.07/1.91 | 0.30 | –2.42 |
t:S4 | 1.09 | 1.09 | 1.01 | 1.10 | 2.32/2.14 | 0.37 | –2.48 |
t:S5 | 0.88 | 0.88 | 0.77 | 0.88 | 1.75/1.65 | 0.45 | –2.02 |
t:S6 | 1.07 | 1.07 | 1.02 | 1.05 | 2.01/1.88 | 0.51 | –1.86 |
t:S7 | 0.83 | 0.83 | 0.56 | 0.66 | 2.14/1.99 | 0.22 | –2.66 |
t:S8 | 0.64 | 0.64 | 0.60 | 0.65 | 1.72/1.59 | 0.32 | –2.07 |
t:S9 | 1.09 | 1.09 | 1.02 | 1.09 | 1.87/1.77 | 0.61 | –1.51 |
t:S10 | 1.17 | 1.17 | 1.13 | 1.23 | 2.04/1.92 | 0.58 | –1.78 |
t:X | — | –0.56 | 2.15 | –0.61 | –3.45/–3.05 | 0.02 | –3.12 |