Market Timing with Moving Averages

: Consider using the simple moving average (MA) rule of Gartley to determine when to buy stocks, and when to sell them and switch to the risk-free rate. In comparison, how might the performance be affected if the frequency is changed to the use of MA calculations? The empirical results show that, on average, the lower is the frequency, the higher are average daily returns, even though the volatility is virtually unchanged when the frequency is lower. The volatility from the highest to the lowest frequency is about 30% lower as compared with the buy-and-hold strategy volatility, but the average returns approach the buy-and-hold returns when frequency is lower. The 30% reduction in volatility appears if we invest randomly half the time in stock markets and half in the risk-free rate.


Introduction
According to the standard investing separation theorem of Tobin [1], investors allocate investments between risk-free and risky assets. If the risk-free rate is low (high), the investors shift their wealth to (from) the risky assets. Fama [2] divided forecasters into two categories, namely macro forecasters (or market timers) and micro forecasters (or security analysts), who try to forecast individual stock returns relative to the market returns.
Merton [3] defined a market timer to forecast when stocks will outperform (underperform) the risk-free asset, indicating that, when r m t > r

Literature Review
In efficient markets, investors earn above average returns only by taking above average risks (Malkiel [16]). Samuelson [17] conformed with Fama [2] by noting that market efficiency can be divided into micro and macro efficiency. The former concerns the relative pricing of individual stocks, and the latter, for markets as a whole. The CAPM by Sharpe [5], and Lintner [6] argues that beta is a proper definition for systematic risk for stock i, if unexplained changes in risk adjusted returns for the stock follow approximately normal distribution with zero mean.
Black [18] stated that the slope of the security market line (SML) is flatter if there exist restrictions in borrowing, that is, leverage constraints in the model. Starting from Black et al. [19], many studies have reported that the security market line is too flat in US stocks compared with the SML suggested by the CAPM version of Sharpe and Lintner.
Ang et al. [10], Baker et al. [20], and Frazzini and Pedersen [21] found that low-beta stocks outperform high-beta stocks statistically significantly. In fact, Frazzini and Pedersen reported that significant excess profits in US stocks can be achieved by shorting high-beta stocks and buying low-beta stocks with leverage, but that leverage constraints make them disappear. Using Black [18], investors often have leverage constraints, thereby making them place too much weight on risky stocks, which results in lower required return for high-beta stocks than would be justified by the Sharpe-Lintner CAPM.
Markowitz [8] defined portfolio risk simply as the volatility of portfolio returns. Clarke et al. [22] found that the volatility of stock returns contains potentially an additional risk factor with respect to systematic risk that can be defined in the betas of CAPM by Sharpe and Lintner. Moreover, Ang et al. [10] reported that the total volatility of international stock market returns is highly correlated with US stock returns, thereby suggesting a common risk factor for US stocks.
Baker et al. [9] suggested that the low-volatility anomaly is due to investor irrational behavior, mainly because an average fund manager seeks to beat the buy-and hold strategy by overinvesting in high-beta stocks. The explanations include preference for lotteries (Barberis and Huang [23]; Kumar [24]; Bali et al. [25]), overconfidence (Ben-David et al. [26]), and representativeness (Daniel and Titman [27]), which means that people assess the probability of a state of the world based on how typical of that state the evidence seems to be (Kahneman and Tversky [28]).
Baker et al. [9] argued that the anomality is also related to the limits of arbitrage (see also Baker and Wurgler [29]). In fact, the extra costs of shorting prevent taking advantage of overpricing (Hong and Sraer [30]). More importantly, Li et al. [31] reported that the excess returns of low-beta portfolios are due to mispricing in US stocks, indicating that the low-volatility anomaly does not exist because of systematic risk by some rational, stock specific volatility risk factor. They tested the low-volatility anomaly with monthly data from January 1963 to December 2011 in NYSE, NASDAQ, and AMEX stocks.
Market timing is closely related to technical trading rules. Brown and Jennings [32] showed theoretically that using past prices (e.g., the MA rule of Gartley [33]) has value for investors, if equilibrium prices are not fully revealing, and signals from past prices have some forecasting qualities. More importantly, Zhu and Zhou [7] indicated that the MA rules are particularly useful for asset allocation purposes among risk averse investors, when markets are forecastable (quality of signal).
Moskowitz et al. [34] argued that there are significant time series momentum (TSM) effects in financial markets that are not related to the cross-sectional momentum effect (Jegadeesh and Titman [35]). However, TSM is closely related to MA rules, since it gives a buy (sell) signal according to some historical price reference points, whereas MA rules give a buy (sell) signal, when the current price moves above (below) the historical average of the chosen calculated rolling window measure.
Starting from LeRoy [36] and Lucas [37], the literature in financial economics states that financial markets returns in efficient markets are partly forecastable, when investors are risk averse. This leads to the time-varying risk premia of investors, as noted by Fama [12]. For example, Campbell and Cochrane [38] presented a consumption-based model, which indicates that when the markets are in recession (boom), risk averse investors require larger (smaller) risk premium for risky assets. More importantly, Cochrane [11] noted that the forecastability of excess returns may lead to successful market timing rules.
Brock et al. [39] tested different MA lag rules for US stock markets, and found that they gain profits compared with holding cash. On the other hand, Sullivan et al. [40] found that MA rules do not outperform the buy-and-hold strategy, if transaction costs are accounted for. Allen and Karjalainen [41] used a genetic algorithm to develop the best ex-ante technical trading rule model using US data, and found some evidence of outperforming the buy-and-hold strategy. Lo et al. [42] found that risk averse investors benefit from technical trading rules because they reduce volatility of the portfolio without giving up much returns when compared against the buy-and-hold strategy.
More recently, Neely et al. [43] used monthly data from January 1951 to December 2011, and reported that MA rules forecast the risk premia in US stock markets statistically significantly. Marshall et al. [44] found that MA rules give an earlier signal than TSM, suggesting better returns for MA rules, but they both work best with outside of large market value stocks.
Moskowitz et al. [34] used monthly data from January 1965 to December 2009, and reported that TSM provides significant positive excess returns in futures markets. However, Kim et al. [45] reported that these positive excess returns produced by TSM are due to the volatility scaling factor used by Moskowitz et al.

Model Specification
Consider an overlapping generation economy with a continuum of young and old investors [0, 1]. A young risk-averse investor j invests their initial wealth, w j t , in infinitely lived risky assets i = 1, 2, . . . I, and in risk-free assets that produce the risk-free rate of return, r f . A risky asset i pays dividend D i t , and has x s i outstanding. Assuming exogenous processes throughout, the aggregate dividend is D t .
A young investor j maximizes their utility from old time consumption through optimal allocation of initial resources w j t , between risky and risk-free assets: where E t is the expectations operator, P t is the price of one share of aggregate stock, ν j is a constant risk-aversion parameter for investor j, σ 2 is the variance of returns for the aggregate stock, and x j t is the demand of risky assets for an investor j. The first-order condition is: which results in optimal demand for the risky assets: Suppose that an investor j is a macro forecaster who allocates their initial wealth, w j t , between risky stocks and risk-free assets according to their forecast about the return of the risky alternative. Then, Equation (1) says that the investor invests in the risky stocks only if the numerator on the right hand side is positive.

Empirical Analysis
This section presents the empirical results from seven frequencies for the (MA) trend-chasing rules. The data consist of 29 companies included in the Dow Jones Industrial Average (DJIA) index in January 2018. The trading data (daily closing prices) cover 30 years from 1 January 1988 to 31 December 2017. Choosing the current DJIA companies for the last 30 years creates a "survivor bias" in the buy-and-hold results. However, this should not be an issue, as we intend to compare the performance of the alternative MA frequency rules.
The rolling window is 200 trading days. The first rule is to calculate MA in every trading day; the second frequency takes into account every 5th trading day (thereby providing a proxy for the weekly rule); the third frequency takes into account every 22th trading day (proxy for the monthly rule); the fourth rule is to calculate MA for every 44th trading day (proxy for every other month); the fifth rule takes into account every 66th trading day (proxy for every third month); the sixth rule takes into account every 88th trading day (proxy for every fourth month); and the seventh rule takes into account every 100th trading day (proxy for every fifth month).
For the 29  The trading rule for all cases is to use a simple crossover rule. When the trend-chasing MA turns lower (higher) than the current daily closing price, we invest the stock (three-month US Treasury Bills) at the closing price of the next trading day. Thus, the trading rule provides a market timing strategy where we invest all wealth either in stocks (separately, every stock included in DJIA), or to the risk-free asset (three-month U.S. Treasury bill), where the moving average rule advices the timing.
At the first frequency (every trading day), we calculate daily returns for MA200, MA180, MA160, MA140, MA120, MA100, MA80, MA60, and MA40. For example, MA200 is calculated as: At the lowest frequency, where every 100th daily observation is counted, MAC2 is calculated as: If X t−1 < P t−1 , we buy the stock at the closing price, P t , thereby giving daily returns as Tables A1-A7 show that, as the frequency decreases until every fourth month frequency (MAQ3−MAQ2), average returns tend to increase, and decrease thereafter. In comparison, the biased buy-and-hold strategy produces +0.117 with equal weights among all DJIA stocks, and with 0.295 annual volatility. A random investment (half the time in the risk-free rate, and half in the equally weighted portfolio from 4 January 1988) produces (0.117 × 0.5 + 0.022 × 0.5) = +0.070 annually, on average, with (1 − √ 0.5 = 0.293) = 29.3% reduction in volatility, indicating 0.209 annual volatility for that portfolio.
The data are dividend excluded, but the average annual dividend yield in DJIA stocks over the last thirty years has been +0.026, so that the biased buy and hold strategy produces +0.143 annually with equal weights among DJIA stocks before taxes. Thus, the random investment strategy produces +0.083 annually, with survivor bias.
Appendix A (namely the second column of Tables A1-A7) also reports the annualized average log returns calculated in the largest sample (full 200 observations) in every category: MA200 +0.065; MAW40 +0.073; MA10 +0.079; MAD5 +0.083; MAT4 +0.089; MAQ3 +0.091; and MAC2 +0.088 after transaction costs and before dividends. Adding +0.013 produces after dividends and before taxes: MA200 +0.078; MAW40 +0.086; MA10 +0.092; MAD5 +0.096; MAT4 +0.102; MAQ3 +0.104; and MAC2 +0.101. These results imply that starting from every fifth trading day frequency, a macro forecaster beats the buy and hold strategy in returns. Figure 1 illustrates the effects of frequency on the returns to volatility ratio (the second column in Appendix A, Tables A1-A7). In Figure 2, the straight line again presents the return to volatility ratio of portfolios with random investment in the risk-free rate and the stocks in DJIA between 4 January 1988 and 31 December 2017. The red crosses plot the average return to volatility ratios, calculated by using a 200-day rolling window, with the following frequencies: daily, every five days, every 22 days, every 44 days, every 66 days, every 88 days, and every 100 days. The averages of every lag are reported in Tables A1-A14, Appendices A and B. Thus, all daily returns from Tables A1-A14 are included. In Figure 2, the straight line again presents the return to volatility ratio of portfolios with random investment in the risk-free rate and the stocks in DJIA between 4 January 1988 and 31 December 2017. The red crosses plot the average return to volatility ratios, calculated by using a 200-day rolling window, with the following frequencies: daily, every five days, every 22 days, every 44 days, every 66 days, every 88 days, and every 100 days. The averages of every lag are reported in Tables A1-A14, and. Thus, all daily returns from Tables A1-A14 are included. Comparing Figures 1 and 2, it is clear that using the whole 200 daily observation windows in the MA rules produces more efficient results in market timing. That is, comparing the products of shorter and longer MA rule rolling windows, e.g., the last two monthly observations compared with ten monthly observations, average realized returns drop from +0.079 to +0.059 before dividends, while volatility remains approximately unchanged (from 0.211 to 0.207). This suggests that, in both cases, about half and half is invested in the equally-weighted DJIA portfolios and in the risk-free rate, and the MA rules advise the timing. More importantly, Tables A8-A14 in Appendix B show that the range in volatilities with all MA rules varies between 0.202 and 0.227 (with 0.02 difference), whereas Tables A1-A7 in Appendix A show that realized returns vary between 0.096 and 0.033 before dividends (with 0.063 difference).
These results indicate that a macro market timing with 200 days rolling window produces a reduction in volatility from 0.295 (the buy-and hold) to between 0.207 and 0.218, but the average annualized returns (dividends included) tend to rise as the MA frequency falls (+0.078 with all 200 observations to +0.104 with every fourth month observations). Thus, the results indicate that MA market timing finds long term stochastic trends more efficiently than short term stochastic trends.
The Sharpe ratio of random market timing (half and half) with dividends is 0.292; for MA200 0.271; for MAW40 0.308; for MA10 0.332; for the MAD5 0.347; for MAT4 0.370; for MAQ3 0.381; and for MAC2 it is 0.362. Figure 3 shows that when the volatility changes 1% in the DJIA stocks, then the average returns change is 0.39%. Figures 1 and 2 suggest that the theoretical change should be such that, when the volatility changes 1%, the average returns change is 0.50%, suggesting a flatter SML line in the data. This suggests strongly that DJIA investors have overweight high-beta stocks in the last 30 years. Comparing Figures 1 and 2, it is clear that using the whole 200 daily observation windows in the MA rules produces more efficient results in market timing. That is, comparing the products of shorter and longer MA rule rolling windows, e.g., the last two monthly observations compared with ten monthly observations, average realized returns drop from +0.079 to +0.059 before dividends, while volatility remains approximately unchanged (from 0.211 to 0.207). This suggests that, in both cases, about half and half is invested in the equally-weighted DJIA portfolios and in the risk-free rate, and the MA rules advise the timing. More importantly, Tables A8-A14 in Appendix B show that the range in volatilities with all MA rules varies between 0.202 and 0.227 (with 0.02 difference), whereas Tables A1-A7 in Appendix A show that realized returns vary between 0.096 and 0.033 before dividends (with 0.063 difference).
These results indicate that a macro market timing with 200 days rolling window produces a reduction in volatility from 0.295 (the buy-and hold) to between 0.207 and 0.218, but the average annualized returns (dividends included) tend to rise as the MA frequency falls (+0.078 with all 200 observations to +0.104 with every fourth month observations). Thus, the results indicate that MA market timing finds long term stochastic trends more efficiently than short term stochastic trends.
The Sharpe ratio of random market timing (half and half) with dividends is 0.292; for MA200 0.271; for MAW40 0.308; for MA10 0.332; for the MAD5 0.347; for MAT4 0.370; for MAQ3 0.381; and for MAC2 it is 0.362. Figure 3 shows that when the volatility changes 1% in the DJIA stocks, then the average returns change is 0.39%. Figures 1 and 2 suggest that the theoretical change should be such that, when the volatility changes 1%, the average returns change is 0.50%, suggesting a flatter SML line in the data. This suggests strongly that DJIA investors have overweight high-beta stocks in the last 30 years. Allen and Karjalainen [41] gave reasons for using a cost of 0.2% per transaction in their sample, but since technological progress has reduced transaction costs since the mid-1990s, 0.1% per transaction should be fair, on average. Nevertheless, a trial with 0.2% transaction costs shows that, for example, the average annualized daily returns become 0.0403 for the MA200−MA40 rules, and 0.0674 for the MA10−MA2 rules. Note that the returns grow 67%, on average, for the MA10−MA2 rules (with about the same volatility) compared with costs of 0.1% per transaction. Note that the model prohibits short selling since we only have long positions in stocks or investing in the risk-free rate. Then, the limits of arbitrage argument of Baker et al. [9] are consistent with our results.

Concluding Remarks
The analysis suggests that a macro forecaster can obtain higher returns with equal volatility (30% below that of the buy-and-hold strategy) by reducing the frequency used in MA rules. The return to volatility ratio for risk-averse investors with MA market timing significantly outperforms the random benchmark strategy, when the frequency in the MA rules is reduced. This indicates that the forecasts become more accurate as the time frame becomes longer.
The results suggest that a flatter SML in the CAPM can be followed by the irrational preference of investors in high-beta stocks, as suggested by Baker et al. (2011) and Li et al. (2016), since the empirically efficient frontier of portfolios becomes flatter than the theoretically efficient SML (random timing) (see Figure 1). In other words, the empirical results suggests\ that market timing with the few past observations (for example, every fourth month) in the past 200 rolling window daily prices, Allen and Karjalainen [41] gave reasons for using a cost of 0.2% per transaction in their sample, but since technological progress has reduced transaction costs since the mid-1990s, 0.1% per transaction should be fair, on average. Nevertheless, a trial with 0.2% transaction costs shows that, for example, the average annualized daily returns become 0.0403 for the MA200−MA40 rules, and 0.0674 for the MA10−MA2 rules. Note that the returns grow 67%, on average, for the MA10−MA2 rules (with about the same volatility) compared with costs of 0.1% per transaction. Note that the model prohibits short selling since we only have long positions in stocks or investing in the risk-free rate. Then, the limits of arbitrage argument of Baker et al. [9] are consistent with our results.

Concluding Remarks
The analysis suggests that a macro forecaster can obtain higher returns with equal volatility (30% below that of the buy-and-hold strategy) by reducing the frequency used in MA rules. The return to volatility ratio for risk-averse investors with MA market timing significantly outperforms the random benchmark strategy, when the frequency in the MA rules is reduced. This indicates that the forecasts become more accurate as the time frame becomes longer.
The results suggest that a flatter SML in the CAPM can be followed by the irrational preference of investors in high-beta stocks, as suggested by Baker et al. (2011) and Li et al. (2016), since the empirically efficient frontier of portfolios becomes flatter than the theoretically efficient SML (random timing) (see Figure 1). In other words, the empirical results suggests\that market timing with the few past observations (for example, every fourth month) in the past 200 rolling window daily prices, have produced significantly better returns to risk ratio for the portfolio of DJIA equally weighted stocks in the past 30 years than random timing. The finding points to the low-volatility anomaly.
One explanation for the results is that they are due to time-varying risk premiums. This is emphasized by Neely et al. (2014), who claimed that MA rules, in effect, forecast changes in the risk premium. If the results are rational products of time-varying risk premiums, the results suggest that investor sensitivity to risk must be extremely high, and their risk premium is larger (smaller) in downs (ups), as suggested by Campbell and Cochrane (1999). As volatility rises (decreases), usually in downs (ups), the results suggest that, when volatility is high, investors as a group tolerate significantly more risk (that is, volatility) than in calmer periods.
Consider the following numerical example: Assume that the risk premium is 0.08 in volatile downs, and 0.04 in calm ups, and the variance of returns is 0.09 in downs and 0.03 in ups. Then, the risk aversion coefficient must be 0.89 in volatile down periods, and 1.33 in calm up periods. As market timing with MA rules works better in longer periods with few observations, it seems to be more accurate in longer stochastic (up or down) trends.

Author Contributions:
This paper is to be attributed in equal parts to the authors.