Persistence in the Realized Betas: Some Evidence from the Stock Market

: This paper examines the stochastic behaviour of the realized betas in the CAPM model for the ten largest companies in terms of market capitalisation included in the U.S. Dow Jones stock market index. Fractional integration methods are applied to estimate their degree of persistence at daily, weekly, and monthly frequencies over the period July 2000–July 2020 over time spans of 1, 3, and 5 years. On the whole, the results indicate that the realized betas are highly persistent and do not exhibit weak mean-reverting behaviour at the weekly and daily frequencies, whilst there is some evidence of weak mean reversion at the monthly frequency. Our findings confirm the sensitivity of beta calculations to the choice of frequency and time span (the number of observations).


Introduction
The capital asset pricing model (CAPM), initially introduced in the 1960s, is based on the idea that systematic risk is determined by the covariance between the market and individual stock returns.It is still the standard framework taught in finance courses and used by risk-averse investors for selecting the optimal portfolios.Its general description can be found in Sharpe (1964), and there is further discussion in Treynor (1961), Lintner (1965), and Mossin (1966).Fama and MacBeth (1973) made use of this model to analyse the relationship between risk and return in NYSE stocks and documented a positive linkage between the average return and the market beta over the period 1926-1968;however, Fama and French (1992) found that this linear relationship had disappeared in the period .
The CAPM model has several limitations and is based on rather restrictive assumptions (see Fernandez 2015Fernandez , [2014] ] 2019); for instance, it requires investors to have homogeneous expectations (of the returns, volatility, and correlations for every security over the same time horizon).In its standard formulation, which is the one mostly used by practitioners, it is a linear regression, whose most important parameter to be estimated is beta, which measures the risk arising from the exposure to market-wide as opposed to idiosyncratic factors.Polls are instead used to predict the market risk and the yield curve for the expected return of the risk-free asset.
Betas are normally predicted using historical data on the assumption that their future behaviour will be similar.Out of 150 finance textbooks, we reviewed 80 that recommend an estimation method but differ in terms of the frequency (daily, weekly, monthly, or annually) and the time span (from 6 months to 25 years) used for this purpose.As in Campbell et al. (1997), we found that the most common estimation approach (in 64% of the cases) is based on monthly data over a 5-year period.However, more recently, higher frequency data have often been used as the developments in IT have made computations easier.Table 1 summaries our findings concerning the frequency and the number of observations (the time span) chosen for estimating the realized betas in the textbooks reviewed.It can be seen that, as the frequency increases, the selected time span decreases (on average, monthly estimates are based on a span of 6 years, weekly ones on a span of 3 years, and daily ones a span of 1.5 years), whilst the number of observations used for the analysis increases.
Among the more recent studies focusing on higher frequency data, Andersen et al. (2003) and Bollerslev et al. (2009) analysed intraday trading with samples of 15 min.Damodaran 1 on his public portal for beta estimation selected different time periods (5 years and 2 years with weekly returns).Papageorgiou et al. (2016) analysed the daily returns over a 1-year period and showed that these results outperform those obtained using monthly data over a 5-year period as in Fama and MacBeth (1973).Cenesizoglu et al. (2016) evaluated the accuracy of one-month-ahead beta forecasts (at the monthly, daily, and 30-min frequency) and found that low (high) frequency returns produce the least (most) accurate estimates.Sharma (2016) analysed the conditional variance of various stock indices over 14 years.Bollerslev et al. (2016) investigated how individual stock prices respond to market price movements and jumps using data at the 5-min intraday frequency with 1-year samples and found evidence that the betas associated with intraday discontinuous and overnight returns entail significant risk premiums, while the intraday continuous betas do not.Cenesizoglu and Reeves (2018) used a realized beta estimator for the daily returns over the previous year for holding periods of 1, 3, and 6 months to explain the momentum effects.
Thus, an appropriate estimation period and sampling frequency are clearly crucial for obtaining accurate beta forecasts.An important issue is the possibility of time variation in the betas (Andersen et al. 2003), which is not considered by the standard CAPM.Multifactor pricing models including additional empirically motivated factors, such as firm size and book-to-market ratios (Fama and French 1993), have been shown to have a better in-sample fit and to produce more accurate out-of-sample predictions but are often criticised because of the difficulty in interpreting the expanded set of variables in terms of systematic risk.
An interesting question for practitioners is how persistent the realized betas are in the medium and long term.Andersen et al. (2005) applied fractional integration methods to analyse the data for 25 Dow Jones Industrial Average (DJIA) stocks over the period 1962-1999 for intraday frequencies and concluded that the corresponding betas are not very persistent and are best modelled as I(0) mean-reverting processes.This present paper uses a similar modelling framework but computes beta coefficients at the daily, weekly, and monthly frequency over the period July 2000-July 2020 using medium-and long-term window spans of 1-, 3-, and 5-year samples for the 10 largest stocks in terms of capitalisation included in the Dow Jones Index.The layout of this paper is as follows: Section 2 provides a brief literature review; Section 3 outlines the fractional integration model used for the analysis; Section 4 describes the data; Section 5 discusses the empirical results; and Section 6 offers some concluding remarks.

Literature Review
To analyse the persistence and long memory properties in the realized CAPM betas, a beta time series with different spans and sampling periods will be analysed.Therefore, the literature review should be focused both on the study of time-varying betas and on the analysis of long memory studies with fractional integration.To avoid misunderstandings, the literature review will be accordingly divided into these two subsections.

Time-Varying Betas
A first group of studies are based on the idea that the betas may vary with the conditioning variables, which leads to the concept of "conditional CAPM", and therefore focus on time-varying betas.This approach was introduced by Dybvig andRoss (1985a, 1985b).Fama and French (1992) pointed out the inability of the static CAPM to explain the cross section of average returns; more specifically, the robustness of the size effect and the absence of a relationship between the beta and average returns are inconsistent with the CAPM.Fama and French (1993) examined the common risk factors in the returns on stocks and bonds, namely, the factors related to markets, the firm size, and the book-to-market ratio.Ferson et al. (1987) developed tests of asset-pricing empirical models allowing market betas to change over time.Ferson and Harvey (1991) analysed the predictable components of monthly common stock and bond portfolio returns.Jagannathan and Wang (1996) argued in favour of time-varying betas on the grounds that the relative risk of a firm's cash flow is likely to change with the business cycle.Wang (2003) used a non-parametric approach to incorporate the conditioning information.Ang and Chen (2007) proposed a conditional CAPM with time-varying betas and market risk premia.
In the last decade, additional factors have been considered.Garleanu and Pedersen (2011) introduced the margin-CAPM model, where high-margin assets require higher returns.Ang and Kristensen (2011) estimated time-varying betas with non-parametric techniques, proposing a conditional CAPM and multifactor models for book-to-market and momentum decile portfolios.Engle and Rangel (2010) and Rangel and Engle (2012) provided evidence that models with volatility and correlation components outperform single component models.Patton and Verardo (2012) studied the information flow and its impact on the betas, and found that these increase on announcement days by a statistically significant amount.Buss and Vilkov (2012) used forward-looking information from option prices to estimate the option-implied correlations.Boubaker and Sghaier (2013) analysed portfolio optimization in the presence of financial returns with long memory.Frazzini and Pedersen (2014) presented a leverage and margin constraint model that varies across investors and time.Jayasinghe et al. (2014) estimated the time-varying conditional variance of index returns and found evidence of mean reversion and long memory in the betas.More recently, Fama and French (2015) extended the standard CAPM model to include five additional factors representing size, value, profitability, and investment patterns in average stocks.Bali et al. (2017) re-examined the static CAPM of Fama and French (1993) and provided evidence that it has no predictive power; they then introduced a conditional CAPM with a time-varying beta and showed that there is a significant link between the dynamic conditional beta and future stock returns.
Other recent studies have proposed alternative beta estimation methods.Lu and Murray (2017) suggested a "bear beta" model, where the time variation in the probability of future bear market states is priced.Pyun (2019) introduced a new out-of-sample forecasting method for monthly market returns using the Variance Risk Premium (VRP) defined in Bollerslev et al. (2009) as the difference between the objective and the risk-neutral expectations of the forward variance.Bai et al. (2019) proposed a general equilibrium model to quantify the consumption CAPM performance.Hollstein et al. (2019) focused on the link between conditional betas and high-frequency data to explain asset pricing anomalies.

Long Memory in Asset Pricing
The second approach was introduced by Bollerslev et al. (1988) and focuses on longrun dependence.Following the early contribution of Robinson (1991), many subsequent studies showed the empirical relevance of long memory for asset return volatility (e.g., Ding et al. 1993).Robinson (1995) developed a formal framework for testing long-run dependence in the logarithmic volatilities; this FIGARCH model was used by Baillie et al. (1996) to analyse exchange rates and by Bollerslev and Mikkelsen (1996) to examine the U.S. stock market.In both cases, long memory was detected, with the series being modelled as mean-reverting fractionally integrated processes, where the conditional variance decreased at a slow hyperbolic rate.Andersen and Bollerslev (1997) concluded that long memory is an intrinsic feature of returns.Bollerslev and Mikkelsen (1999) provided evidence of mean reversion in the volatility process using fractionally integrated models.Cochran and DeFina (1995) found predictable periodicity in market cycles.Bollerslev and Mikkelsen (1996) concluded that long-run dependence in the U.S. stock market is best modelled as a mean-reverting fractionally integrated process.However, Andersen and Bollerslev (1997) found that this process is very slow for most returns, and, as a result, detecting mean reversion is not an easy task.Balvers et al. (2000) pointed out that, if it exists, it can only be detected over long horizons; nevertheless, investors try to discover mean-reverting patterns for forecasting purposes (Jayasinghe et al. 2014).Andersen et al. (2003) analysed the persistence and predictability of the realized betas as well as of the underlying market variances and covariances using intraday data over the period 1962-1999; the latter were found to be highly persistent and fractionally integrated processes, in contrast to the realized betas, which appear to be much less persistent and best modelled as a standard stationary I(0) process.Further, simple AR-type models were shown to outperform other parametric models in terms of their forecasting properties for the integrated volatility.Andersen et al. (2006) pointed out that it is possible for the betas to be only weakly persistent (short memory, with d ~0), despite the widespread finding that realized variances and covariances exhibit long memory (fractionally integrated, with d > 0), in the case of fractional cointegration.
Regarding the sampling frequency, Bollerslev et al. (2006) found evidence of negative correlations between stock market movements and volatility at the intraday frequency.In particular, five-minute intervals appear to provide better results than one-day market sampling for assessing volatility asymmetries.Todorov and Bollerslev (2007) looked for a solution to the problem of modelling jumps in the betas using high-frequency data.Morana (2009) improved the realized beta estimator introduced by Andersen et al. (2005Andersen et al. ( , 2006) ) by allowing for multiple non-orthogonal risk factors.Bollerslev et al. (2011) explored alternative volatility measures to reduce the impact of the microstructure noise.Bollerslev et al. (2012) used intraday data for the S&P 500 and the VIX volatility indices and found further evidence that aggregate stock market volatility exhibits long-run dependence, while the volatility risk premium (VRP) is much less persistent.Bollerslev et al. (2013) concluded that market volatility is best described as a long memory fractionally integrated process.Hansen et al. (2014) proposed a GARCH model incorporating realized measures of variances and covariances.Engle (2016) put forward the Dynamic Conditional Beta (DCB) model to estimate regressions with timevarying parameters.
A brief comparison between the most popular market beta estimation techniques can be found in Hollstein and Prokopczuk (2016), who examined the performance of several time-series models and option-implied estimators and suggested using the hybrid methodology of Buss and Vilkov (2012) since it consistently outperforms all other approaches.

Methodology
We analysed persistence in the realized betas by using fractional integration methods to estimate the degree of dependence in the data, which is measured by the differencing parameter d.For our purposes, we define a covariance stationary process {x t , t = 0, ±1,. ..} with mean µ as integrated of order 0 and denoted by I(0) if the infinite sum of the autocovariances, defined as This type of process, also known as a short memory process, includes not only the white noise but also the stationary and invertible AutoRegressive Moving Average (ARMA) model, which is the most frequently employed model for a stationary time series.By contrast, a process displays the property of long memory (so-named because of the relevance of observations in the distant past) if the infinite sum of its autocovariances is infinite: An alternative definition, based on the frequency domain, uses the spectral density function, f (λ), which is the Fourier transform of the autocovariances.In this context, a process is said to exhibit long memory if the spectral density function is unbounded at one or more frequencies in the spectrum: This category includes many statistical processes, such as the Fractional Gaussian Noise (FgN) model proposed by Mandelbrot and Van Ness (1968).
As for the concept of fractional integration, a process is said to be integrated of order d and denoted by I(d), if d-differences are required to make it I(0), i.e.: where B is the backshift operator and d can be any integer or fractional value.Processes with d higher than 0 are known as long memory processes because of the high degree of dependence between observations far apart in time, where the polynomial in B in Equation ( 1) can be expressed in terms of its binomial expansion, such that: The parameter d plays a crucial role in this context, as it is a measure of the degree of persistence of the series: the higher is d, the higher is the degree of dependence between observations.Following the above equation, if the differencing parameter (d) is an integer, x t would only depend on a finite number of previous observations; however, if it is a noninteger, it will depend on all its past history.Moreover, the higher the value of d, the higher the relationship between the observations.In particular, from an statistical perspective, if d is smaller than 0.5, x t is still covariance stationary; however, d ≥ 0.5 indicates a lack of covariance stationarity and non-stationarity.Furthermore, x t in (2) admits that values of d that are below 1 support the hypothesis of reversion to the mean, with shocks having transitory effects.Finally, if d ≥ 1, there is a lack of mean reversion, which implies the permanency of shocks in the long term and the need of additional policies to restore the previous behaviour.
More specifically, d = 0 implies short memory behaviour, while 0 < d < 0.5 characterises a covariance stationary long memory process; if 0.5 ≤ d < 1, the series is non-stationary but mean-reverting with shocks having long-lasting effects that disappear in the long run; finally, d ≥ 1 implies non-stationarity and a lack of mean reversion.
Although fractional integration was already proposed in the early 1980s by Granger (1980Granger ( , 1981)), Granger and Joyeux (1980), and Hosking (1981), it was not until the late 1990s and early 2000s that it become popular in economics and finance (Baillie 1996;Gil-Alana and Robinson 1997;Mayoral 2006;Gil-Alana and Moreno 2012;Abbritti et al. 2016;etc.).In particular, we estimate the differencing parameter d using the Whittle function in the frequency domain (Dahlhaus 1989) by using a version of the long memory tests of Robinson (1994), which is computationally very attractive.Using this method, we test the null hypothesis: in ( 1), where x t can be the errors in a regression model of form: where z t can be either exogenous regressors or deterministic terms such as an intercept and/or a linear time trend.The test statistic proposed in Robinson (1994) contains several important features.Its limiting distribution is standard normal (N(0, 1)), so we do not need to rely on critical values based on Monte Carlo simulation studies.Moreover, the test statistic and its asymptotic behaviour remain valid for any real value d 0 in (3), including nonstationary cases, and, thus, preliminary differencing is not required to render the series stationary prior to the performance of the test.

Data Description
We have obtained data on daily, weekly, and monthly returns from the Reuters Eikon database for the ten companies with the highest market capitalisation included in the Dow Jones Industrial Average Index (.DJI) over the period 13 July 2000-14 July 2020.Specifically, we considered the following companies: Apple Inc.Using the raw data, we constructed daily, weekly, and monthly realized beta series by applying the formula Covariance(Stock,Index) Variance(Index) over 1-, 3-, and 5-year spans, thus obtaining nine beta measures for each company to examine their behaviour in the medium and long term.These are displayed in Figure 1.
Table 2 reports some descriptive statistics (standard deviation, average, and scaled volatility calculated as the standard deviation over the average) for the series of interest.It can be seen that the volatility coefficient measured as standarddeviation average is smaller at the daily frequency than at the weekly and monthly ones and increases when the window span decreases.For instance, over a 5-year span the average volatility coefficient is equal to 0.274 for the monthly series, 0.146 for the weekly series, and 0.135 for the daily series; the latter increases to 0.163 in the case of the 3-year span and to 0.227 in the case of the 1-year span.(f) JPMORGAN (g) PROCTER (i) VERIZON (j) WALMART

Discussion of the Empirical Results
The estimated model used for analysing the stochastic properties of the constructed series is the following: where y t is the observed time series (the realized betas in our case), β 0 and β 1 are unknown coefficients on the intercept (constant) and the linear time trend, and x t is I(d), where d is estimated from the data.We consider three model specifications, namely, (i) no deterministic terms, i.e., β 0 = β 1 = 0 in (5); (ii) a constant only, i.e., β 1 = 0; and (iii) a constant as well as a linear trend, i.e., β 0 and β 1 are estimated.Table 3 reports the estimated values of d along with their associated 95% confidence bands under the assumption of white noise errors for all three models.The coefficients in bold are in each case those from the preferred model, which has been selected on the basis of the statistical significance of the other parameters, as indicated by the t-statistics.These are reported in Table 4 together with the corresponding estimates of d.Note: In bold, there are the selected specifications on the basis of the significance of the deterministic terms.In parentheses, there are the 95% confidence bands for the values of d.The terms 1y., 3y., and 5. stand for 1-, 3-, and 5-year time spans, respectively.The red values indicate statistical evidence of mean reversion.As can be seen, the preferred specification includes an intercept in only 93% of the cases, whilst in two cases, both over a 5-year span and at the daily frequency, this also includes a linear time trend.In four cases, all for the monthly series over a 1-year span, no deterministic trends are found to be significant.As for the estimated values of d, the unit root null cannot be rejected in most cases.In fact, the average value of d in all cases is very close to 1 (0.989).
Table 5 summarises the estimated values of d with the corresponding volatility coefficients.When using the Fama and MacBeth (1973) "standard" beta measure (based on 5 years of monthly observations), the estimates of d are smaller than 1 in all cases (0.897 on average), which implies a weak mean reversion, although there are differences between companies.In general, d tends to be larger at higher frequencies and over longer time spans; in particular, for the daily data (over 1y, 3y, and 5y spans) or the weekly data (over 3y and 5y spans), the average value of d is 1.04, which implies a lack of mean reversion.By contrast, volatility is smaller at higher frequencies and over longer time spans.The bottom rows of Table 5 report the average over the 10 stocks, for both the integration parameter d and the volatility coefficient for each beta measure.Figure 2 provides a scatter diagram for these two parameters, which confirms that d tends to be larger over longer spans and at higher frequencies, whilst the opposite holds for volatility.Figure 3 provides the same type of information for each individual stock, with the same broad picture emerging.To summarise, we find evidence of non-stationary behaviour, with orders of integration equal to or higher than 1, in the daily and weekly series, whilst there is weak evidence of mean reversion (d < 1) at the monthly frequency.In other words, the parameter d is affected by the data frequency.There is also evidence that the window span (1y, 3y, and 5y) has an impact on the parameter d.

Conclusions
In this study, we examined the statistical properties of the realized betas within the framework of the CAPM model, using data on the 10 largest capitalised companies from the U.S. Dow Jones index and applying fractional integration and long memory techniques in the medium and long term (1, 3, and 5 years).In particular, we estimated their degree of integration d to measure persistence.
Our results highlight the importance of the choice of frequency and time span (the number of observations) for estimation purposes.We find evidence that longer time spans and higher frequencies correspond to higher estimates of d.When using a monthly sampling 5-year span as in Fama and MacBeth (1973), the realized betas appear to be characterised by a weak mean reversion (d = 0.897 on average), which implies that shocks do not have permanent effects.However, the estimations at the daily frequency and for a 5-year span yield larger values of d (d = 1.059 on average), which implies that a mean reversion does not occur in these cases.Note that at higher frequencies more observations are available and thus investors might feel more confident about the corresponding estimates.
On the whole, our analysis suggests that the standard practice of estimating the betas as in Fama and MacBeth (1973) using only a 5-year sample period is questionable given the lack of robustness of the results with regard to the choice of frequency and time span (the number of observations): at higher frequencies, volatility appears to be smaller but higher values of d are obtained, and thus mean-reverting behaviour is not observed.Our findings also point to d possibly not being a time-invariant parameter.Future work should aim to obtain evidence for other developed stock markets to gain additional insights into the behaviour of the realized betas depending on the sampling frequency and/or window span.Funding: Gil-Alana and Martín-Valmayor notifies that for the development of this research, it has been received funds from the MCIN-AEI-FEDER (Government of Spain) under Grant Agreement No PID2020−113691RB-I00 and gratefully acknowledges this financial support from project from 'Ministerio de Ciencia e Innovación' (MCIN), 'Agencia Estatal de Investigación' (AEI) Spain and 'Fondo Europeo de Desarrollo Regional' (FEDER).All authors declare that there is not any competing financial and/or non-financial interests in relation to the work described.
Institutional Review Board Statement: I also declare that research articles, relevant literature and non-research articles are cited appropriately.We have avoided untrue statements about entities (who can be an in-dividual person or a company) or descriptions of their behavior or actions that could potentially be seen as personal attacks or allegations about that entity.This research has no threats to public health or national security (e.g., dual use of research).If necessary, we are prepared to send relevant documentation or data in order to verify the validity of the results presented and sensitive information in the form of confidential or proprietary data is excluded.Authors declare all of the above to make sure to respect third parties' rights such as copyright and/or moral rights.
Informed Consent Statement: I declare on behalf of all authors that the manuscript has not been submitted to more than one publication for simultaneous consideration, and that this work is original and has not been published elsewhere in any form or language (partially or in full).Our results are clear, honest, and without fabrication, falsification or inappropriate data manipulation, to disciplinespecific rules for acquiring, selecting and processing data.No data, text, or theories by others are presented as if they were the author's own ('plagiarism').Proper acknowledgements to other works are given (including material that is closely copied), summarized and/or paraphrased), quotation marks (to indicate words taken from another source) are used for verbatim copying of material, and permissions secured for material that is copyrighted.

Data Availability Statement:
The authors declare that all data supporting the findings of this study is available within the article.In particular, the calculation datasets generated during and/or analyzed during the current study are available from the corresponding author on request.The results/data/figures in this manuscript have not been published elsewhere, nor are they under consideration by another publisher, and there are not hyperlinks to publicly archived datasets analysed or generated during the study except for the public information collected.

Figure 1 .
Figure 1.Historical beta time series.Figure 1. Historical beta time series.

Figure 2 .
Figure 2. Scatter diagram showing the relationship between the average fractional integration parameter d (x axis) and the volatility coefficient calculated as stdev/average (y axis) for each frequency/sample length.

Figure 2 .
Figure 2. Scatter diagram showing the relationship between the average fractional integration parameter d (x axis) and the volatility coefficient calculated as stdev/average (y axis) for each frequency/sample length.

Figure 2 .
Figure 2. Scatter diagram showing the relationship between the average fractional integration parameter d (x axis) and the volatility coefficient calculated as stdev/average (y axis) for each frequency/sample length.

Figure 3 .
Figure 3. Scatter diagram showing the relationship between the fractional integration parameter d (x axis) and the volatility coefficient calculated as stdev/average (y axis) for each frequency/sample length and each stock.

Author
Contributions: Conceptualization, formal analysis, investigation and data curation M.M.-V.; methodology, software, resources and validation L.A.G.-A.; writing original draft preparation, L.A.G.-A.and M.M.-V.; writing-review, visualization, supervision, and editing, G.M.C.; project administration and funding acquisition, L.A.G.-A.and M.M.-V.All authors have read and agreed to the published version of the manuscript.

Table 1 .
Estimation of the realized betas: chosen frequency and number of observations (time span) in finance textbooks.

Table 3 .
Estimates of the differencing parameter d.

Table 4 .
Estimates of the differencing parameter d, the constant, and the trend coefficient from the selected model specifications.

Table 5 .
Estimates of the d parameter with the corresponding volatility coefficient for each beta series.