Temporal Aggregation and Long Memory for Asset Price Volatility

The effects of temporal aggregation and choice of sampling frequency are of great interest in modeling the dynamics of asset price volatility. We show how the squared low-frequency returns can be expressed in terms of the temporal aggregation of a high-frequency series. Based on the theory of temporal aggregation, we provide the link between the spectral density function of the squared low-frequency returns and that of the squared high-frequency returns. Furthermore, we analyze the properties of the spectral density function of realized volatility series, constructed from squared returns with different frequencies under temporal aggregation. Our theoretical results allow us to explain some findings reported recently and uncover new features of volatility in financial market indices. The theoretical findings are illustrated via the analysis of both low-frequency daily Standard and Poor’s 500 (S&P 500) returns from 1928 to 2011 and high-frequency 1-min S&P 500 returns from 1986 to 2007.


Introduction
Long-memory processes, especially the possibility of confusing them with structural changes, are of great interest in the field of time series. Applications of long-memory models are numerous, in particular in relation to stock return volatility in financial markets. Ding et al. (1993) argue that stock return volatility series can be well described by long-memory processes. However, it has also been shown that the estimate of the long-memory parameter, d, is biased away from 0 and the autocovariance function exhibits a slow rate of decay when a stationary short-memory process is contaminated by structural changes in level. In other words, a spurious long-memory process can arise when there are structural changes in a short-memory process. This idea extends that advanced by Perron (1989Perron ( , 1990, who shows that structural changes and unit roots (d = 1) are easily confused in the sense that the estimate of the sum of the autoregressive coefficients is biased towards 1 and that tests of the null hypothesis of a unit root are biased towards non-rejection with a stationary process contaminated by structural changes. Relevant literature on this issue include Diebold and Inoue (2001), Engle and Smith (1999), Gourieroux and Jasiak (2001), Granger and Hyung (2004). Qu (2007, 2010) analyze the properties of the autocorrelation function, the periodogram, and the log-periodogram (LP) estimate of the long-memory parameter for short-memory processes with random level shifts (RLS). They show that the autocorrelation function, the periodogram, and the LP estimates for the log squared daily returns for the Standard and Poor's 500 (S&P 500) during 1928-2002 can be explained by a level shift model with a short-memory component, instead of a long-memory process without level shifts. Lu and Perron (2010) present a method to directly estimate These results will allow us to tackle a second aim and help us explain the following puzzles. First, the trimmed LP estimates for the log squared daily return series are very closed to zero, while they are relatively large (around 0.4) for the log daily realized volatility constructed from high-frequency returns. Second, the trimmed LP estimates for the log squared daily return series are near zero, while they are very large for the log aggregated squared daily returns. Third, the trimmed LP estimates for the log realized return series are large using high-frequency return data, while they are close to zero for the disaggregated original high-frequency returns when a large bandwidth is used. The theoretical findings are illustrated through the analysis of both low-frequency daily S&P 500 returns from 1928 to 2011 and high-frequency 1-min S&P 500 returns from 1986 to 2007. We consider the estimation of the long-memory parameter using the standard and the trimmed LP estimate. Overall, the results indicate that random level shifts are needed to explain the empirical features documented. For the low frequency data on S&P 500 returns, one cannot infer whether the noise is stationary long memory. On the other hand, a long-memory process appears needed to explain the features related to high frequency S&P 500 futures. This is in line with the findings of Varneskov and Perron (2018).
The remainder of this paper is structured as follows. Section 2 introduces the stochastic volatility model and the different aggregation processes for the realized volatility and the squared daily returns. Section 3 analyzes the properties of the long-memory parameter estimates with different aggregation levels and with random level shifts. Section 4 focuses on empirical applications of the theoretical findings to S&P500 data with different sampling frequencies. Section 5 contains brief concluding remarks and comments about potential avenues for future research. A mathematical appendix contains the technical derivations.

Alternative Volatility Measures
We first review the stochastic volatility models and the aggregation mechanisms. We then present theoretical results about temporal aggregation in the frequency domain.

Stochastic Volatility Model
It is evident that the number of observations for stock market returns during a fixed period of time is inversely related to the length of each return interval. That is, with prices fully available, a shorter interval means that the returns are observed more frequently and therefore more observations can be obtained. High-frequency returns are now available since many financial time series are available on a tick-by-tick basis, which is virtually continuous. In this paper, high-frequency returns are classified according to the numbers of time units included in their intervals. To simplify, the returns obtained every one unit of time period are defined as 1-period returns, and similarly k-period returns denote the returns observed every k units of time periods. Here, k is an integer larger than 1.
Let r t,n be the nth intraday log-return (i.e., r t,n = log (P t,n ) − log (P t,n−1 ), where P t,n is the price index) at day t such that r t,n = (1/ √ s)h t,n ε t,n with n = 1, . . . , s is the highest sampling frequency and t = 1, . . . , T is the number of days. The component h t,n intends to capture the volatility level.
Assumption 1. We assume that ε t,n is i.i.d. standard normal, and h t,n and ε t,n are mutually independent. It is further assumed that the demeaned squared volatility level y t,n = h 2 t,n − E h 2 t,n is covariance stationary long-memory with integrable spectral density f y (λ).

Remark 1.
This assumption considers the case of y t,n being long memory, with spectral density function f y (y) ≈ λ −2d as λ → 0.

Temporal Aggregation
According to the classification of high-frequency returns in Section 2.1, a sample of s 1-period returns contains s/k k-period returns. We assume that k is chosen such that s = kS for some integer S. Let r (k) t,p = ∑ pk n=k(p−1)+1 r t,n = ∑ k−1 j=0 L j r t,kp denote the continuously compounded k-period return, so that r for p = 1, . . . , S and t = 1, . . . , T. Here, L is the backshift operator. Note that the backshift operator, L, is applied to the second subscript, i.e., Lh t,kp t,kp = h t,kp−1 t,kp−1 . Therefore, the k-period return r (k) t,p can be written as where z t,p is i.i.d. standard normal. Then the squared k-period return r (k)2 t,p is given by which can be expressed in terms of temporal aggregation, as where [n/k] denotes the integer part of n/k and [·] (k) denotes the k-period non-overlapping temporal aggregation of the series x t , i.e., Note that in the case of aggregated variables, we use a square bracelet operator, i.e., [·] (k) . This should not confused with [·] without a subscript (k) which denotes the integer part. Therefore, the squared daily return is the temporal aggregation of (1/s)h 2 t,n z 2 t , over a day. Here, z t is i.i.d. standard normal for t = 1, . . . , T. The realized volatility constructed from the k-period returns r (k) t,p , is the temporal aggregation of the squared k-period returns r (k)2 t,p , such that

Temporal Aggregation in the Frequency Domain
Assumption 2. We assume that the demeaned squared volatility level y t,n = h 2 t,n − E h 2 t,n is covariance stationary with integrable spectral density f y (λ). Proposition 1. Under Assumptions 1-2, the spectral density of the squared r Proof. See the Appendix A.
The first term on the right hand side of (1) corresponds to the spectral density of the temporal aggregation of the k-period demeaned squared volatility level, [y t,n ] (k) . The remaining two terms are constant, induced by the noise component h t , which does not carry any information about the long-memory.
Remark 2. When k = s, we have the spectral density of the squared daily returns The first term of (2) is such that as λ → 0. Therefore, the spectral density of the demeaned squared daily volatility decreases as s increases. When s is large enough, the spectral density of the squared daily returns f r 2 t (λ) will be dominated by the second and third terms of (2), which implies that the squared daily return series is dominated by noise.
Proposition 2. The spectral density of the realized volatility obtained from k-period returns, r Proof. See the Appendix A.
Remark 3. When S = s, i.e., the realized volatility is obtained from 1-period returns, r t,n , we have the following form of the spectral density For the spectral density of realized volatility, we have from (4) as λ → 0. This means that the three terms of the spectral density of the realized volatility in equation (4) decrease at the same rate when s increases. Comparing the spectral density of the squared daily returns (2) with that of the realized volatility (4), note that their first terms are identical, i.e., corresponding to the spectral density of the temporal aggregation of the demeaned squared volatility level, y t,n = h 2 t,n − E h 2 t,n , over a day. This is the only part that contains information about the long memory and the remaining terms are simply noise. Therefore, both realized volatility and squared daily returns contain the same information about long memory.
A difference between the spectral density of the squared daily returns and that for the realized volatility series occurs only in the second and the third terms, which is independent of the value of λ. Furthermore, Var([y t,n ] (s) ) − Svar([h t,n ] (k) ) is positive in general because most financial series have positive autocorrelation even at large lag. Therefore, the difference between the spectral density of the squared daily returns and the realized volatility will be positive and larger than [

Long-Memory Parameter Estimates across Aggregation Levels
We first describe in Section 3.1 the standard log-periodogram regression, both regular and trimmed as suggested by McCloskey and Perron (2013). Then, in Section 3.2, we show the equivalence of the estimates across aggregation levels.

Log-Periodogram Regressions
A long-memory process typically has a spectral density function which is proportional to λ −2d as λ goes to zero, where d is the memory parameter. The fractionally integrated model, proposed by Granger and Joyeux (1980) and Hosking (1981), is a long-memory generalization of an ARMA model whose autocorrelations decay exponentially. When d ∈ (0, 0.5), the autocorrelations decay slowly, a characteristic of long-memory processes. Various estimators of d have been proposed, among which semiparametric estimators have become widely used as they do not require a distributional assumption on the process generating the difference of order d of the series. A popular semiparametric estimator is the LP regression estimator proposed by Geweke and Porter-Hudak (1983), which uses only frequencies near zero to avoid possible misspecification caused by high frequency movements. The LP regression estimator was analyzed by, among others, Robinson (1995).
The LP regression estimator is based on the following spectral characterization of a long-memory process: where f is the spectral density function of the process. The periodogram of the time series at λ j is defined as where w x λ j is the discrete Fourier transform of {x t } T t=1 evaluated at the Fourier frequency λ j = 2πj/T, and c * denotes the complex conjugate of any complex number c. I x λ j can be viewed as a noisy approximation to f . Therefore, the LP regression is: When l = 1, this is the standard LP regression estimator. We can trim some of the lower frequencies, as in McCloskey and Perron (2013), to obtain consistency and asymptotic normality with the same limiting variance as the standard LP regression estimator regardless of whether the underlying long/short-memory process is contaminated by level shifts or deterministic trends.

Equivalence of Estimates across Aggregation Levels
Lemma 1. Under Assumption 1, for the spectral densities of the aggregated series and the original squared return series, we have f Proof. See the Appendix A.
Lemma 1 implies that the spectral densities of the aggregated series and the original squared return series have the same slope near frequency zero. Hence, aggregation does not change the value of the long-memory parameter, consistent with the results of Chambers (1998), Souza (2005) and Hassler (2011).
Corollary 1. The periodogram is a finite sample version of the spectral density; hence, a similar relation holds approximately, i.e., I A similar result was obtained for stationary long memory series by Ohanissian et al. (2008).
Remark 4. Lemma 1 and Corollary 1 do not depend on the stochastic volatility specification. The validity of the results only require that the original process be stationary.
Remark 5. As discussed by Souza (2008), a better estimator is not necessarily generated by temporally aggregating a time series, because the same estimate can be obtained when the same bandwidths are used on the original time series, which offers a wider choice of bandwidths. That is, the original time series could provide potentially improved estimates in the sense that it allows for more flexible bandwidth selection.
Remark 6. The microstructure noise is not taken into consideration here. However, adding a microstructure noise process will not change the result in Lemma 1, because Assumption 1 will still be satisfied even adding a stationary noise, and the above results hold as long as the time series is stationary.
According to Perron and Qu (2010), the autocorrelation function, the periodogram, and the LP estimate for the log squared daily returns for the S&P 500 can be explained by a simple level shift model with a short-memory component, instead of a long-memory process without level shifts. Lu and Perron (2010) estimate a random level shifts model and find that few level shifts are present, but once these are accounted for, there is no evidence for the existence of long-memory processes in the sense that little serial correlation is found in the remaining noise. Therefore, random level shifts, which have not been included in Assumption 1, should be considered here to generalize Lemma 1. We consider the following random level shift model proposed by Perron and Qu (2010), . It is also assumed that the components π T,t and η t are mutually independent. According to Proposition 3 of Perron and Qu (2010), the limit of the expectation of the periodogram has the following form Lemma 2. Under the data generating process (6), we have the following relation between the k-period aggregated series and the original random level shift series, Therefore, Remarks 4-6 still hold when random level shifts are taken into consideration.

S&P 500 Volatility
We consider two series of returns related to the S&P 500 data, namely low-frequency and high-frequency returns. The low-frequency data consist of 22,000 daily return observations for the period from 13 August 1928 to 30 December 2011. Among these daily returns, the observations from 13 August 1928 to 30 October 2002 were kindly provided by William Schwert. The source of the data for the period 4 January 1928 through 2 July 1962 is Schwert (1990). From 3 July 1962 to 30 October 2002 it is from the CRSP daily returns file, and the returns for the time period after 30 October 2002 were obtained from the Yahoo Finance website. Because of the need to construct various aggregate measures, the effective initial date for estimate is 13 August 1928. The high-frequency data pertain to S&P 500 futures and includes 1-min returns from 7 October 1986 to 2 March 2007, amounting to 5000 trading days in total. These were purchased from http://www.grainmarketresearch.com/. These futures contracts expire within one year after their inception. Specifically, contracts incepted in January, April, July, and October expire in March, June, September, and December, respectively. The cleaned version was provided by Shin Ikeda as described in Appendix A of Ikeda (2015). The span of the data was mostly dictated by the data availability, though it conveniently avoids the turbulent period of the great recession. In order to eliminate the effect of outliers in the data, we use a logarithmic transformation of the observations. Since there are some zeros in the original high-frequency and daily data, we demean our data first, as in Deo et al. (2006). Other methods were proposed in the literature, e.g., Perron and Qu (2010) and Lu and Perron (2010), who add a small value to the squared returns.

Low Frequency Data
We first start our analysis with low frequency data, i.e., the daily data series. Table 1 shows the LP estimates for the log realized S-day return series, which is simply calculated by cumulating S neighboring squared daily returns that do not overlap. More specifically, S = 1, 5, 10, and 20 stands for the squared daily returns, the realized weekly (every five business days), biweekly, and monthly (every 20 business days) volatilities, respectively. The columns labelled log r 2 t (S) , log r 2 t and log{ r 2 t (S) } refer to the S-period aggregation of the logarithmic transformation of squared daily returns, the logarithmic transformation of the original squared daily returns and the logarithmic transformation of the S-period aggregated squared daily returns, respectively, with S denoting the aggregation level. Both the standard LP (SLP) and trimmed LP (TLP) estimates are presented for purposes of comparisons. For each series, the standard LP estimate is computed using m = N 0.5 , while the trimmed one is constructed using (l, m) = (N 0.65 , N 0.9 ), which performs relatively well according to McCloskey and Perron (2013). In all cases, N = T/S. Note that for the original daily returns series log r 2 t we consider the LP estimates constructed with different bandwidths to highlight the importance of the bandwidth selection. As the results will show, the empirical estimates are very similar using either the S periods aggregation with bandwidth T/S or the original daily series with the same bandwidth T/S. This is an implication of Lemmas 1-2 so that it should hold whether the process is a RLS or pure long-memory.  Notes: The table reports the standard log-periodogram regression estimates (SLP) and the trimmed log-periodogram regression estimates (TLP) for the degree of fractional integration in the daily S&P 500 returns data. The sample period is from 13 August 1928 to 30 December 2011. The sample size is 22,000. The rows labelled S = 1, S = 5, S = 10 and S = 20 refer to the squared daily returns, aggregated 5-day squared daily returns, aggregated 10-day squared daily returns and aggregated 20-day squared daily returns, respectively. The columns labelled [log r 2 t ] (S) , log(r 2 t ) and log{[r 2 t ] (S) } refer to the S -periods aggregation of the logarithmic transformation of squared daily returns, the logarithmic transformation of the original squared daily returns and the logarithmic transformation of the S -periods aggregated squared daily returns, respectively. S denotes the aggregation level. The estimates are based on the bandwidth m = N 0.5 for the log-periodogram regression and (l, m) = (N 0.65 , N 0.9 ) for the trimmed log-periodogram regression. For the estimates of aggregated series, we set N = T/S.

Remark 7.
Our theory applies only to the cases log r 2 t (S) and log r 2 t but not to log{ r 2 t (S) }. However, unreported simulations show that all three measures behave similarly as the aggregation changes. Hence, we conjecture that it would be possible to extend our results to cover the case of log{ r 2 t (S) }. Table 1 are worth notice. First, with the same bandwidth (N = T/S), the estimates for the log realized daily return series, which is the log aggregated squared daily returns, are approximately equal to and always a bit larger than the estimates for the log squared original returns. This finding confirms Corollary 1, i.e., the same long-memory parameter estimate can be obtained for both the aggregated time series and the original series when the same bandwidths are used. Figure 1 shows the periodograms of the squared daily return series (left), the five-period aggregation of the squared daily return series, divided by 5 (middle), and the 20-periods aggregation of the squared daily return series, divided by 20 (right) for frequency indices up to 550. The 550th frequency index corresponds to the frequencies π/20, π/4, and π for the squared daily returns, the five-period aggregation of the squared daily returns, and the 20-period aggregation of the squared daily returns, respectively. Note that they have almost identical pattern near frequency zero. This finding confirms Lemmas 1-2 again and implies the same long-memory parameter estimate for the aggregated and original series when the same bandwidths are used.   Second, when S = 1, i.e., the daily return series, note that both the standard and trimmed LP estimators are very different. In particular, the trimmed LP estimates are close to zero when S = 1, indicating the (near) absence of long-memory processes. On the other hand, the standard LP estimate is 0.57. However, for the realized 5-day return series, the standard and trimmed LP estimators are 0.62 and 0.31, respectively. In addition, a feature of interest is the fact that both the standard and trimmed LP estimates increase as S increases. As shown in Remark 5, the same estimate of the long-memory parameter for the aggregated series should be obtained when using the same bandwidths as those on the original time series. Therefore, the apparent difference could simply be caused by the bandwidth selection. We actually use a relative small bandwidth for the aggregated series, compared with the original series, because the number of observation in the aggregated series is smaller. The issue of interest is whether the documented feature is more likely to occur with a RLS model or with a pure long-memory process. As discussed in Perron and Qu (2010), the LP estimate increases as m decreases when considering the RLS plus white noise model. Hence, a larger LP estimate is expected for the aggregated series under RLS. In particular, it is expected to be greater than 0.5, i.e., in the non-stationary region. No such increase towards values in the non-stationary region is expected, as the bandwidth decreases, with a pure long-memory process. Hence, these results are more consistent with a RLS being the data-generating process.

High Frequency Data
We now consider high-frequency data, for which the unit of time is one minute. Table 2 shows the LP estimates for the log realized daily volatility constructed from k-period high-frequency data, and the log squared original returns. Here, k = 1, 5, 30, and 330 correspond to the case of 1-min, 5-min, 30-min, and daily returns, respectively. The columns labelled [log(r (k)2 t,n )] (S) , log[r (k)2 t,n ] and log[r (k)2 t,n ] (S) refer to the S-period aggregation of the logarithmic transformation of squared k-min returns, the logarithmic transformation of squared k-min returns and logarithmic transformation of the realized daily volatility aggregated by squared k-min returns over a day, respectively. S denotes the number of k-min returns per day, with s = kS. Both the standard and trimmed LP estimates are presented for purposes of comparisons. For each series, the standard LP estimate is constructed using m = N 0.5 , and the trimmed one using (l, m) = (N 0.65 , N 0.9 ). For the log realized return volatility series, we let N = T, so that the number of return observations equals the number of days on which prices are available. However, we let N = TS for the log squared return series, which means that the total number of return observations is equal to the product of the number of days and the number of observations in each day. For comparison purposes, we also include the estimates for the log squared original returns with N = T. Table 2. Long-memory parameter estimates with high-frequency S&P 500 futures data. t,n ] (S) refer to the S-period aggregation of the logarithmic transformation of squared k-min returns, the logarithmic transformation of the original squared k-min returns and logarithmic transformation of the realized daily volatility aggregated from squared k-min returns over a day, respectively. S denotes the number of k -min returns per day and we have s = kS. The estimates are based on the bandwidth m = N 0.5 for the log-periodogram regression and (l, m) = (N 0.65 , N 0.9 ) for the trimmed log-periodogram regression. For the estimates of aggregated log squared k-min returns and realized volatility, we set N = T. We report results for both N = T × S and N = T for the estimate obtained using the log squared return series. Table 2 are worth notice. First, similar to the results in Table 1, with the same bandwidth set to N = T, the estimates for the log realized volatility series, which is the aggregated squared returns, are approximately equal to the estimates for the log squared original returns. For instance, when k = 1 and S = 330, the trimmed LP estimator for the log realized return volatility is 0.68 while the corresponding estimate for the log squared original returns is 0.62. Figure 2 shows the periodograms of the realized volatility obtained from 1-min return series (left), s times the periodograms of the squared 1-min return series (middle), as well as the difference between them (right), with 2500 frequency indices. The 2500th frequency index corresponds to the frequencies π and π/330 for the realized volatility and the squared 1-min returns, respectively. The periodogram of the realized volatility (left) and that of the squared 1-min returns (middle) exhibit very similar values for frequency indices smaller than 1000, especially for frequency indices close to zero. The difference between the periodogram for the realized volatility and the 1-min returns (right) is slight. These results can be explained by the fact that the realized volatility series is here the 330-period non-overlapping aggregation of the squared 1-min return series, and their periodograms exhibit approximately the same values for small frequency indices, as stated in Lemmas 1-2.

Some interesting results in
Second, for the log realized volatility series, the LP estimates decrease as k increases, i.e., they are smaller when the return interval is longer. With daily returns (k = 330, S = 1), so that the log realized volatility series is the log squared daily return series, the standard and trimmed LP estimates are 0.48 and 0.04. In particular, the trimmed LP estimate is close to zero, indicating the (near) absence of long-memory, while the standard LP estimate is large, consistent with a RLS process. Of interest is the fact that for k = 1, 5, 30 all estimates are similar when the same bandwidth is used. This accords with the theoretical results that the estimates are invariant to the aggregation level. When k = 330, i.e., daily data, the estimates are somewhat smaller. This feature can be explained by the fact that as the aggregation level increases the spectral density function is contaminated by noise, which henceforth reduces the estimate, see Remark 2. Figure 1. The periodogram of the squared daily returns (left), the 5-periods aggregation of the squared daily returns divided by 5 (middle), and the 20-periods aggregation of the squared daily returns divided by 20 (right) for the daily S&P 500 returns data. The sample period is from August 13, 1928to December 30, 2011. Combining the equation for the squared daily returns (2) and that for the realized volatility (4), we know, as discussed in Section 3, that both the realized volatility and the squared daily returns contain the same information about long memory. However, the squared daily returns contain a larger noise component than does the realized volatility. Figure 3 shows the log squared daily return series (left), the log realized volatility obtained from 1-min return series (middle), and the difference between them (right). We can see that the log squared daily return series (left) exhibits larger variance than the log realized volatility series (middle). No pattern is seen in the difference (right), and it simply seems to be noise. Figure 4 shows the periodograms of the log squared daily returns (left) and the log realized volatility obtained from 1-min return (middle), as well as the difference between them (right). Note that the periodogram of the log squared daily returns (left) is much larger than that of the log realized volatility constructed from 1-min returns (middle) except for the first few frequencies near zero. These results are consistent with Equation (5) and with the presence of RLS given the very large values near frequency zero. For those frequencies near zero, both periodograms show very large values. Similar to what was shown in Figure 3, the difference (right) appears to be caused by a white noise process. In addition, we can see that the white noise process is dominant in the periodogram of the log squared daily return series (left). As shown in Figure 5, similar results occur for the periodograms of the log realized volatility obtained from 110-min returns (left), 30-min returns (middle) and 5-min returns (right). The periodograms exhibit smaller values when higher frequency data are used, except for the first few frequencies indices near zero. For these, the periodograms are likely determined by low-frequency contamination, for example, random level shifts Qu 2010 andMcCloskey andPerron 2013). In general, the values of the periodograms are much larger for frequency indices near zero, which can be explained by the fact that the impact of the random level shifts dominates that of the noise process.     Third, when larger bandwidths (N = T × S) are used to estimate the memory parameter for the log squared original returns, the trimmed LP estimates are close to zero, indicating the (near) absence of long memory, while the standard LP estimates are near the non-stationary region, regardless of the length of the return intervals. These results are consistent with those of Perron and Qu (2010), Lu and Perron (2010), McCloskey and Perron (2013) and Varneskov and Perron (2018). This is an important feature, which shows the importance of the bandwidth selection, in particular in selecting a value large enough. The use of the aggregated squared k minutes returns allow much more flexibility in the possible choice of the bandwidth, so that we can always use N = TS, which was suggested by Souza (2008) as leading to improved estimates. Such choices are not possible when using log realized volatility so that the estimator is much more influenced by the low frequencies thereby inducing larger estimates, regardless of whether the true process is RLS or pure long-memory. Hence, we view the estimates obtained with the aggregated squared k minutes returns and a large bandwidth as being the most reliable, indicating the near absence of long-memory. This is reinforced by the fact that these estimates are (very nearly) the same across aggregation levels, showing robustness to aggregation at any level.

Conclusions
We showed in this paper that the squared low-frequency returns can be expressed in terms of the temporal aggregation of a high-frequency series. We built a bridge between the spectral density function of squared low-frequency and high-frequency returns. Furthermore, we analyzed the properties of the spectral density function of realized volatility, constructed from squared returns with different frequencies under temporal aggregation. The theoretical findings were illustrated through the analysis of both low-frequency daily S&P 500 returns from 1928 to 2011 and high-frequency 1-min S&P 500 returns from 1986 to 2007. Overall, the results indicate that random level shifts are needed to explain the empirical features documented. For the low frequency data on S&P 500 returns, one cannot infer whether the noise is stationary long memory. On the other hand, a long-memory process appears needed to explain the features related to high frequency S&P 500 futures. The differences may be due to the difference in the asset, i.e., futures versus spot returns. It is a feature for which we have no explanation. More work is needed to better understand the differences. This is important for various aspects of financial modeling, including the pricing of options.