You are currently on the new version of our website. Access the old version .
JRFMJournal of Risk and Financial Management
  • Article
  • Open Access

21 January 2026

When Tracking Error Misleads: Risk Exposure Differences Between ETFs and Their Indices

,
and
1
Department of Business and Finance, Institute of Public Administration, Riyadh 1114, Saudi Arabia
2
Department of Business and Economics, Wheaton College, Norton, MA 02766, USA
3
Department of Economics and Finance, The University of New Orleans, New Orleans, LA 70148, USA
*
Authors to whom correspondence should be addressed.
This article belongs to the Section Financial Markets

Abstract

We investigate the underlying risk exposures of ETFs compared with their indices using a Principal Component Analysis approach. Then, we test whether ETFs’ tracking errors can capture the risk exposure difference between ETFs and their underlying benchmarks. We document a significant positive relation between tracking error and differences in risk exposure between ETFs and their corresponding indices. Even modest increases in tracking error are associated with economically meaningful divergences in risk exposure between an ETF and its benchmark. These findings suggest that comparisons of tracking error across index ETFs when making investment decisions may be misleading for investors seeking benchmark-consistent risk exposure.

1. Introduction

Recent years have witnessed a significant expansion of the creation and trading of Exchange-Traded Funds (ETFs). Introduced in the early 1990s, today ETFs provide a diversified investment vehicle for individuals or entities regarding types of assets or geographical focus. By the end of 2024, total net assets in index ETFs amounted to USD 8.1 trillion, representing 23.9% of overall fund market net assets, compared with 8% at the end of 20081. Index ETFs represent a form of passive investment, where managers seek to track rather than outperform a benchmark, contrasting with active management strategies that aim to generate excess returns or alpha (Lettau & Madhavan, 2018). Passive investment strategies rest on the principle of market efficiency, which suggests that investors are unlikely to consistently outperform the market over the long term. Consequently, the primary objective is to replicate overall market performance (Fraś & Rogowski, 2016). The ETFs’ growth comes at the expense of traditional active mutual funds (Stambaugh, 2014). This expansion has encouraged fund managers to develop ETFs with diverse investment objectives and tracking methodologies, resulting in an extensive array of options for investors.
Index ETFs aim to replicate the performance of a specific index or benchmark, offering investors high liquidity, low transaction costs, and a transparent means to achieve broad market exposure. Therefore, the extent to which each ETF replicates its underlying index represents a performance evaluation criterion for investors (Yadav & Pope, 1994; Chu, 2011). The industry standard for measuring this tracking performance is tracking error (hereafter TE), which measures the consistency with which a fund follows its benchmark (Hougan, 2015). Frino and Gallagher (2001), among others, argue that TE is equal to the volatility or standard deviation of the difference between a portfolio’s return and its benchmark index’s return, and is thus a measure of the relative risk of the fund. However, the measurement of TE, in isolation, provides limited interpretive value. The TE only measures the volatility and does not express whether the return differential is positive or negative. Investors tend to compare a particular TE to another one. The lower the tracking error, the more consistently the fund’s return volatility matches that of the benchmark over time. Also, as a rule of thumb, a TE that is closer to zero is better than one that is farther away from zero. Mathematically and empirically, this can be shown with ease. However, we question how close to zero a TE must be to be considered close, and how far from zero is deemed too far. The TE can partially explain the performance of ETFs, as it is unavoidable and necessary (Dorocáková, 2017). We understand that not all index ETFs perfectly replicate their benchmarks because some indices are more difficult to track accurately. Replicating a benchmark involves costs, and in certain situations, legal and regulatory restrictions may prevent an ETF from fully replicating its index (Madhavan, 2016). Additionally, the expense ratio and transaction costs are significant factors that impact TE (Frino & Gallagher, 2001; Saunders, 2018). These drawbacks of TE, among others, motivate this study.
We relate two independent strands of literature to propose a novel approach for assessing ETFs’ performance: fund performance evaluation and factor models. In standard asset pricing frameworks, expected returns are determined by exposure to systematic risk factors; otherwise, rational investors would hold only riskless assets (Ang, 2014). Under the CAPM, the asset’s covariance with the market portfolio drives expected returns, while subsequent work extends this insight to multiple common risk factors (Chen et al., 1986; Fama & French, 2015, among others). In the context of delegated portfolio management, Carhart (1997) shows that fund performance reflects exposure to these common factors as well as implementation costs such as expenses and transaction costs. Applied to index-tracking ETFs, this framework implies that funds that replicate their benchmarks less efficiently should exhibit systematic deviations in risk exposure and, consequently, different risk premia than those that closely track their indices. Kim et al. (2023) document that ETFs with different replication strategies have systematically different tracking errors. Accordingly, we hypothesize that ETFs with low tracking error should display factor exposures similar to those of their underlying indices, whereas higher tracking error may reflect meaningful divergence in underlying risk.
This raises two key considerations for investors: how to compare TEs across ETFs, and what threshold of TE deviation should reasonably be regarded as excessive. We primarily aim to investigate the underlying risk exposures of ETFs compared with their indices. Then, we further investigate whether TEs can capture this risk exposure difference. Finally, we highlight the implications of TE variability for ETFs’ returns relative to their benchmarks.
We employ two samples, daily and weekly, of US equity ETFs. Both samples span the time period from January 2013 to December 2018. We begin by finding the TEs for each ETF. The tracking errors tend to be low overall, with means of around 0.0332 annualized (daily sample) and 0.0224 annualized (weekly sample). Then, we run a Principal Component Analysis (PCA) test based on index variations for each year to assess ETFs’ and indices’ risk exposure to variations in index returns. We relate the risk exposure of each ETF to that of its underlying index by computing the R2 difference. The R2 difference represents the risk exposure difference between each ETF and its index. We find that a high percentage of variations in ETFs and indices can indeed be explained cross-sectionally by the systemic risk of indices at each particular year. Then, we sort the ETFs in each year according to their TEs, reporting the absolute R2 differences. We show that, for both samples, ETFs with the highest TEs exhibit greater variations (gaps) in the differences R2 values. The pattern persists through each year of our analysis. Thus, we establish that, not surprisingly, a high TE results in a greater difference in risk exposure between ETFs and their respective indices. Importantly, it is surprising how the TEs’ comparability led to a substantial increase in risk exposure differences. In our weekly sample, we find that ETFs in the highest quartile (fourth) of the yearly mean TE exhibits higher R2 differences (21.517–66.416%) than ETFs in the median (third) quartile of the yearly TE (0.314–0.943%). The same pattern is apparent in our daily sample as well. Our results suggest that ETFs with a slightly higher TE have a wider risk exposure difference relative to their benchmarks. Therefore, we argue that comparing TEs across ETFs and assessing the differences between them could lead to higher risk exposure than one would expect by simply measuring the differences in TEs for different ETFs.
To further investigate the relationship between the TEs and risk differences, we repeat our analysis of the TE and R2 differences and record monthly variables. Then, we analyze the TEs as the dependent variable, regressed on the absolute R2 difference and control variables. We document a statistically significant relationship between the TE and the absolute R2 differences at the 1% level. These results support the premise that even slight variations in TEs can significantly impact the risk exposure of ETFs. Therefore, assessing ETF risk by comparing TE differences across ETFs is not as accurate as one would assume. This suggests that investors should exercise caution when comparing TEs across ETFs to make informed investment decisions.
The contributions of this paper are twofold. First, it advances the general literature on index fund performance evaluation, with a particular focus on ETFs. We extend the existing literature on the determinants of TE and focus on measuring TE in a way that reflects investors’ real risk exposure rather than mechanical index deviations. Second, this paper enriches the well-established literature on factor models by enhancing the understanding of their application in evaluating ETF risk and performance, challenging the conventional assumption that a lower tracking error implies a closer alignment with the benchmark in both return and risk.
The remainder of this paper is organized as follows: Section 2 provides an overview of related studies. Section 3 describes the sample and presents the TEs. In Section 4, we investigate the relationship between TEs and ETFs’ risk exposure in depth. Section 5 presents the conclusion.

3. The Samples and Tracking Errors

3.1. Data

We obtained ETF and index data from the Eikon database (DataStream). For this study, we chose ETFs that are exposed only to US markets, as the underlying risk factors tend to be more homogeneous than in a sample that includes ETFs from international markets. This increases the explanatory power of risk factors. Furthermore, for the same reason, we focused the analysis on equity ETFs. However, the study could also be applicable to non-US ETFs and all types of indexed funds. We then chose ETFs that satisfy the following criteria:
  • Indexed ETFs, where the main objective of their fund managers is to mimic a particular index;
  • Non-leveraged ETFs that mimic the underlying indices one-to-one, as leveraged ETFs aim to multiply or invert the performance of the underlying index.
The sample includes all ETFs that satisfy the criteria, and their financial data are available on DataStream. The benchmarks from Datastream were matched with the prospectuses of each ETF. The final sample includes 259 ETFs that mimic 239 indices. The data span the time period from the beginning of 2013 until the end of 2018.
The choice of 2013 as the starting point for the sample was motivated by the availability of data and the fact that approximately 23% of the ETFs that meet the criteria are listed after 2013. We use daily and weekly samples. Daily data tend to be noisy (as we observe in the analysis in the coming sections). However, the choice of weekly data limits the noise. The sample includes ETFs from 11 different industries based on the Eikon database classification.
Table 1 reports the sample distribution by industry, with about one-third of ETFs concentrated in the financial sector. However, we do not expect this concentration to introduce significant bias, as our analysis includes both ETFs and their underlying indices, ensuring that the risk factors account for both. Additionally, the primary goal of indexed ETFs is to mimic their benchmark indices.
Table 1. Sample distribution by industry.

3.2. Tracking Errors and the Underlying Risk Factors

3.2.1. Tracking Error

As noted, passive managers aim to replicate an index. Therefore, if an ETF is managed “perfectly,” its return, on average, equals the dollar return of the index. Passive managers have no incentive to take additional risks to outperform the index. Therefore, large deviations of ETF returns from the index do not serve the best interests of investors.
If an ETF perfectly mirrors its underlying index, the TE should be zero. Among the methods previously discussed for calculating TE, we chose to follow the methodology of Cremers and Petajisto (2009) because it is rigorous and tends to produce more conservative results. We adapted Cremers and Petajisto’s (2009) methodology to calculate the TEs using time series regressions. We regress the return of each ETF on the return of its benchmark in each time period, and then we determine the value of the TE by calculating the standard deviation of the error term, as follows:
R E T F i , t = α E T F + β E T F R I n d e x i , t + ε E T F i , t
T r a c k i n g   E r r o r T E = S t . D e v [ ε E T F i , t ]
where R E T F i , t is ETF (i) return at time (t), and R I n d e x i , t is index (i) return at time (t). The tracking error is the standard deviation of the error term in Equation (1). The ETF and index returns are calculated as the natural log of the first difference at each point in time.
R E T F i , t = ln E T F i , t ln E T F i , t 1
R I n d e x i , t = ln I n d e x i , t ln I n d e x i , t 1
Table 2 reports the sample size and description of the daily and weekly samples. We report the averages of all ETFs and indices that have the maximum number of observations in each particular year.
Table 2. Returns.
Panel (A) of Table 2 presents a description of the daily sample for ETFs and indices. We required the ETFs and indices to have the same number of observations in each year to ensure that all ETFs were exposed to the same variations in indices. The daily mean returns of ETFs and indices tend to be similar each year and have the same sign. Importantly, indices tend to have higher average returns relative to ETFs, which is expected since ETFs have higher return distortion and standard deviations, and they act as lag indicators rather than leading ones. The average annualized returns of all ETFs ranged from −0.097 in 2018 to 0.311 in 2013. The average annualized returns on all indices ranged from −0.095 in 2018 to 0.301 in 2013.
Panel (B) of Table 2 presents a description of the weekly sample for ETFs and indices. The weekly sample shows similar patterns to the daily returns. However, the weekly returns generally exhibit lower volatility than the daily returns, as reflected in their average annualized standard deviation. This reduction largely reflects the higher noise inherent in the daily returns.
Figure 1 illustrates the average daily returns (upper graph) and the weekly returns (lower graph) per year. It shows how closely ETF returns track their indices on average per year, confirming that our samples did not include non-index ETFs and that the matching between ETFs and benchmarks is fairly accurate.
Figure 1. Average Returns. These graphs show the daily (top) and weekly (bottom) average returns of ETFs and indices of our sample per year.
Then, we computed the TEs from Equation (2). Table 3 reports the summary statistics for the annualized tracking errors for both samples. Panel (A) of Table 3 presents the summary statistics for the ETF daily tracking errors. The average annualized tracking error for the daily sample is 0.0332, which ranged from 0.0276 in 2017 to 0.0372 in 2013. Panel (B) presents the summary statistics for the weekly sample. As noted for the returns, the standard deviations of the tracking errors were higher for the daily sample than for the weekly sample. For the weekly sample, the average annualized tracking error is 0.0224, which ranged from 0.0197 in 2017 to 0.0263 in 2013.
Table 3. Tracking errors.

3.2.2. Principal Component Analysis (PCA)

After computing the TEs in the previous section, we aim to determine whether TEs could accurately capture the risk exposure of ETFs relative to their benchmarks. Crucially, we want to test how consistent TE changes are at different levels of ETF risk exposure. We use Principal Component Analysis (PCA) to extract the underlying risk premiums. PCA is the most widely used method for dimensionality reduction (Pukthuanthong et al., 2018; Kozak et al., 2018; Lee et al., 2014). In this study, we extract the underlying risk factors from the indices (benchmarks) and then use risk premiums from the returns on these indices as explanatory variables for ETFs and index returns. We use the R2 from these regressions as indicators of the degree of risk exposure for ETFs and their indices over a period of one year.
The cross-sectional risk factor methodology is applied in five steps:
  • For each year, we select the indices with the maximum number of observations. Then, we form a matrix of index returns, Rt, with dimensions Rt = [n × #Ind], where n is the number of observations and Ind is the number of indices for that year.
  • We extract the first 10 PCAs from the matrix of index returns and store them in a matrix Vt with dimensions Vt = [#Ind × #Components].
  • We multiply the matrix of index returns, Rt, by the PCA components matrix, Vt, to obtain the risk factor loadings matrix RLt = [R × V].
  • We regress each ETF and index return on the 10 factors from RLt, as follows:
    R E T F i , t = α + β 1 P C A 1 + β 2 P C A 2 + + β 10 P C A 10 + ε E T F i , t
    R I n d e x i , t = α + β 1 P C A 1 + β 2 P C A 2 + + β 10 P C A 10 + ε I n d e x i , t
  • We record the R2 from each regression for each ETF and index each year.
The R2 represents how much of the ETF/index return variations are explained by the variation in the overall index returns. In other words, it represents the underlying risk exposure of ETFs and indices based on the systemic variations in index returns. We apply this methodology to both samples (daily and weekly). We report the summary statistics from the risk analysis in Table 4.
Table 4. Summary statistics of R2.
Table 4 presents the average R2s for the ETFs and indices derived from Equations (5) and (6). Panel (A) reports the results for the daily sample, and Panel (B) reports the results for the weekly sample. For the daily sample, the average R2 for ETFs ranges from 80.954% in 2017 to 90.091% in 2015. For the indices, it ranges from 88.313% in 2017 to 93.802% in 2015. For the weekly sample, the average R2 for ETFs ranges from 87.858% in 2017 to 94.335% in 2014; for the indices, it ranges from 91.369% in 2017 to 95.891% in 2016.
Table 4 shows that the daily R2s for the ETFs and indices tend to be noisier than the weekly data. Additionally, it demonstrates that a significant portion of variations in ETFs and indices can be explained cross-sectionally by the systemic risk of indices in each particular year. Table 4 indicates that the 10 PCA risk loadings explain the variation in ETFs and index returns very well (with medians around 97%). Therefore, substantial variations in both samples are explained by the systemic variations in indices within a particular year. We believe that the choice of 10 PCA factors is sufficient based on the findings in Table 4. It is important to mention that our focus is on the difference between each ETFs’ R2 and the R2 of its underlying index. However, since we obtained high R2 values, we have a lower chance of other possible explanatory risk factors.

4. Tracking Errors and ETFs Risk Exposure

In this section, we investigate whether tracking errors can capture the risk exposure difference between ETFs and their underlying indices. For each year, we calculated the R2 differences as follows:
R D i f f e r e n c e i , t 2 = R E T F i , t 2 R I n d e x i , t 2
A b s R D i f f e r e n c e i , t 2 = | R E T F i , t 2 R I n d e x i , t 2 |
where R E T F i , t 2 is the R2 from Equation (5) for ETF (i) in year (t), and R I n d e x i , t 2 is the R2 from Equation (6) for index (i) in year (t). Therefore, the R D i f f e r e n c e i , t 2 shows the R2 difference between an ETF and its underlying index. A difference in R2 of zero indicates that the ETF and its underlying index are exposed to the same risk factors. As the R2 difference increases, it implies greater divergence in risk exposures between the ETF and its benchmark index. In our analysis, we examine both the directional difference (Equation (7)) and the absolute difference (Equation (8)). We focus on examining whether ETFs bear risks differently than the indices they mimic. Section 3 presents the results of the PCA and ETF tracking errors. For each year, we sort the ETFs based on the magnitude of their TEs, and we report the difference and absolute difference in R2s, as calculated using Equations (7) and (8), respectively. The following graphs show the relationship between risk exposure and TEs.
Figure 2, Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7 present two graphs each. The x-axes of all graphs represent the ETFs for the year, sorted by the highest TE (from left to right). Each point on the x-axis represents a particular ETF from the sample. The y-axis on the bottom graph represents the ETF’s mean tracking error for each year. The y-axis on the top graph represents the absolute R2 difference for each ETF from Equation (8). All the figures reported are based on the weekly sample to minimize noise from the daily sample.
Figure 2. Weekly sample 2013.
Figure 3. Weekly sample 2014.
Figure 4. Weekly sample 2015.
Figure 5. Weekly sample 2016.
Figure 6. Weekly sample 2017.
Figure 7. Weekly sample 2018.
Figure 2 shows the relationship between risk exposure and TE for the weekly sample in 2013. It shows that ETFs with the highest tracking errors experience higher variations (gaps) in absolute R2 differences. Hence, there are different risk exposures between ETFs and their benchmarks. Importantly, even ETFs with similar TEs can be exposed to different risk levels. These variations are apparent in all years.
Figure 5 shows that some ETFs experience similar risk exposure despite having different levels of TEs. In contrast, Figure 6 shows the absolute R2 differences are widely scattered among ETFs with the highest TEs.
These patterns raise questions about how TEs respond to changes in the risk exposure of an ETF and its benchmark, which could be misleading when comparing TEs across ETFs.
To illustrate the variability in risk exposure of ETFs relative to their benchmarks, Table 5 reports the results of Equations (7) and (8) for the daily and weekly samples. For the weekly sample, we find that the AbsR2 difference, on average, ranges from 0.820% in 2014 to 3.687% in 2017, indicating higher variations from one year to the next.
Table 5. R2 and absolute R2 differences.
To determine a possible range for the Abs R2 difference, we ran 10,000 bootstrap simulations of Equations (5) and (6) for each ETF and index. Then, we calculated the Abs R2 differences using Equations (7) and (8). Figure 8 presents the results.
Figure 8. Bootstrapped absolute R2 differences.
From the simulations, we find that the absolute R2 differences have a mean of 4.808%, a median of 1.702%, and a standard deviation of 7.254. In Section 4.2, we test the impact of changes in absolute R2 differences on ETFs’ returns relative to their indices. To further investigate the relationship between R2 differences and the level of TE, we rank the TEs each year and report the third and fourth quartiles of TEs and absolute R2s each year. We report the results in Table 6.
Table 6. R2 and absolute R2 differences by quartiles.
Panel (A) of Table 6 shows that in 2013, the TE value in the third quartile is 0.0138, while it is 0.1744 in the fourth, representing an increase of 1164%. Meanwhile, the absolute R2 increases by about 1302%. In the other years of the sample, the differences between the third and fourth quartiles of TEs and absolute R2s are even greater. For the weekly sample, on average, TEs increases from the third to the fourth quartile by about 1584%, while the absolute R2s increases by about 6680% per year. This indicates that wide risk exposure can occur even for more minor increases in TEs. Therefore, it is fair to argue that TEs may not capture the difference in risk exposure between ETFs and their underlying indices in a consistent and linear manner. This raises the question of whether the difference in absolute R2 between an ETF and its benchmark is economically significant. We address this issue in Section 4.2.

4.1. Tracking Errors and the Risk Differences

To further investigate the relationship between TEs and differences in underlying risk factors, we collect ETFs that had complete daily observations from the beginning of 2011 to the end of 2018. We extend our sample to improve the statistical power of the analysis and ensure more reliable estimates. Then, we calculate the monthly TEs for each ETF from Equation (2).
Additionally, we perform PCA for each month and record the absolute R2 difference from Equation (8) for each ETF. Following the methods of Madhavan (2016) and Chiu et al. (2012), we also include control variables that affect ETF TEs. The control variables are Premiums, Turnover, Spread, the daily returns on the S&P500 (SPXRet), and VIX. These variables are defined in Appendix A. We use three models derived from the following equation:
T E i , t = α + β 1 A b s R i , t 2 + β 2 P r e m i u m s i , t + β 3 T u r n o v e r i , t + β 4 S p r e a d i , t + β 5 S P X R e t t + β 6 V I X t + ε i , t
Table 7 shows a significant and positive relation between TEs and the absolute R2 difference in the basic form of Equation (9). Specifically, the TE changes by about 0.01793 for every 0.01 increase in the absolute R2 difference. The absolute R2 coefficient remains significant and positive when we add the control variables (Model 2) and when we add a monthly dummy variable (Model 3). The relationship is highly significant at 1% in the three models. The absolute R2 differences’ coefficients ranges from 0.01353 to 0.01793. This finding is in line with the results presented in Figure 2, Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7: for smaller increases in TE, the risk exposure differs significantly. Additionally, this finding suggests that one should exercise caution when comparing TEs across ETFs to make investment decisions. Consistent with the literature, ETF premiums, turnover, SPX returns, and VIX have a significant impact on TEs. The number of observations is different between Model 1 and Models 2 and 3 due to missing data for the Premiums variable. Model 1 results are robust even when using the same sample as for Models 2 and 3.
Table 7. Panel regression.

4.2. Economic Significance

A pertinent question at this point is: what is the economic impact of changes in the R2 differences between ETFs and their underlying indices? We address this question in Table 8.
Table 8. Economic significance.
Table 8 presents the absolute return difference (average per month) between ETFs and their indices when the absolute R2 difference (as defined in Equation (8)) changes by 100 basis points. Over the period from 2012 to 2018, we find that a 100-basis-point change in absolute R2 difference impacts the return difference by approximately 0.0166% per month on average. Therefore, given the monthly returns (Table 2), we argue that the change in absolute R2 is not only impactful regarding risk but also regarding returns.
Our study offers insights into the importance of recognizing the real risk exposure that investors face when they hold an index ETF, which is hard to judge based on the TE alone. Minor changes in the levels of TE can expose investors to risk factors that exceed the exposure of the benchmarks. This differentiation is important for investors, as their investment goals could be heavily compromised.

5. Conclusions

As passive investing continues to expand, ETFs now account for a substantial share of managed assets, and TE is considered a central metric used by investors, regulators, and asset managers to assess fund tracking quality. Our results suggest that reliance on TE alone may be problematic. If the TE does not accurately reflect differences in underlying risk exposure, this may heavily impact investors’ ability to achieve their investment goals.
Motivated by this concern, we examine the usefulness of TE as a measure of risk when making investment decisions. We use the standard PCA approach to assess the underlying risk factors of ETFs and their benchmarks. This framework allows us to evaluate ETF risk exposure directly, rather than inferring it from TE. We find that cross-sectional variation in ETF returns is largely explained by systematic variation in index risk factors. Importantly, even modest differences in TE are associated with economically meaningful divergences in underlying risk exposure relative to the benchmark. These findings imply that comparing TEs across index ETFs may be misleading for investors seeking benchmark-consistent risk exposure. Rather than reflecting a deliberate risk–return tradeoff, TE in ETFs can mask substantial differences in factor exposure, potentially exposing investors to risks for which they are not adequately compensated. Our results have implications for both academic research and practice. For investors, they highlight the limitations of tracking error as a standalone performance metric. For regulators and market participants, they underscore the importance of transparency regarding index replication and underlying risk exposure. Future research could extend our analysis to non-U.S. markets, alternative asset classes, or different ETF replication strategies to further assess the generalizability of these findings.

Author Contributions

Conceptualization, N.A.; methodology, N.A.; software, N.A.; formal analysis, N.A.; investigation, N.A. and F.J.; data curation, N.A.; writing—original draft preparation, N.A. and F.J.; writing—review and editing, N.A., F.J. and M.K.H.; visualization, N.A. and F.J.; supervision, M.K.H.; project administration, F.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data available on request due to restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Definition of Variables.
Table A1. Definition of Variables.
VariableDescriptionSource
PremiumsETFs premium is calculated as the ETF price divided by the net asset value NAV, then the value takes the natural log.DataStream
TurnoverETFs turnover is calculated as the ETF daily trading volume price divided by the share outstanding.DataStream
SpreadETF bid/ask spread is calculated as the ETF daily bid/ask spread divided by the trading price daily mid-point.DataStream
SPX RetThe daily returns on the S&P500.DataStream
VIXThe VIX index, which measures the market’s expectation of future volatility. It is based on options of the S&P 500 Index (Source: www.cboe.com/vix, accessed on 18 January 2026).DataStream

Note

1
According to the Investment Company Institute Fact Book of 2024.

References

  1. Aber, J. W., Li, D., & Can, L. (2009). Price volatility and tracking ability of ETFs. Journal of Asset Management, 10, 210–221. [Google Scholar] [CrossRef]
  2. Ang, A. (2014). Asset management: A systematic approach to factor investing. Oxford University Press. [Google Scholar]
  3. Bae, K., & Kim, D. (2020). Liquidity risk and exchange-traded fund returns, variances, and tracking errors. Journal of Financial Economics, 138(1), 222–253. [Google Scholar] [CrossRef]
  4. Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of Finance, 52, 57–82. [Google Scholar] [CrossRef]
  5. Chen, N.-F., Roll, R., & Ross, S. A. (1986). Economic forces and the stock market. Journal of Business, 59, 383–403. [Google Scholar] [CrossRef]
  6. Chiu, J., Chung, H., Ho, K. Y., & Wang, G. H. (2012). Funding liquidity and equity liquidity in the subprime crisis period: Evidence from the ETF market. Journal of Banking & Finance, 36(9), 2660–2671. [Google Scholar] [CrossRef]
  7. Chu, P. K. K. (2011). Study on the tracking errors and their determinants: Evidence from Hong Kong exchange traded funds. Applied Financial Economics, 21(5), 309–315. [Google Scholar] [CrossRef]
  8. Cremers, K. M., & Petajisto, A. (2009). How active is your fund manager? A new measure that predicts performance. The Review of Financial Studies, 22, 3329–3365. [Google Scholar] [CrossRef]
  9. Dorocáková, M. (2017). Comparison of ETF’s performance related to the tracking error. Journal of International Studies, 10(4), 154–165. [Google Scholar] [CrossRef]
  10. Fama, E. F., & French, K. R. (2015). Incremental variables and the investment opportunity set. Journal of Financial Economics, 117, 470–488. [Google Scholar] [CrossRef]
  11. Fraś, A., & Rogowski, W. (2016). The attractiveness of passive forms of investment in Poland. Journal of Management and Financial Sciences, 9(25), 43–60. [Google Scholar]
  12. Frino, A., & Gallagher, D. R. (2001). Tracking S&P 500 index funds. The Journal of Portfolio Management, 28, 44–55. [Google Scholar] [CrossRef]
  13. Grinold, R., & Kahn, R. (1999). Active portfolio management: A quantitative approach for providing superior returns and controlling risk. McGraw-Hill Professional. [Google Scholar]
  14. Hougan, M. (2015). Tracking difference, the perfect ETF metric. ETF.com. Available online: https://www.etf.com/sections/blog/tracking-difference-perfect-etf-metric?nopaging=1 (accessed on 23 August 2025).
  15. Kim, J., Cho, H., & Seok, S. (2023). Liquidity risk, return performance, and tracking error: Synthetic vs. physical ETFs. Journal of International Financial Markets, Institutions and Money, 89, 101885. [Google Scholar] [CrossRef]
  16. Kostovetsky, L. (2003). Index mutual funds and exchange-traded funds. The Journal of Portfolio Management, 29, 80–92. [Google Scholar] [CrossRef]
  17. Kozak, S., Nagel, S., & Santosh, S. (2018). Interpreting factor models. The Journal of Finance, 73, 1183–1223. [Google Scholar] [CrossRef]
  18. Lee, H. C., Tseng, Y. C., & Yang, C. J. (2014). Commonality in liquidity, liquidity distribution, and financial crisis: Evidence from country ETFs. Pacific-Basin Finance Journal, 29, 35–58. [Google Scholar] [CrossRef]
  19. Lettau, M., & Madhavan, A. (2018). Exchange-traded funds 101 for economists. Journal of Economic Perspectives, 32(1), 135–154. [Google Scholar] [CrossRef]
  20. Lintner, J. (1965). The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. The Review of Economics and Statistics, 47, 13–37. [Google Scholar] [CrossRef]
  21. Madhavan, A. N. (2016). Exchange-traded funds and the new dynamics of investing. Oxford University Press. [Google Scholar]
  22. Mossin, J. (1966). Wages, profits, and the dynamics of growth. The Quarterly Journal of Economics, 80, 376–399. [Google Scholar] [CrossRef]
  23. Poterba, J. M., & Shoven, J. B. (2002). Exchange-traded funds: A new investment option for taxable investors. American Economic Review, 92, 422–427. [Google Scholar] [CrossRef]
  24. Pukthuanthong, K., Roll, R., & Subrahmanyam, A. (2018). A protocol for factor identification. The Review of Financial Studies, 32, 1573–1607. [Google Scholar] [CrossRef]
  25. Roll, R. (1992). Industrial structure and the comparative behavior of international stock market indices. The Journal of Finance, 47(1), 3–41. [Google Scholar] [CrossRef]
  26. Saunders, K. T. (2018). Analysis of international ETF tracking error in country-specific funds. Atlantic Economic Journal, 46(2), 151–160. [Google Scholar] [CrossRef]
  27. Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. The Journal of Finance, 19, 425–442. [Google Scholar] [PubMed]
  28. Shin, S., & Soydemir, G. (2010). Exchange-traded funds, persistence in tracking errors and information dissemination. Journal of Multinational Financial Management, 20(4–5), 214–234. [Google Scholar] [CrossRef]
  29. Stambaugh, R. F. (2014). Investment noise and trends. The Journal of Finance, 69(4), 1415–1453. [Google Scholar] [CrossRef]
  30. Treynor, J. L. (1961). Market value, time, and risk (Working paper). Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2600356 (accessed on 18 January 2026).
  31. Tzioumis, K., & Barbopoulos, L. G. (2025). Betas distribution and ETF tracking error. SSRN Electronic Journal. [Google Scholar] [CrossRef]
  32. Yadav, P. K., & Pope, P. F. (1994). Stock index futures mispricing: Profit opportunities or risk premia? Journal of Banking & Finance, 18(5), 921–953. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.