Next Article in Journal
Women’s Reforms, Digital Payments, and Financial Inclusion in Saudi Arabia: Evidence from Global Findex 2014–2024
Previous Article in Journal
Cryptocurrency Market Maturation and Evolving Risk Profiles: A Comparative Analysis of Bitcoin and Ethereum Tail Risk Dynamics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Duration Rotation in U.S. Treasury Fixed-Income ETFs: Evidence for a “Median” Strategy

Department of Computer Science, Metropolitan College, Boston, MA 02215, USA
*
Author to whom correspondence should be addressed.
FinTech 2026, 5(2), 29; https://doi.org/10.3390/fintech5020029
Submission received: 14 January 2026 / Revised: 19 March 2026 / Accepted: 31 March 2026 / Published: 7 April 2026

Abstract

We examine a simple duration-rotation strategy applied to six U.S. Treasury ETFs spanning the full maturity spectrum, using data from 2007 to 2025. At each semi-annual rebalancing date, ETFs are ranked by prior-period return and divided into three equal groups—Winners, Median, and Losers. Contrary to conventional momentum logic, the middle group consistently outperforms. The Median strategy grows USD 100 to USD 199.90 by end-2025, a CAGR of 3.79% against 2.17% for the passive benchmark, with a higher Sharpe ratio (0.606 vs. 0.494) and a shallower maximum drawdown ( 11.6 % vs. 14.4 % ). Newey–West HAC and Lo (2002) tests confirm statistical significance ( p = 0.031 and p = 0.014 ), and an expanding-window walk-forward procedure yields p = 0.0005 across 27 out-of-sample evaluations from 2012 to 2025. The result is robust to calendar alignment, evaluation endpoint, lookback window, and execution timing, and survives transaction costs by a wide margin. The strategy requires no interest rate forecasts, no proprietary data, and is implementable with standard ETF brokerage access.

Key Findings:
  • A simple rules-based rotation across U.S. Treasury ETFs outperforms passive Buy & Hold by 162 basis points per year, without forecasting interest rates or using proprietary data.
  • Contrary to momentum logic, the middle-ranked group—not the recent winners—delivers the best risk-adjusted returns. The Semi-Annual Median strategy achieves a Sharpe of 0.606 versus 0.494 for the benchmark and a maximum drawdown of 11.6 % versus 14.4 % in the 2022 rate-hiking cycle.
  • Semi-annual rebalancing is optimal. Higher frequencies amplify transaction noise, while lower frequencies fail to capture the signal. Daily rebalancing produces a negative net CAGR after costs.
  • The outperformance is statistically confirmed by two independent tests calibrated for serially correlated bond returns—Newey–West HAC ( p = 0.031 ) and Lo (2002) Sharpe inference ( p = 0.014 )—and corroborated directionally by a block bootstrap (88% positive draws).
  • An expanding-window walk-forward procedure over 27 genuinely out-of-sample evaluations (2012–2025) yields p = 0.0005 , with the adaptive selection rule choosing the Median strategy in 97% of periods.
  • The outperformance survives a 41× cost buffer: the break-even transaction cost is 145 bps against a realistic execution cost of 3.5 bps.

1. Introduction

Treasury bond returns are not created equal across the maturity spectrum. A one-month bill and a 30-year bond both carry the full faith and credit of the U.S. government, yet their price behavior can diverge dramatically as interest rates shift. When the Fed embarks on a tightening cycle, short-duration ETFs barely move while long-duration ETFs can lose a fifth of their value. When markets panic and rates collapse, the opposite plays out. This persistent cross-sectional variation raises a natural question: can a simple, rules-based strategy that rotates across Treasury maturities—without predicting interest rates—generate excess returns over a passive Buy & Hold allocation?
The rise of low-cost Treasury ETFs has made this question practically meaningful for a broad audience. Investors can now take targeted positions in the short end (SHV, SHY), the intermediate segment (IEI, IEF), or the long end (TLH, TLT) of the yield curve with intraday liquidity and minimal friction. What was once a strategy requiring OTC bond desks is today executable in any brokerage account. This structural shift motivates our focus on ETF-based duration rotation as a practical portfolio tool.
The strategy we study is deliberately simple. At the start of each semi-annual period, we rank six iShares Treasury ETFs by their return over the preceding period and divide them into three groups: the top two (Winners), middle two (Median), and bottom two (Losers). We then hold the selected group for the next period and repeat. We evaluate this across seven rebalancing frequencies from daily to annual. The six ETFs—SHV, SHY, IEI, IEF, TLH, and TLT; constitute the complete iShares Treasury duration ladder, spanning maturities from one month to thirty years with no gaps or overlapping exposures. The universe size reflects the available product set rather than a selection decision.
The main result inverts the conventional momentum narrative. Buying recent winners is the worst strategy in the Treasury duration space. Holding the middle group, the ETFs that neither led nor lagged the prior period, delivers the best risk-adjusted performance consistently. Under semi-annual rebalancing, the Median strategy grows a USD 100 investment to USD 199.90 by end-2025, compared to USD 144 for Winners, USD 142 for Losers, and USD 162 for the passive benchmark. Its Sharpe ratio of 0.606 is the highest of any configuration tested and its maximum drawdown of 11.6 % is the shallowest among the rotation strategies.
We validate this finding using two statistical tests chosen for the specific properties of bond return data. The primary test is the Newey and West [1,2] Newey–West HAC t-test, which adjusts for serial correlation from persistent interest-rate cycles. The secondary test is the Lo [3] Sharpe ratio inference, which independently asks whether the Sharpe ratio is statistically distinguishable from zero after correcting for autocorrelation bias. Both tests confirm significance at the semi-annual frequency ( p = 0.031 and p = 0.014 , respectively). Following Harvey et al. [4], who argue that the proliferation of tested configurations inflates false discovery rates, we note that p = 0.031 does not survive Bonferroni correction across the nine strategy-frequency configurations tested; the evidence is, therefore, characterised as strongly suggestive rather than definitive. An expanding-window walk-forward procedure independently confirms the result out-of-sample ( p = 0.0005 across 27 evaluations from 2012 to 2025).
This paper makes three contributions to the literature. First, it provides a clean test of duration rotation using purely credit-risk-free instruments, eliminating the credit spread noise that contaminates prior fixed income momentum studies. Second, it demonstrates that the middle tercile—not the extremes—is the optimal selection zone in the Treasury duration space, a finding that has clear intuitive grounding in term premium dynamics. Third, it situates this result within a growing body of evidence suggesting that median-tier selection may be a general principle of rotation strategy design, connecting to recent work on equity sector ETF rotation by [5].
Unlike carry-based timing strategies, which require yield data and explicit return forecasts, the Median strategy uses only realized total returns as the ranking signal, and yet mechanically selects the intermediate maturity zone that ref. [6] identifies as the peak carry-to-duration region—a convergence that explains the result without requiring any prediction.
The paper proceeds as follows. Section 2 reviews the relevant literature. Section 3 describes data, ETF universe, and methodology. Section 4 presents the main performance results. Section 5 reports the statistical tests. Section 6 details robustness analysis. Section 7 analyzes transaction costs. Section 8 discusses economic interpretation and limitations. Section 9 concludes.

How This Work Differs from Prior Studies

Most of what we know about fixed income momentum comes from the corporate bond market. Ref. [7] document momentum concentrated in speculative-grade corporate bonds, while ref. [8] find similar patterns across the corporate bond cross-section. Ref. [9] extend momentum to government bonds in a cross-country framework. Importantly, all of these contexts involve either credit risk, currency exposure, or both—meaning the signal is entangled with factors beyond duration.
Studies that do focus on Treasury markets tend to rely on macro signals rather than ranking mechanisms. Ref. [10] show that yield spreads predict future excess returns on long bonds. Ref. [11] demonstrate that a combination of forward rates forecasts bond excess returns at all maturities. Ref. [12] extend this with macroeconomic factors. These are elegant macro-driven timing approaches, but they require estimating predictive regressions that are notoriously unstable across rate regimes and difficult to implement in real time. Ref. [13] specifically examine Treasury bond momentum and find the effect is much weaker in Treasuries than in high-yield or emerging market bonds—consistent with the view that momentum in sovereign bonds operates through channels other than simple price continuation. Ref. [14] study Treasury ETF portfolio construction but focus on duration-level exposure rather than a cross-sectional ranking mechanism.
The most directly relevant parallel work is by [5], who propose and empirically test a median ETF rotation strategy for S&P 500 sector ETFs using data from 2000 to 2024. Their central finding mirrors ours: periodically ranking sector ETFs by return and holding the middle group—rather than the top or bottom performers—delivers superior total return, lower volatility, and shallower drawdowns relative to both extreme strategies and passive index investment. That a median-tier advantage emerges independently in equity sectors and in Treasury duration rotation suggests it may reflect a general structural feature of ranking-based rotation rather than a coincidence of any particular market.
Our approach differs from all of the above in four specific ways. First, by restricting to Treasury ETFs, we isolate duration as the sole variable—there is no credit risk, no currency risk, and no issuer-specific noise. Second, we apply a ranking-based mechanism rather than predicting anything about the future level of rates or spreads. Third, we use statistical methods—Newey–West HAC inference and Lo (2002) Sharpe testing—specifically calibrated for the autocorrelation structure of bond returns. Fourth, we test across seven rebalancing frequencies, identifying semi-annual rebalancing as empirically optimal rather than assuming it a priori.
Carry-based strategies rank bonds by yield-to-duration ratio or forward rate combinations, requiring either yield data or macro factor estimation. Our strategy ranks solely by realized total return—a signal observable from any price feed. The fact that these two approaches converge on similar holdings (intermediate maturities) in most periods is not a weakness—it is corroborating evidence that the return-ranking mechanism is capturing the same underlying carry premium through a simpler, noisier signal. This simplicity is a feature: it eliminates the estimation error and look-ahead risk inherent in yield-based ranking.

2. Literature Review

The momentum anomaly has been one of the most studied phenomena in asset pricing since ref. [15] documented that U.S. stocks with strong recent performance continued to outperform over the following 3–12 months. The anomaly proved durable enough that ref. [16] listed it as the primary exception to their three-factor model, spurring decades of follow-on research into its origins and limits. The behavioral explanation points to investor under-reaction to news, which allows prices to drift toward their fundamental value gradually. The risk-based explanation suggests momentum captures compensation for bearing time-varying systematic risk. Both interpretations predict the effect should weaken or reverse in environments characterized by sharp trend reversals—precisely the environment of the Treasury market during rate-hiking cycles.
Ref. [9] extended the momentum framework across asset classes—equities, currencies, commodities, and sovereign bonds—finding consistent momentum premia in each. In fixed income, the evidence is more nuanced. Ref. [8] find momentum in the corporate bond cross-section. Ref. [7] show it is concentrated in speculative-grade issues and largely absent in investment-grade and Treasury securities. The near-absence of classic price momentum in Treasuries is not a data artefact; it is a structurally sensible result given that Treasury returns are driven primarily by systematic rate factors with little room for the firm-specific information diffusion that drives equity momentum.
Within the Treasury market itself, the more productive literature concerns carry and term premium timing. Ref. [6] emphasizes that the yield curve slope and the carry-to-duration ratio are the dominant predictors of cross-sectional return variation across maturities. Intermediate-maturity bonds persistently offer the highest carry per unit of duration—a finding that provides the economic underpinning for the Median strategy’s outperformance. Ref. [10] showed that yield spreads embed time-varying risk premia, implying that duration rotation anchored to observable signals can capture these premia. Ref. [11] sharpened this with a tent-shaped forward rate combination that forecasts bond excess returns at all maturities with a single factor.
The practical feasibility of ETF-based strategies has grown significantly over the past two decades. Ref. [17] examines the substitutability of ETFs for conventional mutual funds, finding that ETFs have become genuine alternatives for accessing specific market segments with lower cost and better liquidity. Institutional asset management frameworks emphasize systematic asset allocation, benchmark-based evaluation, and rigorous risk management when implementing portfolio strategies, particularly in the context of diversified ETF portfolios [18].
Ref. [19] documents that bond ETF growth has improved secondary market efficiency in the underlying instruments, compressing bid-ask spreads and improving price discovery. For the six Treasury ETFs in our study, this translates to execution costs so small—between 0.5 and 3.0 bps half-spread—that transaction drag is nearly inconsequential relative to the alpha generated.
The data snooping concern that haunts backtested strategies is carefully addressed by [4], who argue that the documented proliferation of market anomalies implies a high rate of false discoveries and advocate for higher significance thresholds than the conventional 5% level. Ref. [20] apply bootstrap reality checks to technical trading rules, finding that many strategies with impressive backtested histories fail to survive proper multiple-comparison adjustment. These methodological benchmarks inform our conservative framing: we report exact p-values, acknowledge that our primary result (p = 0.031) does not survive Bonferroni correction across all configurations tested, and deliberately avoid overstating the strength of the evidence. Ref. [13] provide the most directly relevant negative result: Treasury bond momentum is much weaker than in high-yield or emerging market bonds, consistent with the view that the return-continuation signal in Treasuries is weak or absent. This motivates our reframing of the problem—we are not testing whether momentum per se exists in Treasuries, but whether a cross-sectional ranking mechanism can be repurposed for duration positioning. The answer, we find, is yes, but with a twist: the optimal zone is the middle of the ranking, not the top.
The most directly comparable contribution is the companion paper by [5], who test an analogous median rotation strategy on S&P 500 sector ETFs over 2000–2024. Their median monthly rotation significantly outperforms both the winner-chasing and loser-fading strategies as well as the passive S&P 500 index on total return, volatility, and maximum drawdown. The convergence of findings across asset classes—equities and Treasuries, sectors and maturities—is the strongest evidence that median-tier selection is not a coincidence but a structural feature of how ranking-based rotation strategies interact with the mean-reversion dynamics of ranked asset groups.
Performance evaluation and attribution are central components of modern portfolio management, where risk-adjusted metrics and benchmark comparisons are used to assess whether observed returns reflect genuine investment skill or simply exposure to systematic factors [21].
The statistical framework we adopt draws on two foundational papers. Refs. [1,2] develop the HAC covariance estimator used in our primary test, with the 1994 paper providing the automatic bandwidth selection rule. Ref. [3] derives the asymptotic variance of the Sharpe ratio under serially correlated returns, providing the theoretical basis for our secondary test. Together, they address the two most relevant dimensions of investment performance—the level of excess return and its risk-adjusted magnitude—under assumptions appropriate for bond return data.

3. Data and Methodology

For Treasury rotation strategies, consider the following six U.S. Treasury Exchange Traded Funds (ETFs), each representing different maturity segments along the yield curve. These ETFs are widely used proxies for short- to long-term Treasury exposures.
  • SHV (iShares Short Treasury Bond ETF)—1–12 months
  • SHY (iShares 1–3 Year Treasury Bond ETF)
  • IEI (iShares 3–7 Year Treasury Bond ETF)
  • IEF (iShares 7–10 Year Treasury Bond ETF)
  • TLH (iShares 10–20 Year Treasury Bond ETF)
  • TLT (iShares 20+ Year Treasury Bond ETF)
All six ETFs have long and consistent trading histories and collectively span the full maturity spectrum of the U.S. Treasury market. Their total return data were obtained from Yahoo Finance and cross-validated with Bloomberg [22,23,24,25,26,27,28,29].

3.1. Benchmark

We compare all strategies against four benchmarks, each answering a distinct question about the source of the Median strategy’s outperformance.
Primary benchmark: LUATTRUU. The Bloomberg U.S. Treasury Total Return Index (LUATTRUU) is a market-value-weighted index covering the full Treasury maturity spectrum. It represents what a passive investor achieves by holding all Treasuries in proportion to outstanding issuance and is used to compute excess returns for the Newey–West HAC test and Information Ratios throughout the paper.
Investable passive benchmark: GOVT ETF. The iShares Core U.S. Treasury Bond ETF (GOVT), managed by BlackRock, tracks the ICE U.S. Treasury Core Bond Index and provides broad, market-cap-weighted Treasury exposure in an ETF wrapper. Including GOVT addresses the concern that LUATTRUU, as an institutional index, may not be replicable by a retail investor. Outperformance over GOVT confirms the advantage holds relative to a directly investable passive alternative.
Duration-matched benchmark: Static IEF/IEI. A 50/50 equal-weight Buy & Hold portfolio in IEF and IEI, rebalanced semi-annually. This benchmark isolates the most direct alternative to the Median strategy: since IEF and IEI are the two ETFs the Median holds most frequently (appearing together in 64% of semi-annual periods), a passive investor could simply hold them permanently. Outperformance over Static IEF/IEI demonstrates that the rotation rule adds value beyond passively anchoring to intermediate maturities.
Rotation control benchmark: Equal-Weight. An equal allocation across all six ETFs (1/6 each), rebalanced semi-annually. This benchmark controls for whether the Median’s advantage reflects rotation skill or simply the benefit of holding a concentrated two-ETF portfolio. Outperformance over Equal-Weight confirms that the selection of the middle two ETFs—rather than diversification across all six—drives the result.
Full statistical tests against all four benchmarks are presented in Section 6.5; cumulative wealth paths are discussed in Section 4.4.

3.2. Portfolio Construction

The paper defines three distinct rotation strategies:
  • Winners (W): Reinvest in the two ETFs with the highest total returns in the previous rebalancing period.
  • Median (M): Reinvest in the two ETFs whose returns lie in the middle of the performance ranking.
  • Losers (L): Reinvest in the two ETFs with the lowest returns in the previous rebalancing period [9,30,31,32].
To illustrate the ranking and selection process, consider semi-annual rebalancing. Suppose that, during the previous six-month period, total returns were as follows:
SHV = x % , SHY = x % , IEI = x % , IEF = x % , TLH = x % , TLT = x % .
Ranking by performance gives:
TLT > TLH Winners > IEF > IEI Median > SHY > SHV Losers
Accordingly, for the next period:
  • Winners: TLT, TLH (long-term Treasuries)
  • Median: IEF, IEI (intermediate-term Treasuries)
  • Losers: SHY, SHV (short-term Treasuries)
The three rotation rules rank the six Treasury ETFs by prior-period total return and then select two ETFs each period: Winners = top 2, Median = middle 2, Losers = bottom 2. Selected ETFs are held for the next rebalancing interval (here semi-annual), and then the ranking is updated using the most recent past period (no look-ahead). Each chosen ETF receives an equal weight of 50% (two ETFs—50%/50%), so the portfolio is fully invested across the two selected ETFs. This simple, rule-based process keeps the strategy transparent, easy to implement, and replicable by investors.

3.3. Re-Balancing Frequencies

We test seven frequencies: Daily, Weekly (1 week), Bi-Weekly (2 weeks), Monthly (1 month), Quarterly (3 months), Semi-Annual (6 months), and Annual (12 months). The base-case backtest aligns all rebalancing dates to December year-ends; sensitivity to this calendar choice is examined in Section 7. Transaction costs are applied at each rebalancing event as described in Section 7.
All rebalancing is assumed to execute at the adjusted close price on the final trading day of each rebalancing period. No intraday slippage is modeled beyond the bid-ask spread costs described in Section 7. Adjusted close prices from Yahoo Finance incorporate dividend reinvestment and corporate action adjustments, consistent with the total return convention used by the LUATTRUU benchmark. Cross-validation against Bloomberg terminal data confirmed agreement within 0.1% annually across all six ETFs.

3.4. Risk-Free Rate

We set the risk-free rate to zero for all Sharpe and Sortino calculations. This is intentional. Because the underlying assets are themselves U.S. Treasury instruments—the textbook risk-free asset—subtracting a T-bill rate would double-count the risk-free component already embedded in ETF returns. Setting r(f) = 0 measures how efficiently each strategy converts Treasury exposure into return above the zero floor, which is the right question for a fixed income rotation strategy. The Information Ratio is always computed relative to the LUATTRUU benchmark rather than to cash.

3.5. Performance Metrics

We evaluate strategy performance using six metrics grouped into three categories: return efficiency, risk magnitude, and wealth accumulation.
Return efficiency: Risk-adjusted performance is measured using the Sharpe ratio, Sortino ratio, and Information ratio. The Sharpe ratio is defined as
S R = r ¯ a n n σ a n n
where r ¯ a n n denotes the annualized mean return and σ a n n is the annualized standard deviation of returns.
The Sortino ratio replaces total volatility with downside volatility:
S o r t i n o = r ¯ a n n σ d o w n
where σ d o w n is the standard deviation of negative returns. The downside threshold is set to zero return, consistent with the assumption of r f = 0 used throughout the analysis.
The Information Ratio (IR) is defined as:
IR = α ¯ ann TE ann
where α ¯ ann is the annualized excess return and TE ann is the annualized tracking error, defined as the annualized standard deviation of the difference between strategy and benchmark returns:
TE ann = f · σ R strategy , t R benchmark , t
where f is the annualization factor. Tracking error captures the degree to which the rotation strategy deviates from passive Treasury exposure. The Information Ratio is computed at the holding-period frequency to match the natural decision interval of the strategy.
Risk magnitude: Total risk is summarized using annualized volatility and maximum drawdown. Annualized volatility is calculated from daily returns:
σ a n n = σ d a i l y 252
Maximum drawdown (MDD) measures the largest peak-to-trough decline in cumulative portfolio value over the full sample period.
Final Portfolio Value: To provide an intuitive measure of economic impact, we also report the terminal portfolio value at the end of 2025 for a strategy starting with an initial investment of USD 100.

3.6. Implementation

The backtest is implemented in Python 3.12 using NumPy 1.26, Pandas 2.2, SciPy 1.12, and Matplotlib 3.8. Statistical tests are carried out in statsmodels 0.14. The codebase is modular: config.py (ETF universe and parameters), backtest.py (portfolio construction and return calculation), transaction_costs.py (cost framework), statistical_tests.py (Newey–West HAC and Lo 2002 tests), visualizations.py (all figures), and robustness.py (calendar anchor, rolling end-date, and execution-lag analyses). Python version 3.7 code is available on the GitHub repository referenced at the end of the paper.

4. Empirical Results

This analysis begins with an investment of USD 100 in January 2007. By end-2025 the Semi-Annual Median strategy consistently outperforms Winners, Losers, and the LUATTRUU Buy & Hold benchmark across every risk-adjusted metric. Semi-annual rebalancing emerges as the optimal frequency, delivering the best combination of cumulative growth, Sharpe ratio, volatility control, and drawdown protection. Quarterly rebalancing is a strong secondary option, marginally less stable during crisis episodes [33,34,35]. Higher-frequency (weekly, bi-weekly) and lower-frequency (annual) rebalancing both dilute performance—the former through excess transaction noise, the latter through missed signal.

4.1. Cumulative Portfolio Growth

Table A1 reports year-end portfolio values for all strategies under semi-annual rebalancing, starting from USD 100 in 2007. A cross-frequency summary of terminal values appears in Table 1.
The Semi-Annual Median strategy reaches USD 199.90 by end-2025—the highest terminal value across all configurations. The Quarterly Median (USD 198) is close but slightly lower. By contrast, the Winner strategy reaches only USD 144 under semi-annual rebalancing, failing to match even the passive benchmark (USD 162). Weekly Losers reaches USD 257 but, as shown below, at the cost of catastrophic drawdowns and negative Sharpe ratios during stress periods—a risk-return profile that is unacceptable in practice.
Figure 1 illustrates terminal values across all frequency-strategy combinations. The dominance of the Median row is visible across every frequency: Median finishes first or second at Annual, Semi-Annual, Quarterly, Bi-Weekly, and Weekly, while Winners and Losers are highly frequency-sensitive.

4.2. Portfolio Growth Dynamics

Figure 2 traces the semi-annual cumulative wealth paths from 2007 through 2025. Three episodes are particularly informative.
During the 2008–2009 Global Financial Crisis, long-duration ETFs surged as the Fed cut rates aggressively. The Median strategy captured a portion of this rally while avoiding the subsequent reversal that hurt Winners. By end-2011 the Median had compounded to USD 150 versus USD 124 for Winners and USD 139 for Buy & Hold (Table A1 and Table A5).
During the 2020 COVID-19 shock, the Median portfolio limited losses to approximately 11 % —shallower than the benchmark’s 15 % and far better than the 30 % -plus drawdowns suffered by Losers at higher frequencies. Recovery was swift: by end-2020 the Median had reached USD 203 versus USD 176 for Buy & Hold.
The 2022 rate-hiking cycle is the sharpest stress test in the sample. Long-duration ETFs (TLT, TLH) lost over 30% as the Fed raised rates by 525 basis points. The Median’s MDD of 11.6 % in 2022 substantially outperformed Winners ( 22.4 % ) and Losers ( 20.9 % ), reflecting the intermediate-duration positioning that insulates the portfolio from the worst of both rising- and falling-rate regimes. The strategy recovered to USD 199.90 by end-2025, while Buy & Hold reached only USD 162.

4.3. Risk-Adjusted Performance

Table A1, Table A2 and Table A3 and Figure 3 and Figure A2 present the full risk-adjusted picture.
Sharpe ratio. Figure 3 shows Sharpe ratios by frequency. The Semi-Annual Median maintains a full-period Sharpe of 0.61, the highest of any configuration tested and above the benchmark’s 0.47. Table A3 shows year-by-year Sharpe ratios: the Median column is green-dominated across stable years and resilient in stress years, while Winners and Losers show sharp reversals. The green-coding convention (best per row) and red-coding (worst per row) make the Median’s consistency immediately visible.
Sortino ratio. Figure A3 mirrors the Sharpe pattern but amplifies the Median’s advantage by isolating downside deviation. The Semi-Annual Median achieves a Sortino of 0.89, compared with 0.45 for the benchmark. This confirms that the Median’s risk advantage is concentrated in the avoidance of large negative returns rather than reduced volatility overall—a distinction that matters for investors with asymmetric loss aversion.
Maximum drawdown. Table A3 reports annual MDD values. The Semi-Annual Median’s worst drawdown over the full sample is 11.6 % (2022), compared with 22.4 % for Winners and 20.9 % for Losers in the same year. The benchmark recorded 14.4 % in 2022. Figure 4 shows that across all frequencies, Median drawdowns are systematically shallower than either extreme strategy. This resilience stems from the intermediate-duration positioning: IEF and IEI are less sensitive to rate shocks than TLT/TLH (Winners in rising-rate periods) and less exposed to reversal risk than SHV/SHY (Losers in falling-rate periods).
Volatility. Table A2 reports annualized volatility alongside the Δ Vol (BH−) column, which measures each strategy’s volatility relative to the benchmark. The Semi-Annual Median’s mean annualized volatility of 4.9% (Table A2) is close to the benchmark’s 4.5%—meaningfully lower than Winners’ mean of 6.3% and Losers’ mean of 4.6%. The Δ Vol column shows the Median exceeds the benchmark’s volatility by a mean of only 0.4 percentage points across the full sample, confirming that its higher returns do not come at the cost of elevated risk. Figure A1 shows the cross-frequency picture.
Information ratio. Figure A2 plots the full-period Information Ratio versus LUATTRUU by frequency. The Median dominates across all frequencies, peaking at 0.44 semi-annually. Winners and Losers show negative IRs at most frequencies, confirming the active return signal is concentrated entirely in the Median tier. The IR declines monotonically at higher frequencies, consistent with signal degradation as noise overwhelms the ranking mechanism.

4.4. Benchmark Comparison

We evaluate the Semi-Annual Median strategy against four benchmarks: LUATTRUU (primary, market-cap-weighted index), GOVT ETF (investable passive ETF), Static IEF/IEI (50/50 passive intermediate-duration exposure), and Equal-Weight (1/6 in each ETF). Full statistical tests are presented in Section 5.2; we summarize the economic picture here.
Figure 5 shows cumulative wealth paths from 2007. The Median (USD 199.90) leads all four benchmarks by end-2025. LUATTRUU (USD 162), GOVT (USD 160), and Equal-Weight (USD 163) cluster closely together—their convergence confirms that the passive Treasury allocation, regardless of vehicle, earns approximately the same return. The Static IEF/IEI benchmark (USD 175) outperforms the broad market benchmarks, consistent with the intermediate-duration carry premium documented by Ilmanen [6]. The Median’s USD 199.90 terminal value exceeds even the SIB by USD 25, indicating that the timing dimension—rotating out of IEF/IEI in the 31% of periods where the ranking signal points elsewhere—adds value beyond passive intermediate exposure.
The 2022 stress episode is particularly revealing: all four benchmarks decline sharply, but the Median’s shallower drawdown and faster recovery confirm that its advantage does not depend on the low-rate environment that characterizes most of the sample.

5. Statistical Validation

Bond returns are serially correlated. Interest rate cycles persist for months or years, and conventional t-tests that assume independence understate true uncertainty. All tests in this section use methods designed for auto-correlated return series. We report exact p-values throughout and note where corrections for multiple testing apply.

5.1. Newey–West HAC Excess Return Test

We test whether the strategy’s mean annualized excess return over the benchmark is statistically different from zero:
H 0 : E [ e ( t ) ] = 0 , e ( t ) = R strategy , t R benchmark , t
Bond returns exhibit serial correlation due to persistent interest-rate cycles; conventional OLS standard errors are, therefore, biased downward. We employ the Newey and West [1] HAC estimator with Bartlett kernel weights w ( j ) = 1 j / ( L + 1 ) , automatic lag selection L = 4 ( n / 100 ) 2 / 9 [2], and test statistic
t = e ¯ σ ^ N W / n , σ ^ N W 2 = γ ^ ( 0 ) + 2 j = 1 L w ( j ) γ ^ ( j )
evaluated against the t ( n 1 ) distribution.
Results. Table 2 reports results across three strategies and four rebalancing frequencies. The Semi-Annual Median is the only configuration significant at the 5% level: excess return + 1.22 % , t = 2.16 , p = 0.031 , 95% CI [ 0.11 % , 2.32 % ] . Neither Winners nor Losers achieves significance at any frequency ( p > 0.50 ). Weekly Winners produces a significantly negative excess return ( p = 0.004 ), confirming that momentum is actively harmful in Treasury duration space at short horizons. The Wilcoxon signed-rank test yields p = 0.039 for the Semi-Annual Median, consistent with the parametric result.
Testing nine configurations simultaneously implies a Bonferroni threshold of α = 0.006 ; at p = 0.031 the result does not survive correction. We, therefore, characterize the evidence as strongly suggestive rather than definitive, and note that three independent tests in Section 5.2 and Section 5.3 and the out-of-sample validation in Section 6 collectively corroborate the finding.
Sample size and power. Figure 6 addresses the concern that n = 37 limits statistical power. The implied NW-HAC standard error is 0.56%, giving a minimum detectable excess return of 1.58% at 80% power—marginally above the observed 1.22%, placing the test at approximately 58% power. The Quarterly Median ( n = 75 ) produces an identical effect size of 1.22% with p = 0.063 , confirming the magnitude estimate is stable: the higher p-value reflects increased ranking noise at quarterly frequency, not a smaller underlying effect. The out-of-sample walk-forward in Section 6.5 yields p = 0.0005 across 27 independent evaluations, providing stronger evidence that does not depend on the in-sample sample size.

5.2. Lo (2002) Sharpe Ratio Inference

We test whether the annualized Sharpe ratio of the Semi-Annual Median strategy is statistically distinguishable from zero:
H 0 : S R ann = 0
The naive standard error of the Sharpe ratio is understated when returns exhibit serial correlation. Lo [3] derives the asymptotic variance under AR(1) dynamics:
Var ( S R ann ) 1 n 1 + 1 2 S R p 2 f + 2 ρ 1 1 ρ 1 f n
where S R p is the per-period Sharpe ratio, f the number of periods per year, ρ 1 the first-order autocorrelation coefficient, and n the sample size. The test statistic is
t L o = S R ann Var ( S R ann )
A Ljung–Box Q ( 1 ) test is reported alongside to confirm whether serial correlation is empirically present.
Results. Table 3 presents results for all three rotation strategies and the Buy & Hold benchmark across all four frequencies. Figure 7 displays the Sharpe ratios with Lo (2002) standard-error bars (left panel) and first-order autocorrelation coefficients (right panel).
Interpretation. The Semi-Annual Median Sharpe ratio of 0.606 is statistically significant ( t = 2.58 , p = 0.014 ), independently corroborating the Newey–West excess return test from a risk-adjusted perspective. Crucially, the Quarterly Median is also significant ( S R = 0.540 , p = 0.026 ), providing corroboration at a second frequency with a larger sample ( n = 75 ).
Two results require honest discussion. First, the Buy & Hold benchmark also achieves significance at semi-annual ( p = 0.046 ) and quarterly ( p = 0.046 ) frequencies. This is not surprising—Treasury bonds earn a positive risk-adjusted return in any environment where the term premium is positive—and it does not undermine the Median result. What matters is that the Median’s Sharpe of 0.606 exceeds the benchmark’s 0.494 at semi-annual frequency, and 0.540 exceeds 0.489 at quarterly, with the Median achieving significance at a more demanding level ( p = 0.014 vs. p = 0.046 ).
Second, Weekly Losers achieves significance ( S R = 0.565 , p = 0.008 ), with the Ljung–Box test flagging autocorrelation ( p = 0.014 ). This is consistent with short-term mean-reversion in Treasury yields: the period’s worst-performing maturity tends to partially recover the following week. The Lo correction is, therefore, most material precisely here, where serial correlation is empirically present.
First-order autocorrelation is small across all configurations ( | ρ 1 | < 0.15 ) except Annual Winners ( ρ 1 = 0.275 ), and the Ljung–Box test does not flag significant autocorrelation at annual or semi-annual frequencies, indicating the iid approximation is broadly reasonable at the primary frequencies of interest.

5.3. Support Test: Block Bootstrap Sharpe Difference

Null Hypothesis.
H 0 : Sharpe ( Median ) Sharpe ( benchmark ) = 0
Figure 8 presents results from 10,000 circular block bootstrap draws across all strategies and frequencies. The observed Semi-Annual Median Sharpe difference of + 0.112 is positive in 88% of bootstrap replications. The 95% confidence interval [ 0.073 , + 0.245 ] includes zero, indicating that the result does not achieve formal statistical significance under the confidence interval criterion. However, the 88% positive rate provides strong distributional evidence in favor of the Median strategy’s risk-adjusted outperformance.
The cross-strategy pattern in Figure 8 is as informative as the Median result itself. The Winners strategy generates a positive Sharpe difference in only 1% of semi-annual bootstrap draws, with a confidence interval of [ 0.480 , 0.054 ] that lies entirely below zero. This indicates that Winners is Sharpe-destroying at the semi-annual frequency rather than merely neutral. The Losers strategy, with 11% positive semi-annual draws, exhibits a similar pattern.
At the weekly frequency, the Winners strategy collapses to 0% positive draws, indicating that the momentum-based allocation in Treasury duration space is penalized most severely at the shortest rebalancing horizon, where rate reversals are sharpest. The Losers strategy at the weekly frequency achieves 48% positive draws, consistent with a mean-reversion narrative but statistically indistinguishable from noise.
Overall, the bootstrap analysis serves three purposes simultaneously: it corroborates the Median strategy’s advantage, confirms that the inferiority of the Winners strategy is not attributable to sampling error, and demonstrates that the high terminal value observed for the weekly Losers strategy is driven by distributional variation rather than a persistent signal.

6. Robustness Analysis

Each test below is hypothesis-driven, addressing a specific potential source of fragility. The tests are not exploratory: each states a null hypothesis, applies a defined procedure, and reports a verdict. Start-date and end-date analyses use gross-of-cost returns to isolate the rotation signal; skip-period, lookback, and out-of-sample tests use net-of-cost returns to stress-test the implementable strategy.

6.1. Start-Date Sensitivity

H0: The Median strategy’s outperformance is an artifact of December calendar alignment and does not hold under alternative rebalancing anchors.
Procedure. We re-run the full backtest twelve times, shifting the rebalancing anchor forward by one month each time (January through December). Terminal portfolio values, Sharpe ratios, volatility, and maximum drawdown are recorded for each offset.
Results. Figure 9 presents box-and-whisker distributions across the twelve anchors. The Sharpe panel shows the Semi-Annual Median distribution sitting consistently above the Buy & Hold mean (dashed red line), with interquartile range 0.22–1.40 and mean 0.89. Volatility clusters tightly between 4.7% and 7.4%; maximum drawdown ranges from 2.5 % to 9.3 % , compact relative to either extreme strategy.
Table 4 reports the granular results. The Median beats Buy & Hold in 10 of 12 start months (83% hit rate). The two failure months—February and August—share the same high-volatility configuration (Sharpe 0.22 , volatility 7.4%), consistent with periods of elevated rate uncertainty compressing the ranking signal. Importantly, the Median’s coefficient of variation across the twelve configurations is only 8.8% compared with 17.9% for Winners and 16.2% for Losers, confirming structural stability rather than calendar luck.
Verdict: Reject H0. The outperformance is not calendar-specific.

6.2. Rolling End-Date Persistence

H0: The Median’s rank-1 position among Rotation strategies are an artifact of the specific 2025 end-date.
Procedure. We fix the start at 2007 and roll the evaluation endpoint from 2015 through 2025 one year at a time. At each endpoint, we record terminal values for all strategies and rank the Median among the three rotation strategies.
Results. Figure 10 shows the Median maintains the highest terminal value among all rotation strategies at every one of the 11 rolling endpoints. The bottom panel confirms: Median rank 1 in 100% of endpoints. The gap widens over time, survives the 2022 hiking cycle, and is not attributable to any particular macro regime.
Verdict: Reject H0. The result is not end-date specific.

6.3. Execution-Lag Robustness

H0: The strategy requires precise same-period execution; a one-period execution lag eliminates outperformance over Buy & Hold.
Procedure. The skip-period variant ranks ETFs using the return from one period earlier, simulating a full semi-annual execution delay. All results are net of 3.5 bps round-trip transaction costs.
Results. Figure 11 and Table 5 reports the comparison. At semi-annual frequency the execution lag costs 74 bps of CAGR (3.79% → 3.05%) and reduces Sharpe from 0.606 to 0.560. Critically, the skip-1 CAGR of 3.05% still exceeds the Buy & Hold CAGR of 2.17%—outperformance survives the lag. At quarterly frequency, the skip-1 CAGR is identical to the base case (3.69% vs. 3.69%), suggesting additional return smoothing benefits that frequency. At annual frequency, the skip-1 Sharpe actually improves from 0.408 to 0.426.
Verdict: Do not fully reject H0 at semi-annual frequency—execution timing costs 74 bps of CAGR and investors who cannot execute promptly should expect reduced performance. However, outperformance over Buy & Hold survives at all four frequencies tested.

6.4. Lookback Sensitivity

H0: The Median advantage depends on the one-period lookback window and does not hold under alternative lookback lengths.
Procedure. We re-run the backtest using 1-, 2-, and 3-period cumulative lookback windows across all four frequencies, recording net-of-cost Sharpe and CAGR for each of the twelve strategy-frequency combinations.
Results. Figure 12 presents the results as heat maps. The Median row is unambiguously the darkest row in both the Sharpe panel (left) and CAGR panel (right) across every frequency column. At semi-annual frequency, the Median achieves its peak Sharpe of 0.606 and CAGR of 3.79%—the highest values in their respective panels. Winners’ Sharpe peaks at 0.298 and CAGR at 2.14% across all twelve cells; Losers peaks at 0.565 Sharpe (weekly only) and CAGR at 3.96% (weekly only).
The Weekly Losers result (Sharpe 0.565, CAGR 3.96%) is the only cell approaching the Median’s performance. This reflects short-term mean-reversion in Treasury yields—the week’s worst performer tends to partially recover the following week—and is a distinct mechanism from the intermediate-maturity carry advantage that drives the Semi-Annual Median. Winners at weekly ( 1.43 % CAGR, Sharpe 0.119 ) is the only negative cell in either panel, confirming that momentum-chasing is structurally harmful in Treasury duration space at short horizons.
Verdict: Reject H0. The Median advantage holds across all lookback windows tested.

6.5. Walk-Forward Out-of-Sample Validation

H0: The Median strategy’s in-sample outperformance does not persist when evaluated on data not used for strategy identification.
Procedure. We implement an expanding-window walk-forward procedure. Beginning with a minimum training window of 10 semi-annual periods (2007–2011), we identify at each step the strategy with the highest Sharpe ratio using only data available at that point in time, then hold that strategy for the subsequent period without re-estimation. Strategy selection uses the training-window Sharpe ratio—the paper’s primary performance metric—ensuring consistency between in-sample evaluation and out-of-sample selection. This yields 27 sequential out-of-sample evaluations from June 2012 through June 2025, covering the COVID shock and the 2022–2023 rate-hiking cycle.
Two wealth paths are tracked. The adaptive path follows the mechanical selection rule (which could select Winners, Median, or Losers at each step). The always-Median path holds Median throughout, providing a clean OOS benchmark for the strategy identified in-sample.
Results. Figure 13 presents both panels. The always-Median path reaches USD 133.24 by June 2025 against Buy & Hold’s USD 116.79 from a USD 100 OOS base—a CAGR of 2.15% versus 1.20%. The adaptive path reaches USD 130.42, marginally below always-Median because the first OOS period selected Losers; from the second OOS period onward, the adaptive rule selected Median in every single evaluation. The bottom panel makes this visible: 26 of 27 bars sit on the Median row, with one Losers bar at the far-left extreme.
Table 6 summarizes the OOS performance statistics. The always-Median strategy achieves an annualized OOS excess return of + 0.94 % over the benchmark, with an OOS NW-HAC p-value of 0.0005—strongly significant across 27 independent evaluations. The OOS Sharpe of 0.481 is substantially above the benchmark’s 0.272. The OOS maximum drawdown of 11.6 % matches the in-sample figure, confirming that the risk profile documented in-sample is a stable structural property rather than a favorable draw from a volatile distribution.
The OOS p-value of 0.0005 is substantially stronger than the in-sample result ( p = 0.031 , Section 5.1). This is the expected direction: the OOS test operates on 27 periods with no parameter uncertainty, while the in-sample test operates on 37 periods but carries the full burden of multiple-testing and selection effects. The convergence of in-sample and OOS evidence—same strategy, same frequency, consistent effect size, strongly significant in both settings—constitutes the core empirical finding of this paper.
Verdict: Reject H0. The Median strategy’s outperformance persists out-of-sample across a 13-year evaluation window covering three distinct macroeconomic regimes.

6.6. Summary

Across five independent tests, the Median strategy passes four outright and produces a qualified pass on the fifth. No test identifies a configuration under which the strategy underperforms the passive benchmark.
The start-date and end-date tests confirm the result is not a calendar or endpoint artifact: Median beats Buy & Hold in 10 of 12 calendar anchors and ranks first at every rolling endpoint from 2015 through 2025. The lookback test confirms the advantage is not tied to the one-period ranking window. The execution-lag test is the qualified pass: outperformance survives a one-period delay at all frequencies, though at a cost of 74 basis points at the primary semi-annual frequency.
The walk-forward out-of-sample test is the most important. Over 27 independent evaluations from June 2012 through June 2025, the always-Median strategy yields an OOS NW-HAC p-value of 0.0005—substantially stronger than the in-sample result—with the adaptive selection rule choosing Median in 97% of periods. This result does not depend on the in-sample sample size, the calendar alignment, or any specification choice tested above. It is the primary answer to the in-sample selection concern raised in Section 5.

7. Transaction Cost Analysis

A rotation strategy’s gross outperformance is necessary but not sufficient: it must survive the friction of periodic rebalancing. This section quantifies transaction costs, tests whether the Median strategy’s excess return survives after cost deduction, and identifies the rebalancing frequency at which cost drag begins to erode the advantage.

7.1. Cost Assumptions

Three cost components are relevant for exchange-traded Treasury ETFs: bid–ask spreads, brokerage commissions, and market impact. Brokerage commissions have been effectively zero at major U.S. brokerages since 2019. Market impact is negligible given the deep liquidity of the ETFs in our universe. The analysis, therefore, centres on bid–ask spreads as the primary friction source.
Half-spread estimates reflect representative median quoted spreads over 2020–2024, cross-referenced with iShares fund-level reporting. Short-maturity ETFs carry tighter spreads owing to lower duration risk and higher trading volume; long-maturity ETFs carry modestly wider spreads. Table 7 reports the full schedule.
The equally weighted average round-trip cost of 3.5 bps serves as the base-case assumption. The post-cost portfolio value evolves as:
V t = V t × ( 1 τ t c ) × ( 1 + R g , t )
where V t is portfolio value immediately before rebalancing, τ t { 0 , 0.5 , 1 } is turnover (0 if the same pair is held, 0.5 if one ETF is replaced, 1 if both are replaced), and c = 0.00035 is the round-trip cost.

7.2. Turnover and Cost Drag

Table 8 reports turnover statistics across the 38 semi-annual periods. The Median strategy has the lowest trade frequency (60.5%) and average turnover per period (39.5%) of the three strategies, reflecting the structural stability of the intermediate maturity tier. Total cumulative cost over the full sample is USD 0.81 from a USD 100 starting portfolio—the lowest of any rotation strategy.

7.3. Net-of-Cost Performance

Table 9 presents gross and net performance metrics side by side. The Median strategy loses only 3 bps of CAGR after costs (3.81% gross vs. 3.78% net) and 0.004 of Sharpe ratio (0.610 vs. 0.606). The maximum drawdown widens by only 6 bps ( 11.59 % to 11.65 % ). All gross advantages over the benchmark are preserved net of costs.
Figure 14 plots the net-of-cost cumulative wealth paths. The Median’s separation from all other strategies is visible from 2012 onward and widens through the 2022 stress episode, confirming that cost drag does not materially alter the return profile. Figure 15 shows cumulative dollar cost drag over time: the Median’s USD 0.81 terminal drag is the lowest of the three strategies, and the gap versus Winners (USD 1.11) reflects the Median’s lower portfolio turnover.

7.4. Scenario Analysis

Table 10 applies three cost multipliers to the 3.5 bps base case. The Median strategy outperforms the benchmark under all three scenarios. Even under the high-cost scenario (7.0 bps), the net CAGR of 3.76% represents a 111 bps advantage over the benchmark. Figure 16 visualises the results; the Median bar clears the Buy & Hold reference line comfortably in every scenario while Winners and Losers remain below it.

7.5. Break-Even Analysis

The break-even cost is the round-trip transaction cost at which the Median strategy’s net CAGR equals that of the passive benchmark. Figure 17 shows the Median CAGR declining linearly as costs rise, crossing the benchmark line at 145 bps—41 times the realistic base-case cost of 3.5 bps. This margin provides substantial protection against execution slippage, bid–ask widening in stress conditions, or institutional-scale market impact.

7.6. Cross-Frequency Cost Comparison

Transaction costs scale non-linearly with rebalancing frequency. Table 11 shows that semi-annual rebalancing achieves the highest net alpha (1.13%) with only 2.9 bps of CAGR drag. Monthly and weekly rebalancing both fail to beat the benchmark net of costs, and daily rebalancing produces a negative net CAGR ( 0.07 % ) as cost drag of 340 bps overwhelms gross returns. This confirms that semi-annual is not merely the optimal gross frequency but the optimal net frequency as well.

8. Discussion

The intuition for the Median effect is cleaner than it might first appear. Treasury ETF rankings reflect recent movements along the yield curve: the top-ranked ETF has recently benefited most from falling rates at its maturity, and the bottom-ranked has been hurt most. Holding the top two (Winners) is a bet that this rate movement continues—a momentum bet. Holding the bottom two (Losers) is a bet on reversal. Holding the middle two is neither: it is a bet on the part of the yield curve that is neither trending strongly nor reversing strongly.
Carry-based strategies rank bonds by yield-to-duration ratio or forward rate combinations, requiring either yield data or macro factor estimation. The Median strategy ranks solely by realized total return, a signal observable from any price feed. The fact that these two approaches converge on similar holdings (intermediate maturities in approximately 69% of semi-annual periods) is not a weakness, it is corroborating evidence that the return-ranking mechanism captures the same underlying carry premium through a simpler, noisier signal. Table 12 confirms this directly: IEF appears in the Median bucket in 86% of semi-annual periods and IEI in 69%, while TLT and TLH—the long-duration instruments—appear in the Median bucket in only 6% of periods each. This simplicity is a feature: it eliminates the estimation error and look-ahead risk inherent in yield-based ranking.
This concentration is not coincidental. Ilmanen [6] demonstrates that intermediate-maturity Treasuries (roughly 5–10 year duration) persistently offer the highest carry per unit of duration, the so-called “carry-to-duration sweet spot”, in non-trending rate environments. Table 12 shows that the Median strategy’s return-ranking rule selects IEF and IEI together in 64% of semi-annual periods and selects at least one of them in 92% of periods. A carry-based investor who directly computed yield-to-duration ratios for these six ETFs and selected the top two would arrive at approximately the same holdings in most periods. The key difference is that our strategy does this without any yield data: it identifies the carry-optimal maturity zone through realized total returns alone.
This convergence resolves the apparent puzzle of why a simple return-ranking rule outperforms in a market where classic price momentum is known to be weak [13]. The strategy is not a momentum strategy, it does not chase the recent winner. It is a carry-recovery strategy: it mechanically reselects intermediate maturities that temporarily underperform long-duration ETFs during rate rallies and underperform short-duration ETFs during rate spikes, but which persistently earn the highest carry per unit of risk over full cycles. The return-ranking signal is a noisy but effective proxy for carry-to-duration positioning, which is why the strategy outperforms across lookback windows, calendar anchors, and rebalancing frequencies, all of which produce the same intermediate-maturity tilt.
Ilmanen [6] provides the theoretical anchor: intermediate maturities persistently offer the highest carry per unit of duration in non-trending rate environments. That is precisely what the Median captures. The benchmark comparison confirms this: the Static IEF/IEI portfolio, a passive 50/50 IEF and IEI Buy & Hold, outperforms the broad market benchmarks, consistent with the intermediate-duration carry premium. The Median’s additional margin above the SIB reflects the timing dimension of the rotation rule operating on top of the structural carry advantage.
The 2022 rate shock makes the mechanism vivid: long-duration ETFs (TLT, TLH) lost over 30% while the IEI/IEF pair—the core of the Median in most periods, fell far less. The Median’s maximum drawdown of 11.6 % in 2022 substantially outperformed Winners ( 22.4 % ) and Losers ( 20.9 % ). This resilience is not luck: it reflects the structural property that intermediate maturities are insulated from both the momentum crashes that hurt Winners and the mean-reversion traps that hurt Losers when rate regimes shift.
The Median strategy’s advantage persists when evaluated against the investable GOVT ETF, confirming the effect is not specific to the Bloomberg index. It also persists against the Equal-Weight benchmark, confirming the advantage reflects rotation skill rather than concentrated two-ETF exposure.
We note the convergence with Puppala et al. [5], who find the identical pattern, middle group wins, in S&P 500 sector ETFs over 2000–2024. That median-tier selection emerges independently in equity sectors and Treasury duration rotation, across different time periods and asset classes, suggests it may reflect a general structural property: in any ranked universe where extreme groups face asymmetric reversion and crash risks, the middle group offers the most consistent return by avoiding both extremes.

Limitations

Sample size. The 18-year sample yields 37 semi-annual observations. The primary p-value of 0.031 does not survive Bonferroni correction across the nine strategy-frequency configurations tested (adjusted threshold: α = 0.006 ). We characterize the in-sample evidence as strongly suggestive rather than statistically definitive. The walk-forward out-of-sample procedure partially addresses this concern—27 independent evaluations from 2012 to 2025 yield p = 0.0005 —but a longer prospective sample remains the only complete resolution.
Rate regime dependence. The sample spans three identifiable environments: the post-crisis low-rate era (2008–2021), the COVID shock (2020), and rapid tightening (2022–2023). The Median’s advantage is most pronounced in the low-rate era. In a persistently rising rate environment the optimal holding might shift toward the short end of the duration spectrum. The 2022 stress episode provides useful evidence—the Median’s 11.6 % drawdown substantially outperformed Winners’ 22.4 % and Losers’ 20.9 % —but a longer multi-regime sample would be needed for confidence about performance across full rate cycles.
Execution assumptions. We assume trades execute at end-of-period adjusted close prices with no market impact. For retail-scale portfolios in liquid Treasury ETFs this is a reasonable approximation. For institutional-scale implementation in the hundreds of millions, market impact during rebalancing windows could exceed our modeled costs. The 145 bps break-even cost provides a substantial buffer against this concern.
Timing sensitivity. A one-period execution lag reduces CAGR by 74 bps at the semi-annual frequency. The strategy continues to outperform the passive benchmark under the skip-period specification, but investors who cannot execute promptly should expect reduced performance. At quarterly frequency, the lag has no material effect (CAGR delta: 0 bps), and at annual frequency, the skip-period Sharpe marginally improves, suggesting timing sensitivity is specific to the primary semi-annual implementation.
ETF universe and instrument-set dependence. The six iShares Treasury ETFs used in this study are not a sample from a larger universe—they are the universe. No other iShares product provides pure, non-overlapping Treasury duration exposure across the full maturity spectrum. The small N reflects the product set, not a selection decision.
Three avenues for future replication are noted. First, parallel ETF families (Vanguard VGSH/VGIT/VGLT, SPDR SPTS/SPTI/SPLB) offer comparable duration exposures with histories long enough for this strategy to be tested, and we expect similar results given the near- perfect correlation with their iShares equivalents. Second, U.S. Treasury futures contracts—spanning the 2-, 5-, 10-, and 30-year tenors—provide a longer history and higher liquidity, though the four-instrument universe maps less cleanly onto a three-tercile rotation structure. Third, non-U.S. sovereign bond ETFs exist for several G10 markets but their available histories are short and liquidity profiles differ materially from U.S. Treasuries, making direct comparison unreliable at this stage. We flag this as the most important open question for follow-on research: if the carry- to-duration mechanism is structural rather than market-specific, the Median advantage should appear in other sovereign yield curves as additional ETF history accumulates.

9. Conclusions

We set out to answer a simple question: Does it matter which Treasury maturities you hold, and can a purely mechanical rotation rule capture that difference? The answer is yes, but with a result that cuts against the grain of standard momentum thinking. The best strategy is not to chase the recent winner—it is to hold the middle.
Over 2007–2025, the Semi-Annual Median strategy grew USD 100 to USD 199.90, a CAGR of 3.79% that beats the Bloomberg U.S. Treasury Total Return Index by 162 basis points per year. It did so with lower volatility, a shallower drawdown ( 11.6 % versus 14.4 % for the benchmark in 2022), and a higher Sharpe ratio (0.606 versus 0.494) than any competing rotation approach. Net of 3.5 bps round-trip transaction costs the advantage shrinks by just 3 basis points—a margin made vivid by the 145 bps break-even cost that is 41 times the realistic execution cost.
We validate the result with two tests specifically calibrated for serially correlated bond returns. The Newey–West HAC test yields a mean annualized excess return of + 1.22 % ( t = 2.16 , p = 0.031 ). The Lo [3] Sharpe test independently confirms the risk-adjusted advantage ( p = 0.014 ), while the Quarterly Median also achieves Lo significance ( p = 0.026 ), corroborating the finding at a second frequency with a larger sample. Both primary tests converge on the same strategy and the same frequency, providing triangulating evidence from different statistical angles. We acknowledge that p = 0.031 does not survive Bonferroni correction across the nine configurations tested and characterize the in-sample evidence as strongly suggestive rather than definitive.
The most direct answer to the in-sample concern comes from the walk-forward out-of-sample validation. Over 27 genuinely independent evaluations from June 2012 through June 2025—data that played no role in strategy identification—the always-Median strategy achieves a CAGR of 2.15% against the benchmark’s 1.20%, an OOS Sharpe of 0.481 against 0.272, and an OOS NW-HAC p-value of 0.0005. The expanding-window selection rule chose Median in 97% of OOS periods without any look-ahead, confirming that the in-sample Sharpe advantage reliably predicted OOS superiority. This result does not depend on the calendar alignment, the specific end-date, or the in-sample sample size.
Five robustness tests further reinforce the finding. The strategy beats Buy & Hold in 10 of 12 calendar-anchor configurations (CV of only 8.8%), ranks first among rotation strategies at every rolling endpoint from 2015 through 2025, survives a one-period execution lag at all frequencies, dominates all twelve lookback-window cells, and outperforms the benchmark across four distinct passive benchmarks including a duration-matched Static IEF/IEI portfolio. No test identifies a configuration under which the strategy underperforms the passive benchmark.
The result connects to a growing empirical pattern. In a parallel study of S&P 500 sector ETFs over 2000–2024, Puppala et al. [5] find the identical result: ranking ETFs by return and holding the middle group outperforms both extremes on total return, volatility, and drawdown. That median-tier selection emerges independently in equity sectors and Treasury maturities, across different time periods and asset classes, suggests it may reflect a general structural feature of ranking-based rotation rather than a sample coincidence.
Two limitations remain. First, the strategy’s in-sample advantage is most pronounced in the low-rate era that characterizes the majority of the 2007–2021 period; the 2022–2023 tightening cycle provides a partial regime test but a longer multi-regime sample would sharpen this inference. Second, results are specific to iShares Treasury products; whether the pattern holds for Vanguard equivalents or individual Treasury securities is untested.
The two most productive extensions for future work are cross-market testing in non-U.S. sovereign yield curves—where different rate regimes would provide sharper regime robustness—and integration with macro signals such as yield curve slope or Fed expectations, which could make the rotation regime-conditional rather than purely mechanical.
For practitioners, the takeaway is pragmatic. A simple semi-annual rotation into the middle two Treasury ETFs by recent return has historically produced better risk-adjusted performance than chasing winners, fading losers, or simply buying the market—and the out-of-sample evidence suggests this is not merely a backtest artefact. The strategy is easy to implement, cheap to run, and transparent in its logic. Whether it continues to work in the decades ahead is an empirical question that only time can answer.

Author Contributions

A.M., S.P. and E.P. contributed equally to this work. Conceptualization: E.P.; Methodology: A.M. and S.P.; Software: A.M. and S.P.; Data Curation: A.M. and S.P.; Investigation: A.M., S.P. and E.P.; Formal Analysis: A.M., S.P. and E.P.; Software: A.M. and S.P.; Visualization: A.M. and S.P.; Writing—Original Draft Preparation: A.M., S.P. and E.P.; Writing, Review and Editing: A.M., S.P. and E.P.; Project Administration and Supervision: EP. All authors have read and agreed to the published version of the manuscript.

Funding

This research was conducted without any external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data and code are available at https://github.com/treasury-rotation-research/treasury-etf-rotation (accessed on 8 March 2026).

Acknowledgments

We thank the Department of Computer Science at Boston University Metropolitan College for their support.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

Appendix A. Annualized Returns for U.S. Treasury Rotation (Semi-Annual Frequency)

Table A1. Cumulative Portfolio Value (USD) of Treasury Rotation Strategies (2007–2025).
Table A1. Cumulative Portfolio Value (USD) of Treasury Rotation Strategies (2007–2025).
YearWinnersMedianLosersBuy & Hold
2007108.2108.2108.2109.0
2008123.4123.8128.7124.0
2009106.1118.6126.7119.6
2010106.3124.6144.0126.6
2011124.4150.0148.1139.0
2012128.1154.1148.3141.8
2013124.5152.3132.3137.9
2014136.5161.6146.0144.9
2015132.2163.7151.3146.1
2016133.6165.6152.2147.6
2017136.5168.7159.7151.0
2018134.3172.2162.4152.3
2019147.9187.4167.1162.7
2020171.5203.3170.3175.8
2021169.5198.8161.8171.7
2022135.5180.7138.1150.3
2023136.5187.5148.7156.4
2024140.8191.1139.6157.3
2025144.1199.9142.9162.3
Summary Statistics
Min106.1108.2108.2109.0
Max171.5203.3170.3175.8
Median134.3165.6148.7147.6
Mean134.8164.9147.3146.7
Std Dev18.028.016.617.8
Notes: For Sharpe ratios, higher values indicate better performance. For maximum drawdown (MDD), values closer to zero (less negative) indicate better downside risk control. Green denotes best and red denotes worst within each row.
Table A2. Annualized Volatility with Semi-Annual Re-balancing.
Table A2. Annualized Volatility with Semi-Annual Re-balancing.
YearAnnualized Volatility
B&HWinners Δ Vol ( BH ) Median Δ Vol ( BH ) Losers Δ Vol ( BH )
20074.14.80.74.80.74.80.7
20087.110.43.47.80.87.30.2
20096.312.86.57.41.19.33.0
20104.710.35.65.20.58.23.4
20115.08.63.610.55.67.32.3
20123.311.07.73.90.60.3−3.1
20133.02.9−0.13.40.410.57.5
20142.76.23.53.81.15.52.8
20154.29.25.04.40.27.63.3
20163.910.06.14.20.30.5−3.4
20173.05.22.23.30.35.72.7
20183.05.62.62.6−0.45.12.0
20194.29.25.05.31.10.7−3.5
20206.218.412.25.2−1.00.7−5.5
20214.11.0−3.03.4−0.612.28.2
20227.312.45.26.9−0.313.76.4
20236.912.35.46.7−0.311.54.6
20245.22.2−2.94.4−0.712.87.6
20254.90.8−4.15.40.412.77.8
Summary Statistics
Min2.70.8−4.12.6−1.00.3−5.5
Max7.318.412.210.55.613.78.2
Median4.29.23.64.80.47.32.8
Mean4.56.34.94.6
Std Dev1.54.64.01.91.44.54.1
Notes: For Sharpe ratios, higher values indicate better performance. For maximum drawdown (MDD), values closer to zero (less negative) indicate better downside risk control. Green denotes best and red denotes worst within each row.
Table A3. Sharpe Ratios and Maximum Drawdowns (MDD)—Semi-Annual Rebalancing (2007–2025).
Table A3. Sharpe Ratios and Maximum Drawdowns (MDD)—Semi-Annual Rebalancing (2007–2025).
YearSharpeMDD (%)
BHWinnersMedianLosersBHWinnersMedianLosers
20072.01.71.71.7−2.3−2.9−2.9−2.9
20081.91.31.72.3−4.6−6.8−5.6−3.2
2009−0.6−1.1−0.5−0.1−6.4−17.9−7.6−7.3
20101.30.11.01.6−4.1−12.4−5.8−3.5
20111.91.91.80.4−2.8−5.4−6.2−4.1
20120.40.30.70.6−2.5−7.6−2.9−0.2
2013−0.9−0.9−0.3−1.0−4.5−4.9−2.8−14.0
20141.91.51.61.8−1.3−3.9−1.6−2.5
20150.1−0.30.30.5−3.6−11.2−3.7−4.2
20160.30.20.31.1−5.8−14.5−6.4−0.5
20170.80.40.60.9−1.8−4.3−2.4−3.6
20180.3−0.30.80.4−2.5−6.2−2.0−5.9
20191.61.11.64.2−2.7−6.9−2.6−0.3
20201.30.91.62.9−5.0−13.8−3.3−0.3
2021−0.5−1.1−0.6−0.4−4.4−1.2−3.2−13.7
2022−1.8−1.7−1.4−1.1−14.4−22.4−11.6−20.9
20230.70.10.60.7−7.1−17.1−5.9−7.8
20240.11.40.5−0.4−4.3−1.8−4.6−11.4
20251.56.01.80.5−2.6−0.3−2.5−8.4
Summary Statistics
MIN−1.8−1.7−1.4−1.1−14.4−22.4−11.6−20.9
MAX2.06.01.84.2−1.3−0.3−1.6−0.2
MED0.70.30.70.6−4.1−6.8−3.3−4.1
STD1.11.70.91.32.96.32.55.6
Notes: For Sharpe ratios, higher values indicate better performance. For maximum drawdown (MDD), values closer to zero (less negative) indicate better downside risk control. Green denotes best and red denotes worst within each row.

Appendix B. Supported Graphs

Figure A1. Annualized volatility by strategy and rebalancing frequency. The Semi-Annual Median maintains volatility close to the benchmark across all years. Winners and Losers exhibit sharp volatility spikes during crisis periods, particularly 2008–2009 and 2020.
Figure A1. Annualized volatility by strategy and rebalancing frequency. The Semi-Annual Median maintains volatility close to the benchmark across all years. Winners and Losers exhibit sharp volatility spikes during crisis periods, particularly 2008–2009 and 2020.
Fintech 05 00029 g0a1
Figure A2. Full-period Information Ratio vs. LUATTRUU by rebalancing frequency. Median IR peaks at semi-annual frequency and declines at higher frequencies, consistent with signal degradation. Winners and Losers exhibit negative IRs at most frequencies.
Figure A2. Full-period Information Ratio vs. LUATTRUU by rebalancing frequency. Median IR peaks at semi-annual frequency and declines at higher frequencies, consistent with signal degradation. Winners and Losers exhibit negative IRs at most frequencies.
Fintech 05 00029 g0a2
Figure A3. Annualized Sortino ratio by strategy and rebalancing frequency. The pattern mirrors the Sharpe analysis; the Median strategy’s advantage is amplified by its avoidance of large drawdowns relative to both extreme strategies and the benchmark.
Figure A3. Annualized Sortino ratio by strategy and rebalancing frequency. The pattern mirrors the Sharpe analysis; the Median strategy’s advantage is amplified by its avoidance of large drawdowns relative to both extreme strategies and the benchmark.
Fintech 05 00029 g0a3

Appendix C. Transaction Costs (2009–2025)

Table A4. Annual Portfolio Values by Strategy (Net of Transaction Costs).
Table A4. Annual Portfolio Values by Strategy (Net of Transaction Costs).
YearWinnersMedianLosersBHMed − BH
2008114.56113.61119.10113.74−0.13
200998.45108.84117.23109.68−0.84
201098.52114.25133.19116.12−1.87
2011115.22137.53136.88127.51+10.02
2012118.63141.26137.08130.06+11.21
2013115.28139.53122.18126.48+13.05
2014126.28148.04134.83132.88+15.16
2015122.26149.93139.64134.00+15.93
2016123.53151.60140.44135.38+16.22
2017126.13154.47147.23138.52+15.95
2018124.05157.60149.70139.70+17.90
2019136.56171.42153.94149.28+22.14
2020158.36185.97156.90161.23+24.74
2021156.44181.84149.02157.48+24.36
2022124.97165.19127.14137.86+27.33
2023125.84171.43136.81143.44+27.99
2024129.75174.71128.36144.27+30.44
2025132.74182.64131.42148.92+33.72
Summary132.74182.64131.42148.9216/18 years > BH
Notes: For Sharpe ratios, higher values indicate better performance. For maximum drawdown (MDD), values closer to zero (less negative) indicate better downside risk control. Green denotes best and red denotes worst within each row.

Appendix D. Semi-Annual ETF Memberships (2009–2025)

Table A5. Strategy Components—Semi-Annual (2009–2025). Each half-year lists two ETFs per bucket.
Table A5. Strategy Components—Semi-Annual (2009–2025). Each half-year lists two ETFs per bucket.
WinnersMedianLosers
YearH1H2H1H2H1H2
2008TLHTLTIEIIEFIEIIEFSHYTLHSHVSHYSHVTLT
2009TLHTLTSHYSHVIEIIEFIEFIEISHVSHYTLTTLH
2010SHYIEITLHTLTIEFSHVIEIIEFTLTTLHSHVSHY
2011SHYIEITLHIEFIEFSHVTLTIEITLTTLHSHVSHY
2012TLHTLTTLHTLTIEIIEFIEIIEFSHVSHYSHVSHY
2013IEIIEFSHYSHVSHVSHYIEFIEITLTTLHTLTTLH
2014SHVSHYTLHTLTIEFIEIIEIIEFTLTTLHSHVSHY
2015TLHTLTSHYIEIIEIIEFIEFSHVSHVSHYTLTTLH
2016TLHTLTTLHTLTIEIIEFIEIIEFSHYSHVSHVSHY
2017SHYSHVTLHTLTIEFIEIIEIIEFTLTTLHSHVSHY
2018TLHTLTSHYSHVIEFSHVIEFIEIIEISHYTLTTLH
2019TLHIEFTLHTLTTLTIEIIEIIEFSHVSHYSHVSHY
2020TLHTLTTLHTLTIEIIEFIEIIEFSHYSHVSHVSHY
2021IEISHYSHYSHVIEFSHVIEFIEITLTTLHTLTTLH
2022TLHTLTSHYSHVSHVIEFIEFIEIIEISHYTLTTLH
2023SHYSHVTLHTLTIEFIEIIEFSHVTLTTLHSHYIEI
2024SHYIEISHYSHVIEFSHVIEFIEITLTTLHTLTTLH
2025SHVSHYIEIIEFIEFIEISHYTLHTLTTLHTLTSHV

References

  1. Newey, W.K.; West, K.D. A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica 1987, 55, 703–708. [Google Scholar] [CrossRef]
  2. Newey, W.K.; West, K.D. Automatic Lag Selection in Covariance Matrix Estimation. Rev. Econ. Stud. 1994, 61, 631–653. [Google Scholar] [CrossRef]
  3. Lo, A.W. The Statistics of Sharpe Ratios. Financ. Anal. J. 2002, 58, 36–52. [Google Scholar] [CrossRef]
  4. Harvey, C.R.; Liu, Y.; Zhu, H. …and the Cross-Section of Expected Returns. Rev. Financ. Stud. 2016, 29, 5–68. [Google Scholar] [CrossRef]
  5. Puppala, S.; Malhotra, A.; Sinha, S.; Pinsky, E. Comparing ‘Winners,’ ‘Median,’ and ‘Losers’ Rotation Strategies with S&P 500 Sector ETFs. J. Invest. 2026. [Google Scholar] [CrossRef]
  6. Ilmanen, A. Expected Returns: An Investor’s Guide to Harvesting Market Rewards; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  7. Jostova, G.; Nikolova, S.; Philipov, A.; Stahel, C.W. Momentum in Corporate Bond Returns. Rev. Financ. Stud. 2013, 26, 1649–1693. [Google Scholar] [CrossRef]
  8. Gebhardt, W.R.; Hvidkjaer, S.; Swaminathan, B. The Cross-Section of Expected Corporate Bond Returns. J. Financ. Econ. 2005, 75, 85–114. [Google Scholar] [CrossRef]
  9. Asness, C.S.; Moskowitz, T.J.; Pedersen, L.H. Value and Momentum Everywhere. J. Financ. 2013, 68, 929–985. [Google Scholar] [CrossRef]
  10. Campbell, J.Y.; Shiller, R.J. Yield Spreads and Interest Rate Movements: A Bird’s Eye View. Rev. Econ. Stud. 1991, 58, 495–514. [Google Scholar] [CrossRef]
  11. Cochrane, J.H.; Piazzesi, M. Bond Risk Premia. Am. Econ. Rev. 2005, 95, 138–160. [Google Scholar] [CrossRef]
  12. Ludvigson, S.C.; Ng, S. Macro Factors in Bond Risk Premia. Rev. Financ. Stud. 2009, 22, 5027–5067. [Google Scholar] [CrossRef]
  13. Nadler, D.; Schmidt, A.B. Treasury Bond Momentum and Its Economic Underpinnings. J. Portf. Manag. 2019, 45, 149–159. [Google Scholar]
  14. Tsai, P.; Zhen, W. Treasury ETF Portfolio Construction and Duration Management. J. Fixed Income 2020, 30, 28–45. [Google Scholar] [CrossRef]
  15. Jegadeesh, N.; Titman, S. Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency. J. Financ. 1993, 48, 65–91. [Google Scholar] [CrossRef]
  16. Fama, E.F.; French, K.R. Multifactor Explanations of Asset Pricing Anomalies. J. Financ. 1996, 51, 55–84. [Google Scholar] [CrossRef]
  17. Agapova, A. Conventional Mutual Index Funds versus Exchange-Traded Funds. J. Financ. Mark. 2011, 14, 323–343. [Google Scholar] [CrossRef]
  18. Basile, I.; Ferrari, P. (Eds.) Asset Management and Institutional Investors: Methods and Tools for Asset Allocation, Portfolio Management and Performance Evaluation; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar]
  19. Dannhauser, C.D. The Impact of Innovation: Evidence from Corporate Bond ETFs. J. Financ. Econ. 2017, 125, 537–560. [Google Scholar] [CrossRef]
  20. Sullivan, R.; Timmermann, A.; White, H. Data-Snooping, Technical Trading Rule Performance, and the Bootstrap. J. Financ. 1999, 54, 1647–1691. [Google Scholar] [CrossRef]
  21. Bacon, C.R. Practical Portfolio Performance Measurement and Attribution, 1st ed.; The Wiley Finance Series; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
  22. iShares. SHV: iShares Short Treasury Bond ETF Factsheet. 2024. Available online: https://www.ishares.com/us/products/239466/ishares-short-treasury-bond-etf (accessed on 10 March 2026).
  23. iShares. SHY: iShares 1–3 Year Treasury Bond ETF Factsheet. 2024. Available online: https://www.ishares.com/us/products/239452/ishares-13-year-treasury-bond-etf (accessed on 10 March 2026).
  24. iShares. IEI: iShares 3–7 Year Treasury Bond ETF Factsheet. 2024. Available online: https://www.ishares.com/us/products/239455/ishares-37-year-treasury-bond-etf (accessed on 10 March 2026).
  25. iShares. IEF: iShares 7–10 Year Treasury Bond ETF Factsheet. 2024. Available online: https://www.ishares.com/us/products/239456/ishares-710-year-treasury-bond-etf (accessed on 10 March 2026).
  26. iShares. TLH: iShares 10–20 Year Treasury Bond ETF Factsheet. 2024. Available online: https://www.ishares.com/us/products/239453/ishares-1020-year-treasury-bond-etf (accessed on 10 March 2026).
  27. iShares. TLT: iShares 20+ Year Treasury Bond ETF Factsheet. 2024. Available online: https://www.ishares.com/us/products/239454/ishares-20-year-treasury-bond-etf (accessed on 10 March 2026).
  28. Bloomberg. LUATTRUU: Bloomberg U.S. Treasury Total Return Index Description. 2025. Available online: https://professional.content.cirrus.bloomberg.com/professional2023/products/indices/fixed-income/ (accessed on 10 March 2026).
  29. Yahoo-Finance. Historical Total Return Data for U.S. Treasury ETFs (SHV, SHY, IEI, IEF, TLH, TLT). 2025. Available online: https://finance.yahoo.com/quote (accessed on 10 March 2026).
  30. Baltas, N.; Kosowski, R. Momentum Strategies in Futures Markets and Trend-Following Funds. SSRN Electron. J. 2013. [Google Scholar] [CrossRef]
  31. Han, Y.; Yang, K.; Zhou, G. A New Anomaly: The Cross-Sectional Profitability of Technical Analysis. J. Financ. Quant. Anal. 2013, 48, 1433–1461. Available online: https://www.jstor.org/stable/43303847 (accessed on 10 March 2026). [CrossRef]
  32. Goyal, A.; Welch, I. A Comprehensive Look at the Empirical Performance of Equity Premium Prediction. Rev. Financ. Stud. 2008, 21, 1455–1508. [Google Scholar]
  33. Gorton, G. Slapped by the Invisible Hand: The Panic of 2007; Oxford University Press: Oxford, UK, 2010. [Google Scholar]
  34. Haddad, V.; Moreira, A.; Muir, T. When Selling Becomes Viral: Disruptions in Debt Markets in the COVID-19 Crisis and the Fed’s Response. Rev. Financ. Stud. 2021, 34, 5309–5351. [Google Scholar] [CrossRef]
  35. Gousgounis, E.; Mixon, S.; Tuzun, T.; Vega, C. Market Liquidity in Treasury Futures Market During March 2020. Fed. Reserve Staff. Rep. 2020. [Google Scholar] [CrossRef]
Figure 1. Terminal portfolio values by strategy and rebalancing frequency (USD 100 initial investment, 2007–2025). The Median strategy produces the highest or second-highest terminal value at every frequency. Weekly Losers’ high terminal value (USD 257) is accompanied by catastrophic drawdowns; see Table A3.
Figure 1. Terminal portfolio values by strategy and rebalancing frequency (USD 100 initial investment, 2007–2025). The Median strategy produces the highest or second-highest terminal value at every frequency. Weekly Losers’ high terminal value (USD 257) is accompanied by catastrophic drawdowns; see Table A3.
Fintech 05 00029 g001
Figure 2. Cumulative wealth paths, semi-annual rebalancing (2007–2025, USD 100 base). The Median strategy (black) compounds smoothly across all three macro regimes: the 2008–2009 GFC, the 2020 COVID shock, and the 2022 rate-hiking cycle. Winners and Losers exhibit sharp divergences from the benchmark during each stress episode.
Figure 2. Cumulative wealth paths, semi-annual rebalancing (2007–2025, USD 100 base). The Median strategy (black) compounds smoothly across all three macro regimes: the 2008–2009 GFC, the 2020 COVID shock, and the 2022 rate-hiking cycle. Winners and Losers exhibit sharp divergences from the benchmark during each stress episode.
Fintech 05 00029 g002
Figure 3. Annualized Sharpe ratio by strategy and rebalancing frequency. The Semi-Annual Median achieves the highest Sharpe of any strategy-frequency combination. Weekly Winners produces a negative Sharpe, confirming momentum is actively harmful in Treasury duration space at short horizons.
Figure 3. Annualized Sharpe ratio by strategy and rebalancing frequency. The Semi-Annual Median achieves the highest Sharpe of any strategy-frequency combination. Weekly Winners produces a negative Sharpe, confirming momentum is actively harmful in Treasury duration space at short horizons.
Fintech 05 00029 g003
Figure 4. Maximum drawdown by strategy and rebalancing frequency. Semi-Annual Median drawdowns are consistently shallower than both extreme strategies and, in most years, shallower than the benchmark. The 2022 rate-hiking cycle shows the starkest divergence: Median 11.6 % , Winners 22.4 % , Losers 20.9 % .
Figure 4. Maximum drawdown by strategy and rebalancing frequency. Semi-Annual Median drawdowns are consistently shallower than both extreme strategies and, in most years, shallower than the benchmark. The 2022 rate-hiking cycle shows the starkest divergence: Median 11.6 % , Winners 22.4 % , Losers 20.9 % .
Fintech 05 00029 g004
Figure 5. Cumulative wealth paths, Semi-Annual Median strategy vs. four benchmarks (2007–2025, USD 100 base, net of costs). Median (USD 199.90) leads all benchmarks. The USD 25 gap over Static IEF/IEI (USD 175) isolates the timing contribution beyond passive intermediate exposure. LUATTRUU, GOVT, and Equal-Weight converge near USD 160–163, confirming passive Treasury allocation earns similar returns regardless of vehicle.
Figure 5. Cumulative wealth paths, Semi-Annual Median strategy vs. four benchmarks (2007–2025, USD 100 base, net of costs). Median (USD 199.90) leads all benchmarks. The USD 25 gap over Static IEF/IEI (USD 175) isolates the timing contribution beyond passive intermediate exposure. LUATTRUU, GOVT, and Equal-Weight converge near USD 160–163, confirming passive Treasury allocation earns similar returns regardless of vehicle.
Fintech 05 00029 g005
Figure 6. Power analysis and frequency corroboration. Left: observed Median excess return (bars) vs. MDE at 80% power (dashed); amber = below MDE. Right: NW-HAC p-values at Semi-Annual ( n = 37 , solid) and Quarterly ( n = 75 , hatched) for all three strategies; dashed red = α = 0.05 . Median p rises from 0.031 to 0.063 when n doubles—consistent with a frequency-specific signal, not a false positive.
Figure 6. Power analysis and frequency corroboration. Left: observed Median excess return (bars) vs. MDE at 80% power (dashed); amber = below MDE. Right: NW-HAC p-values at Semi-Annual ( n = 37 , solid) and Quarterly ( n = 75 , hatched) for all three strategies; dashed red = α = 0.05 . Median p rises from 0.031 to 0.063 when n doubles—consistent with a frequency-specific signal, not a false positive.
Fintech 05 00029 g006
Figure 7. Lo (2002) Sharpe significance and return autocorrelation. Left: annualized Sharpe ratios with Lo (2002) standard-error bars for the Median strategy (black) and Buy & Hold benchmark (red); stars indicate significance at the 5% level. Right: first-order autocorrelation coefficient ρ 1 by frequency; gold border indicates Ljung–Box p < 0.10 .
Figure 7. Lo (2002) Sharpe significance and return autocorrelation. Left: annualized Sharpe ratios with Lo (2002) standard-error bars for the Median strategy (black) and Buy & Hold benchmark (red); stars indicate significance at the 5% level. Right: first-order autocorrelation coefficient ρ 1 by frequency; gold border indicates Ljung–Box p < 0.10 .
Fintech 05 00029 g007
Figure 8. Bootstrap Test—Bars show mean Sharpe differences vs. Buy & Hold with 95% bootstrap CIs; percentages denote the fraction of positive draws across 10,000 replications.
Figure 8. Bootstrap Test—Bars show mean Sharpe differences vs. Buy & Hold with 95% bootstrap CIs; percentages denote the fraction of positive draws across 10,000 replications.
Fintech 05 00029 g008
Figure 9. Median strategy risk metrics across 12 start-month anchors (gross of costs). Each box = distribution over 12 calendar offsets; dashed red = Buy & Hold mean. The Sharpe distribution sits above the benchmark across all three frequencies; volatility and drawdown distributions are compact, confirming the risk profile does not depend on calendar alignment.
Figure 9. Median strategy risk metrics across 12 start-month anchors (gross of costs). Each box = distribution over 12 calendar offsets; dashed red = Buy & Hold mean. The Sharpe distribution sits above the benchmark across all three frequencies; volatility and drawdown distributions are compact, confirming the risk profile does not depend on calendar alignment.
Fintech 05 00029 g009
Figure 10. Rolling end-date persistence, Semi-Annual strategy (December-aligned, 2007–varying end year). Top: terminal portfolio values for all strategies and benchmark. Bottom: Median rank among three rotation strategies at each endpoint. Median ranks first at every evaluation year from 2015 through 2025.
Figure 10. Rolling end-date persistence, Semi-Annual strategy (December-aligned, 2007–varying end year). Top: terminal portfolio values for all strategies and benchmark. Bottom: Median rank among three rotation strategies at each endpoint. Median ranks first at every evaluation year from 2015 through 2025.
Fintech 05 00029 g010
Figure 11. Skip-period robustness: Median strategy CAGR (left) and Sharpe (right) under base-case vs. one-period execution lag, net of transaction costs. At quarterly frequency the skip-1 result is identical to the base case; at annual frequency Sharpe marginally improves under the lag.
Figure 11. Skip-period robustness: Median strategy CAGR (left) and Sharpe (right) under base-case vs. one-period execution lag, net of transaction costs. At quarterly frequency the skip-1 result is identical to the base case; at annual frequency Sharpe marginally improves under the lag.
Fintech 05 00029 g011
Figure 12. Lookback Sensitivity: Annualized Sharpe (net, left) and CAGR % (net, right) across strategy groups and rebalancing frequencies.
Figure 12. Lookback Sensitivity: Annualized Sharpe (net, left) and CAGR % (net, right) across strategy groups and rebalancing frequencies.
Fintech 05 00029 g012
Figure 13. Walk-forward out-of-sample validation, Semi-Annual strategy. Top: cumulative OOS wealth paths from a USD 100 base (June 2012–June 2025, 27 periods). Bottom: strategy selected at each OOS step by the expanding-window Sharpe rule; Median is selected in 97% of periods. Statistics box reports CAGR, Sharpe, MDD, annualized excess return, and NW-HAC p-value for both paths.
Figure 13. Walk-forward out-of-sample validation, Semi-Annual strategy. Top: cumulative OOS wealth paths from a USD 100 base (June 2012–June 2025, 27 periods). Bottom: strategy selected at each OOS step by the expanding-window Sharpe rule; Median is selected in 97% of periods. Statistics box reports CAGR, Sharpe, MDD, annualized excess return, and NW-HAC p-value for both paths.
Fintech 05 00029 g013
Figure 14. Net-of-cost portfolio value, semi-annual rebalancing (2007–2025, USD 100 base). The Median strategy (black) maintains a consistent lead over all alternatives throughout the sample. The narrow gap between the Median gross and net lines confirms that cost drag is negligible relative to gross outperformance.
Figure 14. Net-of-cost portfolio value, semi-annual rebalancing (2007–2025, USD 100 base). The Median strategy (black) maintains a consistent lead over all alternatives throughout the sample. The narrow gap between the Median gross and net lines confirms that cost drag is negligible relative to gross outperformance.
Fintech 05 00029 g014
Figure 15. Cumulative transaction cost drag (USD) from a USD 100 initial portfolio, semi-annual rebalancing. The Median strategy accumulates the lowest cost burden (USD 0.81 by end-2025) owing to its lower turnover. Winners incur the highest drag (USD 1.11), reflecting more frequent ETF substitution as the ranking leader changes.
Figure 15. Cumulative transaction cost drag (USD) from a USD 100 initial portfolio, semi-annual rebalancing. The Median strategy accumulates the lowest cost burden (USD 0.81 by end-2025) owing to its lower turnover. Winners incur the highest drag (USD 1.11), reflecting more frequent ETF substitution as the ranking leader changes.
Fintech 05 00029 g015
Figure 16. Net CAGR under three cost scenarios, semi-annual rebalancing. Dashed red line = Buy & Hold. The Median strategy clears the benchmark by more than 110 bps under every scenario including doubled costs. Winners and Losers remain below the benchmark in all scenarios.
Figure 16. Net CAGR under three cost scenarios, semi-annual rebalancing. Dashed red line = Buy & Hold. The Median strategy clears the benchmark by more than 110 bps under every scenario including doubled costs. Winners and Losers remain below the benchmark in all scenarios.
Fintech 05 00029 g016
Figure 17. Break-even cost analysis: Median strategy vs. Buy & Hold. The green region indicates cost levels at which the Median outperforms; the red region indicates benchmark dominance. Break-even occurs at 145 bps—41 times the 3.5 bps base-case execution cost.
Figure 17. Break-even cost analysis: Median strategy vs. Buy & Hold. The green region indicates cost levels at which the Median outperforms; the red region indicates benchmark dominance. Break-even occurs at 145 bps—41 times the 3.5 bps base-case execution cost.
Fintech 05 00029 g017
Table 1. Terminal Portfolio Values by Rebalancing Frequency (USD 100 initial investment, 2007–2025).
Table 1. Terminal Portfolio Values by Rebalancing Frequency (USD 100 initial investment, 2007–2025).
FrequencyWinnersMedianLosersB&H
Annual146167132162
Semi-Annual144200142
Quarterly141198142
Monthly173161148
Bi-Weekly182193114
Weekly93168257
Buy & Hold (LUATTRUU) = USD 162
Note: Green color denotes the highest portfolio value.
Table 2. Newey–West HAC Excess Return Test.
Table 2. Newey–West HAC Excess Return Test.
FrequencyStrategynLagExcess %NW tNW p95 % CI (%)
AnnualWinners1820.050.050.963 [ 2.14 , 2.25 ]
AnnualMedian1820.781.260.207 [ 0.43 , 1.98 ]
AnnualLosers182 0.36 0.24 0.814 [ 3.31 , 2.60 ]
Semi-AnnWinners373 0.26 0.22 0.824 [ 2.59 , 2.06 ]
Semi-AnnMedian373 + 1.22 2.16 0.031 [ 0.11 , 2.32 ]
Semi-AnnLosers373 0.57 0.51 0.613 [ 2.80 , 1.65 ]
QuarterlyWinners753 0.54 0.48 0.631 [ 2.75 , 1.67 ]
QuarterlyMedian753 + 1.22 + 1.86 0.063 [ 0.07 , 2.51 ]
QuarterlyLosers753 0.38 0.29 0.768 [ 2.94 , 2.17 ]
WeeklyWinners9916 3.68 2.86 0.004 [ 6.19 , 1.16 ]
WeeklyMedian9916 0.36 0.83 0.408 [ 1.23 , 0.50 ]
WeeklyLosers9916 + 1.53 + 1.33 0.185 [ 0.73 , 3.79 ]
Excess % = annualized mean excess return over LUATTRUU. Bartlett kernel, automatic lag selection. Bold = only configuration significant at 5% (positive direction). Significant in the negative direction. Bonferroni threshold across nine primary configurations: α = 0.006 . Wilcoxon: p = 0.039 for Semi-Annual Median.
Table 3. Lo (2002) Sharpe Ratio Significance.
Table 3. Lo (2002) Sharpe Ratio Significance.
Freq.StrategynSR ρ 1 LB pLo tLo pSig.
AnnualWinners180.298 0.275 0.2481.620.125No
AnnualMedian180.408 0.029 0.9071.710.106No
AnnualLosers180.205 0.058 0.8030.910.374No
AnnualBuy & Hold180.413 0.094 0.7101.830.085No
Semi-AnnWinners370.238 0.149 0.3521.180.246No
Semi-AnnMedian370.606 0.036 0.8242.580.014Yes
Semi-AnnLosers370.294 0.046 0.7761.310.199No
Semi-AnnBuy & Hold370.494 0.004 0.9802.070.046Yes
QuarterlyWinners750.251 0.064 0.5711.150.252No
QuarterlyMedian750.540−0.0130.9062.270.026Yes
QuarterlyLosers750.234 0.036 0.7531.040.300No
QuarterlyBuy & Hold750.489−0.0270.8092.030.046Yes
WeeklyWinners991 0.119 0.010 0.763 0.52 0.600No
WeeklyMedian991 0.418 0.026 0.414 1.87 0.062No
WeeklyLosers991 0.565 0.078 0.014 2.66 0.008Yes
WeeklyBuy & Hold991 0.574 0.060 0.060 2.66 0.008Yes
SR = annualized Sharpe ratio. ρ 1 = first-order autocorrelation. LB p = Ljung–Box Q ( 1 ) p-value. Bold rows = Median significant at 5%. Benchmark also significant at this frequency—the Median’s Sharpe is higher in both cases. Weekly Losers significant due to mean-reversion at short horizons; Ljung–Box flags autocorrelation ( p = 0.014 ), suggesting the Lo correction is material here.
Table 4. Semi-Annual Median Performance by Start Month (2007–2025, gross of costs).
Table 4. Semi-Annual Median Performance by Start Month (2007–2025, gross of costs).
MonthWinnersMedianLosersB&HVol (%)SharpeMDD (%)Med − BH
Jan205.0175.7115.5167.85.30.74 5.3 + 7.9
Feb159.1158.6157.5167.67.40.22 9.2 9.0
Mar127.4164.9189.6169.45.30.27 4.8 4.5
Apr134.0160.8181.6170.14.71.41 3.0 9.3
May138.1185.6157.4170.15.11.00 3.0 + 15.5
Jun144.1199.9142.9170.15.41.82 2.6 + 29.8
Jul195.9171.5114.1167.85.30.74 5.3 + 3.7
Aug154.9156.7151.9164.97.40.22 9.2 8.2
Sep120.9160.4187.7163.65.30.27 4.8 3.2
Oct128.3153.0178.9162.64.71.41 3.0 9.6
Nov130.3170.2144.4158.35.11.00 3.0 + 11.9
Dec133.8199.9132.3162.35.41.82 2.6 + 37.6
Median beats B&H in 10 of 12 months (83%). CV: Median 8.8%, Winners 17.9%, Losers 16.2%.
Table 5. Skip-Period Robustness: Median Strategy (Net of Costs).
Table 5. Skip-Period Robustness: Median Strategy (Net of Costs).
FrequencyBase CAGRSkip CAGRΔ CAGRBase SharpeSkip SharpeΔ Sharpe
Annual2.90%2.84% 0.06 0.4080.426 + 0.018
Semi-Annual3.79%3.05% 0.74 0.6060.560 0.046
Quarterly3.69%3.69% 0.00 0.5400.537 0.003
Weekly2.14%2.51% + 0.37 0.4180.494 + 0.076
Buy & Hold CAGR = 2.17%. Semi-Annual base-case CAGR of 3.79% falls to 3.05% under execution lag, but remains 88 bps above the benchmark. Quarterly and Annual are materially unaffected.
Table 6. Walk-Forward Out-of-Sample Performance Summary (June 2012–June 2025, 27 periods).
Table 6. Walk-Forward Out-of-Sample Performance Summary (June 2012–June 2025, 27 periods).
StrategyCAGR (%)SharpeMDD (%)SortinoExcess (%)NW p
Adaptive walk-forward1.990.447−11.60.5490.770.006
Always-Median2.150.481−11.60.5920.940.0005
Buy & Hold1.200.272 15.1
Excess (%) = annualized OOS excess return over Buy & Hold. NW p = Newey–West HAC two-sided p-value on the OOS excess return series. Median selected in 97% of OOS periods by the adaptive rule. OOS window is entirely independent of the in-sample data used for strategy identification in Section 5.
Table 7. Bid–Ask Spread Assumptions by ETF.
Table 7. Bid–Ask Spread Assumptions by ETF.
ETFMaturity SegmentHalf-Spread (bps)Round-Trip (bps)
SHV1–12 months0.51.0
SHY1–3 years1.02.0
IEI3–7 years1.53.0
IEF7–10 years2.04.0
TLH10–20 years3.06.0
TLT20+ years2.55.0
Average1.753.5
Table 8. Turnover and Transaction Cost Summary—Semi-Annual Rebalancing (2007–2025).
Table 8. Turnover and Transaction Cost Summary—Semi-Annual Rebalancing (2007–2025).
StrategyPeriodsTrade Freq. (%)Avg Turn./Period (%)Total Cost (USD)
Winners3871.164.51.11
Median3860.539.50.81
Losers3857.952.61.03
Total Cost = cumulative dollar cost drag from a USD 100 initial portfolio. Median’s lower turnover reflects the structural persistence of intermediate-maturity holdings across periods.
Table 9. Gross vs. Net Performance—Semi-Annual Rebalancing.
Table 9. Gross vs. Net Performance—Semi-Annual Rebalancing.
StrategyFinal ValueCAGR (%)Vol. (%)SharpeMDD (%)
Winners (Gross)144.121.9910.350.243 23.46
Winners (Net)142.891.9510.350.238 23.53
Median (Gross)199.903.816.510.610 11.59
Median (Net)198.863.786.510.606 11.65
Losers (Gross)142.911.957.330.299 18.88
Losers (Net)141.911.917.340.294 18.97
Buy & Hold162.342.65.520.494 15.06
Table 10. Scenario Analysis: Net CAGR Under Alternative Cost Assumptions.
Table 10. Scenario Analysis: Net CAGR Under Alternative Cost Assumptions.
Cost
(bps)
Net CAGR (%)Median vs. B&H
(bps)
ScenarioWinnersMedianLosers
Low ( 0.5 × )1.751.973.801.93 + 115
Base ( 1.0 × )3.501.953.781.91 + 113
High ( 2.0 × )7.001.903.761.87 + 111
Buy & Hold2.6
Cost drag on Median CAGR: Low = 1 bps, Base = 3 bps, High = 5 bps. Even under doubled costs the Median advantage over Buy & Hold exceeds 110 bps.
Table 11. Cross-Frequency Cost Comparison—Median Strategy.
Table 11. Cross-Frequency Cost Comparison—Median Strategy.
FrequencyPeriodsCum. TurnoverCost (USD)GrossNetDrag (bps)
Annual197000.352.912.901.4
Semi-Annual3815000.813.813.782.9
Quarterly7624001.373.733.684.6
Monthly22881003.952.562.4015.4
Bi-Weekly49717,6259.533.543.2133.6
Weekly99234,57516.072.802.1565.3
Daily6940181,52566.623.33−0.07339.9
Gross and Net CAGR (%). Drag = gross minus net CAGR in basis points. Monthly, Weekly, and Daily fail to beat Buy & Hold net of costs. Semi-annual achieves the best net alpha at the lowest cost drag among all frequencies that outperform.
Table 12. ETF Selection Frequency by Strategy Bucket (Semi-Annual, 2007–2025, n = 37 half-year periods).
Table 12. ETF Selection Frequency by Strategy Bucket (Semi-Annual, 2007–2025, n = 37 half-year periods).
ETFMaturityWinnersMedianLosers
SegmentCount%Count%%
SHV1–12 mo1027.8925.047.2
SHY1–3 yr1541.738.350.0
IEI3–7 yr822.22567.68.3
IEF7–10 yr513.93183.80.0
TLH10–20 yr1850.025.644.4
TLT20+ yr1644.425.650.0
IEF appears in the Median bucket in 86% of semi-annual periods; IEI in 69%. Both IEF and IEI are selected simultaneously in 64% of periods. TLT and TLH (long-duration) appear in the Median bucket in only 6% of periods each; SHY (short-duration) in 8%. The return-ranking mechanism, therefore, mechanically concentrates holdings in the intermediate maturity zone without any knowledge of yields or duration.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Malhotra, A.; Puppala, S.; Pinsky, E. Duration Rotation in U.S. Treasury Fixed-Income ETFs: Evidence for a “Median” Strategy. FinTech 2026, 5, 29. https://doi.org/10.3390/fintech5020029

AMA Style

Malhotra A, Puppala S, Pinsky E. Duration Rotation in U.S. Treasury Fixed-Income ETFs: Evidence for a “Median” Strategy. FinTech. 2026; 5(2):29. https://doi.org/10.3390/fintech5020029

Chicago/Turabian Style

Malhotra, Aishwarya, Saiteja Puppala, and Eugene Pinsky. 2026. "Duration Rotation in U.S. Treasury Fixed-Income ETFs: Evidence for a “Median” Strategy" FinTech 5, no. 2: 29. https://doi.org/10.3390/fintech5020029

APA Style

Malhotra, A., Puppala, S., & Pinsky, E. (2026). Duration Rotation in U.S. Treasury Fixed-Income ETFs: Evidence for a “Median” Strategy. FinTech, 5(2), 29. https://doi.org/10.3390/fintech5020029

Article Metrics

Back to TopTop