Next Article in Journal
Do MENA Banks Withstand Uncertainty? Evidence from Bank Stability
Previous Article in Journal
Risk Disclosure Among Jordanian Non-Financial Firms: Do Audit Quality Characteristics Matter?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

How Much Risk in U.S. Government Bond Markets Is Transmitted to Their Canadian Counterparts? †

Bank of Canada, Ottawa, ON K1A 0G9, Canada
*
Author to whom correspondence should be addressed.
The views expressed are those of the authors and do not necessarily reflect those of the Bank of Canada.
Risks 2026, 14(6), 133; https://doi.org/10.3390/risks14060133
Submission received: 8 April 2026 / Revised: 29 May 2026 / Accepted: 4 June 2026 / Published: 12 June 2026

Abstract

We address this question by jointly modeling the distributional dynamics of the U.S. and Canadian term premia. Our approach combines a flexible marginal specification—the Skewed Generalized Error Distribution—with a flexible bivariate copula (BB7) to capture evolving cross-market dependence. We illustrate the usefulness of this framework by examining December 2024, a period marked by a sharp rise in the U.S. term premium, and track how the forecasted joint distributions evolved throughout this episode. We document a striking change in conditional tail dependence between U.S. and Canadian term premia over this period. While term premia serve as a motivating application, our framework is applicable to a broad class of asset prices and macro-financial variables.

1. Introduction

To what extent are tail events shared across sovereign bond markets? This question is particularly relevant for Canada. Canadian and U.S. sovereign bond markets are highly integrated, and developments in U.S. yields frequently spill over to Canadian yields. As a result, episodes of sharp repricing in U.S. term premia raise a natural risk-management question: Do such shocks propagate internationally as global shocks, or do they remain largely country-specific? Addressing this question requires tools that go beyond point forecasts or linear correlation measures and instead characterize how risks are transmitted across countries in the tails of the joint distribution.
The term premium itself—the compensation investors require for bearing duration risk—is a latent variable inferred from an underlying term-structure model. A large literature has developed approaches for extracting term-premium estimates from observed yields, including the affine model of Kim and Wright (2005), the regression-based decomposition of Adrian et al. (2013), the canonical Gaussian factor model of Joslin et al. (2011), and the tractable specification with a non-negative short rate of Feunou et al. (2022). These models disagree in detail but agree on the broad time-series picture: term premia are volatile and persistent and respond heterogeneously to monetary-policy news, supply shocks, and global risk sentiment. We rely on the Feunou et al. (2022) estimates throughout the empirical analysis. Because term-premium estimates are model-implied latent variables rather than directly observed prices, we treat them as inputs to a modular, two-stage density-forecasting framework and discuss the implications of upstream measurement uncertainty in Section 4.1. A parallel literature has documented that cross-country bond-market spillovers are substantial, time-varying, and often concentrated in stress episodes Diebold and Yilmaz (2012) and that the joint distribution of asset returns exhibits asymmetric tail dependence that linear correlation measures fail to capture (Christoffersen et al. 2012; Patton 2006). Our framework brings these threads together by jointly modeling the cross-country term-premium distribution with flexible marginals and an asymmetric copula.
In this paper, we develop a framework for forecasting and analyzing the joint distribution of U.S. and Canadian term premia that is explicitly designed to capture skewness and tail thickness (Theodossiou 2015, 2018). Cross-country dependence is modeled using a copula, which permits asymmetric upper- and lower-tail dependence.
The contribution of this paper is twofold. Methodologically, we provide a transparent and tractable approach for constructing multi-horizon joint density forecasts of the Canadian and U.S. term premia. This is done using flexible parametric marginal densities and an asymmetric copula function. Empirically, we show that joint tail risk between U.S. and Canadian term premia is time-varying and can weaken precisely during periods of market stress—an insight that is not available from standard Gaussian or correlation-based approaches and is directly relevant for risk monitoring and policy analysis.
Advantages of our approach include the following:
  • Accommodating empirically relevant departures from Gaussianity in term-premium data—such as asymmetry, heavy tails, and tail dependence;
  • Exploiting the high-frequency nature of financial-market data to:
    Link observed data to time-varying distributional features;
    Generate multi-horizon forecasts of the joint distribution; and
    Assess the likelihood of cross-country spillovers during extreme events in real time.
Using daily data from December 2024, we show that although joint bond-market risk increased, short-horizon spillover risk from the U.S. to Canada declined sharply. December 2024 was a period during which U.S. term premia rose sharply over a short window, while the Canadian increase was more muted. A correlation-based analysis would suggest persistent comovement between U.S. and Canadian bond markets throughout this period. In contrast, our joint density approach reveals a pronounced weakening of upper-tail dependence: extreme U.S. term-premium outcomes became substantially less informative about extreme Canadian outcomes. This decoupling suggests that the repricing episode was driven largely by U.S.-specific factors, consistent with contemporaneous commentary emphasizing domestic fiscal and supply-related pressures.
From a risk-management and policy perspective, this finding implies that measures based on correlations or unconditional comovement can substantially overstate cross-border spillover risk during stress episodes, whereas our joint density approach is able to provide a nuanced measure of dependence at a distributional level. While we use U.S. and Canadian term premia as a motivating application, our framework is generic and can be applied to a wide range of asset prices and macro-financial variables.
Commonly used approaches to model joint dynamics often rely on Gaussian assumptions, which are ill-suited for capturing the skewness, excess kurtosis, and asymmetric dependence commonly observed in financial data, particularly during periods of stress. To address these limitations, some strands of the literature have introduced non-Gaussian elements, such as skewed-t innovations (Hansen 1994; Patton 2004), or copula-based approaches to model nonlinear dependence across variables (Patton 2006). While these methods represent important advances, their use in macro-financial risk assessment—in particular, for the construction of joint, multi-horizon density forecasts of term premia—remains limited.
Our approach aims to address several limitations of existing distributional modeling methods. A prominent framework for the modeling of the distribution of economic variables was proposed by Adrian et al. (2019), relying on quantile regressions. A key limitation of quantile regression in this context is the difficulty of constructing a full conditional distribution, particularly in a dynamic setting. Moreover, quantile regressions are known to be relatively “data-hungry” (Plagborg-Møller et al. 2020), which can be problematic when distributional characteristics evolve rapidly, as is often the case in high-frequency financial data.
An alternative way to model the joint distribution is to fit a parametric multivariate distribution directly. Two natural benchmarks are the multivariate skewed-t distribution and the family of dynamic conditional correlation (DCC) models of Engle (2002). Multivariate parametric distributions can impose restrictive structure on both the margins and the dependence: the multivariate skewed t forces a common tail-thickness parameter across margins and delivers tail dependence that is symmetric between the upper and lower tails by construction, while DCC-type models specify time variation in linear conditional correlations under Gaussian or Student-t innovations and do not directly model asymmetric tail dependence. This can lead to mis-specification when variables exhibit heterogeneous marginal characteristics or asymmetric dependence—precisely the features documented for U.S. and Canadian term premia in our application. A copula approach, by contrast, exploits Sklar’s theorem to separate the modeling of individual marginal behavior from cross-dependence, allowing each variable’s distribution to flexibly accommodate its own skewness, kurtosis, and tail behavior, while the copula captures nonlinear and possibly asymmetric tail dependence between variables (Patton 2004, 2006).
The remainder of this paper is organized as follows. Section 2 introduces the distributional framework. Section 3 describes the forecasting methodology. Section 4 presents the empirical results for U.S. and Canadian term premia. Section 5 concludes this work.

2. Model Framework

Accurately characterizing the joint distribution of term premia requires a framework that can accommodate skewness, heavy tails, and asymmetric dependence, particularly during periods of market stress. Financial innovations exhibit pronounced non-Gaussian features, including time-varying skewness and excess kurtosis, as well as stronger dependence in the tails than in the center of the distribution. To capture these features in a parsimonious yet flexible manner, we adopt a two-stage distributional framework that combines the skewed generalized error distribution (SGED) to model each marginal distribution (Theodossiou 2015, 2018) with a BB7 (Joe–Clayton) copula to model cross-country dependence.
The SGED is a four-parameter distribution that allows skewness and tail thickness to be modeled independently of the location and scale parameters. An additional practical advantage of the SGED is that it admits a moment-generating function under standard parameterizations, facilitating mapping from empirically estimated moments to distributional parameters.
Periods of financial stress are often characterized by nonlinear and asymmetric dependence, with co-movements strengthening during extreme events. To model this dependence structure separately from the marginals, we employ a copula-based approach. Specifically, we use the BB7 copula, which allows for asymmetric upper- and lower-tail dependence. This feature is well suited to the modeling of international term-premium linkages, where downside risks (e.g., flight-to-safety episodes) may propagate differently than upside shocks. The BB7 copula nests a wide range of dependence patterns, making it helpful for both inference and forecasting.
Combining SGED marginals with a BB7 copula yields a flexible joint distribution that can capture skewness, heavy tails, and asymmetric tail dependence in a unified framework. This separation of marginal behavior from dependence allows each component to be modeled and forecasted separately.

2.1. Skewed Generalized Error Distribution

The SGED is a member of a family of continuous probability distributions that permits skewness and kurtosis and nests other familiar densities including the normal distribution. The general form of the SGED density for a random variable (y) takes four parameters ( μ , σ , λ , p ) :
f SGED ( y ; μ , σ , λ , p ) = p 2 v σ Γ 1 p exp | y μ + m | v σ 1 + λ sgn ( y μ + m ) p ,
where
m = λ σ 2 2 p Γ 1 2 + 1 p π
ensures that the mean is μ and
v = π Γ 1 p π ( 1 + 3 λ 2 ) Γ 2 p 16 2 p λ 2 Γ 1 2 + 1 p 2 Γ 1 p    .
The first two moments, i.e., the mean and standard deviation, are governed by μ and σ . The third and fourth moments (skewness and kurtosis) are governed by λ and p. Theodossiou (2015) provides tables that summarize how ( λ , p ) map into skewness and kurtosis.
While other well-established parametric distributions exist for the modeling of skewness and kurtosis—notably, the skewed t of Hansen (1994)—we employ the SGED for several reasons. Both distributions belong to the family of skewed, generalized t distributions, but the SGED offers greater flexibility. As demonstrated in Feunou et al. (2016), for a given level of kurtosis, the SGED can achieve higher levels of skewness than the skewed-t distribution. More importantly for our application, the SGED places greater probability mass in the tails compared to a moment-matched skewed-t distribution.1 We provide more details on the shape and quantiles of these distributions in the Appendix C.

2.2. Joint Distribution via Copula

Up to this point, we have focused on the marginal distributions of individual variables. We take a step further and estimate their joint distribution to evaluate the risk of joint events. If the two distributions are fully independent, then the probability of a joint event is simply the product of the two marginal probabilities. However, in our application, there is strong dependence between the variables, so we combine the two marginal densities using a copula. This offers a way to separate margins from the dependence structure and build more flexible multivariate distributions.
We utilize a BB7 copula (also known as a Joe–Clayton copula), which permits asymmetric tail dependence between the two variables, to link our individual marginal SGED functions into a joint distribution. Specifically, we use the symmetrized Joe–Clayton copula of Patton (2006). Let F 1 ( · F t ) and F 2 ( · F t ) denote the conditional SGED forecast cumulative distributions of y 1 , t and y 2 , t and define the uniform distributions as u 1 = F 1 ( y 1 , t ) and u 2 = F 2 ( y 2 , t ) ; the BB7 copula is
C J C u 1 , u 2 τ U , τ L = 1 1 1 ( 1 u 1 ) κ γ + 1 ( 1 u 2 ) κ γ 1 1 / γ 1 / κ where κ = 1 / log 2 2 τ U , γ = 1 / log 2 τ L .
where τ L and τ U are parameters that govern the upper and lower tail dependence between variables.
The joint density can then be written as
f ( y 1 , t , y 2 , t ) = f SGED ( y 1 , t ) · f SGED ( y 2 , t ) · c u 1 , u 2 ; τ U , τ L ,
where c ( · ) is the copula function associated with the BB7 copula. This representation separates marginal distributional features (including skewness and kurtosis) from the dependence structure and, in particular, allows us to model asymmetric upper- and lower-tail spillovers.
The BB7 specification is well suited to the term-premium application for two qualitative reasons. First, in contrast to the Gaussian copula—which exhibits asymptotic tail independence—the BB7 copulaadmits non-zero tail dependence parameters ( τ U and τ L ), so joint extreme outcomes need not vanish in the limit. Second, in contrast to the Student-t copula—which delivers tail dependence that is symmetric between the upper and lower tails—the BB7 copulapermits τ U τ L , so, for example, flight-to-safety dynamics (joint lower-tail outcomes) can propagate at a different intensity than joint upside surprises in yield risk. This asymmetric flexibility is the key feature we exploit in the empirical analysis and is one we cannot reproduce with a Gaussian, Student-t, or multivariate skewed-t specification.

2.3. Estimation of Parameters

Because the SGED marginals and BB7 copula form a flexible family that nests a wide range of distributional shapes and dependence patterns, we maintain this functional form throughout and focus on forecasting its time-varying parameters. In this setting, producing a density forecast is equivalent to producing forecasts of the parameter vector that characterizes the predictive distribution. This approach is closely related to methods that specify explicit laws of motion for distributional moments rather than for the underlying distribution itself (Harvey and Siddique 1999; León et al. 2005).
Time variation in the predictive distribution is induced through observable, ex post characteristics of the forecast errors. Specifically, we first construct standardized forecast errors from a point-forecast model and compute realized measures of skewness and kurtosis based on forecast errors from the point-forecast model. The realized skewness and kurtosis summarize the empirical shape of the forecast-error distribution at time t.
In the following section, we describe how these realized moments are constructed using horizon-specific dynamic equations and are subsequently mapped into the parameters of the SGED and the copula.

3. Forecasting Methodology

Our approach proceeds in two steps. First, we obtain point forecasts and construct forecast errors at each horizon. Second, we map the resulting empirical second through fourth moments into SGED parameters and specify simple, estimable laws of motion for each parameter (including the copula tail-dependence parameters). The remainder of this section details these steps and shows how they combine to deliver multi-horizon forecasts of the full joint distribution.

3.1. Point Forecast and Forecast Errors

We introduce the forecasting model by considering two variables of interest, y 1 , t and y 2 , t , which serve as placeholders for any macroeconomic or financial series for which a density forecast is desired. Throughout Section 3.1, Section 3.2 and Section 3.3, the marginal specification is described for a single representative series ( y i , t , i { 1 , 2 } ), and we suppress the country index (i) when no ambiguity arises; the cross-country dependence structure is re-introduced explicitly in Section 3.4. Throughout, E t [ · ] denotes the conditional expectation under the information set ( F t ) available at time t, which contains lagged values of y i , t and the predictors ( x t ).
Our approach follows a two-step procedure. In the first step, we obtain point forecasts for y t , given the predictors ( x t ). In principle, any forecasting model can be used, provided it yields forecast errors with the regularity properties stated below. We therefore leave the first-stage point-forecasting model in functional form, as denoted by f:
E t [ y t + h ] = f ( x t ) ,
where x t is a set of predictors deemed useful for forecasting, potentially including lagged values of y t . A key advantage of this framework is the conditional independence of the second stage from the first, given the forecast errors. This modularity allows for flexible substitution of forecasting models in the first stage. The point-forecast specification we use for the term-premium application—a shared local-level VARX—is detailed in Section 4.1 and Appendix A.
We interpret the point forecast E t [ y t + h ] as the mean of the predictive distribution:
μ t ( h ) E t [ y t + h ]
where E t [ y t + h ] is the h-step-ahead point forecast from the first-stage model. The forecast error at horizon h is defined as
e t + h = y t + h E t [ y t + h ]
Note that the forecast errors ( e t + h ) are implicitly a function of the forecast horizon so that e t + h e ( h ) t + h ; we suppress this horizon argument for notational clarity wherever no ambiguity arises. We impose the following minimal regularity conditions on the forecast error process:
1.
Finite Fourth Moments: Each component of the forecast error vector has finite fourth moments:
E [ e t 4 ] <
2.
Weak Dependence and Stationarity: Process { e t } is assumed to be covariance-stationary and serially uncorrelated:
E [ e t ] = 0 , Cov ( e t , e t k ) = 0 for k 0 ,
We do not require the errors to be independently and identically distributed, allowing for conditional heteroskedasticity and non-Gaussian features.
These assumptions are standard in most financial econometric applications. Rather than estimating higher moments directly, we treat functions of standardized forecast errors as noisy proxies for time variation in skewness and kurtosis, which are subsequently mapped into SGED shape parameters.

3.2. Modeling Volatility and Higher Moments

The forecast errors ( e t + h ) capture the stochastic component of the point forecast, and we exploit information contained in these errors to construct the predictive density. We adopt the standard multiplicative decomposition, i.e.,
e t + h = σ t ( h ) u t + h ,
where σ t ( h ) is the conditional scale of the forecast error known at time t and u t + h is a standardized innovation with zero mean and unit variance. This decomposition allows higher-order moments to be derived from innovations ( u t + h ).

3.2.1. Realized Volatility

Here and throughout, a superscript r denotes a realized (ex post) quantity computed directly from the forecast errors, as distinct from the model-based forecast of that quantity introduced below. We first construct an ex post realized volatility proxy using an exponentially weighted moving average (EWMA) of squared forecast errors, i.e.,
σ t ( r , h ) 2 = α σ t 1 ( r , h ) 2 + ( 1 α ) e t 2
where α = 0.94 2. The corresponding realized log volatility is defined as
t ( r , h ) log σ t ( r , h ) = log ( σ t ( r , h ) ) 2 .

3.2.2. Forecast Volatility

Rather than modeling realized variance directly, we work with realized log volatility, as it is not bounded. We model realized log volatility using a predictive regression as follows:
t + h ( r , h ) = β 2 ( h ) X t + ρ 2 ( h ) t ( r , h ) + ν 2 , t + h .
After estimating the above regression, the resulting forecast ex ante volatility is obtained by taking exponents of the fitted value, i.e.,
σ t ( h ) exp ^ t + h ( r , h ) = exp β ^ 2 ( h ) X t + ρ ^ 2 ( h ) t ( r , h ) ,
which serves as the conditional SGED scale parameter of the predictive distribution at horizon h.

3.3. Higher Moments and SGED Parameters

Given σ t ( h ) , the standardized innovation is defined as
u t + h = e t + h σ t ( h )
and forms the basis for the construction of realized (ex post) proxies for skewness and kurtosis, which are subsequently forecast and mapped into the time-varying SGED shape parameters. Again, we suppress the horizon index when no ambiguity arises.
We define empirical, realized proxies for skewness and kurtosis ( s t ( r , h ) and k t ( r , h ) ) as functions of the standardized innovations and utilize the EWMA approach as per the realized volatility, i.e.,
s t ( r , h ) = α s t 1 ( r , h ) + ( 1 α ) u t 3
k t ( r , h ) = α k t 1 ( r , h ) + ( 1 α ) u t 4 .
The SGED admits a mapping from its shape parameters ( λ , p ) to skewness and kurtosis:
M ( λ , p ) ( s , k ) ,
where s ( λ , p ) and k ( λ , p ) denote the skewness and kurtosis implied by ( λ , p ) . The SGED does not admit a closed-form moment inversion ( M 1 ) mapping ( s , k ) into ( λ , p ) , so we use numerical methods to invert the realized moments into SGED parameters. g ( · ) denotes the numerical mapping from third and fourth moments to SGED parameters. Given standardized forecast errors ( u t ), we obtain the following:
λ t ( r , h ) = g 1 ( s t ( r , h ) , k t ( r , h ) )
p t ( r , h ) = g 2 ( s t ( r , h ) , k t ( r , h ) )

Forecasting SGED Shape Parameters

The SGED shape parameters are constrained to lie in bounded regions, i.e., λ ( λ min , λ max ) and p ( p min , p max ) . To work on an unrestricted scale, we apply a smooth, monotone sigmoid transformation (sigmoid function) that maps each parameter to the real line.
Specifically, the transformed parameters are defined as
λ ˜ t ( r , h ) log λ t ( r , h ) λ min λ max λ t ( r , h ) ,
p ˜ t ( r , h ) log p t ( r , h ) p min p max p t ( r , h )
and are unconstrained on the real line. We set p min and p max to 0.5 and 9.5 respectively and λ min and λ max to −0.95 and 0.95.
We then model the transformed shape parameters using predictive regressions with horizon-specific coefficients, i.e.,
λ ˜ t + h ( r , h ) = β 3 ( h ) X t + ρ 3 ( h ) λ ˜ t ( r , h ) + ν 3 , t + h ,
p ˜ t + h ( r , h ) = β 4 ( h ) X t + ρ 4 ( h ) p ˜ t ( r , h ) + ν 4 , t + h .
The SGED shape parameters are obtained by applying the inverse sigmoid transformation to the fitted transformed states. Specifically, let λ ˜ ^ t + h ( r , h ) and p ˜ ^ t + h ( r , h ) denote the fitted values from the predictive regressions. The corresponding horizon-h forecasts of the SGED shape parameters are then given by
λ t ( h ) λ min + λ max λ min 1 + exp λ ˜ ^ t + h ( r , h )
p t ( h ) p min + p max p min 1 + exp p ˜ ^ t + h ( r , h ) .
Our fitted SGED function then takes the following form:
f SGED ( y t + h ; μ t ( h ) , σ t ( h ) , λ t ( h ) , p t ( h ) ) .

3.4. Joint Distributional Parameters

The BB7 copula shown in Equation (1) takes two parameters that are functions of the upper and lower tail-dependence measures ( τ L and τ U ).
We approximate time-varying τ t L and τ t U empirically using quantile dependence measures of the standardized forecast errors ( u t ). We estimate tail dependence nonparametrically from standardized forecast errors using rolling-window quantile dependence measures, which allows the copula parameters to evolve over time.
Let u 1 , t and u 2 , t denote the standardized one-step-ahead forecast errors of the two series at time t from a given forecast horizon. Again, we drop the implicit function of the forecast horizon from our notation below for clarity. For each day (t), we construct a rolling subsample with a length of 21 trading days (approximately one month), i.e.,
u ˜ i , t = u i , s : s = t 20 , , t ,
and compute empirical measures of quantile dependence as in (Patton 2006). Specifically, for quantile levels of q L = 0.25 and q U = 0.75 , we define the realized dependence at the given quantiles as
τ t ( r , L ) = 1 21 q s = t 20 t 1 u 1 , s q , u 2 , s q , 0 < q 1 2 ,
τ t ( r , U ) = 1 21 ( 1 q ) s = t 20 t 1 u 1 , s > q , u 2 , s > q , 1 2 < q < 1 .
The resulting series ( τ t ( r , L ) , τ t ( r , U ) ) provide rolling-window estimates of lower- and upper-tail dependence between the two forecast errors.
As per Equations (15) and (16), we transform the realized tail-dependence measure onto the real line to obtain an unconstrained transformation of the realized tail-dependence parameters ( τ ˜ t ( r , L ) and τ ˜ t ( r , U ) ) and, again, build the law of motion for them:
τ ˜ t + h ( r , L ) = β 5 ( h ) X t + ρ 5 ( h ) τ ˜ t ( r , L ) + ν 5 , t + h ,
τ ˜ t + h ( r , U ) = β 6 ( h ) X t + ρ 6 ( h ) τ ˜ t ( r , U ) + ν 6 , t + h .
The fitted values ( τ ˜ ^ t ( r , L ) and τ ˜ ^ t ( r , U ) ) are then transformed as per Equations (19) and (20) to be constrained between 0 and 1 such that the copula dependence parameters are obtained as
C J C u 1 , t , u 2 , t τ t U , τ t L ,
thereby defining our copula function for a given forecast horizon (h).

3.5. Estimation and Calibration

Estimation proceeds in two layers. First, for each horizon (h), we estimate the point-forecasting model ( f ( x t ) ) and obtain forecast errors ( e t + h ). In the second stage, the law of motion for each realized moment or tail-dependence measure is estimated using OLS.
This second stage is carried out separately for each horizon (h) following a local-projection approach in the spirit of Jordà (2005). Rather than imposing a single dynamic structure across horizons, we allow the conditional mean, volatility, skewness, kurtosis, and copula tail-dependence parameters to evolve according to horizon-specific predictive regressions. This yields a distributional local-projection framework in which the entire predictive distribution is forecast directly at each horizon, without requiring iterative multi-step simulation or parametric restrictions linking short- and long-horizon dynamics.

4. Application: US–Canada Term-Premium Risk

The empirical objective of this section is twofold. First, we illustrate how the distributional framework of Section 2 and Section 3 operates in practice using daily five-year term-premium data for the United States and Canada. Second, we use the framework to characterize cross-country risk during the December 2024 episode—a period in which the U.S. term premium rose sharply while its Canadian counterpart rose more modestly—and assess whether the rise in U.S. tail risk translated into elevated Canadian tail risk or, instead, reflected a reallocation of risk toward more idiosyncratic U.S. outcomes. We treat the analysis as an illustrative case study rather than a horse race against alternative multivariate models; complementary statistical diagnostics, a Gaussian-copula benchmark, and placebo windows are introduced alongside the main results to gauge the strength of the empirical signal.

4.1. Data and Point-Forecast Model

We study the joint distribution of U.S. and Canadian five-year bond-term premia. For this analysis, the term premium is treated as observed and is measured using the estimates from the term structure model of Feunou et al. (2022)3.
Term premia are not directly observed prices but model-implied latent variables: they are constructed by decomposing observed nominal yields into an expectationcomponent (the path of future short rates anticipated by the market) and a residual term premium that reflects compensation for duration risk. A well-known consequence is that term-premium estimates are sensitive to the underlying term-structure specification, and different identifying assumptions can yield economically meaningful differences across the Kim and Wright (2005), Joslin et al. (2011), Adrian et al. (2013), and Feunou et al. (2022) frameworks. Our two-stage density-forecasting framework is modular with respect to the first-stage point forecast and, in particular, with respect to the upstream term-premium estimate: any well-defined daily series can be substituted as input, and a robustness check against alternative term-premium constructions (e.g., Adrian et al. 2013) is a natural follow-up. We emphasize that the second-stage marginal and copula objects estimated below inherit this upstream sensitivity: because the dependent variables, themselves, differ across term-structure specifications, the BB7 upper-tail dependence parameter ( τ U ) and the conditional spillover probabilities reported in Section 4 should be read as conditional on the Feunou et al. (2022) term-premium measure, and a fully normalized cross-estimator comparison is left to future work. We treat the first-stage term-premium estimate as observed rather than passing its uncertainty into the predictive density, so the marginal fit and the dependence dynamics stay separate.
The role of the point-forecast model in our framework is not to maximize point-forecast accuracy per se but to generate economically meaningful forecast errors whose distribution can be modeled and forecast. To this end, we exploit the strong long-run relationship between U.S. and Canadian term premia. As in most integrated sovereign bond markets, the two series are highly cointegrated, implying the existence of a stable long-run equilibrium relationship. While term premia may diverge in the short run in response to country-specific shocks, they tend to comove over longer horizons.
Our forecasting model is therefore built around a cointegrated two-variable system with a shared stochastic level—the shared local VARX level described in detail in Appendix A. The model proceeds in three steps. First, we identify the long-run relationship between U.S. and Canadian term premia using the Engle–Granger approach and extract the associated common-trend direction. This common component is modeled as a random walk and captures slow-moving movements shared across both markets Stock and Watson (1988); because the common trend follows a random walk, its forecast at any horizon is simply its current level, which we estimate using a Kalman filter.
Second, subtracting the estimated common level from each series yields stationary deviations that reflect short- and medium-run dynamics; these deviations are modeled using a horizon-specific VARX that includes lagged term premia and a set of high-frequency market-based exogenous predictors (listed in Appendix B). Third, forecasts of the stationary component are recombined with the projected common level to form the overall point forecast. This construction ensures consistency with the cointegrating relationship while isolating the forecast errors that are subsequently used to construct density forecasts. The three-step decomposition and the closed-form for the h-step-ahead forecast are given in Appendix A.
Because the point-forecast model is estimated as a direct h-step projection, downstream regressions on rolling-window quantities use Newey–West standard errors with a bandwidth of at least h. Summary statistics for the shared local-level VARX—the Engle–Granger cointegration coefficient and horizon-specific point-forecast diagnostics ( R Δ 2 , RMSE, and Theil U) for both the U.S. and Canadian equations—are reported in Table 1 at the end of this section.
Figure 1 illustrates the U.S. and Canadian five-year term premia, together with the estimated common stochastic level. The two series exhibit pronounced co-movement over the sample, consistent with a strong long-run relationship. The extracted common trend captures the slow-moving component shared across both markets, while higher-frequency deviations reflect short-run, country-specific dynamics.
Table 1 summarizes the in-sample fit of the first-stage shared-trends model. Panel A reports the Engle–Granger cointegrating regression ( y US , t = c + θ y CA , t + u t ) estimated based onthe daily level series. Panel B reports, for each forecast horizon ( h { 7 , 21 , 63 , 126 , 252 } trading days), the effective sample size, the root mean squared forecast error (in basis points), the R 2 of the model in change space ( R Δ 2 1 RMSE model 2 / RMSE RW 2 , where the random-walk benchmark uses h-step level changes), and the Theil U statistic (model RMSE divided by random-walk RMSE; values below one indicate the model improves on the random-walk benchmark). We assess fit in change space ( R Δ 2 ) rather than reporting a conventional level ( R 2 ): because term premia are highly persistent, a level-fit R 2 would be mechanically close to one and uninformative about genuine forecast skill.
The Engle–Granger slope is θ ^ = 1.06 with a t-statistic of 195 and R 2 = 0.85 , confirming the long-run U.S.–Canada cointegrating relationship exploited by the shared-trend construction. In Panel B, the shared-trend specification adds value relative to the random-walk benchmark in the medium-horizon range relevant to the density-forecasting application: the Theil U falls below one at horizons of one month and beyond for the U.S. equation and at the one- and three-month horizons for the Canadian equation. At the one-week horizon, the Theil U is close to one for both equations (1.03 for the U.S. and 1.02 for Canada), so the shared local-level VARX produces forecasts essentially as good as the random-walk benchmark at this short horizon; the shared-trend component carries most of the short-run predictive content, and the VARX contribution is small. At the longest horizons (six months to one year), the Canadian equation’s residuals are larger than those of a random-walk forecast in change space; we interpret this as residual idiosyncratic variation in Canadian term premia at long horizons that the shared-trend specification is not designed to capture. The model is calibrated based ondeviations from the shared U.S.–Canada trend, not to maximize point-forecast accuracy at all horizons; the density forecasts below use the seven-trading-day horizon, where the forecast errors have desirable distributional properties as documented in Section 2.

4.2. Distributional Properties of Forecast Errors

Using forecast errors from the shared-trends regression described above, we construct empirical distributions of forecast errors across horizons. Figure 2 displays the unconditional forecast-error distributions at selected horizons for the U.S. and Canadian term premia, together with a Gaussian benchmark.
At the one-day horizon, a normal approximation provides a reasonable description of the empirical error distribution. At longer horizons—ranging from one week to three months—the kernel density estimates (solid blue lines) deviate noticeably from the Gaussian benchmark (red dashed line), exhibiting pronounced skewness and excess kurtosis. In particular, for the U.S. term premium at the one-week, one-month, and three-month horizons and for the Canadian term premium at the one-month and three-month horizons, the distributions display clear right-heavy tails.
It is important to emphasize that these are unconditional empirical distributions constructed using forecast errors from the full sample. As such, they average over periods of relatively calm market conditions and episodes of heightened volatility. In real time, forecast-error distributions may exhibit even stronger departures from Gaussianity during periods of market stress. These empirical features motivate the use of flexible parametric distributions such as the SGED that can accommodate time-varying skewness and tail risk in density forecasting.

4.3. Density Calibration

Having documented departures from Gaussianity in the unconditional forecast-error distributions, we now assess whether the SGED-based predictive density we estimate in Section 2 and Section 3 delivers calibrated forecasts relative to a natural parametric alternative, the skewed t of Hansen (1994). Calibration is a property of the full predictive density rather than of any single moment, and the standard diagnostic is the probability integral transform (PIT) of Diebold et al. (1997): if the predictive CDF is correctly specified, the realized observation evaluated through the forecast CDF is uniformly distributed on [ 0 , 1 ] . Systematic clustering of PIT values near 0 or 1 indicates miscalibrated tail probabilities, while interior deviations from uniformity indicate over- or under-dispersion.
Figure 3 reports interior PIT histograms for the SGED and skewed-t specifications at the one-week-ahead horizon for both U.S. and Canadian term premia.4 The skewed-t results, in the bottom row, exhibit a more pronounced U shape than the SGED in the top row, indicating that the skewed-t predictive density understates the probability of adverse forecast errors. The SGED histograms are visibly closer to uniform. A simple summary statistic—the range between the highest- and lowest-density interior bins—is 0.370 for the U.S. SGED versus 0.736 for the U.S. skewed t and 0.361   0.450 for Canada, indicating that the SGED predictive density is closer to a calibrated forecast than the skewed t at the one-week horizon for both series. Formal uniformity tests (Kolmogorov–Smirnov, Anderson–Darling, and Berkowitz tests) are a natural complement to this visual diagnostic; we discuss this further in Appendix D and treat the formal testing exercise as a follow-up to the present paper.
These calibration results lend support to our choice of the SGED as the marginal distribution in the joint density framework and complement the theoretical skewness–kurtosis-bound argument summarized in Appendix C. They also bear on the interpretation of the conditional-spillover results in Section 4.5: the conditional upper-tail probabilities reported there are computed from a marginal specification whose PIT-based calibration appears reasonable, which suggests that the documented decline in conditional dependence over December 2024 is unlikely to be have been driven primarily by tail-probability miscalibration of the underlying SGED marginals.

4.4. Joint Density Estimation

One benefit of our framework is that it links high-frequency market developments to both the marginal distributional parameters and the cross-country dependence structure in real time. This not only allows us to track changes in overall uncertainty but also shifts in the balance of risks facing the joint U.S.–Canadian bond market. We illustrate this feature by examining December 2024, a period characterized by a sharp increase in the U.S. term premium and, to a lesser extent, the Canadian counterpart.
Between 6 December and 17 December 2024, the U.S. term premium increased by approximately 21 basis points, rising from 29 bp to 50 bp, while the Canadian term premium rose by 9 basis points, from 43 bp to 52 bp. Market commentary at the time emphasized U.S.-specific fiscal and supply-side concerns following the U.S. election, among other potential explanations for the sharp increase. This divergence provides a natural setting to assess whether rising U.S. term-premium risk translated into elevated joint tail risk for Canada or, instead, reflected a reallocation of risk toward more idiosyncratic U.S. outcomes.
Figure 4 shows the evolution of U.S. and Canadian term premia over December 2024. While both series increase over the month, the rise in the U.S. term premium is noticeably steeper. This divergence suggests a potential rebalancing of risks across the two markets.
Figure 5 displays one-day-ahead and seven-day-ahead joint density forecasts for the U.S. and Canadian term premia on two representative trading days in December 2024. Each panel shows the full bivariate predictive distribution implied by the SGED marginals and the BB7 copula. Across both horizons, the joint densities retain a stable upward-sloping orientation, indicating persistent positive dependence between the two markets.
Importantly, however, the shape of the joint distribution evolves over time. By 27 December, the joint density shifts upward and becomes more concentrated in the upper-right quadrant. This reflects a reallocation of probability mass toward scenarios in which both term premia are elevated. In risk terms, the balance of outcomes shifts away from downside or symmetric risks and toward joint upside surprises. The widening of the contours therefore reflects not merely higher overall uncertainty but a change in the composition of risk facing the joint bond market.
Taken together, these joint density forecasts indicate that December 2024 was characterized by a shift in the balance of joint risks rather than a uniform increase in uncertainty. Probability mass moved toward scenarios involving higher term premia in both countries, reflecting heightened upside risk in global bond markets. At the same time, joint densities alone do not distinguish between common risk rebalancing and directional spillovers from the United States to Canada. To assess whether elevated U.S. term-premium risk translated into stronger cross-border transmission, we next examine conditional distributions.

4.5. Conditional Spillover Risk

While joint density forecasts characterize the overall balance of risks facing the U.S. and Canadian bond markets, they do not, by themselves, imply spillovers or directional transmission. Conditional distributions provide a sharper and more policy-relevant measure of cross-country exposure by quantifying how risks in one market translate into risks in the other. In this section, we use the estimated joint distribution to assess the likelihood that elevated U.S. term-premium outcomes spill over into Canada over a seven-day (one-week) forecast horizon during December 2024.
Figure 6 illustrates the seven-day-ahead conditional density of the Canadian term premium, given that the U.S. term premium exceeds a high threshold ( US 0.8 ), together with the unconditional Canadian density, for three representative dates in December 2024. On 2 December, the conditional Canadian density is sharply concentrated at high values, indicating that extreme U.S. term-premium realizations were highly informative about Canadian outcomes. By 16 December, this conditional concentration weakens, and by 30 December, the conditional density shifts closer to the unconditional distribution. This visual evidence suggests a progressive decline in spillover strength over the month, even as U.S. term-premium risk remained elevated at short horizons.
To quantify these changes, Table 2 reports conditional upper-tail probabilities in the form of Pr t + h t ( CA x US x ) for a range of thresholds and dates, computed for the h = 7 day-ahead joint distribution. These probabilities provide a direct measure of spillover risk: higher values imply that extreme U.S. term-premium outcomes are more likely to be accompanied by extreme Canadian outcomes over the subsequent week. We use the one-week horizon because its tighter predictive distribution provides the sharpest visualization of the within-month decoupling pattern. The first-stage point-forecast diagnostics in Table 1 characterize the underlying VARX, while the second-stage density forecasts that drive Table 2 are evaluated separately through the PIT calibration diagnostics in Section 4.3.
Early in December, conditional probabilities are extremely high across thresholds. On 2 December, the probability that the Canadian term premium exceeds the 0.7 threshold conditional on the U.S. doing so is roughly 94%, and even at the more demanding 0.8 threshold, the conditional probability is close to 98%; by 9 December, these readings rise further, with the 0.7 conditional approaching 99%. This indicates a tightly linked upper tail, consistent with strong short-horizon spillovers during the initial phase of the U.S. term-premium run-up and with the historical view of U.S. and Canadian sovereign bond markets as close substitutes subject to common global risk factors.
As December progresses, however, conditional spillover risk declines sharply. By 16 December, the conditional probability at the 0.7 threshold falls to roughly 57%, and by 30 December, it falls to 44%. The decline is even more pronounced at the 0.8 threshold, where the conditional probability collapses from 96% on 9 December to 9% on 16 December and remains in the 13–20% range through month-end. This indicates that extreme U.S. term-premium outcomes became progressively less informative about extreme Canadian outcomes over a one-week horizon. Importantly, this decoupling occurs even as the U.S. term premium continues to rise, highlighting a shift toward more idiosyncratic U.S.-specific risk.
From a risk-management perspective, these results indicate that one-week-ahead Canadian tail exposure became substantially less sensitive to U.S. tail events, even as short-horizon U.S. term-premium risk intensified.

4.6. Cross-Episode Comparison

To address the concern that the December 2024 evidence might reflect a one-off configuration of the framework rather than a structural feature of the U.S.–Canadian term-premium relationship, we replicate the conditional upper-tail probability calculation across two additional U.S. term-premium repricing episodes: the May–September 2013 taper tantrum, during which U.S. rates rose sharply following Federal Reserve communication about asset purchases, and the August–October 2023 yield rally, during which U.S. term-premium estimates rose alongside concerns about Treasury issuance and fiscal supply. Both episodes share with December 2024 the characteristic that the U.S. term premium drifted upward over the window; they differ in shock origin (monetary, supply, and fiscal) and in macroeconomic context.
Table 3 reports the conditional probabilities for the three episodes on five weekly Monday dates each, with thresholds in each panel set to percentiles of the U.S. five-year term premium observed within that episode’s window. Calibrating thresholds to each episode’s own distribution makes the conditioning event a comparable upper-tail event across episodes, despite different absolute term-premium levels.
The three panels reveal distinct conditional-spillover dynamics across upper-tail rallies of different origins. In the 2013 taper tantrum (Panel A), the conditional probabilities were essentially zero in May and June, when the U.S. term premium had not yet reached the percentile region that ultimately defined the episode’s upper tail. As the rally proceeded and U.S. TPs rose into that region, conditional probabilities rose, too: by September, the Canadian term premium had an 11–59% probability of also exceeding the period’s upper-tail thresholds, conditional on the U.S. doing so. This is a coupling pattern in levels: Canada is progressively pulled into the upper tail along with the U.S.
The 2023 yield rally (Panel B) shows generally elevated coupling at the lower tail-percentile thresholds (p75–p85), with substantial intra-episode variation. The August through October pattern is broadly stable in the upper 60–90 percent range when conditioning is feasible, indicating that during the 2023 rally, the Canadian term premium tracked U.S. tail outcomes relatively closely.
In contrast, December 2024 (Panel C) shows a sharply different pattern: conditional probabilities began extremely high (94–99% on 2 and 9 December) and declined progressively through the month, falling to 38–48% by 30 December. The decline is broadly shared across all five thresholds in the panel, although it is not strictly monotone within the month (conditional probabilities rose into 9 December and ticked up again on 23 December before reaching their month-end lows). Because the thresholds here correspond to the upper tail of the December 2024 U.S. term-premium distribution, this pattern reflects a specific feature of the December episode: as the U.S. term premium drifted upward over the four weeks, the Canadian term premium did not keep pace at the same level, so the joint probability of both reaching the period’s upper-tail region declined.
Taken together, the three panels indicate that the conditional-spillover dynamics documented for December 2024 are not a generic property of U.S.–Canadian term-premium repricings: the 2013 and 2023 episodes display either rising-coupling or stable-coupling patterns rather than the monotone level decoupling we observe in December 2024. The framework distinguishes these episode types from each other and from a static correlation-based view, supporting the use of joint distributional forecasting as a tool for monitoring how cross-country tail risk shifts during structurally distinct shocks.

5. Conclusions

This paper develops a distributional framework for the forecasting and monitoring of cross-country bond-market risk that moves beyond point forecasts and correlation-based measures. By combining flexible skewed marginal distributions with a time-varying copula, our approach produces full predictive joint distributions for the U.S. and Canadian term premia and allows for a direct assessment of both joint risk and directional spillover risk at short horizons.
Empirically, we document two key findings using high-frequency forecasts for December 2024. First, joint density forecasts reveal a clear rebalancing of risks toward elevated term-premium outcomes in both countries, indicating an increase in upside risk rather than a symmetric rise in uncertainty. Second, conditional distributions show that short-horizon spillover risk from the United States to Canada declined markedly over the same period. Although U.S. term-premium risk intensified, extreme U.S. outcomes became progressively less informative about Canadian tail outcomes over a one-week horizon. This decoupling highlights the importance of distinguishing joint risk from spillover risk when assessing cross-market exposure.
From a risk-management and policy perspective, these results have important implications. Standard correlation-based indicators would suggest persistent comovement between U.S. and Canadian bond markets throughout this episode, masking the sharp decline in conditional tail dependence documented here. In contrast, the distributional framework developed in this paper provides a real-time, forward-looking assessment of how risks are distributed across markets and how stress in one market transmits—or fails to transmit—to another. Such information is particularly relevant for policymakers and market participants concerned with stress testing, financial stability, and cross-border exposure.
Our findings are broadly consistent with three strands of the prior literature. First, the spillover-index literature initiated by Diebold and Yilmaz (2012) has shown that the direction and magnitude of cross-market transmission can vary substantially over time, including discrete declines during shock episodes whose origins are concentrated in one country. Our December 2024 finding—that conditional upper-tail dependence fell sharply even as U.S. term-premium risk intensified—fits this pattern and reinforces the point that spillover intensity and shock magnitude can move in opposite directions. Second, the copula-based contagion literature—in particular, the work of Christoffersen et al. (2012)—has documented that international diversification benefits depend on asymmetric tail dependence and can change abruptly under stress; the BB7-based decomposition we use here is a methodological refinement that allows the upper- and lower-tail dependence channels to evolve independently, which is proven to be the relevant empirical feature in the December 2024 episode. Third, the cross-country bond-market literature has emphasized that U.S. monetary and fiscal news spills over to G7 sovereign yields heterogeneously across episodes; our results contribute a distributional rather than first-moment characterization of how a U.S.-specific repricing episode interacts with a closely integrated counterpart.
The following practical implications follow from these findings. For risk managers holding cross-border sovereign-bond exposure, the framework provides a short-horizon, density-level decomposition of joint risk into a common rebalancing component and a directional spillover component, which is directly informative for value-at-risk and expected-shortfall calculations under non-Gaussian assumptions. For central banks and supervisory authorities, the conditional-spillover diagnostics offer a near-real-time signal of whether stress in a peer market is likely to map into domestic tail risk; the December 2024 evidence suggests that such mappings are not invariant and can weaken during what appear, ex ante, to be highly correlated episodes. For policy stress testing, the joint density forecasts and their conditional slices can be embedded directly in scenario design, providing a coherent alternative to correlation-driven approaches that, as we document, may substantially overstate cross-border tail exposure during U.S.-specific shocks.
While the analysis focuses on the U.S. and Canadian term premia, the framework is general and can be applied to other asset classes, countries, or horizons. Future work could extend the approach to incorporate additional macro-financial drivers, explore longer forecast horizons, or embed the distributional forecasts directly into policy stress-testing exercises. We view the present results as an illustrative case study supported by formal statistical diagnostics rather than as a definitive benchmark of model superiority over all alternatives; a systematic horse race against multivariate skewed-t, dynamic conditional correlation, and quantile-regression approaches across multiple stress episodes is left to future work. Within these limits, the December 2024 evidence indicates that joint and spillover risk can diverge sharply over short windows and that a distributional view of cross-market exposure provides information that linear-correlation indicators do not.

Author Contributions

Conceptualization, B.F.; Methodology, B.F.; Software, R.H.; Validation, R.H.; Formal analysis, B.F. and R.H.; Investigation, R.H.; Data curation, R.H.; Writing—original draft, R.H.; Writing—review and editing, B.F. and J.-S.F.; Visualization, R.H.; Supervision, B.F. and J.-S.F.; Project administration, B.F. and J.-S.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in the study come from a combination of public sources and licensed proprietary data. Publicly available data are obtained from the Federal Reserve Economic Data (FRED https://fred.stlouisfed.org/), while other series are sourced from Bloomberg under license. In addition, the term premium series is based on Bank of Canada calculations and is not publicly distributed. Data were accessed on 4 September 2025.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. TP Mean Forecasting Model: Shared Local-Level VARX

This appendix describes the term-premium point-forecast model implemented in the Term Premium Forecasting Application. The model is designed for a cointegrated pair of term premia and combines (i) a shared stochastic level extracted from the cointegrated system and (ii) a stationary VARX using exogenous regressors.
Let
Y t = y 1 , t y 2 , t , t = 1 , , T ,
denote the two-dimensional vector of term premia (U.S. and Canada). We are interested in h-step-ahead forecasts ( Y t + h t = E t [ Y t + h ] ) for horizons (h) that will yield e t + h as per Equation (3).

Appendix A.1. Step 1: Cointegration and Shared Common Trend

We first estimate a simple Engle–Granger cointegration relation between the two term premia by regressing y 1 , t on y 2 , t :
y 1 , t = c + θ y 2 , t + u t .
This yields the following cointegrating vector:
β = 1 θ .
The common-trend direction (w) is chosen to be orthogonal to β , that is, w β = 0 . A convenient choice is
w θ 1 , w = 1 ( θ , 1 ) θ 1 ,
so that w is normalized to unit length. The scalar “common-trend” process is then defined as
y t c = w Y t , t = 1 , , T .
When Δ Y t is stationary with finite covariance, the Granger Representation Theorem Engle and Granger (1987) delivers a Beveridge–Nelson decomposition of Y t into random-walk and stationary components, so the common trend coincides with the long-horizon conditional expectation, i.e.,
y t c = lim h E t y i , t + h ,
where i { 1 , 2 } indexes the two countries’ term premia.

Appendix A.2. Step 2: Local-Level Model for the Common Trend

We model the common trend ( y t c ) as a local-level (random-walk plus noise) process:
μ t = μ t 1 + η t , y t c = μ t + ε c , t ,
with
η t N ( 0 , q ) , ε c , t N ( 0 , r ) ,
independent over time and across shocks. The ( q , r ) hyperparameters are chosen to deliver mild smoothing of the common level.
We run a Kalman filter and smoother for this univariate local-level model to obtain the following smoothed estimates:
μ ^ t = E [ μ t y 1 : T c ] , t = 1 , , T .
Since μ t follows a random walk, the h-step-ahead forecast of the level component is
E t [ μ t + h ] = μ ^ t .
We map the scalar level back into the two-dimensional space along the common-trend direction (w):
μ ^ t = μ ^ t ι
where ι = [ 1 , 1 ] so that μ t is a 2 × 1 vector representing the shared stochastic level component in each series, consistent with the cointegration-implied common-trend direction.

Appendix A.3. Step 3: Stationary Deviations and VARX

We define the stationary deviations from the shared level as follows:
Y ˜ t = Y t μ t .
These deviations are approximately stationary by construction: the common non-stationary component is removed via the local level. Note that we run this forecasting model for each forecast horizon, so coefficients are horizon-specific (local-projection style). We omit the horizon index from the notation for clarity.
We model the h-step-ahead deviations as a VARX with auto-regressive lags:
Y ˜ t + h = ϕ 0 + L Φ ( ) Y ˜ t + Γ X t + ε t + h ,
where
  • ϕ 0 is an intercept;
  • Φ ( ) is a 2 × 2 coefficient matrix on the lag ( Y ˜ t );
  • X t is a vector of K predictors at time t;
  • Γ is a 2 × K matrix of coefficients on X t ;
  • ε t + h N ( 0 , Σ ε ) is an innovation with covariance matrix Σ ε .

Appendix A.4. Step 4: Forecast Construction

Using the VARX specification for stationary deviations introduced in (A2), the h-step-ahead forecast of the deviation from the shared level is obtained by replacing population parameters with their estimated counterparts:
Y ˜ ^ t + h t = ϕ ^ 0 + L Φ ^ ( ) Y ˜ t + Γ ^ X t .
To recover the full term-premium forecast, we reintroduce the shared level component. The resulting point forecast at horizon h is
Y ^ t + h t = μ ^ t + h t + Y ˜ ^ t + h t ,
where, under the random-walk specification of the local-level model,
μ ^ t + h t = μ ^ t .
The corresponding forecast error is therefore given by
e ^ t + h = Y t + h Y ^ t + h t .
Thus, the h-step-ahead term-premium forecast combines:
1.
The shared long-run trend extracted from the cointegration-implied common component via the local-level model; and
2.
Horizon-specific VARX dynamics governing stationary deviations from this trend.
The forecast path ( Y ^ t + h t ) defines the conditional mean of the predictive distribution, while the forecast errors ( e ^ t + h ) form the input to the SGED-based density forecasting stage described in the main text. Summary statistics for this first-stage model—the Engle–Granger cointegrating coefficient and horizon-specific point-forecast diagnostics—are reported in Table 1 in Section 4.

Appendix B. TP Exogenous Variables

Table A1. Predictors and their economic links to the term premium.
Table A1. Predictors and their economic links to the term premium.
VariableEconomic Interpretation and Link to the Term Premium
MOVEMeasures implied volatility in U.S. Treasury markets; higher rate volatility increases the required compensation for bearing duration risk, raising the term premium.
CreditSpreadsReflect the state of credit risk and financial conditions; widening spreads signal heightened risk aversion and often coincide with higher term premia.
MPUMonetary policy uncertainty; greater uncertainty about the policy rate path increases interest-rate risk and therefore elevates the term premium.
5-Year Inflation SwapsReflect the market’s price for inflation over five years.
Figure A1. Local projection coefficients across horizons and distributional moments. The left column reports estimates for the U.S. term premium (TPUS), and the right column reports estimates for the Canadian term premium (TPCA). Rows correspond to the conditional mean (top), log volatility (middle), and transformed skewness (bottom) of the predictive distribution. Each line shows the horizon-specific response of the corresponding moment to a one-unit change in the indicated predictor. The coefficients illustrate how the influence of macro-financial predictors varies across moments and horizons and differs between U.S. and Canadian term premia. While predictors affect the U.S. and Canadian point forecasts differently—for example, inflation-swap measures are associated with a decline in the U.S. term premium but an increase in the Canadian term premium—the higher-order moments exhibit more similar patterns across countries. The first-moment regressions are intended solely to generate meaningful forecast errors; consequently, the estimated coefficients should not be interpreted as causal effects.
Figure A1. Local projection coefficients across horizons and distributional moments. The left column reports estimates for the U.S. term premium (TPUS), and the right column reports estimates for the Canadian term premium (TPCA). Rows correspond to the conditional mean (top), log volatility (middle), and transformed skewness (bottom) of the predictive distribution. Each line shows the horizon-specific response of the corresponding moment to a one-unit change in the indicated predictor. The coefficients illustrate how the influence of macro-financial predictors varies across moments and horizons and differs between U.S. and Canadian term premia. While predictors affect the U.S. and Canadian point forecasts differently—for example, inflation-swap measures are associated with a decline in the U.S. term premium but an increase in the Canadian term premium—the higher-order moments exhibit more similar patterns across countries. The first-moment regressions are intended solely to generate meaningful forecast errors; consequently, the estimated coefficients should not be interpreted as causal effects.
Risks 14 00133 g0a1

Appendix C. SGED and Skewed-t Distribution

The SGED and the skewed-t distributions both belong to the skewed generalized t family but differ in the range of skewness–kurtosis combinations they can represent. For unimodal distributions, skewness and kurtosis are jointly constrained: only certain pairs are admissible, and as skewness rises for a fixed kurtosis, this bound eventually binds. The skewed-t distributionreaches the bound at a lower level of skewness than the SGED, so for a given kurtosis, the SGED can accommodate greater asymmetry and place more probability mass in one tail without inflating overall kurtosis. This wider admissible region was established by Feunou et al. (2016) and Kerman and McDonald (2013); the shape parameter-to-moment tables presented by Theodossiou (2015)provide the quantile detail that earlier versions of this appendix tabulated directly.
Figure A2 illustrates the practical consequence: it plots SGED and skewed-t densities with identical skewness ( 1.0 ) and kurtosis ( 4.0 ), together with the standard normal for reference. Although the two densities share their first four moments, the SGED exhibits heavier and more flexible tail behavior, notably in the lower tail. Because of this greater flexibility for a given level of kurtosis, we adopt the SGED as the marginal distribution in our joint density-forecasting framework.
Figure A2. Comparison of skewed-t and SGED density functions. The two densities have identical skewness (1.00) and kurtosis (4.00). The standard normal (skewness = 0, kurtosis = 3) is also plotted for reference.
Figure A2. Comparison of skewed-t and SGED density functions. The two densities have identical skewness (1.00) and kurtosis (4.00). The standard normal (skewness = 0, kurtosis = 3) is also plotted for reference.
Risks 14 00133 g0a2

Appendix D. Empirical SGED and Skewed-t Fit

The probability-integral-transform (PIT) calibration results, including the interior PIT histograms for the SGED and skewed-t specifications and the highest-minus-lowest interior-bin diff statistics, are now reported in Section 4.3 of the main text (Figure 3). This appendix retains only the supporting methodological background and the formal goodness-of-fit discussion.
The PIT diagnostic of Diebold et al. (1997) assesses the calibration of a predictive density: if the forecast CDF is correctly specified, the realized observation evaluated through it is uniformly distributed on [ 0 , 1 ] , so systematic clustering near 0 or 1 signals biased tail probabilities, and a U- or hump-shaped histogram signals under- or over-dispersion. Because the PIT is derived from the full predictive distribution, it jointly evaluates the calibration of the mean, variance, skewness, and tail behavior of the forecast density and is, for this reason, widely used to compare competing distributional specifications (Ganics 2018).

On Formal Goodness-of-Fit Testing

The visual PIT diagnostics in Section 4.3 can be supplemented by formal goodness-of-fit tests of the null hypothesis that the PITs are uniformly distributed on [ 0 , 1 ] , such as the Kolmogorov–Smirnov (KS) statistic, the Anderson–Darling (AD) statistic, and the Berkowitz (2001) likelihood-ratio test on the inverse-normal transform of the PITs. The three tests are complementary: the KS statistic is sensitive to deviations near the median, the AD statistic is more sensitive to tail behavior, and the Berkowitz test exploits the normality of the inverse-normal transform under the null hypothesis. A complete formal-testing exercise would require care in handling the dependence induced by overlapping h-step forecasts (which makes asymptotic null distributions inapplicable without correction or a bootstrap), and we view this as a natural extension that we leave for follow-up work. Within the present paper, the visual PIT diagnostics suffice to motivate the SGED specification over the skewed-t alternative at the relevant forecast horizon.

Appendix E. Numerical SGED Inversion

Estimation of flexible skewed distributions by direct likelihood optimization is unreliable because the likelihood surface is irregular and sensitive to interaction between the skewness and tail-shape parameters (Azzalini and Salehi 2020). We therefore avoid local optimization and instead invert the realized moments on a fixed grid over the SGED shape parameters, i.e.,
λ { 0.95 , 0.8 , 0.6 , 0.4 , 0.2 , 0.1 , 0.05 , 0.025 , 0 , 0.025 , 0.05 , 0.1 , 0.2 , 0.4 , 0.6 , 0.8 , 0.95 } ,
p { 0.5 : 0.1 : 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9.5 } .
For each grid point, we evaluate the implied theoretical skewness and kurtosis, resulting in a discrete moment-to-parameter map; at each step, we select the grid point whose moments best match those of the density-forecast residuals. This grid-based strategy follows Azzalini and Salehi (2020).

Notes

1
It is worth noting that the SGED and the skewed-t distributions are both nested by the five-parameter skewed, generalized t distribution.
2
We adopt the standard RiskMetrics daily decay parameter of α = 0.94 for the EWMA volatility filter J.P. Morgan (1996), which is widely used in empirical finance and macroeconomics; see also Andersen et al. (2003) and Engle (2002).
3
This model extends the Gaussian dynamic term-structure framework by introducing a nonlinear transformation of the short rate to account for the zero lower bound.
4
Full discussions of the PIT methodology and additional diagnostics are presented in Appendix D.

References

  1. Adrian, Tobias, Nina Boyarchenko, and Domenico Giannone. 2019. Vulnerable growth. American Economic Review 109: 1263–89. [Google Scholar] [CrossRef]
  2. Adrian, Tobias, Richard K. Crump, and Emanuel Moench. 2013. Pricing the term structure with linear regressions. Journal of Financial Economics 110: 110–38. [Google Scholar] [CrossRef]
  3. Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Paul Labys. 2003. Modeling and forecasting realized volatility. Econometrica 71: 579–625. [Google Scholar] [CrossRef]
  4. Azzalini, Adelchi, and Mahdi Salehi. 2020. Some computational aspects of maximum likelihood estimation of the skew-t distribution. In Computational and Methodological Statistics and Biostatistics: Contemporary Essays in Advancement. Berlin: Springer, pp. 3–28. [Google Scholar]
  5. Berkowitz, Jeremy. 2001. Testing density forecasts, with applications to risk management. Journal of Business & Economic Statistics 19: 465–74. [Google Scholar] [CrossRef]
  6. Christoffersen, Peter, Vihang Errunza, Kris Jacobs, and Hugues Langlois. 2012. Is the potential for international diversification disappearing? A dynamic copula approach. The Review of Financial Studies 25: 3711–51. [Google Scholar] [CrossRef]
  7. Diebold, Francis X., and Kamil Yilmaz. 2012. Better to give than to receive: Predictive directional measurement of volatility spillovers. International Journal of Forecasting 28: 57–66. [Google Scholar] [CrossRef]
  8. Diebold, Francis X., Todd A. Gunther, and Anthony Tay. 1997. Evaluating Density Forecasts. Cambridge: National Bureau of Economic Research. [Google Scholar]
  9. Engle, Robert. 2002. Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models. Journal of Business & Economic Statistics 20: 339–50. [Google Scholar]
  10. Engle, Robert F., and Clive W. J. Granger. 1987. Co-integration and error correction: Representation, estimation, and testing. Econometrica: Journal of the Econometric Society 55: 251–76. [Google Scholar] [CrossRef]
  11. Feunou, Bruno, Jean-Sébastien Fontaine, Anh Le, and Christian Lundblad. 2022. Tractable term structure models. Management Science 68: 8411–29. [Google Scholar] [CrossRef]
  12. Feunou, Bruno, Mohammad R. Jahan-Parvar, and Roméo Tédongap. 2016. Which parametric model for conditional skewness? The European Journal of Finance 22: 1237–71. [Google Scholar] [CrossRef]
  13. Ganics, Gergely. 2018. Optimal density forecast combinations. SSRN. [Google Scholar] [CrossRef]
  14. Hansen, Bruce E. 1994. Autoregressive conditional density estimation. International Economic Review 35: 705–30. [Google Scholar] [CrossRef]
  15. Harvey, Campbell R., and Akhtar Siddique. 1999. Autoregressive conditional skewness. Journal of Financial and Quantitative Analysis 34: 465–87. [Google Scholar] [CrossRef]
  16. Jordà, Òscar. 2005. Estimation and inference of impulse responses by local projections. American Economic Review 95: 161–82. [Google Scholar] [CrossRef]
  17. Joslin, Scott, Kenneth J. Singleton, and Haoxiang Zhu. 2011. A new perspective on Gaussian dynamic term structure models. The Review of Financial Studies 24: 926–70. [Google Scholar] [CrossRef]
  18. J.P. Morgan. 1996. Riskmetrics—Technical Document. Technical Report. New York: J.P. Morgan. [Google Scholar]
  19. Kerman, Sean C., and James B. McDonald. 2013. Skewness–kurtosis bounds for the skewed generalized t and related distributions. Statistics & Probability Letters 83: 2129–34. [Google Scholar] [CrossRef]
  20. Kim, Don H., and Jonathan H. Wright. 2005. An Arbitrage-Free Three-Factor Term Structure Model and the Recent Behavior of Long-Term Yields and Distant-Horizon Forward Rates. Technical Report Finance and Economics Discussion Series 2005–33; Washington: Board of Governors of the Federal Reserve System. [Google Scholar]
  21. León, Ángel, Gonzalo Rubio, and Gregorio Serna. 2005. Autoregresive conditional volatility, skewness and kurtosis. The Quarterly Review of Economics and Finance 45: 599–618. [Google Scholar] [CrossRef]
  22. Patton, Andrew J. 2004. On the out-of-sample importance of skewness and asymmetric dependence for asset allocation. Journal of Financial Econometrics 2: 130–68. [Google Scholar] [CrossRef]
  23. Patton, Andrew J. 2006. Modelling asymmetric exchange rate dependence. International Economic Review 47: 527–56. [Google Scholar] [CrossRef]
  24. Plagborg-Møller, Mikkel, Lucrezia Reichlin, Giovanni Ricco, and Thomas Hasenzagl. 2020. When is growth at risk? Brookings Papers on Economic Activity 2020: 167–229. [Google Scholar] [CrossRef]
  25. Stock, James H., and Mark W. Watson. 1988. Testing for common trends. Journal of the American statistical Association 83: 1097–107. [Google Scholar] [CrossRef]
  26. Theodossiou, Panayiotis. 2015. Skewed generalized error distribution of financial assets and option pricing. Multinational Finance Journal 19: 223–66. [Google Scholar] [CrossRef]
  27. Theodossiou, Panayiotis. 2018. Skewed generalized t and nested probability distributions: Specification and moments. SSRN. [Google Scholar] [CrossRef]
Figure 1. U.S. and Canadian five-year term premia and estimated common stochastic trend. The figure plots the observed term premia, together with the shared long-run component extracted from the cointegrated system using a local level model. The common trend captures slow-moving movements shared across both markets, while deviations from this component represent transitory, country-specific fluctuations.
Figure 1. U.S. and Canadian five-year term premia and estimated common stochastic trend. The figure plots the observed term premia, together with the shared long-run component extracted from the cointegrated system using a local level model. The common trend captures slow-moving movements shared across both markets, while deviations from this component represent transitory, country-specific fluctuations.
Risks 14 00133 g001
Figure 2. Unconditional distributions of term-premium forecast errors at different horizons relative to a normal distribution. The first row shows the U.S. term premium, and the second row shows the Canadian term premium. Solid blue lines denote kernel density estimates, with red dashed lines indicating the corresponding normal densities.
Figure 2. Unconditional distributions of term-premium forecast errors at different horizons relative to a normal distribution. The first row shows the U.S. term premium, and the second row shows the Canadian term premium. Solid blue lines denote kernel density estimates, with red dashed lines indicating the corresponding normal densities.
Risks 14 00133 g002
Figure 3. Interior PIT histograms for the seven-day-ahead forecasts of the U.S. and Canadian term premia under the SGED and skewed-t specifications. Under correct density calibration, PIT values should be uniformly distributed, implying a flat histogram on the interior of ( 0 , 1 ) . The skewed-t histograms exhibit a larger dispersion between the highest and lowest interior bins, indicating more pronounced departures from uniformity than the SGED. The diff statistic reported in each panel title is the range between the highest- and lowest-density interior bins.
Figure 3. Interior PIT histograms for the seven-day-ahead forecasts of the U.S. and Canadian term premia under the SGED and skewed-t specifications. Under correct density calibration, PIT values should be uniformly distributed, implying a flat histogram on the interior of ( 0 , 1 ) . The skewed-t histograms exhibit a larger dispersion between the highest and lowest interior bins, indicating more pronounced departures from uniformity than the SGED. The diff statistic reported in each panel title is the range between the highest- and lowest-density interior bins.
Risks 14 00133 g003
Figure 4. U.S. and Canadian five-year term premia over December 2024. The U.S. term premium rises sharply relative to the Canadian series, particularly in mid-December, motivating a joint distributional analysis of cross-country risk.
Figure 4. U.S. and Canadian five-year term premia over December 2024. The U.S. term premium rises sharply relative to the Canadian series, particularly in mid-December, motivating a joint distributional analysis of cross-country risk.
Risks 14 00133 g004
Figure 5. Joint density forecasts for the U.S. and Canadian term premia, evaluated on 13 December and 27 December 2024: (a) 13 December, one-day-ahead; (b) 27 December, one-day-ahead; (c) 13 December, seven-day-ahead; (d) 27 December, seven-day-ahead. The distribution shifts upward over time, indicating a rebalancing of joint risks toward elevated term-premium outcomes.
Figure 5. Joint density forecasts for the U.S. and Canadian term premia, evaluated on 13 December and 27 December 2024: (a) 13 December, one-day-ahead; (b) 27 December, one-day-ahead; (c) 13 December, seven-day-ahead; (d) 27 December, seven-day-ahead. The distribution shifts upward over time, indicating a rebalancing of joint risks toward elevated term-premium outcomes.
Risks 14 00133 g005
Figure 6. Seven-day-ahead conditional and unconditional density forecasts for the Canadian term premium, given elevated U.S. term-premium outcomes ( US 0.8 ). The conditional density becomes progressively less concentrated relative to the unconditional density over December 2024, indicating weakening short-horizon spillover risk.
Figure 6. Seven-day-ahead conditional and unconditional density forecasts for the Canadian term premium, given elevated U.S. term-premium outcomes ( US 0.8 ). The conditional density becomes progressively less concentrated relative to the unconditional density over December 2024, indicating weakening short-horizon spillover risk.
Risks 14 00133 g006
Table 1. Summary statistics for the first-stage shared-trends point-forecast model.
Table 1. Summary statistics for the first-stage shared-trends point-forecast model.
Panel A: Engle–Granger cointegration regression
ParameterEstimateStd. Errort-Statistic R 2 N
c (intercept)−0.32570.00560.85056692
θ (slope)1.06230.0054195.07
Panel B: Horizon-specific VARX point-forecast summary
TPUSTPCA
HorizonNRMSE (bp) R Δ 2 Theil URMSE (bp) R Δ 2 Theil U
h = 7 66640.1311−0.04811.02540.0788−0.03051.0168
h = 21 66500.21070.09480.95290.12910.08430.9584
h = 63 66080.32910.13390.93210.22330.01200.9956
h = 126 65450.45000.10250.94890.3263−0.12531.0625
h = 252 64190.62260.02920.98690.4909−0.34561.1619
Table 2. Conditional upper-tail probabilities for the seven-day-ahead (one-week) joint distribution of the U.S. and Canadian term premia. Each entry reports Pr t + 7 t ( CA T h r e s h o l d US T h r e s h o l d ) in percent. Declining probabilities at higher thresholds indicate weakening short-horizon spillover risk over December 2024.
Table 2. Conditional upper-tail probabilities for the seven-day-ahead (one-week) joint distribution of the U.S. and Canadian term premia. Each entry reports Pr t + 7 t ( CA T h r e s h o l d US T h r e s h o l d ) in percent. Declining probabilities at higher thresholds indicate weakening short-horizon spillover risk over December 2024.
Threshold2 December9 December16 December23 December30 December
0.499.9999.9395.4399.21100.00
0.597.2796.7077.0189.0595.99
0.692.9496.4764.7280.5675.64
0.794.3399.4256.6859.6644.45
0.897.6695.679.3220.2613.14
Table 3. Conditional upper-tail probabilities ( Pr t + 7 t [ CA x US x ] ) in percent across three U.S. term-premium repricing episodes. In each panel, the threshold (x) is set to the indicated percentile of the realized U.S. five-year term premium during the episode window. For example, in Panel C (December 2024), the threshold ranges from x = 0.69 percentage points at the 75th percentile (the top row, p 75 ) to x = 0.73 at the 95th percentile (the bottom row, p 95 ); each cell then reports the model-implied probability that the Canadian term premium also exceeds the corresponding x, conditional on the U.S. exceeding it. Threshold ranges in percentage points: Panel A (2013) x [ 0.77 , 0.92 ] ; Panel B (2023) x [ 0.44 , 0.68 ] ; Panel C (2024) x [ 0.69 , 0.73 ] .
Table 3. Conditional upper-tail probabilities ( Pr t + 7 t [ CA x US x ] ) in percent across three U.S. term-premium repricing episodes. In each panel, the threshold (x) is set to the indicated percentile of the realized U.S. five-year term premium during the episode window. For example, in Panel C (December 2024), the threshold ranges from x = 0.69 percentage points at the 75th percentile (the top row, p 75 ) to x = 0.73 at the 95th percentile (the bottom row, p 95 ); each cell then reports the model-implied probability that the Canadian term premium also exceeds the corresponding x, conditional on the U.S. exceeding it. Threshold ranges in percentage points: Panel A (2013) x [ 0.77 , 0.92 ] ; Panel B (2023) x [ 0.44 , 0.68 ] ; Panel C (2024) x [ 0.69 , 0.73 ] .
Panel A: May–September 2013 taper tantrum (US monetary)
Threshold20-May-201317-Jun-201315-Jul-201319-Aug-201316-Sep-2013
p750.000.0016.0357.7459.13
p800.000.005.1726.9941.84
p850.000.001.6311.1327.96
p900.000.000.564.5518.61
p950.000.000.161.4011.02
Panel B: August–October 2023 yield rally (US fiscal/supply)
Threshold07-Aug-202328-Aug-202318-Sep-202316-Oct-202330-Oct-2023
p7585.2799.950.0786.0678.59
p8021.8499.910.0076.9865.71
p852.5599.480.0066.7850.92
p900.0171.630.0043.2919.62
p950.0023.780.0032.809.75
Panel C: December 2024 fiscal repricing (US fiscal)
Threshold02-Dec-202409-Dec-202416-Dec-202423-Dec-202430-Dec-2024
p7593.9899.3857.3662.7847.65
p8093.9899.3857.3662.7847.65
p8594.3399.4256.6859.6644.45
p9094.7099.3355.9756.2541.17
p9595.0899.2155.2152.5437.84
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Feunou, B.; Fontaine, J.-S.; Hill, R. How Much Risk in U.S. Government Bond Markets Is Transmitted to Their Canadian Counterparts? Risks 2026, 14, 133. https://doi.org/10.3390/risks14060133

AMA Style

Feunou B, Fontaine J-S, Hill R. How Much Risk in U.S. Government Bond Markets Is Transmitted to Their Canadian Counterparts? Risks. 2026; 14(6):133. https://doi.org/10.3390/risks14060133

Chicago/Turabian Style

Feunou, Bruno, Jean-Sébastien Fontaine, and Robert Hill. 2026. "How Much Risk in U.S. Government Bond Markets Is Transmitted to Their Canadian Counterparts?" Risks 14, no. 6: 133. https://doi.org/10.3390/risks14060133

APA Style

Feunou, B., Fontaine, J.-S., & Hill, R. (2026). How Much Risk in U.S. Government Bond Markets Is Transmitted to Their Canadian Counterparts? Risks, 14(6), 133. https://doi.org/10.3390/risks14060133

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop