USV-Affine Models Without Derivatives: A Bayesian Time-Series Approach

Molibeli, Malefane; van Vuuren, Gary

doi:10.3390/jrfm18070395

Open AccessArticle

USV-Affine Models Without Derivatives: A Bayesian Time-Series Approach

by

Malefane Molibeli

^1,*

and

Gary van Vuuren

^2,3

¹

School of Economics and Finance, University of the Witwatersrand, Johannesburg 2000, South Africa

²

Centre for Business Mathematics and Informatics, North-West University, Potchefstroom Campus, Potchefstroom 2520, South Africa

³

National Institute for Theoretical and Computational Sciences (NITheCS), Pretoria 0001, South Africa

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2025, 18(7), 395; https://doi.org/10.3390/jrfm18070395

Submission received: 24 June 2025 / Revised: 13 July 2025 / Accepted: 14 July 2025 / Published: 17 July 2025

(This article belongs to the Section Financial Markets)

Download

Browse Figures

Review Reports Versions Notes

Abstract

We investigate the affine term structure models (ATSMs) with unspanned stochastic volatility (USV). Our aim is to test their ability to generate accurate cross-sectional behavior and time-series dynamics of bond yields. Comparing the restricted models and those with USV, we test whether they produce both reasonable estimates for the short rate variance and cross-sectional fit. Essentially, a joint approach from both time series and options data for estimating risk-neutral dynamics in ATSMs should be followed. Due to the scarcity of derivative data in emerging markets, we estimate the model using only time-series of bond yields. A Bayesian estimation approach combining Markov Chain Monte Carlo (MCMC) and the Kalman filter is employed to recover the model parameters and filter out latent state variables. We further incorporate macro-economic indicators and GARCH-based volatility as external validation of the filtered latent volatility process. The

A_{1} (4)

USV performs better both in and out of sample, even though the issue of a tension between time series and cross-section remains unresolved. Our findings suggest that even without derivative instruments, it is possible to identify and interpret risk-neutral dynamics and volatility risk using observable time-series data.

Keywords:

parameter and model identifiability; stochastic volatility; MCMC; unspanned stochastic volatility

1. Introduction

Interest rate volatility is increasingly recognized as a distinguished feature in ATSMs and an independent driver of time-varying risk premia in fixed-income markets (Duffee, 2002). Typically, volatility is unobservable and potentially unspanned by the prices of traded bonds, raising significant challenges for both modeling and empirical inference. In particular, the inability to observe volatility directly—especially in markets with sparse derivatives trades due to a lack of liquidity—necessitates models where volatility is treated as a latent factor. Such models must be capable of distinguishing between spanned and unspanned components of risk, a subtle but crucial distinction for pricing, hedging, and risk management.

Unspanned Stochastic Volatility (USV) refers to volatility risk that influences the dynamics of yields and risk premia but cannot be perfectly hedged or inferred from the cross-section of bond prices alone; see Duffee (2002). It results from a tension that exists between fitting a cross-section of yields and matching the time-series dynamics of bond prices; see Collin-Dufresne and Goldstein (2002), Collin-Dufresne et al. (2008), Piazzesi (2010), and Riva (2024) among others. This introduces fundamental incompleteness into the fixed-income market, making it impossible to replicate all sources of risk through portfolios made up solely of bonds—suggesting the presence of portfolio volatility risk (Collin-Dufresne & Goldstein, 2002). The presence of USV can explain persistent empirical puzzles such as the failure of (ATSMs) to match observed option-implied volatilities or the dynamic behavior of the market price of risk (MPR) Duffee (2002), Bikbov and Chernov (2004), and Collin-Dufresne et al. (2009).

One of our central objectives is to evaluate the role of model restrictions in enabling or suppressing the presence of USV. In doing so, we aim to test affine models both with USV, leveraging off the maximal models from Dai and Singleton (2000) and Collin-Dufresne et al. (2008), which in turn, are characterized by a greater number of identifiable parameters. The representation that we follow leads to specifications that are more flexible than the canonical form identified by Dai and Singleton (2000). These are referred to as a maximal

A_{M} (N)

form of ATSMs, which in some cases have more identifiable parameters. Our focus is on the impact of parameter restrictions on these maximal ATSMs, the three-factor

A_{1}

(3) and four-factor

A_{1} (4)

models (Collin-Dufresne et al., 2008). Subsequently, we compare and contrast both the

A_{1} (3)

USV and

A_{1} (4)

USV in terms of their time series and yield curve dynamics.

As in most emerging markets, we anticipate that market data constraints are significant due to non-liquid derivative trading. In such markets, data on derivatives may be sparse and irregular. Riva (2024) points out that this approach to test the presence of USV does not rely on derivatives market data to estimate the underlying risk factors under the physical measure. We consider the yield curve itself to reveal information about latent volatility. While the cross-section of yields provides insight about the first moment, higher-order dynamics—such as yield curvature, and term premium variation,—changes in the covariance structure of yields may be indirectly embedded within the latent volatility. Given our choice of models, we test whether the structures permit the identification of volatility as a latent but influential factor, even in the absence of explicit derivative prices.

There is a challenge of understanding the ‘dual role’ of matching the interest rate volatility within the ATSMs. Collin-Dufresne et al. (2008) reported significant discrepancies when comparing the stochastic volatility from non-Gaussian, square-root ATSMs to the observed state variable proxy of volatility. We explore the dual role of interest rate variance by considering the following aspects. First, we establish whether the three-factor

A_{1} (3)

produces a time-varying volatility and whether the variance is also a factor for the yield curve. The

A_{1} (3)

model was a preferred specification choice in Dai and Singleton (2000) for its ability to characterize the unconditional volatility and a flexible correlation structure. Second, there is a need to investigate the joint dynamics of level, slope, curvature and volatility.

The market price of risk specification should be chosen carefully to ensure that the tension in the simultaneous fitting of certain cross-sectional and time-series properties in the ATSMs is circumvented. The three-factor ATSMs with completely affine specification by Dai and Singleton (2000) were found to be poor in preserving the affine time-series and cross-sectional properties of bond prices. On the other hand, the essentially affine specification allows greater flexibility in fitting the time-varying price of interest rate risk over time, while preserving the affine time-series and cross-sectional properties of bond prices. This is despite the need for a trade-off between time-varying conditional variances and the flexibility in fitting time-varying market price of risk (Duffee, 2002).

Estimation methods will have a problem dealing with the explanation of bond yields under USV. Whereas USV can match some aspects of time series and cross-section of yields, it is unclear whether USV restrictions will affect the model’s ability to capture the cross-section and time series of yields (Collin-Dufresne et al., 2009). Econometrics methods, including generalized method of moments (GMM), simulated methods of estimation (SME), among others, are unsuitable for investigating models with USV (Collin-Dufresne et al., 2009) and references therein. Following their approach, we consider the Bayesian Markov Chain Monte Carlo (MCMC), with a Kalman filter integrated into it. The Kalman filter is preferred for its ability to address nonlinear processes. It is only used here as a computational devise to evaluate the likelihood given a path of the latent part

V_{t}

of the state variable.

Ultimately, we test the time series and term structure dynamics using multiple volatility representations: stochastic volatility (SV) from the models, GARCH-type dynamics, and macro variables such as oil shocks and FX volatility. This comparative approach helps us understand the robustness of results across different volatility specifications and provides evidence for or against the empirical importance of USV in fixed-income markets. To test the model performance and goodness-of-fit, we apply both the root-mean-square (RMSE) and biasness tests, including the pairwise comparison of Diebold and Mariano (2002) (DM) to the models.

The rest of this paper is structured as follows: In Section 2 we review the relevant literature. Section 3 covers the model establishment. Section 4 discusses a data collection method. In Section 5, Section 6 and Section 7 we discuss the model implementation and analysis of results. In Section 8 we conclude.

2. Literature Review

There are ongoing attempts aimed at successfully addressing the dynamic term structure models (DTSM) that exhibit USV. The presence of USV renders the fixed-income markets incomplete, therefore presenting some challenges to the pricing and hedging of bond prices and their derivatives. In recent studies, linear-rational square-root (LRSQ) models are being evaluated for the ability to capture cross-sectional and time-series dynamics simultaneously. Hansen (2025) specifies a five-factor model exhibiting USV, where two of the factors are not spanned by the yield curve. Their specification confirms the presence of USV in conditional yield variance and bond risk premia, which in turn are also linked to macro-economic uncertainty. Andreasen et al. (2025) tests the predictive power of the spread between short and long-term U.S government yields using both the quadratic term structure and the autoregressive gamma-zero models in comparison to LRSQ. Their results provide evidence that LRSQ allows USV, accurately fitting the cross-section of yields and capturing the time-series properties of the yield variance.

Specifically, the focus on ATSMs suggests other alternatives to the trade-off between fitting the cross-section of yields and capturing their time-series behavior. Riva (2024) considers the process where a quadratic variation of yields is tied to the cross-section of average yields. He implements arbitrary portfolios of bonds to test the presence of USV using jump-robust estimators of the diffusive variance of Nelson-Siegel factors. His approach follows Andersen and Benzoni (2010) on quadratic diffusive and affine jump-diffusive models, although they found them to be incapable of accommodating the observed yield volatility dynamics. Their model exhibits evidence for USV across all factors, which includes oil shocks and tax policy shocks, also driving part of the USV, although a greater part of the variation still requires some explanation.

Collin-Dufresne et al. (2009) study the “dual role” of the short rate variance as predicted by the stochastic volatility models. They provide evidence that the three-factor ATSMs explain the shape of the yield curve but fail to explain both the quadratic time-variation in spot rate as estimated from GARCH and the implied variance from options. Evidence from the four-factor model with USV suggests that the model generates both realistic short rate volatility estimates and a good cross-sectional fit. The study suggests further that short rate volatility cannot be extracted from the cross-section of bond prices. They propose an alternative representation of ATSM by Dai and Singleton (2000) which is essentially characterized by—(1) physical interpretations of state variables such as level, slope and curvature, (2) affine dynamics and tractability are preserved, (3) models are econometrically identifiable, and (4) necessary and sufficient conditions lead to parameters restrictions under which the model exhibits USV (Collin-Dufresne et al., 2004). In their companion paper, Collin-Dufresne and Goldstein (2002), they derive these conditions under which restriction exhibits USV. The same paper provides evidence on additional explanatory power possessed by latent variables for explaining the time-series and cross-sectional behavior of bond prices. The presence of USV enforces the requirement that bond and derivative prices are required for the estimation of parameters. Finally, USV presents some challenges on how fixed-income derivatives should be hedged.

Bikbov and Chernov (2004) explores the differences among Gaussian, stochastic volatility, and USV models in the context of the conditional volatility, which is an inherent feature in ATSMs. They follow Collin-Dufresne and Goldstein (2002) approach and restrictions that ensure that models exhibit USV. They estimate the models using both Eurodollar Futures and options data. Gaussian and stochastic volatility models match the conditional mean and volatility of the term structure well. They require both the yield curve dynamics and options data for differences to be distinguished. USV models fail to resolve the tension between futures and options fits. Additional factors, such as jumps, should be considered for the model performance to be improved. Heidari and Wu (2002) maintains USV by allowing a clear separation between factors dedicated to term structure fitting and those for options fitting; see Bikbov and Chernov (2004) and references therein. Similarly, they also reported the tension in pricing and parallel tension in matching lower and higher-order moments, suggesting the direction of adding jump components. Ang et al. (2006) uses the term structure model with inflation and economic growth factors, together with latent variables, to investigate the effect of macro variables on bond prices and the yield curve dynamics. PCA and variance decomposition show that macro factors explain up to 85% of the variation in bond yields. They explain movements at the short end and middle of the yield curve, while latent factors still account for most of the movement at the long end of the yield curve. They conclude that incorporating macro factors in a term structure model improves forecasts.

Another contributing factor to the tension between the time-series dynamics and yield curve properties, hence the presence of USV, is the misspecification of the MPR. Duffee (2002) proposes the essentially affine model for MPR as they are found to allow greater flexibility in fitting variations in the price of interest rate risk over time, while retaining the affine time-series and cross-sectional properties of bond prices. Cheridito et al. (2007) studies a no-arbitrage specification for MPR, with their improved fit coming from the time-series rather than cross-sectional features of the yield curve. A separate set of parameters, with one being a risk-neutral measure and physical measure for the other set, adds flexibility to their specification. The specification relieves the tension between matching the time-series behavior of the interest rate process and matching the cross-sectional shape of the yield curve.

Collin-Dufresne et al. (2004) describes the concepts of identification, identifiability, and maximality. These are the centers for local and global identifiability of models and parameters, whose presence determines the existence of economic interpretation. Identification deals with how and whether the state vector and parameter vector can be inferred from a particular data set. Identifiability deals with the issue of whether the state vector and parameter vector can be inferred from observing all conceivable financial data, as frequently as necessary. Maximality is about the most general model within

A_{M} (N)

as discussed by Dai and Singleton (2000), which is identifiable given sufficiently informative data. Affine models are written in terms of latent variables with no clear economic meaning independently from the model and parameters, leading to representations that are locally but not globally identifiable and often meaningless. Collin-Dufresne et al. (2008) proposes a representation in which the state vector is written in terms of theoretically observable state variables that have unambiguous economic interpretations. To circumvent these challenges, a process is required whereby latent state variables are rotated to observable state variables so that they have economic meaning. Collin-Dufresne et al. (2008) proposes a new representation of affine models in which the state vector comprises infinitesimal maturity yields and their quadratic covariations. In contrast to the invariant transformation rotations by Duffie and Kan (1996) and Dai and Singleton (2000), their representation accommodates the USV (Singleton, 2006).

Finally, the issue of parameter uncertainty and model flexibility compels us to consider a probabilistic method of estimation, the MCMC. We are also concerned about the global identifiability of our models and parameters, and therefore economic meaning. López-Pérez et al. (2025) performs a comparative study between MCMC, Particle filter, and the Kalman filter. Key challenges are the discretization bias and the presence of latent factors. While all procedures are computationally demanding, they found the Kalman filter a computational efficient estimation method to use in goodness-of-fit testing procedures. Incorporating their process to stochastic volatility and application to observed data suggests that the volatility depends on an additional factor that varies independently of the short rate level.

Aït-Sahalia et al. (2024) proposes an efficient and flexible method to compute the maximum likelihood estimators of continuous-time models when part of the state vector is latent, considering stochastic volatility and term structure models. In contrast to MCMC, their approach relies on closed-form approximations to estimate parameters and simultaneously infer the distribution of filters—of the latent states conditioning on observations. The computational complexity of these methods may present some challenges to our intended purpose. MCMC enables us to estimate the parameters under assumptions of prior parameter distributions and conditional likelihood functions to determine posterior distributions of parameter samples and latent variables with reasonable precision.

In state–space models, the use of MCMC methods introduces sampling noise due to the discreteness of the generated samples and the inherent gaps between each pair of measurements. Elerian et al. (2001) proposes the estimation framework, which relies on the introduction of latent auxiliary data to complete the missing diffusion between each pair of measurements. This data augmentation is synonymously referred to as smoothing. Stroud et al. (2003) proposes a simulation-based smoothing technique and the auxiliary mixture model. The auxiliary mixture model consists of state-dependent weights and efficient block sampling algorithms to jointly update all unobserved states given latent mixture indicators (Collin-Dufresne et al., 2009) and references therein. The same blocking approach is followed by Collin-Dufresne et al. (2009) with the parameters vector broken into three blocks— those affecting the dynamics under the risk-neutral measure, risk-premium parameters, and the measurement error standard deviations.

In estimating the USV models, bond prices alone may not be sufficient to identify all the parameters (Bikbov & Chernov, 2004; Collin-Dufresne & Goldstein, 2002). In contrast, Collin-Dufresne et al. (2009) adopts the “time-series only” approach and augments the observed yields with PCA factors. The estimated parameters are fitted to the restricted models that produce USV. Regression analysis is conducted between the short rate volatility produced by these models and the market at-the-money implied volatility to detect the sensitivity of USV to options implied volatility. Our study follows their approach by estimating such parameters from “time series-only”, augmenting with the PCA factors. Instead of regressing the short rate volatility against the options implied volatility, we consider the macro variables as a proxy for the implied volatility and test their sensitivity to the USV. This approach may also be considered to be an alternative practice, particularly in the emerging markets where market options data may be limited due to illiquidity and sparsity of trades.

3. Model Establishment

ATSMs are described in terms of an N-dimensional Markov process of a state variable X and its dynamics under the risk-neutral measure

Q

are written as:

d X_{t} = K^{Q} (θ^{Q} - X_{t}) d t + Σ \sqrt{S_{t}} d W_{t}^{Q}

(1)

where

θ^{Q}

is the long-term mean of the process

X_{t}

under the risk-neutral measure,

W_{t}

is an independent Brownian motion,

κ

and

Σ

are

N \times N

matrices, and

S_{t}

is a diagonal matrix with the ith element given by

S_{i i, t} = α_{i} + β_{i}^{T} \cdot X_{t}

(2)

spot rate is an affine function of

X_{t}

and is written as

r_{t} = δ_{0} + δ_{1} {\dot{X}}_{t}

(3)

where

X_{t}

is a Markov N-dimensional state vector at time t,

δ_{0}

is a scalar and

δ_{i}

is an N-dimensional vector for

i = 1, . . ., N

.

Provided that parameters are admissible, it is known from Duffie and Kan (1996) that a zero-coupon bond has a solution

P_{t} (τ) = e^{A (τ) - B {(τ)}^{T} \cdot X_{t}}

(4)

where

τ \equiv (T - t)

and

A (τ)

and

B (τ)

satisfy the following ordinary differential equations (ODEs), also known as Ricatti equations

\frac{d A (τ)}{d τ} = - θ^{Q^{T}} K^{Q^{T}} B (τ) + \frac{1}{2} \sum_{i = 1}^{N} {[Σ^{T} B (τ)]}_{i}^{2} α_{i} - δ_{0}

(5)

\frac{d A (τ)}{d τ} = - θ^{T} K^{Q^{T}} B (τ) - \frac{1}{2} \sum_{i = 1}^{N} {[Σ^{T} B (τ)]}_{i}^{2} β_{i} + δ_{y}

(6)

A solution to these ODEs is found through numerical integration, starting from the initial conditions

A (0)

=

B {(0)}_{N \times 1}

.

By inverting (4) a related yield

y_{t}

is computed as

y_{t} (τ) = - \frac{l o g P_{t} (τ)}{τ} = \frac{A (τ)}{τ} + \frac{B {(τ)}^{'} X_{t}}{τ}

(7)

We adopt a framework of Collin-Dufresne et al. (2009) in which they use a classification scheme of Dai and Singleton (2000) for

A_{1} (3)

and

A_{1} (4)

models, each with

M = 1

conditional volatility vector. They choose a representation where, for a state vector X, each state variable has a clear economic interpretation. They derive the state variables below using the Taylor series expansions with respect to maturity

τ

Y_{t} (τ) = Y_{t}^{0} (τ) + τ Y_{t}^{1} (τ) + \frac{1}{2} τ^{2} Y_{t}^{2} + \dots

(8)

They assign the state vector

X = [r, μ^{Q}, V]

for the three-factor model. An additional state variable

θ^{Q}

, being three times the curvature, is introduced in the four-factor model so that the state vector becomes

X = [r, μ^{Q}, θ^{Q}, V]

. These state variables are defined as follows:

\begin{matrix} r_{t} & = Y_{t} (0) \end{matrix}

(9)

\begin{matrix} μ_{t}^{Q} & = 2 \frac{\partial Y_{t} (τ)}{\partial τ} |_{τ = 0} \end{matrix}

(10)

\begin{matrix} θ_{t}^{Q} & = 3 \frac{\partial^{2} Y_{t} (τ)}{\partial τ^{2}} |_{τ = 0} \end{matrix}

(11)

\begin{matrix} V_{t} & = \frac{1}{d t} d r_{t}^{2} \end{matrix}

(12)

Following Collin-Dufresne et al. (2008), alternative ways of representing (1) were evaluated to identify the most general and identifiable form than the

Σ \sqrt{S_{t}}

for the diffusion matrix. Here, the stochastic components of the model are expressed in terms of a covariance matrix rather than as I

\hat{t} o

diffusions and are found to be more suitable for introducing the parameters that have clear economic interpretations.

3.1. The $A_{1} (3)$ Model

Model

A_{1} (3)

is a three-factor ATSM that allows the presence of stochastic volatility of interest rates. Its canonical form is expressed by a state vector

S = [x, r, μ_{1}]

can be rotated into three state variables short rate, risk-neutral drift of the short rate, and the variance of the short rate, resulting into an observable state vector

X = [r, μ^{Q}, V]

. This is regardless of whether there is or is not USV; see Singleton (2006). The risk-neutral dynamics for the state vector X can be written as

Instantaneous mean:

\frac{1}{d t} E^{Q} [d X_{t}] = [\begin{matrix} m_{0} + m_{r} r_{t} + m_{μ} μ_{t}^{Q} + m_{V} V_{t} \\ μ_{t}^{Q} \\ γ_{V} - K_{V} V_{t} \end{matrix}]

(13)

where:

$m_{0}$ = is a constant term
$r_{t} = δ_{0} + δ_{1}^{⊤} X_{t}$ = feedback from the short rate
$μ_{t}^{Q}$ = slope of the yield curve
$m_{V}$ = volatility feedback
$γ_{V}$ = the long-term mean or level of the volatility process.
$K_{V}$ = the mean-reversion speed of the volatility process.

and Covariance matrix:

\frac{1}{d t} c o v (d X_{t}, d X_{t}^{T}) \equiv Ω_{t} = Ω_{0} + Ω_{V} (V_{t} - \underset{̲}{V})

(14)

where

\underset{̲}{V}

is set to be a lower bound for

V_{t}

. Admissibility requirements are also met provided that

Ω_{0}

and

Ω_{V}

are positive semidefinite and positive definite, respectively.

Ω_{0} = [\begin{matrix} \underset{̲}{V} & c_{r μ} & 0 \\ c_{r μ} & σ_{μ} & 0 \\ 0 & 0 & 0 \end{matrix}] a n d Ω_{V} = [\begin{matrix} 1 & c_{r μ} & c_{r V} \\ c_{r μ} & σ_{μ} & c_{μ V} \\ c_{r V} & c_{μ V} & σ_{V} \end{matrix}]

(15)

There are six parameters in the drift and eight in the covariance matrix, resulting in 14 risk-neutral parameters in total. This implies that the model is a Q-maximal model because of the greater number of parameters, according to Dai and Singleton (2000) and Collin-Dufresne et al. (2008).

3.2. Market Price of Risk

The pricing kernel, which is also known as the stochastic discount factor (SDF), is defined as

\frac{d Λ_{t}}{Λ_{t}} = - r_{t} d t - λ_{t}^{⊤} d W_{t}

(16)

Theessentially affine form of Duffee (2002) specifies the market price of risk (MPR)

Λ_{t}

as a nonlinear function of volatility

Λ_{t} = - \sqrt{S_{t}} λ_{1} + \sqrt{S_{t}^{- 1}} λ_{2}

(17)

where

S_{t}

is a non-negative function of the volatility state

V_{t}

. This structure introduces both direct and inverse sensitivity to volatility in the risk premia. It captures effects such as increased compensation for high volatility environments, and asymmetric or leverage effects through

\sqrt{S_{t}^{- 1}}

.

3.3. Model $A_{1} (3)$ Under Physical Measure

No-arbitrage conditions require that both the risk-neutral measure

Q

and the physical or historical measure

P

be equivalent. It is implied that the volatility or diffusion part of the SDE does not change in either measure and that only the drift will be influenced by the risk premia. The drift for

A_{1} (3)

state vector under physical measure

P

is specified as

\frac{1}{d t} E^{P} [d X_{t}] = [\begin{matrix} λ_{r 0} + λ_{r r} r_{t} + (1 + λ_{r μ} μ_{t}^{Q} + λ_{r V} V_{t}) \\ (m_{0} + λ_{μ 0}) + (m_{r} + λ_{μ r}) r_{t} + (m_{μ} + λ_{μ μ}) μ_{t}^{Q} + (m_{V} + λ_{μ V}) V_{t} \\ (γ_{V} + λ_{V 0}) - (K_{V} - λ_{V V}) V_{t} \end{matrix}]

(18)

See Appendix A for a detailed derivation:

3.4. The $A_{1} (4)$ Model

In (11) we defined the state variable

θ_{t}^{Q}

, which represents the curvature of the yield at short maturities. Adding this to the state vector results in

X = [r, μ_{t}^{Q}, θ_{t}^{Q}, V_{t}]

. The result is a four-factor model with risk-neutral dynamics represented by the instantaneous mean and covariance matrix as

\frac{1}{d t} E^{Q} [d X_{t}] = [\begin{matrix} μ_{t}^{Q} \\ θ_{t}^{Q} + V_{t} \\ a_{0} + a_{r} r_{t} + a_{μ} μ_{t}^{Q} + a_{V} V_{t} \\ γ_{V} - K_{V} V_{t} \end{matrix}]

(19)

and

\frac{1}{d t} c o v (d X_{t}, d X_{t}^{T}) \equiv Ω_{t} = Ω_{0} + Ω_{V} (V_{t} - \underset{̲}{V})

(20)

where

Ω_{0} = [\begin{matrix} \underset{̲}{V} & c_{r μ} & c_{r θ} & 0 \\ c_{r μ} & σ_{μ} & c_{μ θ} & 0 \\ c_{r μ} & c_{μ θ} & σ_{θ} & 0 \\ 0 & 0 & 0 & 0 \end{matrix}] a n d Ω_{V} = [\begin{matrix} 1 & c_{r μ} & c_{r θ} & c_{r V} \\ c_{r μ} & σ_{μ} & c_{μ θ} & c_{u V} \\ c_{r θ} & c_{μ θ} & σ_{θ} & c_{θ V} \\ c_{r V} & c_{μ V} & c_{θ V} & σ_{V} \end{matrix}]

(21)

Similar to

A_{1} (3)

model, admissibility requirements are met provided that

Ω_{0}

and

Ω_{V}

are positive semidefinite and positive definite, respectively. We note that the model has a total of 22 free risk-neutral parameters, qualifying the model to be a maximal

A_{1} (4)

.

3.5. Parameter Restrictions

Collin-Dufresne et al. (2004) describethe necessary and sufficient conditions for parameter restrictions under which an

A_{1} (3)

model can display USV. A model exhibits USV if the state variables driving volatility risk cannot be hedged away by trading the bond prices alone. These conditions must also lead to the admissibility of the model (Collin-Dufresne et al., 2004).

3.5.1. $A_{1} (3)$

Implementing the following restricted parameters characterize the

A_{1} (3)

as a model with USV (Collin-Dufresne et al., 2004)

m_{r} = - 2 c_{V}^{2}

m_{μ} = 3 c_{V}

m_{V} = 1

σ_{V}^{2} = c_{V}^{2}

3.5.2. $A_{1} (4)$

Collin-Dufresne et al. (2009) proposes an extension to the

A_{1} (3)

restrictions and derive the following for the

A_{1} (4)

model to display USV

a_{r} = - 2 c_{r μ}^{2} (3 c_{r μ} - a_{θ})

a_{μ} = 7 c_{r μ}^{2} - 3 c_{r μ} a_{θ}

a_{V} = 3 c_{r μ}

σ_{μ} = c_{r μ}^{2}

σ_{θ} = c_{r μ}^{4}

c_{r θ} = c_{r μ}^{2}

c_{μ θ} = c_{r μ}^{3}

3.6. Estimation Strategy

Bayesian estimation procedures for continuous-time finance are well documented in many sources, including Stroud et al. (2003), Collin-Dufresne et al. (2009) and references therein, and Johannes and Polson (2010). We aim to infer the posterior distribution of unknown parameters given observed data. This process can be computationally challenging, especially when the posterior has a complex or high-dimensional structure. Two powerful tools that can simplify these challenges are data augmentation and a Gibbs-like posterior sampler. Data augmentation is a technique by which a latent auxiliary variable is introduced to fill the missing diffusion between each observed pair of measurements; see Elerian et al. (2001). The Gibbs sampler is the simplest algorithm that is possible to use directly to sample iteratively from all complete conditionals

(Θ, X)

, where

Θ

is a parameter vector and X a latent variable (Johannes & Polson, 2010).

Collin-Dufresne et al. (2009) considers two data augmentation approaches. First, augmenting with unobservable high-frequency data, which enables the use of the Euler approximation, and provides a Gaussian density that is easy to work with. This introduces discretization bias, especially when time steps h are not sufficiently small, and the true continuous-time model deviates from the linear or Gaussian assumptions underlying Euler methods. Second, augmenting the observed yield data with the theoretically observable term structure factors—state variables X. Its advantage over the former is that it treats the main sources of variation in the yield curve as observable proxies for the latent state process. This approach assumes that the latent factors driving the term structure are extracted directly from cross-sectional yield data. It results in a much simpler posterior structure and avoids discretization altogether, enabling more stable and efficient inference.

We let

P = P_{1}, P_{2}, . . ., P_{T}

represent a time series of principal components of the yields

Y_{t}

. The posterior

p (Θ | P)

is approximated using PCA-augmented data as

p (Θ | P) \propto p (P | Θ) p (Θ)

where

Θ

represents a vector of parameters.

The first term on the right-hand side is the likelihood function, and the second is the prior distribution of parameters. The likelihood function is intractable, making it difficult to evaluate the posterior. This would require the use of several techniques from the MCMC. The observable data

P

are augmented with the term structure factor data

X = \{X_{1}, X_{2}, . . ., X_{T}\}

. These factors are theoretically observable as they can be interpreted independently from the model being considered. In practice, they may not be directly observed as there are only a finite maturity yields available. Uncertainty in these state variables is therefore integrated out using a Gibb-like posterior that alternates between drawing

p (Θ | P, X)

and

p (X | P, Θ)

(Collin-Dufresne et al., 2009).

True dynamics from SDE (1) are then approximated as1

(X_{t + h} \sim X_{t}) \sim N (h (a + b X_{t}), h Ω_{t})

(22)

where:

$h = d t$ is the discretization step (e.g., $\frac{1}{52}$ for weekly data),
$a + b X_{t}$ is the drift under $P$ ,
$Ω_{t}$ is the state-dependent instantaneous covariance matrix which depends on $V_{t}$ .

The likelihood function is then specified in terms of the relationship between data and the state vector as

P_{t} = P C l o a d i n g s \times Y_{t}

(23)

It is clear from (7) that there is a linear relation between the principal components and state variables. Adding a Gaussian error vector

ϵ_{t} \sim N (0, Ω_{T})

, we obtain

P_{t} = K + L X_{t} + ϵ_{t}

(24)

(24) is further defined as a “measurement equation” in a state–space model, which we shall use in Kalman filtering to estimate the latent factor

V_{t}

.

When full conditionals

(Θ, X)

are not tractable for a Gibbs-sample, Metropolis–Hastings (MH) steps within the Gibbs sampler become an alternative technique. The MH algorithm generates a Markov chain with a target posterior density

π (X)

. At each iteration t:

It proposes $X^{*} \sim p (X^{*} | X^{(} t))$
Accept $X^{*}$ with the following probability:

α = m i n (1, \frac{π (X^{*}) p (X^{(t)}) | X^{*})}{π (X^{(t)}) p (X^{*}) p (X^{*} | X^{(t)})})

(25)

where:

X^{*}

represents a proposed state or candidate in the MH algorithm

p (X^{*} | X^{(t)})

is the proposal distribution that defines how the candidate state

X^{*}

is chosen given the current-state

X^{(t)}

.

α

is the acceptance probability in the MH algorithm. It determines whether the proposed state

X^{*}

is accepted as the new state

X^{(t)}

. If

X^{*}

is accepted, the chain moves to

X^{*}

; otherwise, it stays at

X^{(t)}

.

There are several challenges and limitations for the MCMC, such as computational burden, weak parameter identification, and poor mixing. The latter is caused by the presence of high correlation between the parameters and latent states, resulting in slow convergence. Johannes and Polson (2010) recommends the use of artificial data to check the efficiency and convergence of the algorithm. Stroud et al. (2003) conducted a comparative analysis on three alternative MCMC algorithms in terms of their autocorrelation functions. The option with fast decay in autocorrelation was found to have a faster convergence to the other alternatives.

Appendix B discusses briefly the Bayesian algorithm and workflow, as adapted from the Collin-Dufresne et al. (2009) approach. The algorithm, posterior structure, and convergence diagnostics are fully documented in Appendix B. They are augmented by a workflow diagram and pseudo-code in Appendix C and Appendix D. A three-block approach is followed, with parameters broken into

ϕ^{Q}

,

ϕ^{Λ}

, and

ϕ^{λ}

for risk-neutral measure, measurement error standard deviations, and risk-premium parameters, respectively.

4. Data Collection

We use a sample of weekly SA government treasury bond prices spanning the period October 2013 to September 2024 with maturities of 3 months, 5, 10, 12, 20, 25, and 30 years. For out-sample analysis, we use data for the periods October 2023–September 2024 with the same maturities. The data were retrieved from the Thomson Reuters database. We extract the approximated zero-coupon yields by inverting the bond pricing equation. Specifically, for a bond with maturity

τ

and price P, we compute the implied yield using the simplified inversion Formula (7).

Traditionally, zero-coupon yields are typically bootstrapped from swap rates, LIBOR, or any of its replacements. Swap-based bootstrapping depends on credit assumptions, interpolation, and curve construction techniques that may not reflect the true market dynamics. These instruments are derived, not directly traded, and do not represent actual cash instruments. In contrast, coupon bond prices that are observed in the market contain rich maturity and liquidity information, and are directly priced by supply–demand dynamics. We therefore use coupon bond data as the basis for our term structure modeling.

To ensure internal consistency and eliminate arbitrage opportunities, we employ a no-arbitrage ATSM where zero-coupon bond prices satisfy

P (t, T) = E_{t} [e x p (- \int_{t}^{T} r_{s} d s)]

(26)

Assuming an affine structure for the short rate and state variables

X_{t}

, the price takes the form of (4). Here,

X_{t}

is a vector of latent state variables, and

A (τ)

and

B (τ)

are functions satisfying Ricatti differential equations, which ensure closed-form and arbitrage-free bond pricing.

We simulate the evolution of interest rates using the SDE (1). At time t, given each simulation path, bond prices are computed, followed by zero yields. These zero yields produce a time series of term structures

y (t, τ)

, each curve reflecting the simulated latent state at time t for a maturity

τ

.

To extract the structure from the evolving yield curves, we apply the PCA. The zero-yield data are then decomposed into orthogonal components to capture the majority of variance with a few statistical factors: level, slope, and curvature. Table 1 presents the PCA for our zero-yield data. It is noted that the first three components have a cumulative variance of 99.97%, which agrees with the empirical evidence of Litterman et al. (1991); see Figure 1 below.

We also compute the FX volatility and oil shock using Brent crude and USDZAR prices, respectively, for the period October 2013 to September 2024.

5. Scenario Determination

We apply the ATSM to extract the interest rate volatility from a cross-section of bond prices. ATSM represents both the yield curve term structure and the time series of bond prices over several trading periods. This multi-period and cross-sectional nature of the interest rates presents several challenges, including the volatility and trading risk, risk premia, and market completeness. At the heart of term structure modeling is the need to study both the term structure and time series simultaneously. ATSMs have been successfully used in this area as they are characterized by tractability, closed-form solutions, efficient approximation, and closed-form moment conditions for empirical analysis (Collin-Dufresne et al., 2004) and references therein. Our study is based on the following choice of scenarios, some of which are already supported by the empirical evidence:

The test whether state vectors and parameter vectors have a unique economic interpretation. This requires a suitable representation among ATSMs where latent state vectors are translated into observable factors. Both the state vector and model parameter vector should be globally identifiable so that their values can be compared directly across different countries, periods, and even models.
Among the three and four-factor stochastic volatility models, evaluate their capability to break the dual role of predicting the variance of the short rate and simultaneously a linear combination of yields and the quadratic variation of the spot rate. There is empirical evidence that the $A_{1} (3)$ model is unable to play the dual role (Collin-Dufresne et al., 2004). We compare the $A_{1} (4)$ USV with $A_{1} (3)$ USV.
To determine whether estimation can be based on time series only. In the absence of option price data, we test the macro variables as sources of variation and a substitute for option data when estimating USV models.

Among several estimation approaches, we proceed with the Bayesian approach, which combines the MCMC and Kalman filter. This probabilistic approach is more suitable where there are parameter uncertainty and model flexibility issues.

6. Model Implementation

We use (4) to invert zero yields from the SA government treasury bond. The zero yields are further transformed into a nonlinear state–space form. The purpose is to implement the three-factor and four-factor affine models to extract the short rate variance. The final outcome is to determine whether the affine models can explain simultaneously the time series and cross-sectional properties of bond prices. It should be noted that the ATSMs that we selected are based on the maximal forms of both

A_{(} 3)

and

A_{1} (4)

.

Collin-Dufresne et al. (2008) propose an alternative approach to those of Duffie and Kan (1996) (DK) and Dai and Singleton (2000) to rotate from latent to observable state vectors. Whereas DK’s technique involves inverting the term structure with respect to latent factors and cannot be implemented, Collin-Dufresne et al. (2008) writes these term structure dynamics in a nonlinear state–space. The model parameters are then estimated by Bayesian and MCMC methods.

It is noted that the volatility state variable

V_{t}

does not enter the bond pricing equation for those models exhibiting USV. A four-factor model with a state vector

X = [r, μ_{t}, θ_{t}, V_{t}]

effectively becomes a three-factor model by excluding the state variable

V_{t}

.

V_{t}

is, therefore, an additional volatility factor that is free to explain the time-series patterns (Collin-Dufresne et al., 2004). USV can also be regarded as a latent component

V_{t}

that influences the risk premia and yield variance but is not directly priced by the cross-section of bond yields.

Despite the exclusion of

V_{t}

from the bond price equation, it continues to influence the following, thus fulfilling the conditions for USV:

The conditional variance of state transitions $Σ (X_{t + 1} | X_{t})$
The market price of risk via $Λ_{t} = - \sqrt{S_{t}} λ_{1} + \sqrt{S_{t}^{- 1}} λ_{2}$

Since

V_{t}

is not spanned by the cross-section of bond yields, it must be identified through time-series variation and external instruments.

In our empirical implementation, we filter

V_{t}

using the observed term structure, which, by way of a robustness check, we compare against the variation of macro signals such as oil shocks, and GARCH-based FX volatility. This, we believe, also assists in ascertaining the identifiability and economic relevance of the USV factor in markets with limited market options data due to sparse trades.

We consider a state–space model:

\begin{matrix} y_{t} & = K + L X_{t} + ϵ_{t}, ϵ_{t} \sim N (0, Σ_{ϵ}) \end{matrix}

(27)

\begin{matrix} X_{t + 1} & = μ + Φ X_{t} + η_{t}, η_{t} \sim N (0, Q (V_{t})) \end{matrix}

(28)

X_{t}

is an unknown state vector, and

y_{t}

is the observed data vector. Both equations represent the Kalman filter. We apply the Kalman filter within the MCMC procedure to extract the posterior distribution of

V_{t}

.

7. Analysis of Results

This section presents the results from the two competing models

A_{1} (3)

and

A_{1} (4)

, both with USV, focusing on both in-sample and out-of-sample performance. We begin with the analysis of the posterior distribution for parameters to assess the point estimates and associated intervals for estimation consistency. This is followed by an assessment of the models’ ability to fit the yield curve in-sample, and thereafter, an evaluation of their forecasting performance out-of-sample. Subsequently, we provide a closer examination of the time-series behavior of the key latent factor, volatility, term structure implications, various sensitivity and robustness diagnostics.

7.1. Posterior Distributions of Key Parameters

Table 2 and Table 3 report the parameter point estimates for risk-neutral and risk premia, respectively. We applied the discretization value

h = \frac{1}{52}

, which produced reasonable estimates. For each point estimate, based on mean values, there are corresponding credible interval bounds, which appear consistent across the two models. We observe that

A_{1} (4)

USV exhibits substantially narrower posterior distributions across most parameters compared to

A_{1} (3)

USV. This suggests that

A_{1} (4)

USV constrains the parameter estimates better, potentially due to improved model specification. In contrast, the broader intervals in

A_{1} (3)

USV may indicate underfitting and that its structure might not capture the complexity in the data sufficiently. The tighter credible intervals in

A_{1} (4)

USV should be interpreted as increased certainty.

There may be a concern over a possible underfitting observed in

A_{1} (4)

USV, suggesting that a higher-dimensional ATSM, such as a five-factor model, might improve the fit. It should be noted, as stated by (Collin-Dufresne et al., 2004), that additional factors do not necessarily translate into materially different yield curve dynamics. The four-factor USV model effectively behaves like a three-factor model in terms of the yield curve, since the USV factor does not impact bond prices directly. Therefore, increasing the number of factors may not always lead to meaningful gains in model performance, though this remains an avenue for future research.

In addition, for a sample of risk-neutral drift parameters, which includes the physical

γ_{V}^{P}

and

K_{V}^{P}

, we analyzed the posterior plots for consistency.2 Figure 2 and Figure 3 plot the histograms and kernel densities for drift parameters for models

A_{1} (3)

USV and

A_{1} (4)

USV, respectively. As expected, the shapes of the histograms do not exhibit a multi-model but instead, are all unimodal. This is a sign of convergence and good mixing of the MCMC algorithm.

To gain more insight into the effectiveness of the Bayesian inference obtained through MCMC sampling, we used the ARVIZ version 0.22 python software function arviz.summary() of (Kumar et al., 2019) to generate the Table 4 and Table 5. These tables present posterior summaries for key model parameters obtained via MCMC sampling for models

A_{1} (3)

USV and

A_{1} (4)

USV, respectively. For each parameter, we report the posterior mean, standard deviation, and 95% Highest Density Interval (HDI), along with diagnostic statistics including Monte Carlo standard errors (MCSE), effective sample sizes (ESS), and the Gelman–Rubin convergence statistic (

\hat{R}

). All

\hat{R}

values are equal to 1, indicating satisfactory convergence. Effective sample sizes (ESS) were used to assess sampling efficiency and convergence. In Table 4, bulk ESS values ranged from 975 to 2550 across parameters, with all but one parameter exceeding 2000. The lowest bulk ESS (975) is still well above common diagnostic thresholds

\geq 400

, which is indicative of adequate sampling.

Tail ESS values ranged from 674 to 1500, demonstrating reliable estimation of posterior distribution tails and credible intervals. Similar trend for bulk ESS and tail ESS is also observed in Table 5. These values suggest that the Markov chains mixed well and that the posterior estimates are statistically stable. The effective sample sizes are sufficiently large, suggesting reliable estimation. Notably, all point estimates lie well within their respective HDI bounds, consistent with stable posterior inference.

7.2. Yield Curve Fit

In this section, we evaluate the in-sample and out-sample model performance by way of the root-mean-square error (RMSE) and bias for each maturity. Model-fitted yields are derived from the point estimates in Table 2 and Table 3. Similar to Collin-Dufresne et al. (2009), we avoid the process of integrating over the posterior distribution for computational burden. Posterior samples are therefore rerun only for sampled state variables and parameter values held fixed.

Using the short rate observed during the short term while assuming the constant yield to maturity, we compute the discount rate for each maturity, which we then apply to convert the observed yields into par bonds. From the par bonds, we compute the zero yields using (7). This exercise is necessary so that we may compare both the model-fitted yields and the observed ones with zero yields.

Table 6 presents the yield curve fit analysis results in terms of RMSE and bias, both in and out-sample3. The yield curve fitting performance of the two models is assessed using both RMSE and forecast bias, evaluated in-sample and out-of-sample across maturities of 0.25, 5, 10, 12, 20, 25, and 30 years. We applied the DM test to compare forecast errors

ϵ_{t} = Y_{t} - \hat{Y_{t}}

and detect statistically significant differences, where

Y_{t}

and

\hat{Y_{t}}

represent the observed and forecast yields, respectively.

In-sample, model

A_{1} (4)

USV provides a significantly better fit to the yield curve in terms of RMSE, with a DM test statistic of 6.1233 and p-value of 0.0009. This suggests that

A_{1} (4)

USV captures the cross-sectional variation in yields more accurately. Model

A_{1} (3)

USV exhibits a significantly lower forecast bias, particularly at longer maturities, with a DM statistic of −8.00 and p-value of 0.0002, suggesting that the model aligns more closely with observed yields on average, despite being less precise.

Out-sample, the results are more nuanced. Model

A_{1} (4)

USV exhibits lower RMSE at shorter maturities, with a statistically significant difference observed at the first maturity. Across the full maturity spectrum, the difference in RMSE between the models is not statistically significant with a DM statistic of 0.9999 and p-value of 0.3559. A similar pattern is seen in forecast bias with model

A_{1} (3)

USV consistently showing lower bias, but the overall difference is again not significant with a DM statistic of 0.9571 and p-value of 0.3755.

Finally, the results suggest that

A_{1} (4)

USV fits the in-sample yield curve more accurately, whereas

A_{1} (3)

USV exhibits less bias and potentially more stable performance out-of-sample. There is a trade-off between precision and bias, whose decision depends on the application, as we shall also discuss under the volatility regression below. Although previous studies have associated model misspecification with high autocorrelation in residuals (Collin-Dufresne et al., 2009), we did not perform an autocorrelation analysis in this study. Future research could incorporate this test to further assess model adequacy.

7.3. Time-Series Dynamics

In this section, we evaluate the properties of model-implied4 state variables. To estimate them, we run a posterior sampler by holding the parameters in Table 2 and Table 3 fixed. Smoothed estimates are obtained by averaging the results of the draws of state variables (Collin-Dufresne et al., 2009). Both models produce a time series of state variables

X_{t}

, including a filtered volatility

V_{t}

. From these model-implied state variables, we compute various correlations against the actual yield time-series variables. A 26-week rolling standard deviation of the log returns for the 3-month actual yields is fitted. As a comparative means, we also repeated the computation of the 26-week rolling standard deviation for the 12-year and 30-year maturities. Yield curve variables, slope was computed as a function of the 30-year and 3-month yield as

(Y_{30 y} - Y_{0.25 y})

, and the curvature as

(Y_{30 y} - 2 Y_{12 y} + Y_{0.25 y})

. Other variables include the GARCH(1,1) volatility of the same 3-month log returns data used for computing the 26-week rolling standard deviation, FX volatility for the SA Rand Dollar currency pair, and Brent crude volatility.

Table 7 compares models

A_{1} (3)

USV and

A_{1} (4)

USV in terms of the various correlations between model-implied variables and the actual time-series variables for 3-month, 12-year, and 30-year maturities. The

A_{1} (3)

model shows strong alignment with yield curve decomposition literature of Litterman et al. (1991), by capturing over 99% of the variance in the yield data. All three components, level, slope, and curvature, show very high correlations with their empirical counterparts, with over 96%, especially curvature with 98.9%, consistent with theoretical expectations. Figure 1 illustrates how the first three components contribute 99.97% to total variation. As a result, 0.03% variance explained is well below any reasonable threshold and negligible.

The

A_{1} (4)

USV model introduces a marginal component explaining only 0.03% of variance. Including this factor raises the average yield fit to

r_{t} = 1.0

due to overfitting, but significantly worsens the curvature correlation—down to 0.733, likely due to numerical instability introduced by a non-meaningful fourth factor. Although the

A_{1} (4)

USV model slightly improves average yield, it does so at the cost of degrading the structural interpretation of key term structure components.

The

A_{1} (4)

USV model exhibits a larger correlation of 0.550 between the model-implied and 26-week rolling for the short term, even though it decreases for both mid and long term. The

A_{1} (3)

USV model, on the other hand, exhibits very low or even negative correlations of 0.054, 0.002, and −0.037 for the short, mid, and long term, respectively. The GARCH(1,1) displays the results that are almost similar, except that surprisingly some slight increase in correlations is observed with the

A_{1} (4)

USV model.

The left panel of Figure 4 plots the 26-rolling volatility and the model-implied volatility

V_{t}

against the period 2016–2024. Subplot (a) displays the inability of the

A_{1} (3)

USV model volatility to track the variation of short rate volatility with a 26-week rolling; instead, the model exhibits a flat pattern. The

A_{1} (4)

USV volatility, on the other hand, tracks the variation of the short rate volatility closely with the 26-week rolling volatility.

As a robustness check, we compare the correlations between the model-implied volatility

V_{t}

and FX volatility, and Brent crude. Correlations for the 3-month display a positive relationship with the model-implied volatility

V_{t}

of 0.408 for the SA Rand Dollar, and 0.216 for the Brent crude volatilities. A poor picture is exhibited for the mid and short end of the yield curve regarding the correlations. For the yield curve factors level, slope, and curvature, there is generally a positive relationship with the model-implied volatility

V_{t}

.

The curvature factor exhibits negative correlation with model-implied volatility for both models, with

A_{1} (3)

USV showing slightly higher correlation of −0.119 when compared to more negative −0.421 for the

A_{1} (4)

USV models. The negative correlations between the model-implied volatility and the curvature factors—a yield curve component—confirm the existence of the tension between time-series dynamics and yield curve dynamics.

A_{1} (4)

USV fits the yield curve more accurately, as demonstrated by stronger sensitivities to level, slope, and curvature factors and a slightly better

R^{2}

at short maturities. This suggests it effectively captures the dominant yield dynamics at the front end, where traditional factors drive movements. USV restrictions on

A_{1} (3)

appear to reduce performance slightly at the short end—likely due to overfitting or noise from volatility interactions, they provide more stable and potentially more robust behavior over the long term.

A_{1} (3)

USV capture the persistent stochastic volatility effects and improve out-of-sample stability, despite a lower in-sample

R^{2}

.

A_{1} (4)

USV exhibits stronger level, slope, and curvature loadings at the short end, while

A_{1} (3)

USV shows nuanced volatility-related dynamics impacting longer maturities. This highlights that USV dynamics interact differently with yield curve factors across maturities. Ultimately,

A_{1} (4)

USV excels at capturing short-term yield behavior, whereas

A_{1} (3)

USV’s richer structure better reflects long-term volatility features, suggesting a trade-off between in-sample fit and out-of-sample stability. These findings underscore the importance of carefully modeling factor-volatility interactions in yield curve models, especially when testing for USV.

7.4. Volatility Forecasting and Regression

In this section, we analyze the in-sample and out-sample performance using two volatility proxies, the absolute one-week of changes in yield returns

E [∥ Δ Y ∥]

and realized volatility

\hat{σ}

. We use the same data for parameter point estimates from Table 2 and Table 3 to extract our weekly forecasts.

Following the methodology adopted in Collin-Dufresne et al. (2009), we implement a rolling-window approach using a two-year or approximately 104-week estimation window to construct one-week-ahead forecasts of the state variable. At each step, model parameters obtained from a posterior sampler are held fixed. The rolling process is used to re-estimate the latent state variables based on incoming data. This recursive structure allows for dynamic updating of state variables while maintaining stable parameter estimates. Forecasts are generated for both a 104-week in-sample period and a 45-week out-of-sample period. This design enables robust comparison of forecast accuracy across different subsamples and simulates a realistic forecasting scenario.

To generate each forecast, we first estimate the current values of the latent state variables using only information available up to time t. This is conducted in a manner consistent with the previous section, except that in the forecasting context, no data beyond time t is used in the state filtering step. Importantly, the parameter estimates used in this process are fixed at their posterior modes or means, obtained from the full sample estimation. Given these current-state estimates, we simulate 10,000 paths of the model forward one week to generate the predictive distribution of the state variables at time

t + 1

. From this simulated distribution, we then construct a forecast distribution for each yield and volatility proxy of interest per maturity.

The second proxy, realized volatility is given as

\hat{σ_{t, τ}} = \sqrt{\sum_{i = 1}^{N} Δ Y_{t, i, τ}^{2}}

(29)

where

Y_{t, i, τ}

is the

τ

-maturity yield on the ith week following the observation at t.

Traditionally, realized volatility is better estimated using high-frequency return data, such as daily, to increase the certainty over the reliability and accuracy of the forecast. In this study, we use weekly yield data to construct both proxies for volatility. Our choice is motivated both by data availability and by the structure of the domestic bond market, where SA government treasury bonds are traded during weekly auctions. It is our view that these auctions anchor the pricing of yields, meaning that weekly data capture economically meaningful shifts in investor expectations, interest rate risk, and macro-economic signals. Intermediate level frequencies, such as weekly or monthly, are better than both high-frequency and low-frequency levels in producing sign dependence in volatility, which is the source of forecastability in asset returns (Christoffersen & Diebold, 2002).

7.4.1. Forecasting and Model Performance

We report both in-sample and out-of-sample RMSEs in Table 8. In each case, the DM test is employed to assess the statistical significance of differences in forecast accuracy between models. The in-sample RMSE of

∥ Δ Y ∥

exhibits a mixed picture. For the first three maturities, model

A_{1} (4)

USV shows a greater reduction in RMSEs than

A_{1} (3)

USV, indicating an improved fit. For the remaining four maturities,

A_{1} (3)

USV performs better. Despite this maturity-dependent performance, the overall DM test yields a test statistic of 0.053 with a p-value of 0.959, indicating that the observed differences are not statistically significant at conventional levels.

Turning to the in-sample RMSE of the estimated volatility series

\hat{σ}

,

A_{1} (4)

USV consistently outperforms

A_{1} (3)

USV across all maturities. The DM statistic of 3.976 and a p-value of 0.007 support this result with significance at the 1% level. This suggests that

A_{1} (4)

USV more accurately captures the in-sample volatility dynamics of yields.

Out-sample,

A_{1} (4)

USV again shows improved performance in forecasting

∥ Δ Y ∥

, with uniformly lower RMSEs. The DM statistic of 2.404 and a p-value of 0.053 suggest a statistical significance at the 5% level. This result supports

A_{1} (4)

USV’s superior ability to forecast short-term volatility in yield changes. Out-sample performance of the volatility forecasts is even more decisive.

A_{1} (4)

USV demonstrates markedly lower RMSEs in forecasting

\hat{σ}

with a DM statistic of 6.791 and a p-value of 0.001. This result is highly significant and highlights the robustness of

A_{1} (4)

USV in capturing the underlying volatility dynamics beyond the estimation window.

It is worth noting, a tension between time-series volatility performance and cross-sectional yield curve dynamics. While

A_{1} (4)

USV consistently outperforms

A_{1} (3)

USV in forecasting volatility, both in-sample and out-sample, the initial inconsistency across maturities in the RMSE of

∥ Δ Y ∥

reflects a possible trade-off. Specifically, models that are optimized for volatility dynamics may not always align perfectly with those tailored for fitting the cross-section of the yield curve.

This tension is not surprising. Yield curve models often prioritize cross-sectional fit at a point in time, while volatility models emphasize dynamic consistency and predictive power over time. The results here suggest that

A_{1} (4)

USV, potentially incorporating richer dynamics or additional latent volatility structure, sacrifices some cross-sectional fit in the short term, as seen in maturities of 12, 20, 25, and 30 years, but gains substantially in time series forecasting accuracy.

Ultimately, the decision between models depends on the application, such as pricing and hedging, for which cross-sectional accuracy may dominate. For risk management and forecasting, superior volatility dynamics as achieved by

A_{1} (4)

USV are likely to be more valuable.

We also notice that both models match the unconditional volatility of yield changes, with

A_{2} (4)

USV exhibiting a more refined pattern. The right panel of Figure 4 displays the unconditional volatility plotted against maturities, standard deviation of model-fitted yield changes

Δ Y_{t}

in blue and lying within the light-gray bound distribution of model-implied yield volatilities as determined by the point estimates. The model-fitted unconditional volatility clearly fits along the mean and median of the distribution. In the bottom plot (d), we note that

A_{1} (4)

USV appears to be more of an improvement from

A_{1} (3)

USV in the top plot (b). In

A_{1} (4)

USV, the model-fitted unconditional volatility ( blue line) matches the pattern of the distribution average (gray line) better when compared to

A_{1} (3)

USV. Both plots exhibit a snake-shaped curve as discussed by Piazzesi (2010) for US Treasury yields and swaps. From an improved plot (d),

A_{1} (4)

USV, the pattern commencing with the back of the snake, followed by a hump towards 5-year maturities, a hump and drops towards the 10-year maturity, thereafter decreasing in a stable manner towards longer maturities. High volatility over the period between 3-month to 5-year maturities, also termed by Piazzesi (2010) the back-of-the-snake, is due to reactions from monetary policy events, changes in liquidity premia, and macro-economic dynamics, among others. It is also a key to the factor correlations.

7.4.2. Regression

To evaluate the extent to which model-derived volatility

V_{t}

can be explained by both term structure dynamics and exogenous volatility sources, we estimate the following linear regression model:

V_{t} = α + β^{T} X_{t} + γ_{i} Σ_{m_{i, t}} + ϵ_{t}

(30)

where

X_{t}

denotes the yield curve factors derived from either the three-factor model with level, slope, and curvature or a four-factor model including latent volatility

V_{t}

in the term structure model.

Σ_{m_{i, t}}

represent the M observable macro variable proxies at time t for

i = 1, . . ., M

, and

γ_{i}

is the coefficient for the i-th macro variable.

Table 9 presents the results of volatility regressions for the 3-month and 30-year volatilities. The GARCH model exhibits moderate explanatory power, with

R^{2}

values between 0.221 and 0.477, suggesting economically meaningful relationships in both short- and long-term volatility regressions, regardless of the inclusion of PCA factors. Applying PCA on both models

A_{1} (3)

USV and

A_{1} (4)

USV significantly improves the model performance, increasing

R^{2}

values from approximately 0.022 to 0.428 for both models, with minimal differences observed between them. Since both models achieve similar explanatory power after PCA with

R^{2} 0.43

,

R^{2}

alone cannot be used to select the better model.

Stochastic volatility models,

A_{1} (3)

USV and

A_{1} (4)

USV exhibit weak explanatory power in their original specification with

R^{2}

values of 0.022 and 0.027, but performance improves markedly after applying PCA to

R^{2}

values of 0.428 and 0.427, indicating that latent yield curve factors significantly enhance the models’ ability to capture volatility dynamics. Despite similar fit post-PCA,

A_{1} (3)

USV demonstrates a much stronger negative relationship between volatility and the yield curve factors with

β

coefficient of −6.876, while

A_{1} (4)

USV exhibits a modest positive relationship with

β

of 0.810, suggesting fundamentally different volatility structures. The estimated loadings on the level, slope, and curvature components further reveal that

A_{1} (4)

USV is more responsive to changes in yield curve shape, especially in terms of slope –0.173 and curvature −0.364. These findings imply that while both models satisfy the USV condition by showing significant latent factor influence,

A_{1} (4)

USV may better align with USV theory by isolating volatility innovations not captured by the term structure, though

A_{1} (3)

USV captures a stronger overall volatility response. In the second regression,

A_{1} (3)

USV and

A_{1} (4)

USV exhibit a beta of −2.144 and 0.302, respectively, suggesting that

A_{1} (4)

USV is still positively related to realized volatility and aligns better. A drop in

β

coefficient from 0.810 to 0.302 only suggests that the model

A_{1} (4)

USV performs better in the short term than the longer term.

The purpose of this regression is to evaluate the extent to which the stochastic volatility models capture information embedded in the yield curve, and whether they span the term structure or USV properties. Two key insights emerge from both the 3-month and 30-year maturity regressions. First, time-series information, extracted through PCA, proves substantially more informative than raw cross-sectional data. In their unaugmented forms, both models exhibit weak explanatory power with

R^{2}

approximately 0.022–0.027, indicating that volatility is poorly explained by contemporaneous yield levels alone. However, once the latent factors of level, slope, and curvature are introduced, explanatory power improves substantially, with

R^{2}

increasing to 0.428 for the 3-month volatility forecast and up to 0.221 for the 30-year forecasts. This highlights the importance of dynamic yield curve behavior in driving interest rate volatility.

Second, although both models are influenced by latent yield curve factors, they diverge in structure and responsiveness.

A_{1} (3)

USV displays a strong contemporaneous volatility response, with large negative coefficients on latent factors. However, it shows limited sensitivity to yield curve shape in the long-term regression. This implies that

A_{1} (3)

USV primarily reacts to broad level shifts, lacking a nuanced view of the term structure, particularly over longer horizons.

7.4.3. Market Price of Risk

Figure 5 plots a comparison of MPR components from the

A_{1} (3)

and

A_{1} (4)

USV models. The

A_{1} (3)

model yields relatively stable and low-magnitude MPRs, while the

A_{1} (4)

USV model exhibits consistently higher values and greater variability, suggesting that the additional factor introduces a more pronounced risk compensation mechanism. This difference, largely a result of model specification and scale, highlights how additional factors can amplify perceived risk premia. Given that both sets of MPRs are derived from Bayesian posterior point estimates under the risk-neutral measure, the results reflect not just data fit but also the influence of prior beliefs. Overall, the 4-factor model implies a richer, possibly more realistic, structure for pricing risk.

8. Conclusions

We assume that restricted ATSMs with USV can be effectively estimated through Bayesian methods in emerging markets using only time-series data. As a result, we do not follow a traditional joint approach of estimation using both bond price and options data. We evaluate the sensitivity of oil shock and exchange rate volatility to the interest rate volatility. We test how the restricted models

A_{1} (3)

USV and

A_{1} (4)

USV respond to the tension between cross-sectional and time-series dynamics of bond prices.

The tension between time series and yield curve dynamics does not go away completely. Evidence of this is highlighted by a negative correlation between the curvature versus both volatility and variance. Model

A_{1} (4)

USV performs better than

A_{1} (3)

USV in capturing the time-series dynamics. There is a necessary trade-off between the time series or yield curve fitting, which depends on the application—option pricing, portfolio management, risk management and hedging, and policy formulation.

Model flexibility and parameter uncertainty issues are reduced by the Bayesian MCMC estimation strategy. The result is that parameters become locally and globally identifiable and economically meaningful, such that they are easily compared to parameters used in data from other countries.

Our study did not assess the autocorrelation of residuals in bias and RMSE due to computational burden. We did not factor the macro variables into a joint modeling but only used them as a robustness test mechanism. Future research should consider a joint modeling of macro variables as a means to compensate for the role of options data in a joint estimation and evaluate their impact on parameter estimation in the context of Bayesian modeling.

Author Contributions

Conceptualization, M.M. and G.v.V.; methodology, G.v.V.; software, M.M.; validation, M.M. and G.v.V.; formal analysis, M.M. and G.v.V.; investigation, M.M. and G.v.V.; resources, M.M. and G.v.V.; data curation, M.M. and G.v.V.; writing—original draft preparation, M.M.; writing—review and editing, G.v.V.; visualization, M.M.; supervision, G.v.V.; project administration, G.v.V.; funding acquisition, None. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the authors upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ATSM	Affine Term Structure Models
BIC	Bayesian Information Criterion
BDFS	Balduzzi P, Das SR, Foresi S
DM	Diebold–Mariano
DK	Duffie and Kahn
DTSM	Dynamic Term Structure Models
ESS	Effective Sample Size
FX	Foreign Exchange
GMM	Generalized Method of Moments
HDI	High-Density Interval
MCSE	Monte Carlo Standard Error
MCMC	Markov Chain Monte Carlo
LRSQ	Linear-Rational Square-Root
MH	Metropolis–Hastings
ODE	Ordinary Differential Equation
PCA	Principal Component Analysis
RMSE	Root-Mean-Square Error
SA	South African
SME	Simulated Method of Estimation
SDE	Stochastic Differential Equation
SV	Stochastic Volatility
USV	Unspanned Stochastic Volatility
USDZAR	SA Rand Dollar

Appendix A. Derivation of the Physical Measure Drift

It is assumed that the market price of risk

Λ_{t}

takes an “essentially affine form” of Duffee (2002) in the state variables and is written as

Λ_{t} = \sqrt{S_{t}} λ_{1} + \sqrt{S_{t}^{- 1}} λ_{2} X_{t}

where

λ_{1} \in R^{N}

and

λ_{2} \in R^{N \times N}

such that

λ_{1} = [\begin{matrix} λ_{r 0} \\ λ_{μ 0} \\ λ_{V 0} \end{matrix}], λ_{2} = [\begin{matrix} λ_{r r} & λ_{r μ} & λ_{r V} \\ λ_{μ r} & λ_{μ μ} & λ_{μ V} \\ λ_{V r} & λ_{V μ} & λ_{V V} \end{matrix}]

(A1)

To derive drift under a physical measure

P

, we invoke the Girsanov’s theorem: Firstly, the link between Brownian motions under both measures Q and P is established by

W_{t}^{Q} = W_{t}^{P} + \int_{0^{t}} Λ_{t} d t

(A2)

By expectation and dividing by

d t

on both sides we find

\frac{1}{d t} E^{Q} [d X_{t}] = \frac{1}{d t} E^{P} [d X_{t}] + Λ_{t}

(A3)

Substituting the first term on the right-hand side of (A3) with (13) we obtain the following

\frac{1}{d t} E^{P} [d X_{t}] = [\begin{matrix} m_{0} + m_{r} r_{t} + m_{μ} μ_{t}^{Q} + m_{V} V_{t} \\ μ_{t}^{Q} \\ g_{V} - k_{V} V_{t} \end{matrix}] + (\sqrt{S_{t}} λ_{1} + \sqrt{S_{t}^{- 1}} λ_{2} [\begin{matrix} r_{t} \\ μ_{t}^{Q} \\ V_{t} \end{matrix}])

(A4)

We now compute each part of the sum explicitly:

Step 1: Risk-neutral drift

μ^{Q} (X_{t}) = [\begin{matrix} m_{0} + m_{r} r_{t} + m_{μ} μ_{t}^{Q} + m_{V} V_{t} \\ μ_{t}^{Q} \\ g_{V} - k_{V} V_{t} \end{matrix}]

Step 2: Market price of risk term

Let us denote:

Λ_{t} = Λ_{1} + Λ_{2}

(A5)

where

Λ_{1} = \sqrt{S_{t}} λ_{1} = [\begin{matrix} \sqrt{S_{t}} λ_{r 0} \\ \sqrt{S_{t}} λ_{μ 0} \\ \sqrt{S_{t}} λ_{V 0} \end{matrix}], Λ_{2} = \sqrt{S_{t}^{- 1}} λ_{2} X_{t} = \sqrt{S_{t}^{- 1}} \cdot [\begin{matrix} λ_{r r} r_{t} + λ_{r μ} μ_{t}^{Q} + λ_{r V} V_{t} \\ λ_{μ r} r_{t} + λ_{μ μ} μ_{t}^{Q} + λ_{μ V} V_{t} \\ λ_{V r} r_{t} + λ_{V μ} μ_{t}^{Q} + λ_{V V} V_{t} \end{matrix}]

Assuming consistent scaling (or setting

\sqrt{S_{t}} = 1

and

\sqrt{S_{t}^{- 1}} = 1

without loss of generality, as is common in affine models), we obtain:

Λ_{t} = [\begin{matrix} λ_{m 0} + λ_{r r} r_{t} + λ_{r μ} μ_{t}^{Q} + λ_{r V} V_{t} \\ λ_{μ 0} + λ_{μ r} r_{t} + λ_{μ μ} μ_{t}^{Q} + λ_{μ V} V_{t} \\ λ_{V 0} + λ_{V r} r_{t} + λ_{V μ} μ_{t}^{Q} + λ_{V V} V_{t} \end{matrix}]

(A6)

Step 3: Adding the terms to obtain a physical drift

Now, summing

μ^{Q} (X_{t})

and

Λ_{t}

:

\frac{1}{d t} E^{P} [d X_{t}] = [\begin{matrix} m_{0} + λ_{r 0} + (m_{r} + λ_{r r}) r_{t} + (m_{μ} + λ_{r μ}) μ_{t}^{Q} + (m_{V} + λ_{r V}) V_{t} \\ μ_{t}^{Q} + λ_{μ 0} + λ_{μ r} r_{t} + λ_{μ μ} μ_{t}^{Q} + λ_{μ V} V_{t} \\ g_{V} + λ_{V 0} - (k_{V} - λ_{V V}) V_{t} + λ_{V r} r_{t} + λ_{V μ} μ_{t}^{Q} \end{matrix}]

(A7)

To simplify notation and group terms, define:

\begin{matrix} {\tilde{m}}_{0} = m_{0} + λ_{r 0}, {\tilde{m}}_{r} = m_{r} + λ_{r r}, {\tilde{m}}_{μ} = m_{μ} + λ_{r μ}, {\tilde{m}}_{V} = m_{V} + λ_{r V} \\ {\tilde{g}}_{V} = g_{V} + λ_{V 0}, {\tilde{k}}_{V} = k_{V} - λ_{V V} \end{matrix}

Finally, drift under a physical measure

P

becomes

\frac{1}{d t} E^{P} [d X_{t}] = [\begin{matrix} {\tilde{m}}_{0} + {\tilde{m}}_{r} r_{t} + {\tilde{m}}_{μ} μ_{t}^{Q} + {\tilde{m}}_{V} V_{t} \\ (1 + λ_{μ μ}) μ_{t}^{Q} + λ_{μ 0} + λ_{μ r} r_{t} + λ_{μ V} V_{t} \\ {\tilde{g}}_{V} - {\tilde{k}}_{V} V_{t} + λ_{V r} r_{t} + λ_{V μ} μ_{t}^{Q} \end{matrix}]

(A8)

Appendix B. Bayesian Estimation Algorithm

This appendix outlines the Bayesian estimation procedure for the ATSM, implemented via a Metropolis–Hastings (MH) within a Gibbs sampler. The algorithm follows the block sampling structure proposed in Collin-Dufresne et al. (2009); see also Singleton (2006) and Johannes and Polson (2010) for a detailed discussion on MCMC.

Appendix B.1. Latent State Structure and Volatility

The latent state vector

X_{t}

is partitioned as:

X_{t} = [\begin{matrix} X_{t}^{0} \\ V_{t} \end{matrix}], where X_{t}^{0} = [\begin{matrix} r_{t} \\ μ_{t} \end{matrix}]

(A9)

Here,

V_{t}

is the latent stochastic volatility component, constrained below by

\underset{̲}{V}

. We define the deviation variable:

U_{t} = max (V_{t} - \underset{̲}{V}, 0)

(A10)

Sampling is conducted over

U_{t}

, from which

V_{t} = \underset{̲}{V} + U_{t}

is recovered.

Appendix B.2. Model Discretization

The continuous-time dynamics of the state vector under the physical measure

P

are approximated using the Euler–Maruyama scheme in (22):

Appendix B.3. State–Space Representation

The model is cast in the following discrete-time state–space form:

\begin{matrix} P_{t} & = K + L X_{t} + ε_{t}, ε_{t} \sim N (0, Σ_{ε}) \end{matrix}

(A11)

\begin{matrix} X_{t + 1} & = m + F X_{t} + η_{t}, η_{t} \sim N (0, Q (V_{t})) \end{matrix}

(A12)

where:

$P_{t} \in R^{N}$ : observed yield vector by PC loading, see (23),
$X_{t} \in R^{d}$ : latent state,
$K \in R^{N}, L \in R^{N \times d}$ : observation equation parameters,
$m \in R^{d}, F \in R^{d \times d}$ : state equation parameters,
$Q (V_{t})$ : state-dependent transition covariance matrix,
$Σ_{ε}$ : observation noise covariance.

Appendix B.4. Prior Distributions

The parameter vector is partitioned into three blocks:

$ϕ^{Q}$ : risk-neutral dynamics parameters,
$ϕ^{λ}$ : risk premia parameters,
$ϕ^{Λ}$ : measurement error variances.

Priors are assigned as:

\begin{matrix} ϕ^{Q} & \sim N (μ_{Q}, Σ_{Q}) \\ ϕ^{λ} & \sim N (μ_{λ}, Σ_{λ}) \\ σ_{ε, i}^{2} & \sim Inverse - Gamma (a_{ε}, b_{ε}), i = 1, \dots, N \\ X_{0} & \sim N (μ_{0}, Σ_{0}) \end{matrix}

Appendix B.5. Posterior Structure

Let

P_{1 : T}

denote the observed data expressed in terms of the linear relation between the yields and principal components and augmented with the term structure factor data

X_{1 : T}

(Collin-Dufresne et al., 2009). The posterior distribution is:

\begin{matrix} p & (U_{1 : T}, X_{1 : T}^{0}, ϕ^{Λ}, ϕ^{λ}, ϕ^{Q} ∣ P_{1 : T}) \propto \\ p (P_{1 : T} ∣ X_{1 : T}^{0}, U_{1 : T}, ϕ^{Λ}) \cdot p (X_{1 : T}^{0} ∣ U_{1 : T}, ϕ^{Q}) \cdot p (U_{1 : T} ∣ ϕ^{Q}) \\ \cdot p (ϕ^{Λ}) \cdot p (ϕ^{λ}) \cdot p (ϕ^{Q}) \end{matrix}

Appendix B.6. Sampling Algorithm

We implement a Gibbs sampler with Metropolis–Hastings steps embedded for non-conjugate blocks. The full parameter space is divided into:

$U_{1 : T}$ (sampled in blocks),
$ϕ^{Λ}$ (measurement error),
$ϕ^{λ}$ (risk premia),
$ϕ^{Q}$ and $X_{0}$ (risk-neutral dynamics and initial condition),
$X_{1 : T}^{0}$ (Gaussian latent states).

Step 1: Sampling U_t for t ∈ {1, 1 + h, 1 + 2h,…, T}

For each t, the target density follows from the model’s Markov structure:

p (U_{t} ∣ \cdot) \propto p (U_{t + h}, X_{t + h}^{0} ∣ U_{t}, X_{t}^{0}, ϕ^{Q}) \cdot p (U_{t} ∣ U_{t - h}, X_{t - h}^{0}, X_{t}^{0}, ϕ^{Q})

The second term is Gaussian and serves as the **proposal** in a random-walk MH algorithm. At iteration s:

Propose ${\tilde{U}}_{t} \sim N (U_{t}^{(s - 1)}, σ_{prop}^{2})$ , enforce ${\tilde{U}}_{t} \geq 0$
Evaluate acceptance ratio:

$α = min (1, \frac{p (U_{t + h}, X_{t + h}^{0} ∣ {\tilde{U}}_{t}, X_{t}^{0}) \cdot p ({\tilde{U}}_{t} ∣ \cdot)}{p (U_{t + h}, X_{t + h}^{0} ∣ U_{t}^{(s - 1)}, X_{t}^{0}) \cdot p (U_{t}^{(s - 1)} ∣ \cdot)})$
Accept ${\tilde{U}}_{t}$ with probability $α$ , otherwise retain $U_{t}^{(s)} = U_{t}^{(s - 1)}$

Step 2: Sample ϕ^Λ (Measurement Error)

Conditional on $X_{1 : T}^{0}, U_{1 : T}, ϕ^{Q}$
If priors are inverse-gamma, a conjugate update is available

Step 3: Sample ϕ^λ (Risk Premia)

Conditional on $X_{1 : T}^{0}, U_{1 : T}, ϕ^{Λ}, ϕ^{Q}$
Use MH step if non-conjugate
Enforce admissibility and stationarity

Step 4: Sample ϕ^Q and X₀

Conditional on $X_{1 : T}^{0}, U_{1 : T}, ϕ^{Λ}, ϕ^{λ}$
Sample $ϕ^{Q}$ via MH or conjugate draw, enforce stationarity
Sample $X_{0} \sim N (μ_{0}^{*}, Σ_{0}^{*})$ via conditional normal update

Step 5: Sample $X_{1 : T}^{0}$ (Gaussian Latent States)

Conditional on all parameters and $U_{1 : T}$
Use Forward-Filtering Backward-Sampling (FFBS) or Kalman smoother

Step 6: Iterate

Repeat Steps 1–5 for $s = 1, \dots, S$
Store posterior draws; discard burn-in period

Appendix B.7. Convergence Diagnostics and Tuning

Acceptance Rate: A reasonable rate is 20–40% in Metropolis–Hastings blocks. Proposal variances should be adjusted accordingly.
Trace Plots and Autocorrelation: They are used to monitor the mixing behavior of key parameters and latent states.
Gelman–Rubin Statistic $\hat{R}$ : For multiple chains, verify $\hat{R} < 1.1$ for all monitored parameters to ensure convergence.
Effective Sample Size (ESS): It is computed for each parameter to assess how many independent draws are available:

$E S S = \frac{S}{1 + 2 \sum_{k = 1}^{\infty} ρ_{k}}$

(A13)

where $ρ_{k}$ is the autocorrelation at lag k, and S is the total number of post-burn-in samples.
Highest Density Intervals (HDIs): Use posterior quantiles or kernel density estimates to compute 90% or 95% HDIs:

${HDI}_{α} = {θ : p (θ ∣ data) \geq c}$

(A14)

where c is the density threshold that ensures coverage probability $α$ .
Burn-in and Thinning: Discard the first B iterations (e.g., $B = 2000$ ), where B represents the number of Burn-in iterations. Thinning is optional but may reduce autocorrelation in stored draws. For a detailed discussion on convergence diagnostics, see Roy (2020) and (Kumar et al., 2019) for software applications.

Appendix B.8. Alternative Samplers

More efficient sampling algorithms exist. For instance:

Hamiltonian Monte Carlo (HMC) or No-U-Turn Sampler (NUTS):
−
They use gradient information to improve sampling efficiency
−
They are particularly useful for high-dimensional latent blocks
Particle MCMC or Particle Gibbs:
−
It is well-suited to nonlinear and non-Gaussian state–space models

We opt for the MH-within-Gibbs algorithm for transparency, ease of constraint enforcement, and alignment with prior literature.

Appendix C. Workflow Diagram: Yield Curve Modeling and Inference

Appendix D. Algorithm: Yield Curve Inference via PCA, Kalman and MCMC

Algorithm A1 MCMC Algorithm with Block Sampling of

U_{t}

and Kalman Filtering

Require:: Observed data Y, initial values $Φ = {ϕ^{Q}, ϕ^{Λ}, ϕ^{λ}}$ , initial state path $X_{1 : T}^{0}$ , and deviation path $U_{1 : T}$
Ensure:: Posterior samples of $Φ$ , $X_{1 : T}^{0}$ , and $U_{1 : T}$
1:: for iteration = 1 to N do
2:: for each $t \in {1, 1 + h, 1 + 2 h, \dots, T}$ do
3:: Propose ${\tilde{U}}_{t} \sim N (U_{t}^{(s - 1)}, σ_{prop}^{2})$ , truncated below at 0
4:: Compute MH acceptance probability:

$α = min (1, \frac{p (U_{t + h}, X_{t + h}^{0} ∣ {\tilde{U}}_{t}, X_{t}^{0}) \cdot p ({\tilde{U}}_{t} ∣ \cdot)}{p (U_{t + h}, X_{t + h}^{0} ∣ U_{t}^{(s - 1)}, X_{t}^{0}) \cdot p (U_{t}^{(s - 1)} ∣ \cdot)})$
5:: Accept or reject ${\tilde{U}}_{t}$ with probability $α$
6:: if constraint violations (e.g., Feller condition, stationarity) then
7:: Reject draw; set $U_{t}^{(s)} = U_{t}^{(s - 1)}$
8:: else
9:: Accept draw; set $U_{t}^{(s)} = {\tilde{U}}_{t}$
10:: end if
11:: end for
12:: Draw $ϕ^{Λ} \sim p (ϕ^{Λ} ∣ Y, X^{0}, U_{1 : T})$
13:: Draw $ϕ^{λ} \sim p (ϕ^{λ} ∣ Y, X^{0}, U_{1 : T})$
14:: Sample latent states $X_{1 : T}^{0}$ :
: ● Conditional on $U_{1 : T}$ , the system is linear-Gaussian
: ● Apply Kalman filter and smoother to draw $X_{1 : T}^{0}$ from its conditional posterior
15:: Draw $ϕ^{Q} \sim p (ϕ^{Q} ∣ Y, X_{1 : T}^{0}, U_{1 : T})$
16:: Store current samples: $Φ$ , $X_{1 : T}^{0}$ , $U_{1 : T}$
17:: end for
18:: return Posterior draws of ${ϕ^{Q}, ϕ^{Λ}, ϕ^{λ}}$ , $X_{1 : T}^{0}$ , $U_{1 : T}$

Notes

1	Collin-Dufresne et al. (2009) decomposes the state variable X into $X_{t} = [\begin{matrix} X_{t}^{0} \\ V_{t} \end{matrix}]$ ; where $X^{0}$ includes all the state variables $r_{t}$ , $μ_{t}$ , and $θ_{t}$ , but exclude $V_{t}$ . The reason is that $V_{t}$ only affects the factor covariance matrix. They condition on the entire path of, write $X_{t}^{0}$ and $P$ in linear-Gaussian state–space form. The draws involving V can be done using relatively inefficient MH.
2	Risk-neutral parameters $K_{V}$ and $γ_{V}$ are not identifiable under USV, hence they are replaced by $γ_{V}^{P} = γ_{V} + λ_{V 0}$ , and $K_{V}^{P} = K_{V} - λ_{V V}$ (Collin-Dufresne et al., 2009).
3	We use the Diebold and Mariano (2002) test to assess whether forecast $A_{1} (3)$ USV significantly outperforms $A_{1} (4)$ USV in terms of bias and RMSE. The global DM test statistic evaluates the null hypothesis of equal predictive accuracy across the full forecast horizon. Significance is indicated as follows: ** p-value $< 0.01$ , * p-value $< 0.05$ , p-value $< 0.1$ . In addition to the global DM statistic, we compute standardized per-point loss differentials $d_{i}$ to highlight localized forecast performance differences. Each per-point z-score is defined as: $z_{i} = \frac{d_{i} - \bar{d}}{s_{d}}$ , where $d_{i}$ is the pointwise difference in forecast losses, $\bar{d}$ is the mean loss difference, and $s_{d}$ is the sample standard deviation. Significance levels per point are marked by: * $\| z \| > 2.756$ , $\| z \| > 1.960$ , * $\| z \| > 1.645$
4	By model-implied volatility we refer to the output from both $A_{1} (3)$ AND $A_{1} (4)$ USV models. Readers should not confuse this with the implied volatility surface (option-implied volatility).

References

Aït-Sahalia, Y., Li, C., & Li, C. X. (2024). Maximum likelihood estimation of latent Markov models using closed-form approximations. Journal of Econometrics, 240(2), 105008. [Google Scholar] [CrossRef]
Andersen, T. G., & Benzoni, L. (2010). Do bonds span volatility risk in the US Treasury market? A specification test for affine term structure models. The Journal of Finance, 65(2), 603–653. [Google Scholar] [CrossRef]
Andreasen, M. M., Jørgensen, K., & Meldrum, A. (2025). Bond risk premiums at the zero lower bound. Journal of Econometrics, 247, 105939. [Google Scholar] [CrossRef]
Ang, A., Piazzesi, M., & Wei, M. (2006). What does the yield curve tell us about GDP growth? Journal of Econometrics, 131(1–2), 359–403. [Google Scholar] [CrossRef]
Bikbov, R., & Chernov, M. (2004). Term structure and volatility: Lessons from the Eurodollar markets. SSRN Electronic Journal. [Google Scholar] [CrossRef]
Cheridito, P., Filipović, D., & Kimmel, R. L. (2007). Market price of risk specifications for affine models: Theory and evidence. Journal of Financial Economics, 83(1), 123–170. [Google Scholar] [CrossRef]
Christoffersen, P., & Diebold, F. X. (2002, January 2). Financial asset returns, market timing, and volatility dynamics. Market Timing, and Volatility Dynamics. Available online: https://users.nber.org/~confer/2003/si2003/papers/efww/diebold.pdf (accessed on 25 June 2025).
Collin-Dufresne, P., & Goldstein, R. S. (2002). Do bonds span the fixed income markets? Theory and evidence for unspanned stochastic volatility. The Journal of Finance, 57(4), 1685–1730. [Google Scholar] [CrossRef]
Collin-Dufresne, P., Goldstein, R. S., & Jones, C. S. (2008). Identification of maximal affine term structure models. The Journal of Finance, 63(2), 743–795. [Google Scholar] [CrossRef]
Collin-Dufresne, P., Goldstein, R. S., & Jones, C. S. (2009). Can interest rate volatility be extracted from the cross section of bond yields? Journal of Financial Economics, 94(1), 47–66. [Google Scholar] [CrossRef]
Collin-Dufresne, P., Jones, C., & Goldstein, R. (2004). Can interest rate volatility be extracted from the cross section of bond yields? An investigation of unspanned stochastic volatility. Available online: https://www.epfl.ch/labs/sfi-pcd/wp-content/uploads/2021/07/Can-Interest-Rate-Volatility-be-Extracted-from-the-Cross-Section-of-Bond-Yields.pdf (accessed on 25 June 2025).
Dai, Q., & Singleton, K. J. (2000). Specification analysis of affine term structure models. The Journal of Finance, 55(5), 1943–1978. [Google Scholar] [CrossRef]
Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Business & economic statistics, 20(1), 134–144. [Google Scholar]
Duffee, G. R. (2002). Term premia and interest rate forecasts in affine models. The Journal of Finance, 57(1), 405–443. [Google Scholar] [CrossRef]
Duffie, D., & Kan, R. (1996). A yield-factor model of interest rates. Mathematical Finance, 6(4), 379–406. [Google Scholar] [CrossRef]
Elerian, O., Chib, S., & Shephard, N. (2001). Likelihood inference for discretely observed nonlinear diffusions. Econometrica, 69(4), 959–993. [Google Scholar] [CrossRef]
Hansen, J. W. (2025). Unspanned stochastic volatility in the linear-rational square-root model: Evidence from the Treasury market. Journal of Banking & Finance, 171, 107354. [Google Scholar]
Heidari, M., & Wu, L. (2002, September 10). Term structure of interest rates, yield curve residuals, and the consistent pricing of interest rate derivatives. Yield Curve Residuals, and the Consistent Pricing of Interest Rate Derivatives. Available online: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=c708679c36cd86175f128f6fdd5f33ffdc61e959 (accessed on 25 June 2025).
Johannes, M., & Polson, N. (2010). MCMC methods for continuous-time financial econometrics. In Handbook of financial econometrics: Applications (pp. 1–72). Elsevier. [Google Scholar]
Kumar, R., Carroll, C., Hartikainen, A., & Martin, O. (2019). ArviZ a unified library for exploratory analysis of Bayesian models in Python. Journal of Open Source Software, 4(33), 1143. [Google Scholar] [CrossRef]
Litterman, R. B., Scheinkman, J., & Weiss, L. (1991). Volatility and the yield curve. The Journal of Fixed Income, 1(1), 49–53. [Google Scholar] [CrossRef]
López-Pérez, A., Febrero-Bande, M., & González-Manteiga, W. (2025). Estimation and specification test for diffusion models with stochastic volatility. Statistical Papers, 66(2), 40. [Google Scholar] [CrossRef]
Piazzesi, M. (2010). Affine term structure models. In Handbook of financial econometrics: Tools and techniques (pp. 691–766). Elsevier. [Google Scholar]
Riva, R. (2024). How much unspanned volatility can different shocks explain? Available at SSRN 4878175. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4878175 (accessed on 25 June 2025).
Roy, V. (2020). Convergence diagnostics for markov chain monte carlo. Annual Review of Statistics and Its Application, 7(1), 387–412. [Google Scholar] [CrossRef]
Singleton, K. J. (2006). Empirical dynamic asset pricing: Model specification and econometric assessment. Princeton University Press. [Google Scholar]
Stroud, J. R., Müller, P., & Polson, N. G. (2003). Nonlinear state-space models with state-dependent variances. Journal of the American Statistical Association, 98(462), 377–386. [Google Scholar] [CrossRef]

Figure 1. Cumulative explained variance in percentages is plotted against the number of principal components. The green dotted line with dots and blue line represent the cumulative variance for models

A_{1} (3)

and

A_{1} (4)

fitted yields, respectively. A gray dotted line is the maximum level of 99.97% beyond which the remaining 0.03% represented by a blue line is negligible.

Figure 1. Cumulative explained variance in percentages is plotted against the number of principal components. The green dotted line with dots and blue line represent the cumulative variance for models

A_{1} (3)

and

A_{1} (4)

fitted yields, respectively. A gray dotted line is the maximum level of 99.97% beyond which the remaining 0.03% represented by a blue line is negligible.

Figure 2. A sample of posterior parameter histograms for the

A_{1} (3)

USV model. Each subplot depicts a distribution for each parameter in this order (a)

m_{0}

(b)

m_{r}

(c)

m_{μ}

(d)

m_{V}

(e)

K_{V}^{P}

. These are the risk-neutral drift parameters except for the

γ_{V}^{P}

and

K_{V}^{P}

. The magenta line represents a theoretical Gaussian density plot. The histograms exhibit a unimodal shape in most cases, which confirms a successful convergence and mixing.

Figure 2. A sample of posterior parameter histograms for the

A_{1} (3)

USV model. Each subplot depicts a distribution for each parameter in this order (a)

m_{0}

(b)

m_{r}

(c)

m_{μ}

(d)

m_{V}

(e)

K_{V}^{P}

. These are the risk-neutral drift parameters except for the

γ_{V}^{P}

and

K_{V}^{P}

. The magenta line represents a theoretical Gaussian density plot. The histograms exhibit a unimodal shape in most cases, which confirms a successful convergence and mixing.

Figure 3. A sample of posterior parameter histograms for the

A_{1} (4)

USV model are plotted. Each subplot depicts a distribution for each parameter in this order (a)

m_{0}

(b)

m_{r}

(c)

m_{μ}

(d)

m_{θ}

(e)

m_{V}

(f)

γ_{V}^{P}

(g)

K_{V}^{P}

. Specifically, these are the risk-neutral drift parameters except for the

γ_{V}^{P}

and

K_{V}^{P}

. For each histogram, a plot in magenta represents a theoretical Gaussian density plot. The histograms exhibit a unimodal shape in most cases, which confirms a successful convergence and mixing.

Figure 3. A sample of posterior parameter histograms for the

A_{1} (4)

USV model are plotted. Each subplot depicts a distribution for each parameter in this order (a)

m_{0}

(b)

m_{r}

(c)

m_{μ}

(d)

m_{θ}

(e)

m_{V}

(f)

γ_{V}^{P}

(g)

K_{V}^{P}

. Specifically, these are the risk-neutral drift parameters except for the

γ_{V}^{P}

and

K_{V}^{P}

. For each histogram, a plot in magenta represents a theoretical Gaussian density plot. The histograms exhibit a unimodal shape in most cases, which confirms a successful convergence and mixing.

Figure 4. Conditional and unconditional volatility are plotted with the probability distributions of parameters. (a)

A_{1} (3)

USV model-implied conditional volatility

V_{t}

in blue, plotted with a 26-week rolling volatility in green. (b)

A_{1} (3)

USV model-implied unconditional volatility in blue and shown within the 95% confidence bounds, is plotted with a 26-week rolling volatility in green. (c)

A_{1} (4)

USV model-implied conditional volatility

V_{t}

in blue, plotted with a 26-week rolling volatility in green. (d)

A_{1} (4)

USV model-implied unconditional volatility in blue and shown within the 95% confidence bounds, is plotted with a 26-week rolling volatility in green. Conditional volatilities are plotted against a time period 2016–2024, while the unconditional volatilities are plotted against the maturities.

Figure 4. Conditional and unconditional volatility are plotted with the probability distributions of parameters. (a)

A_{1} (3)

USV model-implied conditional volatility

V_{t}

in blue, plotted with a 26-week rolling volatility in green. (b)

A_{1} (3)

USV model-implied unconditional volatility in blue and shown within the 95% confidence bounds, is plotted with a 26-week rolling volatility in green. (c)

A_{1} (4)

USV model-implied conditional volatility

V_{t}

in blue, plotted with a 26-week rolling volatility in green. (d)

A_{1} (4)

USV model-implied unconditional volatility in blue and shown within the 95% confidence bounds, is plotted with a 26-week rolling volatility in green. Conditional volatilities are plotted against a time period 2016–2024, while the unconditional volatilities are plotted against the maturities.

Figure 5. Market price of risk in basis points is plotted against the time series for the period 2013–2024. The first sub-figure (a) plots the model

A_{1} (3)

USV and second (b) plots

A_{1} (4)

USV both against time. They are based on the proportional relationship between

\sqrt{S_{t}}

and the essentially affine

Λ = λ_{1} + λ_{2}

.

Figure 5. Market price of risk in basis points is plotted against the time series for the period 2013–2024. The first sub-figure (a) plots the model

A_{1} (3)

USV and second (b) plots

A_{1} (4)

USV both against time. They are based on the proportional relationship between

\sqrt{S_{t}}

and the essentially affine

Λ = λ_{1} + λ_{2}

.

Table 1. Principal components of the zero-coupon yields for the SA government treasury bond over maturities 3 months, 5, 10, 12, 20, 25, and 30 years. The second row from the bottom provides details about explained variances for individual components, which, by observation, are declining from 66.06% for the first component, followed by 30.14% for the second one, thereafter dropping to almost zero. The last row represents the cumulative percentages, confirming that the first three components contribute more to variation than the remaining ones.

	Principal Components
	1	2	3	4	5	6	7
3-month	−0.10	0.86	−0.33	0.13	−0.13	−0.33	−0.06
5-year	0.02	0.36	−0.08	−0.50	0.44	0.58	0.28
10-year	0.17	−0.03	0.04	−0.37	−0.51	−0.25	0.71
12-year	0.30	−0.11	−0.28	0.52	0.52	−0.18	0.51
20-year	0.60	0.00	−0.37	0.21	−0.43	0.50	−0.15
25-year	0.45	0.34	0.80	0.20	0.05	0.02	0.01
30-year	0.56	−0.08	−0.16	−0.50	0.27	−0.46	−0.36
Explained Variance (%)	66.06	30.14	3.77	0.03	0	0	0
Cumulative Variance (%)	66.06	96.2	99.97	100	100	100	100

Table 2. Posterior distribution for key risk-neutral drift and covariance parameters for model

A_{1} (3)

compared to

A_{1} (4)

. The parameters are fitted to the weekly approximated zero yields for the period October 2013 to September 2024. Both models are distinguished from the unrestricted maximal model by restricted parameters. These restricted parameters are shown as underlined and are a function of normal parameters. The table consists of point estimates, based on mean values of the posterior parameter distribution, and confidence interval bounds (in square brackets). Confidence interval bounds for the point estimates are computed as 2.5 and 97.5 percentiles of the posterior distribution.

Table 2. Posterior distribution for key risk-neutral drift and covariance parameters for model

A_{1} (3)

compared to

A_{1} (4)

. The parameters are fitted to the weekly approximated zero yields for the period October 2013 to September 2024. Both models are distinguished from the unrestricted maximal model by restricted parameters. These restricted parameters are shown as underlined and are a function of normal parameters. The table consists of point estimates, based on mean values of the posterior parameter distribution, and confidence interval bounds (in square brackets). Confidence interval bounds for the point estimates are computed as 2.5 and 97.5 percentiles of the posterior distribution.

Parameter	$A_{1}$ (3) USV	$A_{1}$ (4) USV
$m_{0}$	0.0012 [0.0009, 0.0015]	0.0010 [0.0010, 0.0010]
$m_{r}$	−0.3588 [−0.4171, −0.3005]	−0.0780 [−0.1147, −0.0414]
$m_{μ}$	−0.8077 [−1.1711, −0.4443]	−0.0576 [−0.0676, −0.0476]
$m_{θ}$		−0.1113 [−0.1242, −0.0984]
$m_{V}$	−0.3152 [−1.0949, 0.4645]	−0.0196 [−0.0203, −0.0189]
$γ_{V}^{P}$	0.0009 [[0.0009, 0.0009]	0.0020 [0.0019, 0.0021]
$K_{V}^{P}$	0.4457 [0.4414, 0.4500]	1.0450 [1.0129, 1.0770]
$σ_{r}$	0.0071 [−0.0048, 0.0189]	0.0258 [−0.0230, 0.0746]
$σ_{μ}$	0.0816 [−0.1823, 0.3454]	0.0190 [0.0016, 0.0365]
$σ_{θ}$		0.0101 [0.0098, 0.0104]
$σ_{V}$	0.0003 [−0.0007, 0.0013]	0.0001 [−0.0000, 0.0002]
$\underset{̲}{V}$	$1.0 \times 10^{- 6}$ [ $1.0 \times 10^{- 6}$ , $1.0 \times 10^{- 6}$ ]	0.0001 [0.0001, 0.0001]
$c_{r μ}$	−0.4622 [−0.5098, −0.4146]	−0.3086 [−0.3226, −0.2945]
$c_{r θ}$		−0.0503 [−0.0524, −0.0482]
$c_{μ θ}$		−0.0507 [−0.0523, −0.0491]
$c_{r V}$	−0.0507 [−0.0533, −0.0481]	−0.0973 [−0.1017, −0.0929]
$c_{μ V}$	−0.2237 [−0.2456, −0.2018]	−0.0951 [−0.0989, −0.0914]
$c_{θ V}$		0.1043 [0.0992, 0.1094]

Table 3. Posterior distribution for key risk-premium parameters for model

A_{1} (3)

compared to

A_{1} (4)

. The parameters are fitted to the weekly approximated zero yields for the period October 2013 to September 2024. The table consists of point estimates, based on mean values of the posterior parameter distribution, and confidence interval bounds (in square brackets). Confidence interval bounds for the point estimates are computed as 2.5 and 97.5 percentiles of the posterior distribution.

Table 3. Posterior distribution for key risk-premium parameters for model

A_{1} (3)

compared to

A_{1} (4)

. The parameters are fitted to the weekly approximated zero yields for the period October 2013 to September 2024. The table consists of point estimates, based on mean values of the posterior parameter distribution, and confidence interval bounds (in square brackets). Confidence interval bounds for the point estimates are computed as 2.5 and 97.5 percentiles of the posterior distribution.

Parameter	$A_{1}$ (3) USV	$A_{1}$ (4) USV
$λ_{r 0}$	−0.0122 [−0.0144, −0.0100]	−0.0102 [−0.0104, −0.0099]
$λ_{r r}$	−0.0506 [−0.0543, −0.0469]	−0.0478 [−0.0490, −0.0465]
$λ_{r μ}$	−0.0475 [−0.0524, −0.0427]	−0.0495 [−0.0510, −0.0480]
$λ_{μ 0}$	−0.0098 [−0.0107, −0.0089]	−0.0102 [−0.0105, −0.0099]
$λ_{μ r}$	0.0947 [0.0872, 0.1021]	0.0495 [0.0478, 0.0512]
$λ μ μ$	0.2783 [0.2518, 0.3047]	0.1025 [0.0961, 0.1089]
$λ_{μ V}$	47.4552 [42.8097, 52.1008]	10.2594 [9.7983, 10.7205]
$λ_{V 0}$	−0.0000 [−0.0000, −0.0000]	−0.0001 [−0.0001, −0.0001]
$λ_{V V}$	0.0579 [0.0499, 0.0659]	0.0941 [0.0890, 0.0992]
$λ_{r V}$	−10.5910 [−11.5637, −9.6183]	−1.9671 [−2.0473, −1.8868]
$λ_{r θ}$	0.0516 [0.0485, 0.0548]	0.0201 [0.0196, 0.0206]
$λ_{μ θ}$		0.0201 [0.0197, 0.0205]
$λ_{θ θ}$		0.0193 [0.0183, 0.0203]
$λ_{θ r}$		0.0208 [0.0201, 0.0215]
$λ_{θ μ}$		0.0211 [0.0200, 0.0221]
$λ_{θ V}$		0.0193 [0.0181, 0.0204]

Table 4. The results of estimation for a sample of drift parameters for the model

A_{1} (3)

are presented below. The purpose of the report is to determine the reliability of the posterior parameters.

Table 4. The results of estimation for a sample of drift parameters for the model

A_{1} (3)

are presented below. The purpose of the report is to determine the reliability of the posterior parameters.

	Mean	Std	HDI (2.5%)	HDI (97.5%)	MCSE (Mean)	MCSE (Std)	ESS (Bulk)	ESS (Tail)	$\hat{R}$
$m_{0}$	0.001	0	0.001	0.001	0	0	2326	1346	1
$m_{r}$	−0.284	0.059	−0.398	−0.171	0.001	0.001	2160	1106	1
$m_{μ}$	−0.299	0.063	−0.421	−0.177	0.001	0.001	2555	1429	1
$m_{V}$	−0.099	0.003	−0.105	−0.093	0	0	2312	1495	1
$K_{V}^{P}$	0.428	0.008	0.411	0.445	0	0	2113	1149	1
$γ_{V}^{P}$	0.007	0.005	0	0.016	0	0	975	674	1

Table 5. The results of estimation for a sample of drift parameters for the model

A_{1} (4)

are presented below. The purpose of the report is to determine the reliability of the posterior parameters.

Table 5. The results of estimation for a sample of drift parameters for the model

A_{1} (4)

are presented below. The purpose of the report is to determine the reliability of the posterior parameters.

	Mean	Std	HDI (2.5%)	HDI (97.5%)	ESS (Bulk)	ESS (Tail)	$\hat{R}$
$m_{0}$	0.093	0	0.093	0.094	3775	1808	1
$m_{r}$	−0.227	0.01	−0.247	−0.209	2626	1503	1
$m_{μ}$	−0.051	0	−0.051	−0.051	2549	1383	1
$m_{θ}$	−0.101	0	−0.101	−0.100	2241	1407	1
$m_{V}$	−0.200	0	−0.201	−0.199	3182	1548	1
$γ_{V}^{P}$	−0.018	0.009	−0.036	−0.001	2589	1617	1
$K_{V}^{P}$	−0.032	0.011	−0.052	−0.010	2162	1650	1

Table 6. In-sample and out-sample zero yields

Y_{t}

are compared to model-fitted estimates

\hat{Y_{t}}

for the 0.25, 5,10, 12, 20, 25, and 30-year maturities. The fifth column reflects the statistical significance at 5% and 1%, labelled with ** and *, respectively. Bias is determined in terms of the null hypothesis that bias is zero, while RMSE is based on a pairwise analysis of the models. The inequality sign provides the direction of a loss.

Table 6. In-sample and out-sample zero yields

Y_{t}

are compared to model-fitted estimates

\hat{Y_{t}}

for the 0.25, 5,10, 12, 20, 25, and 30-year maturities. The fifth column reflects the statistical significance at 5% and 1%, labelled with ** and *, respectively. Bias is determined in terms of the null hypothesis that bias is zero, while RMSE is based on a pairwise analysis of the models. The inequality sign provides the direction of a loss.

Maturity	$A_{1} (3)$ USV	Loss Direction	$A_{1} (4)$ USV	Significance
	In-sample RMSE ¹
0.25	0.0703	<	0.0683	*
5	0.0868	<	0.0856
10	0.0958	<	0.0946
12	0.0994	<	0.0984
20	0.1045	<	0.1040
25	0.1053	<	0.1043
30	0.1046	<	0.1040
	In-sample bias ²
0.25	0.0660		0.0661
5	0.0841		0.0842
10	0.0938		0.0939
12	0.0978		0.0979
20	0.1035		0.1036
25	0.1039		0.1041	**
30	0.1035		0.1036
	Out-sample RMSE ³
0.25	15.3984	<	0.086407	**
5	0.2219	<	0.0896
10	0.0780	>	0.1012
12	0.0612	>	0.1095
20	0.0389	>	0.1233
25	0.0505	>	0.1241
30	0.0256	>	0.1232
	Out-sample bias ⁴
0.25	−12.0832		0.0848	**
5	0.0855		0.0879
10	0.0078		0.0996
12	−0.0549		0.1079
20	−0.0369		0.1218
25	−0.0495		0.1227
30	−0.0201		0.1218

¹ DM t-statistic = 6.1233, p-value = 0.0009 → Reject

H_{0}

. Better model:

A_{1} (4)

USV (based on lower RMSE). ² DM t-statistic = −8.0000, p-value = 0.0002 → Reject

H_{0}

. Better model:

A_{1} (3)

USV is significantly less biased. ³ DM t-statistic = 0.9999, p-value = 0.3559 → Fail to reject

H_{0}

. Better model:

A_{1} (4)

USV (based on lower RMSE). ⁴ DM t-statistic = 0.9571, p-value = 0.3755 → Fail to reject

H_{0}

. Better model:

A_{1} (4)

USV (based on lower bias).

Table 7. Model-implied volatility proxy for the weekly yield returns computed as a 26-week rolling standard deviation. Initially, the rolling statistic is computed for the 3-month (short-term) yields, then compared to the 12-year (mid-term) and 30-year (long-term) yields. The model-implied volatility is derived as filtered

V_{t}

from each model. Other time-series variables for the same maturities are the GARCH, Brent crude, USDZAR, and the three PCA factors: level, slope, and curvature. The Spearman correlation is calculated between the model-implied volatility and these variables.

Table 7. Model-implied volatility proxy for the weekly yield returns computed as a 26-week rolling standard deviation. Initially, the rolling statistic is computed for the 3-month (short-term) yields, then compared to the 12-year (mid-term) and 30-year (long-term) yields. The model-implied volatility is derived as filtered

V_{t}

from each model. Other time-series variables for the same maturities are the GARCH, Brent crude, USDZAR, and the three PCA factors: level, slope, and curvature. The Spearman correlation is calculated between the model-implied volatility and these variables.

	$A_{1} (3)$			$A_{1} (4)$
Maturities	0.25	12	30	0.25	12	30
Actual vs. Model Average Yield	0.964	0.964	0.964	1.000	1.000	1.000
Actual vs. Model Sope	0.974	0.974	0.974	0.919	0.933	0.933
Actual vs. Model Curvature	0.989	0.989	0.989	0.733	0.749	0.749
Rolling vs. Model Volatility	0.054	0.002	−0.037	0.550	0.157	−0.049
SA Rand Dollar vs. Model Volatility	0.007	−0.024	0.058	0.408	0.122	0.613
Brent Crude vs. Model Volatility	0.050	0.064	0.000	0.216	0.055	0.478
GARCH vs. Model Volatility	0.026	−0.040	−0.055	0.128	0.145	−0.019
Curvature vs. Model Volatility	−0.119	0.034	−0.034	−0.421	−0.049	−0.297
Curvature vs. Model Variance	−0.119	0.034	−0.034	−0.421	−0.049	−0.297

Table 8. In-Sample and Out-Sample one-week forecast of two different volatility proxies

E [∥ Δ Y ∥]

and

\hat{σ}

for the 0.25, 5, 10, 12, 20, 25, and 30-year maturities. The fifth column reflects the statistical significance at 5% labelled with **. respectively. Bias is determined in terms of the null hypothesis that bias is zero, while RMSE is based on a pairwise analysis of the models. The inequality sign provides the direction of a loss.

Table 8. In-Sample and Out-Sample one-week forecast of two different volatility proxies

E [∥ Δ Y ∥]

and

\hat{σ}

for the 0.25, 5, 10, 12, 20, 25, and 30-year maturities. The fifth column reflects the statistical significance at 5% labelled with **. respectively. Bias is determined in terms of the null hypothesis that bias is zero, while RMSE is based on a pairwise analysis of the models. The inequality sign provides the direction of a loss.

Maturity	$A_{1} (3)$ USV	Loss Direction	$A_{1} (4)$ USV	Significance
	In-Sample RMSE of weekly $∥ Δ Y ∥$ (bps) ⁵
0.25	131.83	<	37.53	**
5	50.38	<	25.22
10	39.66	<	20.54
12	31.97	>	45.51
20	27.15	>	79.48
25	27.50	>	81.28
30	26.69	>	78.63
	In-Sample RMSE of $\hat{σ}$ (bps) ⁶
0.25	44.68	<	39.64	**
5	28.42	<	25.92
10	26.91	<	25.22
12	26.92	<	24.97
20	27.75	<	24.72
25	26.85	<	23.90
30	27.10	<	24.26
	Out-Sample RMSE of weekly $∥ Δ Y ∥$ (bps) ⁷
0.25	18.35	<	14.87
5	33.54	<	33.22
10	41.45	<	41.39
12	45.38	<	44.91
20	56.44	<	49.60
25	57.51	<	50.00
30	58.16	<	50.18
	Out-Sample RMSE of $\hat{σ}$ (bps) ⁸
0.25	15.60	<	14.41
5	29.58	<	28.56
10	38.52	<	37.26
12	41.57	<	40.24
20	39.71	<	38.18
25	40.57	<	38.88
30	40.56	<	38.82

⁵ DM t-statistic = 0.0527, p-value = 0.9597 → Fail to reject

H_{0}

. Better model:

A_{1} (4)

USV (based on lower RMSE). ⁶ DM t-statistic = 2.4035, p-value = 0.0530 * → Fail to reject

H_{0}

. Better model:

A_{1} (4)

USV (based on lower RMSE). ⁷ DM t-statistic = 6.7912, p-value = 0.0005 → Reject

H_{0}

. Better model:

A_{1} (4)

USV (based on lower RMSE). ⁸ DM t-statistic = 6.7912, p-value = 0.0005 → Reject

H_{0}

. Better model:

A_{1} (4)

USV (based on lower RMSE).

Table 9. Regressions for weekly realized volatilities

\hat{σ}

3-month and 30-year maturities. These regressions are performed on different variables. They are reflected as systematic risks according to

β

coefficients for each variable and standard errors in square brackets. For each model, there are two rows: the first without PCA factors and the second with PCA factors, level, slope, and curvature. These regressions were performed on the time-series variables for the period 2013 to 2024. The top panel is the regression results for the 3-month weekly realized volatility, and the bottom panel is for the 30-year weekly realized volatility.

Table 9. Regressions for weekly realized volatilities

\hat{σ}

3-month and 30-year maturities. These regressions are performed on different variables. They are reflected as systematic risks according to

β

coefficients for each variable and standard errors in square brackets. For each model, there are two rows: the first without PCA factors and the second with PCA factors, level, slope, and curvature. These regressions were performed on the time-series variables for the period 2013 to 2024. The top panel is the regression results for the 3-month weekly realized volatility, and the bottom panel is for the 30-year weekly realized volatility.

Variable	Intercept( $α$ )	Volatility( $β$ )	$R^{2}$	Level	Slope	Curvature
		3-Month Yield Volatilities
GARCH(1,1)	0.202 [0.015]	0.395 [0.036]	0.240
	0.210 [0.013]	0.228 [0.036]	0.477	−0.025 [0.005]	−0.112 [0.009]	−0.279 [0.025]
$A_{1} (3)$ USV	1.196 [0.2619]	−8.473 [2.576]	0.027
	0.977 [0.203]	−6.876 [1.999]	0.428	−0.027 [0.005]	−0.132 [0.009]	−0.334 [0.024]
$A_{1} (4)$ USV	0.069 [0.090]	0.810 [0.274]	0.022
	0.696 [0.126]	−1.321 [0.3956]	0.427	−0.043 [0.007]	−0.173 [0.016]	−0.364 [0.025]
30-Year Yield Volatilities
GARCH(1,1)	0.068 [0.007]	0.472 [0.045]	0.221
	0.067 [0.007]	0.447 [0.045]	0.241	−0.005 [0.002]	−0.005 [0.004]	−0.023 [0.009]
$A_{1} (3)$ USV	0.350 [0.085]	−2.144 [0.835]	0.017
	0.317 [0.084]	−1.862 [0.8262]	0.062	−0.007 [0.002]	−0.006 [0.004]	−0.032 [0.010]
$A_{1} (4)$ USV	0.038 [0.025]	0.302 [0.080]	0.035
	−0.293 [0.046]	1.393 [0.150]	0.221	0.010 [0.003]	0.037 [0.006]	−0.032 [0.009]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Molibeli, M.; van Vuuren, G. USV-Affine Models Without Derivatives: A Bayesian Time-Series Approach. J. Risk Financial Manag. 2025, 18, 395. https://doi.org/10.3390/jrfm18070395

AMA Style

Molibeli M, van Vuuren G. USV-Affine Models Without Derivatives: A Bayesian Time-Series Approach. Journal of Risk and Financial Management. 2025; 18(7):395. https://doi.org/10.3390/jrfm18070395

Chicago/Turabian Style

Molibeli, Malefane, and Gary van Vuuren. 2025. "USV-Affine Models Without Derivatives: A Bayesian Time-Series Approach" Journal of Risk and Financial Management 18, no. 7: 395. https://doi.org/10.3390/jrfm18070395

APA Style

Molibeli, M., & van Vuuren, G. (2025). USV-Affine Models Without Derivatives: A Bayesian Time-Series Approach. Journal of Risk and Financial Management, 18(7), 395. https://doi.org/10.3390/jrfm18070395

Article Menu

USV-Affine Models Without Derivatives: A Bayesian Time-Series Approach

Abstract

1. Introduction

2. Literature Review

3. Model Establishment

3.1. The A 1 ( 3 ) Model

3.2. Market Price of Risk

3.3. Model A 1 ( 3 ) Under Physical Measure

3.4. The A 1 ( 4 ) Model

3.5. Parameter Restrictions

3.5.1. A 1 ( 3 )

3.5.2. A 1 ( 4 )

3.6. Estimation Strategy

4. Data Collection

5. Scenario Determination

6. Model Implementation

7. Analysis of Results

7.1. Posterior Distributions of Key Parameters

7.2. Yield Curve Fit

7.3. Time-Series Dynamics

7.4. Volatility Forecasting and Regression

7.4.1. Forecasting and Model Performance

7.4.2. Regression

7.4.3. Market Price of Risk

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Derivation of the Physical Measure Drift

Appendix B. Bayesian Estimation Algorithm

Appendix B.1. Latent State Structure and Volatility

Appendix B.2. Model Discretization

Appendix B.3. State–Space Representation

Appendix B.4. Prior Distributions

Appendix B.5. Posterior Structure

Appendix B.6. Sampling Algorithm

Step 1: Sampling Ut for t ∈ {1, 1 + h, 1 + 2h,…, T}

Step 2: Sample ϕΛ (Measurement Error)

Step 3: Sample ϕλ (Risk Premia)

Step 4: Sample ϕQ and X0

Step 5: Sample X 1 : T 0 (Gaussian Latent States)

Step 6: Iterate

Appendix B.7. Convergence Diagnostics and Tuning

Appendix B.8. Alternative Samplers

Appendix C. Workflow Diagram: Yield Curve Modeling and Inference

Appendix D. Algorithm: Yield Curve Inference via PCA, Kalman and MCMC

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. The $A_{1} (3)$ Model

3.3. Model $A_{1} (3)$ Under Physical Measure

3.4. The $A_{1} (4)$ Model

3.5.1. $A_{1} (3)$

3.5.2. $A_{1} (4)$

Step 1: Sampling U_t for t ∈ {1, 1 + h, 1 + 2h,…, T}

Step 2: Sample ϕ^Λ (Measurement Error)

Step 3: Sample ϕ^λ (Risk Premia)

Step 4: Sample ϕ^Q and X₀

Step 5: Sample $X_{1 : T}^{0}$ (Gaussian Latent States)