Forecasting the Term Structure of Interest Rates with Dynamic Constrained Smoothing B-Splines

: The Nelson–Siegel framework published by Diebold and Li created an important benchmark and originated several works in the literature of forecasting the term structure of interest rates. However, these frameworks were built on the top of a parametric curve model that may lead to poor ﬁtting for sensible term structure shapes affecting forecast results. We propose DCOBS with no-arbitrage restrictions, a dynamic constrained smoothing B-splines yield curve model. Even though DCOBS may provide more volatile forward curves than parametric models, they are still more accurate than those from Nelson–Siegel frameworks. DCOBS has been evaluated for ten years of US Daily Treasury Yield Curve Rates, and it is consistent with stylized facts of yield curves. DCOBS has great predictability power, especially in short and middle-term forecast, and has shown greater stability and lower root mean square errors than an Arbitrage-Free Nelson–Siegel model.


Introduction
Forecast methods applied to a term structure of interest rates are important tools not only for banks and financial firms, or governments and policy makers, but for society itself, helping to understand the movements of markets and flows of money. Several works have been done during the past few decades in order to predict the dynamics of term structure of interest rates. This paper presents a dynamic version of the constrained smoothing B-splines model to forecast the yield curve with no-arbitrage restrictions.
A complete term structure of interest rates does not exist in the real world. Observable market data are discrete points that relate interest rates to maturity dates. Since it is unlikely that there will be an available contract in the market for every maturity needed by practitioners, a continuous curve model is necessary. The importance of these models is crucial for pricing securities, for instance. The first modeling technique that comes to mind is interpolation. With interpolation, one can indeed obtain an adherent fit, but it can easily lead to unstable curves since market data are subject to many sources of disturbance.
The literature describes two approaches for estimating the term structure of interest rates: a statistical approach and an equilibrium approach. The equilibrium approach makes use of theories that describe the overall economy in terms of state variables and its implications on short-term interest rates Cox et al. (1985); Duffie and Kan (1996); Vasicek (1977). In the statistical approach, the construction of the yield curve relies on data observed in the market Heath et al. (1992); Hull and White (1990). This observed data can be smoothed with parametric or nonparametric methods. Parametric methods have functional forms and their parameters can have economic interpretations such as a Nelson-Siegel model Nelson and Siegel (1987) or the Svensson model Svensson (1994). One advantage is that restrictions on parameters can be added so it copes with convenient economic theories such as the arbitrage-free set. However, its functional form makes parametric methods less flexible to fit observed data. This lack of adherence to data can make its practical usage inappropriate, especially in asset pricing and no-arbitrage applications due to misspecification Laurini and Moura (2010). The model can produce yield curves with theoretical integrity but without reflecting the reality. On the other hand, nonparametric methods do not assume any particular functional form and consequently they are very flexible and can be very robust if combined with appropriate conditions. After almost 50 years since the publication of the first yield curve models McCulloch (1971), just recently the yield curve dynamics became an essential topic. With the publication of the Dynamic Nelson-Siegel (DNS) model by Diebold and Li (2003), the subject became established. Even though the dynamics of term structure play a vital role in macroeconomic studies, Diebold and Li argued that until then little attention had been paid to forecasting term structures. They gave two reasons for this lack of interest. Firstly, they stated that no-arbitrage models had little to say about term structure dynamics. Secondly, based on the work of Duffee (2002), they assumed that affine equilibrium models 1 forecast poorly. Therefore, there was a belief that the dynamics of yield curves could not be forecast with parsimonious models.
In order to challenge this idea, Diebold and Li proposed the DNS model using a Nelson-Siegel yield curve fitting to forecast its dynamics. This model became very popular among financial market users and even central banks around the world. It is parsimonious and stable. In addition, the Nelson-Siegel model imposes some desired economic properties such as discount function approaching zero as maturity evolves and its factors representing short-, medium-, and long-term behaviors.
In practice, the forecast results of DNS are remarkable, but, despite its both theoretical and empirical success, DNS does not impose restrictions for arbitrage opportunities. Consequently, practitioners could be exposed to critical financial risks, as the pricing of assets that depends on interest rates relies on arbitrage-free theory. In order to mitigate these risks, Christensen et al. (2011) introduce a class of Arbitrage-Free Nelson-Siegel (AFNS) models. They are affine term structure models that keep the DNS structure and incorporate no-arbitrage restrictions. Tourrucôo et al. (2016) list several appealing features of AFNS. Namely, they keep the desired economic properties of the three-factors model of the original structure of DNS. They also ensure lack of arbitrage opportunities with a more simple structure compared to those affine arbitrage-free models published previously by Duffie and Kan (1996) and Duffee (2002). This is achieved by adding a yield-adjustment term to the Nelson-Siegel yield curve model described as an ordinary differential system of equations to ensure no-arbitrage. Tourrucôo et al. (2016) argue that, in long forecast horizons, the AFNS model with uncorrelated factors delivers the most accurate forecasts. Their conclusion is that no-arbitrage is indeed helpful, but only for longer forecasting horizons. Barzanti and Corradi (2001) published earlier works on the use of constrained smoothing B-splines to overcome some difficulties while estimating term structures of interest rates with ordinary cubic splines. They computed the B-splines coefficients as a least squares problem. However, Laurini and Moura (2010) proposed constrained smoothing B-splines with a different methodology. This methodology was initially proposed by He and Shi (1998) and He and Ng (1999) as a general tool to smooth data with certain qualitative properties such as monotonicity and concavity or convexity constraints. Roughly, the methodology builds the yield curve as a L 1 projection of a smooth function into the space of B-splines. It is achieved by estimating a conditional median function as described in quantile regression theory of Koenker and Bassett (1978). A great advantage is that, being a conditional median function, it is robust to outliers. In addition, its formulation as a linear programming problem allows us to impose several constraints without a substantial increase in computational costs. 1 The expression "affine term structure model" describes any arbitrage-free model in which bond yields are affine (constant-plus-linear) functions of some state vector x. For further reading, we recommend Piazzesi (2010).
Our present work proposes DCOBS, a dynamic constrained smoothing B-splines model to forecast the term structure of interest rates. DCOBS describes the coefficients of the yield curve model proposed by Laurini and Moura (2010) as processes evolving over time. Even though constrained smoothing B-splines specification provides full automation in knot mesh selection, we could not use it in a dynamic framework setting. In order to build a common ground and observe curve shapes evolving over time, knots were fixed to capture short-, medium-and long-term behavior according to observed data. These knots were distributed equally in the dataset, so there was the same amount of coefficients on each daily curve, and it was possible to run a statistical regression. DCOBS has shown great predictability in the short-term, and remained stable in the long-term.
In Sections 2-4, we present a brief introduction to the fundamental concepts of dynamic Nelson-Siegel models. In Section 5, we introduce the DCOBS model. Section 6 presents the dataset used for fitting and forecasting the US Daily Treasury Yield Curve Rates. In Section 7, we study the outputs from a time series of fitted yield curves. Finally, in Section 8, we finish the work pointing the conclusions we made.
The main contributions of this paper are: • A complete formulation of no-arbitrage constrained smoothing B-splines in terms of objective functions and linear constraint equations; • A dynamic framework of constrained smoothing B-splines (DCOBS) described as AR (1)  A software program that fits several curve fitting models, including no-arbitrage constrained smoothing B-splines.

Term Structure of Interest Rates
In this paper, interest rates are treated as a multidimensional variable that represents the return on investment expressed by three related quantities: spot rate, forward rate, and the discount value.
Each of these quantities depends on several economical, political, and social information, such as supply and demand of money and the expectation of its future value, risk, and trust perception, consequences of political acts, etc. The term structure of interest rates is a valuable tool not only for banks and financial firms, or governments and policy makers, but, for society itself, helping to understand the movements of markets and flows of money.
It is assumed that fixed income government bonds can be considered risk-free so we can define a special type of yield that is the spot interest rate, s(τ). This function is the return of a fixed income zero-coupon risk-free bond that expires in τ periods. Today's price of such financial instrument whose future value is $1.00, assuming that its interest rate is continuously compounded, is given by the discount function, d(τ), represented by (1) The relationship between the discount value and the spot rate can be recovered by τ .
Based on the available bonds in the market with different maturities, it is possible to plan at an instant a financial transaction that will take place in another future instant, starting at the maturity of the shorter bond and expiring at the maturity of the longer bond. The interest rate of this future transaction is called the forward rate.
Consider a forward contract traded at the present day at τ P = 0. This contract arranges an investment in the future that starts at the settlement date at time τ. This investment will be kept until the maturity date, at time τ M > τ. Then, the implied continuously compounded forward rate is related to the spot rate according to The instantaneous forward rate or short rate f (τ) is defined by That is, the short rate f (τ) is the forward rate for a forward contract with an infinitesimal investment period after the settlement date. The forward rate can be seen as the marginal increase in the total return from a marginal increase in the length of the investment Svensson (1994). The spot rate s(τ) is defined by Note that the spot rate is the average of the instantaneous forward rates with settlement between the trade date 0 and the maturity date τ. From (1) and (2), the discount function and forward rate may be written as Yield curve is a function of the interest rates of bonds that share the same properties except by their maturities. A yield curve of spot rates is called term structure of interest rates Cox et al. (1985).
Yield curves of coupon-bearing bonds are not equivalent to yield curves of zero-coupon bonds with same maturity dates Svensson (1994). Therefore, yield curves for coupon-bearing bonds should not be used as direct representations of the term structure of interest rates.
In the real world, the term structure of interest rates has a discrete representation. Using interpolation techniques, we can represent the term structure of interest rates in a continuous way. Such a continuous representation provides a valuable tool for calculating the spot rate at any given interval.
As pointed out by Diebold and Li (2003), the classical approaches to model the term structure of interest rates are equilibrium models and no-arbitrage models.
Equilibrium models Cox et al. (1985); Duffie and Kan (1996); Vasicek (1977) construct the term structure of interest rates from economic variables to model a stochastic process for the short rate dynamic. Then, spot rates can be obtained under risk premium assumptions, that is, considering what investors expect as an extra return relative to risk-free bonds.
On the other hand, no-arbitrage models focus on perfectly fitting the term structure of interest rate on observed market spot rates so that there is no arbitrage opportunity. A major contribution to no-arbitrage models was given by Hull and White (1990) and Heath et al. (1992).
In the work of Diebold and Li (2003), neither the equilibrium model nor the no-arbitrage model are used to model the term structure of interest rates. Instead, they use the Nelson and Siegel exponential components framework Nelson and Siegel (1987). They do so because they claim their model produces encouraging results for out-of-sample forecasting. In addition, at that time, little attention had been paid in the research of both no-arbitrage and equilibrium models regarding the dynamics and forecasting of interest rates. Diebold and Li (2003) proposed the following dynamic version of Nelson-Siegel yield curve model

Dynamic Nelson-Siegel
where L 1 (τ) = 1, and the parameter λ is a constant interpreted as the conductor of the curve exponential decay rate. A yield curve is fitted according to the Nelson-Siegel model to relate yields and maturities of available contracts for a specific day. We will refer to such yield curve as the Nelson-Siegel static yield curve. Let θ be {β 1 , β 2 , β 3 }. The curves are fitted by constructing a simplex solver which computes appropriate values for θ to minimize the distance between NS(τ) and market data points. The coefficients β 1 , β 2 , and β 3 are interpreted as three latent dynamic factors. The loading on β 1 is constant and do not change in the limit; then, β 1 can be viewed as a long-term factor. The loading on β 2 starts at 1 and decays quickly to zero; then, β 2 can be viewed as a short-term factor. Finally, the loading on β 3 starts at zero, increases, and decays back to zero; then, β 3 can be viewed as a medium term factor.
The Dynamic Nelson-Siegel (DNS) model is defined by where the coefficients β i,t are AR(1) processes defined by The parameters c i and φ i are estimated with the maximum likelihood for the ARIMA model. The coefficients β i,t are predicted as AR(1) over a dataset of T daily market observations. Furthermore, t ∼ N (0, σ 2 ) and η i,t ∼ N (0, σ 2 i ) are independent errors. Since the yield curve model depends only on β 1,t , β 2,t , β 3,t , then forecasting the yield curve is equivalent to forecasting β 1,t , β 2,t , and β 3,t .
Conversely, the factors for long-term, short-term, and medium-term can also be interpreted, respectively, in terms of level, slope, and curvature of the model. Diebold and Li (2003) use these interpretations to claim that the historical stylized facts of the term structure of interest rates can be replicated by fitting the three factors, which means that the model can replicate yield curve geometric shapes.
For the US market, Diebold and Li (2003) show that the DNS model outperforms traditional benchmarks such as the random walk model, even though Vicente and Tabak (2008) state that the model does not outperform a random walk for short-term forecasts (one-month ahead).

Arbitrage-Free Nelson-Siegel
The Arbitrage-Free Nelson-Siegel (AFNS) static model for daily yield curve fitting was derived by Christensen et al. (2011) from the standard continuous-time affine Arbitrage-Free formulation of Duffie and Kan (1996). The AFNS model almost matches the NS model except by the yield-adjustment term − C(τ,τ M ) τ M −τ . In fact, the definition of the AFNS static model in Christensen et al. (2011) is given by The AFNS model built by Christensen et al. (2011) considers the mean levels of the state variable under the Q-measure at zero, i.e., Considering a general volatility matrix (not related to the dynamic model for forecasting the yield curve) Christensen et al. (2011) show that an analytical form of the yield-adjustment term can be derived as They also estimated the general volatility matrix for maturities measured in years aŝ Note that the adjustment-term C(τ, τ M ) is only time-independent. In other words, it is a deterministic function that depends only on the maturity of the bond. Thus, let the auxiliary function Γ(τ) be As in the Dynamic Nelson-Siegel model, the Dynamic AFNS model describes the AFNS static model evolving over time. The Dynamic AFNS model is defined by where the loadings L 1 (τ), L 2 (τ) and L 3 (τ) are the usual functions of (3) and the coefficients β i,t are autoregressive processes described by where the parameters c i and φ i and the coefficients β i,t are estimated and predicted as described in Section 3. If β i,t is a linear function of β j,t where i = j, or there is a cointegration, the component β i,t can be predicted as a linear function of β j,t , and the model can be simplified.

Dynamic Constrained Smoothing B-Splines
Constrained Smoothing B-Splines is a methodology first proposed by He and Shi (1998) and then formalized by He and Ng (1999) as a proper algorithm. Constrained Smoothing B-Splines extends smoothing splines to a conditional quantile function estimation and then formulates the model as a linear programming problem that can incorporate constraints such as monotonicity, convexity, and boundary conditions. Laurini and Moura (2010) applied this methodology as a static model to fit daily yield curves along with no-arbitrage constraints. The estimation of the daily term structure of interest rates is set to be a conditional median estimation that is robust to outliers. This model produces yield curves as L 1 projection into the space of B-splines. The flexible nature of B-splines and the arbitrage-free constraints makes the model a powerful tool that creates balance between financial meaning and adherence to data avoiding overfitting.
Our main contribution is the proposal of the Dynamic Constrained Smoothing B-splines (DCOBS) model that describes the static model evolving over time.
DCOBS is estimated by the Penalized Least Absolute Deviation where n is the number of contracts available in the reference day, y i are market yields of the contracts, C = N + m is the number of coefficients, N is the number of internal knots, θ = (a 1 , . . . , a C ) is the coefficient vector to be estimated, B j are the B-splines basis, and τ i are distinct maturities of the contracts. As in Laurini and Moura (2010), the model is configured with m = 3 as the order for quadratic B-Splines basis. The selection of the smoothing parameter Λ is automated with generalized cross validation (Leave-One-Out GCV) method of Fisher et al. (1995). The formulation in (8) can be rewritten as where k = 1, ..., N and t k is an internal knot position. The static model defined by (8) can be implemented as an equivalent linear programming problem that minimizes the objective function z such that Each yield observed in the market will produce five linear constraint equations: two constraints for fitting the curve, one constraint for smoothing, and two constraints for no-arbitrage conditions.

The fitting constraints are
The smoothing constraint is Finally, the no-arbitrage constraints are The resulting fitted yield curveŝ is a conditional median function represented by quadratic smoothing B-splines. Now, we propose the Dynamic Constrained Smoothing B-Splines model by where the coefficients a j,t are autoregressive processes described by where the parameters c i and φ i and the coefficients β i,t are estimated and predicted as described in Sections 3 and 4. If a i,t is a linear function of a j,t where i = j, or there is a cointegration, the component a i,t can be predicted as linear function of a j,t and the model can be simplified.
The DCOBS model extends the static model and extrapolates the temporal axis creating a surface of fitted curves. Figure 1 displays a visual idea of the differences between each yield curve model and the superiority of fitting of DCOBS over AFNS.

Descriptive Analysis
This work used 10 years (2007-2017) of public historical data of US Daily Treasury Yield Curve Rates 2 . The data were partitioned so 2007-2016 was the sample data set and 2017 was the test data set. A US federal holidays dataset was built for auxiliary calculations of business days.
The sample data set had 27,544 data points in 2504 reference dates spanning between 2 January 2007 and 30 December 2016. The term structure horizon spans between 22 and 7920 business days. The sample data set originated 5008 yield curves, with 2504 AFNS and 2504 DCOBS.
By the nature of parametric models, there is no challenge in relating the coefficients in a time-dependent process. Thus, Arbitrage-Free Nelson-Siegel yield curves are fitted using the raw sample data.
On the other hand, nonparametric B-Splines models depend on its knots and data points position. Therefore, a two-step normalization procedure was applied and so the coefficients could be related in a time-dependent process. The first step normalizes the horizon length. The second step normalizes the data point positions.
In the first step, the Nelson-Siegel model is applied to extrapolate the horizon and calculate the yields on the boundaries of the term structure. The largest curve was picked from the dataset with a horizon of 7920 days.
For the second step, an auxiliary DCOBS curve was built with knots being equally distributed across the horizon. With the resulting fitted curve, we calculated the normalized term structure by evaluating the auxiliary curve at the points (0, s(132), s(594), s(1320), s(7920)). Theses knots were selected based on observed data and the overall fitting quality it produced 3 .
In this analysis, 2504 yield curves were generated by our computational program using both methods AFNS and the DCOBS. For each curve, the AFNS method produced three coefficients while DCOBS produced five coefficients. The resulting DCOBS yield curves had a better performance compared to AFNS considering Root Mean Square Error in every year of a sample data set as shown in Table 1. The difference of fitting both methods can be seen in Figure 1 for the yield curve on 2 January 2008.

Results
The time series for AFNS coefficients can be seen in Figure 2. Coefficients β 1,t and β 3,t may be cointegrated, so we run a two-step Engle-Granger cointegration test Engle and Granger (1987). The linear regression of β 3,t explained by β 1,t returned an intercept of −0.10 and a coefficient of 0.84. Applying the Augmented-Dickey-Fuller Unit Root Test on regression residuals yielded a statistic of −2.68. Such statistics, confronting the critical values for the co-integration test of Engle and Yoo (1987) leads us to reject the unitary root hypothesis because the residuals are stationary. The conclusion is that there is cointegration between β 1,t and β 3,t .
In economic terms, the cointegration describes a strong relationship between long-and medium-term contracts, which can be a result of a political measure or some market characteristics that stimulated the emission of long-term contracts based on the price of medium-term contracts and vice versa.
The time series for β 1,t and β 2,t in (7) are modeled as AR(1) processes with one differentiation. The estimated φ for β 1,t is −0.04 and for β 2,t is 0.13.
As seen in AFNS coefficients time series, DCOBS estimated coefficients a 2,t and a 3,t seem to cointegrate in Figure 3, as well as coefficients a 4,t and a 5,t , so we run a two-step Engle-Granger cointegration test. The linear regression of a 3,t explained by a 2,t returned an estimated intercept of −7.61 and a coefficient of 8.60. The linear regression of a 5,t explained by a 4,t returned an estimated intercept of 0.11 and an estimated coefficient of 0.43. Applying the Augmented-Dickey-Fuller Unit Root Test on regression residuals yielded the statistics −5.48 for estimated coefficients a 2,t and a 3,t . The same test yielded the statistics −2.41 for estimated coefficients a 4,t and a 5,t . Confronting these statistics with the critical values in Engle and Yoo (1987) implied the unitary root hypothesis because the residuals are stationary. The conclusion is that there is a cointegration between a 2,t and a 3,t as well as between a 4,t and a 5,t . Since B-Splines coefficients have a more local specific behavior, these cointegrations give a more detailed analysis than the AFNS model. It reveals the binding between short-and medium-term contracts on one side and medium-and long-term contracts on the other side. Like the economic interpretation of the AFNS model, this is an important feature of the model because it shows to investors and policy makers the magnitude of how the supply and demand on a type of contract can influence the price of another type of contract.
The time series for a 2,t and a 4,t are modeled as AR(1) processes with one differentiation. The estimated φ of a 2,t is 0.02 and for a 4,t is −0.03. Then, a 3,t and a 5,t are linear functions of a 2,t and a 4,t , respectively.
The time series modeled as AR(1) processes above were used to make the out-of-sample forecast with a horizon of 250 business days, the amount of business days in a test dataset of 2017. Three reference dates were considered for evaluation: 1 month (short-term), 6 months (medium-term), and 12 months (long-term). Figure 4 shows short-, medium-and long-term forecasts for AFNS and DCOBS curves. In a 1-month forecast, DCOBS performs a good fit to the term structure both in the short-term as in a long-term horizon, as the curve follows the data points. AFNS shows a heavy instability in the beginning of the curve in all forecasting, although, in the long-term, it performs well.
We compared both forecast techniques using the Diebold-Mariano accuracy test Diebold and Mariano (1995) with an alternative hypothesis being DCOBS outperforming AFNS prediction. As stated before, DCOBS outperforms AFNS in the short-term prediction. The absolute value of Diebold-Mariano statistics for a one-month forecast is greater than 1.96, so the null hypothesis that both techniques have the same accuracy is rejected. On the other hand, for 6-month and 12-month forecasts, the absolute value of Diebold-Mariano statistics stays lower than 1.96, which means that both techniques may have the same predictive accuracy. Table 2 shows forecasts' root mean square  errors, and Table 3 shows Diebold-Mariano statistics results.

Conclusions
In this work, we have proposed DCOBS, a methodology for forecasting the dynamics of the term structure of interest rates extending the Constrained Smoothing B-Splines curve model.
The results have shown a great predictability power of the DCOBS model on the short-and middle-term, which are extremely important for traders and other financial market specialists. In comparison, the AFNS model has shown poor fitting as seen in Figure 1 and lack of stability in the beginning of the curves. Even though the accuracy of DCOBS in the middle and long-term is statistically equivalent to AFNS, the stability of the DCOBS can certainly be explored in future works to improve its predictability quality.
Finally, DCOBS can be a powerful tool to be applied in other areas like biology, physics, earth sciences, etc. Funding: This research was funded by FAPESP 18/04654-9.

Conflicts of Interest:
The authors declare no conflict of interest.