# Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. The Arbitrage-Regularization Problem

**Assumption**

**1.**

- (i)
- ${\beta}_{t}$ is an ${\mathbb{R}}^{d}$-valued diffusion process which is the unique strong solution to$${\beta}_{t}={\beta}_{0}+{\int}_{0}^{t}\alpha (s,{\beta}_{s})ds+{\int}_{0}^{t}\sigma (s,{\beta}_{s})d{W}_{s},$$$$\sigma (s,\beta )\sigma {(s,\beta )}^{\top}$$is a continuous function of β for any fixed $t\ge 0$.
- (ii)
- The stochastic differential equation (10) has a unique ${\mathbb{R}}^{d}$-valued solution for each ${\beta}_{0}\in {\mathbb{R}}^{d}$.
- (iii)
- For every $u\in \mathcal{U}$, ${\left\{{S}_{t}(\xb7,\xb7;u)\right\}}_{t\in [0,\infty )}$ is a non-anticipative functional in ${\mathbb{C}}_{b}^{1,2}$ verifying the following “predictable-dependence” condition of Fournie (2010):$$\begin{array}{cc}\hfill {S}_{t}({x}_{t},{x}_{t};u)& ={S}_{t}({x}_{t},{x}_{{t}_{-}};u),\hfill \end{array}$$for all $t\in [0,\infty )$ and all $(x,v)\in D([0,t];{\mathbb{R}}^{d})\times D([0,t];{S}_{+}^{d})$, where ${S}_{+}^{d}$ is the set of $d\times d$-dimensional positive semi-definite matrices with real-coefficients,

## 3. Main Results

**Proposition**

**1**

**Definition**

**1**

**Remark**

**1.**

**Assumption**

**2.**

- (i)
- For every $\varphi \in \mathcal{H}$ and ${\phantom{\rule{-0.166667em}{0ex}}}_{N}^{\u2605}$-a.e. $\omega \in \Omega $, the function $(t,u)\mapsto {\mathsf{\Lambda}}_{t}^{u}\left(\varphi \right)\left(\omega \right)$ is continuous on $\mathcal{H}$,
- (ii)
- $\left(\right)open="\{"\; close="\}">\varphi \in \mathcal{H}:\phantom{\rule{0.166667em}{0ex}}(\forall u\in \mathcal{U})\phantom{\rule{0.166667em}{0ex}}{S}_{t}({\varphi}_{t}^{u},{\left[\left[{\varphi}^{u}\right]\right]}_{t};u)\phantom{\rule{0.166667em}{0ex}}isa{\mathbb{P}}_{N}^{\u2605}-local-martingale$ is closed and non-empty.

**Theorem**

**1.**

- (i)
- Equation (8) admits a minimizer on $\mathcal{H}$,
- (ii)
- $\underset{\lambda \uparrow \infty ;\phantom{\rule{0.166667em}{0ex}}\lambda \ge 2}{lim}{inf}_{\varphi \in \mathcal{H}}\ell (\phi -\varphi )+{\mathrm{AF}}^{\lambda}\left(\varphi \right)={min}_{\varphi \in \mathcal{H}}\ell (\phi -\varphi )+{\iota}_{\mathcal{H}}\left(\varphi \right),$
- (iii)
- If for every $\lambda \ge 2$${\mathrm{AF}}^{\lambda}$ is lower-semi-continuous on $\mathcal{H}$ then$$\underset{\lambda \uparrow \infty ;\phantom{\rule{0.166667em}{0ex}}\lambda \ge 2}{lim}\underset{\varphi \in \mathcal{H}}{argmin}\phantom{\rule{0.166667em}{0ex}}\ell (\phi -\varphi )+{\mathrm{AF}}^{\lambda}\left(\varphi \right)\in \underset{\varphi \in \mathcal{H}}{argmin}\phantom{\rule{0.166667em}{0ex}}\ell (\phi -\varphi )+{\iota}_{\mathcal{H}}\left(\varphi \right),$$where ${\iota}_{\mathcal{H}}$ is defined on $\mathcal{H}$ as$$\begin{array}{cc}\hfill {\iota}_{\mathcal{H}}\left(\varphi \right)& \triangleq \left(\right)open="\{"\; close>\begin{array}{cc}0\hfill & if\phantom{\rule{0.166667em}{0ex}}(\forall u\in \mathcal{U})\phantom{\rule{0.166667em}{0ex}}{S}_{t}({\varphi}_{t}^{u},{\left[\left[{\varphi}^{u}\right]\right]}_{t};u)isa{\phantom{\rule{-0.166667em}{0ex}}}_{N}^{\u2605}-local-martingale\hfill \\ \infty \hfill & \phantom{\rule{0.166667em}{0ex}}otherwise.\hfill \end{array}\hfill \end{array}$$

**Assumption**

**3.**

- (i)
- ${sup}_{0\le t}{max}_{i=1,\cdots ,n}ess-sup\left(\right)open="|"\; close="|">{S}_{t}({\widehat{\varphi}}_{t}^{{u}_{i}},{\left[\left[{\widehat{\varphi}}^{{u}_{i}}\right]\right]}_{t};{u}_{i})-{S}_{t}({\varphi}_{t}^{{u}_{i}},{\left[\left[{\varphi}^{{u}_{i}}\right]\right]}_{t};{u}_{i})$
- (ii)
- $m<{inf}_{0\le t}{inf}_{i=1,\cdots ,n}ess-inf\left(\right)open="|"\; close="|">{S}_{t}({\varphi}_{t}^{{u}_{i}},{\left[\left[{\varphi}^{{u}_{i}}\right]\right]}_{t};{u}_{i})$

**Proposition**

**2.**

## 4. Arbitrage-Regularization for Bond Pricing

**Remark**

**2.**

**Theorem**

**2.**

- (i)
- For every $\lambda \in [\frac{2}{3},1)$ there exists an element ${\varphi}^{\lambda}$ in $\mathcal{H}$ minimizing$${\int}_{0}^{\infty}{\int}_{0}^{u}{\int}_{\beta \in {\mathbb{R}}^{d}}{e}^{-\left|t\right|-{\left|u\right|}^{\kappa}-{\parallel \beta \parallel}^{\kappa}}{\left(\right)}^{\phi}p,$$where ${\mathsf{\Lambda}}_{t}^{u}\left(\varphi \right)$ is defined by$$\begin{array}{cc}\hfill {\mathsf{\Lambda}}^{u}\left(\varphi \right)\triangleq & {\left(\right)}^{{\phi}_{0}}p\hfill \end{array}$$where ${\left({\zeta}_{i,j}\right)}_{i,j=1}^{d}=\zeta $, ${\left({\zeta}_{k;i,j}\right)}_{i,j=1}^{d}={\zeta}_{k}$, ${\left({\gamma}_{i}\right)}_{i=1}^{d}=\gamma $, and ${\left({\gamma}_{k;i}\right)}_{i=1}^{d}={\gamma}_{k}$, for $k=1,\cdots ,d$.
- (ii)
- The following inclusion holds$$\begin{array}{ccc}\hfill \phantom{\rule{-15.0pt}{0ex}}\underset{\lambda \uparrow \infty ;\phantom{\rule{0.166667em}{0ex}}\lambda \ge 2}{lim}{\varphi}^{\lambda}& \in \underset{\varphi \in \mathcal{H}}{}\hfill & \hfill {\int}_{0}^{\infty}{\int}_{0}^{u}{\int}_{\beta \in {\mathbb{R}}^{d}}{e}^{-\left|t\right|-{\left|u\right|}^{\kappa}-{\parallel \beta \parallel}^{\kappa}}{\left(\right)}^{\phi}pd\beta dtdu+{\iota}_{\mathcal{H}}\left(\varphi \right),\end{array}$$where ${\iota}_{\mathcal{H}}$ is as in (16).

**Corollary**

**1.**

#### 4.1. A Deep Learning Approach to Arbitrage-Regularization

#### 4.2. Numerical Implementations

#### 4.2.1. Model 1: The Dynamic Nelson-Siegel Model (Practitioner Model)

#### 4.2.2. Model 2: dPCA (Machine-Learning Model)

## 5. Discussion

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Abbreviations

Abbreviations | Meaning | Page |

A-Reg | Arbitrage-Regularization | 4 |

AFNS | Arbitrage-Free Nelson Siegel Model | 12 |

A-Reg(dPCA) | Arbitrage-Regularized Dynamic Principal Component Analysis Model | 13 |

A-Reg(dNS) | Arbitrage-Regularized Nelson-Siegel Model | 12 |

dPCA | Dynamic Principal Component Analysis Model | 13 |

NFLVR | No Free Lunch with Vanishing Risk | 4 |

dNS | dynamic Nelson Siegel Model | 12 |

Symbol | Description | Page |

${\left[Y\right]}_{t}$ | quadratic-variation of the process ${Y}_{t}$ | 4 |

${\left[\left[Y\right]\right]}_{t}$ | local quadratic-variation of the process ${Y}_{t}$ | 4 |

AF | Arbitrage-Penalty | 7 |

${\beta}_{t}$ | Stochastic factor process | 4 |

${\beta}^{ENET}$ | Solution to Elastic-Net regularized regression problem of Hastie et al. (2015) | 13 |

${\mathbb{C}}_{b}^{1,2}$ | Twice-Continuously Differentiable Boundedness-preserving Path-Functionals | 19 |

$\mathcal{D}$ | Horizontal Derivative of Fournie (2010) | 19 |

$D{\theta}^{i}$ | Weak-Derivative in the sense of measures of $\theta $ | 8 |

$|D{\theta}^{i}|$ | Total variation of $D{\theta}^{i}$ | 8 |

${\mathbb{E}}_{\mathbb{P}}$ | The expectation under the probability measure $\mathbb{P}$ | 7 |

∇ | Vertical Derivative of Fournie (2010) | 19 |

$\varphi $ | Reference factor model for latent process | 4 |

${\varphi}_{t}^{u}$ | Abbreviation for $\varphi (t,{\beta}_{t},u)$ | 5 |

$\mathcal{H}$ | Hypothesis class of functions for factor model | 5 |

ℓ | Loss Function | 5 |

${L}_{\nu \otimes \mu}^{p}(I\times {\mathbb{R}}^{d}\times \mathcal{U})$ | Space of (equivalence classes of) p-integrable functions with respect to $\nu \otimes \mu $ | 5 |

$\lambda $ | Meta-parameter for arbitrage-penalty problem | 7 |

$\mathsf{\Lambda}$ | Process given by (13) | 7 |

$\mu $ | Borel probability measure on ${\mathbb{R}}^{d}\mathcal{U}$ | 5 |

${\mathcal{NN}}_{N,h,n+1}^{\rho}$ | Feed-forward neural networks with activation function $\rho $, depth $N+1$, and height h | 11 |

$\nu $ | Borel probability measure on I | 5 |

$\nu \otimes \mu $ | Product (probability) measure of $\nu $ and $\mu $ | 5 |

$\mathbb{P}$ | Real-World Probability Measure | 4 |

${\mathbb{P}}^{\u2605}$ | Equivalent Probability Measure to ${\mathbb{P}}^{\u2605}$ | 4 |

${\mathbb{P}}_{N}$ | Martingale measure for the numéraire | 4 |

$\mathbb{Q}$ | Equivalent Martingale Probability Measure to $\mathbb{P}$ | 4 |

${\mathbb{R}}^{d}$ | d-dimensional Euclidean space | 4 |

${r}_{t}$ | Short-rate | 5 |

${S}_{t}$ | Non-anticipative path-dependant functional in ${\mathbb{C}}_{b}^{1,2}$ encoding latent factor process | 4 |

into ${\left(\right)}_{{X}_{t}}$ | ||

${S}_{+}^{d}$ | $d\times d$ symmetric matrices | 6 |

$\mathcal{U}$ | Borel subset of ${\mathbb{R}}^{D}$ which indexes the large financial market | 4 |

${W}_{i}$ | Affine function between Euclidean spaces | 11 |

$\mathcal{X}$ | Banach subspace of ${L}_{\nu \otimes \mu}^{p}(I\times {\mathbb{R}}^{d}\times \mathcal{U})$ admitting a continuous embedding into | 5 |

${C}^{1,2,2}(I\times {\mathbb{R}}^{d}\times \mathcal{U})$ | ||

${\left(\right)}_{{X}_{t}}$ | large financial market | 4 |

## Appendix A. Background

#### Appendix A.1. Arbitrage-Theory

#### Appendix A.2. Functional Itô Calculus

**Definition**

**A1**

**Theorem**

**A1**

**Proposition**

**A1**

#### Appendix A.3. Background on Γ-Convergence

**Theorem**

**A2**

**Theorem**

**A3**

- (i)
- (Lower Semicontinuity): $\underset{n\to \infty}{\Gamma -lim\phantom{\rule{0.166667em}{0ex}}}{\ell}_{n}$ is lower semicontinuous on X,
- (ii)
- (Stability Under Continuous Perturbation): If $g:X\to \mathbb{R}$ is continuous, then$$\underset{n\to \infty}{\Gamma -lim\phantom{\rule{0.166667em}{0ex}}}({\ell}_{n}+g)=\left(\right)open="("\; close=")">\underset{n\to \infty}{\Gamma -lim\phantom{\rule{0.166667em}{0ex}}}{\ell}_{n}$$
- (iii)
- (Stability Under Relaxation): For every $n\in $ let ${\left\{{\tilde{\ell}}_{n}\right\}}_{n\in}$ be a sequence of functions from X to $\mathbb{R}\cup \{\infty \}$ satisfying ${\ell}_{n}^{lsc}\le {\tilde{\ell}}_{n}\le {\ell}_{n}.$ Then$$\underset{n\to \infty}{\Gamma -lim\phantom{\rule{0.166667em}{0ex}}}{\tilde{\ell}}_{n}=\underset{n\to \infty}{\Gamma -lim\phantom{\rule{0.166667em}{0ex}}}{\ell}_{n},$$where ${\ell}^{lsc}$ is the largest lower semi-continuous function dominated by ℓ, point-wise.

**Theorem**

**A4**

**Definition**

**A2**

- (i)
- ${\ell}^{lsc}\left(x\right)\le {lim-inf}_{n\in}{\ell}_{n}\left({x}_{n}\right)$ for every net ${\left\{{x}_{n}\right\}}_{n\in}$ converging to x in $(X,d)$,
- (ii)
- ${\ell}^{lsc}\left(x\right)\ge {lim-inf}_{n\in}{\ell}_{n}\left({y}_{n}\right)$ for some net ${\left\{{y}_{n}\right\}}_{n\in}$ converging to x in $(X,d)$

## Appendix B. Proofs

**Proof**

**of Proposition 1.**

**Proof**

**of Theorem 1.**

**Proof**

**of Proposition 2.**

**Proof**

**of Theorem 2.**

**Proof**

**of Corollary 1.**

## References

- Bachelier, Louis. 1900. Théorie de la spéculation. Annales Scientifiques de l’É.N.S. 17: 21–86. [Google Scholar] [CrossRef]
- Bain, Alan, and Dan Crisan. 2009. Fundamentals of stochastic filtering. In Stochastic Modelling and Applied Probability. New York: Springer, vol. 60. [Google Scholar]
- Björk, Tomas. 2009. Arbitrage Theory in Continuous Time, 3rd ed. Oxford: Oxford University Press. [Google Scholar]
- Björk, Tomas, and Bent Jesper Christensen. 1999. Interest rate dynamics and consistent forward rate curves. Mathematical Finance 9: 323–48. [Google Scholar] [CrossRef]
- Braides, Andrea. 2002. Γ-convergence for beginners. In Oxford Lecture Series in Mathematics and Its Applications. Oxford: Oxford University Press, vol. 22. [Google Scholar]
- Brown, R. C., and Bohumir Opic. 1992. Embeddings of weighted Sobolev spaces into spaces of continuous functions. Proceedings of the Royal Society of London. Series A 439: 279–96. [Google Scholar]
- Carmona, René. 2014. Springer Texts in Statistics. In Statistical Analysis of Financial Data in R, 2nd ed. Springer Texts in Statistics. New York: Springer. [Google Scholar]
- Chen, Hung-Ching Justin, and Magdon-Ismail Malik. NN-OPT: Neural Network for Option Pricing Using Multinomial Tree. In International Conference on Neural Information Processing. New York: Springer, pp. 360–369.
- Chen, Luyang, Markus Pelger, and Jason Zhu. 2019. Deep learning in asset pricing. Available online: https://ssrn.com/abstract=3350138 (accessed on 15 April 2020).
- Christensen, Jens H. E., Francis X. Diebold, and Glenn D. Rudebusch. 2011a. The affine arbitrage-free class of Nelson-Siegel term structure models. Journal of Econometrics 164: 4–20. [Google Scholar] [CrossRef] [Green Version]
- Cont, Rama, and David-Antoine Fournié. 2013. Functional Itô calculus and stochastic integral representation of martingales. Annals of Probability 41: 109–33. [Google Scholar] [CrossRef]
- Cuchiero, Christa. 2011. Affine and Polynomial Processes. Ph.D. thesis, ETH Zurich, Zürich, Switzerland. [Google Scholar]
- Cuchiero, Christa, Irene Klein, and Josef Teichmann. 2016. A new perspective on the fundamental theorem of asset pricing for large financial markets. Theory of Probability and Its Applications 60: 561–79. [Google Scholar] [CrossRef] [Green Version]
- Dal Maso, Gianni. 1993. An introduction to Γ-convergence. In Progress in Nonlinear Differential Equations and Their Applications. Boston: Birkhäuser Boston, Inc., vol. 8. [Google Scholar]
- De Giorgi, Ennio. 1975. Sulla convergenza di alcune successioni d’integrali del tipo dell’area. Rend. Mat. 8: 277–94. [Google Scholar]
- Delbaen, Freddy, and Walter Schachermayer. 1994. A general version of the fundamental theorem of asset pricing. Mathematische Annalen 300: 463–520. [Google Scholar] [CrossRef]
- Delbaen, Freddy, and Walter Schachermayer. 1998. The fundamental theorem of asset pricing for unbounded stochastic processes. Mathematische Annalen 312: 215–50. [Google Scholar] [CrossRef] [Green Version]
- Devin, Siobhán, Bernard Hanzon, and Thomas Ribarits. 2010. A Finite-Dimensional HJM Model: How Important is Arbitrage-Free Evolution? International Journal of Theoretical and Applied Finance 13: 1241–63. [Google Scholar] [CrossRef]
- Diebold, Francis X., and Glenn D. Rudebusch. 2013. Yield Curve Modeling And Forecasting. Princeton: Princeton University Press. [Google Scholar]
- Dupire, Bruno. 2009. Functional Itô Calculus. Bloomberg Portfolio Research Paper No. 2009-04-FRONTIERS. New York: Bloomberg L.P. [Google Scholar]
- Elworthy, K. David. 1982. Stochastic Differential Equations on Manifolds. London Mathematical Society Lecture Note Series; Cambridge: Cambridge University Press, vol. 70. [Google Scholar]
- Filipović, Damir. 2000. Exponential-polynomial families and the term structure of interest rates. Bernoulli 6: 1081–107. [Google Scholar] [CrossRef]
- Filipović, Damir. 2001. Consistency Problems for Heath-Jarrow-Morton Interest Rate Models. Lecture Notes in Mathematics. Berlin: Springer, vol. 1760. [Google Scholar]
- Filipović, Damir. 2009. Term-Structure Models. A Graduate Course. Springer Finance. Berlin: Springer. [Google Scholar]
- Filipović, Damir, Stefan Tappe, and Josef Teichmann. 2010. Term structure models driven by Wiener processes and Poisson measures: Existence and positivity. SIAM Journal on Financial Mathematics 1: 523–54. [Google Scholar] [CrossRef] [Green Version]
- Filipović, Damir, and Josef Teichmann. 2004. On the geometry of the term structure of interest rates. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 460: 129–67. [Google Scholar] [CrossRef] [Green Version]
- Focardi, Matteo. 2012. Γ-convergence: A tool to investigate physical phenomena across scales. Mathematical Methods in the Applied Sciences 35: 1613–58. [Google Scholar] [CrossRef]
- Fontana, Claudio. 2014. A note on arbitrage, approximate arbitrage and the fundamental theorem of asset pricing. Stochastics An International Journal of Probability and Stochastic Processes 86: 922–31. [Google Scholar] [CrossRef] [Green Version]
- Fournie, David-Antoine. 2010. Functional Ito Calculus and Applications. Ph.D. thesis, Columbia University, New York, NY, USA. [Google Scholar]
- Gelenbe, Erol. 1989. Random neural networks with negative and positive signals and product form solution. Neural Computation 1: 502–10. [Google Scholar] [CrossRef]
- Gonon, Lukas, Lyudmila Grigoryeva, and Juan-Pablo Ortega. 2020. Approximation bounds for random neural networks and reservoir systems. arXiv arXiv:2002.05933. [Google Scholar]
- Guasoni, Paolo. 2006. No arbitrage under transaction costs, with fractional Brownian motion and beyond. Mathematical Finance 16: 569–82. [Google Scholar] [CrossRef]
- Harrison, J. Michael, and David M. Kreps. 1979. Martingales and arbitrage in multiperiod securities markets. Journal of Economic Theory 20: 381–408. [Google Scholar] [CrossRef]
- Hastie, Trevor, Robert Tibshirani, and Martin Wainwright. 2015. Statistical Learning with Sparsity: The LASSO and Generalizations. Boca Raton: CRC Press. [Google Scholar]
- Heath, David, Robert Jarrow, and Andrew Morton. 1992. Bond Pricing and the Term Structure of Interest Rates: A New Methodology for Contingent Claims Valuation. Econometrica 60: 77–105. [Google Scholar] [CrossRef]
- Hornik, Kurt. 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4: 251–57. [Google Scholar] [CrossRef]
- Jaeger, Herbert, and Harald Haas. 2004. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 304: 78–80. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jazaerli, Samy, and Yuri F. Saporito. 2017. Functional Itô calculus, Path-Dependence and the Computation of Greeks. Stochastic Processes and their Applications 127: 3997–4028. [Google Scholar] [CrossRef] [Green Version]
- Kratsios, Anastasis. 2019a. Deep Arbitrage-Free Regularization. Available online: https://github.com/AnastasisKratsios/Deep_Arbitrage_Free_Regularization/ (accessed on 8 April 2020).
- Kratsios, Anastasis. 2020. The Universal Approximation Property: Characterizations, Existence, and a Canonical Topology for Deep-Learning. Machine Learning arXiv:1910.03344. [Google Scholar]
- Meucci, Attilio. 2005. Risk and Asset Allocation. Springer Finance. Berlin: Springer. [Google Scholar]
- Musiela, Marek, and Marek Rutkowski. 1997. Martingale Methods in Financial Modelling. In Applications of Mathematics. Berlin: Springer, vol. xii, p. 512. [Google Scholar]
- Nelson, Charles R., and Andrew F. Siegel. 1987. Parsimonious modeling of yield curves. Journal of Business 60: 473–89. [Google Scholar] [CrossRef]
- Pang, Tao, and Azmat Hussain. 2015. An application of functional Ito’s formula to stochastic portfolio optimization with bounded memory. In Paper presented at 2015 Proceedings of the Conference on Control and Its Applications, Paris, France, July 8–10; pp. 159–166. [Google Scholar]
- Rahimi, Ali, and Benjamin Recht. 2008. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems. Vancouver: NIPS, pp. 1177–1184. [Google Scholar]
- Schweizer, Martin. 1995. On the minimal martingale measure and the Föllmer-Schweizer decomposition. Stochastic Analysis and Applications 13: 573–99. [Google Scholar] [CrossRef]
- Shreve, Steven E. 2004. Stochastic Calculus for finance II: Continuous-Time Models. Berlin: Springer Science & Business Media, vol. 11. [Google Scholar]
- Tibshirani, Robert. 1996. Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society. Series B (Methodological) 58: 267–88. [Google Scholar] [CrossRef]
- Zou, Hui, Trevor Hastie, and Robert Tibshirani. 2006. Sparse principal component analysis. Journal of Computational and Graphical Statistics 15: 265–86. [Google Scholar] [CrossRef] [Green Version]

Sample Availability: Code is available at: https://github.com/AnastasisKratsios. |

**Figure 1.**Day-ahead predictions as a function of $\tilde{\lambda}$ across given maturities for the A−Reg(dNS) model. (

**a**) Average day-ahead predicted yield curves. (

**b**) Estimated MSE of day-ahead bond price predictions.

**Figure 2.**Day-ahead predictions as a function of $\tilde{\lambda}$ across given maturities for the A−Reg(dPCA) model. (

**a**) Average day-ahead predicted yield curves. (

**b**) Estimated MSE of day-ahead bond price predictions.

Model ∖ Maturity | 0.5 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|

Vasiček | 3.155$\times {10}^{-1}$ | 4.323$\times {10}^{-1}$ | 3.622$\times {10}^{-1}$ | 1.950$\times {10}^{-1}$ | 5.730$\times {10}^{-2}$ |

dPCA | 2.526$\times {10}^{-1}$ | 4.349$\times {10}^{-1}$ | 4.176$\times {10}^{-1}$ | 2.526$\times {10}^{-1}$ | 9.261$\times {10}^{-2}$ |

A-Reg(dPCA) | 8.066$\times {10}^{-1}$ | 6.943$\times {10}^{-1}$ | 5.110$\times {10}^{-1}$ | 2.755$\times {10}^{-1}$ | 9.588$\times {10}^{-2}$ |

dNS | 4.513$\times {10}^{-2}$ | 1.479$\times {10}^{-1}$ | 2.134$\times {10}^{-1}$ | 1.477$\times {10}^{-1}$ | 5.968$\times {10}^{-2}$ |

AFNS | 3.729$\times {10}^{-1}$ | 5.315$\times {10}^{-1}$ | 4.414$\times {10}^{-1}$ | 2.436$\times {10}^{-1}$ | 8.301$\times {10}^{-2}$ |

A-Reg(dNS) | 2.903$\times {10}^{-2}$ | 9.514$\times {10}^{-2}$ | 1.601$\times {10}^{-1}$ | 1.235$\times {10}^{-1}$ | 6.482$\times {10}^{-2}$ |

Model ∖ Maturity | 5 | 6 | 7 | 8 | 9 |

Vasiček | 7.735$\times {10}^{-3}$ | 1.996$\times {10}^{-4}$ | 1.024$\times {10}^{-3}$ | 1.480$\times {10}^{-3}$ | 1.348$\times {10}^{-3}$ |

dPCA | 2.193$\times {10}^{-2}$ | 3.326$\times {10}^{-3}$ | 3.119$\times {10}^{-4}$ | 1.897$\times {10}^{-5}$ | 8.097$\times {10}^{-7}$ |

A-Reg(dPCA) | 2.221$\times {10}^{-2}$ | 3.340$\times {10}^{-3}$ | 3.123$\times {10}^{-4}$ | 1.898$\times {10}^{-5}$ | 8.099$\times {10}^{-7}$ |

dNS | 1.972$\times {10}^{-2}$ | 8.313$\times {10}^{-3}$ | 5.323$\times {10}^{-3}$ | 3.925$\times {10}^{-3}$ | 2.998$\times {10}^{-3}$ |

AFNS | 1.840$\times {10}^{-2}$ | 3.225$\times {10}^{-3}$ | 1.633$\times {10}^{-3}$ | 2.084$\times {10}^{-3}$ | 2.708$\times {10}^{-3}$ |

A-Reg(dNS) | 3.579$\times {10}^{-2}$ | 2.236$\times {10}^{-2}$ | 1.523$\times {10}^{-2}$ | 1.050$\times {10}^{-2}$ | 7.308$\times {10}^{-3}$ |

Model ∖ Maturity | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|

Vasiček | 1.108$\times {10}^{-3}$ | 9.002$\times {10}^{-4}$ | 7.382$\times {10}^{-4}$ | 6.125$\times {10}^{-4}$ | 5.135$\times {10}^{-4}$ |

dPCA | 2.578$\times {10}^{-8}$ | 6.328$\times {10}^{-10}$ | 1.433$\times {10}^{-11}$ | 2.607$\times {10}^{-13}$ | 4.179$\times {10}^{-15}$ |

A-Reg(dPCA) | 2.579$\times {10}^{-8}$ | 6.328$\times {10}^{-10}$ | 1.433$\times {10}^{-11}$ | 2.607$\times {10}^{-13}$ | 4.179$\times {10}^{-15}$ |

dNS | 2.381$\times {10}^{-3}$ | 1.969$\times {10}^{-3}$ | 1.686$\times {10}^{-3}$ | 1.484$\times {10}^{-3}$ | 1.337$\times {10}^{-3}$ |

AFNS | 3.407$\times {10}^{-3}$ | 4.163$\times {10}^{-3}$ | 4.918$\times {10}^{-3}$ | 5.603$\times {10}^{-3}$ | 6.164$\times {10}^{-3}$ |

A-Reg(dNS) | 5.215$\times {10}^{-3}$ | 3.827$\times {10}^{-3}$ | 2.885$\times {10}^{-3}$ | 2.229$\times {10}^{-3}$ | 1.761$\times {10}^{-3}$ |

Model ∖ Maturity | 15 | 16 | 17 | 18 | 19 |

Vasiček | 4.342$\times {10}^{-4}$ | 3.698$\times {10}^{-4}$ | 3.169$\times {10}^{-4}$ | 2.729$\times {10}^{-4}$ | 2.360$\times {10}^{-4}$ |

dPCA | 6.714$\times {10}^{-17}$ | 9.566$\times {10}^{-19}$ | 1.426$\times {10}^{-20}$ | 1.819$\times {10}^{-22}$ | 2.749$\times {10}^{-24}$ |

A-Reg(dPCA) | 6.714$\times {10}^{-17}$ | 9.566$\times {10}^{-19}$ | 1.426$\times {10}^{-20}$ | 1.818$\times {10}^{-22}$ | 2.746$\times {10}^{-24}$ |

dNS | 1.225$\times {10}^{-3}$ | 1.138$\times {10}^{-3}$ | 1.069$\times {10}^{-3}$ | 1.012$\times {10}^{-3}$ | 9.639$\times {10}^{-4}$ |

AFNS | 6.577$\times {10}^{-3}$ | 6.867$\times {10}^{-3}$ | 7.115$\times {10}^{-3}$ | 7.453$\times {10}^{-3}$ | 8.052$\times {10}^{-3}$ |

A-Reg(dNS) | 1.422$\times {10}^{-3}$ | 1.171$\times {10}^{-3}$ | 9.831$\times {10}^{-4}$ | 8.406$\times {10}^{-4}$ | 7.316$\times {10}^{-4}$ |

Model ∖ Maturity | 20 | 21 | 22 | 23 | 24 |
---|---|---|---|---|---|

Vasiček | 2.049$\times {10}^{-4}$ | 1.784$\times {10}^{-4}$ | 1.558$\times {10}^{-4}$ | 1.364$\times {10}^{-4}$ | 1.196$\times {10}^{-4}$ |

dPCA | 3.816$\times {10}^{-26}$ | 5.254$\times {10}^{-28}$ | 8.047$\times {10}^{-30}$ | 9.958$\times {10}^{-32}$ | 1.336$\times {10}^{-33}$ |

A-Reg(dPCA) | 3.781$\times {10}^{-26}$ | 4.847$\times {10}^{-28}$ | 3.015$\times {10}^{-30}$ | 2.684$\times {10}^{-30}$ | 1.452$\times {10}^{-29}$ |

dNS | 9.228$\times {10}^{-4}$ | 8.866$\times {10}^{-4}$ | 8.542$\times {10}^{-4}$ | 8.247$\times {10}^{-4}$ | 7.976$\times {10}^{-4}$ |

AFNS | 9.097$\times {10}^{-3}$ | 1.075$\times {10}^{-2}$ | 1.310$\times {10}^{-2}$ | 1.611$\times {10}^{-2}$ | 1.960$\times {10}^{-2}$ |

A-Reg(dNS) | 6.480$\times {10}^{-4}$ | 5.838$\times {10}^{-4}$ | 5.349$\times {10}^{-4}$ | 4.984$\times {10}^{-4}$ | 4.814$\times {10}^{-4}$ |

Model ∖ Maturity | 25 | 26 | 27 | 28 | 29 |

Vasiček | 1.051$\times {10}^{-4}$ | 9.254$\times {10}^{-5}$ | 8.160$\times {10}^{-5}$ | 7.205$\times {10}^{-5}$ | 6.371$\times {10}^{-5}$ |

dPCA | 2.067$\times {10}^{-35}$ | 2.814$\times {10}^{-37}$ | 3.639$\times {10}^{-39}$ | 5.371$\times {10}^{-41}$ | 7.459$\times {10}^{-43}$ |

A-Reg(dPCA) | 9.846$\times {10}^{-29}$ | 1.102$\times {10}^{-27}$ | 2.108$\times {10}^{-26}$ | 6.986$\times {10}^{-25}$ | 3.979$\times {10}^{-23}$ |

dNS | 7.722$\times {10}^{-4}$ | 7.484$\times {10}^{-4}$ | 7.257$\times {10}^{-4}$ | 7.041$\times {10}^{-4}$ | 6.835$\times {10}^{-4}$ |

AFNS | 2.323$\times {10}^{-2}$ | 2.660$\times {10}^{-2}$ | 2.929$\times {10}^{-2}$ | 3.097$\times {10}^{-2}$ | 3.147$\times {10}^{-2}$ |

A-Reg(dNS) | 4.911$\times {10}^{-4}$ | 5.288$\times {10}^{-4}$ | 6.011$\times {10}^{-4}$ | 7.214$\times {10}^{-4}$ | 9.138$\times {10}^{-4}$ |

30 | |
---|---|

Vasiček | 6.371$\times {10}^{-5}$ |

dPCA | 7.459$\times {10}^{-43}$ |

A-Reg(dPCA) | 3.979$\times {10}^{-23}$ |

dNS | 6.835$\times {10}^{-4}$ |

AFNS | 3.147$\times {10}^{-2}$ |

A-Reg(dNS) | 9.138$\times {10}^{-4}$ |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kratsios, A.; Hyndman, C.
Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization. *Risks* **2020**, *8*, 40.
https://doi.org/10.3390/risks8020040

**AMA Style**

Kratsios A, Hyndman C.
Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization. *Risks*. 2020; 8(2):40.
https://doi.org/10.3390/risks8020040

**Chicago/Turabian Style**

Kratsios, Anastasis, and Cody Hyndman.
2020. "Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization" *Risks* 8, no. 2: 40.
https://doi.org/10.3390/risks8020040