Towards Generative Interest-Rate Modeling: Neural Perturbations Within the Libor Market Model

Knezevic, Anna

doi:10.3390/jrfm19010082

Open AccessArticle

Towards Generative Interest-Rate Modeling: Neural Perturbations Within the Libor Market Model

by

Anna Knezevic

School of Electrical Engineering, Computing and Mathematical Sciences, Curtin University, Kent Street, Bentley, Perth, WA 6102, Australia

J. Risk Financial Manag. 2026, 19(1), 82; https://doi.org/10.3390/jrfm19010082

Submission received: 30 November 2025 / Revised: 5 January 2026 / Accepted: 12 January 2026 / Published: 21 January 2026

(This article belongs to the Special Issue Quantitative Finance in the Era of Big Data and AI)

Download

Browse Figures

Versions Notes

Abstract

This study proposes a neural-augmented Libor Market Model (LMM) for swaption surface calibration that enhances expressive power while maintaining the interpretability, arbitrage-free structure, and numerical stability of the classical framework. Classical LMM parametrizations, based on exponential decay volatility functions and static correlation kernels, are known to perform poorly in sparsely quoted and long-tenor regions of swaption volatility cubes. Machine learning–based diffusion models offer flexibility but often lack transparency, stability, and measure-consistent dynamics. To reconcile these requirements, the present approach embeds a compact neural network within the volatility and correlation layers of the LMM, constrained by structural diagnostics, low-rank correlation construction, and HJM-consistent drift. Empirical tests across major currencies (EUR, GBP, USD) and multiple quarterly datasets from 2024 to 2025 show that the neural-augmented LMM consistently outperforms the classical model. Improvements of approximately 7–10% in implied volatility RMSE and 10–15% in PV RMSE are observed across all datasets, with no deterioration in any region of the surface. These results reflect the model’s ability to represent cross-tenor dependencies and surface curvature beyond the reach of classical parametrizations, while remaining economically interpretable and numerically tractable. The findings support hybrid model designs in quantitative finance, where small neural components complement robust analytical structures. The approach aligns with ongoing industry efforts to integrate machine learning into regulatory-compliant pricing models and provides a pathway for future generative LMM variants that retain an arbitrage-free diffusion structure while learning data-driven volatility geometry.

Keywords:

libor market model; generative AI; swaption calibration; neural volatility surfaces

1. Introduction

The Libor Market Model (LMM), also known as the Brace–Gatarek–Musiela model, remains one of the most well-established frameworks for interest rate option pricing (Andersen & Piterbarg, 2010; Brace et al., 1997; Rebonato, 2002). Its appeal lies in the explicit modeling of discretely compounded forward rates, the transparent separation of volatility and correlation parameters (James & Webber, 2000; Karlsson et al., 2017), and the interpretability of its structural assumptions (Henry-Labordère, 2008; Hull, 2022). These features have made the model a staple for pricing and hedging swaptions across trading desks and risk management environments.

Despite its wide adoption, the classical LMM is limited by its reliance on parametric volatility and correlation specifications. Such parametrizations often fail to reproduce the full swaption volatility cube observed in modern markets, particularly expiry–tenor curvature and cross-sectional (expiry–tenor) structures, especially in sparsely quoted and long-tenor regions (Hagan et al., 2002; Obłój, 2007; Rebonato, 2004). Extensions such as SABR–LMM hybrids (Rebonato & White, 2009) address some smile features but introduce additional assumptions and can remain insufficient for jointly matching expiry and tenor structures.

Recent advances in machine learning have reopened the question of whether established financial models could be enhanced without sacrificing their interpretability. Neural stochastic differential equations (Chen et al., 2018; Kidger et al., 2021) and deep learning approaches to high-dimensional PDEs and BSDEs (Han et al., 2018; Huré et al., 2019) provide powerful tools for approximating complex diffusions, while physics-informed neural networks integrate structural PDE constraints directly into training (Berg & Nystrom, 2018; Raissi et al., 2019). However, most of such methods sacrifice transparency or introduce latent representations whose financial interpretation is unclear (also known as black box approaches). For option-pricing applications, where calibration stability, risk factor interpretability, and the traceability of sensitivities are essential, full replacement of classical structures is rarely acceptable.

Within this context, the present work views neural augmentation not as a substitute for the LMM’s analytical foundation but as a constrained overlay. Neural networks are introduced solely to parameterize volatility and correlation structures within the established forward-rate dynamics, and only under diagnostics designed to enforce model-consistent properties (Alaya et al., 2021; Horváth et al., 2021; Liu et al., 2019). This preserves the interpretability of the diffusion and drift structure while allowing greater expressiveness in matching empirical swaption surfaces. Such a design is aligned with the aims of the Special Issue, which emphasizes applications of modern machine learning tools that enhance, rather than replace, domain-specific models.

Although the industry now calibrates swaptions using OIS discounting and benchmarks such as EURIBOR, SONIA, and SOFR, the purpose of the SOFR discussion here is not to recast the LMM for backward-looking compounded rates. Rather, it illustrates that several calibration difficulties encountered when applying LMM-style models to SOFR (e.g., synthetic term construction, normal-volatility quoting conventions) reflect structural misalignment between forward-looking and backward-looking benchmarks. The proposed neural augmentation demonstrates how to tackle certain deficiencies of classical parametrizations that arise in both IBOR and SOFR settings, without requiring modifications to the LMM’s interpretability. The introduction of neural components raises natural questions regarding the necessity of classical extensions such as SABR overlays or jump diffusion terms. In this framework, neither are included. SABR is excluded because the neural parametrization provides sufficient functional expressiveness to generate strike-dependent shapes in principle. This study calibrates only to ATM quotes due to sparse and inconsistent OTM data, and the smile plots in the Appendix A are illustrative (not calibrated) (Richert & Buch, 2022). Jumps are excluded because under swaption-only calibration, jump intensity and diffusive volatility are not separately identifiable in a stable manner (Belomestny & Schoenmakers, 2009; Glasserman & Kou, 2003; Glasserman & Merener, 2003; Steinrücke et al., 2015). Avoiding these components prevents parameter proliferation and maintains the clarity of the diffusion-based interpretation.

The objective of this study is twofold: (i) to improve the calibration of the LMM to the ATM slice of the swaption volatility cube and (ii) to do so while preserving the model’s transparent structure. Neural augmentation is therefore treated not as flexibility for its own sake but as a targeted mechanism to address long-standing calibration deficiencies without altering the economic interpretation of the model. Subsequent sections detail the construction of the neural parametrizations, the diagnostics used to enforce structural consistency, and the empirical performance relative to classical specifications.

2. Materials and Methods

2.1. Market Data and Instruments

2.1.1. Yield Curves

The empirical analysis conducted was based on Bloomberg swaption volatility cubes for USD-SOFR, EUR-EURIBOR, and GBP-SONIA, together with the corresponding discount and projection curves (Andersen & Piterbarg, 2010; Hull, 2022; Rebonato, 2002). For each valuation date, input CSV files provide par yields for the relevant overnight-indexed swap (OIS) curves used for discounting and IBOR-linked swap curves used for projecting forward rates. These are converted into zero-coupon discount factors

P (0, T)

through a standard bootstrapping routine external to the present framework (Hull, 2022; James & Webber, 2000; Rebonato, 2002). The resulting discount function is represented numerically by a callable interpolant

D F_{0} (T)

, which returns

P (0, T)

for maturities T.

Given a tenor structure

{T_{0}, \dots, T_{N}}

with accrual factors

δ_{i} = T_{i + 1} - T_{i}

, forward rates at time 0 are defined by

L_{i} (0) = \frac{1}{δ_{i}} (\frac{P (0, T_{i})}{P (0, T_{i + 1})} - 1), i = 0, \dots, N - 1 .

(1)

This is consistent with standard LMM conventions (Andersen & Piterbarg, 2010; Brace et al., 1997; Rebonato, 2002). The par swap rates

S_{0}

needed for swaption pricing are computed from the same discount curve using

\begin{matrix} A_{0} & = \sum_{j = k}^{k + m - 1} δ_{j} P (0, T_{j + 1}), \end{matrix}

(2)

\begin{matrix} S_{0} & = \frac{P (0, T_{k}) - P (0, T_{k + m})}{A_{0}}, \end{matrix}

(3)

ensuring internal consistency between forward-rate and swap-rate inputs (Andersen & Piterbarg, 2010; Brace et al., 1997; Hull, 2022; Rebonato, 2002) for a swap starting at

T_{k}

with m accrual periods. This ensures internal consistency between the forward-rate and swap-rate inputs.

2.1.2. Swaption Volatility Surface

The calibration target is the ATM swaption volatility surface for USD-SOFR, EUR-EURIBOR, and GBP-SONIA; hence, restricting the empirical study to ATM quotes is sensible because reliable OTM/smile grids are comparatively sparse and inconsistent across dates and currencies, and we treat full-smile calibration as future work. Market inputs are provided as normal (Bachelier) ATM-implied volatilities

σ_{N}

, quoted in basis points on an expiry–tenor grid for each currency (Hull, 2022; Rebonato, 2002). Expiry and swap-tenor labels (e.g., 6M, 1Y, 5Y) are parsed into year fractions, and the union of all expiries and underlying payment dates is merged with the yield–curve pillars to define the simulation tenor array (Andersen & Piterbarg, 2010; Rebonato, 2002).

For each grid point, an annuity-consistent conversion from normal to Black ATM volatility is performed. Let F denote the ATM forward swap rate for expiry

T_{exp}

and underlying swap annuity

A_{0}

. Under the normal model, the per-annuity price of an ATM payer swaption is

{PV}_{ATM}^{N} = \frac{σ_{N} \sqrt{T_{exp}}}{\sqrt{2 π}} .

(4)

Under the Black model with volatility

σ_{B}

, the corresponding per-annuity price is

{PV}_{ATM}^{B} (σ_{B}) = F [Φ (\frac{1}{2} σ_{B} \sqrt{T_{exp}}) - Φ (- \frac{1}{2} σ_{B} \sqrt{T_{exp}})],

(5)

where

Φ

is the standard normal CDF. For each grid point,

σ_{B}

is obtained by numerically solving

{PV}_{ATM}^{B} (σ_{B}) = {PV}_{ATM}^{N},

(6)

using a robust root-finding procedure with adaptive bracketing and safe fallbacks (Hagan et al., 2002; Hull, 2022; Rebonato, 2002; Rebonato & White, 2009). The resulting Black-ATM surface

σ_{ATM}^{mkt} (T_{exp}, τ)

constitutes the primary calibration target.

2.2. Classical LIBOR Market Model Specification

2.2.1. Forward Dynamics

Let

{T_{0}, \dots, T_{N}}

be a fixed tenor structure with associated forward rates

L_{i} (t)

for accrual periods

[T_{i}, T_{i + 1}]

. Under the forward measure

Q^{T_{i + 1}}

and numeraire

P (t, T_{i + 1})

, the LMM specifies

d L_{i} (t) = σ_{i} (t) L_{i} (t) d W_{i}^{(i + 1)} (t),

(7)

where

σ_{i} (t)

is the instantaneous volatility and

W_{i}^{(i + 1)}

is a Brownian motion under

Q^{T_{i + 1}}

(Andersen & Piterbarg, 2010; Brace et al., 1997; Rebonato, 2002). When written under a common terminal measure

Q^{T_{N}}

, the coupled dynamics take the form

d L_{i} (t) = μ_{i} (t) L_{i} (t) d t + σ_{i} (t) L_{i} (t) d W_{i} (t),

(8)

with drift given by the HJM-style no-arbitrage condition

μ_{i} (t) = \sum_{j = i + 1}^{N - 1} \frac{δ_{j} L_{j} (t) σ_{i} (t) ρ_{i j} σ_{j} (t)}{1 + δ_{j} L_{j} (t)},

(9)

and instantaneous correlations

E [d W_{i} (t) d W_{j} (t)] = ρ_{i j} d t .

(10)

2.2.2. Functional Volatility and Correlation

The classical benchmark uses low-dimensional functional forms:

\begin{matrix} σ_{i}^{cl} (t) & = a exp (- b τ_{i} (t)), τ_{i} (t) = max (T_{i} - t, 0), \end{matrix}

(11)

\begin{matrix} ρ_{i j}^{cl} & = exp (- β | i - j |), \end{matrix}

(12)

with parameters

(a, b, β)

to be calibrated. This structure reduces the number of free parameters and reflects the empirical decay of volatilities and correlations across maturities (James & Webber, 2000; Karlsson et al., 2017; Rebonato, 2002, 2004).

2.3. Neural Parametrization of Volatility and Correlation

Architecture

To enhance calibration power while preserving interpretability, the classical parametrization is overlaid with a compact neural network. Network

f_{θ}

takes as input the current time t and forward-rate vector

L (t) = (L_{0} (t), \dots, L_{N - 1} (t))

and outputs perturbations to the classical volatility and a low-rank representation of the correlation structure:

(Δ σ (t), B (t)) = f_{θ} (t, L (t)),

(13)

where

Δ σ (t) \in R^{N}

and

B (t) \in R^{N \times r}

for a small rank r (typically

r \in {2, 3}

). The effective volatility and correlation are then defined as

\begin{matrix} \tilde{C} (t) & = B (t) B {(t)}^{⊤}, \end{matrix}

(14)

\begin{matrix} D (t) & = diag (\sqrt{max (diag (\tilde{C} (t)), ε_{d})}), \end{matrix}

(15)

\begin{matrix} C_{0} (t) & = D {(t)}^{- 1} \tilde{C} (t) D {(t)}^{- 1}, \end{matrix}

(16)

\begin{matrix} ρ (t) & = \frac{1}{2} (C_{0} (t) + C_{0} {(t)}^{⊤}) + ε I . \end{matrix}

(17)

Here,

ρ (t)

is obtained through the diagonal rescaling of

\tilde{C} (t)

to (approximately) unit diagonal, followed by symmetrization and a small diagonal jitter term to improve conditioning; PSD is ensured by construction via

\tilde{C} (t) = B (t) B {(t)}^{⊤}

(Henry-Labordère, 2008; Rebonato, 2004).

The network itself is a shallow multi-layer perceptron with small hidden layers (8–16 units), smooth activations, and explicit output clipping,

Δ σ_{i} (t) \in [\underset{̲}{Δ}, \bar{Δ}], B_{i j} (t) \in [\underset{̲}{b}, \bar{b}],

(18)

to avoid extreme values and improve numerical stability.

Jump components are not modeled in the final specification; jump-related heads are disabled and no Poisson or jump-amplitude parameters enter the SDEs.

2.4. Monte Carlo Simulation and Swaption Pricing

2.4.1. Path Simulation

Both classical and neural-augmented LMMs are simulated under the terminal measure

Q^{T_{N}}

using an Euler–Maruyama scheme with log-normal updates to preserve the positivity of forward rates (Andersen & Piterbarg, 2010; Brace et al., 1997; Glasserman & Merener, 2003; Rebonato, 2002). For a time step of size

Δ t

, the update for forward rate

L_{i} (t)

is

L_{i} (t + Δ t) = L_{i} (t) exp ([μ_{i} (t) - \frac{1}{2} σ_{i} {(t)}^{2}] Δ t + σ_{i} (t) \sqrt{Δ t} {(Γ (t) ϵ_{t})}_{i}),

(19)

where

ϵ_{t} \sim N (0, I)

, and

Γ (t)

is the Cholesky factor of the correlation matrix

ρ (t)

after normalization/symmetrization and jitter.

Simulation was fully vectorized: a fixed number of time steps

n_{steps}

and Monte Carlo paths

n_{paths}

were used, with quantities stored in rank-3 tensors

(time, path, forward)

. Classical and neural simulations shared the same numerical grid to permit direct comparison.

2.4.2. Swaption Pricing and Implied Volatility

For each expiry–tenor pair

(T_{exp}, τ)

present in the market surface, the model-implied ATM payer swaption price was computed from simulated paths. Let

S (T_{exp})

denote the simulated par swap rate at the option expiry constructed from the simulated discount factors. The discounted payoff for each path is

Π = max (S (T_{exp}) - S_{0}, 0) A_{0},

(20)

discounted back to time 0 using the simulated or initial discount curve. The Monte Carlo estimator of the present value is

{\hat{PV}}^{MC} = \frac{1}{n_{paths}} \sum_{k = 1}^{n_{paths}} Π^{(k)} .

(21)

An implied Black-ATM volatility

σ_{ATM}^{model} (T_{exp}, τ)

is then recovered by inverting the Black-ATM pricing formula with

F = S_{0}

and annuity

A_{0}

, ensuring comparability with the market surface and consistency with standard practice (Andersen & Piterbarg, 2010; Hull, 2022; Rebonato, 2002).

2.5. Calibration Objectives and Diagnostics

2.5.1. Vega-Weighted Swaption Data Loss

The primary calibration objective is a vega-weighted least squares error between market and model-implied ATM Black volatilities over all valid surface points (Hagan et al., 2002; Rebonato, 2002; Rebonato & White, 2009):

L_{data} = \frac{1}{N_{cells}} \sum_{(T_{exp}, τ)} {(\frac{{\hat{PV}}^{MC} (T_{exp}, τ) - {PV}^{mkt} (T_{exp}, τ)}{{Vega}^{mkt} (T_{exp}, τ)})}^{2} .

(22)

Here,

{PV}^{mkt}

is the market price implied by the market Black-ATM volatility, and

{Vega}^{mkt}

is the corresponding Black vega. This normalizes errors in “volatility points” and emphasizes regions where the market is most sensitive.

To reduce computational cost, a mini-batch strategy is employed: at each training step, only a small random subset of surface cells is used to estimate

L_{data}

. Over the course of training, all cells are visited repeatedly.

2.5.2. Structural Regularization and Diagnostics

In addition to the data loss, a light structural regularizer is introduced. Rather than computing a full second-order pricing PDE residual, a first-order proxy penalizes rapid time variation and large gradients of the neural outputs, following ideas from physics-informed and BSDE-based deep learning for pricing and control (Berg & Nystrom, 2018; Han et al., 2018; Huré et al., 2019; Raissi et al., 2019):

L_{struct} = E_{(t, L)} [∥ \partial_{t} {σ (t, L) ∥}^{2} + ∥ \nabla_{L} {σ (t, L) ∥}^{2} + {∥ σ (t, L) ∥}^{2}],

(23)

where expectations are approximated using states visited along simulated paths. This encourages smoothness in time and state space without incurring the numerical overhead and instability of nested automatic differentiation.

The set of the following diagnostics is tracked but not directly optimized:

A minimum eigenvalue of $ρ (t)$ over time and paths (PSD check);
The fraction of simulated forward rates that become negative (positivity check);
Deviations from martingale conditions for discounted swap rates;
Gradient norms of network parameters and incidence of NaN/Inf values.

These statistics are used to tune regularization weights and learning rates but are not explicitly included in

L_{total}

.

2.5.3. Total Objective

The overall loss used for neural calibration is

L_{total} = λ_{data} L_{data} + λ_{struct} L_{struct},

(24)

with

λ_{data} ≫ λ_{struct}

to prioritize market fit while maintaining a minimal level of smoothness and stability.

2.6. Pricing Error Diagnostics

In the ATM setting used throughout the empirical study,

K = S_{0}

at each expiry–tenor grid point. These are compared against market-implied Black prices:

{PV}^{Black} = {DF}_{pay} \cdot Δ \cdot BlackCall (F, K, σ_{mkt}, T) .

(25)

Pricing errors are evaluated both in absolute basis points and as vega-weighted deviations:

\begin{matrix} Absolute error (bp) & = 10^{4} \cdot ({PV}^{model} - {PV}^{market}) \end{matrix}

(26)

\begin{matrix} Vega-weighted error & = \frac{{PV}^{model} - {PV}^{market}}{BlackVega (F, K, T, σ_{mkt})} . \end{matrix}

(27)

Model-implied volatilities

\hat{σ}

are obtained by inverting the Black formula:

{PV}^{model} = {DF}_{pay} \cdot Δ \cdot BlackCall (F, K, \hat{σ}, T) .

(28)

The implied volatility error is computed as

IV error = {\hat{σ}}^{model} - σ^{market}

(29)

and is reported separately for the classical and neural models.

2.7. Bucketed RMSE Analysis

To assess calibration quality across the swaption surface, we group swaption quotes into two-dimensional buckets defined by expiry and underlying swap tenor. Specifically,

Expiry buckets: $T \in [0, 1], (1, 2], (2, 5], (5, 10], (10, 30]$ ;
Tenor buckets: $τ \in [1, 2], (2, 5], (5, 10], (10, 30]$ .

Within each bucket, we compute the root mean squared error (RMSE) for implied volatility errors and for price errors:

\begin{matrix} RMSE (σ) & = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {({\hat{σ}}_{i} - σ_{i}^{mkt})}^{2}} \end{matrix}

(30)

\begin{matrix} RMSE (bp) & = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {({PV}_{i}^{model} - {PV}_{i}^{market})}^{2}} \cdot 10^{4} \end{matrix}

(31)

where N is the number of quotes in the bucket,

{\hat{σ}}_{i}

is the model-implied volatility, and

σ_{i}^{mkt}

is the corresponding market-implied volatility. Price RMSE is reported in basis points by scaling with

10^{4}

.

2.8. Drop-One Expiry Jackknife

To test robustness to surface segmentation, we evaluate calibration accuracy under a leave-one-expiry-out protocol. For each expiry bucket

T_{drop}

, we recompute the RMSE after excluding all swaptions with expiry

T_{drop}

:

Jackknife {RMSE}_{- T_{drop}} = RMSE (\{IV errors for all swaptions with T \neq T_{drop}\}) .

(32)

This diagnostic highlights whether the reported improvements are concentrated in (or driven by) a particular expiry region, rather than being broadly distributed across the surface.

2.9. Statistical Significance Tests

Paired statistical tests were applied to determine whether neural calibration improves pricing accuracy in a statistically meaningful way:

Paired t-test: It tests mean error differences across all instruments.
Wilcoxon signed-rank test: A non-parametric test on absolute error ranks.
Cohen’s d: It measures effect size for improvement:

$d = \frac{{\bar{X}}_{classic} - {\bar{X}}_{neural}}{s}, s = pooled std . dev .$
Proportion improved: The fraction of instruments where neural RMSE is lower than that of the classical instrument.

2.10. Summary Metrics

All diagnostics contribute to headline metrics, including

Overall implied volatility RMSE: ${IV}_{RMSE}$ ;
Overall pricing RMSE in basis points: ${BP}_{RMSE}$ ;
Percentage improvement: $% Improvement = \frac{{RMSE}_{classic} - {RMSE}_{neural}}{{RMSE}_{classic}}$ ;
Statistical test p-values and effect sizes.

Together, these diagnostics validate the neural LMM’s ability to improve fit and maintain robustness across expiries, tenors, and calibration conditions.

2.11. Training Procedure and Numerical Safeguards

2.11.1. Optimization Scheme

The neural parameters

θ

are initialized around the classical solution

(a, b, β)

and optimized using a stochastic gradient method with a conservative learning rate, similar in spirit to other neural extensions of the LMM (Horváth et al., 2021; Liu et al., 2019; Sridi & Bilokon, 2023). Training proceeds in micro-batches: for each time step in the simulation grid, a small subset of paths (e.g., 8) and a mini-batch of swaption surface cells are used to compute

L_{total}

and its gradient. The number of gradient updates per time step is adaptively bounded based on recent loss levels, with explicit caps to avoid excessive computation.

2.11.2. Stability Mechanisms

Several numerical safeguards are employed—gradient clipping, output clipping, correlation projection, and finite-value checks—to maintain the stability of the neural-augmented dynamics and avoid numerical arbitrage (Horváth et al., 2021; Kidger et al., 2021; Marshall et al., 2024):

Gradient clipping: Global norm clipping is applied to parameter gradients to prevent exploding updates.
Value clipping: Neural outputs are clipped to predefined bounds before constructing volatilities and correlation factors.
Correlation projection: The correlation output is normalized to unit scale, symmetrized, and diagonally jittered to ensure stable Cholesky factorization.
Numerics checks: All intermediate tensors involved in loss computation are passed through finite-value checks; NaNs or Infs trigger diagnostic flags rather than silent failure.
Shared code path: Classical and neural simulations share the same simulation and pricing routines, with the neural block deactivated when $f_{θ}$ is absent, ensuring that the classical benchmark remains a stable point of reference.

These choices collectively convert the initially overparameterized and numerically fragile prototype into a tractable, interpretable, and computationally manageable neural-augmented LMM suitable for swaption surface calibration.

3. Discussion

3.1. Computational Environment and Performance

3.1.1. Hardware and Software Configuration

All experiments were conducted on a single workstation with the following specifications:

CPU: Intel^® Core^™ i7–8700K @ 3.70 GHz (6 cores/12 threads);
System Memory: 16 GB RAM;
GPU: NVIDIA GeForce RTX 2080 SUPER (8 GB VRAM);
Driver/CUDA: NVIDIA driver 440.33.01, CUDA 10.2;
Operating Environment: Docker container; (tensorflow/tensorflow:2.12.0-gpu-jupyter)
Deep Learning Framework: TensorFlow 2.12.0.

GPU utilization during experiments remained within nominal thermal and power envelopes, with no concurrent GPU workloads active. All reported timings correspond to wall-clock measurements on this single-GPU system.

3.1.2. Model Configuration and Runtime Parameters

The numerical experiments used the following simulation and calibration setup:

Number of Monte Carlo paths: $n_{paths} = 200$ ;
Time step: $Δ t = 0.25$ ;
Number of time steps: $n_{steps}$ (model-dependent);
Mini-batch size: equal to the number of surface grid cells;
Optimization: gradient-based calibration using automatic differentiation.

Initialization and Parameterization Details (Directly from the Code)

The model is a compact MLP with a shared trunk and two heads:

[t, L (t)] \in R^{N + 1} \overset{Dense (4, tanh)}{\to} h (t, L) \overset{vol head : Dense (8, ReLU) \to Dense (N)}{\to} vol params,

h (t, L) \overset{corr head : Dense (4, ReLU) \to Dense (N (N + 1) / 2)}{\to} triangular params, h (t, L) \overset{Dense (1, sigmoid)}{\to} w (t, L) .

All computations are in float32 (explicit casts in call and T_forw are stored as tf.float32). Dense-layer initializers are the TensorFlow/Keras defaults (kernel: Glorot/Xavier uniform; bias: zeros), since no custom initializers are specified in the class.

3.1.3. Volatility Map (Fixed by Code)

The code implements a tenor-decaying baseline

τ_{i} (t) = {(T_{i}^{forw} - t)}^{+}, σ_{i}^{base} (t) = a_{vol} exp (- b_{decay} τ_{i} (t)),

and a small multiplicative perturbation produced by the network:

u_{i} (t, L) = tanh (vol_head {(h (t, L))}_{i}) \in [- 1, 1], σ_{i} (t) = σ_{i}^{base} (t) exp (s_{pert} u_{i} (t, L)),

with s_pert ≡perturb_scale

= 0.15

. For numerical stability, the output volatility is explicitly clipped in the code to

σ_{i} (t) \in [10^{- 6}, 3.0] .

3.1.4. Correlation Map

The neural correlation candidate is SPSD by construction. The correlation head outputs a vector that is reshaped into a lower-triangular matrix

T (t, L)

(via tfp.math.fill_triangular). The diagonal is forced to be strictly positive,

T_{i i} (t, L) \leftarrow softplus (T_{i i} (t, L)) + 10^{- 4},

and the SPSD covariance is formed as

Σ^{nn} (t) = T (t, L) T {(t, L)}^{⊤} .

This is converted to a correlation matrix through diagonal rescaling:

C^{nn} (t) = D {(t)}^{- 1} Σ^{nn} (t) D {(t)}^{- 1}, D (t) = diag (\sqrt{max (diag (Σ^{nn} (t)), 10^{- 8})}) .

Finally, symmetry and conditioning are enforced by the same code-level step applied twice:

C \leftarrow \frac{1}{2} (C + C^{⊤}) + 10^{- 5} I .

Importantly, PSD is ensured by the

T T^{⊤}

construction and correlation normalization.

The learned correlation is anchored to an exponential baseline

C_{i j}^{base} = exp (- β_{opt} | i - j |),

and mixed with the neural candidate using a sigmoid gate with an explicit cap,

w (t, L) = corr_mix (h (t, L)) \cdot w_{max}, w_{max} \equiv w_cap = 0.15,

so that

w (t, L) \in [0, 0.15]

in the provided code. The final correlation is

C (t) = (1 - w (t, L)) C^{base} + w (t, L) C^{nn} (t),

followed by the symmetrization/jitter step above.

3.1.5. Overfitting Mitigation Under Mini-Batch Calibration (Clarified Using the Code Design)

Overfitting is mitigated primarily by structural constraints that are explicit in the implementation:

Low capacity: The shared representation comprises only four units, and the heads are small (eight and four hidden units), limiting function complexity.
Bounded, small volatility deviations: Perturbations are bounded by tanh and scaled by perturb_scale $= 0.15$ , and then volatility is clipped to $[10^{- 6}, 3.0]$ . This prevents the network from fitting idiosyncratic quotes by extreme local volatility distortions.
PSD-by-construction correlation with anchored mixing: The correlation candidate is constrained via $T T^{⊤}$ and correlation normalization, and its impact is restricted by a capped mixing weight $w (t, L) \in [0, 0.15]$ (with w_cap non-trainable in the provided code). Thus, the learned correlation can only deviate modestly from the interpretable exponential baseline unless the cap is explicitly relaxed.
Numerical regularization: Small diagonal jitter ( $10^{- 5} I$ ) is added to the correlation output, improving conditioning and reducing sensitivity to mini-batch noise.

3.1.6. Runtime Performance

Figure 1 provides a visual comparison in seconds between wall-clock runtimes for the classical and neural implementations under identical hardware conditions.

For example, for Q2 of year 2024, the resulting end-to-end computational overhead of the neural approach corresponds to an approximate slowdown factor of

Slowdown \approx \frac{0.218 + 20.141}{0.602 + 16.086} = \frac{20.359}{16.688} \approx 1.22 \times,

relative to the classical calibration–simulation pipeline. For some runs, the classical pipeline is slightly slower due to the BLAS threading/CPU state. Despite the increased runtime, the neural formulation enables substantially richer model expressiveness and end-to-end differentiability, which are not available in the classical setting.

3.2. Model Comparison on OOS and Long-Tenor Holdout

For each dataset (currency–quarter), we compared a classic model against a neural model using the root mean squared error (RMSE) measured in volatility points. Given targets

y_{i}

and predictions

{\hat{y}}_{i}

on a test set of size n,

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}} .

(33)

A lower RMSE indicates better out-of-sample fit.

3.2.1. Per-Row 80/20 Masked-Holdout Evaluation

Figure 2 shows the masked-holdout RMSE computed on a fixed 20% subset of tenor cells selected independently within each expiry row (dataset-specific mask). For each dataset d, we denote the classic and neural test errors by

{RMSE}_{classic}^{(d)}

and

{RMSE}_{neural}^{(d)}

. Bars are shown side-by-side to make absolute error differences visually comparable across datasets.

3.2.2. Long-Tenor Holdout Evaluation

Figure 3 evaluates both models on a long-tenor holdout subset defined by a tenor cutoff of

D_{holdout}^{(d)} = {(x_{i}, y_{i}) \in D^{(d)} : T_{end} (i) > T_{0}}, T_{0} = 8.5 .

(34)

RMSE is then computed on

D_{holdout}^{(d)}

using Equation (33). This isolates performance on the long end of the surface, where extrapolation risk is typically higher.

3.2.3. Relative Improvement Summary

To summarize the benefit of the neural model, Figure 4 plots the percentage improvement versus the classic baseline:

Δ_{%}^{(d)} = \frac{{RMSE}_{classic}^{(d)} - {RMSE}_{neural}^{(d)}}{{RMSE}_{classic}^{(d)}} \times 100 .

(35)

Positive

Δ_{%}^{(d)}

indicates that the neural model reduces error; negative values would indicate degradation. Two bars per dataset are shown: one for the OOS per-row test set and one for the long-tenor holdout test set.

3.2.4. Classical vs. Neural Scatter

Figure 5 plots

({RMSE}_{classic}^{(d)}, {RMSE}_{neural}^{(d)})

for each dataset d. The diagonal line

y = x

marks equal performance:

{RMSE}_{neural}^{(d)} = {RMSE}_{classic}^{(d)} .

(36)

points below the diagonal correspond to

{RMSE}_{neural}^{(d)} < {RMSE}_{classic}^{(d)}

(neural better), while points above indicate the opposite. The distance to the diagonal provides a visual cue for effect size.

3.2.5. Interpretation Note

Absolute RMSE levels differ across currencies/quarters due to underlying market regime and dataset scale. For a cross-dataset comparison of model gain, Equation (35) (percentage reduction) is the most comparable metric, while Figure 2 and Figure 3 emphasize absolute error magnitudes.

3.3. Implied Volatility Error

Primary metrics (headline IV/PV RMSE):

Currency	Year	Quarter	IV RMSE (Classic)	IV RMSE (Neural)	$Δ$ IV (%)	PV RMSE bp (Classic)	PV RMSE bp (Neural)	$Δ$ PV (%)	n
EUR	2024	Q2	0.216	0.200	7.69	289.8	258.3	10.87	213
EUR	2024	Q3	0.216	0.195	9.70	306.3	265.2	13.44	213
EUR	2024	Q4	0.231	0.207	10.21	293.7	251.7	14.32	213
EUR	2025	Q2	0.172	0.155	10.13	277.0	236.1	14.74	213
GBP	2024	Q2	0.160	0.148	7.63	230.3	201.9	12.33	213
GBP	2024	Q3	0.159	0.147	7.41	229.8	202.4	11.93	213
GBP	2024	Q4	0.147	0.137	6.69	225.7	202.1	10.49	213
GBP	2025	Q2	0.133	0.120	9.50	238.4	205.9	13.64	213
USD	2024	Q2	0.175	0.160	8.63	253.8	222.8	12.23	213
USD	2024	Q3	0.187	0.170	8.96	267.5	231.1	13.61	213
USD	2024	Q4	0.160	0.145	9.22	258.9	226.2	12.61	213
USD	2025	Q2	0.155	0.141	8.56	259.1	226.6	12.52	213

Secondary metrics (vega-weighted vol-point RMSE):

Currency	Year	Quarter	VW Vol-Pts RMSE (Classic)	VW Vol-Pts RMSE (Neural)	$Δ$ VW (%)	n
EUR	2024	Q2	0.213	0.197	7.17	213
EUR	2024	Q3	0.211	0.192	8.92	213
EUR	2024	Q4	0.226	0.205	9.42	213
EUR	2025	Q2	0.170	0.154	9.63	213
GBP	2024	Q2	0.159	0.147	7.39	213
GBP	2024	Q3	0.158	0.146	7.17	213
GBP	2024	Q4	0.146	0.137	6.50	213
GBP	2025	Q2	0.132	0.120	9.26	213
USD	2024	Q2	0.173	0.159	8.31	213
USD	2024	Q3	0.185	0.169	8.55	213
USD	2024	Q4	0.158	0.144	8.90	213
USD	2025	Q2	0.153	0.141	8.25	213

The empirical results of this study demonstrate that the neural augmentation of an OIS-discounted, EURIBOR/SONIA/SOFR-referenced LMM framework yields robust and systematic improvements in swaption surface calibration relative to classical parametric specifications (Brace et al., 1997; Horváth et al., 2021; Rebonato, 2002). Across all currencies examined (EUR, GBP, USD) and multiple quarterly datasets from 2024 to 2025, the neural-augmented LMM achieved reductions in implied volatility RMSE of approximately 7–10% and reductions in PV RMSE of 10–15% relative to the classical exponential decay volatility and correlation parameterization (Andersen & Piterbarg, 2010; Hull, 2022; Rebonato, 2002). These gains were observed uniformly across surface regions—including sparsely quoted and long-dated tenors—indicating that the improvement is neither an artifact of local overfitting nor sensitivity to specific surface segments (Karlsson et al., 2017; Rebonato, 2004). The consistency of these results reinforces the central thesis of this work: the neural perturbations of LMM volatility and correlation structures can outperform classical specifications while maintaining interpretability, numerical discipline, and arbitrage-aware dynamics (Andersen & Piterbarg, 2010; Henry-Labordère, 2008; Horváth et al., 2021).

From the perspective of prior research, these findings lie at the intersection of classical analytical models (e.g., SABR and HJM-type formulations) and modern machine learning approaches such as neural SDEs and PINN-based pricing networks (Hagan et al., 2002; Obłój, 2007; Rebonato, 2002). SABR remains widely used for EURIBOR-, SONIA-, and SOFR-referenced swaptions, but its single-factor structure and cross-maturity parameter inconsistency—documented extensively in both the pre- and post-LIBOR literature—limit its robustness in sparse-tenor and long-dated regions (Hagan et al., 2002; Obłój, 2007). The neural-augmented LMM does not require smile data and produces a globally consistent volatility–correlation structure, thereby addressing a well-known limitation of SABR (Hagan et al., 2002; Obłój, 2007; Richert & Buch, 2022).

Machine learning diffusion models offer greater flexibility but often sacrifice interpretability and require nontrivial arbitrage enforcement (Chen et al., 2018; Kidger et al., 2021; Raissi et al., 2019). Pure neural SDEs introduce latent state coordinates and non-economic drift structures, which complicate model validation, stress testing, and risk sensitivity analysis (Horváth et al., 2021; Kidger et al., 2021). In contrast, the present hybrid approach integrates a compact neural network into the LMM’s volatility and correlation layers while preserving HJM drift consistency, log-normal forward-rate dynamics, and the established economic interpretation of forward rates (Andersen & Piterbarg, 2010; Brace et al., 1997; Rebonato, 2002). This ensures that neural components enhance—rather than replace—the structured dynamics of the LMM.

The improvements observed here arise precisely in those regions where classical exponential decay parametrizations are known to underperform: mid-to-long expiries and long-dated underlying swaps (Karlsson et al., 2017; Rebonato, 2004). These regions of the OIS-discounted swaption surface—particularly long-dated SONIA and SOFR tenors—suffer from structural sparsity and elevated uncertainty, making them historically difficult for rigid parametric specifications (Hagan et al., 2002; Rebonato, 2004). The neural-augmented LMM captures subtle maturity-dependent curvature and cross-tenor geometry that classical forms cannot express. The low-rank factor structure for correlation perturbations introduces flexibility without violating PSD or requiring full-matrix calibration, which is historically unstable (Henry-Labordère, 2008; Rebonato, 2004).

Although numerical safeguards (correlation normalization with symmetrization/jitter, log-normal updates, gradient clipping) appear in the methodology, they do not alter the model’s economic structure; they simply preserve numerical stability in the presence of neural perturbations. The backbone LMM remains intact, demonstrating the usefulness of the classical model as a scaffold for neural refinement.

Although this work was calibrated to OIS-discounted swaption cubes referencing EURIBOR, SONIA, and SOFR—the post-LIBOR standard—the findings retain relevance for any forward-rate-based model under multi-curve discounting. The difficulty of adapting traditional LMM structures to backward-looking compounded benchmarks highlights the potential for neural overlays to mitigate parametric rigidity (Alaya et al., 2021). Although a full SOFR-specific extension is beyond the present scope, the success of neural perturbations here suggests promising directions for multi-curve or backward-looking frameworks.

In summary, the neural-augmented LMM improves calibration accuracy across EURIBOR-, SONIA-, and SOFR-referenced swaption surfaces while preserving the interpretability and arbitrage-aware structure expected of modern OIS-based interest-rate models (Horváth et al., 2021; Liu et al., 2019). The hybrid design balances interpretability with expressive power, offering a reproducible and regulatorily acceptable path toward machine learning–enhanced interest-rate modeling.

4. Conclusions

This work proposes and evaluates a neural-augmented extension of an OIS-discounted LMM referencing EURIBOR/SONIA/SOFR, enhancing swaption surface calibration without compromising interpretability or structural guarantees. Unlike full neural diffusion models that replace the underlying structure, the present method preserves forward-rate dynamics, HJM drift consistency, and log-normal evolution, modifying only the volatility and correlation layers through a compact, regularized neural network (Andersen & Piterbarg, 2010; Brace et al., 1997; Hagan et al., 2002; Horváth et al., 2021; Kidger et al., 2021).

Empirical evaluation across the EUR-EURIBOR, GBP-SONIA, and USD-SOFR swaption datasets from 2024 to 2025 demonstrated systematic improvements over classical exponential decay formulations (Hull, 2022; Rebonato, 2002). Improvements of 7–10% in implied volatility RMSE and 10–15% in PV RMSE were observed across all datasets, with no deterioration in any region of the surface. This suggests that neural perturbations address structural deficiencies in classical parametrizations, especially in long-dated or sparsely quoted regions (Karlsson et al., 2017; Rebonato, 2004).

These findings underline the value of hybrid classical–neural designs in OIS-discounted settings, where the transparency of the LMM coexists with the expressive power needed to match modern EURIBOR, SONIA, and SOFR volatility geometry (Andersen & Piterbarg, 2010; Henry-Labordère, 2008). This structure aligns with model-risk requirements and integrates naturally into existing trading-desk calibration pipelines.

The observed improvements in Table 1 (for further visualization aspects, please see extensive charts in the Appendix A) suggest broader implications for OIS-based markets, where backward-looking compounded benchmarks such as SOFR, SONIA, and €STR introduce modeling challenges that neural perturbations may help mitigate (Liu et al., 2019; Richert & Buch, 2022). Future work primarily concerns broader empirical validation once consistent non-ATM swaption grids are available and applying the same constrained-overlay mechanism on top of richer classical backbones (e.g., multi-factor LMM variants) rather than changing the core methodology (Kidger et al., 2021; Liu et al., 2019; Richert & Buch, 2022). Hybrid designs such as the one presented here offer a path toward next-generation interest rate models that remain both interpretable and data-driven.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Swaption data is available from Bloomberg under the following tickers: SWAPTION VOLATILITY CUBE USD-SOFR/EUR-EURIBOR/GBP-SONIA.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ATM	At-the-money;
BSDE	Backward stochastic differential equation;
CDF	Cumulative distribution function;
DF	Discount factor;
EUR	Euro area currency (euro);
EURIBOR	Euro Interbank Offered Rate;
FRA	Forward-rate agreement;
GAN	Generative adversarial network;
GBP	British pound sterling;
HJM	Heath–Jarrow–Morton model;
IBOR	Interbank Offered Rate (generic benchmark family);
IV	Implied volatility;
LMM	Libor market model;
MC	Monte Carlo;
MLP	Multi-layer perceptron;
OIS	Overnight indexed swap (discounting curve);
OLS	Ordinary least squares (if used anywhere);
PDF	Probability density function;
PINN	Physics-informed neural network;
PV	Present value;
PDE	Partial differential equation;
PSD	Positive semidefinite;
Q	Risk-neutral probability measure (when used as $Q$ );
RMSE	Root mean squared error;
SABR	Stochastic Alpha Beta Rho volatility model;
SDE	Stochastic differential equation;
SOFR	Secured Overnight Financing Rate;
SONIA	Sterling Overnight Index Average;
USD	United States dollar;
VW	Vega-weighted.

Appendix A. Charts

Appendix A.1. USD Surface Comparison

Appendix A.2. EUR Surface Comparison

Appendix A.3. GBP Surface Comparison

Appendix A.4. USD Correlation Difference

Appendix A.5. EUR Correlation Difference

Appendix A.6. GBP Correlation Difference

Appendix A.7. USD Model Implied Smile

Appendix A.8. EUR Model Implied Smile

Appendix A.9. GBP Model Implied Smile

Appendix A.10. USD NN Volatility Output Across Time and Tenor

Appendix A.11. EUR NN Volatility Output Across Time and Tenor

Appendix A.12. GBP NN Volatility Output Across Time and Tenor

Appendix A.13. USD Correlation Mix Weight

Appendix A.14. EUR Correlation Mix Weight

Appendix A.15. GBP Correlation Mix Weight

Appendix A.16. USD Volatility Term Structure

Appendix A.17. EUR Volatility Term Structure

Appendix A.18. GBP Volatility Term Structure

References

Alaya, M. B., Kebaier, A., & Sarr, D. (2021). Deep calibration of interest rates model. arXiv, arXiv:2110.15133. Available online: https://arxiv.org/abs/2110.15133 (accessed on 28 November 2025).
Andersen, L. B., & Piterbarg, V. V. (2010). Interest rate modeling, volume iii: Products and risk management. Atlantic Financial Press. [Google Scholar]
Belomestny, D., & Schoenmakers, J. (2009). A jump-diffusion Libor model and its robust calibration (Tech. Rep. No. RQUF-2008-0135). Weierstrass Institute for Applied Analysis and Stochastics (WIAS). Available online: https://www.wias-berlin.de/people/schoenma/RQUF-2008-0135_Final.pdf (accessed on 28 November 2025).
Berg, J., & Nystrom, K. (2018). A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing, 317, 28–41. [Google Scholar] [CrossRef]
Brace, A., Gatarek, D., & Musiela, M. (1997). The market model of interest rate dynamics. Mathematical Finance, 7(2), 127–147. [Google Scholar] [CrossRef]
Chen, R. T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. (2018). Neural ordinary differential equations. Advances in Neural Information Processing Systems, 31, 7. Available online: https://user.eng.umd.edu/~austin/ence688p.d/papers-reading/NN-Ordinary-Differential-Equations2018.pdf (accessed on 28 November 2025).
Glasserman, P., & Kou, S. (2003). The term structure of simple forward rates with jump risk. Mathematical Finance, 13(3), 383–410. Available online: https://www.columbia.edu/~sk75/GlassKou.pdf (accessed on 28 November 2025). [CrossRef]
Glasserman, P., & Merener, N. (2003). Numerical solution of jump-diffusion LIBOR market models. Finance and Stochastics, 7(1), 1–27. Available online: https://link.springer.com/article/10.1007/s007800200076 (accessed on 28 November 2025). [CrossRef]
Hagan, P. S., Kumar, D. K., Lesniewski, A. S., & Woodward, D. E. (2002). Managing smile risk. Wilmott Magazine, 1, 84–108. [Google Scholar]
Han, J., Jentzen, A., & E, W. (2018). Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences, 115(34), 8505–8510. [Google Scholar] [CrossRef] [PubMed]
Henry-Labordère, P. (2008). Analysis, geometry, and modeling in finance: Advanced methods in option pricing. Chapman and Hall/CRC. [Google Scholar]
Horváth, B., Muguruza, A., & Tomas, J. (2021). Deep learning volatility: A deep neural network perspective on pricing and calibration in (rough) volatility models. Quantitative Finance, 21(11), 1731–1749. [Google Scholar] [CrossRef]
Hull, J. C. (2022). Options, futures, and other derivatives (11th ed.). Pearson Education. [Google Scholar]
Huré, C., Pham, H., & Warin, X. (2019). Deep neural networks algorithms for stochastic control problems on finite horizon: Numerical applications. ESAIM: Proceedings and Surveys, 65, 32–50. [Google Scholar] [CrossRef]
James, J., & Webber, N. (2000). Interest rate modelling. Wiley. [Google Scholar]
Karlsson, P., Pilz, K. F., & Schlögl, E. (2017). Calibrating a market model with stochastic volatility to commodity and interest rate risk. Quantitative Finance, 17(6), 907–925. [Google Scholar] [CrossRef]
Kidger, P., Foster, J., Li, X., Oberhauser, H., & Lyon, T. (2021). Neural SDEs as infinite-dimensional GANs. arXiv, arXiv:2106.01845. [Google Scholar] [CrossRef]
Liu, S., Borovykh, A., Grzelak, L. A., & Oosterlee, C. W. (2019). A neural network-based framework for financial model calibration. Journal of Mathematics in Industry, 9, 1–24. [Google Scholar] [CrossRef]
Marshall, N., Xiao, K. L., Agarwala, A., & Paquette, E. (2024). The Dynamics of SGD with gradient clipping in high dimensions. arXiv, arXiv:2406.11733. [Google Scholar] [CrossRef]
Obłój, J. (2007). Fine-tune your smile: Correction to Hagan et al.’s formula. arXiv, arXiv:0708.0998. [Google Scholar] [CrossRef]
Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707. [Google Scholar] [CrossRef]
Rebonato, R. (2002). Modern pricing of interest-rate derivatives: The libor market model and beyond. Princeton University Press. [Google Scholar]
Rebonato, R. (2004). Volatility and Correlation: The Perfect Hedger and the Fox. John Wiley & Sons. [Google Scholar]
Rebonato, R., & White, M. (2009). Linking caplets and swaptions prices in the LMM–SABR model. The Journal of Computational Finance, 13(2), 1–43. [Google Scholar] [CrossRef]
Richert, I., & Buch, R. (2022). Interpolation of missing swaption volatility data using gibbs sampling on variational autoencoders. arXiv, arXiv:2204.10400. [Google Scholar] [CrossRef]
Sridi, A., & Bilokon, P. (2023). Applying deep learning to calibrate stochastic volatility models. arXiv, arXiv:2309.07843. Available online: https://arxiv.org/abs/2309.07843 (accessed on 28 November 2025).
Steinrücke, M., Zagst, R., & Swishchuk, A. (2015). The Markov-switching jump-diffusion LIBOR market model. Quantitative Finance, 15(3), 455–476. [Google Scholar] [CrossRef]

Figure 1. Classic vs Neural implementation runtime.

Figure 2. Per-row 80/20 masked holdout: test RMSE comparison between classic and neural models.

Figure 3. Long-tenor holdout (

T_{end} > 8.5

): test RMSE comparison between classical and neural models.

Figure 3. Long-tenor holdout (

T_{end} > 8.5

): test RMSE comparison between classical and neural models.

Figure 4. Relative improvement of neural vs. classical models on test RMSE (%): OOS per-row and long-tenor holdout. Positive values indicate lower RMSE for the neural model.

Figure 5. Scatter plot of OOS per-row test RMSE: neural vs. classical. The diagonal

y = x

indicates equal performance; points below the line favor the neural model.

Figure 5. Scatter plot of OOS per-row test RMSE: neural vs. classical. The diagonal

y = x

indicates equal performance; points below the line favor the neural model.

Table 1. Summary of calibration improvements from neural-augmented LMM relative to the classical model.

Dataset	IV RMSE Improvement (%)	PV RMSE Improvement (%)
EUR 2024 Q2	7.69	10.87
EUR 2024 Q3	9.70	13.44
EUR 2024 Q4	10.21	14.32
EUR 2025 Q2	10.13	14.74
GBP 2024 Q2	7.63	12.33
GBP 2024 Q3	7.41	11.93
GBP 2024 Q4	6.69	10.49
GBP 2025 Q2	9.50	13.64
USD 2024 Q2	8.63	12.23
USD 2024 Q3	8.96	13.61
USD 2024 Q4	9.22	12.61
USD 2025 Q2	8.56	12.52

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Knezevic, A. Towards Generative Interest-Rate Modeling: Neural Perturbations Within the Libor Market Model. J. Risk Financial Manag. 2026, 19, 82. https://doi.org/10.3390/jrfm19010082

AMA Style

Knezevic A. Towards Generative Interest-Rate Modeling: Neural Perturbations Within the Libor Market Model. Journal of Risk and Financial Management. 2026; 19(1):82. https://doi.org/10.3390/jrfm19010082

Chicago/Turabian Style

Knezevic, Anna. 2026. "Towards Generative Interest-Rate Modeling: Neural Perturbations Within the Libor Market Model" Journal of Risk and Financial Management 19, no. 1: 82. https://doi.org/10.3390/jrfm19010082

APA Style

Knezevic, A. (2026). Towards Generative Interest-Rate Modeling: Neural Perturbations Within the Libor Market Model. Journal of Risk and Financial Management, 19(1), 82. https://doi.org/10.3390/jrfm19010082

Article Menu

Towards Generative Interest-Rate Modeling: Neural Perturbations Within the Libor Market Model

Abstract

1. Introduction

2. Materials and Methods

2.1. Market Data and Instruments

2.1.1. Yield Curves

2.1.2. Swaption Volatility Surface

2.2. Classical LIBOR Market Model Specification

2.2.1. Forward Dynamics

2.2.2. Functional Volatility and Correlation

2.3. Neural Parametrization of Volatility and Correlation

Architecture

2.4. Monte Carlo Simulation and Swaption Pricing

2.4.1. Path Simulation

2.4.2. Swaption Pricing and Implied Volatility

2.5. Calibration Objectives and Diagnostics

2.5.1. Vega-Weighted Swaption Data Loss

2.5.2. Structural Regularization and Diagnostics

2.5.3. Total Objective

2.6. Pricing Error Diagnostics

2.7. Bucketed RMSE Analysis

2.8. Drop-One Expiry Jackknife

2.9. Statistical Significance Tests

2.10. Summary Metrics

2.11. Training Procedure and Numerical Safeguards

2.11.1. Optimization Scheme

2.11.2. Stability Mechanisms

3. Discussion

3.1. Computational Environment and Performance

3.1.1. Hardware and Software Configuration

3.1.2. Model Configuration and Runtime Parameters

Initialization and Parameterization Details (Directly from the Code)

3.1.3. Volatility Map (Fixed by Code)

3.1.4. Correlation Map

3.1.5. Overfitting Mitigation Under Mini-Batch Calibration (Clarified Using the Code Design)

3.1.6. Runtime Performance

3.2. Model Comparison on OOS and Long-Tenor Holdout

3.2.1. Per-Row 80/20 Masked-Holdout Evaluation

3.2.2. Long-Tenor Holdout Evaluation

3.2.3. Relative Improvement Summary

3.2.4. Classical vs. Neural Scatter

3.2.5. Interpretation Note

3.3. Implied Volatility Error

4. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Charts

Appendix A.1. USD Surface Comparison

Appendix A.2. EUR Surface Comparison

Appendix A.3. GBP Surface Comparison

Appendix A.4. USD Correlation Difference

Appendix A.5. EUR Correlation Difference

Appendix A.6. GBP Correlation Difference

Appendix A.7. USD Model Implied Smile

Appendix A.8. EUR Model Implied Smile

Appendix A.9. GBP Model Implied Smile

Appendix A.10. USD NN Volatility Output Across Time and Tenor

Appendix A.11. EUR NN Volatility Output Across Time and Tenor

Appendix A.12. GBP NN Volatility Output Across Time and Tenor

Appendix A.13. USD Correlation Mix Weight

Appendix A.14. EUR Correlation Mix Weight

Appendix A.15. GBP Correlation Mix Weight

Appendix A.16. USD Volatility Term Structure

Appendix A.17. EUR Volatility Term Structure

Appendix A.18. GBP Volatility Term Structure

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI