Research on Financial Default Model with Stochastic Intensity Using Filtered Likelihood Method

Xiangdong Liu; Jiahui Wu; Xianglong Li

doi:10.3390/math11143061

,

and

School of Economics, Jinan University, Guangzhou 510632, China

^*

Author to whom correspondence should be addressed.

Mathematics2023, 11(14), 3061;https://doi.org/10.3390/math11143061

This article belongs to the Special Issue Advances in Mathematical Modelling and Statistical Methods for Risk Management

Version Notes

Order Reprints

Abstract

This paper investigates the financial default model with stochastic intensity by incomplete data. On the strength of the process-designated point process, the likelihood function of the model in the parameter estimation can be decomposed into the factor likelihood term and event likelihood term. The event likelihood term can be successfully estimated by the filtered likelihood method, and the factor likelihood term can be calculated in a standardized manner. The empirical study reveals that, under the filtered likelihood method, the first model outperforms the other in terms of parameter estimation efficiency, convergence speed, and estimation accuracy, and has a better prediction effect on the default data in China’s financial market, which can also be extended to other countries, which is of great significance in the default risk control of financial institutions.

Keywords:

incomplete observable data; financial default markets; marked point process; self-exciting process; filtered likelihood method

MSC:

91G30; 91G40; 91G80

1. Introduction

Generalized from stochastic processes, marked point processes have been widely applied in fields such as informatics, physics, and finance. A marked point process can be used to describe common bond defaults in financial markets [1], where the marker variable

U_{n}

can be seen as the cumulative value of the default amount from zero to the corresponding moment [2]. Financial default models have been studied extensively by foreign academics. For example, Gordy et al. [3] developed the notion of rating-based bank capital risk and presented a single financial default model affected by visible systematic risk. Das et al. [4] and Duffie et al. [5] examined US financial default data from 1979 to 2004 and discovered that, in addition to observable systematic risk, all enterprises are vulnerable to unobservable systematic risk. Based on this, we present a model of the financial default with unobservable latent factors and an O-U process for studying the effect of latent variables on default intensity. Our study reveals that the model with latent variables is more accurate in depicting default events than the model with simply observable systematic risk. Azizpour et al. [6] also discovered that cluster contagion plays a crucial role in the dynamics of model parameters, and they put forward a financial default model that includes cluster contagion and latent variables. In this model, the latent variables are described by the conventional CIR model, and the cluster contagion is described by Hawkes’ standard self-excited process model [7]. The financial default model with the self-excited term outperforms the classic model when it comes to describing default events. Meanwhile, Chen et al. [8] applied the financial default model with the cluster contagion factor to give a derivation of the CDS pricing strategy based on this model, and proved that the financial default model with the cluster contagion factor also has better performance in the portrayal of default events in the supply chain system. Wang et al. [9] investigated how the non-financial sector affects the financial system, extended default clustering estimates to the non-financial sector using data on listed companies, and documented significant time-varying and cross-sectional changes in default clustering across different non-financial sectors in China. Finally, it was found that stock market volatility, fixed asset investment and credit-GDP gaps are key determinants of changes in time-series default clusters. Li et al. [10] proposed a simplified model for the embedded credit risk of corporate bonds. They specified the default risk rate as an affine function of a set of influencing variables and, in order to capture the clustering properties in certain extreme cases, employed a Hawkes jump-diffusion process to model these variables to derive a semi-analytical pricing formula for default bonds. Kai Xing et al. [11] found that macro indicators and NBER recession indicators are significant in explaining and predicting default clusters, even with a lag of 3 months.

The explanatory variables determining the default intensity in different financial default models are often different, and hence the estimation methods of the model parameters often differ from each other. On this premise, Ogata [12] proposed a point-process default model based on the experience of each state under complete data and proved the asymptotic property and convergence of its parameter estimation method on this basis. Fernando et al. [13] investigated the situation in which the model’s default intensity is based on the proportional risk equation with complete data. However, in the default model generated from incomplete observable data (partially observable or unobservable explanatory variables), the discrete information flow of partially observable variables and the path dependence of endogenous variables cause the filtering problem of discontinuous information flow due to the interaction of explanatory variables and point-process paths in the intensity parameters. Most researchers utilize interpolation to make the trajectories of some observable variables continuous in the filtering problem. For example, Lando [14] extended the COX model to give parameter estimates for a financial default model based on default intensity with latent variables under incomplete observable data. In previous studies, it was found that using the interpolation method to solve the likelihood function does not lead to a closed-form solution, and the properties of the resulting parameter estimation method are not well described. Thus, it is reasonable to attempt methods other than interpolation to solve the filtering equations controlling the maximum likelihood estimation. Elliott et al. [15] used a stochastic EM technique to solve the filtering equations and provide asymptotic estimates of the filtering control equations. Giesecke et al. [16] suggested a method for estimating the default intensity based on filtered likelihood estimation and obtained an analytical solution for parameter estimation. The filtered likelihood technique outperforms other methods for solving the filtered equations in terms of parameter estimation accuracy and asymptotic efficiency in numerical simulations. This method can be seen as a good answer to the default model’s parameter estimation problem with incomplete observable data.

Clusters of corporate defaults can have catastrophic economic effects, such as a sudden drop in consumption [17]. The default behavior is one of the most important indicators for credit rating review in the financial market. However, financial default modeling and estimation methods in China are still in their infancy, relying primarily on basic models and Monte Carlo simulation approaches. The filtered likelihood estimation approach is rarely employed in China since there is a lack of the effective creation of explanatory variables affecting financial default events. Studying financial default, particularly bond default, is critical both theoretically and practically for avoiding financial risk and establishing a stable financial environment. Based on incomplete observable data, a filtered likelihood estimation method is utilized in this research to estimate the parameters of the marked point process under the influence of self-excitation and jump-diffusion processes on the default intensity [18]. We also compare and analyze the parameter estimation and model-fitting effects of the two models, as well as investigating and analyzing China’s financial default problem and development trend, using financial default data from 2014 to 2020 [19], and find that our first model has better results in portraying the small sample data of the Chinese bond default market. Other references are also of great help to this paper [20,21,22,23,24].

2. Explanation of Related Concepts

2.1. The CIR Process

The O-U process is a stochastic process, also known as the Ornstein–Uhlenbeck process, which describes a stochastic process with a tendency to regress to the mean. Originally invented by Ornstein and Uhlenbeck in the 1930s, the O-U process is widely used in finance, physics, chemistry, biology, and other fields. The O-U process is usually described by the following differential equation:

d X (t) = θ (μ - X (t)) d t + σ d W (t)

where

X (t)

is the value of the stochastic process at time t, and

θ, μ, σ

, and

W (t)

represent the regression intensity, mean, standard deviation, and Wiener process, respectively. The Wiener process is a continuous, constantly changing stochastic process, which is the noise source of the O-U process and represents the stochastic fluctuations in the system.

In the O-U process, the random variable

X (t)

fluctuates up and down around the mean value

μ

, but when

X (t)

deviates from the mean, the regression force

θ

tends to pull it back to the mean. Thus, the O-U process has the property of regression to the mean, which makes it useful in describing random walks (i.e., random wanderings).

The O-U process and the CIR process are solutions of stochastic differential equations, and they both have the property of regression to the mean, which can be used to describe the evolution of stochastic processes, and therefore they both have a wide range of applications in the field of finance. In fact, the CIR process can be regarded as an extension of the O-U process, and there are certain connections and similarities between them. Specifically, the CIR process can be obtained by applying some transformations to the O-U process. Suppose we have a random variable

X (t)

that conforms to the O-U process, which can be expressed as

d X (t) = θ (μ - X (t)) d t + σ d W (t)

where

θ

is the regression intensity,

μ

is the mean,

σ

is the standard deviation, and

W (t)

is the Wiener process. We can transform it as follows:

\begin{matrix} r (t) & = X {(t)}^{2} \\ d r (t) & = 2 X (t) d X (t) + {(d X (t))}^{2} \end{matrix}

Substituting the differential equation for

X (t)

into the above equation, we obtain

d r (t) = 2 X (t) (θ (μ - X (t)) d t + σ d W (t)) + σ^{2} {(d W (t))}^{2}

The simplification gives

d r (t) = 2 θ (μ - X (t)) X (t) d t + σ^{2} d W (t)

This equation is very similar to the form of the equation for the CIR process, and by simply considering

r (t)

as the random variable of the CIR process,

κ = 2 θ

,

θ = 2 θ μ / σ^{2}

, and

σ = σ

, we can obtain the form of the equation for the CIR process as follows

d r (t) = κ (θ - r (t)) d t + σ \sqrt{r (t)} d W (t)

Therefore, the CIR process can be regarded as an extension of the O-U process, an extended model based on the O-U process, which is widely used in finance for interest rate modeling, option pricing, etc.

2.2. The Cox Process

The Cox model is a statistical model for analyzing survival data, which was originally proposed by David Cox in 1972 and is widely used in survival analysis in medicine, public health, finance, etc. The Cox model is a semi-parametric model that can handle right-hand truncated survival data and does not require any assumptions about the time distribution. The ratio comparison analyzes the effect of a factor on survival time by comparing the risk ratio of two groups on that factor. The basic form of the Cox model is as follows:

h (t | X) = h_{0} (t) exp (β_{1} X_{1} + β_{2} X_{2} + \dots + β_{p} X_{p})

where

h (t | X)

represents the risk function at time t given the covariates X,

h_{0} (t)

is the underlying risk function, and

β_{1}, β_{2}, \dots, β_{p}

are the coefficients of the covariates

X_{1}, X_{2}, \dots, X_{p}

. The core of the Cox model is the assumption that the risk function can be decomposed into an exponential function of the underlying risk function and covariates, which allows the Cox model to handle a variety of different risk factors and does not require any assumptions to be made about the underlying risk function.

2.3. The Hawkes Process

The Hawkes process (Hawkes process) is a stochastic process, which was proposed by Hawkes in 1971 to describe the stochastic process of a class of events occurring. The Hawkes process is often used to describe the occurrence and evolution of events, such as social networks, financial markets, and earthquakes. The Hawkes process has the property of self-excitation, that is, the occurrence of one event affects the occurrence probability of subsequent events.

The basic form of the Hawkes process is as follows:

λ (t) = μ (t) + \sum_{i = 1}^{N (t)} g (t - t_{i})

where

λ (t)

represents the intensity function at time t,

μ (t)

is the underlying intensity function,

N (t)

is the event count in the time interval

[0, t]

, and

g (t - t_{i})

is the activation function that affects the probability of subsequent events. The central idea of the Hawkes process is that when an event occurs, it has an effect on the probability of occurrence of subsequent events, and this effect can be described by the activation function

g (t - t_{i})

. The activation function

g (t - t_{i})

represents the effect of an event occurring at time

t_{i}

on an event after time t, and it is usually a non-negative function with respect to the time difference

t - t_{i}

.

The model parameters of the Hawkes process can be estimated by the maximum likelihood estimation, which can be implemented using the Hawkes package in the R language. The model parameters of the Hawkes process include the underlying intensity function

μ (t)

and the activation function

g (t - t_{i})

, etc. By studying these parameters, the pattern of event occurrence can be obtained, and the occurrence of future events can be predicted.

In conclusion, the Hawkes process is a stochastic process used to describe the occurrence of events, which has the property of self-excitation and can be used to describe the occurrence and evolution of events such as social networks, financial markets, and earthquakes. The estimation and prediction of the Hawkes process model parameters can be achieved by the maximum likelihood estimation and analysis of the activation function.

3. Financial Default Marked Point Process Model

A complete probability space

(Ω, F, P)

and a right continuous and complete filtration

{(F_{t})}_{t \geq 0}

are fixed, and a marked point process

(T_{n}, U_{n})

is conducted. The

T_{n}

is a strictly increasing sequence of positive stopping times that represent the arrival times of events. The

U_{n}

is

F_{T_{n -}}

measurable mark variables that take values in a subset

R_{U}

of the Euclidean space. The marked point processes

(T_{n}, U_{n})

provide information about the arrival of events. One of the random processes

N_{t}

is also called the counting process used to calculate the number of event arrivals:

N_{t} = \sum_{n \geq 1}^{} I_{(T_{n} \leq t)}

(1)

Because

N_{t}

contains the same information as the stop-time sequence, it can act as a replacement for the stop-time sequence.

N_{t}

has a right continuous time-dependent violation strength

λ_{t}

:

λ_{t} = Λ (V; α)

(2)

where

Λ

is a function determined by the parameter

α

,

V_{t}

denotes a Markov process taking values in

R_{V}

. The initial value of this process and the transfer probability are determined by the parameter

γ

. The conditional distribution function

π_{t}

of the marker variable

U_{n}

is determined by the parameter

β

:

\begin{matrix} Λ : R_{v} \to (0, \infty) \\ π_{t} (d_{u}) = π (u, V_{t}; β) d_{u} \end{matrix}

(3)

After the initial values and the corresponding state transfer probabilities are given, the intensity of the default event arrival

λ_{t}

and the process of the explanatory variable

V_{t}

are dynamically changed. Here,

V_{t}

and

N_{t}

are interacting. In particular,

N_{t}

can appear as a constituent element in

V_{t}

, or

V_{t}

and

N_{t}

have the same jump or self-excited behavior. Specifically, the structure of the explanatory variable stochastic process

V_{t}

can be set as

(X, Y, Z)

, where X represents the observable explanatory variable and the path of X in

(0, t)

can take values in

R_{X}

, Y represents the partially observable explanatory variable and the path of Y in

(0, t)

takes values in

R_{Y}

, and Z represents the unobservable latent variable and the path of Z in

(0, t)

takes values in

R_{Z}

, which is shown in Equation (4). Order

τ

is the time of the first default:

\begin{matrix} N_{τ} = \{N_{t} : 0 \leq t \leq τ\} \\ U_{τ} = \{\sum_{n \leq N_{t}} U_{n} : 0 \leq t \leq τ\} \\ X_{τ} = \{X_{t} : 0 \leq t \leq τ\} \\ Y_{τ} = \{Y_{t}^{j} : 0 \leq t \leq τ, t \in s^{j}\}, j \in \{1 \dots d_{y}\}, s = ⋂_{j = 1}^{d_{y}} s_{j} \neq \emptyset, d_{y} > 0, d_{z} > 0 \end{matrix}

(4)

Since the default intensity of financial default events arrival is influenced by macro- and micro-economic factors, observable systematic risk variables, and unobservable systematic risk variables, the structure of the default intensity is constructed as shown in Equation (4) in the setting of the default intensity:

λ_{t} = \sum_{i = 1}^{d} a_{i} X_{i, t} + Y_{t} + Z_{t}

(5)

4. Filter Likelihood Method

The filtered likelihood asymptotic method was proposed by Giesecke et al. [16]. On the one hand, it solves the problems of difficulty in describing the nature of the solution results and difficulty in controlling the convergence speed when the traditional interpolation method is used for parameter estimation of the violation intensity of the marked point process. On the other hand, it also solves the problems of path dependence and discontinuous information flow when the partially observable variable

Y_{t}

is calculated under incomplete data. The filtered likelihood method is superior in terms of parameter accuracy, parameter convergence speed and asymptotic efficiency. Therefore, the filtered likelihood method is used as the parameter estimation method for the financial default intensity model.

First let the parameters be estimated for the explanatory variables in the point process with default intensity as

θ = (α, β, γ) \in Θ

. The true unknown parameters

θ^{0} \in Θ^{0}

,

P_{θ}

and

E_{θ}

denote the probability and expectation operators, respectively. Then, the likelihood function

L_{τ} (θ)

of the data

D_{τ}

under the parameter

θ

comes from the Radon–Nikodym derivative of the

P_{θ}

distribution. Under incomplete data, the likelihood function is a projection of

\frac{d P_{θ}}{d P_{θ_{0}}}

, so we have

L_{τ} (θ) = E_{θ_{0}} [\frac{d P_{θ}}{d P_{θ_{0}}} | D_{τ}] .

(6)

Set the random variable

M_{τ} (θ)

and the distribution function

π^{*} (U_{n})

of the marker term. When

E_{θ} [M_{t} (θ)] = 1

in Equation (7) satisfies the condition, a new measure

P_{θ}^{*}

can be obtained. The counting process in the point process under the new measure obeys the standard Poisson distribution, and the distribution function of the marker term is also independent of

θ

[16]:

M_{τ} (θ) = \prod_{n \leq N_{τ}} \frac{π^{*} (U_{n})}{π (U_{n}, V_{T_{n} -;}; β)} * exp (- \int_{0}^{τ} log (Λ (V_{s -}; α)) d N_{s} - \int_{0}^{τ} (1 - Λ (V_{s}; α)) d s) .

(7)

Therefore, when the parameters are estimated under incomplete data,

L_{τ} (θ)

can be calculated by splitting it into the product of the point process term and the explanatory variable term. See Theorem 1 and Equation (8):

L_{τ} (θ) = l o g L_{F}^{*} (θ; τ) + l o g L_{E}^{*} (θ; τ) .

(8)

In the calculation of Equation (8), a more standardized solution for the term

l o g L_{F}^{*} (θ; τ)

is already available [8,14]. However, it is more difficult to calculate the direct solution for

l o g L_{E}^{*} (θ; τ)

. Thus, the likelihood asymptotic is performed using the filtered likelihood method for the term

ε (θ; τ, g)

in the following Equation (9):

ε (θ; τ, g) = E_{θ^{*}} [\frac{g (V_{τ}, τ)}{M_{t (θ)}} | D_{τ}] .

(9)

According to Theorem 1, when

g (V_{τ}, τ) = 1

,

l o g L_{E}^{*} (θ; τ)

can be calculated. Therefore, we divide the state space by

m_{s} \in N

and the time interval by

m_{T} \in N

to obtain

E^{I} (θ; τ, g)

as its likelihood approximation. The specific result is shown in (10):

E^{I} (θ; τ, g) = \sum_{i_{0} = 1}^{m_{s}} \dots \sum_{i_{m_{t}} = 1}^{m_{s}} p_{θ, 0}^{I} (i_{0}) \times \prod_{j = 1}^{m_{T}} W_{θ, j}^{I} (i_{j}) p_{θ, j}^{I} (i_{j}, i_{j - 1}) .

(10)

p_{θ, 0}^{I} (i_{0})

and

W_{θ, j}^{I} (i_{j})

in Equation (10) are given in a more complicated form ([14]) and are not given here.

E^{I} (θ; τ, 1)

can be viewed as an approximation of

ε (θ; τ, 1)

based on the multidimensional rectangular orthogonal Leberg integral. In the calculation of

E^{I} (θ; τ, 1)

, if the

m_{s}^{m_{T}}

orbits (realizations) are multiplied, finding the parameter estimate that maximizes the probability of all states occurring is difficult to achieve. Therefore, the filtered likelihood algorithm is used to adjust the construction order of the likelihood function, and the transfer probability matrix is constructed from the transfer probabilities p of all states at each moment t. The

l o g L_{E}^{*} (θ; τ)

term is solved by the dot product of the matrix and the multiplication of the matrix, and the local optimal solution of the log-likelihood function is finally solved by the simplex method.

First the vector

ξ

is constructed, where

ξ (i)

is the component of

ξ

,

1 \leq i \leq m_{s}

. Next, since the asymptotic essence of

E^{I} (θ; τ, 1)

is the integral of all realization paths based on a sufficiently small time interval and the division of the state space. For each initial state of

p_{θ, 0}^{I} (i_{0})

with a probability of that, the integral of the realization path term that should be obtained is shown in (11):

\sum_{k = 1}^{m_{s}} p_{θ, 0}^{I} (i) \times p_{θ, j}^{I} (k, i) .

(11)

Therefore, the matrix calculation is used, and the specific calculation process is divided into two steps.

First, the explanatory variables transfer probability matrix is constructed as shown in (12). The integral of the realized path is calculated up to moment

m_{T}

. Then

ξ

is updated:

\begin{array}{l} Q_{j} = [\begin{matrix} q_{j} (1, 1) & \dots & q_{j} (1, m_{s}) \\ ⋮ & ⋮ & ⋮ \\ q_{j} (m_{s}, 1) & \dots & q_{j} (m_{s}, m_{s}) \end{matrix}] \\ q_{j} (k, l) = W_{θ, j}^{I} (k) \times P_{θ, j}^{I} (k, l) \\ k, l \in 1 \dots m_{s} \\ j \in 1, \dots, m_{T} - 1 \\ ξ \leftarrow Q_{j} * ξ \end{array}

(12)

Next, the operation in Equation (13) is performed to construct the probability function matrix at moment

m_{T}

and accordingly calculate the integral of the realized path at moment

m_{T}

and update it for

ξ

:

\begin{matrix} Q_{m_{T}} = [\begin{matrix} q_{m_{T}} (1, 1) & \dots & q_{m_{T}} (1, m_{s}) \\ ⋮ & ⋮ & ⋮ \\ q_{m_{T}} (m_{s}, 1) & \dots & q_{m_{T}} (m_{s}, m_{s}) \end{matrix}] \\ q_{j} (k, l) = W_{θ, m_{T}}^{I} (k) \times P_{θ, m_{T}}^{I} (k, l) \\ k, l \in 1 \dots m_{s} \\ ξ \leftarrow Q_{m_{T}} * ξ \end{matrix}

(13)

The final summation of the realized path integrals for all initial states yields an approximation

E^{I} (θ, τ, 1)

for

ε (θ; τ, 1)

, which is shown in Equation (14).

E^{I} (θ; τ, 1) = \sum_{i = 1}^{m_{s}} ξ (i)

(14)

After the algorithm is constructed, the state space and the time interval are divided into

m_{s}

and

m_{T}

intervals. Then, the unobservable explanatory variables

Z_{t}

and the partially observable explanatory variables

Y_{t}

are transformed into matrices (the expressions are available in the subsequent sections). The orbits (realizations) of the point process are combined and logarithmically manipulated, which are also combined with the log-likelihood function

l o g L_{F}^{*} (θ; τ)

for the explanatory variables part of the processed incomplete data to derive the filtered likelihood asymptotic

ε (θ; τ, g)

for the point process events and the stopping time part. The local optimal solution of the log-likelihood function is found via the downhill simplex method. The parameter estimates obtained by the filter likelihood method can be obtained.

5. Properties and Proofs of Filtered Likelihood Estimation

Since it is known that it is difficult to solve

L_{τ} (θ)

directly after the Radon–Nikodym transformation, the random variable

M_{τ} (θ)

and the distribution function

π^{*} (U_{n})

of the marker term are set. The likelihood function can be calculated in two parts when the condition

E_{θ} [M_{τ} (θ)] = 1

is satisfied. The sum of the two independently calculated parts, i.e., the event term

l o g L_{E}^{*} (θ; τ)

and the factor term

l o g L_{F}^{*} (θ; τ)

, which is positively correlated with the initial maximum likelihood function

L_{τ} (θ)

, i.e., the direction of the optimal solution parameters, is the same. The following theorem applies:

Theorem 1.

Assume that the following conditions hold:

(1) On parameter space Θ,

E_{θ} [M_{τ} (θ)] = 1

is satisfied for all

θ \in Θ

;

(2) For all

θ \in Θ

, given the event data

(N_{τ}, U_{τ})

under measure

P_{θ}^{*}

, the conditional likelihood function

l o g L_{f}^{*} (θ; τ)

for the factor data

(X_{τ}, Y_{τ})

exists and is strictly positive a.s., then the likelihood function is satisfied:

\begin{matrix} L_{τ} (θ) \propto E_{θ}^{*} [\frac{1}{M_{τ} (θ)} ∣ D_{τ}] \cdot log L_{F}^{*} (θ; τ) . \end{matrix}

(15)

The above theorem gives a method for estimating the posterior probability and the posterior mean of the filtered likelihood based on incomplete data. The results of the parameter estimation of the filtered likelihood have the same optimization direction as the original maximum likelihood function, which is proved as follows.

Proof of Theorem 1.

From condition (1) and the definition of the measure

P_{θ}^{*}

, we have

d P_{θ}^{*} = M_{τ} (θ) d P_{θ}

, so for any

A \in D_{τ}

, the measure

P_{θ}^{*}

and the measure

P_{θ}

can be transformed as in (16):

\begin{matrix} P_{θ, τ} (A) = E_{θ} [1_{A}] = E_{θ}^{*} [1_{A} E_{θ}^{*} [\frac{1}{M_{τ} (θ)} ∣ D_{τ}]] = \int_{A} E_{θ}^{*} [\frac{1}{M_{τ} (θ)} ∣ D_{τ}] d P_{θ, τ}^{*} . \end{matrix}

(16)

According to the Radon–Nikodym theorem, the measure

P_{θ}^{*}

is absolutely continuous with respect to measure

P_{θ}

under the Radon–Nikodym derivative, so Equation (17) is established:

\begin{matrix} \frac{d P_{θ, τ}}{d P_{θ, τ}^{*}} = E_{θ}^{*} [\frac{1}{M_{τ} (θ)} ∣ D_{τ}] \end{matrix}

(17)

Then, according to the Bayesian formula, the measure of the data under

P_{θ}^{*}

is equal to the product of the measure of the event term

(N_{τ}, U_{τ})

under

P_{θ}^{*}

and the posterior conditional probability of the factor data

(X_{τ}, Y_{τ})

under the given event term

\begin{matrix} P_{θ, τ}^{*} & = & P_{F, θ, τ}^{*} P_{E, θ, τ}^{*} . \end{matrix}

(18)

Since the event term

(N_{τ}, U_{τ})

does not depend on the parameter

θ

under the measure

P_{θ}^{*}

, the counting process N is normalized. Also, the density function

π^{*}

of the labeled term under the measure

P_{θ}^{*}

does not depend on parameters

θ

and

θ_{0}

, so Equation (19) holds:

\begin{matrix} \frac{P_{E, θ, τ}^{*}}{P_{E, θ_{0}, τ}^{*}} \propto 1 . \end{matrix}

(19)

Equation (20) can be obtained by condition (2):

\begin{matrix} \frac{P_{F, θ, τ}^{*}}{P_{F, θ_{0}, τ}^{*}} = L_{F}^{*} (θ; τ) . \end{matrix}

(20)

Since

P_{θ}^{*}

is in the form of a product of measures as shown in Equation (18), it is obtained by combining Equation (19) with (20) and using Fubini’s theorem:

\begin{matrix} \frac{d P_{θ, τ}^{*}}{d P_{θ_{0}, τ}^{*}} = \frac{d P_{F, θ, τ}^{*} \cdot d P_{E, θ, τ}^{*}}{d P_{F, θ_{0}, τ}^{*} \cdot d P_{E, θ_{0}, τ}^{*}} = \frac{d P_{F, θ, τ}^{*}}{d P_{F, θ_{0}, τ}^{*}} \cdot \frac{d P_{E, θ, τ}^{*}}{d P_{E, θ_{0}, τ}^{*}} \propto L_{F}^{*} (θ; τ) . \end{matrix}

(21)

In summary, the filtered likelihood function has the same optimization direction as the original likelihood function, i.e.,

\begin{matrix} L_{τ} (θ) \propto E_{θ}^{*} [\frac{1}{M_{τ} (θ)} ∣ D_{τ}] \cdot log L_{F}^{*} (θ; τ) . \end{matrix}

(22)

□

6. Two Intensity Parameter Models

The default intensity can usually be expressed in the form of Equation (23), where p represents the arrival rate of a default event in time h. We multiply it by the number of defaults, divide it by the time interval h, and take the limit, the average arrival rate of defaults in a given time [8]:

\begin{matrix} λ_{t} = lim_{h \to 0} \frac{p [(N_{t + h} - N_{t}) \geq 1]}{h} \end{matrix}

(23)

The intensity parameter

λ

is influenced by, partly, measurable macro variables (e.g., GDP, PPI, and CPI), systemic risk factors, and endogenous variables, such as default events. Therefore,

λ

can provide a better understanding of the scale and frequency of financial default events at a macro level, which is of high reference value for the evaluation of the macro-financial environment [6].

6.1. The First Model

Ignoring the contagion factor caused by the cluster default in traditional financial default models often leads to an overestimation of the impact of unobservable systematic risk variables on default intensity, resulting in large errors in model parameter estimation. Furthermore, ignoring cluster default contagion results in extreme values of the marker amount at the time of the financial default event’s arrival, causing a high level of bias in the model’s representation of the financial default event. The probability of another financial default occurring immediately after a marked large-volume financial default increases significantly. Therefore, it is appropriate to include a self-excitation term in the model to characterize the effect of the contagion factor on the default intensity and marker amount, as well as its decaying effect with increasing lags. Moreover, it is necessary to study the optimization of the explanatory variables’ parameter settings and the model-fitting effect under extreme values [25]. As a result, the first model studied in this paper is shown in (24):

\begin{matrix} Λ (V; α) = α_{1} exp (Y_{t}) + α_{2} X_{t} + α_{3} Z_{t} \\ Y_{t} = e^{α_{4} V_{t} + γ_{1} w_{t}} \\ X_{t} = γ_{2} \sum_{T_{n} \leq t}^{d} e^{- γ_{3} (t - T_{n})} U_{n} \\ d Z_{t} = (γ_{4} - γ_{5} Z_{t}) d t + γ_{6} \sqrt{Z_{t}} d w_{t} . \end{matrix}

(24)

The COX model, CIR model, and Hawkes models are used to treat the model’s parameter settings, accordingly. According to the analysis of Azizpour et al. [6], among the variables such as short-term treasury yields, bond credit ratings, and broad market indices, only the GDP variable passed the significance test. Thus, the

V_{t}

variable in the partially observable term

Y_{t}

in the first model was selected, considering only the effect of the GDP factor. Also, for the form of the composition of the effect of the partially observable variable

Y_{t}

on the intensity parameter, the standard proportional risk COX model is used. As for the control of the systematic risk variable

Z_{t}

, the standard mean reversion CIR model is chosen. Since the CIR model has better performance in reflecting the treasury yield curve, which has a relatively good effect in portraying the macroeconomic boom, it makes good economic sense to use this model to portray the unobservable systematic risk variables. In the construction of the fully observable explanatory variable

X_{t}

, a self-exciting term factor with the Hawkes model is added to portray the contagious effect of the cluster event because the effect of a high default event on subsequent default stops is a gradually decaying positive effect. Accordingly, the model is set

U_{n} = m a x (0, l o g u_{n})

where

l o g u_{n}

is monotonically increasing, representing the effect of the size of the default event marker (amount) on it, while the model front of the exponential term

e^{- k (t - T_{n})}

is monotonically decreasing, depicting the half decay of the default amount impact over time. According to Azizpour et al. [6], it is found that as the value of the default marker increases, the probability of the occurrence of a cluster event rises, along with the probability of another default event rising. Also, as

T_{n}

increases, the effect of the occurrence of a default event at moment

T_{n} - 1

gradually decays. The same Azizpour et al. [6] also found that ignoring the effect of the contagious factor

X_{t}

in the model causes problems, such as overestimating the effect of the unobservable variable

Z_{t}

. Therefore, the settings in the model make better economic sense [26].

6.2. The Second Model

In the setup of the intensity parameter model based on incomplete data, the partially observable term

Y_{t}

and the unobservable explanatory variable

Z_{t}

are crucial. Since the unobservable systematic risk variable contains stochastic components, the inclusion of Brownian motion at the stochastic wandering scale in the inscription of

Z_{t}

can better fit the stochastic nature of the model. As for the setting of the partially observable term

Y_{t},

using only the diffusion process to describe it would ignore the jump characteristics of the

Y_{t}

term time series. Therefore, we consider adding the jump process to the diffusion, i.e., the

Y_{t}

term is constructed as a diffusion process with a jump. The second model is constructed as shown in (25):

Λ (V; α) = α_{1} Y_{t} + α_{2} Z_{t} + α_{3} e^{- α_{4} X_{t}^{3}} X_{t}^{1} .

(25)

This model is the one used in Giesecke et al. [16] for simulations with the filtered likelihood method. It can be considered an extension of the first model. Each variable in this model satisfies the stochastic differential equation shown in (26), where

V_{t} = (Y_{t}, Z_{t}, X_{t}^{1}, X_{t}^{2}, X_{t}^{3})

:

d V_{t} = μ (V_{t}; γ) d t + σ (V_{t}; γ) d W_{t} + R (V_{t -}; γ) d N_{t} .

(26)

and the parameters

μ

,

σ

, and R to be estimated for the stochastic differential equation, as shown in (27):

\begin{matrix} μ (V_{t}; γ) = (\begin{matrix} (γ_{1} + \frac{γ_{2}^{2}}{2}) (v_{1} - γ_{3} e^{- γ_{4} v_{5}} v_{4}) - γ_{3} γ_{4} e^{- γ_{4} v_{5}} v_{4} \\ (γ_{5} + \frac{γ_{6}^{2}}{2}) v_{2} \\ 0 \\ 0 \\ 1 \end{matrix}), \\ σ (V_{t}; γ) = (\begin{matrix} γ_{2} (v_{1} - γ_{3} e^{- γ_{4} v_{5}} v_{4}) & 0 \\ 0 & γ_{6} v_{2} \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{matrix}), \\ R (V_{t -}; γ) = {(\begin{matrix} γ_{3} & 0 & e^{γ_{7} v_{5}} & e^{γ_{8} v_{5}} & 0 \end{matrix})}^{T} . \end{matrix}

(27)

Bringing the corresponding parameters in (27) into the stochastic differential Equation (26) leads to the analytical solution of the explanatory variables in the intensity parameter model as shown in Equation (28):

\begin{matrix} Y_{t} = V_{0}^{1} e^{(γ_{1} t + γ_{2} w_{t}^{1})} + γ_{3} \int_{0}^{t} e^{- γ_{4} (t - u)} d N_{u}, \\ Z_{t} = V_{0}^{2} e^{(γ_{5} t + γ_{6} w_{t}^{2})}, \\ X_{t}^{1} = \int_{0}^{t} e^{- α_{4} u} d N_{u}, \\ X_{t}^{2} = \int_{0}^{t} e^{- γ_{4} u} d N_{u}, \\ X_{t}^{3} = t . \end{matrix}

(28)

where

Y_{t}

is a diffusion process with jumps, which are driven by the counting process

N_{u}

, which is used to describe the effect of the self-excitation caused by the contagion term on the financial default event, and

Z_{t}

is the geometric Brownian motion. The fully observable variables in the model are

X_{t}^{1}

,

X_{t}^{2}

,

X_{t}^{3}

, where

X_{t}^{1}

,

X_{t}^{2}

are two pure jump processes that integrate over the counting process, and

X_{t}^{3}

is a one-dimensional variable t representing the stopping time of the default event. According to the constructed form of the pair of jump processes in Equation (28), the three elements

X_{t}^{1}

,

X_{t}^{2}

, and

X_{t}^{3}

in the fully observable vector are combined to obtain

α_{3} \int_{0}^{t} e^{- α_{4} (u - t)} d N_{u}

, and with the combination

Z_{t}

, one can obtain a partially observable variable of the form shown in

Y_{t}

.

A comparison with the first model reveals that the two models are similar in the form of the fully observable variable

X_{t}

, while the second model lacks the self-excitation effect caused by the cluster contagion factor due to the neglect of the marker term

U_{n}

. In dealing with the latent variable

Z_{t}

, the components of the two models differ significantly: the first model uses the standard mean reversion CIR model, which has only a numerical solution to the stochastic differential equation, while the second model uses the geometric Brownian motion, which has a clearer analytical solution. The second model uses the summation of

X_{t}

and

Y_{t}

for the partial observable vector, while the first model uses the COX risk ratio form for the GDP variable to depict the effect of macro variables on the default intensity.

In the following, the algorithm for filtering the likelihood function of this model is processed to obtain the closed expression for

P_{θ, j}^{I} (k, l)

in Equation (12). From Equation (28), we can see that

Y_{t}

consists of a geometric Brownian motion with log-normal probability density function and a pure jump process.

Z_{t}

also follows a log-normal distribution, and the model uses a random variable with log-normal conditional probability distribution to characterize the transfer probability of the variable. Also, the variable is logarithmicized to have mean

γ_{5} (τ^{j} - τ^{j - 1})

and standard deviation

γ_{6 \sqrt{τ^{j} - τ^{j - 1}}}

at each fixed interval

τ^{j} - τ^{j - 1}

. Therefore, its transfer probability is shown in (29), and the treatment of the

Z_{t}

transfer probability is described in more detail in Section 6.2:

\begin{matrix} P_{θ}^{*} [Y_{τ^{j}} \in A_{Y}^{k_{y}} ∣ D_{τ}, Y_{τ^{j - 1}} = y^{l_{y}}] \\ = \int_{A_{Y}^{k_{y}}} \frac{ϕ (log \frac{f_{Y} (Y_{s^{j}}, s^{j})}{f_{Y} (u, τ^{j})}; γ_{1}^{j} (1 - ρ^{j}), γ_{2}^{j} \sqrt{1 - ρ^{j}}) ϕ (log \frac{f_{Y} (u, τ^{j})}{f_{Y} (y^{l_{y}}, τ^{j - 1})}; γ_{1}^{j} ρ^{j}, γ_{2}^{j} \sqrt{ρ^{j}})}{f_{Y} (u, τ^{j}) ϕ (log \frac{f_{Y} (Y_{s^{j}}, s^{j})}{f_{Y} (y^{l_{y}, τ^{j - 1}})}; γ_{1}^{j}, γ_{2}^{j})} d u, \\ P_{θ} [Z_{τ^{j}} \in A_{Z}^{k_{z}} ∣ Z_{τ^{j - 1}} = z^{l_{z}}] \\ = \int_{A_{Z}^{k_{z}}} \frac{1}{u} ϕ (log \frac{u}{z^{l_{z}}}; γ_{5} (τ^{j} - τ^{j - 1}); γ_{6} \sqrt{τ^{j} - τ^{j - 1}}) d u, \end{matrix}

(29)

and

\begin{matrix} P_{θ, j}^{I} (k, l) = P_{θ}^{*} [Y_{τ^{j}} \in A_{Y}^{k_{y}} ∣ D_{τ}, Y_{τ^{j - 1}} = y^{l_{y}}] \times P_{θ} [Z_{τ^{j}} \in A_{Z}^{k_{z} Z_{τ^{j - 1}}} = z^{l_{z}}] . \end{matrix}

(30)

where the parameters,

γ_{1}^{j}

is

γ_{1} (s^{j} - τ^{j - 1})

,

γ_{2}^{j}

is

γ_{2} \sqrt{s^{j} - τ^{j - 1}}

,

ρ^{j}

is

\frac{τ^{j} - τ^{j - 1}}{s^{j} - τ^{j - 1}}

,

ϕ (\cdot; a, b)

is the density of a normal distribution with mean a and standard deviation b.

A_{y}^{k_{y}}

and

B_{y}^{k_{y}}

denote the division interval, where the variable

Y_{t}

and

Z_{t}

are transferred to the next state. The function

f_{Y} (y, t)

is a jump process that can be expressed as

y - γ_{3} \int_{0}^{t} e^{- γ_{4} (t - u)} d N_{u}

. Also, the model

Y_{t}

is constructed in a form more similar to that of the model obtained by combining

X_{t}

with

Z_{t}

. And in the application,

Y_{t}

is a partially observable variable in months, so

A_{y}^{k_{y}}

can be taken at 30-day time intervals.

6.3. Nature of Parameter Estimation for Both Models

Theorem 2

(Convergence Theorem for Parameter Estimation). Suppose the conditions of Theorem 1 in Giesecke et al. [16] hold, and also that

s u p_{θ \in Θ} L_{F}^{*} (θ; τ) < \infty

holds almost everywhere on the parameter space Θ, then, if there is a parameter

{\hat{θ}}_{τ} \in Θ

, the likelihood function maximizes the following equation

\begin{matrix} L_{τ} (θ) & = & ε (θ; τ; 1) L_{F}^{*} (θ; τ) . \end{matrix}

(31)

Also, for any

n \in N

, the parameter estimation

{\hat{θ}}_{τ}^{n}

obtained under the n division maximizes the likelihood function

L_{τ}^{n}

in (32):

\begin{matrix} L_{τ}^{n} (θ) & = & E^{I^{n}} (θ; τ; 1) L_{F}^{*} (θ; τ) . \end{matrix}

(32)

Then, for any fixed

τ > 0

,

{lim}_{n \to \infty} L_{τ}^{n} ({\hat{θ}}_{τ}^{n}) = L_{τ} ({\hat{θ}}_{τ})

holds almost everywhere, and also for any data

D_{τ}

,

{\hat{θ}}_{τ}^{n}

converges to the parameter estimation

{\hat{θ}}_{τ}

such that the likelihood function achieves the global optimum. This theorem illustrates that the parameters obtained under the filtered likelihood method of the first model estimates under the filtered likelihood method of the first model has the property of global convergence.

Proof of Theorem 2.

First taking

ε > 0

, it follows from Giesecke et al. [13] that there exists

n_{0} \in N

that depends only on

τ

. Equation (33) holds for all

n \geq n_{0}

and

θ \in Θ

:

\begin{matrix} |E^{I^{n}} (θ; τ; g) - ε (θ; τ; g)| \leq ε . \end{matrix}

(33)

Since

m a x_{θ \in Θ} L_{F}^{*} (θ; τ) < \infty

holds almost everywhere, we have that

\begin{matrix} |L_{τ} (θ) - L_{τ}^{n} (θ)| \leq ϵ max_{θ \in Θ} L_{F}^{*} (θ; τ) < \infty \end{matrix}

(34)

holds almost everywhere. Definition

\tilde{ϵ} = ϵ {max}_{θ \in Θ} L_{F}^{*} (θ; τ)

since

{\hat{θ}}_{τ}^{n}

is the parameter estimation that maximizes

L_{τ}^{n}

. Therefore, the value space

Θ

can be expressed by Equation (35) under n division:

\begin{matrix} {\hat{θ}}_{τ}^{n} \in \{θ \in Θ : L_{τ} ({\hat{θ}}_{τ}) - \tilde{ϵ} \leq L_{τ}^{n} (θ) \leq L_{τ} ({\hat{θ}}_{τ}) + \tilde{ϵ}\} . \end{matrix}

(35)

Therefore, when

ε \to \infty

,

{lim}_{n \to \infty} L_{τ}^{n} ({\hat{θ}}_{τ}^{n}) = L_{τ} ({\hat{θ}}_{τ})

holds, and the proof is complete. □

Theorem 3

(Asymptotic properties of parameter estimation). Assume that all the conditions of Theorem 2 hold, and the following conditions also hold:

(1) The exact maximum likelihood estimates of any of the original parameters are consistent and asymptotically normal, i.e., on the probability space

P_{θ_{0}}

, when

τ \to \infty

, there is

{\hat{θ}}_{τ} \to θ_{0}

. Also on the probability space

P_{θ_{0}}

when

τ \to 0

we have

\sqrt{τ} ({\hat{θ}}_{τ} - θ_{0}) \to N (0, Σ_{0})

, where

Σ_{0}

is:

Σ_{0} = I (θ_{0}^{- 1}) + 2 ξ I {(θ_{0})}^{- 1} Σ_{E F} I {(θ_{0})}^{- 1}

.

(2) The parameter estimation of the maximum likelihood function

L_{τ (θ)}

to obtain the global optimal solution does not reach the boundary of the parameter space Θ.

(3) For any

n > 1

, the Hessian matrix

▽^{2} L_{τ}^{n}

of the asymptotic likelihood function is positive definite in the neighborhood of the filtered likelihood estimate

{\hat{θ}}_{τ}^{n}

, while for any

q > \frac{1}{2}

, when

τ \mapsto n (τ)

, let the sequence satisfy the following equation

\begin{matrix} Ψ_{τ}^{n (τ), 1} + Ψ_{τ}^{n (τ), 2} & = & O (τ^{- q}) . \end{matrix}

(36)

Then when

τ \mapsto n (τ)

, the sequence of parameter estimation

{({\hat{θ}}_{τ}^{n (τ)})}_{τ > 0}

derived from the discretization

I^{n (τ)}

satisfies

(1) On the probability space

P_{θ_{0}}

, when

τ \to \infty

we have

{\hat{θ}}_{τ}^{n (τ)} \to θ_{0}

(2) Under the

P_{θ_{0}}

distribution, when

τ \to \infty

we have

\sqrt{τ} ({\hat{θ}}_{τ}^{n (τ)} - θ_{0}) \to N (0, Σ_{0})

. Here

Σ_{0}

is the same as in condition (1).

Proof of Theorem 3.

For any fixed

τ > 0

, it follows from Theorem 2 that there is a limit to the filter estimate

{\hat{θ}}_{τ}^{n}

when n tends to infinity. Let this limit be

{\hat{θ}}_{τ}^{\infty}

. If

{\hat{θ}}_{τ}

is the global optimal solution of the likelihood function, i.e.,

{\hat{θ}}_{τ} = a r g m a x_{θ \in Θ} L_{τ} (θ)

, then we have

\begin{matrix} L_{τ} ({\hat{θ}}_{τ}^{\infty}) = L_{τ} ({\hat{θ}}_{τ}) . \end{matrix}

(37)

Since

{\hat{θ}}_{τ}

is a globally optimal solution that does not take values at the boundary of the parameter space

Θ

, then the first-order derivative of

L_{τ} ({\hat{θ}}_{τ}^{\infty})

is 0, i.e.,

\begin{matrix} ▽ L_{τ} ({\hat{θ}}_{τ}^{\infty}) = 0 . \end{matrix}

(38)

Meanwhile, from the asymptotic normality of the global optimal parameter of

{\hat{θ}}_{τ}

as shown in the following equation, the second-order derivative of the limiting parameter

{\hat{θ}}_{τ}^{\infty}

can be obtained. The Taylor expansion of

L_{τ} ({\hat{θ}}_{τ}^{\infty})

can be performed around

θ_{0}

\begin{matrix} \sqrt{τ} ({\hat{θ}}_{τ}^{n (τ)} - θ_{0}) \to N (0, Σ_{0}) . \end{matrix}

(39)

According to Giesecke et al. [15], the series of parameter estimation

{\hat{θ}}_{τ}^{n (τ)}

generated by dividing n when

τ \mapsto n (τ)

and the parameter estimation

{\hat{θ}}_{τ}^{\infty}

that lead to the global optimization of the likelihood function satisfy the following equation:

\begin{matrix} |L_{τ}^{n (τ)} ({\hat{θ}}_{τ}^{n (τ)}) - L_{τ}^{n (τ)} ({\hat{θ}}_{τ}^{\infty})| = & O (τ^{- q}) . \end{matrix}

(40)

For each parameter

\tilde{θ}

in the domain of

{\hat{θ}}_{τ}^{\infty}

, there is a second-order Taylor expansion of the great likelihood function

L_{τ} ({\hat{θ}}_{τ}^{\infty})

around the point

θ_{0}

satisfying the following equation:

\begin{matrix} L_{τ} ({\hat{θ}}_{τ}^{\infty}) - L_{τ}^{n (τ)} ({\hat{θ}}_{τ}^{n (τ)}) & = & \nabla^{2} L_{τ}^{n (τ)} (\tilde{θ}) ({\hat{θ}}_{τ}^{n (τ)} - {\hat{θ}}_{τ}^{\infty}) . \end{matrix}

(41)

Since for any

n > 1

, the Hessian matrix

\nabla^{2} L_{τ}^{n (τ)}

of the asymptotic likelihood function is positive definite over the domain of the filtered likelihood estimate

{\hat{θ}}_{τ}^{n (τ)}

, it follows that the second-order derivative of the likelihood function is finite, i.e.,

\begin{matrix} ∥{(\nabla^{2} L_{τ}^{n (τ)} (\tilde{θ}))}^{- 1}∥ < \infty . \end{matrix}

(42)

So when

τ \to \infty

, and

q > \frac{1}{2}

holds, we have the following conclusion:

\begin{matrix} \sqrt{τ} |{\hat{θ}}_{τ}^{n (τ)} - {\hat{θ}}_{τ}^{\infty}| \leq ∥{(\nabla^{2} L_{τ}^{n (τ)} (\tilde{θ}))}^{- 1}∥ |L_{τ}^{n (τ)} ({\hat{θ}}_{τ}^{n (τ)}) - L_{τ}^{n (τ)} ({\hat{θ}}_{τ}^{\infty})| = O (τ^{\frac{1}{2} - q}) \to 0 . \end{matrix}

(43)

□

7. Model Comparison and Empirical Analysis

Due to the late development of China’s financial market and the emergence of financial defaults (bond defaults) in actual practice only in recent years, there are few sample data available for the study of financial default models and parameter estimation. As China’s structural deleveraging continues to advance, the financing environment as a whole has tightened, making it more difficult for enterprises to raise funds, and the tight credit environment has exacerbated the structural problems in the fund transmission process, while the declining risk appetite of investors in the bond market has further increased the difficulty of financing for issuers with low credit quality to the extent that the number of bond defaults increased in the second half of 2018 as well as in 2019. In this chapter, we use the bond default data of China from 5 March 2014 to 24 June 2020 in the Wind Database to study the trend of quarterly cumulative default amount and daily default amount, the correlation of the quarterly average default arrivals and quarterly GDP data, as well as the correlation of daily default marker amount and daily average default arrival rate by using R language software to analyze them. We also study their normality, lagging and spikes. A comparative analysis of the first model and the second model is also conducted. The coefficient of determination

R^{2}

, the AIC criterion, the degree of model fit, and the significance analysis of model parameters are utilized to derive a model that is more consistent with the portrayal of China’s bond default data in the case of small samples. Since the default data are tagged with large amounts, the tagged amounts are reduced by a factor of 10,000 to facilitate observation and analysis. Figure 1 first shows the trend of quarterly cumulative default amount and daily default amount.

Figure 1. Trend graph of quarterly cumulative default amount (bar) and daily default amount.

7.1. Model Variable Study

From Figure 2, the average default intensity arrival rate (day) fluctuates in the interval [0, 25]. Due to the distribution of the average default arrival rate exhibiting characteristics similar to a normal distribution, it is reasonable to assume that there is an unobserved systematic risk factor

Z_{t}

that affects the average default event rate that can be fitted by the geometric Brownian motion

W_{t}

under a random walk scale. The results of the J-B test on the true value of the average intensity of financial default arrival parameter

λ_{t}

show that p-value

< 0.05

, indicating that the series test of the average arrival times of default rejects the normality hypothesis, as shown in the Table 1. Combined with the peak and thick tail shape of the average default arrival rate image, it shows that the extreme value phenomenon of the distribution of the average occurrence rate of financial default events is significant. The probability of the occurrence of financial default events is relatively high. At the same time, the influence of contagion factors is obvious in Figure 2. After the default event occurs, the probability of another default event increases. Therefore, the default event is not completely determined by the random walk scale. Under the Brownian motion

W_{t}

interpretation, there is also a strong contagion factor.

Figure 2. Trend of the average number of default arrivals.

Table 1. J-B test result.

In Figure 3, the solid line depicts quarterly GDP statistics, whereas the dotted line depicts quarterly average default arrivals. The link between GDP and the quarterly average default arrival rate reveals a p-value of 0.05, as shown in the Table 2. As a result, it seems appropriate to include GDP, a partially observable quantity, in the explanation of the intensity parameter model, implying that the macro-financial environment has a greater impact on the financial default arrival rate. The macroeconomic environment substantially impacts the pace at which financial defaults arrive. It can be utilized to fit the underlying pattern features of changes in the intensity parameter. Defaults tend to climb in tandem with the general financial environment’s volume, but the average growth rate of defaults is more moderate or even declining.

Figure 3. Average default arrivals (quarterly) vs. GDP data (quarterly) trend graph.

Table 2. Results of the correlation test between GDP and quarterly average default arrival rate.

Figure 4 shows that when a contagion event occurs, the default amount

U_{n}

(dashed line) and the average daily default arrivals (solid line) have a fairly consistent trend and periodicity. Therefore, it is reasonable to use

l o g U_{n}

to build a Hawkes model that simulates the effect of the contagion factor on the average default arrival rate at the random travel scale using a model with the half-decayed self-excitation effect. Although the self-excited effect accounts for a small proportion of the model description, at moment T when the contagion effect causes extreme marker values, the self-excited term can better describe the fluctuation of the average default arrival rate affected by infectious factors. Thus, the setting of the self-excited term can better describe the model in extreme value situations.

Figure 4. Default marker (amount) vs. average default arrival rate trend.

7.2. Comparative Analysis of the Model

The filtered likelihood algorithm is used to estimate the parameters of the model. Compared with the Monte Carlo method, the filtered likelihood algorithm greatly reduces the computational burden of estimating the a posteriori mean of the intensity parameter likelihood function; compared with the EM algorithm, the filtered likelihood algorithm enhances the efficiency of parameter estimation without increasing the estimation error so that the model can converge better and has a better fit. The parameters are

θ = [α_{1}, α_{2}, α_{3}, α_{4}, γ_{1}, γ_{2}, γ_{3}, γ_{4}, γ_{5}, γ_{6}]

. Since the unobservable variable

Z_{t}

in the second model is inscribed by a random variable whose conditional probability distribution is log-normal with respect to its transfer probability, the logarithmization of this variable yields its mean

γ_{5} (τ^{j} - τ^{j - 1})

and standard deviation

γ_{6} \sqrt{τ^{j} - τ^{j - 1}}

at each fixed interval

τ^{j} - τ^{j - 1}

. In the case where the state is

Z^{l^{z}}

at moment

τ^{j - 1}

is known, the next moment state is the transfer probability obtained by integrating over all states of

\frac{u}{Z^{l^{z}}}

as shown in (44):

P_{θ} [Z_{τ^{j}} \in A_{z}^{k_{z}} ∣ Z_{τ^{j}} = z^{l_{z}}] = \int_{A_{z}^{k_{z}} ∣} \frac{1}{u} ϕ (log \frac{u}{z^{l_{z}}}; γ_{5} (τ^{j} - τ^{j - 1}), γ_{6} \sqrt{τ^{j} - τ^{j - 1}}) d u

(44)

where

ϕ (*; a, b)

is a normally distributed probability density function with mean a and standard deviation b, and

A_{z}^{k_{z}}

is a subinterval after partitioning the

Z_{t}

state space. Also, after we analyze the unobservable variables of the first model, it is evident that since the factors set in the first model to capture the variables of systematic risk are standard mean-reverting CIR models, the initial values of the model are set to obey, and the shape parameter

\frac{2 k z}{σ^{2}}

is a non-centered chi-square distribution with scale parameter

\frac{σ^{2}}{2 k}

. Meanwhile, it is easy to know through a calculation that its initial values can be set to the shape parameter

v = 4.111

and scale parameter

λ = 0.497

. Therefore, with the conditional transfer probability under time division

m^{T}

, state division

m^{s}

can be calculated by Equation (45):

p_{θ} (t, y ∣ 0, x) = c e^{- u - v} {(\frac{u}{v})}^{q / 2} I_{q} (2 \sqrt{u v}), x, y \in R_{+},

(45)

where, for parameters, c is

\frac{2 k}{σ^{2} (1 - e^{- k t})}

and q is

\frac{2 k z}{σ^{2}} - 1

; for variables, u is

c x e^{- k t}

and v is

c y

; and

I_{q} (x)

is the q-order first class modified Sebel function

\sum_{i = 0}^{\infty} {(\frac{x}{2})}^{2 i + q} \frac{1}{i! Γ (i + q + 1)}

.

The asymptotic estimation of the intensity parameter likelihood function using the filtered likelihood algorithm and the locally optimal parameter estimation of the second model, and the first model likelihood function obtained by the simplex method, as shown in Table 3. For the design of the initial values of the model, on the one hand, we refer to the relevant literature, and on the other hand, we consider the weight of the parameters in the model. For example, parameters

γ_{5}

and

γ_{6}

of the latent variable

Z_{t}

in the first model are set to consider the variation of

Z_{t}

per unit time t and the variation of Brownian motion

W_{t}

relative to

Z_{t}

at each moment, respectively. From the analysis of the parameters, we can see that the contagious factor

X_{t}

and the unobservable latent variable

Z_{t}

have a positive influence on the model but the proportion is relatively small, while the partially observable explanatory variable

Y_{t}

has a positive influence on the model by decaying the period with a relatively large proportion.

Table 3. Financial default model parameter estimation.

The right panel of Figure 5 shows the fit of the second model to the average default arrival rate. The figure shows that the second model better describes the overall trend of the average default arrival rate and has a better fit for the contagion factors of default events. The left panel of Figure 5 shows the fit of the first model to the average default arrival rate. It can be seen that the first model reflects the overall trend of the average default arrival rate better than the second model. The first model has a better fit for the “spikes thick tail” of the default arrival rate distribution because it takes into account the influence of the contagion factors depicted by the Hawkes model represented by

e^{- κ (t - T_{n}) log U_{n}}

. The fit also matches the moment of default, the change in decay and the magnitude of the event. Combined with the higher values of the coefficient of determination

R^{2}

and AIC of the first model, it can be concluded that the first model is a better fit for the stochastic average default intensity arrival rate of financial default events in China in the case of small samples, as shown in the Table 4.

Figure 5. Default marker (amount) vs. average default arrival rate trend.

Table 4. Comparison of financial default model values.

8. Conclusions and Recommendations

From a macro perspective, financial default data can indicate the stability and risk level of the national financial environment. The timing of financial defaults and the regularity of the accumulated default amount can be used as a benchmark for the stability and future predictions of the financial market’s development. Two default intensity parameters models are investigated in this paper. Because the first model accounts for the self-excitation impact of cluster default factors, it does not overstate the influence of latent variables on the model and hence has a better fit for small samples of data, such as China’s bond default market. In this paper, we estimate the default intensity parameters in these two models based on the filtered likelihood method with great likelihood, which enhances the interpretation of the parameter solution and the efficiency of the model parameter solution, and, to some extent, reduces the estimation errors caused by the nature of the filtered likelihood method [27]. The default intensity model can be improved in the future study by selecting relevant explanatory factors that are more appropriate for China’s national conditions, allowing it to play a stronger role in the early warning and prediction of China’s financial default market.

Author Contributions

Conceptualization, X.L. (Xiangdong Liu); Methodology, X.L. (Xianglong Li); Software, J.W. The theoretical proof and empirical analysis of this article were written by three authors. All authors have read and agreed to the published version of the manuscript.

Funding

The research is supported Support by National Natural Science Foundation of China (No. 71471075), Fundamental Research Funds for the Central University (No. 19JNLH09), Innovation Team Project in Guangdong Province, P.R. China (No. 2016WCXTD004) and Industry -University- Research Innovation Fund of Science and Technology Development Center of Ministry of Education, P.R. China (No. 2019J01017).

Data Availability Statement

Not Applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shreve, S. Stochastic Calculus for Finance-Continuous Times Models; Springer: New York, NY, USA, 2003; pp. 125–251. [Google Scholar]
Bremnud, P. Point Process and Queues-Martingale Dynamics; Springer: New York, NY, USA, 1980; pp. 150–233. [Google Scholar]
Gordy, M.B. A Risk-Factor Model Foundation for Ratings-Based Bank Capital Rules. Journey Financ. Intermediat. 2003, 12, 199–232. [Google Scholar] [CrossRef]
Das, S.; Duffie, D.; Kapadia, N.; Saita, L. Common Failings: How Corporate Defaults are Correlated. J. Financ. 2006, 62, 93–117. [Google Scholar] [CrossRef]
Duffie, D.; Saita, L.; Wang, K. Multi-period corporate default prediction with stochastic covariates. J. Financ. Econ. 2007, 83, 635–665. [Google Scholar] [CrossRef]
Azizpour, S.; Giesecke, K.; Schwenkler, G. Exploring the sources of default Clustering. J. Financ. Econ. 2018, 129, 154–183. [Google Scholar] [CrossRef]
Hawkes, A.G. Spectra of some self-exciting and mutually exciting point processes. Biometrika 1971, 58, 83–90. [Google Scholar] [CrossRef]
Chen, Y.S.; Zou, H.W.; Cai, L.X. Credit Default Swap Pricing Based on Supply Chain Default Transmission. South China Financ. 2018, 8, 33–42. [Google Scholar]
Wang, X.; Hou, S.; Shen, J. Default clustering of the nonfinancial sector and systemic risk: Evidence from China. Econ. Model. 2021, 96, 196–208. [Google Scholar] [CrossRef]
Li, C.; Ma, Y.; Xiao, W.L. Pricing defaultable bonds under Hawkes jump-diffusion processes. Financ. Res. Lett. 2022, 47, 102738. [Google Scholar]
Xing, K.; Luo, D.; Liu, L. Macroeconomic conditions, corporate default, and default clustering. Econ. Model. 2023, 118, 106079. [Google Scholar] [CrossRef]
Ogata, Y. The Asymptotic Behavior Of MaxiumLikehood Estimators for Stationary Point Processes. Ann. Inst. Stat. Math. 1978, 30, 243–261. [Google Scholar] [CrossRef]
Fernando, B.P.W.; Sritharan, S.S. Nonlinear Filtering of Stochastic Navier-Stokes Equation with Ito-Levy Noise. Stoch. Anal. Appl. 2013, 31, 381–426. [Google Scholar] [CrossRef]
Lando, D. On Cox Processes and Credit Risky Securities. Rev. Deriv. Res. 1998, 2, 99–120. [Google Scholar] [CrossRef]
Elliott, R.J.; Chuin, C.; Siu, T.K. The Discretization Filter: On filtering and estimation of a threshold stochastic volatility model. Appl. Math. Comput. 2011, 218, 61–75. [Google Scholar]
Giesecke, K.; Schwenkler, G. Filtered likelihood for point processes. J. Econom. 2018, 204, 33–53. [Google Scholar] [CrossRef]
Gourieroux, C.; Monfort, A.; Mouabbi, S.; Renne, J.P. Disastrous defaults. Rev. Financ. 2021, 25, 1727–1772. [Google Scholar] [CrossRef]
Duffie, D.; Eckner, A.; Horel, G. Frailty Correlated Default. J. Financ. 2009, 64, 2089–2123. [Google Scholar] [CrossRef]
Chakrabarty, B.; Zhang, G. Credit contagion channels: Market microstructure evidence from Lehman Brothers’ bankruptcy. Financ. Manag. 2012, 41, 320–343. [Google Scholar] [CrossRef]
Collin-Dufresn, P.; Goldstein, R.S.; Martin, J.S. The Determinants of Credit Spread Changes. J. Financ. 2001, 68, 2177–2207. [Google Scholar] [CrossRef]
Liu, X.D.; Jin, X.J. Parameter estimation via regime switching model for high frequency data. J. Shenzhen Univ. (Sci. Eng.) 2018, 35, 432–440. [Google Scholar] [CrossRef]
Liu, X.D.; Wang, X.R. Semi-Markov regime switching interest rate term structure models—Based on minimal Tsallis entropy martingale measure. Syst. Eng.-Theory Pract. 2017, 37, 1136–1143. [Google Scholar]
Xu, Y.X.; Wang, H.W.; Zhang, X. Application of EM algorithm to eatimate hyper parameters of the random parameters of Wiener process. Syst. Eng. Electron. 2015, 37, 707–712. [Google Scholar]
Giesecke, K.; Schwenkler, G. Simulated likelihood estimators for discretely observed jump–diffusions. J. Econom. 2019, 213, 297–320. [Google Scholar] [CrossRef]
Duffie, D.; Pan, J.; Singleton, K.J. Transform Analysis and Asset Pricing for Affine Jump-diffusions. Econometrica 2000, 68, 1343–1376. [Google Scholar] [CrossRef]
Hou, Z.T.; Ma, Y.; Liu, L. Estimation of the stationary distribution parameters based on the forward equation. Acta Math. Sci. 2016, 36, 997–1009. [Google Scholar]
Wang, S.W.; Wen, C.L. The Wavelet Packet Maximum Likelihood Estimation Method of Long Memory Process Parameters. J. Henan Univ. (Nat. Sci.) 2006, 2, 79–84. [Google Scholar]

Figure 1. Trend graph of quarterly cumulative default amount (bar) and daily default amount.

Figure 2. Trend of the average number of default arrivals.

Figure 3. Average default arrivals (quarterly) vs. GDP data (quarterly) trend graph.

Figure 4. Default marker (amount) vs. average default arrival rate trend.

Figure 5. Default marker (amount) vs. average default arrival rate trend.

Table 1. J-B test result.

	Value
Skew	6.78
Kurtosis	66.84
Jarque–Bera	67,966.17
p-value	0.00

Table 2. Results of the correlation test between GDP and quarterly average default arrival rate.

	Value
t	3.98
df	23
p-value	0.00
cor	0.64

Table 3. Financial default model parameter estimation.

Parameter	The First Model Initial Value	The First Model Parameter Estimation	The Second Model Initial Value	The Second Model Parameter Estimation
$α_{1}$	4.0	3.0	3.0	3.0
$α_{2}$	3.0	3.0	3.0	3.0
$α_{3}$	2.0	2.0	2.0	2.0
$α_{4}$	0.2	2.5	2.5	−0.125
$γ_{1}$	0.5	−0.5	−0.5	0.5
$γ_{2}$	2.0	1.5	1.0	0.2
$γ_{3}$	1.0	0	0.2	5.0
$γ_{4}$	4.0	5.0	5.0	0.2
$γ_{5}$	−0.5	−0.125	−0.125	0.4
$γ_{6}$	0.53	0.5	0.5	0.32

Table 4. Comparison of financial default model values.

	The First Model	The Second Model
AIC	785	744
BIC	776	753
$R^{2}$	0.108	0.325
MLE	503	396

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Research on Financial Default Model with Stochastic Intensity Using Filtered Likelihood Method

Abstract

1. Introduction

2. Explanation of Related Concepts

2.1. The CIR Process

2.2. The Cox Process

2.3. The Hawkes Process

3. Financial Default Marked Point Process Model

4. Filter Likelihood Method

5. Properties and Proofs of Filtered Likelihood Estimation

6. Two Intensity Parameter Models

6.1. The First Model

6.2. The Second Model

6.3. Nature of Parameter Estimation for Both Models

7. Model Comparison and Empirical Analysis

7.1. Model Variable Study

7.2. Comparative Analysis of the Model

8. Conclusions and Recommendations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics