How Much Do Negative Probabilities Matter in Option Pricing?: A Case of a Lattice-Based Approach for Stochastic Volatility Models

Tseng, Chung-Li; Miao, Daniel Wei-Chung; Chung, San-Lin; Shih, Pai-Ta

doi:10.3390/jrfm14060241

Open AccessArticle

How Much Do Negative Probabilities Matter in Option Pricing?: A Case of a Lattice-Based Approach for Stochastic Volatility Models

¹

UNSW Business School, The University of New South Wales, Sydney, NSW 2052, Australia

²

Graduate Institute of Finance, National Taiwan University of Science and Technology, Taipei 106335, Taiwan

³

Department of Finance, National Taiwan University, Taipei 10617, Taiwan

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2021, 14(6), 241; https://doi.org/10.3390/jrfm14060241

Submission received: 12 April 2021 / Revised: 21 May 2021 / Accepted: 23 May 2021 / Published: 30 May 2021

(This article belongs to the Special Issue Volatility Modelling and Forecasting)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we focus on two-factor lattices for general diffusion processes with state-dependent volatilities. Although it is common knowledge that branching probabilities must be between zero and one in a lattice, few methods can guarantee lattice feasibility, referring to the property that all branching probabilities at all nodes in all stages of a lattice are legitimate. Some practitioners have argued that negative probabilities are not necessarily ‘bad’ and may be further exploited. A theoretical framework of lattice feasibility is developed in this paper, which is used to investigate how negative probabilities may impact option pricing in a lattice approach. It is shown in this paper that lattice feasibility can be achieved by adjusting a lattice’s configuration (e.g., grid sizes and jump patterns). Using this framework as a benchmark, we find that the values of out-of-the-money options are most affected by negative probabilities, followed by in-the-money options and at-the-money options. Since legitimate branching probabilities may not be unique, we use an optimization approach to find branching probabilities that are not only legitimate but also can best fit the probability distribution of the underlying variables. Extensive numerical tests show that this optimized lattice model is robust for financial option valuations.

Keywords:

finance; trinomial tree; two-factor model; stochastic volatility; lattice feasibility

1. Introduction

The lattice (or tree) approach is a popular one for valuing derivative securities, as it is normally simple to implement and has an intuitive appeal. The lattice approach involves discrete approximation to the diffusion processes followed by the underlying variables. It is especially useful for valuing American options where early exercise is possible. Since its introduction by Cox et al. (1979), the lattice approach has undergone several extensions in the past few decades to accommodate increasingly complex derivative valuations. To name a few, those significant models include Rendleman and Bartter (1979), Boyle (1986, 1988), (H&W) Hull and White (1988, 1990, 1993, 1994), Chung and Shih (2007), Beliaeva and Nawalkha (2010), and Akyildirim et al. (2014).

In a lattice, each link (branch) connecting two lattice nodes at two consecutive time periods is associated with a branching probability. A legitimate branching probability must be between zero and one. Researchers know and follow this basic rule in developing lattice-based methods. However, to the best of our knowledge, few methods can guarantee lattice feasibility, referring to the property that all branching probabilities at all nodes in all stages of a lattice are legitimate. With lattice feasibility, a lattice constructs a discrete time financial market that is arbitrage free. It is well known that lattice feasibility is easier to achieve when there is only one underlying variable, while two-factor lattice feasibility is harder to meet, especially when the correlation between two underlying uncertainties is high.

The term lattice feasibility was coined by Tseng and Lin (2007). The authors employed the trinomial lattice proposed by H&W (Hull and White 1990) to value real options involving two underlying correlated uncertainties, each with a constant volatility. They found that each lattice configuration implies a maximum correlation of the two underlying variables that the lattice can approximate without incurring negative probabilities, and this maximum correlation may be enhanced by varying the size of its lattice grids. After optimizing the lattice configuration, Tseng and Lin (2007) also showed that the trinomial lattice proposed by H&W (Hull and White 1990) cannot accommodate a correlation beyond

4 / \sqrt{35} \approx 0.676

without incurring negative probabilities. The authors further showed that the popular two-factor interest rate tree proposed by H&W (Hull and White 1994) for valuing interest rate derivatives can only guarantee lattice feasibility when the correlation is no greater than 0.2. This means that negative probabilities may occur far more often than we know in using lattices to value derivatives in real practice.

Since it is not unusual to encounter negative probabilities when using lattices for option pricing, some practitioners have argued that negative probabilities may not necessarily be ‘bad’ and may be further exploited (e.g., Burgin and Meissner 2012; Haug 2007). Consider a price node in a trinomial tree, where the sum of three branching probabilities must be equal to one. Given an abnormality where one branching probability becomes negative or exceeds unity, the other two probabilities must be adjusted accordingly to offset the abnormality. However, the expected payoff at this price node may reveal no sign of abnormality. From this perspective, it is not surprising that some practitioners have reported that some finite difference/finite element models can still produce stable and consistent outputs even with negative probabilities (Zvan et al. 2001). How much does it really matter if one allows negative probabilities in a lattice for option pricing?

To investigate the impact of negative probabilities in option valuations, we focus on using a two-factor lattice to represent general diffusion processes such as the Heston stochastic volatility (SV) model (Heston 1993). In the Heston model, the dynamics of the volatility process is assumed to follow the CIR process (Cox et al. 1985) used to describe the interest rate dynamics. The analytical tractability of the CIR process leads to explicit solutions for some bond pricing problems (e.g., Kouritzin 2000; Maghsoodi 1996; among others). When the CIR process is incorporated as the second dimension of the Heston model, the resultant two-factor lattice is more general and is far from trivial. To observe the impact that comes from negative probabilities, we need to develop a lattice model that can guarantee lattice feasibility and can be used as a benchmark. Under the same lattice framework, with every parameter fixed but branching probabilities, we can then observe how negative probabilities influence valuations.

An important alternative to the lattice method for the options pricing under the Heston model is the Monte Carlo (MC) method. Since the MC method generates stochastic paths, not lattices, to model the evolution of the underlying uncertainties, there is no issue of lattice feasibility or negative probability. However, the generations of the stochastic paths under the Heston model are not straightforward, and there have been a number of discussions on this issue. Examples along this line include Broadie and Kaya (2006) and Andersen (2007). When the MC method is applied to the pricing of American options, the Least-Squares Monte Carlo (LSMC) method is a standard approach proposed by Longstaff and Schwartz (2001). Though a lot of progresses have been made using the MC method to price American options, one also needs to resort to its variations, such as different techniques on resampling or branching (e.g., see a recent paper by Kouritzin and Mackay 2020).

Researchers have proposed lattices for stochastic volatility models. Recent papers include (Akyildirim et al. 2014; Beliaeva and Nawalkha 2010; Costabile et al. 2012; Ruckdeschel et al. 2013). All lattice approaches consider matching two conditional marginal moments for each underlying variable at all nodes, and the correlation is dealt with either by matching the cross moment of the variables or using variable transformation to decorrelate them. Special attention must be paid to avoid negative branching probabilities that are more likely to occur when the correlation is high. One popular approach is to truncate branching probabilities that are negative or exceed unity to bring them to be between zero and one. While truncating branching probabilities may not exactly match the moments, Akyildirim et al. (2014) show that in their approach the matching error may be negligible and prove the convergence of their approach to the underlying processes. In this paper, we take the standard approach by matching the two marginal moments and the cross moment of the two underlying variables. We show that branching probabilities can be guaranteed to be between zero and one by adjusting the configuration of the lattice for a given (fixed) time step, and no probability truncation is needed.

To manage stochastic volatilities, we extend the lattice parameters from the grid size to include a jump size. With this change, the lattice configuration can be optimized to guarantee lattice feasibility even if both state variables are highly correlated with the correlation close to one. As will be shown later, this newly introduced parameter, the jump size, has the effect of refining the grid size. As opposed to traditional lattice approaches which perform lattice refinement on time space, our method can also perform refinement on the state space of the underlying variables even when the time step is not especially small. The consequence is better fitting of the underlying processes and faster convergence.

Numerical tests show that lattice feasibility has a direct impact on option pricing. We find that the values of out-of-the-money (OTM) options are most affected by negative probabilities, followed by in-the-money (ITM) options and at-the-money (ATM) options. Although negative probabilities matter less in some situations, the resulted distortion of the underlying probability distribution is in general hard to predict and exploit. Despite the importance of lattice feasibility, our numerical tests also show that lattice feasibility alone may not be sufficient to guarantee accurate valuation, especially when the time step of the lattice is not especially small. Since legitimate branching probabilities may not be unique, we propose an optimization approach to find branching probabilities that are not only legitimate but also can best fit the probability distribution of the underlying variables.

The rest of this paper is structured as follows. To lay the foundation of lattice feasibility for approximating two general, correlated diffusion processes, a general one-factor lattice model is first considered in Section 2, with the CIR model used as an example to show in detail how the lattice can be constructed. Section 3 considers the lattice for two general and correlated diffusion processes including the Heston SV model (Heston 1993), and derives the lattice feasibility conditions. We analyze the impact of lattice feasibility on options valuation in Section 4 and conduct extensive numerical tests, including pricing European options and American options in Section 5. This paper concludes in Section 6. All proofs of propositions and theorems are given in the Appendix A of this paper.

2. General One-Factor Trinomial Lattice

We consider the following general Ito process:

d Y_{t} = μ (Y_{t}) d t + σ (Y_{t}) d W_{t},

(1)

where

W_{t}

is a Wiener process and the volatility

σ (\cdot)

is a state-dependent function. In this paper, we assume that

σ (\cdot)

is lower bounded by a constant

σ^{min} > 0

, which is a small number close to zero. That is,

σ (y) \geq σ^{min} > 0, \forall y .

(2)

For any process whose volatility function does not satisfy (2), denoted as

\begin{matrix} d Y_{t} = μ (Y_{t}) d t + α (Y_{t}) d W_{t}, \end{matrix}

(3a)

where

α (y) \geq 0

(e.g.,

α (y) = ξ \sqrt{y}

in the CIR model), one can work with the following counterpart:

\begin{matrix} d Y_{t} = μ (Y_{t}) d t + max (α (Y_{t}), σ^{min}) d W_{t}, \end{matrix}

(3b)

whose volatility function satisfies (2). In (3b), the volatility is assumed to be bounded below by

σ^{min}

for the convenience of subsequent treatments. For a sufficiently small

σ^{min}

, the effect brought about by this lower bound is insignificant and the difference between the two models is negligible, as long as the drift

μ (\cdot)

is nonzero when the volatility is close to zero with the process on the brink of being absorbed. How to determine the lower bound

σ^{min}

in the setting of (3b) will be addressed later.

We will propose a trinomial lattice to approximate (1) based on H&W (Hull and White 1993). In their model, the time horizon is divided into intervals of equal length

Δ t

, and the process can only take on values that are multiples of

Δ y

. At time t, a typical lattice node

Y_{t} = y

branches to nodes

y + (k + h) Δ y

,

y + k Δ y

, and

y + (k - h) Δ y

at the next stage with respective branching probabilities

p_{u}

,

p_{m}

, and

p_{d}

, where the (text) subscripts represent upward, middle, and downward branches, respectively. As will be shown later, the jump size h depends on y and, therefore, the lattice may not recombine at all nodes.

The lattice size

Δ y = c σ^{s} \sqrt{Δ t}

is predetermined using the same method as when volatility is constant, where c and

σ^{s}

are constant. Since

σ (y)

is a function,

σ^{s}

is a surrogate constant volatility, which is set to be a small number no greater than

σ^{min}

:

0 < σ^{s} \leq σ^{min} .

(4)

The branching factor k is chosen such that

k Δ y

(middle branch) approximates the expected price deviation

μ (y) Δ t

. That is, k is the nearest integer of

μ (y) Δ t / Δ y

:

k (y) \equiv ⌊μ (y) Δ t / Δ y + 0.5⌋,

(5)

where

⌊ \cdot ⌋

is the floor function that maps to the nearest integer less than or equal to the operand. Let the mismatch between

μ (y) Δ t / Δ y

and

k (y)

in (5) be denoted by

ϵ

, defined as follows:

ϵ (y) \equiv μ (y) Δ t / Δ y - k (y) .

(6)

Apparently from (5),

| ϵ | \leq 0.5, \forall y .

(7)

The interpretation of (7) is that the middle branch may miss the expected price deviation by no more than

0.5 Δ y

. Note that both

ϵ

and k are functions of y and t, and may vary from node to node.

The jump size h is determined such that

h σ^{s}

approximates

σ (y)

. Likewise, let h be the nearest integer of

σ (y) / σ^{s}

:

x (y) \equiv σ (y) / σ^{s},

(8)

h (y) \equiv ⌊σ (y) / σ^{s} + 0.5⌋ = ⌊x (y) + 0.5⌋ .

(9)

Depending on how small

σ^{s}

is chosen, there is a minimum value of h incurred in (9) and is denoted by

\underset{̲}{h}

:

\underset{̲}{h} = min_{y} h (y) = ⌊σ^{min} / σ^{s} + 0.5⌋ .

(10)

Given

σ^{min}

, one can either choose a

σ^{s}

to determine the integer

\underset{̲}{h}

, as in (10); or vice versa. In our design, one determines

\underset{̲}{h}

first, followed by

σ^{s}

using the following formula:

σ^{s} = \frac{σ^{min}}{max (\underset{̲}{h} - 0.5, 1)} = \{\begin{matrix} σ^{min}, & if \underset{̲}{h} = 1, \\ σ^{min} / (\underset{̲}{h} - 0.5), & if \underset{̲}{h} \geq 2 . \end{matrix}

(11)

As

\underset{̲}{h}

increases,

Δ y

decreases, thereby indicating the presence of more refined lattice grids.

There may exist a mismatch between

σ (y) / σ^{s}

and

h (y)

in (9). This time, we measure their difference by their ratio and see how far it is away from 1. The ratio

γ

is defined as follows:

γ (y) \equiv \frac{σ (y) / σ^{s}}{h (y)} = \frac{x (y)}{h (y)} .

(12)

It is clear when h is high that

γ

is close to one. Therefore, the range of

γ (y)

is determined by the lower bound of

h (y)

, which is

\underset{̲}{h}

. It can be shown that the following relations hold:

\begin{matrix} 1 - 0.5 / \underset{̲}{h} \leq γ (y) \leq 1 + 0.5 \underset{̲}{h}, \forall y, & if \underset{̲}{h} \geq 2, \end{matrix}

(13a)

\begin{matrix} 1 \leq γ (y) \leq 1 + 0.5 \underset{̲}{h}, \forall y, & if \underset{̲}{h} = 1, \end{matrix}

(13b)

and

max (\underset{̲}{h} - 0.5, 1) \leq x (y) \leq \underset{̲}{h} + 0.5, \forall y .

(14)

Given any arbitrary lattice node y, use (5) and (9) to determine the value of k and h, respectively. One can then solve the branching probabilities

p_{u}

,

p_{m}

, and

p_{d}

at y, such that these three branches match the mean and variance of the price deviation. They can be expressed in terms of c,

ϵ (y)

,

γ (y)

, and

h (y)

:

\begin{matrix} p_{u} & = & \frac{1}{2} (\frac{ϵ^{2}}{h^{2}} + \frac{ϵ}{h} + \frac{γ^{2}}{c^{2}}), \end{matrix}

(15a)

\begin{matrix} p_{d} & = & \frac{1}{2} (\frac{ϵ^{2}}{h^{2}} - \frac{ϵ}{h} + \frac{γ^{2}}{c^{2}}), \end{matrix}

(15b)

\begin{matrix} p_{m} & = & 1 - (\frac{ϵ^{2}}{h^{2}} + \frac{γ^{2}}{c^{2}}) . \end{matrix}

(15c)

Proposition 1.

(One-Factor Lattice Feasibility) Given c and

\underset{̲}{h}

, for the branching probabilities (15a)–(15c) to be legitimate for all

| ϵ | \leq 0.5

and γ satisfying (13a)–(13b) for all y, the following condition must hold:

\sqrt{\frac{\underset{̲}{h} + 0.5}{\underset{̲}{h} - 0.5}} \leq c \leq max (\sqrt{3}, \sqrt{2 \underset{̲}{h} - 1}) .

(16)

Proposition 1 states how the values of the two key parameters c and

\underset{̲}{h}

should coordinate to achieve lattice feasibility. This proposition states that, if a lattice is configured such that its values for c and

\underset{̲}{h}

meet (16), the lattice will not have negative probabilities in any branches using (15a)–(15c). When

\underset{̲}{h} = 1

, the only feasible c is

\sqrt{3}

, which is an interesting number for c. When

c = \sqrt{3}

, each typical trinomial branch approximates the underlying normal distribution well (matching up to the 5th moment when

ϵ = 0

). However, when

\underset{̲}{h} \geq 2

, c becomes flexible and its feasible value spreads over a bigger range.

2.1. Effects of Grid Refinement

As pointed in the previous section, introducing

\underset{̲}{h}

has an effect of grid refinement. Basically, if

σ (y)

has a smaller

σ^{min}

, a higher

\underset{̲}{h}

will be required, which leads to smaller

σ^{s}

and

Δ y

. Since

| μ (y) Δ t - k (y) Δ y | = | ϵ (y) | Δ y \leq 0.5 Δ y

, a smaller

Δ y

means that the mean of the price change is better approximated. Traditionally, a discrete lattice approximates the underlying continuous processes better and better as the time step

Δ t

decreases, which may be viewed as lattice refinement on the time space. By increasing the value of

\underset{̲}{h}

, we introduce a refinement on the state space of the underlying variable, even when

Δ t

is not especially small. Therefore, grid refinement has an effect on better convergence.

2.2. Weak Convergence of the One-Factor Lattice

In the proposed trinomial lattice, the jump size

h (y)

varies from node to node due to the state-dependent volatility. As a result, the branches may not recombine in an easily predictable way as in the constant volatility case. Therefore, it is not immediately clear that such a lattice would converge to the underlying process. Next, we show that our proposed lattice indeed converges weakly to the underlying diffusion process in (1).

Proposition 2.

Let

Y_{t}^{Δ t}

denote the trinomial lattice of the one-factor process

Y_{t}

defined in (1) with

Δ y = c σ^{s} \sqrt{Δ t}

, and

k (y)

and

h (y)

defined by (5) and (9), respectively. Suppose that

Y_{0}^{Δ t} = Y_{0}

and the following conditions

\begin{matrix} 0 \leq p_{u}, p_{m}, p_{d} \leq 1 \end{matrix}

(17a)

\begin{matrix} E [(Y_{t + Δ t}^{Δ t} - y) | Y_{t}^{Δ t} = y] & = & μ (y) Δ t \end{matrix}

(17b)

\begin{matrix} E [{(Y_{t + Δ t}^{Δ t} - y)}^{2} | Y_{t}^{Δ t} = y] & = & σ^{2} (y) Δ t + μ^{2} (y) Δ t^{2}, \end{matrix}

(17c)

hold for all lattice node y and for all time interval

Δ t

. As

Δ t \to 0

,

Y_{t}^{Δ t}

converges weakly to the diffusion process

Y_{t}

.

2.3. Estimating $σ^{min}$ for CIR Model and Feller Condition

In this section, we use the Cox et al. (1985) model as an example to illustrate the applicability of the proposed framework. Consider

d Y_{t} = - λ (Y_{t} - m) d t + ξ \sqrt{Y_{t}} d W_{t},

(18)

where

λ

, m, and

ξ

are positive constants. The CIR model (18) does not meet (1) and (2), but (3a). This is because the lower bound of the square root volatility function is zero. To meet (3b), assume that

σ^{min} = ε > 0

is infinitesimal. In this section, we show how to find an approximate lower bound

{\hat{σ}}^{min} \geq ε

for the volatility function. Equivalently, we would find a

y^{min}

such that

{\hat{σ}}^{min} = σ (y^{min}) = ξ \sqrt{y^{min}} \geq ε

, i.e.,

y^{min} \geq ε^{2} / ξ^{2}

.

To do this, we must exploit the mean reverting (MR) property of the drift function of (18). Our approach is to find a

y^{min}

, hopefully much greater than

ε^{2} / ξ^{2}

, such that

y^{min}

is at a level where

Y_{t}

would revert upward so that the whole lattice

Y_{t + 1}

remains to be capped from below by it. Since the lattice is capped from below by

y^{min}

, no other volatility value in the lattice is smaller than

{\hat{σ}}^{min}

.

Note that this reverting level

y^{min}

depends on

Δ t

and decreases to 0 as

Δ t \to 0

. This should not be a problem because, in practice, lattices yield satisfactory valuation results at some finite

Δ t

where

{\hat{σ}}^{min}

hopefully is still much greater than

ε

.

As mentioned above, we would like to see the entire lattice to be well contained in the

R^{+}

domain. With

{\hat{σ}}^{min} = ξ \sqrt{y^{min}}

and using (11), we have

Δ y = ξ c \sqrt{Δ t} \sqrt{y^{min}} / max (\underset{̲}{h} - 0.5, 1) .

(19)

We want to find a

y^{min} > 0

such that

⌊\frac{y}{Δ y}⌋ + k (y) - h (y) \geq ⌊\frac{y^{min}}{Δ y}⌋, \forall y \geq y^{min}

(20)

holds. In (20),

⌊ y / Δ y ⌋

may be viewed as the index of the lattice node of y (with the index equal to 0 at

y = 0

), and from this node,

(k (y) - h (y)) Δ y

corresponds to the price change of the downward branch. Therefore, the left-hand side of (20) represents the index of this lattice node mapped onto from y at the next time period, which should be bounded below by the node corresponding to

y^{min}

. Note in (20) that

y^{min}

is a continuous variable, which may not match the value of some discrete lattice node. If

y^{min} \geq Δ y

, we are done; otherwise, we check if

k (Δ y) - h (Δ y) \geq 0

(21)

in order to ensure that the lattice will revert upward at

y = Δ y

.

An important fact is that a

y^{min} > 0

may not necessarily exist for any arbitrary process parameters. For example, it has been shown that

2 λ m \geq ξ^{2}

must hold for y in (18) to be bounded below by zero, which is known as Feller condition (Feller 1951). Next, we shall show that, if the Feller condition is met, then

y^{min} > 0

exists and the entire lattice can be well contained in

R^{+}

.

Proposition 3.

If (i)

1 - λ Δ t > 0

, and (ii)

4 λ m > ξ^{2} c^{2} / (1 - λ Δ t)

both hold, then

y^{min} > 0

exists and the entire lattice can be well contained in

R^{+}

.

It is clear that condition (i) of Proposition 3 can be easily satisfied by reducing

Δ t

, but condition (ii) may not. Compared with the Feller condition

λ m \geq 0.5 ξ^{2}

, condition (ii) has a similar form and may be viewed as the discrete version of Feller condition in our proposed lattice approach.

For a finite

Δ t > 0

, if condition (ii) is violated, to rectify it, it is probably more effective to reduce the value of c (to its lower bound by increasing

\underset{̲}{h}

) than to reduce

Δ t

. This demonstrates another application of grid refinement. As a limit, our approach requires

λ m > 0.25 ξ^{2}

to hold (by setting

Δ t = 0

and

c = 1

) so that

y^{min} > 0

exists. This condition is slightly looser than the Feller condition due to the fact that it is easier to bound the process of y from below in the (approximate) discrete domain than in the continuous one. This also means that, whenever the Feller condition is met, a Cox et al. (1985) process can always be approximated by the proposed lattice.

Proposition 4.

Given a Cox et al. (1985) process (18) that satisfies the Feller condition, there exists a feasible lattice configuration

(Δ t, \underset{̲}{h}, c)

such that

y^{min} > 0

and the lattice can be well contained in

R^{+}

.

3. Two-Factor Trinomial Lattice

In this section, we consider a more general two-factor model:

\begin{matrix} d Y_{1, t} & = & μ_{1} (Y_{1, t}, Y_{2, t}) d t + σ_{1} (Y_{1, t}, Y_{2, t}) d W_{1, t} \end{matrix}

(22a)

\begin{matrix} d Y_{2, t} & = & μ_{2} (Y_{1, t}, Y_{2, t}) d t + σ_{2} (Y_{1, t}, Y_{2, t}) d W_{2, t}, \end{matrix}

(22b)

where

W_{1, t}

and

W_{2, t}

are Wiener processes with an instantaneous correlation

ρ

. Following the treatment described in (3a)–(3b) in Section 2, we assume that there exist

σ_{l}^{min} > 0

,

l = 1, 2

such that

σ_{l} (y_{1}, y_{2}) \geq σ_{l}^{min} > 0, \forall y_{1}, y_{2}, l = 1, 2 .

(23)

When both individual one-factor lattices are integrated, we shall apply the same convention to all relevant notations, such as

σ^{s}

, k, h,

\underset{̲}{h}

, and c, by adding a subscript

l = 1, 2

to reflect the index of the process that each notation represents.

Our idea of using two-factor trinomial lattices to approximate (22a)–(22b) follows the one proposed by Hull and White (1994): obtain a one-factor trinomial lattice for each state variable first, then integrate both lattices to one such that nine branches (

3 \times 3

) emanating from each lattice node. Let

Δ y_{l} = c_{l} σ_{l}^{s} \sqrt{Δ t}

,

l = 1, 2

, a definition extended directly from that in the one-factor case in Section 2. Assume the one-factor discrete price increment, branching factor for

y_{l}

, and branch jump size are

Δ y_{l}

,

k_{l}

, and

h_{l}

,

l = 1, 2

, respectively, as defined in Section 2. An example is given in Figure 1, where a node

(y_{1, t}, y_{2, t})

at time t is shown in the left panel. The first one calculates

k_{l}

and

h_{l}

using (5) and (9), respectively, for each factor l to determine the three nodes

y_{l, t + 1}^{u}

,

y_{l, t + 1}^{m}

, and

y_{l, t + 1}^{d}

at time

t + 1

that each factor l will map into. The nine nodes of their combinations at

t + 1

are shown in the right panel.

The main task is to solve the nine branching probabilities while matching the first two moments of each factor and their correlation. We further define

k_{l u} \equiv k_{l} + h_{l}

,

k_{l m} \equiv k_{l}

, and

k_{l d} \equiv k_{l} - h_{l}

,

l = 1, 2

. At each node, to determine the nine branching probabilities

p_{i j}, i, j \in Ω

, where

Ω \equiv {u, m, d}

, one solves the following linear system, denoted by

(Q_{0})

:

(Q_{0})

\begin{matrix} \sum_{i, j \in Ω} p_{i j} k_{1 i} Δ y_{1} & = & μ_{1} Δ t, \end{matrix}

(24a)

\begin{matrix} \sum_{i, j \in Ω} p_{i j} k_{2 j} Δ y_{2} & = & μ_{2} Δ t, \end{matrix}

(24b)

\begin{matrix} \sum_{i, j \in Ω} p_{i j} k_{1 i}^{2} Δ y_{1}^{2} & = & μ_{1}^{2} Δ t^{2} + σ_{1}^{2} Δ t, \end{matrix}

(24c)

\begin{matrix} \sum_{i, j \in Ω} p_{i j} k_{2 j}^{2} Δ y_{2}^{2} & = & μ_{2}^{2} Δ t^{2} + σ_{2}^{2} Δ t, \end{matrix}

(24d)

\begin{matrix} \sum_{i, j \in Ω} p_{i j} k_{1 i} k_{2 j} Δ y_{1} Δ y_{2} & = & ρ σ_{1} σ_{2} Δ t + μ_{1} μ_{2} Δ t^{2}, \end{matrix}

(24e)

\begin{matrix} \sum_{i, j \in Ω} p_{i j} & = & 1, \end{matrix}

(24f)

\begin{matrix} p_{i j} & \geq & 0, \forall i, j \in Ω \end{matrix}

(24g)

At each node of the trinomial lattice, it is required to solve

(Q_{0})

to determine the branching probabilities, which has nine variables, six linear equations, and nine nonnegativity inequalities. In optimization terminology,

(Q_{0})

is said to be feasible if a (feasible) solution exists that meets all constraints. Otherwise,

(Q_{0})

is said to be infeasible, which means that some probabilities must be negative or greater than 1. Lattice feasibility refers to the condition where

(Q_{0})

is feasible for all possible nodes in a lattice. In the next proposition, we will show that lattice feasibility is a necessary condition for weak convergence of the proposed two-factor lattice.

Proposition 5.

Let

(Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t})

denote the trinomial lattice of the two-factor process

(Y_{1, t}, Y_{2, t})

defined in (22a)–(22b) with

Δ y_{l} = c_{l} σ_{l}^{s} \sqrt{Δ t}

, and

k_{l} (y)

and

h_{l} (y)

defined by (5) and (9), respectively, for

l = 1, 2

. Suppose that

(Y_{1, 0}^{Δ t}, Y_{2, 0}^{Δ t}) = (Y_{1, 0}, Y_{2, 0})

and the following conditions

\begin{matrix} 0 \leq p_{i j} \leq 1, i, j \in Ω \end{matrix}

(25a)

\begin{matrix} E [(Y_{l, t + Δ t}^{Δ t} - y_{l}) | (Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t}) = (y_{1}, y_{2})] & = & μ_{l} Δ t, l = 1, 2 \end{matrix}

(25b)

\begin{matrix} E [{(Y_{l, t + Δ t}^{Δ t} - y_{l})}^{2} | (Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t}) = (y_{1}, y_{2})] & = & σ_{l}^{2} Δ t + μ_{l}^{2} Δ t^{2}, l = 1, 2 \end{matrix}

(25c)

\begin{matrix} E [(Y_{1, t + Δ t}^{Δ t} - y_{1}) (Y_{2, t + Δ t}^{Δ t} - y_{2}) | (Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t}) = (y_{1}, y_{2})] & = & ρ σ_{1} σ_{2} Δ t + μ_{1} μ_{2} Δ t^{2} \end{matrix}

(25d)

hold for all lattice node

(y_{1}, y_{2})

and for all time interval

Δ t

. As

Δ t \to 0

,

(Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t})

converges weakly to the diffusion process

(Y_{1, t}, Y_{2, t})

.

3.1. Feasibility of the General Lattice

Consider the nine branches that emanated from a fixed node, where the one-factor branching probabilities are assumed to be

{{\tilde{p}}_{l u}, {\tilde{p}}_{l m}, {\tilde{p}}_{l d}}

,

l = 1, 2

. These probabilities are denoted with a tilde to indicate that they are not variables. In addition, assume at this node that the corresponding error factors of the branching and the jump sizes are

ϵ_{l}

and

γ_{l}

,

l = 1, 2

. The branching probabilities can be obtained as follows (cf. (15a)–(15c)):

\begin{matrix} {\tilde{p}}_{l u} & = & \frac{1}{2} (\frac{ϵ_{l}^{2}}{h_{l}^{2}} + \frac{ϵ_{l}}{h_{l}} + \frac{γ_{l}^{2}}{c_{l}^{2}}) \end{matrix}

(26a)

\begin{matrix} {\tilde{p}}_{l d} & = & \frac{1}{2} (\frac{ϵ_{l}^{2}}{h_{l}^{2}} - \frac{ϵ_{l}}{h_{l}} + \frac{γ_{l}^{2}}{c_{l}^{2}}) \end{matrix}

(26b)

\begin{matrix} {\tilde{p}}_{l m} & = & 1 - (\frac{ϵ_{l}^{2}}{h_{l}^{2}} + \frac{γ_{l}^{2}}{c_{l}^{2}}) \end{matrix}

(26c)

The key step is to rewrite

(Q_{0})

to the following equivalent form

(Q)

, in terms of

{{\tilde{p}}_{l i}}

,

i \in Ω

, and

ϵ_{l}, γ_{l}

,

l = 1, 2

:

(Q (ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2}))

\begin{matrix} \sum_{j \in Ω} p_{u j} & = & {\tilde{p}}_{1 u} \end{matrix}

(27a)

\begin{matrix} \sum_{j \in Ω} p_{d j} & = & {\tilde{p}}_{1 d} \end{matrix}

(27b)

\begin{matrix} \sum_{i \in Ω} p_{i u} & = & {\tilde{p}}_{2 u} \end{matrix}

(27c)

\begin{matrix} \sum_{i \in Ω} p_{i d} & = & {\tilde{p}}_{2 d} \end{matrix}

(27d)

\begin{matrix} p_{uu} + p_{dd} - p_{ud} - p_{du} & = & ρ \frac{γ_{1} γ_{2}}{c_{1} c_{2}} + \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}} \end{matrix}

(27e)

\begin{matrix} \sum_{i, j \in Ω} p_{i j} & = & 1 \end{matrix}

(27f)

\begin{matrix} p_{i j} & \geq & 0, \forall i, j \in Ω . \end{matrix}

(27g)

To maintain legibility, we denote

θ_{l} \equiv (ϵ_{l}, x_{l}, γ_{l}, h_{l}) \in Θ_{l} ({\underset{̲}{h}}_{l})

,

l = 1, 2

, where

Θ_{l} ({\underset{̲}{h}}_{l}) \equiv {θ_{l} | (7), (14), (13 a) - (13 b), \forall h_{l} \geq {\underset{̲}{h}}_{l}}, l = 1, 2 .

(28)

In (28),

Θ_{l} ({\underset{̲}{h}}_{l})

covers all possible nodes in

Y_{l}

,

l = 1, 2

.

(Q)

is denoted with

(θ_{1}, θ_{2})

to emphasize that it is a ‘local’ problem associated with some specific lattice node (as

{\tilde{p}}_{l u}

and

{\tilde{p}}_{l d}

are functions of

θ_{l}

,

l = 1, 2

). Equations (27a)–(27d) intend to match the means and variances of the price deviations of the two factors, which have been met by the marginal probabilities

{{\tilde{p}}_{1 j}}

and

{{\tilde{p}}_{2 j}}

,

j \in Ω

. Equation (27e) is derived from (24e) with some algebra.

Consider an initial solution

p_{i j} = {\tilde{p}}_{1 i} {\tilde{p}}_{2 j}, i, j \in Ω

, which satisfies (27a)–(27f) but (27e), unless

ρ = 0

. To determine the range of

ρ

on the right-hand side (RHS) of (27e) that

(Q)

is feasible, consider the following two linear programs:

\begin{matrix} R^{min} (θ_{1}, θ_{2}) \equiv min_{p_{i j}, i, j \in Ω} {p_{uu} + p_{dd} - p_{ud} - p_{du} | (Q) excluding (27 e)} \end{matrix}

(29a)

\begin{matrix} R^{max} (θ_{1}, θ_{2}) \equiv max_{p_{i j}, i, j \in Ω} {p_{uu} + p_{dd} - p_{ud} - p_{du} | (Q) excluding (27 e)} \end{matrix}

(29b)

For a given set of

(θ_{1}, θ_{2})

, it is clear that

(Q)

is feasible, if, and only if, the RHS of (27e) is between

R^{min}

and

R^{max}

, i.e.,

R^{min} (θ_{1}, θ_{2}) \leq ρ \frac{γ_{1} γ_{2}}{c_{1} c_{2}} + \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}} \leq R^{max} (θ_{1}, θ_{2}) .

(30)

Therefore, for

θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})

,

l = 1, 2

, we have

c_{1} c_{2} \cdot max_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (R^{min} (θ_{1}, θ_{2}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \leq ρ \leq c_{1} c_{2} \cdot min_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (R^{max} (θ_{1}, θ_{2}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}),

(31)

Given the values of

c_{1}

and

c_{2}

, using (31), the range of the correlation

ρ

between the two factors for which the lattice can guarantee feasible branching probabilities at all nodes and all stages can be identified. Next, we will show that (31) is symmetric such that its upper bound is the negative of its lower bound.

Proposition 6.

For any arbitrary

θ_{l} = (ϵ_{l}, x_{l}, γ_{l}, h_{l}) \in Θ_{l} ({\underset{̲}{h}}_{l})

,

l = 1, 2

, the following equality holds:

min_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (R^{max} (θ_{1}, θ_{2}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) = - max_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (R^{min} (θ_{1}, θ_{2}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}), l = 1, 2 .

(32)

Since (31) is symmetric, we can focus on solving

R^{max}

. The closed-form expression for

R^{max}

has been derived in Tseng and Lin (2007), which is duplicated in the following proposition for the sake of clarity.

Proposition 7.

R^{max} = min {R_{1}^{max}, R_{2}^{max}, R_{3}^{max}, R_{4}^{max}, R_{5}^{max}, R_{6}^{max}}

, where

\begin{matrix} R_{1}^{max} & = & {\tilde{p}}_{1 u} + {\tilde{p}}_{1 d} \\ R_{2}^{max} & = & {\tilde{p}}_{2 u} + {\tilde{p}}_{2 d} \\ R_{3}^{max} & = & {\tilde{p}}_{1 u} + {\tilde{p}}_{2 d} \\ R_{4}^{max} & = & 1 - ({\tilde{p}}_{1 d} - {\tilde{p}}_{2 d}) - ({\tilde{p}}_{2 u} - {\tilde{p}}_{1 u}) \\ R_{5}^{max} & = & {\tilde{p}}_{2 u} + {\tilde{p}}_{1 d} \\ R_{6}^{max} & = & 1 - ({\tilde{p}}_{2 d} - {\tilde{p}}_{1 d}) - ({\tilde{p}}_{1 u} - {\tilde{p}}_{2 u}), \end{matrix}

where

{\tilde{p}}_{l i}

,

l = 1, 2

,

i \in Ω

are from (26a)–(26c).

Note that

R_{l}^{max}

,

l = 1, \dots, 6

are also functions of

θ_{1}

and

θ_{2}

. It can be seen that

R_{1}^{max}

,

R_{3}^{max}

, and

R_{4}^{max}

reduce to

R_{2}^{max}

,

R_{5}^{max}

, and

R_{6}^{max}

, and vice versa, respectively, by exchanging the factor indices 1 and 2.

Using the result from Proposition 6 to find the upper bound of (31), one needs to solve

min_{θ_{1}, θ_{2}} min_{i = 1, \dots, 6} \frac{1}{γ_{1} γ_{2}} \{R_{i}^{max} - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}\}

(33)

for

θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})

,

l = 1, 2

. It would be easier to solve if one could switch the two minimization operators in (33) as follows:

min_{i = 1, \dots, 6} \{min_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (R_{i}^{max} - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}})\} .

(34)

It turns out that both (33) and (34) are equivalent, which is shown in the following proposition.

Proposition 8.

Let

f_{1}, \dots, f_{n} : R^{n} \to R

be n continuous functions and

D \subseteq R^{n}

, where

n \in N

is finite. Then,

min_{z \in D} min {f_{1} (z), \dots, f_{n} (z)} = min {min_{z \in D} f_{1} (z), \dots, min_{z \in D} f_{n} (z)} .

(35)

Next, we state the main theorem of this section.

Proposition 1.

(Two-Factor Lattice Feasibility) Given a lattice configuration

(c_{1}, c_{2}, {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

,

(Q)

is feasible for all

θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})

,

l = 1, 2

, if, and only if,

| ρ | \leq \bar{ρ} (c_{1}, c_{2}, {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

, where

\bar{ρ} (c_{1}, c_{2}, {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2}) \equiv c_{1} c_{2} \cdot min {w_{1}, w_{2}, w_{3}, w_{4}},

(36)

and

\begin{matrix} w_{1} & = & min_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (\frac{ϕ_{1}^{2}}{h_{1}^{2}} - \frac{ϕ_{1}}{2 h_{1} h_{2}} + \frac{γ_{1}^{2}}{c_{1}^{2}}), ϕ_{1} (h_{1}, h_{2}) \equiv min (\frac{h_{1}}{4 h_{2}}, \frac{1}{2}), \end{matrix}

(37a)

\begin{matrix} w_{2} & = & min_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (\frac{ϕ_{2}^{2}}{h_{2}^{2}} - \frac{ϕ_{2}}{2 h_{1} h_{2}} + \frac{γ_{2}^{2}}{c_{2}^{2}}), ϕ_{2} (h_{1}, h_{2}) \equiv min (\frac{h_{2}}{4 h_{1}}, \frac{1}{2}), \\ w_{3} & = & min_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} [\frac{1}{8} (ϕ_{3}^{2} - 1) + \frac{1}{2} (\frac{γ_{1}^{2}}{c_{1}^{2}} + \frac{γ_{2}^{2}}{c_{2}^{2}})], \end{matrix}

(37b)

\begin{matrix} ϕ_{3} (h_{1}, h_{2}) \equiv max (1 - \frac{1}{h_{1}} - \frac{1}{h_{2}}, 0), \end{matrix}

(37c)

\begin{matrix} w_{4} & = & min_{θ_{1}, θ_{2}} \frac{1}{γ_{1} γ_{2}} (1 - \frac{1}{2 h_{1}}) (1 - \frac{1}{2 h_{2}}) . \end{matrix}

(37d)

Theorem 1 is general, but, unfortunately, no explicit functional expressions for

w_{1}, \dots, w_{4}

are available because each of them requires solving a two-dimensional global minimum of a discontinuous function. However, given a set of lattice parameters

(c_{1}, c_{2}, {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

, using numerical methods, such as exhaustive search, one can easily obtain the numerical values of

w_{1}, \dots, w_{4}

.

From the perspective of minimizing computational requirements, one would prefer smaller values of

({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

so that the grid size may not become too small. To achieve that, next, we try to maximize

\bar{ρ}

over

c_{1}

and

c_{2}

for a given

({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

pair:

ρ_{max} ({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2}) = max_{c_{1}, c_{2}} {\bar{ρ} (c_{1}, c_{2}, {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2}) | c_{1}, c_{2} subject to (16)} .

(38)

Using numerical methods, Table 1 shows the values of

ρ_{max}

for all 100 pairs of

({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

for

1 \leq {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2} \leq 10

. Note that, since

ρ_{max}

is symmetric, Table 1 only displays half of the pairs with

{\underset{̲}{h}}_{1} \geq {\underset{̲}{h}}_{2}

. From Table 1, it is clear that

ρ_{max} ({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

is an increasing function in

{\underset{̲}{h}}_{1}

and

{\underset{̲}{h}}_{2}

. If one considers increasing

{\underset{̲}{h}}_{1}

by 1 to be as difficult in terms of computational effort as increasing

{\underset{̲}{h}}_{2}

by 1, then the diagonal elements where (

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2}

) seem to be the most efficient choices.

When

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2}

, to obtain the optimal solutions

(c_{1}^{*}, c_{2}^{*})

for achieving

ρ_{max}

in (38), it turns out that

w_{1} = w_{2}

in Theorem 1 are the smallest elements in the minimum operator in (36), so a symmetrical optimal solution

c_{1}^{*} = c_{2}^{*}

is obtained. Using numerical methods, the values of

ρ_{max}

and the corresponding (

c_{1}^{*}, c_{2}^{*})

are summarized in Table 2.

As an example, if

ρ = - 0.8

, using the optimized

(c_{1}^{*}, c_{2}^{*})

from Table 2, one can use

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2} = 5

by using

(c_{1}, c_{2}) = (1.1145, 1.1145)

obtained from the Appendix A. Note that the results presented in this section are for general two-factor lattices. If the two underlying diffusion processes have some special structures, e.g., the class of square root volatility models, then

ρ_{max}

may be further increased so that the required

({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

that guarantees lattice feasibility may be lowered. In the next section, we focus on the Heston SV model, for which explicit functional expressions for

w_{1}, \dots, w_{4}

are available.

3.2. Lattice for the Heston SV Model

Consider the Heston SV model (Heston 1993) as follows:

\begin{matrix} d Y_{1, t} & = & μ d t + \sqrt{Y_{2, t}} d W_{1, t} \end{matrix}

(39a)

\begin{matrix} d Y_{2, t} & = & - λ (Y_{2, t} - m) d t + ξ \sqrt{Y_{2, t}} d W_{2, t}, \end{matrix}

(39b)

where

Y_{1, t} = ln S_{t}

is the logarithm of the stock price;

μ

,

λ

, m, and

ξ

are positive constants.

Since both volatilities in (39a) and (39b) are functions of

Y_{2}

, we focus on the MR process of

Y_{2}

. As mentioned in Section 2.2, if the Feller condition is satisfied,

y_{2}^{min} > 0

exists for (39b). Based on

y_{2}^{min}

, we can identify positive

σ_{2}^{min}

and

σ_{2}^{s}

. Once

σ_{2}^{s}

is obtained, we set

σ_{1}^{s} = σ_{2}^{s} / ξ .

(40)

Proposition 9.

To implement the Heston SV model given in (39a)–(39b) using the proposed trinomial lattice and (40), if

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2}

, then

h_{1} = h_{2}

,

x_{1} = x_{2}

, and

γ_{1} = γ_{2}

at all nodes.

Using Proposition 9, we will show that Theorem 1 has an explicit functional form for

\bar{ρ}

if we set

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2}

. Since

μ_{1}

is a constant,

ϵ_{1}

is a fixed constant for all nodes. We denote it as

{\tilde{ϵ}}_{1}

. Without loss of generality, we assume

{\tilde{ϵ}}_{1} = 0

. To make

{\tilde{ϵ}}_{1} = 0

, from (5), we need to have

\frac{μ_{1} Δ t}{c_{1} σ_{1}^{s} \sqrt{Δ t}} = \frac{μ_{1} \sqrt{Δ t}}{c_{1} σ_{1}^{s}} = the nearest integer = k_{1} .

(41)

This can be achieved by adjusting the value of

c_{1}

,

σ_{1}^{min}

, or

Δ t

. Adjusting

c_{1}

and/or

Δ t

is more straightforward than adjusting

σ_{1}^{min}

. However, since

c_{1}

is a parameter for lattice configuration, we recommend adjusting the value of

Δ t

slightly to eliminate the remainder of (41) in order to make

{\tilde{ϵ}}_{1} = 0

.

With

ϵ_{1} = 0

and

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2} = \underset{̲}{h}

, using (31), the condition for lattice feasibility can be significantly simplified as follows:

max_{θ_{1}, θ_{2}} \frac{1}{γ^{2}} (R^{min} (θ_{1}, θ_{2})) \leq \frac{ρ}{c_{1} c_{2}} \leq min_{θ_{1}, θ_{2}} \frac{1}{γ^{2}} (R^{max} (θ_{1}, θ_{2})),

(42)

for

θ_{l} \in Θ_{l} (\underset{̲}{h})

,

l = 1, 2

.

Theorem 2.

(Lattice feasibility for the Heston SV) Pertaining to the Heston SV model in (39a)–(39b), assume

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2} = \underset{̲}{h}

and

ϵ_{1} = 0

. Given a lattice configuration

(c_{1}, c_{2}, \underset{̲}{h})

,

(Q)

is feasible for all

θ_{l} \in Θ_{l} (\underset{̲}{h})

,

l = 1, 2

, if, and only if,

ρ \leq {\bar{ρ}}^{H} (c_{1}, c_{2}, \underset{̲}{h})

, where

{\bar{ρ}}^{H} (c_{1}, c_{2}, \underset{̲}{h}) = c_{1} c_{2} \cdot min {w_{1}^{H}, w_{2}^{H}, w_{3}^{H}, w_{4}^{H}},

(43)

where

\begin{matrix} w_{1}^{H} & = & \frac{1}{c_{1}^{2}} \end{matrix}

(44a)

\begin{matrix} w_{2}^{H} & = & \frac{1}{c_{2}^{2}} \end{matrix}

(44b)

\begin{matrix} w_{3}^{H} & = & \frac{1}{2} (\frac{1}{c_{1}^{2}} + \frac{1}{c_{2}^{2}}) - \frac{1}{4 (max (\underset{̲}{h}, 2) - 0.5)}, \end{matrix}

(44c)

\begin{matrix} w_{4}^{H} & = & \frac{\underset{̲}{h} (\underset{̲}{h} - 0.5)}{{(\underset{̲}{h} + 0.5)}^{2}} \end{matrix}

(44d)

Next, we try to maximize

{\bar{ρ}}^{H}

by adjusting the lattice configuration

(c_{1}, c_{2})

while maintaining lattice feasibility. Let

ρ_{max}^{H} (\underset{̲}{h}) = max_{c_{1}, c_{2}} {{\bar{ρ}}^{H} (c_{1}, c_{2}, \underset{̲}{h}) | c_{1}, c_{2} subject to (16)} .

(45)

Using numerical methods, the solution of (45) is obtained as follows. When

\underset{̲}{h} \leq 2

,

ρ_{max}^{H}

is achieved at the lower bounds of

c_{1}

and

c_{2}

, where

c_{1} = c_{2} = \sqrt{(\underset{̲}{h} + 0.5) / (\underset{̲}{h} - 0.5)}

, and

ρ_{max}^{H}

is determined by

w_{3}^{H}

. When

\underset{̲}{h} \geq 3

,

ρ_{max}^{H}

is achieved at

c_{1} = \sqrt{(\underset{̲}{h} + 0.5) / (\underset{̲}{h} - 0.5)}

, the lower bound, and

c_{2}

at the point where

w_{3}^{H} = w_{4}^{H}

. That is,

ρ_{max}^{H} (\underset{̲}{h}) = c_{1}^{H} (\underset{̲}{h}) c_{2}^{H} (\underset{̲}{h}) \frac{\underset{̲}{h} (\underset{̲}{h} - 0.5)}{{(\underset{̲}{h} + 0.5)}^{2}}, \underset{̲}{h} \geq 3,

(46)

where

\begin{matrix} c_{1}^{H} (\underset{̲}{h}) & = & \sqrt{\frac{\underset{̲}{h} + 0.5}{\underset{̲}{h} - 0.5}} \end{matrix}

(47)

\begin{matrix} c_{2}^{H} (\underset{̲}{h}) & = & \sqrt{\frac{{(\underset{̲}{h} + 0.5)}^{2} (\underset{̲}{h} - 0.5)}{{\underset{̲}{h}}^{3} - {\underset{̲}{h}}^{2} + 1.25 \underset{̲}{h}}} . \end{matrix}

(48)

It can be seen that

ρ_{max}^{H} (\underset{̲}{h})

approaches 1 as

\underset{̲}{h}

increases. The value of

ρ_{max}^{H} (\underset{̲}{h})

for

\underset{̲}{h} = 1, \dots, 10

is given in Table 3, along with the corresponding

(c_{1}^{H}, c_{2}^{H})

. Note that

ρ_{max}^{H} (\underset{̲}{h})

is symmetric in

c_{1}

and

c_{2}

. Therefore, another solution for

ρ_{max}^{H} (\underset{̲}{h})

is to switch

c_{1}^{H}

and

c_{2}^{H}

.

Back to the previous example, if

ρ = - 0.8

, now it only requires

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2} = 3

with

(c_{1}, c_{2}) =

(1.1832,1.1866) or (1.1866, 1.1832) to achieve lattice feasibility, a reduction from

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2} = 5

of the optimized general model.

4. Impact of Lattice Infeasibility on Option Valuation

In this section, we investigate how much lattice infeasibility could impact option valuation. Consider the following SV model of Heston (1993) where the stock price (

S_{t}

) and the instantaneous variance (

V_{t}

) under the risk neutral measure are defined as follows:

\begin{matrix} d ln S_{t} & = & (r - q - \frac{V_{t}}{2}) d t + \sqrt{V_{t}} d W_{1, t}, \end{matrix}

(49a)

\begin{matrix} d V_{t} & = & λ (m - V_{t}) d t + ξ \sqrt{V_{t}} d W_{2, t}, \end{matrix}

(49b)

where r is the constant risk-free rate, q is the dividend yield, and

d W_{1, t}

and

d W_{2, t}

are Wiener processes such that

d W_{1, t} \cdot d W_{2 . t} = ρ d t

.

4.1. An Optimization Perspective

The problem

(Q)

contains six linear equations with nine variables and a nonnegativity constraint. Since

(Q)

has no linear independence of the linear equations, these linear equations have infinitely many solutions (sets of branching probabilities). Therefore, the key is the nonnegativity constraint, which requires a solution to be a set of legitimate probabilities. When the term ‘lattice feasibility’ was coined by Tseng and Lin (2007), there was an implication of using optimization to determine branching probabilities. The idea was to add an objective function to be optimized subject to

(Q)

. Since not all feasible solutions contribute the same to the objective function, using optimization would find not only a feasible solution, but an optimal solution. The objective function in this context refers to the quality of fitting the probability distribution of the underlying uncertainties. The authors in Tseng and Lin (2007) used the following objective function:

min_{p_{i j}} \sum_{i, j \in Ω} {(p_{i j} - {\tilde{p}}_{1 i} {\tilde{p}}_{2 j})}^{2},

(50)

where

p_{i j}

,

i, j \in Ω

are subject to

(Q)

; and

{\tilde{p}}_{1 i}

and

{\tilde{p}}_{2 j}

are marginal probabilities obtained from (26a)–(26c). Tseng and Lin (2007) showed that doing so best fits the probability distribution of the underlying variables in some measure.

The optimization problem in (50) is a standard quadratic programming (QP) problem with linear constraints. Apparently, the optimal solution of (50) is

p_{i j} = {\tilde{p}}_{1 i} {\tilde{p}}_{2 j}

,

i, j \in Ω

when

ρ = 0

. When

ρ \neq 0

, this optimization problem finds the solution that has the least deviation from the solution of the uncorrelated case. At each node, their approach requires solving a simple optimization problem to determine branching probabilities for

(Q)

. We adopt this approach in this paper and refer to it as Best-Fit. Although solving a (QP) at each node may seem cumbersome, we have developed an iterative method to identify the binding constraints at optimality. Once the binding constraints are identified, the corresponding feasible solution is the optimal one due to the convexity of the QP. This approach is very efficient as it usually takes a few trials to correctly identify the binding constraints.

On the other hand, we need a counterpart method for determining branching probabilities that works well in practice but may not guarantee lattice feasibility. We consider the popular lattice approach proposed by Hull and White (1994) (denoted by H&W) as this counterpart method. Note that the comparison is conducted in the same lattice framework such that the lattice structure is exactly the same except for their respective methods of determining branching probabilities at each node: the first is by Best-Fit and the second by H&W.

When

ρ < 0

, Hull and White (1994) suggests with

δ = ρ / 36

:

\begin{matrix} [\begin{matrix} p_{du} & p_{mu} & p_{uu} \\ p_{dm} & p_{mm} & p_{um} \\ p_{dd} & p_{md} & p_{ud} \end{matrix}] = [\begin{matrix} {\tilde{p}}_{1 d} {\tilde{p}}_{2 u} & {\tilde{p}}_{1 m} {\tilde{p}}_{2 u} & {\tilde{p}}_{1 u} {\tilde{p}}_{2 u} \\ {\tilde{p}}_{1 d} {\tilde{p}}_{2 m} & {\tilde{p}}_{1 m} {\tilde{p}}_{2 m} & {\tilde{p}}_{1 u} {\tilde{p}}_{2 m} \\ {\tilde{p}}_{1 d} {\tilde{p}}_{2 d} & {\tilde{p}}_{1 m} {\tilde{p}}_{2 d} & {\tilde{p}}_{1 u} {\tilde{p}}_{2 d} \end{matrix}] + [\begin{matrix} - δ & - 4 δ & + 5 δ \\ - 4 δ & + 8 δ & - 4 δ \\ + 5 δ & - 4 δ & - δ \end{matrix}]; \end{matrix}

(51a)

and similarly when

ρ < 0

,

\begin{matrix} [\begin{matrix} p_{du} & p_{mu} & p_{uu} \\ p_{dm} & p_{mm} & p_{um} \\ p_{dd} & p_{md} & p_{ud} \end{matrix}] = [\begin{matrix} {\tilde{p}}_{1 d} {\tilde{p}}_{2 u} & {\tilde{p}}_{1 m} {\tilde{p}}_{2 u} & {\tilde{p}}_{1 u} {\tilde{p}}_{2 u} \\ {\tilde{p}}_{1 d} {\tilde{p}}_{2 m} & {\tilde{p}}_{1 m} {\tilde{p}}_{2 m} & {\tilde{p}}_{1 u} {\tilde{p}}_{2 m} \\ {\tilde{p}}_{1 d} {\tilde{p}}_{2 d} & {\tilde{p}}_{1 m} {\tilde{p}}_{2 d} & {\tilde{p}}_{1 u} {\tilde{p}}_{2 d} \end{matrix}] + [\begin{matrix} + 5 δ & - 4 δ & - δ \\ - 4 δ & + 8 δ & - 4 δ \\ - δ & - 4 δ & + 5 δ \end{matrix}] . \end{matrix}

(51b)

4.2. Numerical Comparisons: Best-Fit vs. H&W

It is well known that the feasibility of

(Q)

becomes harder to meet when the value of the instantaneous correlation

ρ

is high. If one allows some branching probabilities to be negative, the corresponding probability distribution(s) of the price and/or the volatility is distorted. Depending on the degree of the distortion, there may be some impact on the option valuation. We use both approaches (Best-Fit and H&W) to value a European call option under the Heston SV model. The parameters are taken from Table 1 of Ball and Roma (1994) with

r = q = 0

,

λ = 8

,

m = V_{0} = 0.1225,

and

ξ = 0.8

.

With

S_{0} = $ 100

, three cases, corresponding to three different strike prices,

K =

$80, $100, and $120, are tested with the correlation

ρ

ranging between −0.8 and 0.8. The result is summarized in Table 4. For the lattice, we consider

n = 100

(with T = 6 months) to make sure that both methods (Best-Fit and H&W) can best fit the probability distribution of the underlying variables.

The last two columns of Table 4 show the percentage of lattice nodes (from

t = 0

to T) that each has at least one outgoing branch with a probability either negative or exceeding unity. Note that this should not be confused with the situation where a price node at some stage has a negative probability to prevail. A lattice node may receive many incoming branches from other nodes in the previous stage. While some of the incoming branches may have negative probabilities, it is unlikely that the resultant probability for reaching this node is negative.

Basically, the option prices using Best-Fit are very close to the exact values, within one cent. It can be seen that Best-Fit maintains lattice feasibility for all

ρ

values tested, while H&W meets the feasibility condition only when

| ρ | \leq 0.3

. When

0.3 < | ρ | \leq 0.7

, there is only a very small number of nodes containing branches with negative probabilities. However, when

| ρ | = 0.8

, almost all the nodes (99.79%) violate the feasibility condition. This justifies our selection of H&W as the counterpart, as this method indeed provides very good approximations of branching probabilities that solve

(Q)

. Figure 2 displays the percentage deviation of the obtained option prices from the exact ones by changing

ρ

using Best-Fit and H&W.

It can be seen from Figure 2 that H&W obtains a precise value only when there is no correlation. As

| ρ |

increases, the option value obtained by H&W starts to deviate from the exact values. The deviation is roughly a piecewise linear function of

ρ

, whose slope doubles when

| ρ | > 0.6

. On the other hand, the deviations of Best-Fit are largely confined within 0.6 cents (less than 0.1%). Comparing both methods, we make the following three observations for H&W’s method:

Consider the OTM option and the ITM option when $| ρ | \leq 0.3$ . Though lattice feasibility is fully maintained, the error persists for $ρ \neq 0$ . Therefore, lattice feasibility alone cannot guarantee good valuation, especially with a finite time step $Δ t$ . The valuation error would converge to zero only when $Δ t$ is sufficiently small. Therefore, lattice feasibility is merely a necessary condition for accurate valuations. From this perspective, the optimization approach for finding a feasible and optimal set of branching probabilities makes sense.
Consider the ATM option. Its pricing errors are relatively small for all $ρ$ values tested. Even when $ρ = | 0.8 |$ , where infeasibility occurs at almost all nodes, the error does not seem too bad (0.3% when $ρ = 0.8$ ; 1.3% when $ρ = - 0.8$ ). This shows that sometimes negative probabilities seem to matter less.
Lattice infeasibility means that there are some distortions on the probability distribution of the underlying uncertainties. The effect can be overvaluation (e.g., OTM and ATM options when $ρ < 0$ ) or undervaluation (e.g., ITM option when $ρ < 0$ ) of the options.

In Figure 3, we plot the exact probability density functions (PDF) of the (logarithm of the) stock price (dotted curve) at different correlation levels, and the PDF approximated by the lattice using H&W (solid curve) and Best-Fit for branching probabilities. The exact PDF is obtained from a standard integration scheme of characteristic functions (e.g., see Rough 2013). The probability distributions are taken when

ρ = \pm 0.3

,

\pm 0.6

, and

\pm 0.8

. At

ρ = \pm 0.3

, both Best-Fit and H&W achieve lattice feasibility (with the first two moments of price and volatility deviations matched). There is no visible distortion in the PDF from Figure 3, yet its OTM option price still has an error of more than 3% (12 cents). This indicates that the optimization of (50) subtly improves the approximation of the tail distribution.

When

| ρ | > 0.3

, the discrepancy between the exact PDF and the PDF approximated by H&W becomes visible. Using

ρ = \pm 0.8

in Figure 3 as an example, where the discrepancy is most distinct, the exact PDF and the one approximated by Best-Fit are still visually indistinguishable. In general, when

ρ > 0.3

(or

ρ < - 0.3

), it can be seen that, using H&W, the price distribution is distorted such that it is slightly less skewed to the right (or left) with a bigger (or smaller) mode. In Figure 3, the three strike prices tested corresponding to OTM, ATM, and ITM options are also identified. When

ρ > 0.3

(or

ρ < - 0.3

), it can be seen that using this distorted price distribution to value a European call option undervalues (or overvalues) the OTM and ATM ones, but overvalues (or undervalues) the ITM option. We make the following three additional observations:

The price of an OTM option is directly impacted by lattice infeasibility as its value is determined by the tail distribution, which is the part of the probability distribution that is most affected by negative probabilities.
For an ITM option, a wider part of the probability distribution becomes relevant, which tends to involve both tails and the central part, making the overall effects hard to predict.
For an ATM option, the distorted tail part seems less important because the less distorted central part dominates the valuation.

To sum up, since, in reality, how the underlying distribution is distorted by lattice infeasibility is unknown a priori, it seems unlikely that one could exploit negative probabilities. However, if one really hopes for negative probabilities to work to their advantage, it is less likely to happen on OTM options, but more probable on ATM options.

5. Performance of the Best-Fit Lattice

In this section, we provide more numerical results for a lattice equipped with lattice feasibility and the Best-Fit method for branching probabilities. We continue to focus on the Heston SV model in (49a)–(49b). Since this approach guarantees feasible branching probabilities that also fit the underlying probability distribution well, with no surprise, all results indicate that such an approach is very reliable for accurate option valuations.

5.1. European Options Valuation

To make a comprehensive analysis of pricing errors, we compare the option prices obtained by the proposed lattice model with the exact solutions of European call options of various specifications. These specifications are drawn from combinations of the following factors:

S_{0} = 100

,

q = 2 %

(dividend yield),

T = 0.5

years (time to maturity),

m = 0.04

,

λ \in {2, 4, 8}

,

ξ \in {0.3, 0.6, 0.9}

,

K \in {90, 100, 110}

(strike price),

r \in {3 %, 5 %, 7 %}

,

V_{0} \in {0.02, 0.04, 0.08}

, and

ρ \in {- 0.5, - 0.25, 0, 0.25, 0.5}

. Those combinations that do not meet the Feller condition (i.e.,

2 λ m \geq ξ^{2}

) are excluded. After that, 540 sets remain for the testing. The accuracy measure used in this paper is the root mean squared error (RMSE), defined as follows:

RMSE = \sqrt{\frac{1}{540} \sum_{i = 1}^{540} e_{i}^{2}},

(52)

where the error

e_{i}

is the difference between the exact value of the i-th European call option and the estimated option value using the proposed lattice model.

From Table 5, it can be seen that the pricing errors of the proposed lattice model are generally small compared with the exact option prices even if the number of time steps

(N)

is small. For instance, when

N = 10

, the RMSE of the proposed trinomial method for pricing European call options is $0.0061, which is far smaller than the bid-ask spreads observed in the market. Moreover, Table 5 indicates that the rate of convergence of the proposed method is of order

O (1 / N)

.

5.2. Convergence and Complexity

Consider another test case taken from Table 1 of Ball and Roma (1994) with

r = q = 0

,

ρ = 0

,

λ = 4

,

m = V_{0} = 0.09,

and

ξ = 0.4

. In addition, assume

T = 0.5

months,

S_{0} = K = $ 100

. Using this example, we also investigate the convergence pattern of the option prices obtained by the proposed lattice approach and the exact value ($8.3595) when the number of time steps increases, along with its computational requirement. The results are presented in Figure 4. As seen in the upper portion of Figure 4, the option prices obtained by the proposed lattice model do converge to the exact price. We have investigated the convergence pattern for more than twenty cases, and the results are similar to those shown in Figure 4. In the lower portion, we show the CPU times required for obtaining the option values. The CPU times (in seconds) are measured on a personal computer with Intel Core 2 Duo processor E8400 of 3 GHz. We also record the number of lattice nodes in the final stage. Both the CPU times and the node numbers are displayed in logarithmic scales on both the horizontal and vertical axes. The purpose is to check whether the computational complexity is exponential. A linear behavior in a log-log graph, such as the lower portion of Figure 4, indicates that the complexity is polynomial. It is estimated for this particular instance that the CPU time is approximately of order

O (N^{4})

and the number of nodes in the final stage is of order

O (N^{3})

, where N is the number of steps. The result indicates that, as N increases, most branches do recombine, which prevents the number of lattice nodes from growing exponentially with N. In this example, when

N = 30

, the option value is already within 0.2% of the exact value using about three seconds of CPU time and about 40,000 nodes in the final stage. When

N = 45

, the pricing error of the proposed approach is within 0.1%, using about 21 s of CPU time and 160,000 lattice nodes in the final stage. The number of lattice nodes involved in the proposed approach is indeed much greater than that of using traditional recombining lattices. However, the computation using the proposed lattice model can be managed to be quite efficient.

5.3. American Options Valuation

One unique benefit for the lattice approach is its ability to value American options. We use the proposed lattice model (Best-Fit) to value American put option under the Heston SV model with parameters: strike

K =

$100.

λ = 3.0

,

m = 0.04

,

r = 0.05

,

q = 0

, and

ξ = 0.1

. The results are summarized in Table 6. We compare the result with that reported in Beliaeva and Nawalkha (2010), denoted by ‘B&N Tree’ in the same table. We also apply the control variate (CV) technique to value the American put as follows:

American Put (CV) = American Put (Best-Fit) +

(European Put (Closed-Form) - European Put (Best-Fit))

Each of the last two columns of Table 6 indicates the difference, labeled as ‘error’, between an obtained American put option prices (by either the proposed lattice model or the B&N Tree) and that by the CV technique. It can be seen that most errors are smaller than one cent, and, in most cases, the differences obtained by the Best-Fit model are smaller than the corresponding ones reported in Beliaeva and Nawalkha (2010). Furthermore, the approach that achieves a smaller error is highlighted in bold in each comparison.

A more detailed comparison is summarized in Table 7. There are 36 cases tested in Table 6 (corresponding to 36 rows), which can be divided into several categories given in Table 7. The number of wins (in terms of a lower error) and the percentage of winning are recorded, along with the corresponding RMSE of each category. Overall, the Best-Fit model achieves smaller errors in 61% of all 36 test cases; our RMSE (0.0081) is only 44% of that (0.0181) by the B&N Tree (see Table 6). In all categories summarized in Table 7, the Best-Fit model outperforms the B&N tree approach in all categories either by the number of case wins or RMSE. In terms of the correlation

ρ

, Table 7 shows that the Best-Fit model performs especially well when the correlation is high (

ρ = - 0.7

) such that the Best-Fit model does better in 67% of cases, and the RMSE is only 1/3 of the B&N Tree approach. When

| ρ |

is high, lattice feasibility becomes harder to meet. Since the proposed lattice model maintains lattice feasibility and can best fit the underlying probability distribution for all

ρ

, the results in Table 7 show that it indeed performs relatively well when

| ρ |

is high.

To summarize Table 7, it is fair to say that the Best-Fit model performs better especially when

| ρ |

is high, T is greater than one month, and/or for the options that are ITM or OTM.

6. Conclusions

In this paper, we focus on two-factor lattices for general diffusion processes where volatilities can be state-dependent, including stochastic volatility models. For a lattice approach, although it is common knowledge that branching probabilities must be between zero and one, few methods can guarantee all branching probabilities of all nodes in all stages are always legitimate. We refer to this property as lattice feasibility. Since it is not unusual to encounter negative probabilities, some practitioners have argued that negative probabilities are not necessarily ‘bad’ and may be further exploited. We have developed a theoretical framework of lattice feasibility to investigate how negative probabilities may impact option pricing in a lattice approach. We have shown that lattice feasibility can be achieved by adjusting a lattice’s configuration (e.g., grid sizes).

Failing to meet lattice feasibility means that some branching probabilities in a lattice are negative or exceed unity, which implies distortions on the probability distribution of the underlying variables. Depending on the distortion, the accuracy of options pricing may be affected. We have found that out-of-the-money options are most affected, followed by in-the-money options and at-the-money options. It has also been observed that negative probabilities indeed matter less in some situations. Since, in reality, how an underlying probability distribution is distorted by lattice infeasibility is unknown a priori, it seems unlikely that one could exploit negative probabilities consistently as some practitioners may hope.

Although lattice feasibility is a necessary condition for weak convergence of approximating the underlying diffusion processes, our numerical tests also show that lattice feasibility alone may not be sufficient to guarantee accurate valuation, especially when the time step of the lattice is not especially small. Since legitimate branching probabilities may not be unique, we use an optimization approach to find branching probabilities that are not only legitimate but also can best fit the probability distribution of the underlying variables. Extensive numerical tests show that this optimized lattice model is a reliable and robust approach for financial option valuations.

Author Contributions

Conceptualization, C.-L.T.; methodology, C.-L.T. and D.W.-C.M.; software, C.-L.T.; analysis, C.-L.T., D.W.-C.M., S.-L.C. and P.-T.S.; writing—original draft preparation, C.-L.T.; writing—review and editing, C.-L.T., D.W.-C.M., S.-L.C. and P.-T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Proof of Proposition 1

From (15a)–(15c), apparently

p_{u} \geq 0, p_{d} \geq 0, p_{m} \geq 0 \Leftrightarrow - {(\frac{ϵ}{h})}^{2} + \frac{| ϵ |}{h} \leq \frac{x^{2}}{c^{2} h^{2}} \leq 1 - {(\frac{ϵ}{h})}^{2}

(A1)

For

p_{u}

,

p_{m}

, and

p_{d}

to be legitimate for all

ϵ \in [- 0.5, 0.5]

, we consider

max_{| ϵ | \leq 0.5} (- {(\frac{ϵ}{h})}^{2} + \frac{| ϵ |}{h}) \leq \frac{x^{2}}{c^{2} h^{2}} \leq min_{| ϵ | \leq 0.5} (1 - {(\frac{ϵ}{h})}^{2}) .

(A2)

The upper bound of (A2) is

1 - 1 / (2 h^{2})

achieved at

ϵ = 0.5

. The lower bound is achieved at

ϵ = \pm 0.5

. Thus,

\frac{h}{2} - \frac{1}{4} \leq (\frac{x}{c})^{2} \leq h^{2} - \frac{1}{4},

(A3)

From (9), it is clear that the middle term of (A3) is within the following interval:

(\frac{x}{c})^{2} \in \{\begin{matrix} [1 / c^{2}, 1 . 5^{2} / c^{2}), & if h = 1, \\ [{(h - 0.5)}^{2} / c^{2}, {(h + 0.5)}^{2} / c^{2}), & if h \geq 2 . \end{matrix}

(A4)

For both (A3) and (A4) to hold, we must have for

h \in N

and

\begin{matrix} \frac{{(h + 0.5)}^{2}}{c^{2}} & \leq & h^{2} - \frac{1}{4} \end{matrix}

(A5a)

\begin{matrix} max (\frac{1}{c^{2}}, \frac{{(h - 0.5)}^{2}}{c^{2}}) & \geq & \frac{h}{2} - \frac{1}{4} . \end{matrix}

(A5b)

Thus,

\frac{h + 0.5}{h - 0.5} \leq c^{2} \leq max (1, {(h - 0.5)}^{2}) {(\frac{h}{2} - \frac{1}{4})}^{- 1}

(A6)

Plugging

h = 1, 2, 3, \dots

to (A6) gives the following ranges of

c^{2}

:

\begin{matrix} 3 \leq c^{2} \leq 4, & if h = 1 \\ 5 / 3 \leq c^{2} \leq 3, & if h = 2 \\ 7 / 5 \leq c^{2} \leq 5, & if h = 3 \\ 9 / 7 \leq c^{2} \leq 7, & if h = 4 \\ \dots \end{matrix}

When

\underset{̲}{h} = 1

, apparently

c^{2}

must be 3. When

h \geq 2

, the upper bound of

c^{2}

increases with h, while the lower bound decreases. Therefore, for

h \geq \underset{̲}{h}

,

\underset{̲}{h}

sets the bounds. That is,

\sqrt{(\underset{̲}{h} + 0.5) / (\underset{̲}{h} - 0.5)} \leq c \leq \sqrt{2 \underset{̲}{h} - 1}

for

\underset{̲}{h} \geq 2

.

Appendix A.2. Proof of Proposition 2

The proof is based on Durrett (1996) where the conditions for weak convergence from Markov chains to diffusion processes are given. To present our proof, we first give the following lemma which is taken from Lemma 8.2 of (Durrett 1996, p. 306) and adapted to our one-factor case. For convenience, we introduce the following definitions for the one-factor lattice

Y_{t}^{Δ t}

.

\begin{matrix} α^{Δ t} (y) & = & {(Δ t)}^{- 1} E [(Y_{t + Δ t}^{Δ t} - y) | Y_{t}^{Δ t} = y], \\ β^{Δ t} (y) & = & {(Δ t)}^{- 1} E [{(Y_{t + Δ t}^{Δ t} - y)}^{2} | Y_{t}^{Δ t} = y], \\ γ_{p}^{Δ t} (y) & = & {(Δ t)}^{- 1} E [| Y_{t + Δ t}^{Δ t} {- y |}^{p} | Y_{t}^{Δ t} = y] . \end{matrix}

They are concerned with the first, second, and higher (absolute) moments of the change in lattice across

Δ t

.

Lemma A1.

Suppose that

Y_{0}^{Δ t} = Y_{0}

. If there exists a

p \geq 2

and for all

R < \infty

, we have

\begin{matrix} (i) & lim_{Δ t ↓ 0} sup_{| y | \leq R} | α^{Δ t} (y) - μ (y) | = 0, \\ (i i) & lim_{Δ t ↓ 0} sup_{| y | \leq R} | β^{Δ t} (y) - σ^{2} (y) | = 0, \\ (i i i) & lim_{Δ t ↓ 0} sup_{| y | \leq R} γ_{p}^{Δ t} (y) = 0, \end{matrix}

then the one-factor lattice

Y_{t}^{Δ t}

converges weakly to

Y_{t}

as

Δ t \to 0

.

To prove Proposition 2, we need to check the above three conditions. Note that conditions (i) and (ii) are true because (17a)–(17c) hold for all y. Then, we check condition (iii) with

p = 4

and have

\begin{matrix} γ_{4}^{Δ t} (y) & = & {(Δ t)}^{- 1} E [| Y_{t + Δ t}^{Δ t} {- y |}^{4} | Y_{t}^{Δ t} = y] \\ = & {(Δ t)}^{- 1} (p_{u} {| (k + h) Δ y |}^{4} + p_{m} {| k Δ y |}^{4} + p_{d} {| (k - h) Δ y |}^{4}) \\ \leq & {(Δ t)}^{- 1} {(| (k + h) Δ y |}^{4} + {| k Δ y |}^{4} + {| (k - h) Δ y |}^{4}) \\ \leq & {(Δ t)}^{- 1} 3 {(k + h)}^{4} {(Δ y)}^{4} . \end{matrix}

Note that

Δ y = c σ^{s} \sqrt{Δ t}, k (y) = ⌊\frac{μ (y) Δ t}{Δ y} + 0.5⌋, h (y) = ⌊\frac{σ (y)}{σ^{s}} + 0.5⌋ .

Fix an arbitrarily large

R < \infty

. For any y such that

| y | \leq R

(thus

μ (y)

and

σ (y)

are finite) and for a sufficiently small

Δ t

, we have

| k (y) | \leq sup_{| y | \leq R} | \frac{μ (y)}{c σ^{s}} | \sqrt{Δ t} + 0.5 \leq 1, | h (y) | \leq sup_{| y | \leq R} | \frac{σ (y)}{σ^{s}} | + 1 = M < \infty .

Putting these together, we have

sup_{| y | \leq R} γ_{4}^{Δ t} (y) \leq 3 c^{4} {(σ^{s})}^{4} {(M + 1)}^{4} Δ t \to 0 as Δ t \to 0 .

Thus, condition (iii) is satisfied.

Appendix A.3. Proof of Proposition 3

Before we prove Proposition 3, some preparation needs to be done. Equation (20) is equivalent to:

\begin{matrix} ⌊\frac{y}{Δ y}⌋ + ⌊\frac{- λ (y - m) Δ t}{Δ y} + 0.5⌋ - ⌊\frac{ξ \sqrt{y}}{σ^{s}} + 0.5⌋ - ⌊\frac{y^{min}}{Δ y}⌋ \\ = & ⌊\frac{y}{Δ y}⌋ + ⌊\frac{- λ (y - m) Δ t}{Δ y} + 0.5⌋ - ⌊\frac{ξ c \sqrt{Δ t} \sqrt{y}}{Δ y} + 0.5⌋ - ⌊\frac{y^{min}}{Δ y}⌋ \geq 0 \end{matrix}

(A7a)

Likewise, (21) is equivalent to:

\begin{matrix} ⌊\frac{- λ (y - m) Δ t}{Δ y} + 0.5⌋ - ⌊\frac{ξ c \sqrt{Δ t} \sqrt{y}}{Δ y} + 0.5⌋ \geq 0 \end{matrix}

(A7b)

Since dealing with the floor functions is cumbersome, the following lemma enables us to consider the sufficient conditions without the floor functions that imply (A7a) and (A7b).

Lemma A2.

Suppose

a_{1}, \dots, a_{4} \in R

. The following two statements are true:

(a): If $a_{1} + a_{2} - a_{3} - a_{4} \geq 1$ , then $⌊ a_{1} ⌋ + ⌊ a_{2} + 0.5 ⌋ - ⌊ a_{3} + 0.5 ⌋ - ⌊ a_{4} ⌋ \geq 0$ .
(b): If $a_{2} - a_{3} \geq 0$ , then $⌊ a_{2} + 0.5 ⌋ - ⌊ a_{3} + 0.5 ⌋ \geq 0$ .

Proof.

Let

n_{1} = ⌊ a_{1} ⌋

,

n_{2} = ⌊ a_{2} + 0.5 ⌋

,

n_{3} = ⌊ a_{3} + 0.5 ⌋

, and

n_{4} = ⌊ a_{4} ⌋

. This implies that

a_{1} \in [n_{1}, n_{1} + 1)

,

a_{2} \in [n_{2} - 0.5, n_{2} + 0.5)

,

a_{3} \in [n_{3} - 0.5, n_{3} + 0.5)

, and

a_{4} \in [n_{4}, n_{4} + 1)

.

(a): We have $a_{1} + a_{2} - a_{3} - a_{4} \in (n_{1} + n_{2} - n_{3} - n_{4} - 2, n_{1} + n_{2} - n_{3} - n_{4} + 2)$ . If $n_{1} + n_{2} - n_{3} - n_{4} < 0$ , i.e., $n_{1} + n_{2} - n_{3} - n_{4} \leq - 1$ , then $a_{1} + a_{2} - a_{3} - a_{4} < 1$ . This means that $a_{1} + a_{2} - a_{3} - a_{4} \geq 1$ implies $n_{1} + n_{2} - n_{3} - n_{4} \geq 0$ .
(b): Similarly, $a_{2} - a_{3} \in (n_{2} - n_{3} - 1, n_{2} - n_{3} + 1)$ . If $n_{2} - n_{3} < 0$ , i.e., $n_{2} - n_{3} \leq - 1$ , then $a_{2} - a_{3} < 0$ . This means that, if $a_{2} - a_{3} \geq 0$ , then $n_{2} - n_{3} \geq 0$ .

□

Using Lemma A2, we consider the sufficient condition of (A7a) as follows:

\frac{y}{Δ y} + \frac{- λ (y - m) Δ t}{Δ y} - \frac{ξ c \sqrt{Δ t} \sqrt{y}}{Δ y} - \frac{y^{min}}{Δ y} \geq 1,

(A8)

which is equivalent to

(1 - λ Δ t) y - ξ c \sqrt{Δ t} \sqrt{y} + (λ m Δ t - y^{min} - Δ y) \geq 0 .

(A9)

We want to show that

y^{min} > 0

exists such that (A9) holds for all

y \geq 0

. Then, using Lemma A2(a), this implies that (20) holds for all

y \geq 0

.

For (A7b), we consider the following sufficient condition using Lemma A2(b):

\frac{λ (m - Δ y) Δ t}{Δ y} - \frac{ξ c \sqrt{Δ y} \sqrt{Δ t}}{Δ y} = \frac{- Δ t}{Δ y} (Δ y + \frac{ξ c}{\sqrt{Δ t}} \sqrt{Δ y} - λ m) \geq 0,

(A10)

which is equivalent to

Δ y + \frac{ξ c}{\sqrt{Δ t}} \sqrt{Δ y} \leq λ m .

(A11)

Lemma A3.

The following two statements are true:

(a): Given $a, b, d \in R$ , if $a > 0$ , $b > 0$ , $d > 0$ , and $b^{2} \leq 4 a d$ , then $a y - b \sqrt{y} + d \geq 0$ for $\forall y \geq 0$ .
(b): If $α_{1} \geq 0$ and $α_{2} > 0$ , then $y + α_{1} \sqrt{y} \leq α_{2}$ has a solution $y > 0$ .

Proof.

(a): Given $a, b, d \in R$ , $a > 0$ , $b > 0$ , $d > 0$ , and $b^{2} \leq 4 a d$ , consider $f (y) = a^{2} y^{2} + (2 a d - b^{2}) y + d^{2}$ . Note that $f (y)$ is a quadratic convex function. Consider two cases: (I) If $b^{2} \leq 2 a d$ , then $f^{'} (0) = 2 a d - b^{2} \geq 0$ ; $f (y)$ is increasing and is positive for $\forall y \geq 0$ . (II) If $2 a d < b^{2} \leq 4 a d$ , f has a local minimum at $y^{*} = (b^{2} - 2 a d) / (2 a^{2}) > 0$ with objective value $f (y^{*}) = b^{2} (4 a d - b^{2}) / (4 a^{2}) \geq 0$ . Both cases imply that $f (y) \geq 0$ , $\forall y \geq 0$ . Since $a > 0$ and $d > 0$ , $f (y) = a^{2} y^{2} + (2 a d - b^{2}) y + d^{2} \geq 0, \forall y \geq 0$ implies that $a y + d \geq b \sqrt{y}, \forall y \geq 0$ .
(b): When $y = 0$ , the left-hand-side of $y + α_{1} \sqrt{y} = 0 < α_{2}$ . Therefore, by continuity, there exists some $y > 0$ to satisfy the inequality $y + α_{1} \sqrt{y} \leq α_{2}$ .

□

Now, we are ready to prove Proposition 3 using Lemmas A2 and A3. Let

a = 1 - λ Δ t

,

b = ξ c \sqrt{Δ t}

,

d = λ m Δ t - y_{min} - Δ y

, where

Δ y

is from (19). From Lemma A3(a), if

a = 1 - λ Δ t > 0

, which is condition (i) of Proposition 3, and

b^{2} \leq 4 a d

, then (A9) holds for

\forall y \geq 0

, which further implies that (20) holds for

\forall y \geq 0

. Now, consider the following equivalent statements:

\begin{matrix} b^{2} \leq 4 a d \end{matrix}

(A12)

\begin{matrix} \Leftrightarrow & ξ^{2} c^{2} Δ t \leq 4 (1 - λ Δ t) (λ m Δ t - y_{min} - \frac{ξ c}{max (\underset{̲}{h} - 0.5, 1)} \sqrt{y_{min}} \sqrt{Δ t}) \end{matrix}

(A13)

\begin{matrix} \Leftrightarrow & y^{min} + \frac{ξ c \sqrt{Δ t}}{max (\underset{̲}{h} - 0.5, 1)} \sqrt{y^{min}} \leq Δ t (λ m - \frac{ξ^{2} c^{2}}{4 (1 - λ Δ t)}) . \end{matrix}

(A14)

Using Lemma A3(b),

y^{min} > 0

exists if the right-hand side of (A14) is strictly positive. That is,

λ m > \frac{ξ^{2} c^{2}}{4 (1 - λ Δ t)},

(A15)

which is given by condition (ii) of Proposition 3. Suppose a feasible

y^{min} = {\tilde{y}}_{a} > 0

is found that satisfies (A14). Using Lemma A2(a) and Lemma A3(a), we conclude

y^{min} > 0

exists such that (20) holds for

\forall y \geq 0

.

Using

y^{min} = {\tilde{y}}_{a}

to evaluate

Δ y

from (19), if

{\tilde{y}}_{a} \geq Δ y

, we are done. Otherwise,

k (Δ y) - h (Δ y) \geq 0

is imposed as given in (21), which is reduced to (A11). Using Lemma A3(b), it is clear that

Δ y > 0

exists to ensure (A11). However, to ensure that

y^{min} < Δ y

, from (19), the following upper bound of

Δ y

must also be imposed:

Δ y < ξ^{2} c^{2} Δ t / max {(\underset{̲}{h} - 0.5, 1)}^{2}

(A16)

Using Lemma A3(b) again, it is clear that a

Δ y > 0

exists that also meets (A16). Since

Δ y

is a function of

y^{min}

from (19), we conclude that a

y^{min} > 0

exists, say

y^{min} = {\tilde{y}}_{b}

, such that (21) holds. Since

{\tilde{y}}_{b} < {\tilde{y}}_{a}

, it is clear that

{\tilde{y}}_{a}

also meets (A14). Therefore, both (A7a) and (A7b) hold with

y^{min} = {\tilde{y}}_{b}

.

To sum up, given conditions (i) and (ii) of Proposition 3,

y^{min} > 0

exists (e.g., equal to

min ({\tilde{y}}_{a}, {\tilde{y}}_{b}) > 0

). This means that the entire lattice can be well contained in

R^{+}

.

Appendix A.4. Proof of Proposition 4

Given Feller’s condition

λ m \geq 0.5 ξ^{2}

, we want to show there exists a lattice configuration

(Δ t, \underset{̲}{h}, c)

such that

c^{2} / (1 - λ Δ t) \leq 2

. To do so, we choose a small

Δ t > 0

such that

λ Δ t < 0.5

, i.e.,

2 (1 - λ Δ t) > 1

. We also choose a big

\underset{̲}{h}

such that

c^{2}

is very close to 1 while still being greater than 1 based on (16), and furthermore

2 (1 - λ Δ t) \geq c^{2} > 1

. This would imply

1 - λ Δ t > 0

, which is condition (i) of Proposition 3; and

λ m \geq 0.5 ξ^{2} \geq \frac{c^{2}}{4 (1 - λ Δ t)} ξ^{2},

(A17)

which is condition (ii) of Proposition 3. The result follows from Proposition 3.

Appendix A.5. Proof of Proposition 5

The proof for the two-factor case is in the same spirit as the one-factor case treated in Section A.2. For the two-factor lattice

(Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t})

, define

\begin{matrix} α_{l}^{Δ t} (y_{1}, y_{2}) & = & {(Δ t)}^{- 1} E [(Y_{l, t + Δ t}^{Δ t} - y_{l}) | (Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t}) = (y_{1}, y_{2})], l = 1, 2, \\ β_{l}^{Δ t} (y_{1}, y_{2}) & = & {(Δ t)}^{- 1} E [{(Y_{l, t + Δ t}^{Δ t} - y_{l})}^{2} | (Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t}) = (y_{1}, y_{2})], l = 1, 2, \\ δ^{Δ t} (y_{1}, y_{2}) & = & {(Δ t)}^{- 1} E [(Y_{1, t + Δ t}^{Δ t} - y_{1}) (Y_{2, t + Δ t}^{Δ t} - y_{2}) | (Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t}) = (y_{1}, y_{2})], \\ γ_{p}^{Δ t} (y_{1}, y_{2}) & = & {(Δ t)}^{- 1} E [{({(Y_{1, t + Δ t}^{Δ t} - y_{1})}^{2} + {(Y_{2, t + Δ t}^{Δ t} - y_{2})}^{2})}^{\frac{p}{2}} | (Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t}) = (y_{1}, y_{2})] . \end{matrix}

With the above definitions, we may restate Lemma 8.2 of Durrett (1996) for the two-factor case as follows.

Lemma A4.

Suppose that

(Y_{1, 0}^{Δ t}, Y_{2, 0}^{Δ t}) = (Y_{1, 0}, Y_{2, 0})

. If there exists a

p \geq 2

and for all

R < \infty

, we have

\begin{matrix} (i) & lim_{Δ t ↓ 0} sup_{| (y_{1}, y_{2}) | \leq R} | α_{l}^{Δ t} (y_{1}, y_{2}) - μ_{l} (y_{1}, y_{2}) | = 0, l = 1, 2, \\ (i i) & lim_{Δ t ↓ 0} sup_{| (y_{1}, y_{2}) | \leq R} | β_{l}^{Δ t} (y_{1}, y_{2}) - σ_{l}^{2} (y_{1}, y_{2}) | = 0, l = 1, 2, \\ lim_{Δ t ↓ 0} sup_{| (y_{1}, y_{2}) | \leq R} | δ^{Δ t} (y_{1}, y_{2}) - ρ (y_{1}, y_{2}) σ_{1} (y_{1}, y_{2}) σ_{2} (y_{1}, y_{2}) | = 0, \\ (i i i) & lim_{Δ t ↓ 0} sup_{| (y_{1}, y_{2}) | \leq R} γ_{p}^{Δ t} (y_{1}, y_{2}) = 0, \end{matrix}

then

(Y_{1, t}^{Δ t}, Y_{2, t}^{Δ t})

converges weakly to

(Y_{1, t}, Y_{2, t})

as

Δ t \to 0

.

The proof of Proposition 5 requires the above three conditions. Conditions (i) and (ii) are true because (25a)–(25d) hold. To check condition (iii) with

p = 4

, we observe that

\begin{matrix} γ_{4}^{Δ t} (y_{1}, y_{2}) & = & {(Δ t)}^{- 1} \sum_{i, j \in Ω} p_{i j} {[{((k_{1} + n_{1} h_{1}) Δ y_{1})}^{2} + {((k_{2} + n_{2} h_{2}) Δ y_{2})}^{2}]}^{2} \\ \leq & {(Δ t)}^{- 1} 9 {[{((k_{1} + h_{1}) Δ y_{1})}^{2} + {((k_{2} + h_{2}) Δ y_{2})}^{2}]}^{2}, \end{matrix}

where

n_{1}, n_{2} \in {- 1, 0, 1}

. Using an argument similar to Section A.2, we conclude that, for a given

R < \infty

, there exists a constant

M < \infty

such that

γ_{4}^{Δ t} (y_{1}, y_{2}) \leq M Δ t

for all lattice nodes with

| (y_{1}, y_{2}) | < R

. This shows that condition (iii) is satisfied.

Appendix A.6. Proof of Proposition 6

(i): First, observe from (15a)–(15c) that replacing $ϵ$ by $- ϵ$ is equivalent to switching $p_{u}$ and $p_{d}$ . Consider the optimization problem of $R^{max}$ with a feasible solution of ${p_{i j}}, i, j \in Ω$ , as shown in Figure A1. The objective function is $p_{uu} + p_{dd} - p_{ud} - p_{du}$ , which is the sum of the northwest and southeast corner elements minus the sum of the other two corner elements. It can be seen that switching the first and the third columns (i.e., replacing ${\tilde{p}}_{2 u}$ and ${\tilde{p}}_{2 d}$ ), and the first and the third rows (i.e., replacing ${\tilde{p}}_{1 u}$ and ${\tilde{p}}_{1 d}$ ), yields the same objective value. This implies that $R^{max} (ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2}) = R^{max} (- ϵ_{1}, x_{1}, γ_{1}, h_{1}, - ϵ_{2}, x_{2}, γ_{2}, h_{2})$ .

Figure A1. Interpretation of $(Q)$ .

Figure A1. Interpretation of $(Q)$ .
(ii): Continuing the argument in (i), if one only switches the first and the third columns or the first and the third rows, the objective value will still be the same but with a different sign. Therefore, $R^{max} (ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2}) = - R^{min} (ϵ_{1}, x_{1}, γ_{1}, h_{1}, - ϵ_{2}, x_{2}, γ_{2}, h_{2}) = - R^{min} (- ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2})$ .
(iii): From (ii), it is clear that

$\begin{matrix} min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (R^{max} (ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A18a)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (- R^{min} (- ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A18b)

$\begin{matrix} = & - max_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (R^{min} (- ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2}) - \frac{(- ϵ_{1}) ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A18c)

$\begin{matrix} = & - max_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (R^{min} (ϵ_{1}, x_{1}, γ_{1}, h_{1}, ϵ_{2}, x_{2}, γ_{2}, h_{2}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A18d)

In (A18d), we use the fact that $ϵ_{1} \in [- 0.5, 0.5]$ ; thus, replacing $- ϵ_{1}$ by $ϵ_{1}$ will not affect the result of the optimization.

Appendix A.7. Proof of Proposition 7

See the proof of Lemma 2 in Tseng and Lin (2007).

Appendix A.8. Proof of Proposition 8

Let

f_{0} (z) \equiv min (f_{1} (z), \dots, f_{n} (z))

, and

z^{i} \equiv arg {min}_{z \in D} f_{i} (z)

,

i = 0, \dots, n

. For simplicity, we assume

f_{1} (z^{1}) \leq f_{i} (z^{i})

,

\forall i = 1, \dots, n

. We want to show

f_{0} (z^{0}) = f_{1} (z^{1})

.

By the definition of

z^{i}

,

i = 1, \dots, n

, we have

f_{1} (z^{1}) = min {f_{1} (z^{1}), \dots, f_{n} (z^{1})} = f_{0} (z^{1})

. Thus,

f_{1} (z^{1}) \geq f_{0} (z^{0})

. Furthermore, we have

f_{i} (z^{i}) \leq f_{i} (z^{0})

,

i = 1, \dots, n

. Thus,

min {f_{1} (z^{1}), \dots, f_{n} (z^{n})} \leq min {f_{1} (z^{0}), \dots, f_{n} (z^{0})}

. This means that

f_{1} (z^{1}) \leq f_{0} (z^{0})

. Thus, we conclude that

f_{0} (z^{0}) = f_{1} (z^{1})

.

Appendix A.9. Proof of Theorem 1

We consider the six cases of

R_{i}^{max}

,

i = 1, \dots, 6

, given in Proposition 7.

(i): $R_{1}^{max} = {\tilde{p}}_{1 u} + {\tilde{p}}_{1 d}$ . From (31), we consider the following optimization problem:

$\begin{matrix} min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (R_{1}^{max} - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A19a)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (\frac{γ_{1}^{2}}{c_{1}^{2}} + \frac{ϵ_{1}^{2}}{h_{1}^{2}} - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A19b)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (\frac{γ_{1}^{2}}{c_{1}^{2}} + \frac{ϕ_{1}^{2}}{h_{1}^{2}} - \frac{ϕ_{1}}{2 h_{1} h_{2}}) \end{matrix}$

(A19c)

$\begin{matrix} = & w_{1}, \end{matrix}$

(A19d)

where, in (A19c), we optimize the objective function over $ϵ_{1}$ and $ϵ_{2}$ first, with $γ_{l} > 0$ and $h_{l} > 0$ being fixed. The optimal solution is achieved at $(ϵ_{1}, ϵ_{2}) = (ϕ_{1}, 1 / 2)$ , with $ϕ_{1} = min (h_{1} / (4 h_{2}), 1 / 2)$ .
(ii): This is a symmetric case for case (i) by exchanging the factor indices, 1 and 2.
(iii): $R_{3}^{max} = {\tilde{p}}_{1 u} + {\tilde{p}}_{2 d}$ , we have

$\begin{matrix} min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (R_{3}^{max} - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A20a)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} [\frac{1}{2} (\frac{γ_{1}^{2}}{c_{1}^{2}} + \frac{ϵ_{1}^{2}}{h_{1}^{2}} + \frac{ϵ_{1}}{h_{1}}) + \\ \frac{1}{2} (\frac{γ_{2}^{2}}{c_{2}^{2}} + \frac{ϵ_{2}^{2}}{h_{2}^{2}} - \frac{ϵ_{2}}{h_{2}}) - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}] \end{matrix}$

(A20b)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} [\frac{1}{2} ({(\frac{ϵ_{1}}{h_{1}} - \frac{ϵ_{2}}{h_{2}} + \frac{1}{2})}^{2} - \frac{1}{4}) + \\ \frac{1}{2} (\frac{γ_{1}^{2}}{c_{1}^{2}} + \frac{γ_{2}^{2}}{c_{2}^{2}})] \end{matrix}$

(A20c)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} [\frac{1}{8} (ϕ_{3}^{2} - 1) + \frac{1}{2} (\frac{γ_{1}^{2}}{c_{1}^{2}} + \frac{γ_{2}^{2}}{c_{2}^{2}})] \end{matrix}$

(A20d)

$\begin{matrix} = & w_{3} . \end{matrix}$

(A20e)

Again, in (A20d), we optimize over $(ϵ_{1}, ϵ_{2})$ first, and the optimal solution is achieved at $(ϵ_{1}, ϵ_{2}) = (- 0.5, 0.5)$ with $ϕ_{3} = max {1 - 1 / h_{1} - 1 / h_{2}, 0}$ .
(iv): $R_{4}^{max} = 1 - ({\tilde{p}}_{1 d} - {\tilde{p}}_{2 d}) - ({\tilde{p}}_{2 u} - {\tilde{p}}_{1 u})$ . Repeat the same process and we have

$\begin{matrix} min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (R_{4}^{max} - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A21a)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (1 + \frac{ϵ_{1}}{h_{1}} - \frac{ϵ_{2}}{h_{2}} - \frac{ϵ_{1} ϵ_{2}}{h_{1} h_{2}}) \end{matrix}$

(A21b)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (1 + \frac{ϵ_{1}}{h_{1}}) (1 - \frac{ϵ_{2}}{h_{2}}) \end{matrix}$

(A21c)

$\begin{matrix} = & min_{θ_{l} \in Θ_{l} ({\underset{̲}{h}}_{l})} \frac{1}{γ_{1} γ_{2}} (1 - \frac{1}{2 h_{1}}) (1 - \frac{1}{2 h_{2}}) \end{matrix}$

(A21d)

$\begin{matrix} = & w_{4}, \end{matrix}$

(A21e)

where in (A21a) we used the fact that the optimal solution for $(ϵ_{1}, ϵ_{2})$ is $(- 1 / 2, 1 / 2)$ .
(v): For cases associated with $R_{5}^{max}$ and $R_{6}^{max}$ , they are symmetric counterparts of (iii) and (iv) by exchanging the factor indices 1 and 2. This obtains the same lower bounds as in (iii) and (iv).
Summarizing all four possible cases above, we conclude that

$\bar{ρ} = c_{1} c_{2} \cdot min {w_{1}, w_{2}, w_{3}, w_{4}} .$

(A22)

The proof is completed.

Appendix A.10. Proof of Proposition 9

From (11) and (40), we have

σ_{1}^{s} = σ_{2}^{s} / ξ

. It follows that at any node

x_{1} = \frac{\sqrt{y_{2}}}{σ_{1}^{s}} = \frac{ξ \sqrt{y_{2}}}{σ_{2}^{s}} = x_{2},

(A23)

and

h_{1} = ⌊ x_{1} + 0.5 ⌋ = ⌊ x_{2} + 0.5 ⌋ = h_{2} .

(A24)

Thus,

γ_{1} = \frac{x_{1}}{h_{1}} = \frac{x_{2}}{h_{2}} = γ_{2} .

(A25)

Appendix A.11. Proof of Theorem 2

This proof is similar to that of Theorem 1. Six cases corresponding to

R_{1}^{max}

to

R_{6}^{max}

will be discussed.

(i): $R_{1}^{max} = {\tilde{p}}_{1 u} + {\tilde{p}}_{1 d}$ . Consider

$min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (R_{1}^{max} (θ_{1}, θ_{2})) = min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (\frac{x^{2}}{c_{1}^{2} h^{2}}) = \frac{1}{c_{1}^{2}}$

This corresponds to $w_{1}^{H}$ .
(ii): $R_{2}^{max} = {\tilde{p}}_{2 u} + {\tilde{p}}_{2 d}$ . Consider

$min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (R_{2}^{max} (θ_{1}, θ_{2})) = min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (\frac{x^{2}}{c_{2}^{2} h^{2}} + \frac{ϵ_{2}^{2}}{h^{2}}) = \frac{1}{c_{2}^{2}},$

where the minimum is achieved at $ϵ_{2} = 0$ . This corresponds to $w_{2}^{H}$ .
(iii): $R_{3}^{max} = {\tilde{p}}_{1 u} + {\tilde{p}}_{2 d}$ , we have

$\begin{matrix} min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (R_{3}^{max} (θ_{1}, θ_{2})) \end{matrix}$

$\begin{matrix} = & min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (\frac{1}{2} (\frac{x^{2}}{c_{1}^{2} h^{2}} + \frac{x^{2}}{c_{2}^{2} h^{2}}) + \frac{1}{2} {(- \frac{ϵ_{2}}{h} + \frac{1}{2})}^{2} - \frac{1}{8}) \end{matrix}$

(A26)

$\begin{matrix} = & \frac{1}{2} (\frac{1}{c_{1}^{2}} + \frac{1}{c_{1}^{2}}) + min_{x, h} \frac{h^{2}}{x^{2}} (\frac{1}{2} {(- \frac{1}{2 h} + \frac{1}{2})}^{2} - \frac{1}{8}) \end{matrix}$

(A27)

$\begin{matrix} = & \frac{1}{2} (\frac{1}{c_{1}^{2}} + \frac{1}{c_{1}^{2}}) + \frac{{\hat{\underset{̲}{h}}}^{2}}{{(\hat{\underset{̲}{h}} - 0.5)}^{2}} (\frac{1}{2} {(- \frac{1}{2 \hat{\underset{̲}{h}}} + \frac{1}{2})}^{2} - \frac{1}{8}) \end{matrix}$

(A28)

where in the minimization of (A26) over $ϵ_{2}$ , the minimum is achieved at $ϵ_{2} = 0.5$ . To solve the optimization in (A27), treat the objective function as a one-dimensional function of $x \geq 1$ , which is a continuous variable, with h plugged in as $⌊ x + 0.5 ⌋$ from (9). It can be seen that the objective function is discontinuous. Using computing tools to display the one-dimensional function, subject to $x \geq max (\underset{̲}{h} - 0.5, 1)$ from (14) and $h \geq \underset{̲}{h}$ , the global minimum is achieved at $(h, x) = (\hat{\underset{̲}{h}}, \hat{\underset{̲}{h}} - 0.5)$ , where $\hat{\underset{̲}{h}} = max (\underset{̲}{h}, 2)$ .
(iv): $R_{4}^{max} = 1 - ({\tilde{p}}_{1 d} - {\tilde{p}}_{2 d}) - ({\tilde{p}}_{2 u} - {\tilde{p}}_{1 u})$ .

$\begin{matrix} min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (R_{4}^{max} (θ_{1}, θ_{2})) = min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (1 - \frac{ϵ_{2}}{h}) \end{matrix}$

(A29)

$\begin{matrix} = min_{x, h} \frac{h^{2}}{x^{2}} (1 - \frac{1}{2 h}) = \frac{{\underset{̲}{h}}^{2}}{{(\underset{̲}{h} + 0.5)}^{2}} (1 - \frac{1}{2 \underset{̲}{h}}) \end{matrix}$

(A30)

where, in the minimization of (A29) over $ϵ_{2}$ , the minimum is achieved at $ϵ_{2} = 0.5$ . Using computing tools to solve the one-dimensional minimization in (A30), the global minimum is achieved at $h = \underset{̲}{h}$ and $x = \underset{̲}{h} + 0.5 - δ,$ where $δ > 0$ is infinitesimal. The existence of an infinitesimal $δ$ is to enforce the relation $h = ⌊ x + 0.5 ⌋$ . However, for obtaining the minimum value, which will just serve a bound in our purpose, one essentially can just plug $(h, x) = (\underset{̲}{h}, \underset{̲}{h} + 0.5)$ into (A29) to find the value of the minimum.
(v): $R_{5}^{max} = {\tilde{p}}_{2 u} + {\tilde{p}}_{1 d}$

$\begin{matrix} min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (R_{5}^{max} (θ_{1}, θ_{2})) \end{matrix}$

$\begin{matrix} = & min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (\frac{1}{2} (\frac{x^{2}}{c_{1}^{2} h^{2}} + \frac{x^{2}}{c_{2}^{2} h^{2}}) + \frac{1}{2} {(\frac{ϵ_{2}}{h} + \frac{1}{2})}^{2} - \frac{1}{8}) \end{matrix}$

(A31)

$\begin{matrix} = & \frac{1}{2} (\frac{1}{c_{1}^{2}} + \frac{1}{c_{1}^{2}}) + min_{x, h} \frac{h^{2}}{x^{2}} (\frac{1}{2} {(\frac{- 1}{2 h} + \frac{1}{2})}^{2} - \frac{1}{8}) \\ = & \frac{1}{2} (\frac{1}{c_{1}^{2}} + \frac{1}{c_{1}^{2}}) + \frac{{\hat{\underset{̲}{h}}}^{2}}{{(\hat{\underset{̲}{h}} - 0.5)}^{2}} (\frac{- (\hat{\underset{̲}{h}} - 0.5)}{4 {\hat{\underset{̲}{h}}}^{2}}), \end{matrix}$

(A32)

where, in the minimization of (A31) over $ϵ_{2}$ , the minimum is achieved at $ϵ_{2} = - 0.5$ . Following the steps described in (iii) to minimize (A32) as a one-dimensional problem using computing tools, the global minimum is achieved at $(h, x) = (\hat{\underset{̲}{h}}, \hat{\underset{̲}{h}} - 0.5)$ , where $\hat{\underset{̲}{h}} = max (\underset{̲}{h}, 2)$ . This term corresponds to $w_{3}^{H}$ . It can be seen that $w_{3}^{H}$ is smaller than the term in (A28). Thus, $R_{3}^{max}$ does not contribute to ${\bar{ρ}}^{H}$ .
(vi): $R_{6}^{max} = 1 - ({\tilde{p}}_{2 d} - {\tilde{p}}_{1 d}) - ({\tilde{p}}_{1 u} - {\tilde{p}}_{2 u})$ .

$\begin{matrix} min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (R_{6}^{max} (θ_{1}, θ_{2})) = min_{x, h, ϵ_{2}} \frac{h^{2}}{x^{2}} (1 + \frac{ϵ_{2}}{h}) \end{matrix}$

(A33)

$\begin{matrix} = min_{x, h} \frac{h^{2}}{x^{2}} (1 - \frac{1}{2 h}) = \frac{{(\underset{̲}{h})}^{2}}{{(\underset{̲}{h} + 0.5)}^{2}} (1 - \frac{1}{2 \underset{̲}{h}}) \end{matrix}$

(A34)

where, in the minimization in (A33) over $ϵ_{2}$ , the minimum is achieved at $ϵ_{2} = - 0.5$ . Using computing tools to solve the one-dimensional minimization in (A34), the global minimum is achieved at $h = \underset{̲}{h}$ and $x = \underset{̲}{h} + 0.5 - δ,$ where $δ > 0$ is infinitesimal. Like in (iv), one can simply plug $(h, x) = (\underset{̲}{h}, \underset{̲}{h} + 0.5)$ into (A34). This corresponds to $w_{4}^{H}$ , which is always smaller than the term in (A30). Thus, $R_{4}^{max}$ does not contribute to ${\bar{ρ}}^{H}$ .

References

Akyildirim, Erdinç, Yan Dolinsky, and H. Mete Soner. 2014. Approximating stochastic volatility by recombinant trees. The Annals of Applied Probability 24: 2176–205. [Google Scholar] [CrossRef]
Andersen, Leif B. G. 2007. Efficient Simulation of the Heston Stochastic Volatility Model. Available online: http://ssrn.com/abstract=946405 (accessed on 21 May 2021).
Ball, Clifford, and Antonio Roma. 1994. Stochastic volatility option pricing. Journal of Financial and Quantitative Analysis 29: 589–607. [Google Scholar] [CrossRef]
Beliaeva, Natalia A., and Sanjay K. Nawalkha. 2010. A simple approach to price American options under the Heston stochastic volatility model. Journal of Derivatives 17: 25–43. [Google Scholar] [CrossRef]
Boyle, Phelim P. 1986. Option valuation using a three-jump process. International Options Journal 3: 7–12. [Google Scholar]
Boyle, Phelim P. 1988. A lattice framework for option pricing with two state variables. Journal of Financial and Quantitative Analysis 23: 1–12. [Google Scholar] [CrossRef]
Broadie, Mark, and O¨zgu¨r Kaya. 2006. Exact simulation of stochastic volatility and other affine jump diffusion processes. Operations Research 54: 217–31. [Google Scholar] [CrossRef] [Green Version]
Burgin, Mark, and Gunter Meissner. 2012. Negative Probabilities in Financial Modeling. Wilmott Magazine 58: 60–65. [Google Scholar] [CrossRef]
Chung, San-Lin, and Pai-Ta Shih. 2007. Generalized Cox-Ross-Rubinstein binomial models. Management Science 53: 508–20. [Google Scholar] [CrossRef]
Costabile, Massimo, Ivar Massabo‘, and Emilio Russo. 2012. A forward shooting grid method for option pricing with stochastic volatility. Journal of Derivatives 20: 67–78. [Google Scholar] [CrossRef]
Cox, John C., Jonathan E. Ingersoll, and Stephen A. Ross. 1985. A theory of the term structure of interest rates. Econometrica 53: 385–408. [Google Scholar] [CrossRef]
Cox, John C., Stephen Ross, and Mark Rubinstein. 1979. Option pricing: A simplified approach. Journal of Financial Economics 7: 229–64. [Google Scholar] [CrossRef]
Durrett, Richard. 1996. Stochastic Calculus: A Practical Introduction. Boca Raton: CRC Press Inc. [Google Scholar]
Feller, William. 1951. Two singular diffusion problems. Annals of Mathematics 54: 173–82. [Google Scholar] [CrossRef]
Haug, Espen Gaarder. 2007. Why so negative to negative probabilities. In Derivatives Models on Models. New York: John Wiley & Sons, Chapter 14. [Google Scholar]
Heston, Steven L. 1993. A closed-form solutions for options with stochastic volatility with application to bond and currency options. The Review of Financial Studies 6: 327–43. [Google Scholar] [CrossRef] [Green Version]
Hull, John C., and Alan White. 1988. The use of control variate technique in option-pricing. Journal of Financial and Quantitative Analysis 23: 237–51. [Google Scholar] [CrossRef]
Hull, John C., and Alan White. 1990. Valuing derivative securities using the explicit finite difference method. Journal of Financial and Quantitative Analysis 25: 87–100. [Google Scholar] [CrossRef]
Hull, John C., and Alan White. 1993. One-factor interest-rate models and the valuation of interest-rate derivative securities. Journal of Financial and Quantitative Analysis 28: 235–54. [Google Scholar] [CrossRef]
Hull, John C., and Alan White. 1994. Numerical procedures for implementing term structure models II: Two-factor models. Journal of Derivatives 2: 37–49. [Google Scholar] [CrossRef]
Kouritzin, Michael A. 2000. Exact infinite dimensional filters and explicit solutions. In Stochastic Models. Edited by Luis G. Gorostiza and B. Gail Ivanoff. Providence: American Mathematical Society, pp. 265–82. [Google Scholar]
Kouritzin, Michael A., and Anne Mackay. 2020. Branching particle pricers with Heston examples. International Journal of Theoretical and Applied Finance 23: 2050003. [Google Scholar] [CrossRef] [Green Version]
Longstaff, Francis A., and Eduardo S. Schwartz. 2001. Valuing American options by simulation: A simple least-squares approach. Review of Financial Studies 14: 113–47. [Google Scholar] [CrossRef] [Green Version]
Maghsoodi, Yoosef. 1996. Solutions of the extended CIR term structure and bond option valuation. Mathematical Finance 6: 89–109. [Google Scholar] [CrossRef]
Rendleman, Richard J., and Brit J. Bartter. 1979. Two-state option pricing. Journal of Finance 34: 1093–110. [Google Scholar] [CrossRef]
Rough, Fabrice D. 2013. The Heston Model and Its Extension in Matlab and C#. Hoboken: Wiley. [Google Scholar]
Ruckdeschel, Peter, Tilman Sayer, and Alexander Szimayer. 2013. Pricing American options in the Heston model: A close look on incorporating correlation. Journal of Derivatives 20: 9–29. [Google Scholar] [CrossRef]
Tseng, Chung-Li, and Kyle Lin. 2007. A framework using two-factor price lattices for generation asset valuation. Operations Research 55: 234–51. [Google Scholar] [CrossRef] [Green Version]
Zvan, R., Peter A. Forsyth, and K. R. Vetzal. 2001. Negative Coefficients in Two Factor Option Pricing Models. Working Paper. Available online: https://cs.uwaterloo.ca/~paforsyt/posmesh3.pdf (accessed on 21 May 2021).

Figure 1. An example of node branching of our proposed two-factor lattice.

Figure 2. Deviation (%) of the values of European call options from the exact ones using Best-Fit and H&W for branching probabilities.

Figure 3. Illustration of the underlying distributions of

ln S_{T}

.

Figure 3. Illustration of the underlying distributions of

ln S_{T}

.

Figure 4. Convergence pattern of the option prices and the computational requirement.

Table 1. The 100 values of

ρ_{max} ({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

for

1 \leq {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2} \leq 10

.

Table 1. The 100 values of

ρ_{max} ({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})

for

1 \leq {\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2} \leq 10

.

$({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})$	1	2	3	4	5	6	7	8	9	10
1	0.3333	0.4957	0.5544	0.5866	0.6076	0.6222	0.6319	0.6343	0.6362	0.6376
2		0.6399	0.6892	0.7156	0.7327	0.7450	0.7539	0.7610	0.7668	0.7714
3			0.7401	0.7661	0.7834	0.7947	0.8050	0.8112	0.8170	0.8212
4				0.7949	0.8124	0.8235	0.8321	0.8393	0.8450	0.8499
5					0.8302	0.8414	0.8504	0.8571	0.8633	0.8682
6						0.8551	0.8628	0.8707	0.8751	0.8798
7							0.8735	0.8797	0.8856	0.8898
8								0.8878	0.8919	0.8961
9									0.8991	0.9033
10										0.9084

Table 2. The values of

ρ_{max}

and the optimal

(c_{1}^{*}, c_{2}^{*})

when

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2}

.

Table 2. The values of

ρ_{max}

and the optimal

(c_{1}^{*}, c_{2}^{*})

when

{\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2}

.

${\underset{̲}{h}}_{1} = {\underset{̲}{h}}_{2}$	$ρ_{max} ({\underset{̲}{h}}_{1}, {\underset{̲}{h}}_{2})$	$(c_{1}^{}, c_{2}^{})$
1	0.3333	(1.7321, 1.7321)
2	0.6397	(1.3340, 1.3340)
3	0.7400	(1.2052, 1.2052)
4	0.7949	(1.1469, 1.1469)
5	0.8320	(1.1145, 1.1145)
6	0.8551	(1.0931, 1.0931)
7	0.8735	(1.0792, 1.0792)
8	0.8878	(1.0686, 1.0686)
9	0.8991	(1.0602, 1.0602)
10	0.9084	(1.0543, 1.0543)

Table 3. The values of

ρ_{max}^{H}

(with

{\tilde{ϵ}}_{1} = 0

).

Table 3. The values of

ρ_{max}^{H}

(with

{\tilde{ϵ}}_{1} = 0

).

$\underset{̲}{h}$	$ρ_{max}^{H} (\underset{̲}{h})$	$(c_{1}^{H}, c_{2}^{H})$	$\underset{̲}{h}$	$ρ_{max}^{H} (\underset{̲}{h})$	$(c_{1}^{H}, c_{2}^{H})$
1	0.5000	(1.7321, 1.7321)	6	0.9453	(1.0871, 1.1133)
2	0.7222	(1.2910, 1.2910)	7	0.9549	(1.0742, 1.0989)
3	0.8596	(1.1832, 1.1866)	8	0.9616	(1.0646, 1.0877)
4	0.9065	(1.1339, 1.1564)	9	0.9667	(1.0572, 1.0787)
5	0.9308	(1.1055, 1.1319)	10	0.9705	(1.0513, 1.0714)

Table 4. Impact of lattice infeasibility on options pricing.

	In-the-Money (ITM)			At-the-Money (ATM)			Out-of-the-Money (OTM)			Infeasibility (%)
$ρ$	Exact	Best-Fit	H&W	Exact	Best-Fit	H&W	Exact	Best-Fit	H&W	Best-Fit	H&W
0.8	21.5729	21.5704	21.9341	9.8203	9.8259	9.7892	4.3276	4.3337	3.9417	0	99.79
0.7	21.6637	21.6620	21.9489	9.8140	9.8194	9.7859	4.2401	4.2454	3.9227	0	1.03
0.6	21.7499	21.7490	21.9495	9.8060	9.8107	9.7854	4.1509	4.1553	3.9221	0	0.21
0.5	21.8320	21.8313	21.9926	9.7965	9.8010	9.7776	4.0600	4.0640	3.8665	0	0.21
0.4	21.9104	21.9100	22.0347	9.7853	9.7903	9.7694	3.9672	3.9710	3.8103	0	0.11
0.3	21.9854	21.9853	22.0757	9.7726	9.7779	9.7606	3.8725	3.8760	3.7534	0	0.00
0.2	22.0571	22.0575	22.1158	9.7584	9.7640	9.7514	3.7757	3.7790	3.6958	0	0.00
0.1	22.1260	22.1267	22.1549	9.7428	9.7485	9.7417	3.6769	3.6798	3.6375	0	0.00
0.0	22.1920	22.1931	22.1931	9.7256	9.7315	9.7315	3.5759	3.5784	3.5784	0	0.00
−0.1	22.2555	22.2569	22.2304	9.7070	9.7129	9.7210	3.4726	3.4747	3.5186	0	0.00
−0.2	22.3165	22.3183	22.2669	9.6870	9.6928	9.7101	3.3668	3.3684	3.4579	0	0.00
−0.3	22.3752	22.3773	22.3026	9.6655	9.6712	9.6987	3.2584	3.2596	3.3964	0	0.00
−0.4	22.4317	22.4341	22.3375	9.6425	9.6481	9.6869	3.1472	3.1480	3.3339	0	0.11
−0.5	22.4861	22.4888	22.3717	9.6181	9.6234	9.6745	3.0330	3.0334	3.2706	0	0.21
−0.6	22.5386	22.5414	22.4051	9.5922	9.5969	9.6617	2.9157	2.9156	3.2063	0	0.21
−0.7	22.5891	22.5925	22.4052	9.5647	9.5700	9.6619	2.7949	2.7940	3.2057	0	1.03
−0.8	22.6379	22.6415	22.4165	9.5357	9.5410	9.6579	2.6703	2.6687	3.1832	0	99.79

Table 5. The accuracy of the proposed Best-Fit model for pricing European call options under the Heston SV model.

No. of Time Steps $(N)$	10	20	30	40
RMSE	0.0061	0.0027	0.0024	0.0013

Table 6. American put price computed using Best-Fit model and control variate (CV) technique, compared with the result reported in Beliaeva and Nawalkha (2010).

$S_{0}$	$ρ$	$V_{0}$	T	Best-Fit	Best-Fit CV	B&N Tree	B&N Tree CV	Euro Put	Error	Error
				$(N = 50)$	$(N = 50)$	$(N = 50)$	$(N = 50)$	True Value	(Best-Fit)	(B&N Tree)
90	−0.1	0.04	0.0833	10.0000	9.9996	10.0000	10.0001	9.6699	0.0004	0.0001
90	−0.7	0.04	0.0833	10.0000	10.0010	10.0000	9.9996	9.6533	0.0010	0.0004
100	−0.1	0.04	0.0833	2.1253	2.1262	2.1257	2.1249	2.0950	0.0009	0.0008
100	−0.7	0.04	0.0833	2.1279	2.1274	2.1250	2.1263	2.0971	0.0006	0.0013
110	−0.1	0.04	0.0833	0.1092	0.1090	0.1087	0.1090	0.1083	0.0002	0.0003
110	−0.7	0.04	0.0833	0.1259	0.1275	0.1282	0.1273	0.1265	0.0016	0.0009
90	−0.1	0.16	0.0833	10.7103	10.7109	10.7195	10.7099	10.5957	0.0006	0.0096
90	−0.7	0.16	0.0833	10.6858	10.6829	10.6709	10.6800	10.5668	0.0030	0.0091
100	−0.1	0.16	0.0833	4.2187	4.2166	4.2275	4.2163	4.1859	0.0021	0.0112
100	−0.7	0.16	0.0833	4.2196	4.2156	4.2184	4.2147	4.1852	0.0040	0.0037
110	−0.1	0.16	0.0833	1.1721	1.1667	1.1667	1.1669	1.1608	0.0053	0.0002
110	−0.7	0.16	0.0833	1.1962	1.1944	1.2176	1.1942	1.1882	0.0018	0.0234
90	−0.1	0.04	0.2500	10.1653	10.1654	10.1678	10.1663	9.6430	0.0002	0.0015
90	−0.7	0.04	0.2500	10.1197	10.1208	10.1162	10.1207	9.5698	0.0010	0.0045
100	−0.1	0.04	0.2500	3.4750	4.4765	3.4736	3.4727	3.3684	0.0014	0.0009
100	−0.7	0.04	0.2500	3.4848	3.4816	3.4803	3.4801	3.3770	0.0032	0.0002
110	−0.1	0.04	0.2500	0.7725	0.7729	0.7722	0.7730	0.7584	0.0003	0.0008
110	−0.7	0.04	0.2500	0.8429	0.8417	0.8462	0.8412	0.8259	0.0012	0.0050
90	−0.1	0.16	0.2500	12.1895	12.1773	12.1898	12.1752	11.8933	0.0122	0.0146
90	−0.7	0.16	0.2500	12.1256	12.1135	12.0626	12.1061	11.8287	0.0121	0.0435
100	−0.1	0.16	0.2500	6.5014	6.4937	6.5027	6.4936	6.3755	0.0077	0.0091
100	−0.7	0.16	0.2500	6.4986	6.4898	6.4855	6.4908	6.3735	0.0089	0.0053
110	−0.1	0.16	0.2500	3.0985	3.0900	3.0874	3.0900	3.0451	0.0085	0.0026
110	−0.7	0.16	0.2500	3.1550	3.1453	3.1731	3.1479	3.1011	0.0097	0.0252
90	−0.1	0.04	0.5000	10.6472	10.6466	10.6431	10.6422	9.8582	0.0006	0.0009
90	−0.7	0.04	0.5000	10.5627	10.5636	10.5575	10.5609	9.7572	0.0009	0.0034
100	−0.1	0.04	0.5000	4.6481	4.6492	4.6445	4.6439	4.4126	0.0011	0.0006
100	−0.7	0.04	0.5000	4.6685	4.6636	4.6672	4.6610	4.4312	0.0050	0.0062
110	−0.1	0.04	0.5000	1.6812	1.6812	1.6804	1.6813	1.6220	0.0001	0.0009
110	−0.7	0.04	0.5000	1.7906	1.7872	1.7950	1.7856	1.7240	0.0034	0.0094
90	−0.1	0.16	0.5000	13.3245	13.3089	13.2938	13.2944	12.7057	0.0155	0.0006
90	−0.7	0.16	0.5000	13.2393	13.2230	13.0958	13.1731	12.6171	0.0162	0.0773
100	−0.1	0.16	0.5000	8.0210	8.0041	7.9928	7.9943	7.6974	0.0168	0.0015
100	−0.7	0.16	0.5000	8.0174	8.0001	7.9432	7.9810	7.6965	0.0174	0.0378
110	−0.1	0.16	0.5000	4.5590	4.5430	4.5221	4.5383	4.3942	0.0160	0.0162
110	−0.7	0.16	0.5000	4.6343	4.6188	4.5972	4.6118	4.4716	0.0155	0.0146
								RMSE	0.0080	0.0181

Table 7. Comparison summary of the results in Table 6.

		No. of Wins (%)		RMSE
Category	No. of Cases	Best-Fit	B&N Tree	Best-Fit	B&N Tree
Overall	36	22 (61%)	14 (39%)	0.0080	0.0181
$ρ = - 0.1$	18	10 (56%)	8 (44%)	0.0078	0.0066
$ρ = - 0.7$	18	12 (67%)	6 (33%)	0.0082	0.0247
$V_{0} > m$	18	11 (61%)	7 (39%)	0.0112	0.0254
$V_{0} \leq m$	18	11 (61%)	7 (39%)	0.0018	0.0033
In-the-money (ITM)	12	9 (75%)	3 (25%)	0.0082	0.0263
At-the-money (ATM)	12	5 (42%)	7 (58%)	0.0081	0.0120
Out-of-the-money (OTM)	12	8 (67%)	4 (33%)	0.0077	0.0122
$T = 1 / 12$ yr	12	6 (50%)	6 (50%)	0.0023	0.0085
$T = 1 / 4$ yr	12	8 (67%)	4 (33%)	0.0072	0.0156
$T = 1 / 2$ yr	12	8 (67%)	4 (33%)	0.0116	0.0259

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tseng, C.-L.; Miao, D.W.-C.; Chung, S.-L.; Shih, P.-T. How Much Do Negative Probabilities Matter in Option Pricing?: A Case of a Lattice-Based Approach for Stochastic Volatility Models. J. Risk Financial Manag. 2021, 14, 241. https://doi.org/10.3390/jrfm14060241

AMA Style

Tseng C-L, Miao DW-C, Chung S-L, Shih P-T. How Much Do Negative Probabilities Matter in Option Pricing?: A Case of a Lattice-Based Approach for Stochastic Volatility Models. Journal of Risk and Financial Management. 2021; 14(6):241. https://doi.org/10.3390/jrfm14060241

Chicago/Turabian Style

Tseng, Chung-Li, Daniel Wei-Chung Miao, San-Lin Chung, and Pai-Ta Shih. 2021. "How Much Do Negative Probabilities Matter in Option Pricing?: A Case of a Lattice-Based Approach for Stochastic Volatility Models" Journal of Risk and Financial Management 14, no. 6: 241. https://doi.org/10.3390/jrfm14060241

APA Style

Tseng, C.-L., Miao, D. W.-C., Chung, S.-L., & Shih, P.-T. (2021). How Much Do Negative Probabilities Matter in Option Pricing?: A Case of a Lattice-Based Approach for Stochastic Volatility Models. Journal of Risk and Financial Management, 14(6), 241. https://doi.org/10.3390/jrfm14060241

Article Menu

How Much Do Negative Probabilities Matter in Option Pricing?: A Case of a Lattice-Based Approach for Stochastic Volatility Models

Abstract

1. Introduction

2. General One-Factor Trinomial Lattice

2.1. Effects of Grid Refinement

2.2. Weak Convergence of the One-Factor Lattice

2.3. Estimating σ min for CIR Model and Feller Condition

3. Two-Factor Trinomial Lattice

3.1. Feasibility of the General Lattice

3.2. Lattice for the Heston SV Model

4. Impact of Lattice Infeasibility on Option Valuation

4.1. An Optimization Perspective

4.2. Numerical Comparisons: Best-Fit vs. H&W

5. Performance of the Best-Fit Lattice

5.1. European Options Valuation

5.2. Convergence and Complexity

5.3. American Options Valuation

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix A.1. Proof of Proposition 1

Appendix A.2. Proof of Proposition 2

Appendix A.3. Proof of Proposition 3

Appendix A.4. Proof of Proposition 4

Appendix A.5. Proof of Proposition 5

Appendix A.6. Proof of Proposition 6

Appendix A.7. Proof of Proposition 7

Appendix A.8. Proof of Proposition 8

Appendix A.9. Proof of Theorem 1

Appendix A.10. Proof of Proposition 9

Appendix A.11. Proof of Theorem 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.3. Estimating $σ^{min}$ for CIR Model and Feller Condition