Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options

Reesor, R. Mark; Marshall, T. James

doi:10.3390/jrfm13050095

Open AccessArticle

Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options

by

R. Mark Reesor

^1,* and

T. James Marshall

²

¹

Department of Mathematics, Wilfrid Laurier University, Waterloo, ON N2L 3C5, Canada

²

Bank of Montreal, Toronto, ON M5X 1A1, Canada

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2020, 13(5), 95; https://doi.org/10.3390/jrfm13050095

Submission received: 31 August 2019 / Revised: 1 May 2020 / Accepted: 3 May 2020 / Published: 13 May 2020

(This article belongs to the Special Issue Computational Finance)

Download

Browse Figures

Review Reports Versions Notes

Abstract

We present the Forest of Stochastic Trees (FOST) method for pricing multiple exercise options by simulation. The proposed method uses stochastic trees in place of binomial trees in the Forest of Trees algorithm originally proposed to value swing options, hence extending that method to allow for a multi-dimensional underlying process. The FOST can also be viewed as extending the stochastic tree method for valuing (single exercise) American-style options to multiple exercise options. The proposed valuation method results in positively- and negatively-biased estimators for the true option value. We prove the sign of the estimator bias and show that these estimators are consistent for the true option value. This method is of particular use in cases where there is potentially a large number of assets underlying the contract and/or the underlying price process depends on multiple risk factors. Numerical results are presented to illustrate the method.

Keywords:

Monte Carlo; multiple exercise options; dynamic programming; stochastic optimal control

1. Introduction

The broad class of stochastic optimal control problems includes many important applications in management sciences and quantitative finance such as development of natural resources, project initiation or abandonment, maintenance scheduling, land use decisions, and valuation and hedging of complex contracts. Very few problems have closed form solutions. For example the Black–Scholes–Merton formula for pricing European-style options does not have an American-style option analog. Approaches such as binomial lattice methods, partial differential equation (PDE) methods, variational inequalities and integral equations have been adopted for pricing these types of derivatives. However, all of these methods mentioned are limited in the number of sources of uncertainty and the dimensionality of the underlying asset that can be practically incorporated, leaving Monte Carlo (MC) as a general method of solution.

Multiple exercise options (MEOs) are generalizations of American-style options as they provide the holder more than one exercise right and sometimes control over one or more other variables, such as the amount exercised. Similar to pricing American-style options, MEO valuation is a problem in stochastic optimal control. For American-style options, the solution provides both a value and optimal exercise rule—a stopping time. For MEOs the solution gives both a value and optimal exercise policy. In cases in which the holder controls only the exercise times, the exercise policy is a sequence of stopping times. For MEOs in which the holder controls the exercise times and amounts, the exercise policy is a paired sequence of stopping times and exercise amounts. The policy generalizes to other control variables. Multiple exercise option valuation algorithms are generalizations of those used for pricing American-style options. Most of the work in the literature has focused on swing options in which the holder controls the exercise times and amounts. Hence, we discuss MEO valuation in the context of swing options and note that the FOST method and the FOST price estimators’ properties apply to general MEOs.

The Forest of (Binomial/Trinomial) Trees method of Lari et al. (2001) and Jaillet et al. (2004) extends the binomial method of Cox et al. (1979) for pricing American style options to value swing options. As with most tree-based methods, the Forest of Trees is not able to handle high-dimensional underlying processes. Tangential to the Forest of Trees extension, the Stochastic Tree method of Broadie and Glasserman (1997) extends the binomial method for pricing American style options to allow for a high-dimensional underlying asset.

In this paper, we replace the binomial and trinomial trees of Lari et al. (2001) and Jaillet et al. (2004) with the stochastic trees of Broadie and Glasserman (1997), hence creating the Forest of Stochastic Trees (FOST) method for valuing multiple exercise options. The FOST can be thought of as generalizations of two different methodologies; specifically, it extends

The Forest of Trees method of Lari et al. (2001) and Jaillet et al. (2004) for valuing multiple exercise options on a single asset to allow for high-dimensional underlying assets and processes; and
The Stochastic Tree method of Broadie and Glasserman (1997) for valuing high-dimensional American-style options (single exercise right) to options having multiple exercise rights and additional controls (e.g., volume control).

Visually these generalizations complete the final entry in the two-by-two table shown in Table 1.

We construct high and low biased FOST estimators that are analogous to those from the Stochastic Tree. The main theoretical results, presented in Section 2.2 and Section 2.3 and proven Appendix B, are listed below.

The high FOST estimator has positive bias.
The low FOST estimator has negative bias.
On any given realization the high FOST estimator is at least as big as the low FOST estimator.
The high and low FOST estimators are asymptotically unbiased.

The remainder of this section provides a literature review. Section 2 describes the Forest of Stochastic Trees estimators in detail; specifies results giving estimator properties (e.g., biasedness and convergence); provides numerical results showing the method is effective at pricing; and supplies discussion on improving computational performance through parallel processing. Section 3 summarizes and concludes the paper. A list of the notation used is given in Appendix A and proofs of the paper’s main theoretical contributions on estimator properties are given in Appendix B.

Literature Review

Multiple exercise options arise in many different areas and the structure of these contracts is typically tailored to particular clients/needs, in contrast to standardized derivatives such as interest rate swaps and exchange-traded commodity futures. A non-exhaustive list of examples of MEOs include (i) tolling agreements used in the steel Kim et al. (2019) and electricity Deng and Oren (2006) sectors; (ii) chooser flexible caps which are exotic interest rate derivatives Meinshausen and Hambly (2004); (iii) valuation and control of energy production and storage facilities Chen and Forsyth (2007); Ludkovski and Carmona (2010); Thompson et al. (2009); and (iv) swing options Calvo-Garrido et al. (2017); Jaillet et al. (2004); Lari et al. (2001); Wilhelm and Winter (2008).

Valuation methods for MEOs are extensions of those used for American-style options. There are continuous-time solutions to both the American-style and multiple exercise option valuation problems; these are computed by solving a system of Hamilton–Jacobi–Bellman quasi-variational inequalities Korn et al. (2005). These methods give more accurate and stable price and sensitivity estimates than those computed using simpler tools (e.g., trees). However, these methods are quite complex mathematically and break down in higher dimensions.

In this article, we focus on the mathematically simpler time-discretized version of the valuation problem. Discrete-time tree-based methods for valuing American-style options Cox et al. (1979) have been extended to MEOs via the Forest of Trees Jaillet et al. (2004); Lari et al. (2001). Techniques for pricing American-style options using solutions of PDEs have been modified to MEOs Calvo-Garrido et al. (2017); Chen and Forsyth (2007); Thompson et al. (2009); Wilhelm and Winter (2008). These methods for MEOs inherit properties similar to the corresponding methods for single-exercise options. One crucial property is that these methods fail as the dimensionality of the problem increases.

Monte Carlo is the obvious tool to overcome the curse of dimensionality, as the rate of convergence of Monte Carlo estimators is independent of the dimension. Tilley (1993) was the first to show that the forward-in-time Monte Carlo approach could be used to solve the backward-in-time dynamic programming problem arising from valuation of an American-style option. Since this seminal paper, numerous other methods for the Monte Carlo valuation of American style options have appeared. These include methods that attempt to parameterize the exercise region Barraquand and Martineau (1995) and those that discretize the state space Bally et al. (2005). Methods that parameterize the early-exercise region have been extended to value multiple exercise options by parameterizing the set of exercise level curves Ibán̎ez (1996). Similarly state space aggregation methods have been used for multiple exercise option valuation Ben Latifa et al. (2016). These approaches, however, also suffer from the curse of dimensionality and do not easily generalize to arbitrary payoffs and underlying price processes.

Monte Carlo methods that do not break down with the dimensionality and that accommodate general payoff and price processes include those that solve the optimal stopping-time problem through estimation of the hold or continuation value. These include the stochastic tree and mesh techniques of Broadie and Glasserman (1997, 2004) and the regression-based approach first appearing in Carriere (1996) and then subsequently generalized in Longstaff and Schwartz (2001). For each of these valuation techniques, high- and low-biased estimators are easily generated, along with a hybrid interleaving estimator that has properties of both. Duality-based methods solve the optimal control problem in the dual space by approximating an optimal martingale, typically by regression Andersen and Broadie (2004); Haugh and Kogan (2004).

Least-squares Monte Carlo has been modified for the pricing of swing options in Barrera et al. (2006); Meinshausen and Hambly (2004), respectively. Although increased dimensionality does not decrease the performance of these methods, they suffer from other drawbacks. In least-squares Monte Carlo methods one must select a set of basis functions on which to run regressions to estimate continuation values. In general only a complete (infinite) set of basis functions results in continuation value estimators that are consistent for the true option value. In practice, of course, a finite set of basis functions is used and introduces an approximation error. Continuation value estimators are consistent for the true approximation value and not the true option value Clement et al. (2001); Stentoft (2004).

Duality methods have been extended to MEOs Bender (2011); Chandramouli and Haugh (2012); Gyurko et al. (2015); Meinshausen and Hambly (2004). Duality methods rely on having a sub-optimal exercise policy that produces a low-biased estimate from which the solution to the dual problem can be approximated to yield a high-biased estimate. Typically regression-based methods are used to estimate the sub-optimal exercise policy Chandramouli and Haugh (2012); Gyurko et al. (2015); Meinshausen and Hambly (2004) implying the above noted issues of least-squares Monte Carlo persist when pricing MEOs. Policy iteration methods such as Bender (2011), yield approximations of the time-0 value at each iteration of the dynamic program. As with the pricing of American-style options this method is advantageous because it removes the requirement to calculate nested conditional expectations prior to the time-0 value being approximated.

The stochastic mesh of Broadie and Glasserman (2004) has been extended to MEOs via the Forest of Stochastic Meshes (FOSM) Marshall (2012); Marshall and Reesor (2011). High and low biased FOSM estimators are derived Marshall and Reesor (2011) and their properties shown Marshall (2012), similar to the work presented here for the Forest of Stochastic Trees estimators.

2. Results

We consider the valuation of multiple exercise options as a stochastic optimal control problem with three relevant state variables—the underlying variable (S), number of exercise rights remaining (

N

), and usage level (U) assuming some volume control. At each exercise opportunity and given (

S, N, U

), the current values of the state variables, the holder must choose between

Exercising u units plus continuing with an option having $N - 1$ remaining exercise rights and usage level $U + u$ ; and
Continuing with an option having $N$ exercise rights and usage level U (i.e., no exercise).

Note that with volume control the payoff from exercising u units changes with u (as does the continuation value of the option). Thus, the holder chooses the value-maximizing u when deciding to exercise. Also note that with

N = 1

and u constrained to be 1, this is an American-style option.

We work with the time-discretized problem and use dynamic programming to solve for the optimal exercise policy and the corresponding optimal value. In all variables, let the subscript i denote time-

t_{i}

and let

U_{i}

be the time-

t_{i}

set of admissible volume choices which includes the zero volume choice (i.e., hold). The recursive equations for the dynamic program are

\begin{matrix} H_{i} (S_{i}, N_{i + 1}, U_{i + 1}) & = E [B_{i + 1} (S_{i + 1}, N_{i + 1}, U_{i + 1}) | Z_{i}] and \end{matrix}

(1)

\begin{matrix} B_{i} (S_{i}, N_{i}, U_{i}) & = max_{u \in U_{i}} [h_{i} (S_{i}, N_{i}, U_{i}, u) + H_{i} (S_{i}, N_{i} - I_{{u \neq 0}}, U_{i} + u)], \end{matrix}

(2)

with the terminal conditions

\begin{matrix} H_{m} (S_{m}, N_{m}, U_{m}) & = \tilde{ϕ} (U_{m}) and \end{matrix}

(3)

\begin{matrix} B_{m} (S_{m}, N_{m}, U_{m}) & = max_{u \in U_{m}} [h_{m} (S_{m}, N_{m}, U_{m}, u) + H_{m} (S_{m}, N_{m} - I_{{u \neq 0}}, U_{m} + u)], \end{matrix}

(4)

where

H_{i} (S, N, U)

and

B_{i} (S, N, U)

are the time-

t_{i}

, state-

Z_{i}

continuation and option values, respectively,

h_{i} (S, N, U, u)

is the payoff from exercising u units with

h_{i} (S, N, U, 0) = 0

,

Z_{i}

is the time-

t_{i}

information set generated by the paths of (

S, N, U

), I is an indicator function and

\tilde{ϕ} (\cdot)

is a cumulative usage penalty term. Estimator properties and their proofs are given for this multiple exercise option setup. However, the dynamic program and estimator properties can be stated and proven for alternative specifications provided there is a finite number of exercise rights and usage levels. For example, a swing option contract may specify a certain number of up and down swing rights,

N_{u}

and

N_{d}

. An up swing right allows the holder to take more than the baseline amount of the underlying asset while a down swing right allows the holder to take less. Another variation is to allow for multiple rights to be exercised at each opportunity where each right corresponds to a fixed volume amount Bender and Schoenmakers (2006); Meinshausen and Hambly (2004).

2.1. Forest of Stochastic Trees

The FOST generalizes the stochastic tree method for valuing American-style options to the valuation of multiple exercise options and extends the Forest of Trees method to handle a high-dimensional underlying asset. This is done by replacing the binomial/trinomial trees with stochastic trees in the framework of Lari et al. (2001) and Jaillet et al. (2004) hence giving the FOST. The stochastic tree is constructed identically as described in Broadie and Glasserman (1997) and the tree is replicated multiple times, with one replication corresponding to each possible (

N, U

) combination. This is analogous to the Forest of Trees in which the same underlying binomial/trinomial tree is replicated for each possible (

N, U

) combination.

The dynamic program is approximately solved by replacing the continuation values in Equations (1) and (3) with stochastic tree-type estimators. As with the original stochastic tree technique, high- and low-biased option value estimators are constructed by using the analogous high- and low-biased estimators, respectively, on each stochastic tree in the forest. The recursive equations for the high estimator are

\begin{matrix} {\hat{H}}_{i} (S_{i}^{j}, N_{i + 1}, U_{i + 1}) & = \frac{1}{b} \sum_{k = 1}^{b} {\hat{V}}_{i + 1} (S_{i + 1}^{k}, N_{t + 1}, U_{t + 1}), a n d \end{matrix}

(5)

\begin{matrix} {\hat{V}}_{i} (S_{i}^{j}, N_{i}, U_{i}) & = max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + {\hat{H}}_{i} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)], \end{matrix}

(6)

with the terminal conditions

\begin{matrix} {\hat{V}}_{m} (S_{m}^{j}, N_{m}, U_{m}) & = max_{u \in U_{m}} [h_{m} (S_{m}^{j}, N_{m}, U_{m}, u) + \tilde{ϕ} (U_{m} + u)], \end{matrix}

(7)

where

{\hat{H}}_{i} (S, N, U)

and

{\hat{V}}_{i} (S, N, U)

are the time-

t_{i}

, state-

Z_{i}

continuation and option value estimators, respectively,

h_{i} (S, N, U, u)

(with

h_{i} (S, N, U, 0) = 0

) is the time-

t_{i}

, state-

Z_{i}

payoff from exercising u units, b is the branching factor, I is an indicator function and

\tilde{ϕ} (U_{m} + u)

is a global usage penalty term. The superscript

j = {j_{0}, j_{1}, \dots, j_{i}}

indicates the specific node within a given stochastic tree and

k = {j, k}

.

Figure 1 is a diagram of a section of a Forest of Stochastic Trees with two volume choices,

u_{1}

and

u_{2}

. It illustrates the nodes in the forest which need to be considered when making an exercise decision given state (

N, U

). The three choices are no exercise, exercise

u_{1}

units, and exercise

u_{2}

units.

The low estimator is similarly defined using the low estimator on each stochastic tree via the dynamic program,

\begin{matrix} {\hat{g}}_{i l} (S_{i}^{j}, N_{i}, U_{i}, u) & = h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + \frac{1}{b - 1} \underset{k \neq l}{\sum_{k = 1}^{b}} {\hat{v}}_{i + 1} (S_{i + 1}^{k}, N_{i} - I_{{u \neq 0}}, U_{i} + u), \end{matrix}

(8)

\begin{matrix} {\hat{H}}_{i l} (S_{i}^{j}, N_{i}, U_{i}) & = max_{u \in U_{i}} [{\hat{g}}_{i l} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)], \end{matrix}

(9)

\begin{matrix} {\hat{v}}_{i l} (S_{i}^{j}, N_{i}, U_{i}) & = h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{*}) + {\hat{v}}_{i + 1} (S_{i + 1}^{l}, N_{i} - I_{{u^{*} \neq 0}}, U_{i} + {\hat{u}}^{*}), and \end{matrix}

(10)

\begin{matrix} {\hat{v}}_{i} (S_{i}^{j}, N_{i}, U_{i}) & = \frac{1}{b} \sum_{l = 1}^{b} {\hat{v}}_{i l} (S_{i}^{j}, N_{i}, U_{i}) \end{matrix}

(11)

where

{\hat{H}}_{i l} (X_{i}^{j}, N_{i}, U_{i})

is the

l

-th leave-one-out hold value estimator and

{\hat{u}}^{*}

is the estimated optimal exercise amount which depends on i and l. The terminal conditions associated with this dynamic programming scheme are,

\begin{matrix} {\hat{v}}_{m} (S_{m}^{j}, N_{m}, U_{m}) & = max_{u \in U_{m}} [h_{m} (S_{m}^{j}, N_{m}, U_{m}, u) + \tilde{ϕ} (U_{m} + u)], \end{matrix}

(12)

where

\tilde{ϕ} (U_{m} + u)

is a cumulative usage penalty term.

2.2. Estimator Bias

In order to justify using the high- and low-biased estimators to construct upper and lower option price bounds, respectively, we prove that the high estimator is always positively biased and that the low estimator is always negatively biased. In addition we include a comparison of the estimators which orders their values on any realization of the simulated forest.

The theorems that follow are direct extensions of those in Broadie and Glasserman (1997). Below, the branching factor, b, appears as an argument in the estimators. For example,

{\hat{V}}_{0} (b, S_{0}, N_{0}, U_{0})

refers to the time-0, state-

Z_{0}

high-biased estimator with a stochastic tree branching factor of b. This argument has been suppressed to this point for convenience. We begin with the theorem regarding the bias of the high estimator.

Theorem 1.

(High estimator bias) The high estimator is biased high, i.e.,

E [{\hat{V}}_{0} (b, S_{0}, N_{0}, U_{0})] \geq B_{0} (S_{0}, N_{0}, U_{0})

(13)

for all b.

Similarly, the result stating the bias of the low estimator follows.

Theorem 2.

(Low estimator bias) The low estimator is biased low, i.e.,

E [{\hat{v}}_{0} (b, S_{0}, N_{0}, U_{0})] \leq B_{0} (S_{0}, N_{0}, U_{0})

(14)

for all b.

Finally, an ordering result for the high and low estimators is stated in Theorem 3.

Theorem 3.

(Comparison of Estimators) On every realization of the forest the low estimator is less than or equal to the high estimator. That is,

{\hat{v}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) \leq {\hat{V}}_{i} (b, S_{i}^{j}, N_{i}, U_{i})

(15)

with probability one for all

j

, i.

Theorems 1–3 are proven in Appendix B.

2.3. Estimator Convergence

An advantage of the stochastic tree method over some other MC valuation methods is that its estimators are consistent to the true option value. This property also holds for the FOST estimators. In this section we state two theorems—one for the consistency of the high estimator, and the other for the consistency of the low estimator. Here convergence is in probability to the true option value and as above the argument b that appears with the estimators refers to an arbitrary branching factor size of b with convergence being shown as

b \to \infty

. Before stating the result, define

{\bar{V}}_{0} (b, S_{0}, N_{0}, U_{0})

as the average of R independent replications of

{\hat{V}}_{0} (b, S_{0}, N_{0}, U_{0})

.

Theorem 4.

(High estimator convergence) Suppose

E [{|h_{i} (S_{i}, N_{i}, U_{i})|}^{p^{'}}] < \infty

, for all

t_{i}

, and some

p^{'} > 1

. Then

{\bar{V}}_{0} (b, S_{0}, N_{0}, U_{0})

converges to

B_{0} (S_{0}, N_{0}, U_{0})

in p-norm for any

0 < p < p^{'}

as

b \to \infty

. This holds for an arbitrary number of repeated valuations of the forest, R. In particular

{\bar{V}}_{0} (b, S_{0}, N_{0}, U_{0})

converges to

B_{0} (S_{0}, N_{0}, U_{0})

in probability and is thus a consistent estimator of the option value.

This results implies that

E [{\hat{V}}_{0} (b, S_{0}, N_{0}, U_{0})] \to B_{0} (S_{0}, N_{0}, U_{0})

(16)

as

b \to \infty

. Hence the estimator is asymptotically unbiased.

Theorem 5.

(Low estimator convergence) Suppose that,

\begin{matrix} P [h_{i} (S_{i}, N_{i}, U_{i}, u^{1}) + H_{i} (S_{i}, N_{i} - I_{{u^{1} \neq 0}}, U_{i} + u^{1}) \\ \neq h_{i} (S_{i}, N_{i}, U_{i}, u^{2}) + H_{i} (S_{i}, N_{i} - I_{{u^{2} \neq 0}}, U_{i} = u^{2})] = 1, \end{matrix}

for

u^{1}, u^{2} \in U_{i}

,

u^{1} \neq u^{2}

and all i. Then Theorem 4 also holds for the low estimator.

The additional condition imposed in Theorem 5 is analogous to that used in Theorem 3 of Broadie and Glasserman (1997). This condition says that, with probability one, the optimal exercise policy is never indifferent between the choices of volumes to exercise (including

u = 0

). As in Broadie and Glasserman (1997) imposing this condition simplifies the analysis of the estimator. Theorems 4 and 5 are proven in Appendix B.

2.4. Numerical Results

Pricing swing contracts involves a large number of parameters and in this section we provide some results which illustrate the validity of our method across a variety of specifications. We assume that the underlying assets follow a risk neutral stochastic process, there are no transaction costs and other than penalties, there are no other constraints considered. We also assume a constant risk free rate of interest and the volatilities of all assets are known constant functions of time.

The option swing rights may be exercised at discrete times up to and including expiry and the volume choices given are in discrete amounts. That is, anytime the holder chooses to exercise a right, they must choose from a finite list of possible volume amounts. The rationale behind allowing all time steps to be exercise opportunities is the exponential growth in computational time caused by adding intermediate non-exercise times. However, the method can easily be modified to incorporate these extra time steps. As previously mentioned penalties can be implemented globally and are based on the net volume exercised during the contract.

2.4.1. Single Dimension

Beginning with the one dimensional case, we have based our simulations on an underlying asset with a risk neutralized price process that satisfies the following stochastic differential equation,

d S_{i} = S_{i} [(r - δ) d t + σ d Z_{i}] .

(17)

In this equation, r is the riskless interest rate,

Z_{i}

is a standard Brownian motion process,

σ

is a constant volatility parameter and the underlying asset itself pays a continuous dividend yield

δ

. The parameter values for the underlying asset are specified as

r = 0.05

,

δ = 0.1

, and

σ = 0.2

.

The swing options considered have both up and down swing rights and the payoff upon exercise is

u \times max [max (S - K_{u}, 0), max (K_{d} - S, 0), 0],

(18)

where u is the volume exercised, S is the price of the underlying asset at the exercise time, and

K_{u}

and

K_{d}

are the up and down swing strike prices, respectively. For the examples considered here, we set

K_{u} = K_{d} = K

which simplifies the payoff function to

u \times max (S - K, K - S) .

(19)

For all examples in Section 2.4.1 the option expiry is 3.0 years and the options have both up and down swing rights with strike prices

K_{u} = K_{d} = 40.0

, respectively. In examples where the holder controls the amount exercised, a list of volume choices is given. The volume choices are consecutive integer multiples of a base amount and the up swing and down swing volumes have the same magnitude. For comparison purposes the results in this subsection include a binomial value which is calculated using the Forest of Trees Lari et al. (2001).

All simulations in this subsection were completed on the SHARCNet cluster Whale. Whale is located at the University of Waterloo and consists of Opteron 2.2 GHz processors (four per node) with a Gigabit Ethernet interconnect. Timing results listed below are given in total cpu time accumulated which is approximately equal to (program runtime) × (number of processors used).

Example 1.

(Illustration of Bias and Convergence) The swing option in this example has one up and one down swing right, three exercise opportunities, exercise volume of 60 units of the underlying and there is no penalty. The initial price is USD 40. Here we illustrate the effect of the branching factor b on the value estimates. Specifically, we perform R repeated valuations of the FOST with a branching factor of b and hold the total sample size fixed using the relation

R = 32, 000 (\frac{10}{b})

. Figure 2 plots the FOST estimates versus branching factor (the estimates are the averages of the R repeated valuations). Taking the Binomial estimate as the true option value, it is clear that the high estimator overestimates the true price while the low estimator underestimates the true price. Furthermore, as the branching factor increases, the high estimator decreases and the low estimator increases to the true option price, clearly illustrating estimator convergence. Estimator standard errors are approximately 0.07% of estimator value.

Example 2.

(Effect of Usage Penalty and Initial Asset Price) In this example, the option has two up and two down swing rights and, upon exercise there are three volume choices—20, 40 and 60 units of the asset. Should the final net volume exercised exceed 90 units or be below −90 units a penalty is imposed. The penalty is calculated by multiplying the terminal asset price by ten times the excess usage above 90 units or below −90 units. To see the effect of the penalty on option value, we also turn the penalty term off and value the corresponding option with no penalty. The initial asset value ranges from USD 20 to USD 60 in steps of USD 10. There are

m = 5

exercise opportunities and we use a branching factor of

b = 20

with

R = 4000

repeated valuations.

The pricing results are presented in Table 2. In each row of the table, we see that the high and low estimators bound the true price. Unsurprisingly, imposing a penalty on the cumulative volume reduces the option value. Furthermore, as the initial stock price increases, the increase in an up swing right’s value is more than the decrease in a down swing right’s value. The opposite is true as the initial stock price decreases. The end result is that as the initial stock price moves away from being at-the-money, the option value increases.

The average computing time per row (not including the binomial forest valuation) was 5.6 h for the cases with usage penalty and was 1.1 h without usage penalty. The reduction in runtime for the case with no penalty can be described as follows. If there are no constraints (e.g., penalty, storage) on the option holder then upon exercise it is always optimal to choose the maximum amount. Therefore with no penalty this option is equivalent to that of an otherwise identical swing option with no volume choices and an exercise volume of 60 units. The latter has fewer trees in its forest and is therefore quicker to evaluate. In Table 2 we have chosen to exploit this as a convenient way to save computational time. For the binomial method run times were on the order of a few seconds.

Example 3.

(Effect of Number of Exercise Rights) In this example we illustrate that the option value increases with the number of exercise rights and compare the swing option value with that of a corresponding basket of American options. The option has

m = 5

exercise opportunities, an exercise volume of 60 units, and there is no usage penalty. Additionally, we set the initial price to

S_{0} = 40

and use a branching factor of

b = 20

with

R = 4000

repeated valuations. We consider options having an equal number of up and down swing rights. Table 3 gives the option price estimates for

N_{u} = N_{d} = 1, 3, 5

along with prices computed using the Forest of Trees. First notice that the high and low estimates bound the true option from the binomial model. Next note that with

N_{u} = N_{d} = 5

exercise rights, the high and low estimates are exactly the same. In this case the number of exercise opportunities is equal to the numbers of up and down swing rights and since the up and down swing strikes are equal, exactly one of these rights will be exercised at each opportunity (see Equation (19)). This makes both the exercise payoff and the continuation value estimates exactly the same at all times and along all branches for both the low and high estimators, yielding identical prices.

Second, the option value increases with the number of swing rights. However, the price increases by a factor that is less than the increase in the number of swing rights. For example, when the number rights increases by a factor of 3 (e.g., going from

N_{u} = N_{d} = 1

to

N_{u} = N_{d} = 3

) the option value increases by a factor of 2.5 and when the number of rights increases by a factor of

\frac{5}{3}

(e.g., going from

N_{u} = N_{d} = 3

to

N_{u} = N_{d} = 5

), the option value increases by a factor of 1.2. This result matches with the intuition that a swing option with a given number of rights is less valuable than a basket of American put and call options with otherwise identical parameters and

K_{d} \leq K_{u}

.

For the case of one up and one down swing right the basket of American options would contain a single call corresponding to the up swing right and a single put corresponding to the down swing right, with equal strike prices for the call and put. Changing the number of up and down swing rights invokes a change in the corresponding number of call and put options in the basket. Figure 3 shows a comparison between the values of a basket of American options and a swing option with a comparable number of exercise rights. The value of the basket is linear in the number of exercise rights and the swing option value is below the value of the corresponding basket when the number of rights is greater than one. This follows from the restriction that only one swing right may be exercised at any exercise opportunity whereas all American options of a particular type could be exercised at a given time. In the case of one up and one down right the two are equal since it would never be optimal to exercise both the put and call style rights at the same time. As the number of rights increases, the difference in values increases due to the swing option restriction allowing only a single right to be exercises at each opportunity. The low-biased estimator is used for both the basket and swing option values in Figure 3.

2.4.2. Calibrated Forward Curve

In this example, we use the trinomial-tree model described in Section 4 of Jaillet et al. (2004) from which price movements are simulated. This model is a 1-factor model with mean reversion that is seasonally adjusted and calibrated to the forward curve. The option we value is Example (a) of Section 4.2 in Jaillet et al. (2004). This option a two up right swing option with each right allowing the holder to take delivery of either 1 or 2 MMBTus of natural gas. We simplify to have four exercise opportunities and 4 months until expiry. Upon exercise the holder gets

max (U_{i} (A_{i} S_{i} - K), 0)

(20)

where

U_{i}

is the volume chosen,

A_{i}

is the seasonality factor and

S_{i}

is the deseasonalized spot price.

Figure 4 plots the option price estimates, including 95% confidence intervals, against branching factor. The number of repeated valuations used to generate prices with a branching factor of b is

R = \frac{160, 000}{b}

. We see that, with a branching factor of only 8, the confidence intervals for the high- and low-biased estimators begin to overlap and quickly become almost indistinguishable for higher branching factors, numerical illustration of both estimators’ consistency. Additionally, the high and low estimators converge to the true price computed using the trinomial model. We note that the serial computational times (for a single FOST valuation) for branching factors of 8 and 32 were approximately 4.5 and 110 s, respectively using a 2.1 Ghz Core 2 Duo processor. The pricing results shown in Figure 4 are consistent with the results in Jaillet et al. (2004) but we note that the valuation method in that publication breaks down in higher dimensions and in cases where the inclusion of more risk factors is desirable.

2.4.3. Five Dimensions

Due to the computationally intensive nature of this method it only becomes truly useful in cases where PDE or tree based methods fail. In this subsection we present high-dimensional versions of the examples presented in Section 2.4.1. The asset prices are assumed to follow a risk neutralized correlated geometric Brownian motion described by the stochastic differential equations,

d S_{i}^{k} = S_{i}^{k} [(r - δ^{k}) d t + σ^{k} d Z_{i}^{k}], k = 1, \dots, d,

(21)

where

Z_{i}^{k}

is a standard Brownian motion process and the instantaneous correlation between

Z^{k}

and

Z^{s}

is

ρ^{k s}

. Here the parameter values are specified as

r = 0.05

,

δ^{k} = δ = 0.1

,

σ^{k} = σ = 0.2

for all k and that

ρ^{k s} = 0

for all

k \neq s

. In addition all assets have the same initial value,

S_{0}

, and we take the number of assets to be

d = 5

.

The swing options considered have both up and down swing rights and the payoff upon exercise is

u \times max [(max_{k = 1, \dots, d} S^{k} - K_{u}, K_{d} - max_{k = 1, \dots, d} S^{k}), 0],

(22)

where

S^{k}

is the price of the kth underlying asset at the exercise time. This payoff is an extension of the example given in Broadie and Glasserman (1997) and Broadie and Glasserman (2004) for single-exercise American-style options. For the examples considered here, we set

K_{u} = K_{d} = K

which simplifies the payoff function to

u \times max (max_{k = 1, \dots, d} S^{k} - K, K - max_{k = 1, \dots, d} S^{k}) .

(23)

As in Section 2.4.1, the option expiry is 3.0 years and the options have both up and down swing rights with strike prices

K_{u} = K_{d} = 40.0

, respectively. In examples where the holder controls the amount exercised, a list of volume choices is given. Note that we present results from our FOST methodology without comparisons to other methods as there is no generally accepted benchmark for the examples considered here.

Example 4.

(Illustration of Bias and Convergence) This 5-dimensional example corresponds with Example 1. The swing option has one up and one down swing right, three exercise opportunities, exercise volume of 60 units of the underlying and there is no penalty. The initial price is USD 40. We perform R repeated valuations of the FOST with a branching factor of b and hold the total sample size fixed using the relation

R = 32000 (\frac{10}{b})

. This results in standard errors ≈ 0.09% of option value. Figure 5 plots the FOST estimates versus branching, showing that the high estimator overestimates the option price while the low estimator underestimates the price. Furthermore, as the branching factor increases, the high estimator decreases and the low estimator increases and they appear to be converging to the same value, illustrating estimator convergence. These findings are consistent with those in Example 1.

Example 5.

(Effect of Usage Penalty and Initial Asset Price) This 5-dimensional example corresponds with Example 2, with the option specifications the same as presented there, modulo the adjustment to the payoff function to five dimensions. The pricing results are in Table 4. The effects of a usage penalty and initial stock price are qualitatively the same compared with the 1-dimensional results. We note that the option value estimates in this example are higher than those in Example 2 due to the payoffs depending on the maximum of the five asset prices. The computing times for the 5-dimensional case are similar to those for the 1-dimensional asset.

Example 6.

(Effect of Number of Exercise Rights) This 5-dimensional example corresponds with Example 3, with the option specifications the same as presented there, modulo the adjustment to the payoff function to five dimensions. The results are given in Table 5 and Figure 6. The results, intuition, and interpretation are qualitatively the same as the 1-dimensional results.

2.5. Algorithmic Enhancement via Parallel Processing

One method for enhancing the computational efficiency of this algorithm is by taking advantage of multi-processor computing techniques. The simplest and most obvious implementation would be to parallelize across repeated valuations of the forest resulting in serial farming of the repeated valuations. Since each repeated valuation results in an iid random value for the option estimate, the generation of all the results may be completed independently of one another, removing the need for communication between processors. This method is simple and effective. However we state here without numerical evidence that it results in a near perfect speed up without the need for expensive interconnections. With this method the minimum run time that can be produced is determined by the number of processors available, the number of repeated valuations necessary for the desired accuracy and the run time of a single forest.

A variation on the aforementioned parallel implementation is to parallelize the FOST computations internally within the forest. In the results shown in Figure 7 the FOST algorithm has been modified so that the computation of the individual trees within the forest is done using multiple processors. Here we have begun the parallelization after the first time step by dividing up the computation of the remaining subtrees across different processors. Upon completion, the results are gathered and the option value at the initial time step is determined. In Figure 7 we see that this method results in a near perfect speed up due to the small ratio of communication time versus computational time. This implementation may be combined with serial farming resulting in further computational time efficiency. This is discussed more fully in Marshall et al. (2011).

In Figure 7 the swing option is identical one in Example 4 and pricing is done with a branching factor

b = 160

. The computational times were generated using the SHARCNET cluster Hound which comprises 2.2 GHz Opteron processors with 4 GB per core and Infini-Band interconnections. Run times are normalized to the run time of a single processor.

3. Discussion and Conclusion

The FOST can be thought of as generalizations of two existing pricing methodologies. First, it generalizes the Forest of Trees method to a high-dimensional underlying. Second, it generalizes the Stochastic Tree pricing method for single-exercise right options to multiple exercise rights. We construct high and low FOST estimators analogous to those defined for the Stochastic Tree. We prove properties regarding FOST estimator bias, ordering, and convergence and present numerical results as illustrations.

In related work, we have replaced the binomial/trinomial trees in the Forest of Trees method with Stochastic Meshes Broadie and Glasserman (2004), creating the Forest of Stochastic Meshes Marshall and Reesor (2011). This avoids the exponential growth in computing time with the number of exercise opportunities experienced by the FOST. Another avenue of future work involves algorithmic enhancement. In Section 2.5 we discussed the use of parallel processing to reduce computing time. Two alternatives to this are variance reduction and bias reduction. There are some standard variance reduction methods (e.g., antithetic variates, control variates) that could be used to produce more efficient estimators. The bias reduction technique given in Whitehead et al. (2012) for Stochastic Tree estimators successfully reduces the branching factor required to obtain a desired accuracy for an American option value. This technique can be extended to correct the bias in FOST estimators and we have preliminary evidence of its effectiveness Marshall (2012). Combinations of variance reduction, bias reduction, and parallel processing can be investigated to further improve the algorithm’s performance.

Author Contributions

R.M.R. and T.J.M. contributed equally to all aspects of this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the Natural Sciences and Engineering Research Council of Canada grant number RGPIN-2017-05441.

Acknowledgments

The authors thank SHARCNet for computational resources and technical support, particular acknowledgement goes to both Tyson Whitehead and Baolai Ge. We also thank graduate students and faculty in the Financial Mathematics group at Western University, in particular, Matt Davison, Adam Metzler, Rogemar Mamon, and Lars Stentoft.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. The views represented herein are the authors’ own views and do not necessarily represent the views of Bank of Montreal or its affiliates and are not a product of its research.

Abbreviations

The following abbreviations are used in this manuscript:

MEO	Multiple Exercise Option
PDE	Partial Differential Equation
FOST	Forest of Stochastic Trees
FOSM	Forest of Stochastic Meshes
MC	Monte Carlo

Appendix A. Nomenclature

In this appendix we describe the notation used in this paper.

If X is a random variable, we write

| | X | |

for the p-norm

{({E | X |}^{p})}^{1 / p}

of X. The conditional p-norm of X on

Z_{i}

,

{({E | X |}^{p} | Z_{i})}^{1 / p}

, is denoted

{| | X | |}_{Z_{i}}

. Here we include a summary of notation used for the proofs contained in this appendix as well as all subsequent appendices.

Time is indexed by i for $t_{i}$ , $i = 0, 1, \dots, m$ .
R is the number of repeated valuations of the forest.
b is the branching factor.
$S_{i}^{j}$ is the spot price vector at time $t_{i}$ for branch $j = {j_{0}, j_{1}, \dots, j_{i}}$ . For convenience we may suppress the bold superscript if there is no ambiguity in doing so, in these cases $S_{i + 1}^{j}$ refers to the time- $t_{i + 1}$ price along the branch path $j = {j_{0}, j_{1}, \dots, j_{i}, j}$ .
$Z_{i}$ represents the time- $t_{i}$ history of the set of state variables $(S_{i}^{j}, N_{i}, U_{i})$ , where we suppress the branching history index.
${\hat{V}}_{i} (b, S_{i}^{j}, N_{i}, U_{i})$ is the time- $t_{i}$ , state- $Z_{i}$ high estimator.
${\hat{v}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i})$ is the time- $t_{i}$ , state- $Z_{i}$ leave one out low biased estimator which does not include node l at time- $t_{i + 1}$ .
${\hat{H}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u)$ is the time- $t_{i}$ , state- $Z_{i}$ leave one out hold value estimator for exercising u units which does not include node l at time- $t_{i + 1}$ ,

$\begin{matrix} {\hat{H}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u) & = \frac{1}{b - 1} \sum_{\binom{k = 1}{k \neq l}}^{b} D_{i + 1} {\hat{v}}_{i + 1}^{k} (b, S_{i + 1}^{k}, N_{i} - I_{{u \neq 0}}, U_{i} + u) \\ k & = {j, k} \end{matrix}$
${\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u) = h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + {\hat{H}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u)$
${\hat{v}}_{i} (b, S_{i}^{j}, N_{i}, U_{i})$ is the time- $t_{i}$ , state- $Z_{i}$ low estimator

${\hat{v}}_{i} = \frac{1}{b} \sum_{l = 1}^{b} {\hat{v}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i})$
$N_{i}$ is the time- $t_{i}$ number of exercise rights remaining.
$U_{i}$ is the time- $t_{i}$ cumulative volume.
$U_{i}$ is the time- $t_{i}$ discretized set of available volume choices,

$U_{i} = \{u_{0}, u_{1}, u_{2}, \dots, u_{z} : z \in N\},$

where $u_{0} = 0$ .
u is the time- $t_{i}$ volume exercised. Here $u \in U_{i}$ .
$D_{i + 1}$ is the discount factor from $t_{i + 1}$ to $t_{i}$ .
$h_{i} (S_{i}^{j}, N_{i}, U_{i}, u)$ is the time- $t_{i}$ , state- $Z_{i}$ payoff from exercising u units with $h_{i} (S_{i}^{j}, N_{i}, U_{i}, 0) = 0$ .
$H_{i} (S_{i}^{j}, N_{i}, U_{i})$ is the time- $t_{i}$ , state- $Z_{i}$ true hold value,

$H_{i} (S_{i}^{j}, N_{i}, U_{i}) = E [D_{i + 1} B_{i + 1} (S_{i + 1}^{k}, N_{i + 1}, U_{i + 1}) | Z_{i}]$
$B_{i} (S_{i}^{j}, N_{i}, U_{i})$ is the time- $t_{i}$ , state- $Z_{i}$ true option value,

$B_{i} (S_{i}^{j}, N_{i}, U_{i}) = max_{u \in U} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + H_{i} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)]$

where $I_{{A}}$ is the indicator function for set A.

Appendix B. Proofs of Main Results and Lemmas

Appendix B.1. Proofs of Main Results

In this section we prove the main results of the paper, Theorems 1–5, presented in Section 2.

Proof of Theorem 1.

Here we prove the more general statement that

E [{\hat{V}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) | Z_{i}] \geq B_{i} (S_{i}^{j}, N_{i}, U_{i})

for

i = 0, 1, \dots, m

. The proof proceeds by backward induction. At expiry the inequality holds trivially since

{\hat{V}}_{m} (b, S_{m}^{j}, N_{m}, U_{m}) = B_{m} (S_{m}^{j}, N_{m}, U_{m})

so that

E [{\hat{V}}_{m} (b, S_{m}^{j}, N_{m}, U_{m}) | Z_{m}] \geq B_{m} (S_{m}^{j}, N_{m}, U_{m})

. We now assume the inductive hypothesis,

E [{\hat{V}}_{i + 1} (b, S_{i + 1}^{j}, N_{i + 1}, U_{i + 1}) | Z_{i + 1}] \geq B_{i + 1} (S_{i + 1}^{j}, N_{i + 1}, U_{i + 1})

, and proceed to the time-

t_{i}

case. We have,

\begin{matrix} E [{\hat{V}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) | Z_{i}] \\ = E [max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{k}, N_{i} - I_{{u \neq 0}}, U_{i} + u)] | Z_{i}] \\ \geq max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + E [D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{k}, N_{i} - I_{{u \neq 0}}, U_{i} + u) | Z_{i}]] \\ = max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + E [D_{i + 1} E [{\hat{V}}_{i + 1} (b, S_{i + 1}^{k}, N_{i} - I_{{u \neq 0}}, U_{i} + u) | Z_{i + 1}] | Z_{i}]] \\ \geq max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + E [D_{i + 1} B_{i + 1} (S_{i + 1}^{k}, N_{i + 1}, U_{i + 1}) | Z_{i}]] \\ = max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + H_{i} (S_{i}^{j}, N_{i}, U_{i})] \\ = B_{i} (S_{i}^{j}, N_{i}, U_{i}), \end{matrix}

where the first equality comes from the definition of the high estimator, the first inequality comes from the conditional Jensen’s inequality and note that

N_{i + 1} = N_{i} - I_{{u^{*} \neq 0}}

and

U_{i + 1} = u_{i} + u^{*}

where

u^{*}

is the value-maximizing volume choice, the second equality uses the tower law and the fact that

D_{i + 1}

is

Z_{i}

-measurable, and the second inequality invokes the inductive hypothesis. □

Proof of Theorem 2.

As with the proof of the bias of the high estimator we prove the more general statement that

E [{\hat{v}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) | Z_{i}] \leq B_{i} (S_{i}^{j}, N_{i}, U_{i})

for

i = 0, 1, \dots, m

by backward induction. Again at expiry the inequality holds trivially since

{\hat{v}}_{m} (b, S_{m}^{j}, N_{m}, U_{m})

= B_{m} (S_{m}^{j}, N_{m}, U_{m})

. We now assume the inductive hypothesis,

E [{\hat{v}}_{i + 1} (b, S_{i + 1}^{j}, N_{i + 1}, U_{i + 1}) | Z_{i + 1}] \leq B_{i + 1} (S_{i + 1}^{j}, N_{i + 1}, U_{i + 1})

. We also note that since the

{\hat{v}}_{i l}

’s are iid we have that,

E [{\hat{v}}_{i} | Z_{i}] = E [{\hat{v}}_{i l} | Z_{i}]

. In what follows we define

{\hat{u}}_{l}^{*} \in U_{i}

to be the volume choice which maximizes a particular

{\hat{v}}_{i l}

. That is,

{\hat{u}}_{l}^{*} = arg max_{u \in U_{i}} [{\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u)] .

(A1)

Note that

{\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u)

is conditionally independent of

{\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i + 1}, U_{i + 1}, u)

given

Z_{i}

and subsequently

{\hat{u}}_{l}^{*}

is also independent of

{\hat{v}}_{i + 1, l}

given

Z_{i}

since it is a function of

{\hat{g}}_{i l}

.

Now,

\begin{matrix} E [{\hat{v}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}) | Z_{i}] = E [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i}, U_{i}) I_{{{\hat{u}}_{l}^{*} = 0}} | Z_{i}] \\ + E [(h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{1}) + D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{1})) I_{{{\hat{u}}_{l}^{*} = u_{1}}} | Z_{i}] \\ + \dots + E [(h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{z}) + D_{i + 1} {\hat{v}}_{i + 1, l} (i + 1, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{z})) I_{{{\hat{u}}_{l}^{*} = u_{z}}} | Z_{i}] \\ = E [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i}, U_{i}) | Z_{i}] P ({\hat{u}}_{l}^{*} = 0 | Z_{i}) + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{1}) P ({\hat{u}}_{l}^{*} = u_{1} | Z_{i}) \\ + E [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{1}) | Z_{i}] P ({\hat{u}}_{l}^{*} = u_{1} | Z_{i}) \\ + \dots + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{z}) P ({\hat{u}}_{l}^{*} = u_{z} | Z_{i}) \\ + E [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{z}) | Z_{i}] P ({\hat{u}}_{l}^{*} = u_{z} | Z_{i}) \\ = E [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i}, U_{i}) | Z_{i}] p_{0} \\ + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{1}) p_{1} + E [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{1}) | Z_{i}] p_{1} \\ + \dots + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{z}) p_{z} + E [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{z}) | Z_{i}] p_{z} \end{matrix}

where in the second equality we have used the conditional independence of

{\hat{g}}_{i l}

and

{\hat{v}}_{i + 1, l}

. Here

p_{0} = P ({\hat{u}}_{l}^{*} = 0 | Z_{i})

and

p_{j} = P ({\hat{u}}_{l}^{*} = u_{j} | Z_{i})

for

1 \leq j \leq z

and

p_{0} + \dots + p_{z} = 1

. Thus, using the tower law, we have,

\begin{matrix} E [{\hat{v}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) | Z_{i}] = E [{\hat{v}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}) | Z_{i}] \\ = E [D_{i + 1} E [{\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i}, U_{i}) | Z_{i + 1}] | Z_{i}] p_{0} \\ + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{1}) p_{1} + E [D_{i + 1} E [{\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{1}) | Z_{i + 1}] | Z_{i}] p_{1} \\ + \dots + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{z}) p_{z} + E [D_{i + 1} E [{\hat{v}}_{i + 1, l} (b, S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{z}) | Z_{i + 1}] | Z_{i}] p_{z} \\ \leq E [D_{i + 1} B_{i + 1} (S_{i + 1}^{k}, N_{i}, U_{i}) | Z_{i}] p_{0} \\ + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{1}) p_{1} + E [D_{i + 1} B_{i + 1} (S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{1}) | Z_{i}] p_{1} \\ + \dots + h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{z}) p_{z} + E [D_{i + 1} B_{i + 1} (S_{i + 1}^{k}, N_{i} - 1, U_{i} + u_{z}) | Z_{i}] p_{z} \\ = H_{i} (S_{i}^{j}, N_{i}, U_{i}) p_{0} + (h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{1}) + H_{i} (S_{i}^{j}, N_{i} - 1, U_{i} + u_{1})) p_{1} \\ + \dots + (h_{i} (S_{i}^{j}, N_{i}, U_{i}, u_{z}) + H_{i} (S_{i}^{j}, N_{i} - 1, U_{i} + u_{z})) p_{z} \\ \leq max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + H_{i} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)] \\ = B_{i} (S_{i}^{j}, N_{i}, U_{i}) \end{matrix}

where the first inequality follows from the inductive hypothesis and the remaining steps follow from the definitions for

B_{i}

and

H_{i}

. □

Proof of Theorem 3.

At expiry we have that

{\hat{v}}_{m} (b, S_{m}^{j}, N_{m}, U_{m}) = {\hat{V}}_{m} (b, S_{m}^{j}, N_{m}, U_{m}) = B_{m} (S_{m}^{j}, N_{m}, U_{m})

so the relation holds trivially. We now take the inductive hypothesis to be

{\hat{v}}_{i + 1} (b, S_{i + 1}^{j}, N_{i + 1}, U_{i + 1}) \leq {\hat{V}}_{i + 1} (b, S_{i + 1}^{j}, N_{i + 1}, U_{i + 1})

for

j_{i + 1} = 1, \dots, b

. Using the

{\hat{g}}_{i l}

as defined above we first consider the case where for a given tree,

{\hat{u}}_{l}^{*} = arg max_{u \in U_{i}} [{\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u)],

(A2)

is the same for all l (i.e.,

{\hat{u}}_{l}^{*} = {\hat{u}}^{*}

, for all l).

Then,

\begin{matrix} {\hat{v}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) = \frac{1}{b} \sum_{l = 1}^{b} {\hat{v}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}) \\ = \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{*}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{*} \neq 0}}, U_{i} + {\hat{u}}^{*})] \\ \leq \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{*}) + D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{*} \neq 0}}, U_{i} + {\hat{u}}^{*})] \\ = h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{*}) + \frac{1}{b} \sum_{l = 1}^{b} [D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{*} \neq 0}}, U_{i} + {\hat{u}}^{*})] \\ \leq max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + \frac{1}{b} \sum_{l = 1}^{b} [D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u)]] \\ = {\hat{V}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) \end{matrix}

where the first inequality comes from the inductive hypothesis and the remaining relations come from the parameter definitions.

Next consider the case where the low estimator gives two different estimated optimal exercise amounts,

{\hat{u}}^{1}

,

{\hat{u}}^{2}

, across all l branches where

{\hat{u}}^{1} \neq {\hat{u}}^{2}

. That is

{\hat{u}}_{l}^{*} = {\hat{u}}_{1}

or

{\hat{u}}_{l}^{*} = {\hat{u}}^{2}

for all

l = 1, \dots, b

. As above we take

{\hat{u}}_{l}^{*}

to be the optimal exercise amount determined by the l-th leave one out estimator, then,

\begin{matrix} {\hat{v}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) = \frac{1}{b} \sum_{l = 1}^{b} {\hat{v}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}) \\ = \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}_{l}^{*}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}_{l}^{*} \neq 0}}, U_{i} + {\hat{u}}_{l}^{*})] \\ = \frac{1}{b} \sum_{l = 1}^{b} \{[h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})] I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{1}}} \\ + [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})] I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{2}}}\} \\ = \frac{(\frac{1}{b} \sum_{l = 1}^{b} I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{1}}}) \times (\frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})] I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{1}}})}{\frac{1}{b} \sum_{l = 1}^{b} I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{1}}}} \\ + \frac{(\frac{1}{b} \sum_{l = 1}^{b} I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{2}}}) \times (\frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})] I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{2}}})}{\frac{1}{b} \sum_{l = 1}^{b} I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{2}}}} \\ = p \times \frac{\frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})] I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{1}}}}{\frac{1}{b} \sum_{l = 1}^{b} I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{1}}}} \\ + (1 - p) \times \frac{\frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})] I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{2}}}}{\frac{1}{b} \sum_{l = 1}^{b} I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{2}}}}, \end{matrix}

where

p = \frac{1}{b} \sum_{l = 1}^{b} I_{{{\hat{u}}_{l}^{*} = {\hat{u}}^{1}}}

.

Without loss of generality, suppose that

{\hat{u}}_{l}^{*} = {\hat{u}}^{1}

for

l = 1, \dots, k

and

{\hat{u}}_{l}^{*} = {\hat{u}}^{2}

for

l = k + 1, \dots, b

. Then the above ratios become

\frac{\sum_{l = 1}^{k} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})]}{k}

(A3)

and

\frac{\sum_{l = k + 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})]}{b - k},

(A4)

respectively. Now for any

i^{*} \leq k < j^{*} \leq b

we have

{\hat{g}}_{i i^{*}} (b, S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) > {\hat{g}}_{i j^{*}} (b, S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1})

which from the definition of

{\hat{g}}_{i l}

, implies that

D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{i^{*}}, N_{i}, U_{i} + {\hat{u}}^{1}) \leq D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{j^{*}}, N_{i}, U_{i} + {\hat{u}}^{1}) .

Therefore,

max_{1 \leq a \leq k} [D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{a}, N_{i}, U_{i} + {\hat{u}}^{1})] \leq min_{k + 1 \leq a \leq b} [D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{a}, N_{i}, U_{i} + {\hat{u}}^{1})] .

This implies that Equation (A3)

\begin{matrix} \frac{1}{k} \sum_{l = 1}^{k} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})] \\ \leq \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})], \end{matrix}

and similarly for Equation (A4)

\begin{matrix} \frac{1}{b - k} \sum_{l = k + 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})] \\ \leq \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})] . \end{matrix}

Therefore

\begin{matrix} {\hat{v}}_{i} (b, S_{i}^{j}, N_{i}, U_{i}) \leq p \times \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})] \\ + (1 - p) \times \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})] \\ \leq p \times \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})] \\ + (1 - p) \times \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})] \\ \leq max [\frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{1}) + D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{1} \neq 0}}, U_{i} + {\hat{u}}^{1})], \\ \frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, {\hat{u}}^{2}) + D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{{\hat{u}}^{2} \neq 0}}, U_{i} + {\hat{u}}^{2})]] \\ \leq max_{u \in U_{i}} [\frac{1}{b} \sum_{l = 1}^{b} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u)]] \\ = {\hat{V}}_{i} (S_{i}^{j}, N_{i}, U_{i}), \end{matrix}

where the second inequality comes from the inductive hypothesis, the third inequality is an application of Jensen’s inequality, the fourth inequality comes from maximizing over a larger set, and the final equality is the definition of the high-biased estimator.

For the cases where the low estimator gives

z^{*}

distinct estimated optimal exercise amounts,

{\hat{u}}^{1}, \dots, {\hat{u}}^{z^{*}}

, across all z branches,

z^{*} = 3, \dots, z

, arguments similar to those given above (for 2 distinct estimated optimal exercise amounts) show that,

{\hat{v}}_{i} (S_{i}^{j}, N_{i}, U_{i}) \leq {\hat{V}}_{i} (S_{i}^{j}, N_{i}, U_{i}) .

Since we restrict the number of volume choices to be finite, the theorem is proven. □

Prior to proving Theorems 4 and 5 we first state the following preliminary result.

Lemma A1.

If

| | h_{i} (S_{i}, N_{i}, U_{i}, u) | | < \infty

for all

t_{i}

, for some

p \geq 1

, then the following are true for all

0 \leq t_{i} \leq t_{k} \leq t_{m}

:

\begin{matrix} {∥B_{k} (S_{k}, N_{k}, U_{k})∥}_{Z_{i}} & < \infty \end{matrix}

(A5)

\begin{matrix} sup_{b} {∥{\hat{V}}_{k} (b, S_{k}, N_{k}, U_{k})∥}_{Z_{i}} & < \infty \end{matrix}

(A6)

\begin{matrix} sup_{b} {∥{\hat{v}}_{k} (b, S_{k}, N_{k}, U_{k})∥}_{Z_{i}} & < \infty \end{matrix}

(A7)

The proof of this lemma can be found in the appendix. A second preliminary result is

Lemma A2.

Let

a_{1}, \dots, a_{n}, b_{1}, \dots, b_{n}, c_{1}, \dots, c_{n}

be real numbers. Then,

A_{n} \equiv |max (a_{1} + b_{1}, \dots, a_{n} + b_{n}) - max (a_{1} + c_{1}, \dots, a_{n} + c_{n})| \leq 2 \sum_{i = 1}^{n + 1} |b_{i} - c_{i}| \equiv B_{n} .

(A8)

Its proof can also be found in the appendix. We now prove Theorems 4 and 5.

Proof of Theorem 4.

Here we take

R = 1

and state that if the convergence holds for a single realization of the forest then it will hold for the mean of any number of realizations due to the independence of each repeated valuation. Here we prove by backward induction the more general statement

∥ {\hat{V}}_{i} (b, S_{i}, N_{i}, U_{i}) - B_{i} (S_{i}, N_{i}, U_{i}) ∥_{Z_{i}} \to 0

for any generic node in a given tree and for all

i = 0, \dots, m

. At expiry the relation holds trivially since at

t_{i} = t_{m}

we have that

{\hat{V}}_{m} (b, S_{m}, N_{m}, U_{m}) = B_{m} (S_{m}, N_{m}, U_{m})

. The inductive hypothesis is taken to be

∥ {\hat{V}}_{i + 1} (b, S_{i + 1}, N_{i + 1}, U_{i + 1}) - B_{i + 1} (S_{i + 1}, N_{i + 1}, U_{i + 1}) ∥_{Z_{i + 1}} \to 0

.

Now,

\begin{matrix} {∥{\hat{V}}_{i} (b, S_{i}, N_{i}, U_{i}) - B_{i} (S_{i}, N_{i}, U_{i})∥}_{Z_{i}} \\ = ∥max_{u \in U_{i}} [h_{i} (S_{i}, N_{i}, U_{i}, u) + \frac{1}{b} \sum_{j = 1}^{b} D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)] \\ - {max_{u \in U_{i}} [h_{i} (S_{i}, N_{i}, U_{i}, u) + H_{i} (S_{i}, N_{i} - I_{{u \neq 0}}, U_{i} + u)]∥}_{Z_{i}} \\ = ∥|max_{u \in U_{i}} [h_{i} (S_{i}, N_{i}, U_{i}, u) + \frac{1}{b} \sum_{j = 1}^{b} D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)] \\ - {max_{u \in U_{i}} [h_{i} (S_{i}, N_{i}, U_{i}, u) + H_{i} (b, S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)]|∥}_{Z_{i}} \\ \leq {∥2 \sum_{k = 0}^{z} \frac{1}{b} |\sum_{j = 1}^{b} D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{j}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k}) - H_{i} (S_{i}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k})|∥}_{Z_{i}} \\ \leq 2 \sum_{k = 0}^{z} {∥\frac{1}{b} \sum_{j = 1}^{b} D_{i + 1} {\hat{V}}_{i + 1} (b, S_{i + 1}^{j}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k}) - H_{i} (S_{i}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k})∥}_{Z_{i}} \\ \leq 2 \sum_{k = 0}^{z} {∥\frac{1}{b} \sum_{j = 1}^{b} D_{i + 1} [{\hat{V}}_{i + 1} (i, S_{i + 1}^{j}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k}) - B_{i + 1} (S_{i + 1}^{j}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k})]∥}_{Z_{i}} \\ + 2 \sum_{k = 0}^{z} {∥D_{i + 1} B_{i + 1} (S_{i + 1}^{j}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k}) - H_{i} (S_{i}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k})∥}_{Z_{i}} \\ = 2 \sum_{k = 0}^{z} (E_{k} + C_{k}) \end{matrix}

where the first equality comes from the definitions of the estimator and the true value. The third step comes as a result of Lemma A2, the fourth step comes from a generalization of the triangle inequality. In the final step we rewrite the expression for convience in what follows.

First we deal with the

C_{k}

’s. Given

Z_{i}

we have that

D_{i + 1} B_{i + 1} (S_{i + 1}^{j}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k})

for

j = 1, \dots, b

and

k = 0, \dots, z

are iid with means of

H_{i} (S_{i}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k})

and finite p-norms. Then by Theorem I.4.1 of Gut (1988) we have that all

C_{k}

’s in the above expression go to zero.

Next we consider the

E_{k}

’s. Here we have, by the properties of p-norms and the fact that the terms being averaged are iid, that

E_{k} \leq {∥{\hat{V}}_{i + 1} (b, S_{i + 1}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k}) - B_{i + 1} (S_{i + 1}, N_{i} - I_{{u_{k} \neq 0}}, U_{i} + u_{k})∥}_{Z_{i}}

since

E_{k}

is bounded by the p-norm of any one of the terms being averaged. By the inductive hypothesis

{∥{\hat{V}}_{i + 1} (b, S_{i + 1}, N_{i + 1}, U_{i + 1}) - B_{i + 1} (S_{i + 1}, N_{i + 1}, U_{i + 1})∥}_{Z_{i + 1}} \to 0,

where

N_{i + 1} = N_{i} - I_{{u_{k} \neq 0}}

and

U_{i + 1} = U_{i} + u_{k}

.

Also by a standard condition for uniform integrability (see Gut (1988) p. 178) we have that

{∥{\hat{V}}_{i + 1} (b, S_{i + 1}, N_{i + 1}, U_{i + 1}) - B_{i + 1} (S_{i + 1}, N_{i + 1}, U_{i + 1})∥}_{Z_{i}} \to 0,

(A9)

provided

sup_{b} E [{|{\hat{V}}_{i + 1} (b, S_{i + 1}, N_{i + 1}, U_{i + 1}) - B_{i + 1} (S_{i + 1}, N_{i + 1}, U_{i + 1})|}^{p + ϵ} | Z_{i}] < \infty

for some

ϵ

. From Lemma A1 we know that

sup_{b} E [{|{\hat{V}}_{i + 1} (b, S_{i + 1}, N_{i + 1}, U_{i + 1})|}^{p + ϵ} | Z_{i}] < \infty

and that

E [{|B_{i + 1} (S_{i + 1}, N_{i + 1}, U_{i + 1})|}^{p + ϵ} | Z_{i}] < \infty .

Thus (A9) holds for each

k = 0, \dots, z

and hence the result is proven. □

Proof of Theorem 5.

As with the proof of Theorem 4 we proceed by backward induction. Again at expiry the relation holds trivially since

{\hat{v}}_{m} (b, S_{m}, N_{m}, U_{m}) = B_{m} (S_{m}, N_{i}, U_{i})

. The inductive hypothesis is taken to be

∥ {\hat{v}}_{i + 1} (b, S_{i + 1}, N_{i + 1}, U_{i + 1}) - B_{i + 1} (S_{i + 1}, N_{i + 1}, U_{i + 1}) ∥_{Z_{i + 1}} \to 0

.

Let

{\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u)

be as defined at the start of Appendix A and note that, with probability one,

\begin{matrix} h_{i} (S_{i}^{j}, N_{i}, U_{i}, u^{1}) + H_{i} (S_{i}^{j}, N_{i} - I_{{u^{1} \neq 0}}, U_{i} + u^{1}) \\ \neq h_{i} (S_{i}^{j}, N_{i}, U_{i}, u^{2}) + H_{i} (S_{i}^{j}, N_{i} - I_{{u^{2} \neq 0}}, U_{i} + u^{2}), \end{matrix}

for all

u^{1}, u^{2} \in U_{i}

,

u^{1} \neq u^{2}

.

Before proceeding, we stop to make three claims:

${∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) - H_{i} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)∥}_{Z_{i}} \to 0$
${∥{\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u) - [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) - H_{i} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)]∥}_{Z_{i}} \to 0$
${∥I_{{{\hat{u}}_{l}^{*} = u}} - I_{{u^{*} = u}}∥}_{Z_{i}} \to 0$
for all $u \in U_{i}$ and where

$\begin{matrix} {\hat{u}}_{l}^{*} = arg max_{u \in U_{i}} [{\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u)] and \\ u^{*} = arg max_{u \in U_{i}} [h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) - H_{i} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u)] \end{matrix}$

The proof of item (i) is the same as the proof of the corresponding step in Theorem 4. Since the estimators in (i) and (ii) differ only in the omission of one term in

{\hat{g}}_{i l}

, similar arguments prove that (ii) also holds.

Now for (iii), if

{\hat{u}}_{l}^{*} = u^{*}

then the result holds trivially. Now suppose that

{\hat{u}}_{l}^{*} = u \neq u^{*}

for some

u \in U_{i}

. Then,

\begin{matrix} {∥I_{{{\hat{u}}_{l}^{*} = u}} - I_{{u^{*} = u}}∥}_{Z_{i}} = {∥I_{{{\hat{u}}_{l}^{*} = u}}∥}_{Z_{i}} = {[P ({\hat{u}}_{l}^{*} = u | Z_{i})]}^{\frac{1}{p}} \\ = {[P ({\hat{g}}_{i l} (b, S_{i}^{j}, N_{i}, U_{i}, u) \geq h_{i} (S_{i}^{j}, N_{i}, U_{i}, u) + H_{i} (S_{i}^{j}, N_{i} - I_{{u \neq 0}}, U_{i} + u))]}^{1 / p} \\ \to 0 . \end{matrix}

Since (ii) holds and convergence in p-norm implies convergence in probability. Thus (iii) is proven.

Now proceeding from the definition of the low estimator and the true option value for all

u \in U_{i}

,

\begin{matrix} {∥{\hat{v}}_{i} (b, S_{i}, N_{i}, U_{i}) - B_{i} (S_{i}, N_{i}, U_{i})∥}_{Z_{i}} \\ = {∥\frac{1}{b} \sum_{l = 1}^{b} {\hat{v}}_{i l} (b, S_{i}, N_{i}, U_{i}) - B_{i} (S_{i}, N_{i}, U_{i})∥}_{Z_{i}} \\ = ∥\frac{1}{b} \sum_{l = 1}^{b} (D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i}, U_{i}) I_{{{\hat{u}}_{l}^{*} = 0}} \\ + [a h_{i} (S_{i}, N_{i}, U_{i}, u_{1}) + D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - 1, U_{i} + u_{1})] I_{{{\hat{u}}_{l}^{*} = u_{1}}} \\ + \dots + {[a h_{i} (S_{i}, N_{i}, U_{i}, u_{z}) + D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - 1, U_{i} + u_{z})] I_{{{\hat{u}}_{l}^{*} = u_{z}}}) - B_{i} (S_{i}, N_{i}, U_{i})∥}_{Z_{i}} \\ \leq {∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i}, U_{i}) I_{{{\hat{u}}_{l}^{*} = 0}} - H_{i} (S_{i}, N_{i}, U_{i}) I_{{u^{*} = 0}}∥}_{Z_{i}} \\ + {∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - 1, U_{i} + u_{1}) I_{{{\hat{u}}_{l}^{*} = u_{1}}} - H_{i} (S_{i}, N_{i} - 1, U_{i} + u_{1}) I_{{u^{*} = u_{1}}}∥}_{Z_{i}} \\ + \dots + {∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - 1, U_{i} + u_{z}) I_{{{\hat{u}}_{l}^{*} = u_{z}}} - H_{i} (S_{i}, N_{i} - 1, U_{i} + u_{z}) I_{{u^{*} = u_{z}}}∥}_{Z_{i}} \\ + {∥h_{i} (S_{i}, N_{i}, U_{i}, u_{1}) I_{{{\hat{u}}_{l}^{*} = u_{1}}} - h_{i} (S_{i}, N_{i}, U_{i}, u_{1}) I_{{u^{*} = u_{z}}}∥}_{Z_{i}} \\ + \dots + {∥h_{i} (S_{i}, N_{i}, U_{i}, u_{z}) I_{{{\hat{u}}_{l}^{*} = u_{z}}} - h_{i} (S_{i}, N_{i}, U_{i}, u_{z}) I_{{u^{*} = u_{z}}}∥}_{Z_{i}} \end{matrix}

(A10)

where the inequality in the third step is due to a generalization of the triangle inequality.

The immediate consequence of claim (iii) above is that all terms in Equation (A10) with the form

{∥h_{i} (S_{i}, N_{i}, U_{i}, u) I_{{{\hat{u}}_{l}^{*} = u}} - h_{i} (S_{i}, N_{i}, U_{i}, u) I_{{u^{*} = u}}∥}_{Z_{i}} \to 0

for all

u \in U_{i}

. Thus

\sum_{k = 0}^{z} {∥h_{i} (S_{i}, N_{i}, U_{i}, u_{k}) I_{{{\hat{u}}_{l}^{*} = u_{k}}} - h_{i} (S_{i}, N_{i}, U_{i}, u_{k}) I_{{u^{*} = u_{k}}}∥}_{Z_{i}} \to 0 .

It remains to show that the remaining terms in (A10) converge in the p-norm. Taking one of these terms, that is, fix a

u \in U_{i}

, we now show this converges in the p-norm to zero.

\begin{matrix} {∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{{\hat{u}}^{*} = u}} - H_{i} (S_{i}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{u_{l}^{*} = u}}∥}_{Z_{i}} \\ = ∥\frac{1}{b} \sum_{l = 1}^{b} [D_{i + 1} {\hat{v}}_{i + i, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{{\hat{u}}_{l}^{*} = u}} - [D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{u^{*} = u}}] \\ + {\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{u^{*} = u}} - H_{i} (S_{i}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{u^{*} = u}}∥}_{Z_{i}} \\ \leq ∥\frac{1}{b} \sum_{l = 1}^{b} [D_{i + 1} {\hat{v}}_{i + i, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{{\hat{u}}_{l}^{*} = u}} - D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{u^{*} = u}}]∥ \\ + {∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{u^{*} = u}} - H_{i} (S_{i}, N_{i} - I_{{u \neq 0}}, U_{i} + u) I_{{u^{*} = u}}∥}_{Z_{i}} \\ \leq {∥D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u)∥}_{Z_{i}} \cdot {∥I_{{{\hat{u}}_{l}^{*} = u}} - I_{{u^{*} = u}}∥}_{Z_{i}} \\ + {∥I_{{u^{*} = u}}∥}_{Z_{i}} \cdot {∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) - H_{i} (S_{i}, N_{i} - I_{{u \neq 0}}, U_{i} + u)∥}_{Z_{i}}, \end{matrix}

where the first step comes from adding and subtracting the same term, the second comes from applying the triangle inequality and the third step comes from factoring out common terms.

Now by (iii),

{∥I_{{{\hat{u}}_{l}^{*} = u}} - I_{{{\hat{u}}^{*} = u}}∥}_{Z_{i}} \to 0,

by (i),

{∥\frac{1}{b} \sum_{l = 1}^{b} D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u) - H_{i} (S_{i}, N_{i} - I_{{u \neq 0}}, U_{i} + u)∥}_{Z_{i}} \to 0,

and we note that

\begin{matrix} {∥I_{{u^{*} = u}}∥}_{Z_{i}} < \infty and \\ {∥D_{i + 1} {\hat{v}}_{i + 1, l} (b, S_{i + 1}^{l}, N_{i} - I_{{u \neq 0}}, U_{i} + u)∥}_{Z_{i}} < \infty, \end{matrix}

by (A7).

Hence we have proven the consistency of the low-biased estimator. □

Appendix B.2. Lemma Proofs

Proof of Lemma A1.

If every

h_{i} (S_{i}, N_{i}, U_{i}, u)

has finite p-th moment, then each

| | h_{i} (S_{k}, N_{k}, U_{k}, u) {| |}_{Z_{i}}

is finite. Since the max, discounting, and conditional expectation operators preserve finiteness of moments then it follows that

| | B_{k} (S_{k}, N_{k}, U_{k}) {| |}_{Z_{i}}

and also

| | H_{k} (S_{k}, N_{k}, U_{k}) {| |}_{Z_{i}}

must also be finite.

Proceeding to (A6), fix

t_{i}

and proceed by backward induction on

t_{k}

from

t_{m}

to

t_{i}

. At expiry (A6) follows from (A5). Then for

t_{k} < t_{m}

,

\begin{matrix} sup_{b} {∥{\hat{V}}_{k} (b, S_{k}, N_{k}, U_{k})∥}_{Z_{i}} = sup_{b} {∥max_{u \in U_{k}} [h_{k} (S_{k}, N_{k}, U_{k}, u) + {\hat{H}}_{k} (b, S_{k}, N_{k} - I_{{u \neq 0}}, U_{k} + u)]∥}_{Z_{i}} \\ = sup_{b} {∥max_{u \in U_{k}, u} [h_{k} (S_{k}, N_{k}, U_{k}, u) + \frac{1}{b} \sum_{j = 1}^{b} D_{k + 1} {\hat{V}}_{k + 1} (b, S_{k + 1}^{j}, N_{k} - I_{{u \neq 0}}, U_{k} + u)]∥}_{Z_{i}} \\ \leq {∥h_{k} (S_{k}, N_{k}, U_{k}, 0)∥}_{Z_{i}} + sup_{b} {∥\frac{1}{b} \sum_{j = 1}^{b} D_{k + 1} {\hat{V}}_{k + 1} (b, S_{k + 1}^{j}, N_{k}, U_{k})∥}_{Z_{i}} \\ + {∥h_{k} (S_{k}, N_{k}, U_{k}, u_{1})∥}_{Z_{i}} + sup_{b} {∥\frac{1}{b} \sum_{j = 1}^{b} D_{k + 1} {\hat{V}}_{k + 1} (b, S_{k + 1}^{j}, N_{k} - 1, U_{k} + u_{1})∥}_{Z_{i}} \\ + \dots + {∥h_{k} (S_{k}, N_{k}, U_{k}, u_{r})∥}_{Z_{i}} + sup_{b} {∥\frac{1}{b} \sum_{j = 1}^{b} D_{k + 1} {\hat{V}}_{k + 1} (b, S_{k + 1}^{j}, N_{k} - 1, U_{k} + u_{r})∥}_{Z_{i}} \\ \leq sup_{b} {∥{\hat{V}}_{k + 1} (b, S_{k + 1}, N_{k}, U_{k})∥}_{Z_{i}} + {∥h_{k} (S_{k}, N_{k}, U_{k}, u_{1})∥}_{Z_{i}} + sup_{b} {∥{\hat{V}}_{k + 1} (b, S_{k + 1}, N_{k} - 1, U_{k} + u_{1})∥}_{Z_{i}} \\ + \dots + {∥h_{k} (S_{k + 1}, N_{k}, U_{k}, u_{r})∥}_{Z_{i}} + sup_{b} {∥{\hat{V}}_{k + 1} (b, S_{k + 1}, N_{k} - 1, U_{k} + u_{r})∥}_{Z_{i}}, \end{matrix}

where

h_{k} (S_{k}, N_{k}, U_{k}, 0) = 0

. This is the sum of a finite number of terms, each of which is finite.

For (iii) the proof is similar to that of (ii). □

Proof of Lemma A2.

In order to prove Lemma A2 we proceed by induction by considering the cases for

n = 1

and

n = 2

. For

n = 1

,

A_{1} = |max (a_{1} + b_{1}) - max (a_{1} + c_{1})| = |b_{1} - c_{1}| < B_{1}

therefore

A_{1} \leq B_{1}

. Now, for

n = 2

A_{1} = |max (a_{1} + b_{1}, a_{2} + b_{2}) - max (a_{1} + c_{1}, a_{2} + c_{2})|

Consider the following,

$a_{1} + b_{1} > a_{2} + b_{2}$
(a)
$a_{1} + c_{1} > a_{2} + c_{2}$
Then

$A_{2} = | a_{1} + b_{1} - a_{1} - c_{1} | = | b_{1} - c_{1} | \leq B_{2}$

(b)
$a_{1} + c_{1} < a_{2} + c_{2}$
Note that conditions (i) and (b) imply that

$b_{2} - b_{1} < a_{1} - a_{2} < c_{2} - c_{1}$

(A11)

and we have

$\begin{matrix} A_{2} & = | a_{1} + b_{1} - a_{2} - c_{2} | \\ = | a_{1} + b_{1} - c_{1} + c_{1} - a_{2} - c_{2} | \\ \leq | b_{1} - c_{1} | + | (a_{1} - a_{2}) - (c_{2} - c_{1}) | \\ \leq | b_{1} - c_{1} | + | (b_{2} - b_{1}) - (c_{2} - c_{1}) | \\ = | b_{1} - c_{1} | + | b_{2} - c_{2} - (b_{1} - c_{1}) | \\ \leq 2 | b_{1} - c_{1} | + | b_{2} - c_{2} | \\ \leq 2 | b_{1} - c_{1} | + 2 | b_{2} - c_{2} | \\ = B_{2} \end{matrix}$

where the first inequality comes from the triangle inequality, the second comes from Inequality (A11) and the third inequality comes from another application of the triangle inequality.
$a_{2} + b_{2} > a_{1} + b_{1}$
(a)
$a_{2} + c_{2} > a_{1} + c_{1}$
Then

$A_{2} = | a_{2} + b_{2} - a_{2} - c_{2} | = | b_{2} - c_{2} | \leq B_{2}$

(b)
$a_{2} + c_{2} < a_{1} + c_{1}$
Note that conditions (ii) and (b) imply that

$b_{1} - b_{2} < a_{2} - a_{1} < c_{1} - c_{2}$

(A12)

and we have

$\begin{matrix} A_{2} & = | a_{2} + b_{2} - a_{1} - c_{1} | \\ = | a_{2} + b_{2} - c_{2} + c_{2} - a_{1} - c_{1} | \\ \leq | b_{2} - c_{2} | + | (a_{2} - a_{1}) - (c_{1} - c_{2}) | \\ \leq | b_{2} - c_{2} | + | (b_{1} - b_{2}) - (c_{1} - c_{2}) | \\ = | b_{2} - c_{2} | + | b_{1} - c_{1} - (b_{2} - c_{2}) | \\ \leq 2 | b_{2} - c_{2} | + | b_{1} - c_{1} | \\ \leq 2 | b_{2} - c_{2} | + 2 | b_{1} - c_{1} | \\ = B_{2} \end{matrix}$

where again the first inequality comes from the triangle inequality, the second comes from Inequality (A12) and the third inequality comes from another application of the triangle inequality.

Therefore

A_{2} \leq B_{2}

.

Now assume that the inductive hypothesis

A_{n} \leq B_{n}

is true. We need to show that

A_{n + 1} \leq B_{n + 1}

. First define

i_{n}

and

j_{n}

such that

a_{i_{n}} + b_{i_{n}} = max (a_{1} + b_{1}, \dots, a_{n} + b_{n})

and

a_{i_{n}} + c_{i_{n}} = max (a_{1} + c_{1}, \dots, a_{n} + c_{n})

respectively. Now,

\begin{matrix} A_{n + 1} = |max (a_{1} + b_{1}, \dots, a_{n} + b_{n}, a_{n + 1} + b_{n + 1}) - max (a_{1} + c_{1}, \dots, a_{n} + c_{n}, a_{n + 1} + c_{n + 1})| \\ = |max (a_{i_{n}} + b_{i_{n}}, a_{n + 1} + b_{n + 1}) - max (a_{j_{n}} + c_{j_{n}}, a_{n + 1} + c_{n + 1})| \end{matrix}

Consider the following,

$a_{i_{n}} + b_{i_{n}} > a_{n + 1} + b_{n + 1}$
(a)
$a_{j_{n}} + c_{j_{n}} > a_{n + 1} + c_{n + 1}$

$\begin{matrix} A_{n + 1} & = | a_{i_{n}} + b_{i_{n}} - a_{j_{n}} - c_{j_{n}} | \\ \leq 2 \sum_{i = 1}^{n} | b_{i} - c_{i} | \\ \leq 2 \sum_{i = 1}^{n + 1} | b_{i} - c_{i} | \\ = B_{n + 1} \end{matrix}$

where the first inequality comes from the inductive hypothesis.
(b)
$a_{j_{n}} + c_{j_{n}} < a_{n + 1} + c_{n + 1}$
By the definitions of $i_{n}$ and $j_{n}$ and (b) we have

$a_{i_{n}} + c_{i_{n}} \leq a_{j_{n}} + c_{j_{n}} < a_{n + 1} + c_{n + 1} .$

This combined with (i) gives

$b_{n + 1} - b_{i_{n}} < a_{i_{n}} - a_{n + 1} < c_{n + 1} - c_{i_{n}}$

(A13)

Then

$\begin{matrix} A_{n + 1} & = | a_{i_{n}} + b_{i_{n}} - a_{n + 1} - c_{n + 1} | \\ = | a_{i_{n}} + b_{i_{n}} - c_{i_{n}} + c_{i_{n}} - a_{n + 1} - c_{n + 1} | \\ \leq | b_{i_{n}} - c_{i_{n}} | + | (a_{i_{n}} - a_{n + 1}) - (c_{n + 1} - c_{i_{n}}) | \\ \leq | b_{i_{n}} - c_{i_{n}} | + | (b_{n + 1} - b_{i_{n}}) - (c_{n + 1} - c_{i_{n}}) | \\ = | b_{i_{n}} - c_{i_{n}} | + | b_{n + 1} - c_{n + 1} - (b_{i_{n}} - c_{i_{n}}) | \\ \leq 2 | b_{i_{n}} - c_{i_{n}} | + | b_{n + 1} - c_{n + 1} | \\ \leq B_{n + 1} \end{matrix}$

where again the first inequality comes from the triangle inequality, the second comes from Inequality (A13) and the third inequality comes from another application of the triangle inequality.
$a_{i_{n}} + b_{i_{n}} < a_{n + 1} + b_{n + 1}$
(a)
$a_{j_{n}} + c_{j_{n}} < a_{n + 1} + c_{n + 1}$

$\begin{matrix} A_{n + 1} & = | a_{n + 1} + b_{n + 1} - a_{n + 1} - c_{n + 1} | \\ = | b_{n + 1} - c_{n + 1} | \\ \leq 2 \sum_{i = 1}^{n + 1} | b_{i} - c_{i} | \\ = B_{n + 1} \end{matrix}$

(b)
$a_{j_{n}} + c_{j_{n}} > a_{n + 1} + c_{n + 1}$
By the definitions of $i_{n}$ and $j_{n}$ and (ii) we have

$a_{j_{n}} + b_{j_{n}} \leq a_{i_{n}} + b_{i_{n}} < a_{n + 1} + b_{n + 1} .$

This combined with (b) gives

$b_{j_{n}} - b_{n + 1} < a_{n + 1} - a_{j_{n}} < c_{j_{n}} - c_{n + 1}$

(A14)

Then

$\begin{matrix} A_{n + 1} & = | a_{n + 1} + b_{n + 1} - a_{j_{n}} - c_{j_{n}} | \\ = | a_{n + 1} + b_{n + 1} - c_{n + 1} + c_{n + 1} - a_{j_{n}} - c_{j_{n}} | \\ \leq | b_{n + 1} - c_{n + 1} | + | (a_{n + 1} - a_{j_{n}}) - (c_{j_{n}} - c_{n + 1}) | \\ \leq | b_{n + 1} - c_{n + 1} | + | (b_{j_{n}} - b_{n + 1}) - (c_{j_{n}} - c_{n + 1}) | \\ = | b_{n + 1} - c_{n + 1} | + | b_{j_{n}} - c_{j_{n}} - (b_{n + 1} - c_{n + 1}) | \\ \leq 2 | b_{n + 1} - c_{n + 1} | + | b_{j_{n}} - c_{j_{n}} | \\ \leq B_{n + 1} \end{matrix}$

where again the first inequality comes from the triangle inequality, the second comes from Inequality (A14) and the third inequality comes from another application of the triangle inequality.

Therefore

A_{n + 1} \leq B_{n + 1}

and the Lemma is proven. □

References

Andersen, Leif, and Mark Broadie. 2004. Primal-Dual Simulation Algorithm for Pricing Multidimensional American Options. Management Science 50: 1222–34. [Google Scholar] [CrossRef]
Bally, Vlad, Gilles Pagés, and Jacques Printems. 2005. A Quantization Tree Method for Pricing and Hedging Multidimensional American Options. Mathematical Finance 15: 119–68. [Google Scholar] [CrossRef]
Barraquand, Jerome, and Didier Martineau. 1995. Numerical Valuation of High Dimensional Multivariate American Securities. Journal of Financial and Quantitative Analysis 30: 383–405. [Google Scholar] [CrossRef]
Barrera-Esteve, Christophe, Florent Bergeret, Charles Dossal, Emmanuel Gobet, Asma Meziou, Remi Munos, and Damien Reboul-Salze. 2006. Numerical Methods for the Pricing of Swing Options: A Stochastic Control Approach. Methodology and Computing in Applied Probability 8: 517–40. [Google Scholar] [CrossRef]
Ben Latifa, Imene, Joseph Frederic Bonnans, and Mohamed Mnif. 2016. Numerical methods for an optimal multiple stopping problem. Stochastics and Dynamics 16. [Google Scholar] [CrossRef]
Bender, Christian. 2011. Dual pricing of multi-exercise options under volume constraints. Finance and Stochastics 15: 1–26. [Google Scholar] [CrossRef]
Bender, Christian, and John Schoenmakers. 2006. An Iterative Algorithm for Multiple Stopping: Convergence and Stability. Advances in Applied Probability 38: 729–49. [Google Scholar] [CrossRef]
Broadie, Mark, and Paul Glasserman. 1997. Pricing American-style securities using simulation. Journal of Economic Dynamics and Control 21: 1323–52. [Google Scholar] [CrossRef]
Broadie, Mark, and Paul Glasserman. 2004. A stochastic mesh method for pricing high-dimensional American options. The Journal of Computational Finance 7: 35–72. [Google Scholar] [CrossRef]
Calvo-Garrido, Maria del Carmen, Matthias Ehrhardt, and Carlos Vázquez. 2017. Pricing swing options in electricity markets with two stochastic factors using a partial differential equation approach. The Journal of Computational Finance 20: 81–107. [Google Scholar] [CrossRef]
Carriere, Jacques. 1996. Valuation of Early-Excercise Price of Options Using Simulations and Nonparametric Regression. Insurance: Mathematics and Economics 19: 19–30. [Google Scholar]
Chandramouli, Shyam S., and Martin B. Haugh. 2012. A unified approach to multiple stopping and duality. Operations Research Letters 40: 258–64. [Google Scholar] [CrossRef]
Chen, Zhuliang, and Peter A. Forsyth. 2007. A semi-Lagrangian approach for natural gas storage valuation and optimal operation. SIAM Journal on Scientific Computing 30: 339–68. [Google Scholar] [CrossRef]
Clément, Emmanuelle, Damien Lamberton, and Philip Protter. 2001. An analysis of a least-squares regression algorithm for American option pricing. Finance Stochastics 6: 449–71. [Google Scholar] [CrossRef]
Cox, John C., Stephen A. Ross, and Mark Rubinstein. 1979. Option Pricing: A Simplified Approach. Journal of Financial Economics 7: 229–63. [Google Scholar] [CrossRef]
Deng, Shijie, and Shmuel S. Oren. 2006. Electricity derivatives and risk management. Energy 31: 940–53. [Google Scholar] [CrossRef]
Gut, Allan. 1988. Stopped Random Walks. New York: Springer. [Google Scholar]
Gyurkó, Lajos Gergely, Ben M. Hambly, and Jan Hendrick Witte. 2015. Monte Carlo methods via a dual approach for some discrete time stochastic control problems. Mathematical Methods of Operations Research 81: 109–35. [Google Scholar] [CrossRef][Green Version]
Haugh, Martin B., and Leonid Kogan. 2004. Pricing American Options: A Duality Approach. Operations Research 52: 258–70. [Google Scholar] [CrossRef]
Ibán̎ez, Alfredo. 1996. Valuation by Simulation of Contingent Claims with Multiple Early Exercise Opportunities. Mathematical Finance 19: 19–30. [Google Scholar]
Jaillet, Patrick, Ehud I. Ronn, and Stathis Tompaidis. 2004. Valuation of Commodity-Based Swing Options. Management Science 50: 909–21. [Google Scholar] [CrossRef]
Kim, Dong-Hyun, Eul-Bum Lee, In-Hyeo Jung, and Douglas Alleman. 2019. The Efficacy of the Tolling Model’s Ability to Improve Project Profitability on International Steel Plants. Energies 12: 1221. [Google Scholar] [CrossRef]
Dahlgren, Martin, and Ralf Korn. 2005. The Swing Option on the Stock Market. The International Journal of Theoretical and Applied Finance 8: 123–129. [Google Scholar] [CrossRef]
Lari-Lavassani, Ali, Mohamadreza Simchi, and Antony Ware. 2001. A Discrete Valuation of Swing Options. Canadian Applied Mathematics Quarterly 9: 35–73. [Google Scholar]
Longstaff, Francis A., and Eduardo S. Schwartz. 2001. Valuing American Options by Simulation: A Simple Least-squares Approach. The Review of Financial Studies 14: 113–47. [Google Scholar] [CrossRef]
Ludkovski, Michael, and Rene Carmona. 2010. Valuation of energy storage: An optimal switching approach. Quantitative Finance 10: 359–74. [Google Scholar]
Marshall, T. James. 2012. Valuation of Multiple Exercise Options. Ph.D. thesis, Western University, London, ON, Canada. [Google Scholar]
Marshall, T. James, and R. Mark Reesor. 2011. Forest of Stochastic Meshes: A Method for Valuing High Dimensional Swing Options. Operations Research Letters 39: 17–21. [Google Scholar] [CrossRef]
Marshall, T. James, R. Mark Reesor, and Matthew Cox. 2011. Simulation Valuation of Multiple Exercise Options. Paper presented at the 2011 Winter Simulation Conference, Phoenix, AZ, USA, December 11–14. [Google Scholar]
Meinshausen, Nicolai, and Ben M. Hambly. 2004. Monte Carlo Methods For the Valuation of Multiple-Exercise Options. Mathematical Finance 14: 557–83. [Google Scholar] [CrossRef]
Stentoft, Lars. 2004. Assessing the Least-Squares Monte-Carlo Approach to American Option Valuation. Review of Derivatives Research 7: 129–68. [Google Scholar] [CrossRef]
Thompson, Matt, Matt Davison, and Henning Rasmussen. 2009. Natural gas storage valuation and optimization: A real options application. Naval Research Logistics 56: 226–38. [Google Scholar] [CrossRef]
Tilley, James A. 1993. Valuing American Options in a Path Simulation Model. Transactions of the Society of Actuaries 45: 83–104. [Google Scholar]
Whitehead, Tyson, R. Mark Reesor, and Matt Davison. 2012. A bias-reduction technique for Monte Carlo pricing of early-exercise options. Journal of Computational Finance 15. [Google Scholar] [CrossRef]
Wilhelm, Martina, and Christoph Winter. 2008. Finite element valuation of swing options. Journal of Computational Finance 11: 107–32. [Google Scholar] [CrossRef]

Figure 1. Section of a Forest of Trees with

N

= # of exercise rights remaining, U = usage level, and three exercise choices—no exercise, excerise

u_{1}

units, and exercise

u_{2}

units.

Figure 1. Section of a Forest of Trees with

N

= # of exercise rights remaining, U = usage level, and three exercise choices—no exercise, excerise

u_{1}

units, and exercise

u_{2}

units.

Figure 2. Option value estimates (USD) vs. log branching factor (b) with a single underlying asset. The option has one up and one down swing right, 3 exercise opportunities, exercise volume of 60 units of the underlying and there is no usage penalty. The initial price is USD 40. The number of repeated valuations

R = 32000 (\frac{10}{b})

results in standard errors ≈ 0.07% of option value.

Figure 2. Option value estimates (USD) vs. log branching factor (b) with a single underlying asset. The option has one up and one down swing right, 3 exercise opportunities, exercise volume of 60 units of the underlying and there is no usage penalty. The initial price is USD 40. The number of repeated valuations

R = 32000 (\frac{10}{b})

results in standard errors ≈ 0.07% of option value.

Figure 3. Basket of American calls and puts and swing option values versus the number of exercise rights using a single underlying asset. Parameter values used are exercise volume of 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

Figure 3. Basket of American calls and puts and swing option values versus the number of exercise rights using a single underlying asset. Parameter values used are exercise volume of 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

Figure 4. Option-value estimates versus (log) branching factor. Approximate pointwise 95% confidence intervals for each estimate are given by the vertical bars. The option and underlying model in this example is from Jaillet et al. (2004) with the Trinomial price given by their Forest of Trinomial Trees method.

Figure 5. Option value estimates (USD) vs. log branching factor (b) with a five-dimensional underlying. The option has one up and one down swing right, three exercise opportunities, exercise volume of 60 units and there is no usage penalty. The number of repeated valuations

R = 32000 (\frac{10}{b})

results in standard errors ≈ 0.09% of option value.

Figure 5. Option value estimates (USD) vs. log branching factor (b) with a five-dimensional underlying. The option has one up and one down swing right, three exercise opportunities, exercise volume of 60 units and there is no usage penalty. The number of repeated valuations

R = 32000 (\frac{10}{b})

results in standard errors ≈ 0.09% of option value.

Figure 6. Basket of American calls and puts and swing option values versus the number of exercise rights using a 5-dimensional underlying asset. Parameter values used are exercise volume of 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

Figure 6. Basket of American calls and puts and swing option values versus the number of exercise rights using a 5-dimensional underlying asset. Parameter values used are exercise volume of 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

Figure 7. Normalized runtime using MPI versus number of CPUs (

n_{p}

). The option is identical to the no volume choice swing option with a five-dimensional underlying considered in Section 2.4.3. The branching factor used is

b = 160

.

Figure 7. Normalized runtime using MPI versus number of CPUs (

n_{p}

). The option is identical to the no volume choice swing option with a five-dimensional underlying considered in Section 2.4.3. The branching factor used is

b = 160

.

Table 1. Table showing that the Forest of Stochastic Trees extends (i) the Forest of Trees method to a high-dimensional underlying; and (ii) the Stochastic Tree method to multiple exercise options.

	1-Dimensional Asset	High-Dimensional Asset
Single exercise American option	Binomial Tree	Stochastic Tree
	Cox et al. (1979)	Broadie and Glasserman (1997)
Multiple exercise option	Forest of Trees	Forest of Stochastic Trees
	Lari et al. (2001); Jaillet et al. (2004)	Marshall and Reesor

Table 2. Swing option values as a function of initial asset price and usage penalty with a single underlying asset. Parameter values used are

N_{u} = N_{d} = 2

,

U_{i} = {20, 40, 60}

,

b = 20

,

R = 4000

,

m = 5

,

U_{m i n} = - 90

, and

U_{m a x} = 90

.

Table 2. Swing option values as a function of initial asset price and usage penalty with a single underlying asset. Parameter values used are

N_{u} = N_{d} = 2

,

U_{i} = {20, 40, 60}

,

b = 20

,

R = 4000

,

m = 5

,

U_{m i n} = - 90

, and

U_{m a x} = 90

.

$S_{0}$	Penalty	High	Error	Low	Error	Binomial
60	ON	2271.153	1.418	2240.319	1.378	2259.845
60	OFF	2422.781	1.576	2392.872	1.523	2411.844
50	ON	1445.468	0.844	1408.843	0.904	1429.645
50	OFF	1542.053	0.978	1503.963	0.980	1526.055
40	ON	1018.104	0.859	968.793	1.044	989.651
40	OFF	1156.591	0.911	1134.093	0.903	1145.801
30	ON	1345.556	1.205	1309.214	1.280	1326.266
30	OFF	1562.347	1.316	1532.854	1.343	1546.055
20	ON	2189.531	0.905	2147.623	1.018	2157.976
20	OFF	2443.877	0.924	2402.192	1.034	2412.354

Table 3. Swing option values as a function of the number of exercise rights with a single underlying asset. Parameter values used are exercise volume of 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

Table 3. Swing option values as a function of the number of exercise rights with a single underlying asset. Parameter values used are exercise volume of 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

$N_{u} = N_{d}$	High	Error	Low	Error	Binomial
1	630.054	0.449	605.394	0.453	617.832
3	1573.237	1.449	1559.517	1.437	1567.344
5	1852.788	2.128	1852.788	2.128	1852.627

Table 4. Swing option values as a function of moneyness and penalties with a five-dimensional underlying asset. Parameter values used are

N_{u} = N_{d} = 2

,

U_{i} = {20, 40, 60}

,

b = 20

,

R = 4000

,

m = 5

,

U_{m i n} = - 90

, and

U_{m a x} = 90

.

Table 4. Swing option values as a function of moneyness and penalties with a five-dimensional underlying asset. Parameter values used are

N_{u} = N_{d} = 2

,

U_{i} = {20, 40, 60}

,

b = 20

,

R = 4000

,

m = 5

,

U_{m i n} = - 90

, and

U_{m a x} = 90

.

$S_{0}$	Penalty	High	Error	Low	Error
60	ON	3577.280	2.864	3517.297	2.845
60	OFF	3832.050	2.286	3772.123	2.856
50	ON	2246.657	2.280	2197.957	2.259
50	OFF	2479.081	2.341	2431.065	2.318
40	ON	1221.847	1.595	1189.610	1.564
40	OFF	1257.171	1.499	1226.370	1.467
30	ON	1105.831	0.453	1087.851	0.447
30	OFF	1209.179	0.393	1196.255	0.391
20	ON	1937.615	0.445	1930.860	0.472
20	OFF	2177.194	0.489	2177.031	0.513

Table 5. Swing option values as a function of the number of exercise rights with a five-dimensional underlying asset. Parameter values used are base volume = 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

Table 5. Swing option values as a function of the number of exercise rights with a five-dimensional underlying asset. Parameter values used are base volume = 60 units,

S_{0} = 40

,

b = 20

,

R = 4000

,

m = 5

, and no usage penalty.

$N_{u} = N_{d}$	High	Error	Low	Error
1	683.144	0.741	652.481	0.721
3	1728.947	2.279	1709.497	2.248
5	2087.495	3.114	2087.495	3.114

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Reesor, R.M.; Marshall, T.J. Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options. J. Risk Financial Manag. 2020, 13, 95. https://doi.org/10.3390/jrfm13050095

AMA Style

Reesor RM, Marshall TJ. Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options. Journal of Risk and Financial Management. 2020; 13(5):95. https://doi.org/10.3390/jrfm13050095

Chicago/Turabian Style

Reesor, R. Mark, and T. James Marshall. 2020. "Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options" Journal of Risk and Financial Management 13, no. 5: 95. https://doi.org/10.3390/jrfm13050095

APA Style

Reesor, R. M., & Marshall, T. J. (2020). Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options. Journal of Risk and Financial Management, 13(5), 95. https://doi.org/10.3390/jrfm13050095

Article Menu

Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options

Abstract

1. Introduction

Literature Review

2. Results

2.1. Forest of Stochastic Trees

2.2. Estimator Bias

2.3. Estimator Convergence

2.4. Numerical Results

2.4.1. Single Dimension

2.4.2. Calibrated Forward Curve

2.4.3. Five Dimensions

2.5. Algorithmic Enhancement via Parallel Processing

3. Discussion and Conclusion

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Nomenclature

Appendix B. Proofs of Main Results and Lemmas

Appendix B.1. Proofs of Main Results

Appendix B.2. Lemma Proofs

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI