Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints

Gaïgi, M’hamed; Kharroubi, Idris; Lim, Thomas

doi:10.3390/math8112053

Open AccessArticle

Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints

by

M’hamed Gaïgi

¹,

Idris Kharroubi

² and

Thomas Lim

^3,*

¹

Ecole Nationale d’Ingénieurs de Tunis-LAMSIN, Université de Tunis El Manar, Tunis 2092, Tunisie

²

Sorbonne Université, Université de Paris, CNRS, Laboratoire de Probabilités, Statistiques et Modélisations (LPSM), 75005 Paris, France

³

Ecole Nationale Supérieure d’Informatique pour l’Industrie et l’Entreprise, Laboratoire de Mathématiques et Modélisation d’Evry, CNRS UMR 8071, 91037 Evry, France

^*

Author to whom correspondence should be addressed.

Mathematics 2020, 8(11), 2053; https://doi.org/10.3390/math8112053

Submission received: 11 September 2020 / Revised: 5 November 2020 / Accepted: 9 November 2020 / Published: 18 November 2020

(This article belongs to the Special Issue Stochastic Optimization Methods in Economics, Finance and Insurance)

Download

Browse Figures

Versions Notes

Abstract

:

In this work, we study an optimization problem arising in the management of a natural resource over an infinite time horizon. The resource is assumed to evolve according to a logistic stochastic differential equation. The manager is allowed to harvest the resource and sell it at a stochastic market price modeled by a geometric Brownian process. We assume that there are delay constraints imposed on the decisions of the manager. More precisely, starting harvesting order and selling order are executed after a delay. By using the dynamic programming approach, we characterize the value function as the unique solution to an original partial differential equation. We complete our study with some numerical illustrations.

Keywords:

impulse control; natural resource; optimal harvesting; execution delay; viscosity solutions; state constraints

MSC:

Classification (2010): 93E20; 62L15; 49L20; 49L25; 92D25

1. Introduction

In the recent decades, the management of natural resources has become a major issue. Indeed, for many countries, natural resources ensure regular incomes, allowing for economic growth and development. In particular, the seeking of high short-term profits can lead to an overconsumption of the natural resources and therefore to their exhaustion (see, e.g., [1]). Hence, the question of the sustainability of such natural resources is crucial.

As a consequence, many countries have imposed restrictions on the exploitation of natural resources so as to avoid their depletion. One of the repercussions of these constraints is the non-immediacy of the decisions: the actions of the natural resources managers are executed after some delay. Moreover, the harvests are limited in time, and sometimes, we have a lag constraint between two harvests. The aim of this work is to model these delays and to study their effects on the gain and the behavior of the natural resource managers.

We suppose that the resource manager can act by two type of interventions: starting and stopping the harvesting of the natural resource. We therefore model the strategy by a double sequence

{(d_{i}, s_{i})}_{i \geq 1}

where

d_{i}

and

s_{i}

are stopping times representing respectively the time of the i-th decision to start and stop harvesting. Therefore, we assume

s_{i} \geq d_{i}

for

i \geq 1

.

Such a formulation naturally appears in decision-making problems in economics and finance. In many cases, managers face technical and regulatory delays, which may be significant. Thus, these delays need to be taken into account in the way of acting (see for example [2,3]). In our case, we consider the management of a natural resource when we have constraints and lags. We first suppose that there is a minimum time

δ

between the end of an action and the beginning of the following one. This constraint can be written on the strategy

{(d_{i}, s_{i})}_{i \geq 1}

as

d_{i + 1} \geq s_{i} + δ

for

i \geq 1

. We also suppose that we have two kind of lags. The first one appears for starting orders: the harvest of the natural resource starts after a given fixed delay ℓ. This delay represents the time needed to access the natural resource. The second kind of lag, denoted by m, corresponds to the time between the end of the harvest and the date when the manager sells the harvest; this lag can be due to the drying, the packaging, the transport, the time to find a counterpart to buy the harvest, etc.

Hence, our modeling takes into account the non-immediacy of both the harvest and its sale. As a result, the corresponding optimal strategies will be more practical and will lead to economic and environmental policies that are more effective than those suggested in the classic literature.

We assume that without any intervention of the manager, the natural resource abundance evolves according to a stochastic logistic diffusion model. Such a logistic dynamics is classic in the modeling of populations’ evolution; see for example [4]. If we denote by

X^{α}

the controlled resource abundance process and by P its price process, the problem of the manager turns into a maximization of the expected total profit on an infinite horizon of the form:

\begin{matrix} E [\sum_{i \geq 1} f (d_{i}, s_{i}, P_{s_{i} + m}, {(X_{t}^{α})}_{d_{i} + ℓ \leq t \leq s_{i}})], \end{matrix}

over the strategies

α = {(d_{i}, s_{i})}_{i \geq 1}

satisfying the previous constraints.

From a mathematical point of view, control problems with delay were studied in [5,6] where there was only one kind of intervention. In our model, we consider two kinds of interventions, which are moreover related by a constraint. In the paper [7], the authors also considered two kinds of interventions. However, there is no constraint linking them, and only one of them is lagged. Furthermore, the state variable (the resource abundance) is a physical quantity. We therefore have the additional state constraint restricting strategies to those in which the remaining abundance is nonnegative.

Control problems under state constraints without delay have been intensely studied in the literature (see for example [8] for the study of optimal portfolio management under liquidity constraints), and the classical approach to deal with such problems is to consider the notion of constrained viscosity solutions introduced in [9,10]. In this work, we adapt these techniques to a state constraints control problem with delay. Using a dynamic programming approach, we characterize the associated value function as the unique solution in the viscosity sense to an original partial differential equation (PDE). The novelty of the PDE lies in the different forms it takes on several regions of the space.

We then test the applicability of our approach by computing numerically the optimal strategies on some examples. Our numerical tests show that the optimal strategies heavily depend on the delay parameters. In particular, the effective optimal strategies are different from the naive optimal strategies, i.e., without delays. This illustrates the contribution of our approach to identifying optimal solutions for the management of natural resources.

The paper is organized as follows. We define the model and formulate our stochastic control problem in Section 2. In Section 3, we derive the partial differential equations associated with the control problem. Then, we characterize the value function as the unique viscosity solution of a Hamilton–Jacobi–Bellman equation. Finally, in Section 4, we compute numerically the value function and the associated optimal policy via an iterative procedure based on a quantization method. We further enrich our studies with numerical illustrations.

2. Problem Formulation

2.1. The Model

Let

Ω = C (R_{+}, R^{2})

be the space of continuous functions from

R_{+}

to

R^{2}

. We define on

Ω

the

σ

-algebra

F

generated by the coordinate functions

ω \in Ω \mapsto ω_{t}

, for

t \in R_{+}

, and we endow

(Ω, F)

with the Wiener measure

P

. By an abuse of notation, we still denote by

F

the

P

-completed

σ

-algebra. We define on the probability space

(Ω, F, P)

the two

R

-valued processes B and W by:

\begin{matrix} B_{t} (ω) = ω_{t}^{1} & and & W_{t} (ω) = ω_{t}^{2}, \end{matrix}

for

t \in R_{+}

and

ω = {(ω_{t}^{1}, ω_{t}^{2})}_{t \geq 0}

. We then denote by

F = {(F_{t})}_{t \geq 0}

the complete filtration generated by

(W, B)

.

We consider a resource that evolves according to the classical logistic stochastic differential equation if there is no harvesting:

\begin{matrix} d X_{t} & = & η X_{t} (λ - X_{t}) d t + γ X_{t} d B_{t}, \end{matrix}

where

η

,

λ

, and

γ

are three positive constants. The constant

η λ

corresponds to the intrinsic rate of population growth, and

1 / λ

is the carrying capacity of the environment. A manager can harvest the resource under some conditions. We denote by

α : = {(d_{i}, s_{i})}_{i \geq 1}

a harvesting strategy, which is described as follows.

$d_{i}$ is the time at which the manager gives the order to harvest. The harvest starts only at time $d_{i} + ℓ$ , with ℓ a positive constant representing the delay.
$s_{i}$ is the time when the harvest is stopped.

In the following, we will only consider the set

A

of admissible strategies such that

{(d_{i})}_{i \geq 1}

and

{(s_{i})}_{i \geq 1}

are two increasing sequences of

F

-stopping times satisfying:

\begin{matrix} 0 & \leq s_{i} - d_{i} & \leq K, \end{matrix}

(1)

and

\begin{matrix} s_{i} + δ & \leq & d_{i + 1}, \end{matrix}

(2)

for any

i \geq 1

, where

δ

and K are positive constants with

ℓ < K

.

We assume that in the harvesting time, the manager harvests the quantity

g (x)

by the time unit where x is the quantity of the available resource, and g is a function satisfying the following conditions.

(Hg) g is an increasing function from

R_{+}

to

R_{+}

such that

g (0) = 0

, and there exist two positive constants

a_{min}

and

a_{max}

such that

a_{min} x \leq g (x) \leq a_{max} x

for any

x \in R_{+}

.

Moreover, the manager must pay a cost when he harvests during a period

Δ t

, and this cost is

f (Δ t)

where f is an increasing function from

R_{+}

to

R_{+}

such that

f (0) = 0

.

Finally, after any harvest time, the manager sells at time

s_{i} + m

, with m a positive constant, the harvested resource. We denote by

P^{p}

the price of the resource, and we suppose that it evolves according to the following stochastic differential equation:

\begin{matrix} d P_{t}^{p} & = & P_{t}^{p} (μ_{t} d t + σ_{t} d W_{t}), \\ P_{0}^{p} & = & p, \end{matrix}

where

μ

and

σ

are positive bounded

F

-adapted processes and p is the price at time 0. We assume that

m < δ

.

We can sum up all the constraints with the following graph.

The state A corresponds to the state where the manager can decide to start a harvest. The state B corresponds to the harvesting time. The state C corresponds to the moment of sale.

The variable

d_{i}

(resp.

s_{i}

) corresponds to the time when the manager decides to leave the state A (resp. B). The time to go from the state A to the state B is ℓ. This means that the time between the order to harvest and the start of the harvest is ℓ. We cannot stay more than

K - ℓ

in the state B, which means that the harvesting time cannot be more than

K - ℓ

. The time to go from the state B to the state C is m. This means that the manager must wait m after the harvest to sell this production. The time to go from the state C to the state A is

δ - m

, which means that the minimum time between the sale and the next order to harvest is

δ - m

.

If the manager follows an admissible strategy

α = {(d_{i}, s_{i})}_{i \geq 1}

, then the quantity of available resource

X_{t}^{x, α}

at time t evolves with the following stochastic differential equation:

\begin{matrix} d X_{t}^{x, α} & = & η X_{t}^{x, α} (λ - X_{t}^{x, α}) d t + γ X_{t}^{x, α} d B_{t} - \sum_{i \geq 1} g (X_{t}^{x, α}) 1_{d_{i} + ℓ \leq t \leq s_{i}} d t, \end{matrix}

(3)

with

X_{0}^{x, α} = x

.

2.2. The Value Function

The objective of the manager is to optimize the expected profit over an infinite horizon. The associated value function is then given by:

\begin{matrix} V (x, p) & = & sup_{α \in A} E [\sum_{i \geq 1} e^{- β (s_{i} + m)} (G_{i}^{α} - C_{i}^{α})], \end{matrix}

(4)

where

β

is a positive constant corresponding to the discount factor,

G_{i}^{α}

and

C_{i}^{α}

corresponds to the gain, and the cost for the i-th harvest associated with the strategy

α \in A

:

\begin{matrix} C_{i}^{α} & = & f (s_{i} - d_{i} - ℓ), \end{matrix}

(5)

and:

\begin{matrix} G_{i}^{α} & = & P_{s_{i} + m}^{p} \int_{d_{i} + ℓ}^{s_{i}} g (X_{t}^{x, α}) d t . \end{matrix}

(6)

3. PDE Characterization

3.1. Extension of the Value Function

In order to provide an analytic characterization of the value function V defined by (4), we need to extend the definition of this control problem to general initial conditions. Indeed, the delays imposed on the manager make the state process non-Markov. To overcome this issue, we introduce new variables keeping in mind the time spent from the previous decision. More precisely, we consider a gain function

J (x, p, θ, ρ, y, α)

from

E : = R_{+} \times R_{+}^{*} \times D \times A (Θ)

to

R

, with x representing the size of the available resource at the initial time, p the price of the resource,

θ

the time from the last decision of the manager (start a harvest or stop a harvest),

ρ

the time from which the manager has decided to harvest the last time, y the quantity of the harvest until now associated with this harvest, and

α

the strategy. We introduce some notation to simplify the formulae:

z : = (x, p)

,

Z : = R_{+} \times R_{+}^{*}

and

Θ : = (θ, ρ, y)

. We also introduce the following sets:

\begin{matrix} D_{0} & : = & {(θ, ρ, y) \in R_{+}^{3}, y \geq 0, 0 \leq θ = ρ \leq K}, \\ D_{1} & : = & {(θ, ρ, y) \in R_{+}^{3}, y \geq 0, 0 \leq θ < ρ \land m and ρ - K \leq θ \leq ρ - ℓ}, \\ D_{2} & : = & {(θ, ρ, y) \in R_{+}^{3}, y \geq 0, m \leq θ < ρ and ρ - K \leq θ \leq ρ - ℓ}, \end{matrix}

and

D : = D_{0} \cup D_{1} \cup D_{2}

.

The gain function J is given for any state

(z, Θ) \in Z \times D

and strategy

α \in A (Θ)

by:

\begin{matrix} J (z, Θ, α) & : = & E [\sum_{j \geq 1} e^{- β (s_{j} + m)} (G_{j} - C_{j})], \end{matrix}

(7)

where

G_{1}

and

C_{1}

are defined by:

\begin{matrix} G_{1} & = & [y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s] P_{s_{1} + m}^{p} 1_{D_{0}} + y P_{m - θ}^{p} 1_{D_{1}}, \\ C_{1} & = & f (s_{1} + θ - ℓ) 1_{D_{0}} + f (ρ - θ - ℓ) 1_{D_{1}} . \end{matrix}

For any

j \geq 2

,

C_{j}

and

G_{j}

are defined by (5) and (6).

To define the set of admissible strategies

A (Θ)

, we first introduce the set of admissible strategies

A_{i} (Θ)

defined for any

Θ \in D_{i}

with

i \in {0, 1, 2}

:

\begin{matrix} A_{0} (Θ) & : = & {{(d_{i}, s_{i})}_{i \geq 1}, where d_{1} = - θ, s_{1} is a stopping time valued in [{(ℓ - θ)}^{+}, K - θ] and \\ {(d_{i}, s_{i})}_{i \geq 2} \in A with d_{2} \geq s_{1} + δ}, \end{matrix}

\begin{matrix} A_{1} (Θ) & : = & \{{(d_{i}, s_{i})}_{i \geq 1}, where (d_{1}, s_{1}) = (- ρ, - θ) and {(d_{i}, s_{i})}_{i \geq 2} \in A with d_{2} \geq δ - θ\}, \end{matrix}

\begin{matrix} A_{2} (Θ) & : = & \{{(d_{i}, s_{i})}_{i \geq 1}, where (d_{1}, s_{1}) = (- ρ, - θ) and {(d_{i}, s_{i})}_{i \geq 2} \in A with d_{2} \geq {(δ - θ)}^{+}\} . \end{matrix}

Finally, we define the set

A (Θ)

by

A (Θ) = A_{i} (Θ)

when

Θ \in D_{i}

with

i \in {0, 1, 2}

.

We can now define the extended value function v by:

\begin{matrix} v (z, Θ) & : = & sup_{α \in A (Θ)} E [J (z, Θ, α)], \end{matrix}

for any

(z, Θ) \in Z \times D

.

3.2. Dynamic Programming Principle

To characterize the value function v by a PDE, we use the approach by dynamic programming principle. The value function v satisfies the following equalities, which depend on the set in which

Θ

lives.

Theorem 1.

For any

z \in Z

and

Θ \in D_{0}

, we have:

\begin{matrix} v (z, Θ) & = & sup_{{(ℓ - θ)}^{+} \leq s_{1} \leq K - θ} E [e^{- β s_{1}} v (X_{s_{1}}^{x, α}, P_{s_{1}}^{p}, 0, s_{1} + θ, y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s)] . \end{matrix}

For any

z \in Z

and

Θ \in D_{1}

, we have:

\begin{matrix} v (z, Θ) & = & E [e^{- β (m - θ)} [y P_{m - θ}^{p} - f (ρ - θ - ℓ)] + e^{- β (m - θ)} v (X_{m - θ}^{x, α}, P_{m - θ}^{p}, m, ρ + m - θ, 0))] . \end{matrix}

For any

z \in Z

and

Θ \in D_{2}

, we have:

\begin{matrix} v (z, Θ) & = & sup_{{(δ - θ)}^{+} \leq d_{2}} E [e^{- β d_{2}} v (X_{d_{2}}^{x, α}, P_{d_{2}}^{p}, 0, 0, y)] . \end{matrix}

The proof of this theorem is postponed to Appendix A, Appendix B and Appendix C.

3.3. Growth Property

We now impose the following assumption on the coefficients:

\begin{matrix} μ + η λ - β & < & 0 . \end{matrix}

(8)

We then get the following growth property for the value function v. This one will be useful to characterize v as the unique viscosity solution of a PDE system.

Proposition 1.

The value function v satisfies the following growth condition: there exist two positive constants

C_{1}

and

C_{2}

such that:

\begin{matrix} y p - C_{1} & \leq & v (z, Θ) \leq C_{2} {(1 + | x |}^{2} + {| p |}^{2}), \end{matrix}

(9)

for any

z = (x, p) \in Z

and

Θ = (θ, ρ, y) \in D

.

Proof.

We first prove the left inequality. If

Θ \in D_{0}

, we can consider the strategy that consists of stopping the harvest as soon as possible and never harvesting after that, so:

\begin{matrix} v (z, Θ) & \geq & E [y P_{{(ℓ - θ)}^{+} + m}^{p} - f ({(θ - ℓ)}^{+})] \\ \geq & y p - f (K - ℓ) . \end{matrix}

If

Θ \in D_{1}

, we can consider the strategy that consists of selling the harvest and never harvesting after that, and we get:

\begin{matrix} v (z, Θ) & \geq & E [y P_{m - θ}^{p} - f (ρ - θ - ℓ)] \\ \geq & y p - f (K - ℓ) . \end{matrix}

If

Θ \in D_{2}

, we can consider the strategy that consists of starting the harvest as soon as possible, stopping this as soon as possible, and never harvesting after that, so:

\begin{matrix} v (z, Θ) & \geq & E [y P_{{(δ - θ)}^{+} + ℓ + m}^{p} - f (0)] \\ \geq & y p - f (K - ℓ) . \end{matrix}

Hence, the left inequality holds with

C_{1} = f (K - ℓ)

.

We now prove the right inequality. For that, we introduce the process

{\bar{X}}^{x}

defined by

{\bar{X}}_{0}^{x} = x

and:

\begin{matrix} d {\bar{X}}_{t}^{x} & = & η {\bar{X}}_{t}^{x} (λ - {\bar{X}}_{t}^{x}) d t + γ {\bar{X}}_{t}^{x} d B_{t} . \end{matrix}

Using the closed formula of the logistic diffusion (see, e.g., in [11]), we have:

\begin{matrix} {\bar{X}}_{t}^{x} & = & \frac{e^{(η λ - \frac{γ^{2}}{2}) t + γ B_{t}}}{\frac{1}{x} + η \int_{0}^{t} e^{(η λ - \frac{γ^{2}}{2}) u + γ B_{u}} d u} \\ \leq & x e^{η λ T} e^{- \frac{γ^{2}}{2} t + γ B_{t}}, \end{matrix}

which implies the following inequality:

\begin{matrix} sup_{0 \leq t \leq T} E [{\bar{X}}_{t}^{x}] & \leq & x e^{η λ T} . \end{matrix}

(10)

We now consider any strategy

α = {(d_{i}, s_{i})}_{i \geq 1} \in A (Θ)

. Since the cost function f is positive, we get:

\begin{matrix} J (z, Θ, α) & \leq & \sum_{i \geq 1} e^{- β (s_{i} + m)} E [P_{s_{i} + m}^{p} \int_{d_{i} + ℓ}^{s_{i}} g ({\bar{X}}_{s}^{x}) d s] . \end{matrix}

From Assumption (Hg), we have:

\begin{matrix} J (z, Θ, α) & \leq & \sum_{i \geq 1} e^{- β (s_{i} + m)} E [P_{s_{i} + m}^{p}] \int_{d_{i} + ℓ}^{s_{i}} a_{max} E [{\bar{X}}_{s}^{x}] d s \\ \leq & C p \sum_{i \geq 1} e^{(μ - β) s_{i}} a_{max} sup_{0 \leq t \leq s_{i}} E [{\bar{X}}_{s}^{x}] . \end{matrix}

From Inequality (10), we get (C is a generic constant, which can be modified):

\begin{matrix} J (z, Θ, α) & \leq & C p x \sum_{i \geq 1} e^{(μ + η λ - β) s_{i}} . \end{matrix}

From Inequality (8) and all the constraints about

d_{i}

and

s_{i}

, we know that

s_{i} \geq K i + δ (i - 1) + m

for any

i \in N^{*}

, so we get:

\begin{matrix} J (z, Θ, α) & \leq & C p x \sum_{i \geq 1} e^{(μ + η λ - β) (K i + δ (i - 1) + m)} \\ \leq & C p x \sum_{i \geq 1} e^{(μ + η λ - β) (K + δ) i} \\ \leq & C p x, \end{matrix}

which implies:

\begin{matrix} v (z, Θ) & \leq & C p x . \end{matrix}

□

3.4. Viscosity Properties and Uniqueness

We now consider all the cases to get the PDEs satisfied by the value function v, which is derived from the dynamic programming relation:

If $Θ \in D_{0}$ with $θ \in [0, ℓ)$ , that means the manager has given the order to harvest, but this has not yet started, which implies:

$\begin{matrix} β v - L^{0} v & = & 0, \end{matrix}$

(11)

with $L^{0} ψ = η x (λ - x) \partial_{x} ψ + μ p \partial_{p} ψ + \frac{{| γ x |}^{2}}{2} \partial_{x x} ψ + \frac{{| σ p |}^{2}}{2} \partial_{p p} ψ + \partial_{θ} ψ + \partial_{ρ} ψ$ for any function $ψ \in C^{2} (Z \times D)$ .
If $Θ \in D_{0}$ with $θ \in [ℓ, K)$ , that means the manager harvests, and he/she can decide to stop this, which implies:

$\begin{matrix} min (β v - L^{1} v, v - M_{1} v) & = & 0, \end{matrix}$

(12)

with $L^{1} ψ = (η x (λ - x) - g (x)) \partial_{x} ψ + μ p \partial_{p} ψ + g (x) \partial_{y} ψ + \frac{{| γ x |}^{2}}{2} \partial_{x x} ψ + \frac{{| σ p |}^{2}}{2} \partial_{p p} ψ + \partial_{θ} ψ$ $ψ \in C^{2} (Z \times D)$ , and the operator $M_{1}$ is defined for any function $v \in C^{2} (Z \times D_{0})$ by:

$\begin{matrix} M_{1} v (x, p, θ, θ, y) & = & v (x, p, 0, θ, y) . \end{matrix}$
If $Θ \in D_{0}$ with $θ = K$ , the manager must stop the harvest, so we have:

$\begin{matrix} v (x, p, K, K, y) & = & v (x, p, 0, K, y) . \end{matrix}$

(13)
If $Θ \in D_{1}$ with $θ \in [0, m)$ , that means the manager has finished harvesting, but he/she has not yet sold his/her harvest, which implies:

$\begin{matrix} β v - L^{0} v & = & 0 . \end{matrix}$

(14)
If $Θ \in D_{1}$ with $θ = m$ , that means the manager sells his/her harvest, which implies:

$\begin{matrix} lim_{θ \to m^{-}} v (x, p, θ, ρ, y) & = & y p - f (ρ - m - ℓ) + v (x, p, m, ρ, 0) . \end{matrix}$

(15)
If $Θ \in D_{2}$ with $θ \in [m, δ)$ , then the manager can do nothing, which implies:

$\begin{matrix} β v - L^{0} v & = & 0 . \end{matrix}$

(16)
If $Θ \in D_{2}$ with $θ \geq δ$ , then the manager can decide to start a harvest:

$\begin{matrix} min (β v - L^{0} v, v - M_{2} v) & = & 0 . \end{matrix}$

(17)

The operator $M_{2}$ is defined for any function $v \in C^{2} (Z \times D_{2})$ by:

$\begin{matrix} M_{2} v (x, p, θ, ρ, y) & = & v (x, p, 0, 0, y) . \end{matrix}$

As usual, we do not have any regularity property on the value function v. We therefore work with the notion of the viscosity solution.

Definition 1

(Viscosity solution to (11)–(17)). A locally bounded function w defined on

Z \times D

is a viscosity supersolution (resp. subsolution) if:

for any $(z, Θ) \in Z \times D_{0}$ and $φ \in C^{2} (Z \times D_{0})$ such that:

$\begin{matrix} (w_{*} - φ) (z, Θ) & = & min_{Z \times D_{0}} (w_{*} - φ) \\ (r e s p . (w^{*} - φ) (z, Θ) & = & max_{Z \times D_{0}} (w^{*} - φ)) \end{matrix}$

we have

$\begin{matrix} β φ (z, Θ) - L^{0} φ (z, Θ) & \geq & 0 i f θ \in [0, l) \\ (r e s p . β φ (z, Θ) - L^{0} φ (z, Θ) & \leq & 0) \\ min (β φ (z, Θ) - L^{1} φ (z, Θ), w_{*} (z, Θ) - M_{1} w_{*} (z, Θ)) & \geq & 0 i f θ \in [l, K) \\ (r e s p . min (β φ (z, Θ) - L^{1} φ (z, Θ), w^{*} (z, Θ) - M_{1} w^{*} (z, Θ)) & \leq & 0) \end{matrix}$
for any $z \in Z$ and $y \in R_{+}$

$\begin{matrix} w_{*} (x, p, K, K, y) & \geq & w_{*} (x, p, 0, K, y) \\ (r e s p . w^{*} (x, p, K, K, y) & \leq & w^{*} (x, p, 0, K, y)) \end{matrix}$
for any $(z, Θ) \in Z \times D_{1}$ with $θ \in [0, m)$ and $φ \in C^{2} (Z \times D_{1})$ such that:

$\begin{matrix} (w_{*} - φ) (z, Θ) & = & min_{Z \times D_{1}} (w_{*} - φ) \\ (r e s p . (w^{*} - φ) (z, Θ) & = & max_{Z \times D_{1}} (w^{*} - φ)) \end{matrix}$

we have:

$\begin{matrix} β φ (z, Θ) - L^{0} φ (z, Θ) & \geq & 0 \\ (r e s p . β φ (z, Θ) - L^{0} φ (z, Θ) & \leq & 0) \end{matrix}$
for any $z \in Z$ , $ρ \in [ℓ + m, K + m]$ and $y \in R_{+}$ :

$\begin{matrix} w_{*} (x, p, m^{-}, ρ, y) & \geq & y p - f (ρ - ℓ) + w_{*} (x, p, m, ρ, 0) \\ (r e s p . w^{*} (x, p, m^{-}, ρ, y) & \leq & y p - f (ρ - ℓ) + w^{*} (x, p, m, ρ, 0)) \end{matrix}$
for any $(z, Θ) \in Z \times D_{2}$ and $φ \in C^{2} (Z \times D_{2})$ such that:

$\begin{matrix} (w_{*} - φ) (z, Θ) & = & min_{Z \times D_{2}} (w_{*} - φ) \\ (r e s p . (w^{*} - φ) (z, Θ) & = & max_{Z \times D_{2}} (w^{*} - φ)) \end{matrix}$

we have:

$\begin{matrix} β φ (z, Θ) - L^{0} φ (z, Θ) & \geq & 0 i f θ \in [m, δ) \\ (r e s p . β φ (z, Θ) - L^{0} φ (z, Θ) & \leq & 0) \\ min (β φ (z, Θ) - L^{0} φ (z, Θ), w_{*} (z, Θ) - M_{2} w_{*} (z, Θ)) & \geq & 0 i f θ \geq δ \\ (r e s p . min (β φ (z, Θ) - L^{0} φ (z, Θ), w^{*} (z, Θ) - M_{2} w^{*} (z, Θ)) & \leq & 0) \end{matrix}$

A locally bounded function w defined on

Z \times D

is said to be a viscosity solution to (11)–(17) if it is a supersolution and a subsolution to (11)–(17).

The next result provides the viscosity properties of the value function v.

Theorem 2

(Viscosity characterization). The value function v is the unique viscosity solution to (11)–(17) satisfying the growth condition (9). Moreover, v is continuous on

Z \times D

.

The proof of this theorem is postponed in Appendix B.

4. Numerical Results

Unfortunately, we are not able to provide an explicit solution for the HJB Equations (11)–(17). We therefore propose in this section a scheme to approximate the solution v.

4.1. The Discrete Problem

In the following, we introduce the numerical tools to solve the HJB equations related to the value function v. We use a numerical backward scheme based on the optimal quantization mixed with an iterative procedure. The convergence of the solution of the numerical scheme towards the solution of the HJB equation, when the time-space step on a bounded grid goes to zero, can be shown using the standard monotonicity, stability, and consistency arguments. We refer to [12,13] for numerical schemes of the same form.

Each HJB equation of the form

min (β v - L^{i} v, v - h_{i}) = 0

, with

i \in {0, 1}

, will be approximated as follows:

\begin{matrix} v^{n + 1} (x, p, θ, ρ, y) & = & max \{E [v^{n + 1} (X_{Δ}^{i}, P_{Δ}, θ + Δ, ρ + Δ, Y_{Δ}^{i})], h_{i}^{n}\}, \end{matrix}

where:

\begin{matrix} X_{Δ}^{i} & = & x + (η x (λ - x) - i g (x)) Δ + γ x \sqrt{Δ} ξ_{k}, \\ P_{Δ} & = & p exp ((μ - σ^{2} / 2) Δ + σ \sqrt{Δ} ξ_{l}), \\ Y_{Δ}^{i} & = & y + i g (x) Δ . \end{matrix}

The constant

Δ

represents the time step, and the index n stands for the iteration procedure steps, which are stopped when the error between two consecutive iterations becomes smaller than a given stopping criterion

ε

. The random variables

ξ_{k}

and

ξ_{l}

represent the quantization of two independent normally distributed random variables.

Remark 1.

Recall that the optimal quantization technique consists of approximating the expectation

E [f (Z)]

, where Z is a normal distributed variable and f is a given real function, by:

\begin{matrix} E [f (ξ)] & = & \sum_{k \in ξ (Ω)} f (k) P (ξ = k) . \end{matrix}

The distribution of the discrete variable ξ is known for a fixed

N : = c a r d (ξ (Ω))

, and the approximation is optimal as the

L^{2}

-error between ξ and Z is of order

1 / N

(see [14]). The optimal grid

ξ (Ω)

and the associated weights

P (ξ = k)

can be downloaded from the website: http://www.quantize.maths-fi.com/downloads.

4.2. Numerical Interpretations

The numerical computation is done using the following set of data:

$η = 0.1, λ = 2, γ = 0.2, μ = 0.2, σ = 0.1, β = 0.5$ .
$δ = K = 0.4828, l = m = 0.2069$ .
Penalty function: $f (x) = 0.1 \times x$ .
Gain function: $g (x) = x$ .

Figure 1. The shape of the value function v for $(x, p, θ, ρ, y) \in D_{0}$ in the plane of x.

We plot the shape of the value function v for a fixed state

(p, θ, ρ, y)

in the plane of x s.t.

(x, p, θ, ρ, y) \in D_{0}

and

θ \in [0, ℓ)

. We can see that, as expected, v is increasing with respect to x. We can remark about three cases in this figure:

$x \in [1.5, 1.7]$ : in this case the, value function is increasing since if we have $1.5 \leq x < x^{'} \leq 1.7$ at the initial time, then we will have $X_{ℓ - θ}^{x} < X_{ℓ - θ}^{x^{'}} < 2$ a.s., and the bigger is the resource when we harvest, the more we can harvest since the function g is increasing;
$x \in [1.7, 2]$ : in this case, the value function is constant since if we have $1.7 \leq x < x^{'} \leq 2$ at the initial time then we will have $X_{ℓ - θ}^{x} = X_{ℓ - θ}^{x^{'}} = 2$ a.s., so we harvest exactly the same quantity in the two cases;
$x > 2$ : in this case, the value function is increasing since if we have $2 \leq x < x^{'}$ at the initial time, then the resource decreases, but we will have $X_{ℓ - θ}^{x} < X_{ℓ - θ}^{x^{'}}$ a.s.

The value function increases with respect to x, which is natural since the greater the resource is, the more we can harvest as the function g is increasing. We can also see that the value function becomes concave after

x = 2

, which is due to the resource’s mean-reverting nature. Indeed, if the quantity of the resource is greater than two, because the drift is negative, the resource would necessarily decrease, reducing the harvest. The oscillations observed when x is small are due to the delay ℓ.

Figure 2. The shape of the value function v for $(x, p, θ, ρ, y) \in D_{0}$ in the plane of x for different values of ℓ.

We plot the shape of the value function v for a fixed state

(p, θ, ρ, y)

in the plane of x s.t.

(x, p, θ, ρ, y) \in D_{0}

and

θ \in [0, ℓ)

for different values of ℓ. We can see that, when changing the delay time ℓ, the change point of the value function’s monotony is also shifted:

1.6

for

ℓ = 0.1379

,

1.7

for

ℓ = 0.2069

, and

1.8

for

ℓ = 0.2759

. Indeed, as the figure shows, as the delay decreases, the change point of the monotony approaches zero and will likely disappear when there is no delay, leading to a perfect concave function. In fact, wasting time waiting to harvest because of the delay will lead the manager to skip the increasing period of the resource. The manager will start the real harvesting when the population is dropping due to the mean-reverting parameter

λ

; thus, the value function will decrease.

Figure 3. The optimal policy for $(x, p, θ, ρ, y) \in D_{0}$ in the plane of x.

We plot the optimal decision that the manager would make for a fixed state

(p, θ, ρ, y)

in the plane of x s.t.

(x, p, θ, ρ, y) \in D_{0}

and

θ \in [ℓ, K]

. As we can see, the optimal decision that the manager should make is to start harvesting if the resource x is over a given level; otherwise, he/she should stop and sell the harvest. This is due to the cost f, which penalizes him/her as long as the harvesting is ongoing. In fact, if the population is not large enough, he/she would not be able to cover his/her loss.

Figure 4. The optimal policy for $(x, p, θ, ρ, y) \in D_{0}$ in the plane of $θ$ .

We plot the optimal decision that the manager would make for four fixed states

(x, p, y)

(P 1, P 2, P 3, P 4)

in the plane of

θ

s.t.

(x, p, θ, ρ, y) \in D_{0}

. The state

P 1

represents the case where x and y are both low; state

P 2

is for x low and y high,

P 3

for x high and y low, and final state

P 4

for x and y both high. Decision 1 stands for starting the harvest, and Decision 2 stands for stopping it. As we can see, the optimal decision that the manager should make in state

P 1

(resp.

P 2

), knowing that he/she has already spent

θ

time since the starting decision, is to stop harvesting if

θ \leq θ_{0}

(resp.

θ \leq θ_{1}

) where

θ_{0} ≃ 0.34

(resp.

θ_{1} ≃ 0.38

). We can explain this as follows: on the one hand, in the case where

θ \leq θ_{0}

(resp.

θ \leq θ_{1}

), due to the cost of harvesting and the fact that we are in state

P 1

(resp.

P 2

) where the population is low, the manager prefers to immediately stop harvesting and sell the harvest; otherwise, he/she will likely lose money. On the other hand, if

θ \geq θ_{0}

(resp.

θ \geq θ_{1}

), i.e., the manager has already given the order to harvest since a given period of time, it is optimal for him/her to harvest for the purpose of covering the cost due to the large time spent harvesting. We can also note that this last window of time is larger for state

P 1

in comparison with the one of state

P 2

. Indeed, in state

P 2

, the manager has already harvested more than in state

P 1

, so he/she can stop harvesting sooner.

Concerning states

P 3

and

P 4

, where the population is high, obviously, the optimal decision for the manager is to harvest at all times and stop harvesting when

θ = K = 0.4828

.

Figure 5. The shape of the value function v for $(x, p, θ, ρ, y) \in D_{1}$ in the plane of p.

We plot the value function v for a fixed state

(x, θ, ρ, y)

in the plane of p s.t.

(x, p, θ, ρ, y) \in D_{1}

and

θ \in [0, m]

. As expected, v is nondecreasing w.r.t. p. The more expensive the resource is, the more the manager takes benefits.

Figure 6. The optimal policy for $(x, p, θ, ρ, y) \in D_{2}$ in the plane of x.

We plot the optimal decision that the manager would make for a fixed state

(p, θ, ρ, y)

in the plane of x s.t.

(x, p, θ, ρ, y) \in D_{2}

and

θ \in [δ, θ_{m a x}]

. As we can see, the optimal decision that the manager should make, knowing that he/she has already sold the harvest, is to start harvesting over a certain level of x under the mean-reverting barrier

λ

so that the population grows enough to cover the harvesting costs and take benefits.

4.3. Conclusions

Our modeling takes into account the non-immediacy of both the harvest and its sale, described by the time delays ℓ and m. The optimum strategies commented on previously illustrate the effect of those delays on the manager’s actions. In the classical literature, many studies suggesting optimal harvesting policies presume that the natural resource is immediately available, which is not the case in general. Consequently, the proposed policies would not be feasible and would lead to a sub-optimal use of the resource, in the best case scenario. The ecological and economic effects may thus be consequential.

For example, a modeling of fisheries that does not involve delays may lead to a harvesting strategy that would likely deplete the fish population, leading to extinction.

In fact, if the population is at a high level at the initial time, it is then likely to decline due to the logistic nature of the dynamics. As a result, if the time required to reach the harvest region is not taken into account, the best approach would be to harvest massively, thus causing a drastic degradation of the fish population.

We may make the same reasoning when it comes to selling the crop. Assume that we are in a position to sell our harvest immediately after harvesting and neglect the time required to return to land. In that case, if the price of fish falls, we would suffer losses, and we would not be able to amortize the costs of fishing.

Author Contributions

Designed the theoretical framework, M.G., I.K. and T.L.; methodology, M.G., I.K. and T.L.; formal analysis, M.G., I.K. and T.L.; Designed the experiments, M.G.; performed the experiments, M.G.; writing an original draft preparation, M.G., I.K. and T.L.; writing a review and editing, M.G. and T.L. All authors read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Dynamic Programming Principle

We introduce some notations in this part to alleviate the proofs. We first denote by

T

the set of

F

-stopping time. For

ω, ω^{'} \in Ω

and

t \geq 0

, we set:

\begin{matrix} {(^{t} ω_{s})}_{s \geq 0} = {(ω_{s \land t})}_{s \geq 0} & and & {(ω \oplus_{t} ω_{s}^{'})}_{s \geq 0} = {(ω_{s} 1_{s \leq t} + (ω_{s}^{'} - ω_{t}^{'} + ω_{t}) 1_{s > t})}_{s \geq 0} . \end{matrix}

For any

(z, Θ) \in Z \times D

and

α \in A (Θ)

we define

Z^{z, α}

as the two-dimensional process

(X^{x, α}, P^{p})

. For any

t \geq 0

, we denote by

Θ (t, α)

the triple

(θ_{t}, ρ_{t}, y_{t})

where

θ_{t}

corresponds to the time from the last decision of the manager before t (this order can be an order to start a harvest or an order to stop a harvest),

ρ_{t}

the time from which the manager has given the last order to harvest before t and

y_{t}

is the harvested quantity until time t.

For

τ \in T

and

α = {(d_{i}, s_{i})}_{i \geq 1} \in A (Θ)

, we define the shifted (random) strategy

α^{τ}

by:

\begin{matrix} α^{τ} (ω) & = & {{(d_{i} (ω \oplus_{τ (ω)} ω^{'}) - τ (ω), s_{i} (ω \oplus_{τ (ω)} ω^{'}) - τ (ω))}_{i \geq κ (τ, α) (ω)}, ω^{'} \in Ω} \end{matrix}

(A1)

with

\begin{matrix} κ (τ, α) (ω) & : = & sup {i \geq 1, d_{i} (ω) \leq τ (ω)} \end{matrix}

for all

ω \in Ω

.

Before proving the dynamic programming principle, we need the following results.

Lemma A1.

For any

ϑ \in T

,

(z, Θ) \in Z \times D

and

α = {(d_{i}, s_{i})}_{i \geq 1} \in A (Θ)

, we have the following properties.

Consistency of the admissible strategies: $Θ (ϑ, α) \in D$ and $α^{ϑ} \in A (Θ (ϑ, α))$ $P$ -a.s.
Consistency of the gain function:

$\begin{matrix} J (z, Θ, α) & = & E [\sum_{i = 1}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i}^{α} - C_{i}^{α})] + E [e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), α^{ϑ})] . \end{matrix}$

Proof.

These properties are direct consequences of the dynamics of

Z^{z, α}

and of the definitions of J and

A

.

We now turn to the proof of the dynamic programming principle. Unfortunately, we have not enough information on the value function v to directly prove these results. In particular, we do not know the measurability of v and this prevents us from computing expectations involving v as in the dynamic programming principle. We therefore provide weaker dynamic programing principles involving the envelopes

v^{*}

and

v_{*}

as in [15] where:

\begin{matrix} v^{*} (z, Θ) & = & {lim^{¯}}_{\begin{matrix} (z^{'}, Θ^{'}) \to (z, Θ) \\ (z^{'}, Θ^{'}) \in E \\ θ^{'} \to θ^{+} \\ ρ^{'} \to ρ^{+} \end{matrix}} v (z^{'}, Θ^{'}), \end{matrix}

and:

\begin{matrix} v_{*} (z, Θ) & = & {lim_{̲}}_{\begin{matrix} (z^{'}, Θ^{'}) \to (z, Θ) \\ (z^{'}, Θ^{'}) \in E \\ θ^{'} \to θ^{+} \\ ρ^{'} \to ρ^{+} \end{matrix}} v (z^{'}, Θ^{'}) . \end{matrix}

We recall that in general,

v_{*} \leq v \leq v^{*}

. Since we get the continuity of v at the end, these results implies the dynamic programming principle. □

Proposition A1.

For any

Θ \in D_{0}

and

z \in Z

, we have:

\begin{matrix} v (z, Θ) \geq sup_{α \in A (Θ)} sup_{ϑ \in T} E [e^{- β ϑ} v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < s_{1} + m} \\ + [e^{- β (s_{1} + m)} ((y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s) P_{s_{1} + m}^{p} - f (s_{1} + θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{s_{1} + m \leq ϑ}] . \end{matrix}

For any

Θ \in D_{1}

and

z \in Z

, we have:

\begin{matrix} v (z, Θ) \geq sup_{α \in A (Θ)} sup_{ϑ \in T} E [e^{- β ϑ} v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < m - θ} \\ + [e^{- β (m - θ)} (y P_{m - θ}^{p} - f (ρ - θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{m - θ \leq ϑ}] . \end{matrix}

For any

Θ \in D_{2}

and

z \in Z

, we have:

\begin{matrix} v (z, Θ) \geq sup_{α \in A (Θ)} sup_{ϑ \in T} E [e^{- β ϑ} v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < d_{2}} + [\sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{ϑ \geq d_{2}}] . \end{matrix}

Proof.

Let

i \in {0, 1, 2}

,

z \in Z

and

Θ \in D_{i}

,

α \in A (Θ)

and

ϑ \in T

. By definition of the value function v, for any

ε > 0

and

ω \in Ω

, there exists

α^{ε, ω} = {(s_{k}^{ε, ω}, d_{k}^{ε, ω})}_{k \geq 1} \in A (Θ (ϑ (ω), α)

, which is

ε

-optimal at

(Z_{ϑ}^{z, α}, Θ (ϑ, α)) (ω)

, i.e.,

\begin{matrix} v (Z_{ϑ (ω)}^{z, α} (ω), Θ (ϑ (ω), α (ω))) - ε & \leq & J (Z_{ϑ (ω)}^{z, α} (ω), Θ (ϑ (ω), α (ω)), α^{ε, ω}) . \end{matrix}

(A2)

By a measurable selection theorem (see e.g., Theorem 82 in the appendix of Chapter III in [16]), there exists a sequence of stopping times

{\bar{α}}^{ε} = {({\bar{s}}_{k}^{ε}, {\bar{d}}_{k}^{ε})}_{k \geq 1}

s.t.

{\bar{s}}_{k}^{ε} (ω) = s_{k}^{ε, ω} (ω)

and

{\bar{d}}_{k}^{ε} (ω) = d_{k}^{ε, ω} (ω)

for a.a.

ω \in Ω

.

We now define by concatenation the control strategy

\bar{α}

consisting of the impulse control components of

α

on

[0, ϑ)

, and the impulse control components

({\bar{α}}^{ε} + ϑ)

on

[ϑ, \infty)

. More precisely,

α

is given by:

\begin{matrix} \bar{α} (ω) & = & {(s_{k} (ω), d_{k} (ω))}_{1 \leq k < κ (ϑ, α) (ω)} \cup {({\bar{s}}_{k}^{ε} (ω) + ϑ (ω), {\bar{d}}_{k}^{ε} (ω) + ϑ (ω))}_{κ (ϑ, α) (ω) \leq k} . \end{matrix}

By definition of the shift given in (A1), we have:

\begin{matrix} {\bar{α}}^{ϑ} (ω) & = & {{({\bar{s}}_{k}^{ε} (ω \oplus_{ϑ (ω)} ω^{'}), {\bar{d}}_{k}^{ε} (ω \oplus_{ϑ (ω)} ω^{'}))}_{k \geq 1}, ω^{'} \in Ω} \\ = & {{\bar{α}}^{ϑ, ε} (ω \oplus_{ϑ (ω)} ω^{'}), ω^{'} \in Ω} . \end{matrix}

From Lemma A1 (ii) and the definition of the performance criterion we get the following equalities.

If $z \in Z$ and $Θ \in D_{0}$ , then we have:

$\begin{matrix} J (z, Θ, \bar{α}) & = & E [e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), {\bar{α}}^{ϵ}) 1_{ϑ < s_{1} + m} \\ + [e^{- β (s_{1} + m)} ((y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s) P_{s_{1} + m}^{p} - f (s_{1} + θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), {\bar{α}}^{ε}) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{s_{1} + m \leq ϑ}] . \end{matrix}$
If $z \in Z$ and $Θ \in D_{1}$ , then we have:

$\begin{matrix} J (z, Θ, \bar{α}) & = & E [e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), {\bar{α}}^{ϵ}) 1_{ϑ < m - θ} \\ + [e^{- β (m - θ)} (y P_{m - θ}^{p} - f (ρ - θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), {\bar{α}}^{ε}) + e^{- β (s_{κ (ϑ, α)} + n)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{m - θ \leq ϑ}] . \end{matrix}$
If $z \in Z$ and $Θ \in D_{2}$ , then we have:

$\begin{matrix} J (z, Θ, \bar{α}) & = & E [e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), {\bar{α}}^{ϵ}) 1_{ϑ < d_{2}} + [\sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), {\bar{α}}^{ε}) + e^{- β (s_{κ (ϑ, α)} + n)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{d_{2} \leq ϑ}] . \end{matrix}$

Together with (A2), this implies if

z \in Z

and

Θ \in D_{0}

, we have:

\begin{matrix} v (z, Θ) & \geq & J (z, Θ, \bar{α}) \\ \geq & E [e^{- β ϑ} (v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) - ε) 1_{ϑ < s_{1} + m} \\ + [e^{- β (s_{1} + m)} ((y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s) P_{s_{1} + m}^{p} - f (s_{1} + θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} (v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) - ε) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{s_{1} + m \leq ϑ}] . \end{matrix}

If

z \in Z

and

Θ \in D_{1}

, we have:

\begin{matrix} v (z, Θ) & \geq & J (z, Θ, \bar{α}) \\ \geq & E [e^{- β ϑ} (v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) - ε) 1_{ϑ < m - θ} \\ + [e^{- β (m - θ)} (y P_{m - θ}^{p} - f (ρ - θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i})) \\ + e^{- β ϑ} (v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) - ε) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{m - θ \leq ϑ}] . \end{matrix}

If

z \in Z

and

Θ \in D_{2}

, we have:

\begin{matrix} v (z, Θ) & \geq & J (z, Θ, \bar{α}) \\ \geq & E [e^{- β ϑ} (v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) - ε) 1_{ϑ < d_{2}} + [\sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} (v_{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) - ε) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{d_{2} \leq ϑ}] . \end{matrix}

Since

ε

,

ϑ

and

α

are arbitrarily chosen, we get the result. □

Proposition A2.

For all

z \in Z

and

Θ \in D_{0}

, we have:

\begin{matrix} v (z, Θ) \leq sup_{α \in A (Θ)} inf_{ϑ \in T} E [e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < s_{1} + m} \\ + [e^{- β (s_{1} + m)} ((y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s) P_{s_{1} + m}^{p} - f (s_{1} + θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{s_{1} + m \leq ϑ}] . \end{matrix}

For all

z \in Z

and

Θ \in D_{1}

, we have:

\begin{matrix} v (z, Θ) \leq sup_{α \in A (Θ)} inf_{ϑ \in T} E [e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < m - θ} \\ + (e^{- β (m - θ)} (y P_{m - θ}^{p} - f (ρ - θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}) 1_{m - θ \leq ϑ}] . \end{matrix}

For all

z \in Z

and

Θ \in D_{2}

, we have:

\begin{matrix} v (z, Θ) \leq sup_{α \in A (Θ)} inf_{ϑ \in T} E [e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < d_{2}} + [\sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}) 1_{ϑ \geq d_{2}}] . \end{matrix}

Proof.

Fix

z \in Z

and

Θ \in D_{0}

,

α \in A (Θ)

and

ϑ \in T

. From Lemma A1, the definition of the performance criterion, we get:

\begin{matrix} J (z, Θ, α) & = & E [e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), α^{ϑ}) 1_{ϑ < s_{1} + m} \\ + [e^{- β (s_{1} + m)} ((y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s) P_{s_{1} + m}^{p} - f (s_{1} + θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), α^{ϑ}) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{s_{1} + m \leq ϑ}] \\ \leq & E [e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < s_{1} + m} \\ + [e^{- β (s_{1} + m)} ((y + \int_{{(ℓ - θ)}^{+}}^{s_{1}} g (X_{s}^{x, α}) d s) P_{s_{1} + m}^{p} - f (s_{1} + θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{s_{1} + m \leq ϑ}] \end{matrix}

since

ϑ

and

α

are arbitrary, we obtain the required inequality.

If

z \in Z

and

Θ \in D_{1}

, we get:

\begin{matrix} J (z, Θ, α) & = & E [e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), α^{ϑ}) 1_{ϑ < m - θ} \\ + [e^{- β (m - θ)} (y P_{m - θ}^{p} - f (ρ - θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), α^{ϑ}) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{m - θ \leq ϑ}] \\ \leq & E [e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < m - θ} \\ + [e^{- β (m - θ)} (y P_{m - θ}^{p} - f (ρ - θ - ℓ)) + \sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{m - θ \leq ϑ}] \end{matrix}

since

ϑ

and

α

are arbitrary, we obtain the required inequality.

If

z \in Z

and

Θ \in D_{2}

, we get:

\begin{matrix} J (z, Θ, α) & = & E [e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), 2, α^{ϑ}) 1_{ϑ < d_{2}} + [\sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i}) \\ + e^{- β ϑ} J (Z_{ϑ}^{z, α}, Θ (ϑ, α), α^{ϑ}) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{d_{2} \geq ϑ}] \\ \leq & E [e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) 1_{ϑ < d_{2}} + [\sum_{i = 2}^{κ (ϑ, α) - 1} e^{- β (s_{i} + m)} (G_{i} - C_{i})) \\ + e^{- β ϑ} v^{*} (Z_{ϑ}^{z, α}, Θ (ϑ, α)) + e^{- β (s_{κ (ϑ, α)} + m)} (G_{κ (ϑ, α)} - C_{κ (ϑ, α)}) 1_{D_{2}}] 1_{d_{2} \leq ϑ}] \end{matrix}

since

ϑ

and

α

are arbitrary, we obtain the required inequality. □

Appendix B. Viscosity Properties

We first prove the viscosity supersolution property. Fix $i \in {0, 1, 2}$ , and let $\bar{z} \in Z$ and $\bar{Θ} \in D_{i}$ , $φ \in C^{2} (Z \times D_{i})$ such that:

$\begin{matrix} (v_{*} - φ) (\bar{z}, \bar{Θ}) & = & min_{Z \times D_{i}} (v_{*} - φ) = 0 . \end{matrix}$

(A3)

If $i = 0$ and $\bar{θ} \geq ℓ$ , we can take the immediate control $s_{1} = 0$ so we obtain by Theorem 1:

$\begin{matrix} v (\bar{x}, \bar{p}, \bar{θ}, \bar{θ}, \bar{y}) & \geq & v (\bar{x}, \bar{p}, 0, \bar{θ}, \bar{y}) = M_{1} v (\bar{x}, \bar{p}, \bar{θ}, \bar{θ}, \bar{y}) . \end{matrix}$

(A4)

If $i = 2$ and $\bar{θ} \geq δ$ , we can take the immediate control $d_{2} = 0$ so we obtain by Theorem 1:

$\begin{matrix} v (\bar{x}, \bar{p}, \bar{θ}, \bar{ρ}, \bar{y}) & \geq & v (\bar{x}, \bar{p}, 0, 0, \bar{y}) = M_{2} v (\bar{x}, \bar{p}, \bar{θ}, \bar{ρ}, \bar{y}) . \end{matrix}$

(A5)

From the definition of $v_{*}$ , there exists a sequence ${(z_{n}, Θ_{n})}_{n \in N}$ of $Z \times D_{i}$ such that:

$\begin{matrix} (z_{n}, Θ_{n}, v (z_{n}, Θ_{n})) & \underset{n \to + \infty}{⟶} & (\bar{z}, \bar{Θ}, v_{*} (\bar{z}, \bar{Θ})) . \end{matrix}$

We define $γ_{n} : = v (z_{n}, Θ_{n}) - v_{*} (\bar{z}, \bar{Θ}) - φ (z_{n}, Θ_{n}) + φ (\bar{z}, \bar{Θ})$ . By continuity of $φ$ , we get $γ_{n} \to 0$ as $n \to \infty$ .
Applying Proposition A1 with $ϑ = h_{n} = \sqrt{| γ_{n} |}$ . We have for n large enough:
- if $i = 0$ :
  
  $\begin{matrix} v (z_{n}, Θ_{n}) & \geq & E [e^{- β h_{n}} v_{*} (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n})) 1_{h_{n} < s_{1}^{n} + m} \\ + [e^{- β (s_{1}^{n} + m)} ((y + \int_{{(ℓ - θ_{n})}^{+}}^{K - θ_{n}} g (X_{s}^{x_{n}, α^{0, n}}) d s) P_{s_{1}^{n} + m}^{p_{n}} - f (s_{1}^{n} + θ_{n} - ℓ)) \\ + e^{- β h_{n}} v_{*} (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n}))] 1_{s_{1}^{n} + m \leq h_{n}}] \end{matrix}$
  
  where $α^{0, n}$ is the strategy $(d_{1}^{n}, s_{1}^{n}) = (- θ_{n}, K - θ_{n})$ and then the manager follows the optimal strategy after this date,
- if $i = 1$ :
  
  $\begin{matrix} v (z_{n}, Θ_{n}) & \geq & E [e^{- β h_{n}} v_{*} (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n})) 1_{h_{n} < m} \\ + 1_{h_{n} \geq m} [e^{- β (s_{1}^{n} + m)} (y P_{s_{1}^{n} + m}^{p_{n}} - f (s_{1}^{n} + θ_{n} - ℓ)) + e^{- β h_{n}} v_{*} (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n}))]] \end{matrix}$
  
  where $α^{1, n}$ is the strategy $(d_{1}^{n}, s_{1}^{n}) = (- ρ_{n}, - θ_{n})$ and then the manager follows the optimal strategy after this date,
- if $i = 2$ :
  
  $\begin{matrix} v (z_{n}, Θ_{n}) & \geq & E [e^{- β h_{n}} v_{*} (Z_{h_{n}}^{z_{n}, α^{2, n}}, Θ (h_{n}, α^{2, n})) 1_{h_{n} < δ} \\ + 1_{h_{n} \geq δ} e^{- β h_{n}} v_{*} (Z_{h_{n}}^{z_{n}, α^{2, n}}, Θ (h_{n}, α^{2, n}))] \end{matrix}$
  
  where $α^{2, n}$ is the strategy $(d_{2}^{n}, s_{2}^{n}) = ({(δ - θ_{n})}^{+}, {(δ - θ_{n})}^{+} + K)$ and then the manager follows the optimal strategy after this date.
We get for n large enough from (A3) and the three previous inequalities:

$\begin{matrix} γ_{n} + φ (z_{n}, Θ_{n}) & \geq & E [e^{- β h_{n}} (v_{*} (Z_{h_{n}}^{z_{n}, α^{i, n}}, Θ (h_{n}, α^{i, n}))] \\ \geq & E [e^{- β h_{n}} (φ (Z_{h_{n}}^{z_{n}, α^{i, n}}, Θ (h_{n}, α^{i, n}))] . \end{matrix}$

Applying Itô’s formula, we get:
- if $i = 0$ and $\bar{θ} < ℓ$ :
  
  $\begin{matrix} \frac{1}{h_{n}} E [\int_{0}^{h_{n}} e^{- β s} (β φ (Z_{s}^{z_{n}, α^{0, n}}, Θ (s, α^{0, n})) - L^{0} φ (Z_{s}^{z_{n}, α^{0, n}}, Θ (s, α^{0, n}))) d s] & \geq & - \sqrt{| γ_{n} |}, \end{matrix}$
- if $i = 0$ and $ℓ \leq \bar{θ} < K$ :
  
  $\begin{matrix} \frac{1}{h_{n}} E [\int_{0}^{h_{n}} e^{- β s} (β φ (Z_{s}^{z_{n}, α^{0, n}}, Θ (s, α^{0, n})) - L^{1} φ (Z_{s}^{z_{n}, α^{0, n}}, Θ (s, α^{0, n}))) d s] & \geq & - \sqrt{| γ_{n} |}, \end{matrix}$
- if $i = 1$ and $\bar{θ} < m$ :
  
  $\begin{matrix} \frac{1}{h_{n}} E [\int_{0}^{h_{n}} e^{- β s} (β φ (Z_{s}^{z_{n}, α^{1, n}}, Θ (s, α^{1, n})) - L^{0} φ (Z_{s}^{z_{n}, α^{1, n}}, Θ (s, α^{1, n}))) d s] & \geq & - \sqrt{| γ_{n} |}, \end{matrix}$
- if $i = 2$ and $\bar{θ} \geq m$ :
  
  $\begin{matrix} \frac{1}{h_{n}} E [\int_{0}^{h_{n}} e^{- β s} (β φ (Z_{s}^{z_{n}, α^{2, n}}, Θ (s, α^{2, n})) - L^{0} φ (Z_{s}^{z_{n}, α^{2, n}}, Θ (s, α^{2, n}))) d s] & \geq & - \sqrt{| γ_{n} |} . \end{matrix}$
  
  Sending n to ∞, we get the supersolution property from the mean value theorem.
We turn to the viscosity subsolution. Fix $i \in {0, 1, 2}$ , and let $\bar{z} \in Z$ and $\bar{Θ} \in D_{i}$ , $φ \in C^{2} (Z \times D_{i})$ such that:

$\begin{matrix} (v^{*} - φ) (\bar{z}, \bar{Θ}) & = & max_{Z \times D_{i}} (v^{*} - φ) (z, Θ) = 0 . \end{matrix}$

If $v^{*} (\bar{z}, \bar{Θ}) \leq M_{1} v^{*} (\bar{z}, \bar{Θ})$ for $\bar{Θ} \in D_{0}$ with $\bar{θ} \geq ℓ$ , and $v^{*} (\bar{z}, \bar{Θ}) \leq M_{2} v^{*} (\bar{z}, \bar{Θ})$ for $\bar{Θ} \in D_{2}$ with $\bar{θ} \geq δ$ , then the subsolution inequality holds trivially. Consider now the case $v^{*} (\bar{z}, \bar{Θ}) > M_{1} v^{*} (\bar{z}, \bar{Θ})$ for $\bar{Θ} \in D_{0}$ with $\bar{θ} \geq ℓ$ , and $v^{*} (\bar{z}, \bar{Θ}) > M_{2} v^{*} (\bar{z}, \bar{Θ})$ for $\bar{Θ} \in D_{2}$ with $\bar{θ} \geq δ$ , and argue by contradiction by assuming on the contrary that:
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{0}$ and $\bar{θ} < ℓ$ :
  
  $\begin{matrix} r & : = & β φ (\bar{z}, \bar{Θ}) - L^{0} φ (\bar{z}, \bar{Θ}) > 0, \end{matrix}$
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{0}$ and $\bar{θ} \geq ℓ$ :
  
  $\begin{matrix} r & : = & β φ (\bar{z}, \bar{Θ}) - L^{1} φ (\bar{z}, \bar{Θ}) > 0, \end{matrix}$
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{1}$ :
  
  $\begin{matrix} r & : = & β φ (\bar{z}, \bar{Θ}) - L^{0} φ (\bar{z}, \bar{Θ}) > 0, \end{matrix}$
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{2}$ :
  
  $\begin{matrix} r & : = & β φ (\bar{z}, \bar{Θ}) - L^{0} φ (\bar{z}, \bar{Θ}) > 0 . \end{matrix}$
By continuity of $φ$ and its derivatives, there exists some $Δ_{0} > 0$ s.t. for all $0 < Δ \leq Δ_{0}$ , we have:
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{0}$ and $\bar{θ} < ℓ$ :
  
  $\begin{matrix} β φ (z, Θ) - L^{0} φ (z, Θ) > r / 2, \forall (z, Θ) \in B ((\bar{z}, \bar{Θ}), Δ) \cap E^{0} with θ < ℓ, \end{matrix}$
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{0}$ and $\bar{θ} \geq ℓ$ :
  
  $\begin{matrix} β φ (z, Θ) - L^{1} φ (z, Θ) > r / 2, \forall (z, Θ) \in B ((\bar{z}, \bar{Θ}), Δ) \cap E^{0} with θ \geq ℓ, \end{matrix}$
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{1}$ :
  
  $\begin{matrix} β φ (z, Θ) - L^{0} φ (z, Θ) > r / 2, \forall (z, Θ) \in B ((\bar{z}, \bar{Θ}), Δ) \cap E^{1}, \end{matrix}$
- if $\bar{z} \in Z$ , $\bar{Θ} \in D_{2}$ :
  
  $\begin{matrix} β φ (z, Θ) - L^{0} φ (z, Θ) > r / 2, \forall (z, Θ) \in B ((\bar{z}, \bar{Θ}), Δ) \cap E^{1} . \end{matrix}$
From the definition of $v^{*}$ , there exists a sequence ${(z_{n}, Θ_{n})}_{n \in N}$ of $B ((\bar{z}, \bar{Θ}), Δ / 2) \cap E^{0}$ with $θ_{n} < ℓ$ (resp. $θ_{n} \geq ℓ$ ) if $\bar{θ} < ℓ$ (resp. $\bar{θ} \geq ℓ$ ) such that:

$\begin{matrix} (z_{n}, Θ_{n}, v (z_{n}, Θ_{n})) & \overset{n \to \infty}{\to} & (\bar{z}, \bar{Θ}, v^{*} (\bar{z}, \bar{Θ})), \end{matrix}$

and there exists a sequence ${(z_{n}, Θ_{n})}_{n \in N}$ of $B ((\bar{z}, \bar{Θ}), Δ / 2) \cap E^{1}$ (resp. $B ((\bar{z}, \bar{Θ}), Δ / 2) \cap E^{2}$ ) if $\bar{θ} < m$ (resp. $\bar{θ} \geq m$ ) such that:

$\begin{matrix} (z_{n}, Θ_{n}, v (z_{n}, Θ_{n})) & \overset{n \to \infty}{\to} & (\bar{z}, \bar{Θ}, v^{*} (\bar{z}, \bar{Θ})) . \end{matrix}$

By Theorem 1 we can find for each $n \in N$ a control $α^{i, n} \in A (Θ_{n})$ such that for all $h_{n} \in T$ :

$\begin{matrix} v (z_{n}, Θ_{n}) \leq E [e^{- β h_{n}} (v (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n})) 1_{h_{n} < s_{1}^{0, n} + n}) \\ + [e^{- β (s_{1}^{0, n} + m} ((y_{n} + \int_{{(ℓ - θ)}^{+}}^{s_{1}^{0, n}} g (X_{s}^{x_{n}, α^{0, n}}) d s) P_{s_{1}^{0, n} + n}^{p_{n}} - f (s_{1}^{0, n} + θ - ℓ)) + \sum_{i = 2}^{κ (h_{n}, α^{0, n}) - 1} e^{- β (s_{i}^{0, n} + m)} (G_{i} - C_{i}) \\ + e^{- β h_{n}} v (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n}))] 1_{s_{1}^{0, n} + n \leq h_{n}}] + \frac{1}{n}, \end{matrix}$

and for all $(z, Θ) \in E^{1}$ , we have:

$\begin{matrix} v (z_{n}, Θ_{n}) \leq E [e^{- β h_{n}} v (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n})) 1_{h_{n} < m - θ_{n}} \\ + [e^{- β (m - θ_{n})} (y_{n} P_{m - θ_{n}}^{p_{n}} - f (ρ_{n} - θ_{n} - ℓ)) + \sum_{i = 2}^{κ (h_{n}, α^{1, n}) - 1} e^{- β (s_{i}^{1, n} + n)} (G_{i} - C_{i}) \\ + e^{- β h_{n}} v (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n}))] 1_{ℓ - θ_{n} \leq h_{n}}] + \frac{1}{n}, \end{matrix}$

and for all $(z, Θ) \in E^{2}$ , we have:

$\begin{matrix} v (z_{n}, Θ_{n}) \leq E [e^{- β h_{n}} v (Z_{h_{n}}^{z_{n}, α^{2, n}}, Θ (h_{n}, α^{2, n})) 1_{h_{n} < d_{2}^{n}} \\ + [\sum_{i = 2}^{κ (h_{n}, α^{2, n}) - 1} e^{- β (s_{i}^{1, n} + m)} (G_{i} - C_{i}) + e^{- β h_{n}} v (Z_{h_{n}}^{z_{n}, α^{2, n}}, Θ (h_{n}, α^{2, n}))] 1_{d_{2}^{n} \leq h_{n}}] + \frac{1}{n} . \end{matrix}$

We now choose $h_{n} : = τ_{n}^{0} \land s_{1}^{0, n}$ where $τ_{n}^{0} : = inf {s \geq 0, (Z_{s}^{z_{n}, α^{0, n}}, Θ (s, α^{0, n})) \notin B ((z_{n}, Θ_{n}), Δ / 2)}$ . Therefore, we get:

$\begin{matrix} v (z_{n}, Θ_{n}) & \leq & E [e^{- β h_{n}} (v (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n})) 1_{τ_{n}^{0} < s_{1}^{0, n}} \\ + h_{1}^{*} (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n})) 1_{τ_{n}^{0} \geq s_{1}^{0, n}})] + \frac{1}{n} \end{matrix}$

(A6)

Now, since $h_{1}^{*} (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n})) 1_{τ_{n}^{0} \geq s_{1}^{0, n}} < φ (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n})) 1_{τ_{n}^{0} \geq s_{1}^{0, n}}$ and $v \leq v^{*} \leq φ$ on $E^{0}$ , we get:

$\begin{matrix} φ (z_{n}, Θ_{n}) + γ_{n} & \leq & E [e^{- β h_{n}} φ (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n}))] + \frac{1}{n} . \end{matrix}$

Applying Itô’s formula to $e^{- β h_{n}} φ (Z_{h_{n}}^{z_{n}, α^{0, n}}, Θ (h_{n}, α^{0, n}))$ between 0 and $h_{n}$ , we then get:

$\begin{matrix} γ_{n} & \leq & - \frac{r}{2} E [h_{n}] + \frac{1}{n} . \end{matrix}$

This implies:

$\begin{matrix} lim_{n \to \infty} E [h_{n}] & = & 0 . \end{matrix}$

(A7)

On the other hand, we have by (A6):

$\begin{matrix} v (z_{n}, Θ_{n}) & \leq & sup_{(z^{'}, Θ^{'}) \in B ((z_{n}, Θ_{n}), Δ / 2)} v_{0} (z^{'}, Θ^{'}) P [τ_{n}^{0} < s_{1}^{0, n}] \\ + sup_{(z^{'}, Θ^{'}) \in B ((z_{n}, Θ_{n}), Δ / 2)} h_{1}^{*} (z^{'}, Θ^{'}) P [τ_{n}^{0} \geq s_{1}^{0, n}] + \frac{1}{n} . \end{matrix}$

From (A7), we then get sending n to infinity and $Δ$ to zero:

$\begin{matrix} v^{*} (\bar{z}, \bar{Θ}) & \leq & M_{1} v^{*} (\bar{z}, \bar{Θ}) . \end{matrix}$

Concerning the proof for $D_{1}$ and $D_{2}$ , we consider two cases: the case $\bar{θ} < m$ and the case $\bar{θ} \geq m$ .
We start with the case $\bar{θ} \geq m$ . In this case we consider $h_{n} = d_{2}^{1, n} \land τ_{n}^{1}$ where $τ_{n}^{1} : = inf {s \geq 0, (Z_{s}^{z_{n}, α^{1, n}}, d (s, α^{1, n})) \notin B ((z_{n}, Θ_{n}), Δ / 2)}$ . Therefore, we have:

$\begin{matrix} v_{2} (z_{n}, Θ_{n}) & \leq & E [e^{- β h_{n}} (h_{2}^{*} (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n})) 1_{d_{2}^{1, n} \leq τ_{n}^{1}} \\ + v_{2} (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n})) 1_{d_{2}^{1, n} > τ_{n}^{1}})] + \frac{1}{n} \end{matrix}$

(A8)

Now, since $h_{2}^{*} (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n})) 1_{d_{2}^{1, n} \leq τ_{n}^{1}} < φ (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n})) 1_{d_{2}^{1, n} \leq τ_{n}^{1}}$ and $v_{2} \leq v_{2}^{*} \leq φ$ on $E^{1}$ , we get:

$\begin{matrix} φ (z_{n}, Θ_{n}) + γ_{n} & \leq & E [e^{- β h_{n}} φ (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n}))] + \frac{1}{n} . \end{matrix}$

Applying Itô’s formula to $e^{- β h_{n}} φ (Z_{h_{n}}^{z_{n}, α^{1, n}}, Θ (h_{n}, α^{1, n}))$ between 0 and $h_{n}$ , we then get:

$\begin{matrix} γ_{n} & \leq & - \frac{r}{2} E [h_{n}] + \frac{1}{n} . \end{matrix}$

This implies:

$\begin{matrix} lim_{n \to \infty} E [h_{n}] & = & 0 . \end{matrix}$

(A9)

On the other hand, we have by (A8):

$\begin{matrix} v_{2} (z_{n}, Θ_{n}) & \leq & sup_{(z^{'}, Θ^{'}) \in B ((z_{n}, Θ_{n}), Δ / 2)} v_{2} (z^{'}, Θ^{'}) P [τ_{n}^{1} < d_{2}^{1, n}] \\ + sup_{(z^{'}, Θ^{'}) \in B ((z_{n}, Θ_{n}), Δ / 2)} h_{2}^{*} (z^{'}, Θ^{'}) P [τ_{n}^{1} \geq d_{2}^{1, n}] + \frac{1}{n} . \end{matrix}$

From (A9), we then get sending n to infinity and $Δ$ to zero:

$\begin{matrix} v_{2}^{*} (\bar{z}, \bar{d}) & \leq & h_{2}^{*} (\bar{z}, \bar{d}), \end{matrix}$

which is a contradiction.
The case $\bar{θ} < m$ where $i = 1$ , is analogous to the previous case and we also obtain a contradiction.

Appendix C. Uniqueness

The uniqueness of v as a viscosity solution to (11), (12), (14), (16) and (17) satisfying (9) follows from the following comparison result.

Proposition A3.

Let

\underset{̲}{w} : Z \times D \to R

a viscosity subsolution to (11), (12), (14), (16), and (17) such that:

\begin{matrix} \underset{̲}{w} (x, p, K, K, y) & \geq & g (x, p, 0, K, y) \\ lim_{θ \to m^{-}} \underset{̲}{w} (x, p, θ, ρ, y) & \geq & y p - f (ρ - m - ℓ) + g (x, p, m, ρ, 0) \end{matrix}

(A10)

and

\bar{w} : Z \times D \to R

a viscosity supersolution to (11), (12), (14), (16), and (17) such that:

\begin{matrix} \bar{w} (x, p, K, K, y) & \leq & g (x, p, 0, K, y) \\ lim_{θ \to m^{-}} \bar{w} (x, p, θ, ρ, y) & \leq & y p - f (ρ - m - ℓ) + g (x, p, m, ρ, 0) . \end{matrix}

(A11)

Suppose there exist two positive constants

C_{1}

and

C_{2}

such that:

\begin{matrix} \underset{̲}{w} (z, Θ) & \leq & C_{2} {(1 + | x |}^{2} + {| p |}^{2}) \\ \bar{w} (z, Θ) & \geq & y p - C_{1}, \end{matrix}

(A12)

for all

(z, Θ) \in Z \times D

with

z = (x, p)

and

Θ = (θ, ρ, y)

. Then

\underset{̲}{w} \leq \bar{w}

on

Z \times D

. In particular, there exists at most a unique viscosity solution w to (11), (12), (14), (16), (17), (A10), and (A11) satisfying (A12) and w is continuous on

Z \times D

.

The proof follows from the classical argument of doubling variable for proving the comparison between a sub and a super solution. We therefore omit it and refer to [8] Theorem 5.2 for a detailed proof that can be easily adapted to our PDE.

References

Lande, R.; Engen, S.; Sæther, B.-E. Optimal harvesting of fluctuating populations with a risk of extinction. Am. Nat. 1995, 145, 728–745. [Google Scholar] [CrossRef]
Alvarez, L.; Keppo, J. The impact of delivery lags on irreversible investment under uncertainty. Eur. J. Oper. Res. 2002, 136, 173–180. [Google Scholar] [CrossRef]
Bar-Ilan, A.; Strange, W. Investment lags. Am. Econ. Rev. 1996, 8, 610–622. [Google Scholar]
Alvarez, L.; Shepp, L. Optimal harvesting of stochastically fluctuating populations. J. Math. Biol. 1998, 37, 155–177. [Google Scholar] [CrossRef]
Bruder, B.; Pham, H. Impulse control problem on finite horizon with execution delay. Stoch. Process. Appl. 2009, 119, 1436–1469. [Google Scholar] [CrossRef] [Green Version]
Oksendal, B.; Sulem, A. Optimal Stochastic Impulse Control with Delayed Reaction. Appl. Math. Optim. 2008, 58, 253–298. [Google Scholar] [CrossRef] [Green Version]
Kharroubi, I.; Lim, T.; Vath, V.L. Optimal Exploitation of a Resource with Stochastic Population Dynamics and Delayed Renewal. J. Math. Anal. Appl. 2019, 477, 627–656. [Google Scholar] [CrossRef] [Green Version]
Mnif, M.; Pham, H. A model of optimal portfolio selection under liquidity risk and price impact. Financ. Stochastics 2007, 11, 51–90. [Google Scholar]
Soner, H.M. Optimal Control with State-Space Constraints I. SIAM J. Control. Optim. 1986, 24, 552–561. [Google Scholar] [CrossRef]
Soner, H.M. Optimal Control with State-Space Constraints II. SIAM J. Control. Optim. 1986, 24, 1110–1122. [Google Scholar] [CrossRef]
Skiadas, C.H. Exact Solutions of Stochastic Differential Equations: Gompertz, Generalized Logistic and Revised Exponential. Methodol. Comput. Appl. Probab. 2010, 12, 261–270. [Google Scholar] [CrossRef]
Gaïgi, M.; Vath, V.L.; Mnif, M.; Toum, S. Numerical Approximation for a Portfolio Optimization Problem Under Liquidity Risk and Costs. Appl. Math. Optim. 2016, 74, 163–195. [Google Scholar] [CrossRef]
Guilbaud, F.; Mnif, M.; Pham, H. Numerical methods for an optimal order execution problem. J. Comput. Financ. 2013, 16, 3–45. [Google Scholar] [CrossRef] [Green Version]
Pagès, G.; Pham, H.; Printems, J. Optimal quantization methods and applications to numerical problems in finance. In Handbook on Numerical Methods in Finance; Rachev, S., Ed.; Birkhäuser: Boston, MA, USA, 2004; pp. 253–298. [Google Scholar]
Bouchard, B.; Touzi, N. Weak dynamic programming principle for viscosity solutions. SIAM J. Control. Optim. 2011, 49, 948–962. [Google Scholar] [CrossRef] [Green Version]
Dellacherie, C.; Meyer, P.A. Probabilités et Potentiel, I–IV; Hermann: Paris, France, 1975. [Google Scholar]

Figure 1. The shape of the value function v sliced in the plane of x.

Figure 2. The shape of the value function v sliced in the plane of x for different values of ℓ.

Figure 3. The optimal policy for

(x, p, θ \in [ℓ, K], ρ, y) \in D_{0}

in the plane of x.

Figure 3. The optimal policy for

(x, p, θ \in [ℓ, K], ρ, y) \in D_{0}

in the plane of x.

Figure 4. The optimal policy for

(x, p, θ, ρ, y) \in D_{0}

in the plane of

θ

.

Figure 4. The optimal policy for

(x, p, θ, ρ, y) \in D_{0}

in the plane of

θ

.

Figure 5. The value function v for

(x, p, θ \in [0, m], ρ, y) \in D_{1}

in the plane of p.

Figure 5. The value function v for

(x, p, θ \in [0, m], ρ, y) \in D_{1}

in the plane of p.

Figure 6. The optimal policy for

(x, p, θ \in [δ, θ_{m a x}], ρ, y) \in D_{2}

in the plane of x.

Figure 6. The optimal policy for

(x, p, θ \in [δ, θ_{m a x}], ρ, y) \in D_{2}

in the plane of x.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gaïgi, M.; Kharroubi, I.; Lim, T. Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints. Mathematics 2020, 8, 2053. https://doi.org/10.3390/math8112053

AMA Style

Gaïgi M, Kharroubi I, Lim T. Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints. Mathematics. 2020; 8(11):2053. https://doi.org/10.3390/math8112053

Chicago/Turabian Style

Gaïgi, M’hamed, Idris Kharroubi, and Thomas Lim. 2020. "Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints" Mathematics 8, no. 11: 2053. https://doi.org/10.3390/math8112053

APA Style

Gaïgi, M., Kharroubi, I., & Lim, T. (2020). Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints. Mathematics, 8(11), 2053. https://doi.org/10.3390/math8112053

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints

Abstract

1. Introduction

2. Problem Formulation

2.1. The Model

2.2. The Value Function

3. PDE Characterization

3.1. Extension of the Value Function

3.2. Dynamic Programming Principle

3.3. Growth Property

3.4. Viscosity Properties and Uniqueness

4. Numerical Results

4.1. The Discrete Problem

4.2. Numerical Interpretations

4.3. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A. Dynamic Programming Principle

Appendix B. Viscosity Properties

Appendix C. Uniqueness

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI