Next Article in Journal
A Heuristic Method for Certifying Isolated Zeros of Polynomial Systems
Next Article in Special Issue
Price and Treatment Decisions in Epidemics: A Differential Game Approach
Previous Article in Journal
Introducing Weights Restrictions in Data Envelopment Analysis Models for Mutual Funds
Previous Article in Special Issue
A Game-Theoretic Loss Allocation Approach in Power Distribution Systems with High Penetration of Distributed Generations

Article

# Payoff Distribution in a Multi-Company Extraction Game with Uncertain Duration

by 1,*,† and
1
Faculty of Applied Mathematics and Control Processes, St. Petersburg State University, St. Petersburg 198504, Russia
2
MEMOTEF, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2018, 6(9), 165; https://doi.org/10.3390/math6090165
Received: 25 July 2018 / Revised: 24 August 2018 / Accepted: 31 August 2018 / Published: 11 September 2018

## Abstract

A nonrenewable resource extraction game model is analyzed in a differential game theory framework with random duration. If the cumulative distribution function (c.d.f.) of the final time is discontinuous, the related subgames are differentiated based on the position of the initial instant with respect to the jump. We investigate properties of optimal trajectories and of imputation distribution procedures if the game is played cooperatively.

## 1. Introduction

Modern mathematical game theory solves problems of modeling, research and analysis of various conflict-controlled processes. Of particular interest are the processes developing over time [1]. Differential games allow us to describe such dynamic processes in the sense of a conflict.
In a differential game of extraction, the standard scenario involves a dynamic competition among players (or, more precisely, companies) which exert effort aimed at extracting a natural resource. If the resource does not regenerate over time, such as natural gas or earth minerals, it is called exhaustible or nonrenewable.
Economic literature has been dealing with effects and characteristics of exhaustible resource extraction since 1817, when Ricardo [2] addressed the issue in his essay The principles of political economy and taxation. In the 20th century, the debate was relaunched by Hotelling [3], and then subsequently a vast stream of static and dynamic models was conceived and developed over the years (see, for example [4]).
If we only focus on models described through differential games, the basic framework includes a population of companies extracting the same resource, having the extraction effort levels as their strategic variables, which directly affect their respective payoffs, which increase as the extracted quantity increases. On the other hand, the state variables represent the stocks of resources, which are depleted over time by extraction. In the easiest representation, there is a unique resource and all companies aim to pick it up as much as possible. To describe a more realistic economic behavior, a key element was introduced in economic literature: the random duration of the game.
The seminal paper on this extension of the standard optimal control problem is due to Yaari [5] in 1965. At the same time, in Russia, in 1966, Petrosyan and Murzov [6] first studied differential zero-sum games with terminal payoff at random time horizon. Subsequently, further studies have been provided: in the work of Boukas et al. [7] in 1990, an optimal control problem with random duration was studied in general terms. Cooperative differential games with random time horizon were first studied by Petrosyan and Shevkoplyas [8] in 2000, whereas the concept of time consistency in differential games with prescribed duration was introduced in [9].
Such a concept is particularly relevant because most literature treats stability of the cooperative solutions in static cooperative settings. On the other hand, stable cooperation in the problem is a key requirement when the scenario is dynamic as well. In cooperative differential games, cooperating players wish to establish a dynamically stable (time-consistent) cooperative agreement (e.g., the dynamic versions of the Shapley Value, core, etc.).
Time consistency implies that, as cooperation evolves, cooperating partners are guided by the same optimality principle at each instant of time and hence do not have any incentive to deviate from the previously adopted cooperative behavior.
After Petrosyan’s seminal paper in 1977, such topic was actively developed by a number of researchers. In a paper by Jorgensen et al. [10], the problem of time-consistency and agreeability of the solution in linear-state class of differential games was investigated. In a paper by Petrosjan and Zaccour [11], a similar problem of ecological management was studied as well as in the more recent paper by Zaccour [12] and book by Petrosyan and Yeung [13]. Recently, the notion of time consistency was extended to the case of discrete games (see, e.g., [14]). An extension of the time consistency problem to the case of differential games with random duration was first undertaken in [8], subsequently further investigation and results were accomplished in [15,16,17,18,19]. In [20], a random time horizon hybrid (see also [21] for a general treatment of hybrid differential games) differential game was considered such that the probability distribution can change over time. Differential games with discrete random variable of time horizon and corresponding time-consistency problem were considered recently in [22]. Time-consistency notation for multistage games with vector payoffs was introduced in [23]. The regularization of a cooperative solution for the case of Core and the Shapley value had been done for a multistage game with random time horizon in [24]. The present contribution locates itself in this line of research.
In this paper, we intend to propose a description and an analysis of a scenario which differs from the previous treatments: the random variable which indicates the stopping time of extraction has a c.d.f. which is not continuous over the whole time interval. Specifically, we assume that there is a jump at an internal point, and we carry out an analysis which is differentiated based on the initial time of the game, i.e., before or after the jump. This formulation can represent any situation in which the distribution of the random variable is affected by external factors such as a Parliament bill which makes an extraction technique illegal. An example may be provided by the controversial fracking process for gas extraction.
In this setting, standard models take into account an oligopolistic competition among firms, where each firm aims to maximize its own profit. However, there exist some different approaches in the literature which also involve the possibility of cooperation among agents.
Because of the depletion of oil and gas resources on the mainland, the active development of oil-and-gas fields on continental shelves is to begin in the near future. Today, there are about seventy developing and potential oil-and-gas fields on continental shelves of Azerbaijan, Canada, Kazakhstan, Mexico, Norway, Russia, Saudi Arabia, the USA, etc. For example, today the firms which are involved in the development of Sakhalin oil-and-gas fields (Russia) are Gazprom, Shell, Mitsui, and Mitsubishi.
Moreover, the task of oil and gas exploitation in the Arctic is a key issue nowadays, especially relevant for Canada, Denmark, Norway, Russia and the USA. We believe that the source of economic success of the development of pool in Arctic should bring about a cooperative collaboration of participating countries. Collaboration in the Arctic is important at least in the sense that an accident at one borehole could lead to serious problems or complete stoppage of resource exploitation for all neighbors. Thus, the involved countries have to collaborate to provide security for oil and gas exploitation in the Arctic, otherwise environmental disasters and huge economic losses for all participants might occur. This is the main motivation to consider the cooperative form of the non-renewable resource extraction game.
However, despite all the above, the oil and gas extraction on a continental shelf is a high-risk economic activity and reconsideration of existing models of non-renewable resource extraction is required. Stochastic framework may be useful in the sense that it increases the validity of models (see, for example, [25]). As usual, game-theoretical models with infinite or fixed time horizon are used for modeling of renewable or exhausted resource exploitation. Although they provide numerous insights for equilibrium and stability, such an approach is not very realistic. Namely, the contract date is never equal to the real period of field exploitation, because either exploitation is prematurely finished by accident or unprofitability or the period of exploitation is extended.
Here, we specifically consider the occurrence of a cooperative game structure, where companies agree on a collective strategy to maximize the aggregate payoff. The agreement establishes that, after maximization, the total payoff is supposed to be redistributed among the cooperating firms. As in standard theory of cooperative games, the distribution of the total worth is the problem to be addressed (see, for example, [8]). In a differential game, the total worth simply corresponds to the sum of the integral payoffs of all players, and the distribution of the total worth has to be implemented by using a suitable solution concept. Our main focus is on the cooperative setup, where we describe the determination of an IDP (imputation distribution procedure, which was first introduced by Petrosyan in [9]), which is a dynamic way to attribute players their respective shares gained in the game. We also determine the relations to explicitly calculate IDPs in the above different cases, also discussing the issue of time consistency. Finally, we outline a complete example where N companies compete over extraction of a unique exhaustible resource, comparing the results in the non-cooperative and cooperative scenarios.
The paper is organized as follows. Section 2 introduces the notation of the game, whose non-cooperative setup is exposed. The cooperative setup is proposed in Section 3, where the main findings, including a theorem which establishes the existence of a time-consistent imputation, are laid out in detail. In Section 4, we propose a model to employ the above-mentioned procedure. Section 5 concludes and proposes some possible future developments.

## 2. Notation and Non-Cooperative Setup

#### 2.1. Problem Statement

Consider the following standard notation for the N-players differential game $Γ T ( t 0 , x 0 )$, starting at initial time instant $t 0$ and at initial state $x 0$:
• $u 11 ∈ U 11 , u 12 ∈ U 12 , … , u 1 M ∈ U 1 M , … , u N 1 ∈ U N 1 , … , u N M ∈ U N M$ are the extraction effort levels of the N companies involved in pulling out M exhaustible resources. More precisely, $u i j$ is the effort exerted by firm i to extract resource j. The only requirement for the control sets $U i j$, for $i = 1 , … , N$, $j = 1 , … , M$, concerns the non-negativity of effort levels, so we can assume $U i j ⊆ R +$, for all $i , j$. (We do not impose any other constraint both on the control sets and on the state set, thus admitting any possible level. Because such sets are not compact in principle, maximum points may fail to exist, hence the choice of the payoff functions is crucial to have an equilibrium structure.)
• $x ( t ) = ( x 1 ( t ) , … , x M ( t ) )$ is the state vector indicating the quantities of the exhaustible resources available to be extracted by the companies. We assume $x ∈ X ⊆ R + M$.
• The M dynamic constraints of the game are given by:
$x ˙ ( t ) = g ( x ( t ) , u 11 ( t ) , … , u N M ( t ) ) x ( t 0 ) = x 0 ∈ R + M ,$
where $x ∈ R + M$, $u i j ∈ U i j ⊆ R +$, and $g : R M × R N → R N M$ is a vector-valued function. The state equations in Equation (1) are ODEs whose solutions satisfy the standard existence and uniqueness requirements (the standard requirements are simply satisfied when dealing with a linear-quadratic structure such as the one we consider in Section 4).
• The interval over which the game is played is $[ t 0 , T ] ⊂ R +$, where $t 0 ≥ 0$ and $T < ∞$.
• The final instant of the game, i.e., the exact time at which all companies stop the extraction, is described by the random variable $t ^ ∈ [ t 0 , T ]$. The cumulative distribution function (c.d.f.) of $t ^$ is given by $F p ( t )$, which is assumed to have a break (jump) of length $p > 0$. The jump occurs at instant $t 1 ∈ [ t 0 , T ]$, i.e., it can be described as follows (Figure 1):
$F p ( t ) = F ( t ) , t ∈ [ t 0 , t 1 ) F ( t ) + p , t ∈ [ t 1 , T ] ,$
where $F ( t )$ is a sufficiently regular function. By construction, there exists $q > 0$ such that $F ( T ) = q ,$$p + q = 1 .$
• The instantaneous payoff of the i-th player at the moment $τ ∈ [ t 0 , T ]$ is defined as $h i ( x ( τ ) , u i 1 ( τ ) , … , u i M ( τ ) )$. To shorten the notation, we write
$h i ( x ( τ ) , u i 1 ( τ ) , … , u i M ( τ ) ) = h i ( τ ) .$
The i-th related integral function is:
$H i ( t ) = ∫ t 0 t h i ( τ ) d τ .$
• The i-th objective function is represented by the following integral payoff to be maximized:
$K i ( t 0 , x 0 , u 11 , … , u N M ) = ∫ t 0 T ∫ t 0 t h i ( x ( τ ) ) d τ d F p ( t ) .$
The transformation of integral functional in the form of double integral (Equation (3)) to the standard for dynamic programming form is important for further study of the game (see also [26]).
Proposition 1.
The integral payoff in Equation (3) has the following form:
$K i ( t 0 , x 0 , u 11 , … , u N M ) = ∫ t 0 T h i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T h i ( t ) d t .$
Proof.
Keeping in mind that $H i ( t 0 ) = 0$, $F p ( t 0 ) = 0$, $F p ( T ) = 1$, the payoffs $K i ( · )$ can be rearranged by a simple manipulation:
$K i ( t 0 , x 0 , u 11 , … , u N M ) = ∫ t 0 t 1 H i ( t ) d F p ( t ) + ∫ t 1 T H i ( t ) d F p ( t ) =$
$= H i ( t ) F p ( t ) t 0 t 1 − ∫ t 0 t 1 h i ( t ) F p ( t ) d t + H i ( t ) F p ( t ) t 1 T − ∫ t 1 T h i ( t ) F p ( t ) d t =$
$= ∫ t 0 T h i ( t ) ( 1 − F p ( t ) ) d t = ∫ t 0 t 1 h i ( t ) ( 1 − F ( t ) ) d t + ∫ t 1 T h i ( t ) ( 1 − F ( t ) − p ) d t =$
$= ∫ t 0 T h i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T h i ( t ) d t .$
☐
It can be helpful to provide a piece of justification for this model. Namely, this problem statement intends to take into account a common situation that there are certain events that happen at fixed time instants and that can be decisive for the game to stop or to proceed.
For instance, political activity or controversy may affect the situation: suppose that the Parliament passes a bill, or the outcome of a referendum establishes that would seriously impede or forbid the extraction activity (for example, prohibition of the fracking process). Obviously, companies know that the decision will be taken on a certain day and they can also estimate the probability of a negative outcome. Hence, it can be readily embedded into the ex-ante estimation of the terminal time probability distribution. Furthermore, the interpretation of such a scenario can also be extended towards other dynamic models involving environmental aspects. For example, even settings where the objective is pollution reduction can be affected by temporary shocks which modify the p.d.f. of some relevant variable: if the state variable is the pollution stock and we have a p.d.f. of its diffusion over the environment ex ante, a natural event may cause a jump in the distribution and, consequently, the need for a change of strategy. Other applications in other fields (such as insurance theory) can be hypothesized as well, but that goes far beyond the scope of our paper.
Back to our modeling, the jump in the probability distribution can also occur at the initial time, and this implies that there is a finite probability that the game does not start at all. Such a situation can be very interesting from the theoretical point of view as this corresponds to a non-proper probability function, i.e., a situation that was never addressed before in literature.
Finally, an interesting interpretation can be attached to the c.d.f. $F p ( t )$: basically, $p ∈ [ 0 , 1 )$, suggesting that it can represent the probability that the jump occurs. Namely, if $t 1 = t 0$ the game stops immediately after the start, and since $F ( t 1 ) = 0$, $p = 1$. On the other hand, p decreases as time goes on, because $F ( · )$ is increasing: if $t 1 = T$, no jump occurs and $F ( T ) = 1$, so $p = 0$.

#### 2.2. Problem Statement for a Subgame

The important notation in dynamic (differential) games is a notion of subgame [13] which takes non-trivial form for our problem statement for the reason of stochastic elements relating to time of a game duration. In dynamic (differential) games, there is a key notion of subgame [13], which takes a non-standard form, due to the stochastic time duration of the game.
Let the game evolves along the trajectory $x ˜ ( t )$. To better identify subgames of $Γ T ( t 0 , x ˜ )$, we are going to distinguish two main cases, which are differentiated based on the payoff flows: when the subgame starts before the jump instant $t 1$ and after $t 1$.
Subgame starting at $θ < t 1$: Consider a subgame $Γ T ( θ , x ˜ )$ such that $θ ∈ [ t 0 ; t 1 )$. The conditional c.d.f. in the considered subgame takes the following form:
$F θ p ( t ) = F p ( t ) − F p ( θ ) 1 − F p ( θ ) ,$
where
$F θ p ( t ) = F ( t ) − F ( θ ) 1 − F ( θ ) , t ∈ [ θ , t 1 ) F ( t ) + p − F ( θ ) 1 − F ( θ ) , t ∈ [ t 1 , T ] .$
Therefore, recalling that $q = 1 − p$, the expected integral payoff accruing to the player i in this subgame is given by the following formula:
$K i ( θ , x ˜ , u 11 , … , u N M ) = ∫ θ T h i ( t ) ( 1 − F θ p ) d t =$
$= ∫ θ t 1 h i ( t ) 1 − F ( t ) − F ( θ ) 1 − F ( θ ) d t + ∫ t 1 T h i ( t ) 1 − F ( t ) + p − F ( θ ) 1 − F ( θ ) d t =$
$= 1 1 − F ( θ ) ∫ θ t 1 h i ( t ) ( 1 − F ( t ) ) d t + 1 1 − F ( θ ) ∫ t 1 T h i ( t ) ( q − F ( t ) ) d t =$
$= 1 1 − F ( θ ) ∫ θ t 1 h i ( t ) ( 1 − F ( t ) ) d t + ∫ t 1 T h i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T h i ( t ) d t =$
$= 1 1 − F ( θ ) ∫ θ T h i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T h i ( t ) d t .$
Subgame starting at $θ ^ ≥ t 1$: Consider a subgame $Γ T ( θ ^ , x ˜ )$ such that $θ ^ ∈ [ t 1 , T ]$. The conditional cumulative distribution function in the considered subgame takes the following form:
$F θ ^ p ( t ) = F ( t ) − F ( θ ^ ) 1 − p − F ( θ ^ ) .$
Therefore, player i’s expected integral payoff is provided by the formula:
$K i ( θ ^ , x ˜ , u 11 , … , u N M ) = 1 1 − p − F ( θ ^ ) ∫ θ ^ T h i ( t ) ( 1 − p − F ( t ) ) d t .$
Thus, we prove the following proposition.
Proposition 2.
The expected integral payoff of player i in the subgame $Γ T ( θ , x ˜ )$, $θ ∈ [ t 0 , T ]$ has the following form:
$K i ( θ , x ˜ , u 11 , … , u N M ) = 1 1 − F ( θ ) ∫ θ t 1 h i ( t ) ( 1 − F ( t ) ) d t + ∫ t 1 T h i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T h i ( t ) d t , i f θ ∈ [ t 0 ; t 1 ) ; 1 1 − p − F θ ) ∫ θ T h i ( t ) ( 1 − p − F ( t ) ) d t , i f θ ∈ [ t 1 , T ] .$

#### 2.3. Open-loop Nash equilibrium

To find the equilibrium in the non-cooperative setup of the game, we use the definition of a time-consistent Nash equilibrium from [13] adopted for the new problem statement as defined in Section 2.1. Let us consider case of $M = 1$ (the definition can be easily extended for the case with several resources).
Definition 1.
A set of strategies $u 1 * s , u 2 * s , … , u N * s$ is said to constitute a Nash equilibrium solution for the n-person differential game (Equations (1))–((4)), if the following inequalities are satisfied for all $u i ( s ) ∈ U i$, $i ∈ N$, $s ∈ [ t 0 , T ]$:
$K 1 s , x * s , u 1 * , u 2 * , … , u N * ≥ K 1 s , x * s , u 1 , u 2 * , … , u N * , K 2 s , x * s , u 1 * , u 2 * , … , u N * ≥ K 2 s , x * s , u 1 * , u 2 , u 3 * , … , u N * , ⋮ K n s , x * s , u 1 * , u 2 * , … , u N * ≥ K n s , x * s , u 1 * , u 2 * , … , u n − 1 * , u N ;$
where
$x ˙ * s = f s , x * s , u 1 * s , u 2 * s , … , u N * s , x * t 0 = x 0 ,$
The set of strategies $u 1 * s , u 2 * s , … , u N * s$ is said to be a Nash equilibrium of the game.

## 3. Main Results in the Cooperative Setup

Suppose that the game $Γ T ( t 0 , x 0 )$ is played in a cooperative scenario. Generally speaking, cooperation means that a group of companies agree to form a coalition before starting the game. In this case, we assume that such a group is the grand coalition, i.e., the totality of the involved players. Clearly, any dynamic model in which players form coalitions that are subgroups of the grand coalition deserves a special attention as well, but it is outside the scope of this paper (for the construction of the value functions in cooperative games, see, for example, [27,28] for cooperative differential games).
From now on, to simplify the notation and to reconcile the ongoing discussion with a standard case, we assume a unique exhaustible resource, which is extracted by N different companies, hence $M = 1$ and $u 1 , … , u N$ are the effort levels. The cooperating players decide to use optimal strategies $u 1 * , … , u N *$, which are defined as the strategies maximizing the sum of all payoffs, i.e.,
$( u 1 * , … , u N * ) = arg max u ∈ U 1 × ⋯ × U N ∑ i = 1 N K i ( t 0 , x 0 , u 1 , … , u N ) .$
As is standard in cooperative games, all players in the coalition jointly agree on a distribution method to share the total payoff. It is possible that, in some instant, the solution of the current game is not optimal according to the optimality principle which was initially selected, meaning that the optimality principle may lose time-consistency. Because we are investigating a dynamic setting, it is necessary to define and to determine an imputation distribution procedure which is supposed to be compliant with the payoff in the form of Equation (4).
Before proceeding, we briefly recall the notion of imputation: in an N-players cooperative game, an imputation is a distribution $ξ = ( ξ 1 , … , ξ N )$ among players such that the sum of its coordinates is equal to the value of the grand coalition and each $ξ i$ assigns to the i-th player a quantity which is not smaller than the one she would achieve by playing as a singleton. In other words, if N is the set of players and $v : 2 N ⟶ R$ is the characteristic function of the game, $ξ$ is an imputation if $ξ 1 + ⋯ + ξ N = v ( N )$ and $ξ i ≥ v ( i )$ for all $i = 1 , … , N$. The first property is called efficiency and guarantees that the imputation is a method of distribution of the total gain among all players (for an exhaustive overview on cooperative games, see [29]). Different imputations are usually employed in cooperative games, because not all solution concepts fit all models. However, the most useful one seems to be the Shapley value, first introduced by Nobel laureate L.S. Shapley in [30] in 1953, and which has been utilized in a huge number of economic and financial applications. (An extensive treatment of the Shapley value and of other relevant solution concepts can be found in [29].)
Definition 2.
Given an imputation $ξ = ( ξ 1 , … , ξ N ) ∈ R + N$ in a game $Γ T ( t 0 , x * )$, such that for all $i = 1 , … , N$ we have that:
$ξ i = ∫ t 0 T ( 1 − F ( τ ) ) β i ( τ ) d τ − p ∫ t 1 T β i ( τ ) d τ ,$
then the vector function $β ( t ) = ( β 1 ( t ) , … , β N ( t ) ) ∈ R + N$ is called an imputation distribution procedure (IDP).
The next Definition intends to expose the property of time-consistency for imputations.
Definition 3.
An imputation $ξ = ( ξ 1 , … , ξ N ) ∈ R + N$ in a game $Γ T ( t 0 , x * )$ is time-consistent if there exists an IDP $β ( t ) = ( β 1 ( t ) , … , β N ( t ) ) ∈ R + N$ such that:
1.
for all $θ ∈ [ t 0 , t 1 )$ the vector $ξ θ = ( ξ 1 θ , … , ξ N θ )$, where
$ξ i θ = 1 1 − F ( θ ) ∫ θ T β i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T β i ( t ) d t .$
for all $i = 1 , … , N$, belongs to the same optimality principle in the subgame $Γ T ( θ , x * )$, i.e., $ξ θ$ is an imputation in $Γ T ( θ , x * )$;
2.
for all $θ ^ ∈ [ t 1 , T ]$ the vector $ξ ^ θ ^ = ξ ^ 1 θ ^ , … , ξ ^ N θ ^ ,$ where
$ξ ^ i θ ^ = 1 1 − p − F ( θ ^ ) ∫ θ ^ T β i ( t ) ( 1 − p − F ( t ) ) d t ,$
for all $i = 1 , … , N$, belongs to the same optimality principle in the subgame $Γ T ( θ ^ , x * )$, i.e., $ξ ^ θ ^$ is an imputation in $Γ T ( θ ^ , x * )$.
The next step consists in the determination of a relation between $ξ$ and $β$. In addition, in this case, we have to distinguish the cases when the subgame starts before or after the jump at instant $t 1$. Firstly, we prove a lemma which is helpful to reformulate imputation $ξ$. The subsequent Proposition intend to explicitly outline the forms for the IDPs of the game.
Lemma 1.
If $t 0 ≤ θ ≤ t 1 ≤ θ ^ ≤ T$, for all $i = 1 , … , N$, the coordinates of imputation ξ can be written as follows:
$ξ i = ∫ t 0 θ β i ( t ) ( 1 − F ( t ) ) d t + ( 1 − F ( θ ) ) ξ i θ ,$
$ξ i = ∫ t 0 t 1 β i ( t ) ( 1 − F ( t ) ) d t + ∫ t 1 θ ^ β i ( t ) ( q − F ( t ) ) d t + ( q − F ( θ ) ) ξ i θ ^ .$
Proof.
We can write the following:
$ξ i = ∫ t 0 T β i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T β i ( t ) d t =$
$= ∫ t 0 θ β i ( t ) ( 1 − F ( t ) ) d t + ∫ θ T β i ( t ) ( 1 − F ( t ) ) d t − p ∫ t 1 T β i ( t ) d t =$
$= ∫ t 0 θ β i ( t ) ( 1 − F ( t ) ) d t + ( 1 − F ( θ ) ) ξ i θ .$
and finally Equation (6).
For the second case, we can write the following:
$ξ i = ∫ t 0 t 1 β i ( t ) ( 1 − F ( t ) ) d t + ∫ t 1 T β i ( t ) ( q − F ( t ) ) d t =$
$= ∫ t 0 t 1 β i ( t ) ( 1 − F ( t ) ) d t + ∫ t 1 θ ^ β i ( t ) ( q − F ( t ) ) d t + ∫ θ ^ T β i ( t ) ( q − F ( t ) ) d t =$
$= ∫ t 0 t 1 β i ( t ) ( 1 − F ( t ) ) d t + ∫ t 1 θ ^ β i ( t ) ( q − F ( t ) ) d t + ( q − F ( θ ^ ) ) ξ i θ ^ .$
and finally Equation (7). ☐
Proposition 3.
If $θ ∈ [ t 0 , t 1 )$, then for all $i = 1 , … , N$, the i-th coordinate of the IDP is given by:
$β i ( θ ) = f ( θ ) 1 − F ( θ ) ξ i θ − ( ξ i θ ) ′ .$
If $θ ∈ [ t 1 , T ]$, then for all $i = 1 , … , N$, the i-th coordinate of the IDP is given by:
$β i ( θ ) = f ( θ ) q − F ( θ ) ξ i θ − ( ξ i θ ) ′ .$
Proof.
When $θ ∈ [ t 0 , t 1 )$, we can differentiate Equation (6) with respect to $θ$, thus obtaining:
$0 = β i ( θ ) ( 1 − F ( θ ) ) − f ( θ ) ξ i θ + ( 1 − F ( θ ) ) ( ξ i θ ) ′ .$
Then, solving for $β i ( θ )$ yields:
$β i ( θ ) = f ( θ ) 1 − F ( θ ) ξ i θ − ( ξ i θ ) ′ .$
When $θ ∈ [ t 1 , T )$, we can differentiate Equation (7) with respect to $θ ^$, thus obtaining:
$0 = β i ( θ ^ ) ( q − F ( θ ^ ) ) − f ( θ ^ ) ξ i θ ^ + ( q − F ( θ ^ ) ) ( ξ i θ ^ ) ′ .$
Then, solving for $β i ( θ ^ )$ yields:
$β i ( θ ^ ) = f ( θ ^ ) q − F ( θ ^ ) ξ i θ ^ − ( ξ i θ ^ ) ′ .$
☐
The above results can be collected as follows:
Theorem 1.
Let the imputation $ξ ( t , x * ( t ) , T )$ of the game $Γ T ( t 0 , x * )$ be an absolutely continuous function of t, $t ∈ [ t 0 , T ]$. If the IDP has one of the following forms:
1.
if $τ ∈ [ t 0 , t 1 )$,
$β i ( τ ) = f ( τ ) 1 − F ( τ ) ξ i ( τ , x * ( τ ) , T ) − ξ i ′ ( τ , x * ( τ ) , T ) ,$
1.
if $τ ∈ [ t 1 , T ]$
$β i ( τ ) = f ( τ ) 1 − p − F ( τ ) ξ i ( τ , x * ( τ ) , T ) − ξ i ′ ( τ , x * ( τ ) , T ) ,$
then $ξ ( t 0 , x 0 , T )$ is a time-consistent imputation with IDP given by either Equation (10) or (11).
The problem of stable cooperation in differential games with random duration where c.d.f. is continuous ( without any breaks) was studied by in [8,16,18]. Assuming in our model p is equal to zero, the obtained results coincide with the results in the above-mentioned work. Moreover, new results cover the framework for a fully deterministic models. Namely, for the problem with prescribed duration for $f ( τ ) = 0$ in Equations (10) and (11), we obtain the results published in [9]. For the problem with constant discounting, see work [11] and Equation (10) with $f ( τ ) 1 − F ( τ ) = λ$.

## 4. An Example

We are going to consider a simple model of common-property nonrenewable resource extraction published in [31] in 2000, and then further investigated in successive papers (e.g., [15,32]).
In addition, in this case, $M = 1$, that is we have a unique state variable $x ( t )$ indicating the stock of a nonrenewable resource at time t. The companies’ strategic variables $u i ( t )$, for $i = 1 , … , N$ denote the rates of extraction, or extraction efforts, at time t. The state equation has the form:
$x ˙ ( t ) = − ∑ i = 1 N u i ( t ) ,$
the initial condition, i.e., the amount of resource at time $t 0$ is $x ( t 0 ) = x 0 .$ The differential Equation (12) is the most standard and simple dynamics in nonrenewable resource extraction games, where all players concur to extract and deplete the resource with the same intensity. When the involved resource is renewable, it also regenerates at a growth rate $δ$, hence a positive linear term in the state variable also appears in Equation (12), and the model must be treated differently (see for example [33] or the survey [34]).
Back to the model, we suppose that the game ends at the random time instant t, a random variable having exponential distribution $F ( t )$ on the interval $[ t 0 , t 1 ]$ (Figure 2), i.e., we are investigating the first case, before the jump in the distribution. We also assume that the jump takes place in the end of the interval $[ t 0 ; T ]$, i.e., $t 1 = T$. Hence, the discontinuity occurs at the terminal time. The c.d.f. of the random variable t is given by:
$F ( t ) = e − t 0 1 − e − ( t − t 0 ) ,$
which turns into $F ( t ) = 1 − e − t$ for $t 0 = 0$. From now on, we consider this case, i.e., $t 0 = 0$.
Note that we can provide the complete formulation of the discontinuous c.d.f. as in the previous section:
$F p ( t ) = 1 − e − t , t ∈ [ 0 , t 1 ) 1 − e − t + e − T , t ∈ [ t 1 , T ] ,$
meaning that, in this case, $p = e − T$.
In this game, each player i has a utility function
$h i ( x ( τ ) , u i ( τ ) ) = k i u i ( t ) − 1 2 u i ( t ) 2 − δ i x ( t ) ,$
where $k i$ and $δ i$ are positive constants depending on the specific scenario and on the companies’ characteristics.
The expected integral payoff of player (to lighten the notation, we omit redundant arguments whenever possible):
$K i = ∫ 0 t 1 ( k i u i ( t ) − 1 2 u i ( t ) 2 − δ i x ( t ) ) e − t d t .$
We are going to find noncooperative open-loop optimal trajectories of state and controls in relation to the noncooperative form of the game using Pontryagin’s maximum principle, which is one of the two major procedures for equilibrium structure in differential games [31]. In this model, this method is suitable, because the open-loop trajectories are easily visualized in $K i ( · )$. Each company aims to solve the following problem:
$max u i ∫ 0 t 1 ( k i u i ( t ) − 1 2 u i ( t ) 2 − δ i x ( t ) ) e − t d t .$
Each player has a Hamiltonian function of the form:
$H i ( · ) = − ψ i ( t ) ∑ j = 1 n u j ( t ) + k i u i ( t ) − 1 2 u i ( t ) 2 − δ i x ( t ) e − t ,$
where $ψ i ( t )$ is the i-th adjoint variable attached by company i to the resource dynamics or, in line with a standard economic interpretation, the related shadow price.
Differentiating each Hamiltonian with respect to $u i$ and then equating to 0 yields the first order conditions:
$∂ H i ∂ u i = − ψ i ( t ) + ( k i − u i ( t ) ) e − t = 0 ,$
then, solving for $u i ( t )$, we obtain:
$u i ( t ) = k i − ψ i ( t ) e t .$
The second order conditions hold, because for all $i = 1 , … , N$:
$∂ 2 H i ∂ u i 2 = − e − t < 0 .$
$ψ ˙ i ( t ) = δ i e − t ψ i ( t 1 ) = 0 ,$
hence the optimal costates are $ψ i * ( t ) = δ i e − t 1 − e − t$, for all $i = 1 , … , N$.
Plugging $ψ i * ( t )$ into the FOCs yields the optimal controls, i.e.,
$u i * ( t ) = k i − δ i ( e t − t 1 − 1 ) .$
To determine the optimal state $x * ( t )$, it suffices to substitute Equation (14) into the state dynamics in Equation (12) and subsequently integrate both sides, employing the initial condition:
$x ˙ ( t ) = ( e t − t 1 − 1 ) ∑ j = 1 N δ j − ∑ j = 1 N k j x ( 0 ) = x 0 ,$
so the optimal stock of resource amounts to:
$x * ( t ) = x 0 − t ∑ j = 1 N δ j + k j + ∑ j = 1 N δ j e t − 1 e − t 1 .$
Now, we are going to take into account a cooperative version of the game, that is a scenario where all companies agree to play strategies such that their aggregate payoff is maximized. The sum of all payoffs is:
$∑ j = 1 N K j = ∑ j = 1 N ∫ 0 t 1 k j u j ( t ) − 1 2 u j ( t ) 2 − δ j x ( t ) e − t d t .$
The approach for the determination of the open-loop equilibrium structure is analogous to the one adopted in the noncooperative case. From now on, we are going to use the notation $u i C$, $x C ( t )$ to avoid confusion with the previous quantities.
$u i C ( t ) = k i − ∑ j = 1 N δ j ( e t − t 1 − 1 ) .$
$x C ( t ) = x 0 − t ∑ j = 1 N N δ j + k j + N ∑ j = 1 N δ j e t − 1 e − t 1 .$
The comparison between the resource stocks in the two scenarios can be illustrated by a simple inequality, highlighting that the noncooperative resource stock exceeds the cooperative one (Figure 3 and Figure 4). Namely, at all $t ∈ [ t 0 , t 1 ]$, we have that:
$x * ( t ) ≥ x C ( t ) ⇕ x 0 − t ∑ j = 1 N δ j + k j + ∑ j = 1 N δ j e t − 1 e − t 1 ≥ x 0 − t ∑ j = 1 N N δ j + k j + N ∑ j = 1 N δ j e t − 1 e − t 1 ⇕ t ( N − 1 ) ∑ j = 1 N δ j ≥ ( N − 1 ) ∑ j = 1 N δ j e t − 1 e − t 1 ⇕ e t 1 ≥ e t − 1 t .$
Such an estimate always holds for $t ≥ t 0$, because
$e t 1 > e t > e t − 1 ≥ e t − 1 t .$
An investigation of a suitable IDP requires the definition of an imputation in this model. If we choose an egalitarian distribution, we can define the shares of the imputation as fractions of the total payoff equally divided by the number of players, i.e.,
$ξ i = max u ∑ j = 1 N K j ( x 0 , u 1 , … , u N ) N = ∑ j = 1 N ∫ t 0 t 1 ( k j u j C ( t ) − 1 2 u j C ( t ) 2 − δ i x C ( t ) ) e − t d t N .$
The case we are taking into account is the first one in the previous section, i.e., $θ ∈ [ t 0 , t 1 ]$, where constant $D = 0$. Furthermore, the exponential c.d.f. at hand has a relevant property: since $f ( t ) = e − t$, the ratio $f ( t ) / 1 − F ( t ) = 1$, hence Equation (8) for IDP takes the form:
$β i ( θ ) = ξ i θ − ( ξ i θ ) ′ .$
Evaluating $h i * ( · )$ at the optimal controls and states amount to:
$h i * ( t ) = k i k i − ∑ j = 1 N δ j ( e t − t 1 − 1 ) − 1 2 k i − ∑ j = 1 N δ j ( e t − t 1 − 1 ) 2 − δ i x 0 − t ∑ j = 1 N N δ j + k j + N ∑ j = 1 N δ j e t − 1 e − t 1 = k i 2 − k i ∑ j = 1 N δ j ( e t − t 1 − 1 ) − 1 2 k i 2 − 2 k i ∑ j = 1 N δ j ( e t − t 1 − 1 ) + ∑ j = 1 N δ j 2 ( e t − t 1 − 1 ) 2 − δ i x 0 + t δ i ∑ j = 1 N N δ j + k j − N δ i ∑ j = 1 N δ j e t − 1 e − t 1 = k i 2 2 + ( e t − t 1 − 1 ) 2 ∑ j = 1 N δ j 2 2 − δ i x 0 + t δ i ∑ j = 1 N N δ j + k j − N δ i ∑ j = 1 N δ j e t − 1 e − t 1 .$
By employing $h i * ( t )$ in $K i ( · )$, we can determine the expression of the expected integral payoff of company i for a subgame starting at $θ ∈ [ 0 , t 1 ]$:
$K i * ( θ ) = 1 e − θ ∫ θ t 1 k i 2 2 + ( e t − t 1 − 1 ) 2 ∑ j = 1 N δ j 2 2 − δ i x 0 +$
$+ t δ i ∑ j = 1 N N δ j + k j − N δ i ∑ j = 1 N δ j e t − 1 e − t 1 e − t d t = ⋯ =$
$= k i 2 2 − δ i x 0 1 − e θ − t 1 + ∑ j = 1 N δ j 2 2 ( θ − t 1 ) e θ − t 1 + 1 − e 2 θ − 2 t 1 2 +$
$+ δ i ∑ j = 1 N ( N δ j + k j ) θ + 1 − ( t 1 + 1 ) e θ − t 1$
$− N δ i ∑ j = 1 N δ j ( t 1 − θ ) e θ + e θ − t 1 − 1 .$
Subsequently, we have to determine $( K i * ( θ ) ) ′$, by a simple differentiation:
$( K i * ( θ ) ) ′ = − k i 2 2 − δ i x 0 e θ − t 1 + ∑ j = 1 N δ j 2 ( 1 + θ − t 1 ) e θ − t 1 − e 2 ( θ − t 1 ) +$
$+ δ i ∑ j = 1 N ( N δ j + k j ) ( 1 − ( t 1 + 1 ) e θ − t 1 ) − N δ i ∑ j = 1 N δ j ( t 1 − θ − 1 ) e θ + e θ − t 1 .$
Finally, employing the found forms for $K i * ( θ )$ and $( K i * ( θ ) ) ′$, we get:
$K j * ( θ ) − ( K j * ( θ ) ) ′ = k i 2 2 − δ i x 0 + 1 2 ∑ j = 1 N δ j 2 e θ − t 1 − 1 2 +$
$+ θ δ i ∑ j = 1 N ( N δ j + k j ) − N δ i ∑ j = 1 N δ j ( e θ − 1 ) .$
Thus, IDP takes form
$β i ( θ ) = ξ i θ − ( ξ i θ ) ′ = ∑ j = 1 N K j * ( θ ) N − ∑ j = 1 N K j * ( θ ) N ′ = ∑ j = 1 N ( K j * ( θ ) − ( K j * ( θ ) ) ′ ) N =$
$= 1 N ∑ j = 1 N k j 2 2 − δ j x 0 + 1 2 ∑ l = 1 N δ l 2 e θ − t 1 − 1 2 +$
$+ θ δ j ∑ l = 1 N ( N δ l + k l ) − N δ i ∑ l = 1 N δ l ( e θ − 1 ) .$
Figure 5, which was created with Matlab R2016a, portrays a sketch of the behavior of the imputation and of the IDP over time. The numerical simulation was performed for the following parameters: $N = 3$, $∑ j = 1 3 δ j = 0.000069$, $t 1 = 20$, $k 1 = 1$, $k 2 = 2$, and $k 3 = 3$.
On this figure, we can see that the amount of imputation is equal to the integral of IDP multiplied by discount probability factor.

## 5. Conclusions and Further Developments

We proposed an analysis of a class of extraction differential games with uncertain duration possibly involving a discontinuous c.d.f. for the random variable indicating the duration of the game. Then, we focused our attention on the cooperative aspects of the game to identify the appropriate IDP and applied such a theory to a standard nonrenewable resource extraction model.
There exists a number of possible improvements, both from theoretical and applied viewpoints, regarding the feedback information structure of such a class of games, the solution concepts (i.e., Shapley value, Banzhaf value, and core) to be employed, the models which represent scenarios different from the extraction of an exhaustible resource and also models of processes with more complex and realistic c.d.f. All of them are left for future research.

## Author Contributions

Conceptualization, E.G. and A.M.; Methodology, E.G.; Validation, E.G., A.M. and A.P.; Formal Analysis, E.G., A.M. and A.P.; Investigation, E.G., A.M. and A.P.; Writing—Original Draft Preparation, E.G., A.M. and A.P.; Writing—Review & Editing, A.M.; Visualization, E.G., A.M. and A.P.

## Funding

Ekaterina Gromova acknowledges the grant from Russian Science Foundation 17-11-01079.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

1. Isaacs, R. Differential Games; John Wiley and Sons: New York, NY, USA, 1965. [Google Scholar]
2. Ricardo, D. On the Principles of Political Economy and Taxation; John Murray: London, UK, 1817. [Google Scholar]
3. Hotelling, H. The economics of exhaustible resources. J. Polit. Econ. 1931, 39, 137–175. [Google Scholar] [CrossRef]
4. Tsiropoulou, E.E.; Vamvakas, P.; Katsinis, G.K.; Papavassiliou, S. Combined Power and Rate Allocation in Self- Optimized Multi-Service Two-Tier Femtocell Networks. Comput. Commun. 2015, 72, 38–48. [Google Scholar] [CrossRef]
5. Yaari, M.E. Uncertain lifetime, life insurance, and the theory of the consumer. Rev. Econ. Stud. 1965, 32, 137–150. [Google Scholar] [CrossRef]
6. Petrosyan, L.A.; Murzov, N.V. Game-theoretic problems of mechanics. Litovsk. Math. Sb. 1966, 7, 423–433. [Google Scholar]
7. Boukas, E.K.; Haurie, A.; Michael, P. An optimal control problem with a random stopping time. J. Optim. Theory Appl. 1990, 64, 471–480. [Google Scholar] [CrossRef]
8. Petrosyan, L.A.; Shevkoplyas, E.V. Cooperative solutions for games with random duration. Game Theory Appl. 2003, 9, 125–139. [Google Scholar]
9. Petrosyan, L.A. Time-consistency of solutions in multi-player differential games. Vestn. Leningr. State Univ. Math. 1977, 4, 46–52. [Google Scholar]
10. Jorgensen, S.; Martin-Herran, G.; Zaccour, G. Agreeability and Time Consistency in Linear-State Differential Games. J. Optim. Theory Appl. 2003, 119, 49–63. [Google Scholar] [CrossRef]
11. Petrosjan, L.A.; Zaccour, G. Time-consistent Shapley value allocation of pollution cost reduction. J. Econ. Dyn. Control 2003, 27, 381–398. [Google Scholar] [CrossRef]
12. Zaccour, G. Time consistency in cooperative differential games: A tutorial. Inf. Syst. Oper. Res. 2008, 46, 81. [Google Scholar] [CrossRef]
13. Yeung, D.W.K.; Petrosyan, L.A. Subgame Consistent Cooperation; Springer: New York, NY, USA, 2016. [Google Scholar]
14. Reddy, P.V.; Shevkoplyas, E.V.; Zaccour, G. Time-consistent Shapley value for games played over event trees. Automatica 2013, 49, 1521–1527. [Google Scholar] [CrossRef]
15. Kostyunin, S.; Palestini, A.; Shevkoplyas, E.V. On a nonrenewable resource extraction game played by asymmetric firms. J. Optim. Theory Appl. 2014, 163, 660–673. [Google Scholar] [CrossRef]
16. Marin-Solano, J.; Shevkoplyas, E.V. Non-constant discounting and differential games with random time horizon. Automatica 2011, 47, 2626–2638. [Google Scholar] [CrossRef]
17. Parilina, E.M.; Zaccour, G. Node-Consistent Shapley Value for Games Played over Event Trees with Random Terminal Time. J. Optim. Theory Appl. 2017, 175, 236–254. [Google Scholar] [CrossRef]
18. Shevkoplyas, E.V. Stable cooperation in differential games with random duration. Control Soc. Econ. Syst. 2010, 2, 79–105. [Google Scholar]
19. Shevkoplyas, E.V. The Hamilton-Jacobi-Bellman equation for a class of differential games with random duration. Autom. Remote Control 2014, 75, 959–970. [Google Scholar] [CrossRef]
20. Gromov, D.; Gromova, E. Differential games with random duration: A hybrid systems formulation. Contrib. Game Theory Manag. 2014, 7, 104–119. [Google Scholar]
21. Gromov, D.; Gromova, E. On a Class of Hybrid Differential Games. Dyn. Games Appl. 2017, 7, 266–288. [Google Scholar] [CrossRef]
22. Malakhova, A.P.; Gromova, E.V. Strongly Time-Consistent Core in Differential Games with Discrete Distribution of Random Time Horizon. Math. Appl. 2018, 46, 197–209. [Google Scholar]
23. Kuzyutin, D.; Nikitina, M. Time consistent cooperative solutions for multistage games with vector payoffs. Op. Res. Lett. 2017, 45, 269–274. [Google Scholar] [CrossRef]
24. Gromova, E.; Plekhanova, T. On the regularization of a cooperative solution in a multistage game with random time horizon. Discret. Appl. Math. 2018. [Google Scholar] [CrossRef]
25. Feliz, R.A. The optimal extraction rate of a natural resource under uncertainty. Econ. Lett. 1993, 43, 231–234. [Google Scholar] [CrossRef]
26. Gromova, E.V.; Malakhova, A.P.; Tur, A.V. On the conditions on the integral payoff function in the games with random duration. Contrib. Game Theory Manag. 2017, 10, 94–99. [Google Scholar]
27. Reddy, P.V.; Zaccour, G. A friendly computable characteristic function. Math. Soc. Sci. 2016, 82, 18–25. [Google Scholar] [CrossRef]
28. Gromova, E.V.; Petrosyan, L.A. On an approach to constructing a characteristic function in cooperative differential games. Autom. Remote Control 2017, 78, 1680–1692. [Google Scholar] [CrossRef]
29. Owen, G. Game Theory, 3nd ed.; Academic Press: New York, NY, USA, 1995. [Google Scholar]
30. Shapley, L.S. A Value for n-person Games. In Contributions to the Theory of Games; Kuhn, H.W., Tucker, A.W., Eds.; Princeton University Press: Princeton, NJ, USA, 1953; Volume II, pp. 307–317. [Google Scholar]
31. Dockner, E.J.; Jorgensen, S.; Long, N.V.; Sorger, G. Differential Games in Economics and Management Science; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
32. Rubio, S.J. On the coincidence of feedback Nash equilibria and Stackelberg equilibria in economic applications of differential games. J. Optim. Theory Appl. 2006, 128, 203–220. [Google Scholar] [CrossRef]
33. Jorgensen, S.; Yeung, D.W. Stochastic differential game model of a common property fishery. J. Optim. Theory Appl. 1996, 90, 381–403. [Google Scholar] [CrossRef]
34. Van Long, N. Dynamic games in the economics of natural resources: a survey. Dyn. Games Appl. 2011, 1, 115–148. [Google Scholar] [CrossRef]
Figure 1. An example of a c.d.f. $F p ( t )$ in the interval $[ t 0 , T ]$.
Figure 1. An example of a c.d.f. $F p ( t )$ in the interval $[ t 0 , T ]$.
Figure 2. The exponential c.d.f. $F ( t ) = 1 − e − ( t − t 0 )$ in the interval $[ t 0 , t 1 ]$.
Figure 2. The exponential c.d.f. $F ( t ) = 1 − e − ( t − t 0 )$ in the interval $[ t 0 , t 1 ]$.
Figure 3. Comparison between deterministic and stochastic settings for state $x * ( t )$.
Figure 3. Comparison between deterministic and stochastic settings for state $x * ( t )$.
Figure 4. Comparison between Nash equilibrium and cooperative equilibrium for $x ( t )$.
Figure 4. Comparison between Nash equilibrium and cooperative equilibrium for $x ( t )$.
Figure 5. IDP and Imputation for one player.
Figure 5. IDP and Imputation for one player.