Next Article in Journal
Subclasses of Multivalent Analytic Functions Associated with a q-Difference Operator
Next Article in Special Issue
Mixed Mechanisms for Auctioning Ranked Items
Previous Article in Journal
Efficient and Secure Strategy for Energy Systems of Interconnected Farmers′ Associations to Meet Variable Energy Demand
Previous Article in Special Issue
Prioritised Learning in Snowdrift-Type Games
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Differential Game with Random Time Horizon and Discontinuous Distribution

1
Faculty of Applied Mathematics and Control Processes, St. Petersburg State University, 199034 St Petersburg, Russia
2
Department of Mathematics, St. Petersburg School of Physics, Mathematics, and Computer Science, National Research University Higher School of Economics (HSE), Soyuza Pechatnikov ul. 16, 190008 St. Petersburg, Russia
3
Krasovskii Institute of Mathematics and Mechanics (IMM UB RAS), 620108 Yekaterinburg, Russia
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(12), 2185; https://doi.org/10.3390/math8122185
Submission received: 13 November 2020 / Revised: 1 December 2020 / Accepted: 5 December 2020 / Published: 8 December 2020
(This article belongs to the Special Issue Statistical and Probabilistic Methods in the Game Theory)

Abstract

:
One class of differential games with random duration is considered. It is assumed that the duration of the game is a random variable with values from a given finite interval. The cumulative distribution function (CDF) of this random variable is assumed to be discontinuous with two jumps on the interval. It follows that the player’s payoff takes the form of the sum of integrals with different but adjoint time intervals. In addition, the first interval corresponds to the zero probability of the game to be finished, which results in terminal payoff on this interval. The method of construction optimal solution for the cooperative scenario of such games is proposed. The results are illustrated by the example of differential game of investment in the public stock of knowledge.

1. Introduction

Dynamic processes with many participants are well described by the differential game theory framework (e.g., for differential games see [1], also [2] in Russian, and their applications in economics in [3]). Meanwhile, a lot of natural processes (such as ecological dynamics) impose random components [4]. Thus, the assumption of the random nature of the processes’ time horizon allows constructing models even closer to reality [5]. Therefore, differential game models with random duration are of particular importance. This type of models was initially introduced in [6], where a differential game of pursuit with terminal payoff and stochastic terminal time was considered; the first problem formulation with integral payoff function and random time horizon with continuous distribution function was considered in [7] and later extended to the case of differential game with random time horizon and discounting in [8].
In this work, one class of differential games with a random time horizon is considered. It is assumed that the duration of the game is a random variable corresponding to the discontinuous cumulative distribution function (CDF). Particular classes of discontinuous models had been formulated in [9] for the case with only one jump and in [10] for the step form of the CDF. In this paper, the discontinuous structure of the CDF of the game implies that the game can not be interrupted until a certain point, then could stop either on the continuous interval or in the certain discrete-time moments. Moreover, the first interval corresponds to the zero probability of the game to be ended, which results in terminal payoff on this time interval. This game formulation leads to the fact that the player’s payoff takes the form of the sum of integrals with different, but adjoint time intervals and the problem can be considered in terms of hybrid differential games [11,12]. For more examples of games with regime switching see, e.g., [13,14,15].
The method of solution for such a wide class of problems by using dynamic programming methods [1], the maximum Pontryagin principle [16] and parametrization firstly was suggested in [17] for a class of differential games with random time horizon, with continuous but composite CDF, further generalized in [11]. Based on [11], a number of games with payoffs defined on adjoint time intervals were solved in [18,19].
In this paper, a method to construct optimal controls for the introduced class of games with discontiniuous CDF is proposed. This method refers to the idea of defining the connecting trajectory points at the edges of intervals as numeric parameters and usage of the maximum Pontryagin principle [16], after every interval. The numeric values for these points are obtained by maximization of total payoff under a dynamic programming approach on the interval. The method can also be used for a non-cooperative game scenario to find Nash equllibria and further exploit for a wide class of differential games with integral payoffs defined on different time intervals.
The results obtained in this work are illustrated by the game-theoretical model of investment to the public stock of knowledge [3] (also see [20]) modified accordingly to the described stochastic framework. The considered model refers to the unstable market conditions which bring a random nature to the process.
The paper is organized as follows. Section 2 is dedicated to the game formulation and main assumptions made to construct a feasible solution. The following Section 3 consists of the stock investment model specification. The optimization approach for the game formulation considered in the paper is proposed in Section 3 as well. In Section 4.1, the calculations for the adjoint time intervals are given. All the necessary details to reproduce this result of the work are written in Appendix A. Section 4.2 provides the dynamic programming approach to determine the numerical values of the state at the end points of the intervals. The limit case under the assumption of no jumps considered CDF is investigated in Section 5.1. Another limit case with assumption of step CDF is described in Section 5.2. The last Section presents a numeric example for the stock investment model.

2. Problem Formulation

Consider a differential game of n participants (players): Γ T ( t 0 , x 0 ) . Let N = { 1 , , n } be the set of players. Assume that the game starts at the initial moment t 0 and the initial state x 0 ; the duration of the game is a random variable such that it corresponds to the particular cumulative distribution function (CDF) described in the below notation. The following assumptions are made:
  • the interval over which the game is played is [ t 0 , T ] R + , where t 0 0 and T are random variables defined on the interval [ t 0 , T 2 ] , T 2 < . The random variable corresponds to the discontinuous CDF
    F ( τ ) = 0 , if τ < T ¯ δ , φ ( τ ) , if T ¯ δ τ < T ¯ + δ , 1 p 1 p 2 , if T ¯ + δ τ < T 1 , 1 p 2 , if T 1 τ < T 2 , 1 , for τ T 2 ,
    where φ ( τ ) is assumed to be an absolutely continuous non-decreasing function, φ ( T ¯ δ ) = 0 , φ ( T ¯ + δ ) = 1 p 1 p 2 , p 1 > 0 , p 2 > 0 , p 1 + p 2 1 .
    The CDF of the random variable is assumed to be discontinuous with two jumps occurring on the bounded interval. The example of such CDF is given on Figure 1;
  • the dynamic constraints of the game are given by
    x ˙ = g ( x ( t ) , u 1 ( t ) , , u n ( t ) ) , x R h , x ( t 0 ) = x 0 . ,
    where the state Equation (2) are the ODEs whose solutions satisfy the standard existence and uniqueness requirements, which means that the function g ( x ( t ) , u 1 ( t ) , , u n ( t ) ) in (2) is a differentiable function on [ t 0 ; T 2 ] ;
  • the controls u i ( t ) are open-loop strategies;
  • the controls u i ( t ) belong to the sets of admissible controls U i , which consist of all measurable functions on the interval [ t 0 , T 2 ] , taking values in the set of admissible control values U i , which are in turn convex compact subsets of R k ;
  • the instantaneous payoff of the i-th player at the moment τ [ t 0 , T 2 ] is defined as h i ( x ( τ ) , u 1 ( τ ) , , u n ( τ ) ) . To shorten the notation, we write
    h i ( x ( τ ) , u 1 ( τ ) , , u n ( τ ) ) = h i ( x ( τ ) , u ( τ ) ) ,
    where u ( τ ) = { u 1 ( τ ) , , u n ( τ ) } ;
  • in the deterministic case the integral payoff is
    J i ( x 0 , t 0 , T f , u ) = t 0 T f h i ( x ( t ) , u ( t ) ) d t , i = 1 , n ¯ ,
    where T f is a known moment of the end of the game, T f [ t 0 , T 2 ] ;
  • in the case of random time horizon, the mathematical expectation of the integral payoff is considered. Thus, the i-th player’s integral functional is:
    K i ( x 0 , t 0 , T 2 , u ) = E ( J i ) = t 0 T 2 t 0 t h i ( x ( τ ) , u ( τ ) ) d τ d F ( t ) , i = 1 , n ¯ .
It was obtained previously [10] that (4) can be written as follows
K i ( x 0 , t 0 , T 2 , u ) = t 0 T 2 h i ( x ( t ) , u ( t ) ) ( 1 F ( t ) ) d t , i = 1 , n ¯ .
Proposition 1.
For the considered game formulation, the payoff is a sum of functionals defined over four time intervals:
K i ( x 0 , t 0 , T 1 , T 2 , u ) = t 0 T ¯ δ h i ( x ( t ) , u ( t ) ) d t + T ¯ δ T ¯ + δ h i ( x ( t ) , u ( t ) ) ( 1 φ ( t ) ) d t + T ¯ + δ T 1 h i ( x ( t ) , u ( t ) ) ( p 1 + p 2 ) d t + T 1 T 2 h i ( x ( t ) , u ( t ) ) p 2 d t .
Proof. 
The CDF of the random variable in the described game formulation is a piecewise function on the four time intervals. It is not hard to spot, that due to the zero probability of the game to be ended on the first interval [ t 0 , T ¯ δ ] , the payoff of the player on this interval takes a deterministic form. The payoff on the other intervals is nothing but a mathematical expectation of the integration payoff corresponding to the particular form of the CDF on the interval. Thus, accordingly to (5) and proof in [10], the total payoff of the player could be written as the sum of four adjoint integrals (6). □
Obviously, the problem can be easily modified for the more general case with jumps at T 1 , T 2 , , T n .

3. Model Example

Consider the model example in the frame of differential game formulation mentioned in the previous section. Namely, consider a so-called stock investment model described in [3] (also see [21]). There, n individuals invest in the stocks related to the particular industry. State variable x ( t ) corresponds to the number of stocks held at the moment t, and u i ( t ) is the investment strategy of agent i at time t. The game dynamics take form of the accumulative process
x ˙ ( t ) = i = 1 N u i ( t ) , x R , u i U R , x ( t 0 ) = x 0 .
Assume that the instant payoff function is the linear-quadratic one, as each agent derives linear utility from the consumption of the stock
h i ( x ( t ) , u ( t ) ) = q i x ( t ) r i u i 2 ( t ) , q i > 0 , r i > 0 .
Assume also that the CDF function of the game partly corresponds to a uniform distribution
F ( τ ) = 0 , if τ < T ¯ δ , ( 1 p 1 p 2 ) t T ¯ + δ 2 δ , for T ¯ δ τ < T ¯ + δ , 1 p 1 p 2 , if T ¯ + δ τ < T 1 , 1 p 2 , if T 1 τ < T 2 , 1 , if τ T 2 .
To further simplify the investigation, assume that the initial moment of the game is equal to zero: t 0 = 0 .
Consider the cooperative form of the game. Assume that all participants decided to cooperate and, thus, unite their efforts to maximize the total payoff.
i = 1 n K i ( x 0 , t 0 , T 1 , T 2 , u ) max u
The optimization problem can be solved using parametric method for four separate intervals. Over every interval, we use the Pontryagin maximum principle [16].
We start by introducing three yet undefined values of the state at the respective switching instants:
x 1 , x 2 , x 3 R .
Let I i denote the ith interval, I i ( x j ) is the total payoff on interval i, which depends on the numeric parameter x j . So, we obtain the following connected optimization problems:
I 1 : [ 0 ; T ¯ δ ] both sides are fixed { x 0 , x 1 } I 2 : [ T ¯ δ ; T ¯ + δ ] both sides are fixed { x 1 , x 2 } I 3 : [ T ¯ + δ ; T 1 ] both sides are fixed { x 2 , x 3 } I 4 : [ T 1 ; T 2 ] only one side is fixed { x 3 }
To obtain the numeric value of every parameter x 1 , x 2 , x 3 , the following algorithm based on dynamic programming principle can be used
I 1 ( x 1 ) + I 2 ( x 1 , x 2 ) + I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) max x 1 , x 2 , x 3
  • I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) max x 3 –compute x 3 parameterized by x 2 ;
  • I 2 ( x 1 , x 2 ) + max x 3 { I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) } max x 2 –compute x 2 parameterized by x 1 , while using the previously obtained expression for x 3 that depends on x 2 ;
  • I 1 ( x 1 ) + max x 2 { I 2 ( x 1 , x 2 ) + max x 3 { I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) } } max x 1 –compute x 1 .
Thus, all three numeric values for x 1 , x 2 , x 3 can be unambiguously obtained.

4. Computations

4.1. Intervals Calculations

In this section, the expressions for the optimal control and optimal trajectory are obtained under the assumption that three numeric parameters x 1 , x 2 , x 3 are given. The detailed calculations for all four intervals are presented in Appendix A. For every interval, the Pontryagin maximum principle (under the assumption of one side of interval fixed or both sides fixed) is used.
For the interval I 1 , we obtain the following expressions for the optimal trajectory and controls:
x * ( t ) I 1 = ( x 1 x 0 ) t ( T ¯ δ ) + 1 4 q ^ t ( T ¯ δ t ) i = 1 n 1 r i + x 0 , u i * ( t ) I 1 = x 1 x 0 r i i = 1 n 1 r i ( T ¯ δ ) + q ^ ( T ¯ δ ) 4 r i q ^ t 2 r i .
For the interval I 2 , we have the following expressions for the optimal trajectory and controls:
x * ( t ) I 2 = q ^ i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 δ 2 χ ( t ) 2 4 + 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ i = 1 n 1 r i δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( 1 p 1 p 2 ) ln χ ( t ) 2 δ + x 1 , u i * ( t ) I 2 = q ^ χ ( t ) 4 r i ( 1 p 1 p 2 ) 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ δ 2 i = 1 n 1 r i ( 1 + p 1 + p 2 ) 2 r i i = 1 n 1 r i χ ( t ) ln ( p 1 + p 2 ) ,
where χ ( t ) = 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) .
For the interval I 3 , we obtain:
x * ( t ) I 3 = i = 1 n 1 r i ( x 3 x 2 ) ( t T ¯ δ ) i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 T ¯ δ ) ( t T ¯ δ ) 4 q ^ ( t T ¯ δ ) 2 4 + x 2 , u i * ( t ) I 3 = ( x 3 x 2 ) r i i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 + T ¯ + δ 2 t ) 4 r i .
For the interval I 4 , we obtain:
x * ( t ) I 4 = q ^ i = 1 n 1 4 r i ( t T 1 ) ( t + T 1 2 T 2 ) + x 3 , u i * ( t ) I 4 = q ^ ( T 2 t ) 2 r i .
The expressions for optimal control on the intervals (10)–(13) are written under assumption of these controls belonging to the set of admissible controls U i . In other cases, the solution lies in the board of the compact set.

4.2. Computation of the Parameters x 1 , x 2 , x 3

As was written above, the connecting trajectory points at the boundary of intervals x 1 , x 2 , x 3 are obtained by the following algorithm.
I 1 ( x 1 ) + I 2 ( x 1 , x 2 ) + I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) max x 1 , x 2 , x 3
Using dynamic programming optimal principal
  • I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) max x 3 –compute x 3 parameterized by x 2 ;
  • I 2 ( x 1 , x 2 ) + max x 3 { I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) } max x 2 –compute x 2 parameterized by x 1 , while using the previously obtained expression for x 3 that depends on x 2 ;
  • I 1 ( x 1 ) + max x 2 { I 2 ( x 1 , x 2 ) + max x 3 { I 3 ( x 2 , x 3 ) + I 4 ( x 3 ) } } max x 1 –compute x 1 .
Every expression is a linear-quadratic one in relation to x 1 , x 2 or x 3 correspondingly. Thus, it is easy to obtain optimal values by using the first derivative of the expression. As a result, we obtain three linear equations, and after solving the system of these three equations, we obtain
x 1 = q ^ ( T ¯ δ ) ( p 1 ( T 1 T ¯ ) + p 2 ( T 2 T ¯ ) + T ¯ + δ 2 ) i = 1 n 1 r i 2 + x 0 , x 2 = q ^ δ 2 i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 ( ( p 1 + p 2 ) 2 ( 2 ln ( p 1 + p 2 ) 1 ) + 1 ) q ^ δ i = 1 n 1 r i ln ( p 1 + p 2 ) 1 p 1 p 2 ( p 1 ( T 1 T ¯ δ ) + p 2 ( T 2 T ¯ δ ) ) + x 1 , x 3 = i = 1 n 1 r i q ^ ( T 1 T ¯ δ ) 2 ( p 1 + p 2 ) ( p 1 + p 2 ) ( T 1 T ¯ δ ) 2 + p 2 ( T 2 T 1 ) + x 2 .

5. Analysis of the Limiting Cases

5.1. Assumption of no Jumps in CDF

Consider the limit case of the game formulated in the paper. Namely, suppose that the probabilities p 1 , p 2 which are responsible for the discontinuous structure of CDF function tend to zero
p 1 0 , p 2 0 .
This assumption leads to the situation when the CDF becomes a continuous function and the game cannot end after T ¯ + δ . Thus, we disregard the intervals I 3 , I 4 .
The dynamics and the instantaneous payoff function of the game remain the same
x ˙ ( t ) = i = 1 N u i ( t ) , x R , u i U R , x ( 0 ) = x 0 , h i ( x ( t ) , u ( t ) ) = q i x ( t ) r i u i 2 ( t ) .
The CDF function of the game takes form
F ( τ ) = 0 , for τ < T ¯ δ t T ¯ + δ 2 δ , for T ¯ δ τ < T ¯ + δ 1 , for τ T ¯ + δ .
This game formulation of stock investment game was investigated in [18]. Consider a cooperative form of the game:
i = 1 n K i ( x 0 , t 0 , T ¯ + δ , u ) = i = 1 n t 0 T ¯ δ h i ( x ( t ) , u ( t ) ) d t + i = 1 n T ¯ δ T ¯ + δ h i ( x ( t ) , u ( t ) ) ( 1 t T ¯ + δ 2 δ ) d t max .
The optimization problem is solved using parametric method but with only one parameter x 1 R .
I 1 : [ 0 ; T ¯ δ ] both sides are fixed { x 0 , x 1 } I 2 : [ T ¯ δ ; T ¯ + δ ] only one side is fixed { x 1 }
To obtain the numeric value of parameter x 1 , the following maximization problem must be solved
I 1 ( x 1 ) + I 2 ( x 1 ) max x 1
The solution for this game is given in the form of optimal control, optimal trajectory and the numeric value of x 1 . Using the Pontryagin maximum principle, it is not difficult to obtain the following expressions (denote them with superscript A)
x 1 A = q ^ ( T ¯ 2 δ 2 ) i = 1 n 1 r i 4 + x 0 , u i ( t ) I 1 A = q ^ ( T ¯ t ) 2 r i , u i ( t ) I 2 A = q ^ ( T ¯ + δ t ) 4 r i .
x ( t ) I 1 A = x 1 x 0 t i = 1 n 1 r i ( T ¯ δ ) + q ^ ( T ¯ δ ) 4 t q ^ t 2 4 i = 1 n 1 r i + x 0 , x ( t ) I 2 A = q ^ i = 1 n 1 r i 4 ( T ¯ + δ ) t + ( T ¯ δ ) 2 t 2 2 + x 0 .
From the other side, we can use the solution (10), (11) and (14) of the initial game described in Section 4 and set p 1 = 0 , p 2 = 0 .
The connection point x 1 (14) under assumption of p 1 and p 2 aiming to 0 coincides with the result obtained in (15):
x 1 = q ^ ( T ¯ δ ) ( p 1 ( T 1 T ¯ ) + p 2 ( T 2 T ¯ ) + T ¯ + δ 2 ) i = 1 n 1 r i 2 + x 0 , lim p 1 0 , p 2 0 q ^ ( T ¯ δ ) ( p 1 ( T 1 T ¯ ) + p 2 ( T 2 T ¯ ) + T ¯ + δ 2 ) i = 1 n 1 r i 2 + x 0 = q ^ ( T ¯ 2 δ 2 ) i = 1 n 1 r i 4 + x 0 = x 1 A .
Similarly, the optimal control on the first interval (10) under the assumption of p 1 and p 2 tending to 0 also coincides with the control obtained for the reduced model (15):
u i * ( t ) I 1 = x 1 x 0 r i i = 1 n 1 r i ( T ¯ δ ) + q ^ ( T ¯ δ ) 4 r i q ^ t 2 r i = q ^ ( T ¯ 2 δ 2 ) i = 1 n 1 r i 4 r i i = 1 n 1 r i ( T ¯ δ ) + q ^ ( T ¯ δ ) 4 r i q ^ t 2 r i = q ^ ( T ¯ t ) 2 r i = u i * ( t ) I 1 A .
Optimal control on the second interval (11) under assumption of p 1 and p 2 aiming to 0 could be rewritten using the expression for x 2 obtained from (14):
x 2 x 1 = q ^ δ 2 i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 ( ( p 1 + p 2 ) 2 ( 2 ln ( p 1 + p 2 ) 1 ) + 1 ) q ^ δ i = 1 n 1 r i ln ( p 1 + p 2 ) 1 p 1 p 2 ( p 1 ( T 1 T ¯ δ ) + p 2 ( T 2 T ¯ δ ) ) . lim p 1 0 , p 2 0 ( x 2 x 1 ) = q ^ δ 2 i = 1 n 1 r i 2 .
Then we get the coincidence of optimal controls for the limiting case (16) and initial game (11):
u i * ( t ) I 2 = q ^ χ ( t ) 4 r i ( 1 p 1 p 2 ) 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ δ 2 i = 1 n 1 r i ( 1 + p 1 + p 2 ) 2 r i i = 1 n 1 r i χ ( t ) ln ( p 1 + p 2 ) , lim p 1 0 , p 2 0 q ^ χ ( t ) 4 r i ( 1 p 1 p 2 ) 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ δ 2 i = 1 n 1 r i ( 1 + p 1 + p 2 ) 2 r i i = 1 n 1 r i χ ( t ) ln ( p 1 + p 2 ) = q ^ ( T ¯ + δ t ) 4 r i = u i ( t ) I 2 A .
Optimal trajectory on the first interval (10) under assumption of p 1 and p 2 aiming to 0 also gives the results (16):
x * ( t ) I 1 = ( x 1 x 0 ) t ( T ¯ δ ) + 1 4 q ^ t ( T ¯ δ t ) i = 1 n 1 r i + x 0 = x * ( t ) I 1 A .
Optimal trajectory function on the second interval under assumption of p 1 and p 2 aiming to 0 coincides with the corresponding result in (16):
x * ( t ) I 2 = q ^ i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 δ 2 χ ( t ) 2 4 + 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ i = 1 n 1 r i δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( 1 p 1 p 2 ) ln ( χ ( t ) 2 δ ) + x 1 ,
where χ ( t ) = 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) . We have
lim p 1 0 , p 2 0 q ^ i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 δ 2 χ ( t ) 2 4 + 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ i = 1 n 1 r i δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( 1 p 1 p 2 ) ln ( χ ( t ) 2 δ ) + x 1 = q ^ i = 1 n 1 r i 4 ( T ¯ + δ ) t + ( T ¯ δ ) 2 t 2 2 + x 0 = x ( t ) I 2 A .
As we have shown, the solutions of the initial game and the reduced game coincide. Thus, the solution of the main game formulation described in Section 4 can be considered as the generalization of the game described in this Section.

5.2. Assumption of a Piece-Wise Constant CDF

Consider another edge case of the game formulated in the paper. Now, suppose that the continuous part of (1) ϕ ( τ ) takes only zero values on all interval [ t 0 , T 1 ] . This assumption is equivalent to the following condition:
p 1 + p 2 = 1 .
In this case, CDF corresponds to discrete random variable T and becomes a step function with probabilities p 1 and p 2 of the game to be finished at time T 1 and T 2 correspondingly. Thus, there is a guaranteed terminal payoff on the interval [ t 0 , T 1 ] and expected value for the interval [ T 1 , T 2 ] .
The dynamic and the instantaneous payoff function of the game remain, the same as
x ˙ ( t ) = i = 1 N u i ( t ) , x R , u i U R , x ( 0 ) = x 0 , h i ( x ( t ) , u ( t ) ) = q i x ( t ) r i u i 2 ( t ) .
The CDF function of the game takes the step form:
F ( τ ) = 0 , if τ < T 1 , 1 p 2 , if T 1 τ < T 2 , 1 , if τ T 2 .
This game formulation of differential game with discrete random time horizon was investigated in [22]. Consider cooperative form of the game:
i = 1 n K i ( x 0 , t 0 , T 1 , T 2 , u ) = i = 1 n t 0 T 1 h i ( x ( t ) , u ( t ) ) d t + p 2 i = 1 n T 1 T 2 h i ( x ( t ) , u ( t ) ) d t max .
The optimization problem is solved using the parametric method, but only with one parameter x 3 R :
I 1 , 2 , 3 : [ 0 ; T 1 ] both sides are fixed { x 0 , x 3 } I 4 : [ T 1 ; T 2 ] only one side is fixed { x 3 }
To obtain the numeric value of parameter x 3 , the following maximization problem must be solved
I 1 , 2 , 3 ( x 3 ) + I 4 ( x 3 ) max x 3
The solution for this game is given in the form of optimal control, optimal trajectory and the numeric value of x 3 . Using the Pontryagin maximum principle, we obtain the following expressions (denote them with superscript B):
x 3 B = q ^ i = 1 n 1 r i T 1 2 ( T 2 T 1 ) p 2 + T 1 2 + x 0 .
Note that optimal control and optimal trajectory are the same over all three intervals I 1 , I 2 , I 3 :
u i ( t ) I 1 , I 2 , I 3 B = q ^ 2 r i T 1 p 1 + T 2 p 2 t .
x ( t ) I 1 , I 2 , I 3 B = q ^ i = 1 n 1 r i 2 T 1 p 1 t + T 2 p 2 t t 2 2 + x 0 .
For the interval I 4 , we obtain:
u i ( t ) I 4 B = q ^ ( T 2 t ) 2 r i , x ( t ) I 4 B = q ^ i = 1 n 1 4 r i ( t T 1 ) ( t + T 1 2 T 2 ) + x 3 .
From the expression for optimal trajectory (17), we can obtain the following points:
x 1 B = x ( T ¯ δ ) I 1 , I 2 , I 3 B = q ^ i = 1 n 1 r i ( T ¯ δ ) 2 p 1 T 1 + p 2 T 2 T ¯ δ 2 + x 0 , x 2 B = x ( T ¯ + δ ) I 1 , I 2 , I 3 B = q ^ i = 1 n 1 r i ( T ¯ + δ ) 2 p 1 T 1 + p 2 T 2 T ¯ + δ 2 + x 0 .
From the other side, we can use the solution (10)–(14) of the initial game described in Section 4 and calculate the limit under condition of ( p 1 + p 2 ) 1 . Below, we use L’Hopital’s rule to compute the limits.
For the connectivity point x 1 under assumption of p 1 + p 2 1 , we obtain the results:
x 1 = q ^ ( T ¯ δ ) ( p 1 ( T 1 T ¯ ) + p 2 ( T 2 T ¯ ) + T ¯ + δ 2 ) i = 1 n 1 r i 2 + x 0 , lim ( p 1 + p 2 ) 1 q ^ ( T ¯ δ ) ( p 1 ( T 1 T ¯ ) + p 2 ( T 2 T ¯ ) + T ¯ + δ 2 ) i = 1 n 1 r i 2 + x 0 = q ^ i = 1 n 1 r i ( T ¯ δ ) 2 p 1 T 1 + p 2 T 2 T ¯ δ 2 + x 0 = x 1 B .
Similarly, for x 2 and x 3 , we have:
x 2 = q ^ δ 2 i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 ( ( p 1 + p 2 ) 2 ( 2 ln ( p 1 + p 2 ) 1 ) + 1 ) q ^ δ i = 1 n 1 r i ln ( p 1 + p 2 ) 1 p 1 p 2 ( p 1 ( T 1 T ¯ δ ) + p 2 ( T 2 T ¯ δ ) ) + x 1 . lim ( p 1 + p 2 ) 1 q ^ δ 2 i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 ( ( p 1 + p 2 ) 2 ( 2 ln ( p 1 + p 2 ) 1 ) + 1 ) q ^ δ i = 1 n 1 r i ln ( p 1 + p 2 ) 1 p 1 p 2 ( p 1 ( T 1 T ¯ δ ) + p 2 ( T 2 T ¯ δ ) ) + x 1 = q ^ i = 1 n 1 r i ( T ¯ + δ ) 2 p 1 T 1 + p 2 T 2 T ¯ + δ 2 + x 0 = x 2 B , x 3 = i = 1 n 1 r i q ^ ( T 1 T ¯ δ ) 2 ( p 1 + p 2 ) ( p 1 + p 2 ) ( T 1 T ¯ δ ) 2 + p 2 ( T 2 T 1 ) + x 2 .
So, we obtain:
lim ( p 1 + p 2 ) 1 i = 1 n 1 r i q ^ ( T 1 T ¯ δ ) 2 ( p 1 + p 2 ) ( p 1 + p 2 ) ( T 1 T ¯ δ ) 2 + p 2 ( T 2 T 1 ) + x 2 = q ^ i = 1 n 1 r i T 1 2 ( T 2 T 1 ) p 2 + T 1 2 + x 0 = x 3 B .
Optimal controls on the first interval under assumption of ( p 1 + p 2 ) 1 and the condition x 1 = x 1 B coincide both for the limiting and the initial cases:
u i * ( t ) I 1 = x 1 x 0 r i i = 1 n 1 r i ( T ¯ δ ) + q ^ ( T ¯ δ ) 4 r i q ^ t 2 r i = q ^ i = 1 n 1 r i ( T ¯ δ ) 2 p 1 T 1 + p 2 T 2 T ¯ δ 2 r i i = 1 n 1 r i ( T ¯ δ ) + q ^ ( T ¯ δ ) 4 r i q ^ t 2 r i = q ^ 2 r i T 1 p 1 + T 2 p 2 t = u i ( t ) I 1 , I 2 , I 3 B .
Optimal control on the second interval under assumption of ( p 1 + p 2 ) 1 is rewritten using the expression for x 2 = x 2 B :
u i * ( t ) I 2 = q ^ χ ( t ) 4 r i ( 1 p 1 p 2 ) 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ δ 2 i = 1 n 1 r i ( 1 + p 1 + p 2 ) 2 r i i = 1 n 1 r i χ ( t ) ln ( p 1 + p 2 ) ,
So again we have:
lim ( p 1 + p 2 ) 1 q ^ χ ( t ) 4 r i ( 1 p 1 p 2 ) 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ δ 2 i = 1 n 1 r i ( 1 + p 1 + p 2 ) 2 r i i = 1 n 1 r i χ ( t ) ln ( p 1 + p 2 ) = q ^ 2 r i T 1 p 1 + T 2 p 2 t = u i ( t ) I 1 , I 2 , I 3 B .
For the third interval, and taking in mind that x 3 depends on p 1 and p 2 , we obtain:
u i * ( t ) I 3 = ( x 3 x 2 ) r i i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 + T ¯ + δ 2 t ) 4 r i , lim ( p 1 + p 2 ) 1 ( x 3 x 2 ) r i i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 + T ¯ + δ 2 t ) 4 r i = q ^ 2 r i T 1 p 1 + T 2 p 2 t = u i ( t ) I 1 , I 2 , I 3 B .
For the last interval, the expression is the same:
u i * ( t ) I 4 = q ^ ( T 2 t ) 2 r i = u i ( t ) I 4 B .
Optimal trajectory on the first interval under assumption of ( p 1 + p 2 ) 1 and the condition x 1 = x 1 B has the form:
x * ( t ) I 1 = ( x 1 x 0 ) t ( T ¯ δ ) + 1 4 q ^ t ( T ¯ δ t ) i = 1 n 1 r i + x 0 = q ^ i = 1 n 1 r i 2 T 1 p 1 t + T 2 p 2 t t 2 2 + x 0 = x ( t ) I 1 , I 2 , I 3 B .
Optimal trajectory on the second interval under assumption of ( p 1 + p 2 ) 1 and x 2 = x 2 B is as follows:
x * ( t ) I 2 = q ^ i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 δ 2 χ ( t ) 2 4 + 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ i = 1 n 1 r i δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( 1 p 1 p 2 ) ln ( χ ( t ) 2 δ ) + x 1 ,
where χ ( t ) = 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) . Then, we obtain:
lim ( p 1 + p 2 ) 1 q ^ i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 δ 2 χ ( t ) 2 4 + 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ i = 1 n 1 r i δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( 1 p 1 p 2 ) ln ( χ ( t ) 2 δ ) + x 1 = q ^ i = 1 n 1 r i 2 T 1 p 1 t + T 2 p 2 t t 2 2 + x 0 = x ( t ) I 1 , I 2 , I 3 B .
We also obtain the optimal trajectory on the third interval under assumption of ( p 1 + p 2 ) 1 and x 3 = x 3 B :
x * ( t ) I 3 = i = 1 n 1 r i ( x 3 x 2 ) ( t T ¯ δ ) i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 T ¯ δ ) ( t T ¯ δ ) 4 q ^ ( t T ¯ δ ) 2 4 + x 2 .
Then
lim ( p 1 + p 2 ) 1 i = 1 n 1 r i ( x 3 x 2 ) ( t T ¯ δ ) i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 T ¯ δ ) ( t T ¯ δ ) 4 q ^ ( t T ¯ δ ) 2 4 + x 2 = q ^ i = 1 n 1 r i 2 T 1 p 1 t + T 2 p 2 t t 2 2 + x 0 = x ( t ) I 1 , I 2 , I 3 B .
On the last interval, we have
x * ( t ) I 4 = q ^ i = 1 n 1 4 r i ( t T 1 ) ( t + T 1 2 T 2 ) + x 3 = x ( t ) I 4 B .
As can be seen, the results coincide as well as in the edge case under assumption of no jumps. Thus, the solution of the main game formulation described in Section 4 could be considered as the generalization of the game described in this Section as well.

6. Numeric Example

This section is devoted to the particular numeric example of the stock investment game considered in the paper. Assume the following values of parameters
N = 3 number of players x 0 = 20 initial number of stocks t 0 = 0 initial time T = 10 , d = 3 , T 1 = 70 , T 2 = 75 time structure parameters q 1 = 1 , q 2 = 3 , q 3 = 6 , values of coefficients from payoff r 1 = 20 , r 2 = 1 , r 3 = 4 p 1 = 0.1 , p 2 = 0.2 probabilities to stop the game at T 1 , T 2
The cumulative distribution function (CDF) is defined according to (9).
Assume also that the set of admissible controls corresponds to the interval
U i = [ 0 ; U ] , U < .
The optimal control of the first player is presented in Figure 2. This function is a piecewise function defined on four separate intervals. The optimal control’s values belong to the predefined set of admissible control values.
The graphs for the other two players differ in minor details and show a similar picture with the controls belonging to the admissible set.
The optimal trajectory is presented in Figure 3. The trajectory is a continuous function defined on all four time subintervals. The bold black dots represent the connectivity points x 1 , x 2 , x 3 calculated in the previous section.
The evolution of the trajectory in Figure 3 is consistent with the intuitive understanding that it should be a non-decreasing continuous function. In contrast, Figure 2 shows the function for optimal investments, which should not satisfy the property of monotonicity and continuity, but should belong to a compact set, as shown in Figure 2.

7. Conclusions

In this work, a special class of differential games with random duration and discontinuous CDF was studied. The method to construct an optimal solution based on the consideration of separate adjoint time intervals is proposed. The analytical formulas of optimal control for every player and the optimal trajectory of the game are obtained. The numeric example is given and illustrated in the form of graphs.
In addition to the general formulation for the optimal solution we considered a number of special cases and shown that all considered cases agree well with the general formulation and can be derived from it. This proves the validity of our findings and extends the class of possible problems to be addressed within this framework.
The future research will consist, in particular, in studying the non-cooperative form of the game and time-consistency problem for the cooperative form of the game. In addition, we are going to investigate the Stackelberg equilibrium [23] for the considered game formulation.

Author Contributions

Conceptualization, E.G.; Formal analysis, A.Z.; Investigation, A.Z. and A.T.; Methodology, E.G. and A.T.; Project administration, A.Z.; Supervision, E.G. and A.T.; Validation, A.Z. and A.T.; Writing—original draft, A.Z. and E.G.; Writing—review & editing, E.G. All authors have read and agreed to the published version of the manuscript.

Funding

The work by E. Gromova on the formulation and general solution of the problem was supported by the grant from Russian Science Foundation 17-11-01093, while the development of analytical methods used for obtaining the solution performed by A. Tur was supported by RFBR under the research project 18-00-00727 (18-00-00725).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Interval I1

Consider cooperative form of the game. Need to solve the following maximization issue
0 T ¯ δ i = 1 n ( q i x ( t ) r i u i 2 ( t ) ) d t max u i .
Using Pontryagin maximum principle under assumption of both sides of the trajectory on the interval I 1 fixed: x ( t ) = x 0 , x ( T ¯ δ ) = x 1 , we construct a Hamiltonian for (A1)
H ( x ( t ) , u ( t ) , ψ ( t ) ) = ψ ( t ) i = 1 n u i ( t ) + i = 1 n ( q i x ( t ) r i u i 2 ( t ) ) .
From the equation based on the first derivative of (A2), we obtain the optimal control expression. The second derivative provides assurance that the this optimal control corresponds to the maximum
u i * ( t ) = ψ ( t ) 2 r i , 𝜕 2 H 𝜕 u i 2 ( t ) = 2 r i < 0 .
The differential equation for the adjoint variable ψ ( t ) takes form
𝜕 ψ 𝜕 t = 𝜕 H 𝜕 x = i = 1 n q i = q ^ , q ^ = i = 1 n q i .
Thus, we could derive
ψ ( t ) = ψ 0 q ^ t , ψ ( 0 ) = ψ 0 .
Finally, the optimal control on the interval takes form
u i * ( t ) = ψ 0 q ^ t 2 r i .
Using the expression for (A3), we can rewrite the dynamic equation as
x ˙ ( t ) = i = 1 n u i ( t ) = ψ 0 q ^ t 2 ( 1 r 1 + + 1 r n ) , x ( 0 ) = x 0 , x ( T ¯ δ ) = x 1 .
Simple integration helps to obtain the form of optimal trajectory on the interval
x ( t ) = 2 ψ 0 t q ^ t 2 4 ( 1 r 1 + + 1 r n ) + x 0 .
Using the condition on the right side of the interval x ( T ¯ δ ) = x 1 , we can obtain the value of ψ 0 . Thus,
ψ 0 = 2 ( x 1 x 0 ) ( 1 r 1 + + 1 r n ) ( T ¯ δ ) + q ^ ( T ¯ δ ) 2 .
Finally,
x * ( t ) I 1 = ( x 1 x 0 ) t ( T ¯ δ ) + 1 4 q ^ t ( T ¯ δ t ) ( 1 r 1 + + 1 r n ) + x 0 .
u i * ( t ) I 1 = x 1 x 0 r i ( 1 r 1 + + 1 r n ) ( T ¯ δ ) + q ^ ( T ¯ δ ) 4 r i q ^ t 2 r i .
The important note here is that the optimal control (A5) must be checked on the belonging to the set of admissible controls U i on the considered time interval. In the case that it does not belong to the set of admissible control, the solution must be found in the boarder of this set.

Appendix A.2. Interval I2

Similarly to the previous interval consider cooperative form of the game, the need to solve the following maximization issue now includes the trace of random component being involved
T ¯ δ T ¯ + δ i = 1 n ( q i x ( t ) r i u i 2 ( t ) ) ( 1 ( 1 p 1 p 2 ) t T ¯ + δ 2 δ ) d t max u i .
Using Pontryagin maximum principle under assumption of both sides of the trajectory on the interval I 2 fixed: x ( T ¯ δ ) = x 1 , x ( T ¯ + δ ) = x 2 , we write a Hamiltonian for (A6)
H ( x ( t ) , u ( t ) , ψ ( t ) ) = ψ ( t ) i = 1 n u i ( t ) + i = 1 n ( q i x ( t ) r i u i 2 ( t ) ) ( 1 ( 1 p 1 p 2 ) t T ¯ + δ 2 δ ) .
From the equation based on the first derivative of (A7), we obtain the optimal control expression. The second derivative provides assurance that this optimal control corresponds to the maximum
u i * ( t ) = ψ ( t ) δ r i ( δ t + T ¯ + ( p 1 + p 2 ) ( t T ¯ + δ ) ) ,
𝜕 2 H 𝜕 u i 2 = 2 r i ( 1 ( 1 p 1 p 2 ) t T ¯ + δ 2 δ ) < 0 .
The differential equation for the adjoint variable ψ ( t ) takes form
𝜕 ψ 𝜕 t = 𝜕 H 𝜕 x = q ^ ( 1 ( 1 p 1 p 2 ) t T ¯ + δ 2 δ ) , q ^ = i = 1 n q i . ψ ( t ) = q ^ T ¯ δ t ( 1 ( 1 p 1 p 2 ) t T ¯ + δ 2 δ ) d t + ψ 1 , ψ ( T ¯ δ ) = ψ 1 .
Solving the above
ψ ( t ) = q ^ ( t T ¯ + δ ) + q ^ ( 1 p 1 p 2 ) ( t T ¯ + δ ) 2 4 δ + ψ 1 , u i * ( t ) = q ^ ( t T ¯ + δ ) + q ^ ( 1 p 1 p 2 ) ( t T ¯ + δ ) 2 4 δ + ψ 1 2 r i ( 1 ( 1 p 1 p 2 ) t T ¯ + δ 2 δ ) .
Finally,
u i * ( t ) = q ^ ( t T ¯ + δ ) ( 4 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) + 4 ψ 1 δ 4 r i ( 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) .
Dynamic equation
x ˙ ( t ) = q ^ ( t T ¯ + δ ) ( 4 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) + 4 ψ 1 δ 4 ( 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) ( 1 r 1 + + 1 r n ) ,
x ( T ¯ δ ) = x 1 , x ( T ¯ + δ ) = x 2 .
By solving
x ( t ) = T ¯ δ t q ^ ( t T ¯ + δ ) ( 4 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) + 4 ψ 1 δ 4 ( 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) i = 1 n 1 r i d t + x 1
we obtain
x ( t ) = i = 1 n 1 4 r i 2 q ^ δ ( 1 p 1 p 2 ) 2 2 δ ln ( χ ( t ) 2 δ ) χ ( t ) + 2 δ q ^ ( t T ¯ + δ ) 2 2 4 ψ 1 δ 1 p 1 p 2 ln ( χ ( t ) 2 δ ) + x 1 ,
where χ ( t ) = 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) .
We can find ψ 1 from the condition x ( T ¯ + δ ) = x 2 :
ψ 1 = q ^ δ 1 p 1 p 2 + q ^ δ ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( x 2 x 1 ) ( 1 p 1 p 2 ) i = 1 n 1 r i δ ln ( p 1 + p 2 ) .
Overall,
x * ( t ) I 2 = i = 1 n 1 4 r i 2 q ^ δ ( 1 p 1 p 2 ) 2 2 δ ln ( χ ( t ) 2 δ ) χ ( t ) + 2 δ q ^ ( t T ¯ + δ ) 2 2 4 q ^ δ 2 ( 1 p 1 p 2 ) 2 + q ^ δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( 1 p 1 p 2 ) ( x 2 x 1 ) i = 1 n 1 r i ln ( p 1 + p 2 ) ln χ ( t ) 2 δ + x 1 = q ^ i = 1 n 1 r i 2 ( 1 p 1 p 2 ) 2 δ 2 χ ( t ) 2 4 + 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ i = 1 n 1 r i δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( 1 p 1 p 2 ) ln ( χ ( t ) 2 δ ) + x 1 ,
u i * ( t ) I 2 = q ^ ( t T ¯ + δ ) ( 4 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) 4 r i ( 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) + 4 ( q ^ δ 2 1 p 1 p 2 + q ^ δ 2 ( 1 + p 1 + p 2 ) 2 ln ( p 1 + p 2 ) ( x 2 x 1 ) ( 1 p 1 p 2 ) i = 1 n 1 r i ln ( p 1 + p 2 ) ) 4 r i ( 2 δ ( 1 p 1 p 2 ) ( t T ¯ + δ ) ) = q ^ χ ( t ) 4 r i ( 1 p 1 p 2 ) 2 ( x 2 x 1 ) ( 1 p 1 p 2 ) q ^ δ 2 i = 1 n 1 r i ( 1 + p 1 + p 2 ) 2 r i i = 1 n 1 r i χ ( t ) ln ( p 1 + p 2 ) .
Similarly, the optimal control (A8) must be checked on the belonging to the set of admissible controls U i on the considered time interval. In the case that it does not belong to the set of admissible control, the solution must be found in the boarder of this set.

Appendix A.3. Interval I3

Consider again the cooperative form of the game. Solve the following maximization issue
T ¯ + δ T 1 i = 1 n ( p 1 + p 2 ) ( q i x ( t ) r i u i 2 ( t ) ) d t max u i .
Using Pontryagin maximum principle under assumption of both sides of the trajectory on the interval, I 3 fixed: x ( T ¯ + δ ) = x 2 x ( T 1 ) = x 3 , we write a Hamiltonian for (A9)
H ( x ( t ) , u ( t ) , ψ ( t ) ) = ψ ( t ) i = 1 n u i ( t ) + i = 1 n ( p 1 + p 2 ) ( q i x ( t ) r i u i 2 ( t ) ) .
From the equation based on the first derivative of (A10), we obtain the optimal control expression. The second derivative provides assurance that this optimal control corresponds to the maximum
u i * ( t ) = ψ ( t ) 2 r i ( p 1 + p 2 ) , 𝜕 2 H 𝜕 u i 2 = 2 r i ( p 1 + p 2 ) < 0 .
The differential equation for the adjoint variable ψ ( t ) takes form
𝜕 ψ 𝜕 t = 𝜕 H 𝜕 x = i = 1 n q i ( p 1 + p 2 ) = q ^ ( p 1 + p 2 ) , q ^ = i = 1 n q i .
Taking the intergral
ψ ( t ) = T ¯ + δ t q ^ ( p 1 + p 2 ) d t + ψ 2 , ψ ( T ¯ + δ ) = ψ 2 ,
we obtain
ψ ( t ) = q ^ ( p 1 + p 2 ) ( t T ¯ δ ) + ψ 2
The expression for optimal control takes form
u i * ( t ) = q ^ ( t T ¯ δ ) 2 r i + ψ 2 2 r i ( p 1 + p 2 ) .
From dynamic equation
x ˙ ( t ) = i = 1 n u i ( t ) = 1 2 ( p 1 + p 2 ) i = 1 n 1 r i ( q ^ ( p 1 + p 2 ) ( t T ¯ δ ) + ψ 2 ) ,
we obtain
x ( t ) = T ¯ + δ t 1 2 ( p 1 + p 2 ) i = 1 n 1 r i ( q ^ ( p 1 + p 2 ) ( t T ¯ δ ) + ψ 2 ) d t + x 2 , x ( t ) = i = 1 n 1 r i ψ 2 t T ¯ δ 2 ( p 1 + p 2 ) q ^ ( t T ¯ δ ) 2 4 + x 2 .
From the condition x ( T 1 ) = x 3 , we can obtain
ψ 2 = 2 ( p 1 + p 2 ) ( x 3 x 2 ) i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( p 1 + p 2 ) ( T 1 T ¯ δ ) 2 .
Overall,
x * ( t ) I 3 = i = 1 n 1 r i ( x 3 x 2 ) ( t T ¯ δ ) i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 T ¯ δ ) ( t T ¯ δ ) 4 q ^ ( t T ¯ δ ) 2 4 + x 2 ,
u i * ( t ) I 3 = ( x 3 x 2 ) r i i = 1 n 1 r i ( T 1 T ¯ δ ) + q ^ ( T 1 + T ¯ + δ 2 t ) 4 r i .
Similarly, the optimal control (A11) must be checked on the belonging of the set of admissible controls U i on the considered time interval. In the case that it does not belong to the set of admissible control, the solution must be found in the boarder of this set.

Appendix A.4. Interval I4

Consider again the cooperative form of the game on the last interval. Solve the following maximization issue
T 1 T 2 i = 1 n p 2 ( q i x ( t ) r i u i 2 ( t ) ) d t max u i .
Using Pontryagin maximum principle under assumption of only one left sides of the trajectory I 4 fixed: x ( T 1 ) = x 3 , we write a Hamiltonian for (A12)
H ( x , u , ψ ) = ψ i = 1 n u i ( t ) + i = 1 n p 2 ( q i x ( t ) r i u i 2 ( t ) ) .
From the equation based on the first derivative of (A13), we obtain the optimal control expression. The second derivative provides assurance that this optimal control corresponds to the maximum
u i * ( t ) = ψ ( t ) 2 r i p 2 , 𝜕 2 H 𝜕 u i 2 = 2 r i p 2 < 0 .
The differential equation for the adjoint variable ψ ( t ) takes form
𝜕 ψ 𝜕 t = 𝜕 H 𝜕 x = i = 1 n q i p 2 = q ^ p 2 , q ^ = i = 1 n q i .
Solving
ψ ( t ) = T 1 t q ^ p 2 d t + ψ 3 , ψ ( T 1 ) = ψ 3 , ψ ( T 2 ) = 0 , ψ ( t ) = q ^ p 2 ( t T 2 ) .
The expression for optimal control takes form
u i * ( t ) = q ^ ( t T 2 ) 2 r i .
Let us substitute the controls to dynamic equation. We obtain
x ˙ ( t ) = i = 1 n u i ( t ) = q ^ ( t T 2 ) i = 1 n 1 2 r i , x ( T 1 ) = x 3 .
Thus,
x ( t ) = T 1 t q ^ ( t T 2 ) i = 1 n 1 2 r i d t + x 3 , x ( t ) = q ^ i = 1 n 1 4 r i ( t T 1 ) ( t + T 1 2 T 2 ) + x 3 .
Overall, we get
x * ( t ) I 4 = q ^ i = 1 n 1 4 r i ( t T 1 ) ( t + T 1 2 T 2 ) + x 3 , u i * ( t ) I 4 = q ^ ( t T 2 ) 2 r i .
Similarly, the optimal control (A14) must be checked on the belonging to the set of admissible controls U i on the considered time interval. In the case it does not belong to the set of admissible control, the solution must be found in the boarder of this set.

References

  1. Basar, T.; Olsder, G. Dynamic Noncooperative Game Theory; SIAM: New York, NY, USA, 1999. [Google Scholar]
  2. Petrosyan, L.; Danilov, N. Cooperative Differential Games and THEIR Applications; Izd. Tomskogo University: Tomsk, Russia, 1982. [Google Scholar]
  3. Dockner, E.J.; Jørgensen, S.; Long, N.V.; Sorger, G. Differential Games in Economics and Management Science. In Cambridge Books; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
  4. Yeung, D.; Petrosyan, L. Cooperative Stochastic Differential Games; Springer Science, Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  5. Yaari, M. Uncertain lifetime, life insurance, and the theory of the consumer. Rev. Econ. Stud. 1965, 32, 137–150. [Google Scholar] [CrossRef]
  6. Petrosyan, L.; Murzov, N. Game-theoretic problems of mechanics. Litovsk. Math. Sb. 1966, 7, 423–433. [Google Scholar]
  7. Petrosyan, L.; Shevkoplyas, E. Cooperative differential games with stochastic time. Vestn. Petersburg Univ. Math. 2000, 33, 18–23. [Google Scholar]
  8. Marin-Solano, J.; Shevkoplyas, E. Non-constant discounting and differential games with random time horizon. Automatica 2011, 47, 2626–2638. [Google Scholar] [CrossRef]
  9. Gromova, E.; Malakhova, A.; Palestini, A. Payoff Distribution in a Multi-Company Extraction Game with Uncertain Duration. Mathematics 2018, 6, 165. [Google Scholar] [CrossRef] [Green Version]
  10. Gromova, E.; Tur, A. On the form of integral payoff in differential games with random duration. In Proceedings of the 2017 XXVI International Conference on Information, Communication and Automation Technologies (ICAT), Sarajevo, Bosnia-Herzegovina, 26–28 October 2017; pp. 1–6. [Google Scholar]
  11. Gromov, D.; Gromova, E. On a Class of Hybrid Differential Games. Dyn. Games Appl. 2017, 7, 266–288. [Google Scholar] [CrossRef]
  12. Reddy, P.V.; Schumacher, J.M.; Engwerda, J.C. Analysis of optimal control problems for hybrid systems with one state variable. SIAM J. Control. Optim. 2020, 58, 3262–3292. [Google Scholar] [CrossRef]
  13. Bonneuil, N.; Boucekkine, R. Optimal transition to renewable energy with threshold of irreversible pollution. Eur. J. Oper. Res. 2016, 248, 257–262. [Google Scholar] [CrossRef] [Green Version]
  14. Elliott, R.J.; Siu, T.K. A stochastic differential game for optimal investment of an insurer with regime switching. Quant. Financ. 2011, 11, 365–380. [Google Scholar] [CrossRef]
  15. Reddy, P.; Schumacher, J.; Engwerda, J. Optimal management with hybrid dynamics—The shallow lake problem. In Mathematical Control Theory I; Springer: Berlin/Heidelberg, Germany, 2015; pp. 111–136. [Google Scholar]
  16. Pontryagin, L.; Boltyanskii, V.; Gamkrelidze, R.; Mishchenko, E. The Mathematical Theory of Optimal Processes; Interscience: New York, NY, USA, 1962. [Google Scholar]
  17. Gromov, D.; Gromova, E. Differential games with random duration: A hybrid systems formulation. Contrib. Game Theory Manag. 2014, 7, 104–119. [Google Scholar]
  18. Tur, A.V.; Magnitskaya, N.G. Feedback and Open-Loop Nash Equilibria in a Class of Diferential Games with Random Duration. Contrib. Game Theory Manag. 2020, 13, 415–426. [Google Scholar] [CrossRef]
  19. Gromova, E.; Magnitskaya, N. Solution of the differential game with hybrid structure. Contrib. Game Theory Manag. 2019, 12, 159–176. [Google Scholar]
  20. Feichtinger, G.; Jørgensen, S. Differential game models in management science. Eur. J. Oper. Res. 1983, 14, 137–155. [Google Scholar] [CrossRef]
  21. Jørgensen, S.; Zaccour, G. Developments in differential game theory and numerical methods: Economic and management applications. Comput. Manag. Sci. 2007, 4, 159–181. [Google Scholar] [CrossRef]
  22. Malakhova, A.P.; Gromova, E.V. Strongly Time-Consistent Core in Differential Games with Discrete Distribution of Random Time Horizon. Math. Appl. 2018, 46, 197–209. [Google Scholar] [CrossRef]
  23. Abdel-Wahab, O.; Bentahar, J.; Otrok, H.; Mourad, A. Resource-Aware Detection and Defense System Against Multi-Type Attacks in the Cloud: Repeated Bayesian Stackelberg Game. IEEE Trans. Dependable Secur. Comput. 2019. [Google Scholar] [CrossRef]
Figure 1. The example of discrete distribution of random time horizon.
Figure 1. The example of discrete distribution of random time horizon.
Mathematics 08 02185 g001
Figure 2. The optimal control of the first player on four time subintervals.
Figure 2. The optimal control of the first player on four time subintervals.
Mathematics 08 02185 g002
Figure 3. The optimal trajectory on four time subintervals connected by connectivity points—bold black dots x 1 , x 2 , x 3 correspondingly.
Figure 3. The optimal trajectory on four time subintervals connected by connectivity points—bold black dots x 1 , x 2 , x 3 correspondingly.
Mathematics 08 02185 g003
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zaremba, A.; Gromova, E.; Tur, A. A Differential Game with Random Time Horizon and Discontinuous Distribution. Mathematics 2020, 8, 2185. https://doi.org/10.3390/math8122185

AMA Style

Zaremba A, Gromova E, Tur A. A Differential Game with Random Time Horizon and Discontinuous Distribution. Mathematics. 2020; 8(12):2185. https://doi.org/10.3390/math8122185

Chicago/Turabian Style

Zaremba, Anastasiia, Ekaterina Gromova, and Anna Tur. 2020. "A Differential Game with Random Time Horizon and Discontinuous Distribution" Mathematics 8, no. 12: 2185. https://doi.org/10.3390/math8122185

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop