Abstract
One class of differential games with random duration is considered. It is assumed that the duration of the game is a random variable with values from a given finite interval. The cumulative distribution function (CDF) of this random variable is assumed to be discontinuous with two jumps on the interval. It follows that the player’s payoff takes the form of the sum of integrals with different but adjoint time intervals. In addition, the first interval corresponds to the zero probability of the game to be finished, which results in terminal payoff on this interval. The method of construction optimal solution for the cooperative scenario of such games is proposed. The results are illustrated by the example of differential game of investment in the public stock of knowledge.
1. Introduction
Dynamic processes with many participants are well described by the differential game theory framework (e.g., for differential games see [], also [] in Russian, and their applications in economics in []). Meanwhile, a lot of natural processes (such as ecological dynamics) impose random components []. Thus, the assumption of the random nature of the processes’ time horizon allows constructing models even closer to reality []. Therefore, differential game models with random duration are of particular importance. This type of models was initially introduced in [], where a differential game of pursuit with terminal payoff and stochastic terminal time was considered; the first problem formulation with integral payoff function and random time horizon with continuous distribution function was considered in [] and later extended to the case of differential game with random time horizon and discounting in [].
In this work, one class of differential games with a random time horizon is considered. It is assumed that the duration of the game is a random variable corresponding to the discontinuous cumulative distribution function (CDF). Particular classes of discontinuous models had been formulated in [] for the case with only one jump and in [] for the step form of the CDF. In this paper, the discontinuous structure of the CDF of the game implies that the game can not be interrupted until a certain point, then could stop either on the continuous interval or in the certain discrete-time moments. Moreover, the first interval corresponds to the zero probability of the game to be ended, which results in terminal payoff on this time interval. This game formulation leads to the fact that the player’s payoff takes the form of the sum of integrals with different, but adjoint time intervals and the problem can be considered in terms of hybrid differential games [,]. For more examples of games with regime switching see, e.g., [,,].
The method of solution for such a wide class of problems by using dynamic programming methods [], the maximum Pontryagin principle [] and parametrization firstly was suggested in [] for a class of differential games with random time horizon, with continuous but composite CDF, further generalized in []. Based on [], a number of games with payoffs defined on adjoint time intervals were solved in [,].
In this paper, a method to construct optimal controls for the introduced class of games with discontiniuous CDF is proposed. This method refers to the idea of defining the connecting trajectory points at the edges of intervals as numeric parameters and usage of the maximum Pontryagin principle [], after every interval. The numeric values for these points are obtained by maximization of total payoff under a dynamic programming approach on the interval. The method can also be used for a non-cooperative game scenario to find Nash equllibria and further exploit for a wide class of differential games with integral payoffs defined on different time intervals.
The results obtained in this work are illustrated by the game-theoretical model of investment to the public stock of knowledge [] (also see []) modified accordingly to the described stochastic framework. The considered model refers to the unstable market conditions which bring a random nature to the process.
The paper is organized as follows. Section 2 is dedicated to the game formulation and main assumptions made to construct a feasible solution. The following Section 3 consists of the stock investment model specification. The optimization approach for the game formulation considered in the paper is proposed in Section 3 as well. In Section 4.1, the calculations for the adjoint time intervals are given. All the necessary details to reproduce this result of the work are written in Appendix A. Section 4.2 provides the dynamic programming approach to determine the numerical values of the state at the end points of the intervals. The limit case under the assumption of no jumps considered CDF is investigated in Section 5.1. Another limit case with assumption of step CDF is described in Section 5.2. The last Section presents a numeric example for the stock investment model.
2. Problem Formulation
Consider a differential game of n participants (players): . Let be the set of players. Assume that the game starts at the initial moment and the initial state ; the duration of the game is a random variable such that it corresponds to the particular cumulative distribution function (CDF) described in the below notation. The following assumptions are made:
- the interval over which the game is played is , where and T are random variables defined on the interval , . The random variable corresponds to the discontinuous CDFwhere is assumed to be an absolutely continuous non-decreasing function, , , .The CDF of the random variable is assumed to be discontinuous with two jumps occurring on the bounded interval. The example of such CDF is given on Figure 1;
Figure 1. The example of discrete distribution of random time horizon. - the controls are open-loop strategies;
- the controls belong to the sets of admissible controls , which consist of all measurable functions on the interval , taking values in the set of admissible control values , which are in turn convex compact subsets of ;
- the instantaneous payoff of the i-th player at the moment is defined as . To shorten the notation, we writewhere ;
- in the deterministic case the integral payoff iswhere is a known moment of the end of the game, ;
- in the case of random time horizon, the mathematical expectation of the integral payoff is considered. Thus, the i-th player’s integral functional is:
It was obtained previously [] that (4) can be written as follows
Proposition 1.
For the considered game formulation, the payoff is a sum of functionals defined over four time intervals:
Proof.
The CDF of the random variable in the described game formulation is a piecewise function on the four time intervals. It is not hard to spot, that due to the zero probability of the game to be ended on the first interval , the payoff of the player on this interval takes a deterministic form. The payoff on the other intervals is nothing but a mathematical expectation of the integration payoff corresponding to the particular form of the CDF on the interval. Thus, accordingly to (5) and proof in [], the total payoff of the player could be written as the sum of four adjoint integrals (6). □
Obviously, the problem can be easily modified for the more general case with jumps at .
3. Model Example
Consider the model example in the frame of differential game formulation mentioned in the previous section. Namely, consider a so-called stock investment model described in [] (also see []). There, n individuals invest in the stocks related to the particular industry. State variable corresponds to the number of stocks held at the moment t, and is the investment strategy of agent i at time t. The game dynamics take form of the accumulative process
Assume that the instant payoff function is the linear-quadratic one, as each agent derives linear utility from the consumption of the stock
Assume also that the CDF function of the game partly corresponds to a uniform distribution
To further simplify the investigation, assume that the initial moment of the game is equal to zero: .
Consider the cooperative form of the game. Assume that all participants decided to cooperate and, thus, unite their efforts to maximize the total payoff.
The optimization problem can be solved using parametric method for four separate intervals. Over every interval, we use the Pontryagin maximum principle [].
We start by introducing three yet undefined values of the state at the respective switching instants:
Let denote the ith interval, is the total payoff on interval i, which depends on the numeric parameter . So, we obtain the following connected optimization problems:
To obtain the numeric value of every parameter , the following algorithm based on dynamic programming principle can be used
- –compute parameterized by ;
- –compute parameterized by , while using the previously obtained expression for that depends on ;
- –compute .
Thus, all three numeric values for can be unambiguously obtained.
4. Computations
4.1. Intervals Calculations
In this section, the expressions for the optimal control and optimal trajectory are obtained under the assumption that three numeric parameters are given. The detailed calculations for all four intervals are presented in Appendix A. For every interval, the Pontryagin maximum principle (under the assumption of one side of interval fixed or both sides fixed) is used.
For the interval , we obtain the following expressions for the optimal trajectory and controls:
For the interval , we have the following expressions for the optimal trajectory and controls:
where
For the interval , we obtain:
For the interval , we obtain:
4.2. Computation of the Parameters
As was written above, the connecting trajectory points at the boundary of intervals are obtained by the following algorithm.
Using dynamic programming optimal principal
- –compute parameterized by ;
- –compute parameterized by , while using the previously obtained expression for that depends on ;
- –compute .
Every expression is a linear-quadratic one in relation to correspondingly. Thus, it is easy to obtain optimal values by using the first derivative of the expression. As a result, we obtain three linear equations, and after solving the system of these three equations, we obtain
5. Analysis of the Limiting Cases
5.1. Assumption of no Jumps in CDF
Consider the limit case of the game formulated in the paper. Namely, suppose that the probabilities which are responsible for the discontinuous structure of CDF function tend to zero
This assumption leads to the situation when the CDF becomes a continuous function and the game cannot end after . Thus, we disregard the intervals .
The dynamics and the instantaneous payoff function of the game remain the same
The CDF function of the game takes form
This game formulation of stock investment game was investigated in []. Consider a cooperative form of the game:
The optimization problem is solved using parametric method but with only one parameter .
To obtain the numeric value of parameter , the following maximization problem must be solved
The solution for this game is given in the form of optimal control, optimal trajectory and the numeric value of . Using the Pontryagin maximum principle, it is not difficult to obtain the following expressions (denote them with superscript A)
From the other side, we can use the solution (10), (11) and (14) of the initial game described in Section 4 and set .
The connection point (14) under assumption of and aiming to 0 coincides with the result obtained in (15):
Similarly, the optimal control on the first interval (10) under the assumption of and tending to 0 also coincides with the control obtained for the reduced model (15):
Optimal control on the second interval (11) under assumption of and aiming to 0 could be rewritten using the expression for obtained from (14):
Optimal trajectory on the first interval (10) under assumption of and aiming to 0 also gives the results (16):
Optimal trajectory function on the second interval under assumption of and aiming to 0 coincides with the corresponding result in (16):
where We have
As we have shown, the solutions of the initial game and the reduced game coincide. Thus, the solution of the main game formulation described in Section 4 can be considered as the generalization of the game described in this Section.
5.2. Assumption of a Piece-Wise Constant CDF
Consider another edge case of the game formulated in the paper. Now, suppose that the continuous part of (1) takes only zero values on all interval . This assumption is equivalent to the following condition:
In this case, CDF corresponds to discrete random variable T and becomes a step function with probabilities and of the game to be finished at time and correspondingly. Thus, there is a guaranteed terminal payoff on the interval and expected value for the interval .
The dynamic and the instantaneous payoff function of the game remain, the same as
The CDF function of the game takes the step form:
This game formulation of differential game with discrete random time horizon was investigated in []. Consider cooperative form of the game:
The optimization problem is solved using the parametric method, but only with one parameter :
To obtain the numeric value of parameter , the following maximization problem must be solved
The solution for this game is given in the form of optimal control, optimal trajectory and the numeric value of . Using the Pontryagin maximum principle, we obtain the following expressions (denote them with superscript B):
Note that optimal control and optimal trajectory are the same over all three intervals :
For the interval , we obtain:
From the expression for optimal trajectory (17), we can obtain the following points:
From the other side, we can use the solution (10)–(14) of the initial game described in Section 4 and calculate the limit under condition of . Below, we use L’Hopital’s rule to compute the limits.
For the connectivity point under assumption of , we obtain the results:
Similarly, for and , we have:
So, we obtain:
Optimal controls on the first interval under assumption of and the condition coincide both for the limiting and the initial cases:
Optimal control on the second interval under assumption of is rewritten using the expression for :
So again we have:
For the third interval, and taking in mind that depends on and , we obtain:
For the last interval, the expression is the same:
Optimal trajectory on the first interval under assumption of and the condition has the form:
Optimal trajectory on the second interval under assumption of and is as follows:
where Then, we obtain:
We also obtain the optimal trajectory on the third interval under assumption of and :
Then
On the last interval, we have
As can be seen, the results coincide as well as in the edge case under assumption of no jumps. Thus, the solution of the main game formulation described in Section 4 could be considered as the generalization of the game described in this Section as well.
6. Numeric Example
This section is devoted to the particular numeric example of the stock investment game considered in the paper. Assume the following values of parameters
The cumulative distribution function (CDF) is defined according to (9).
Assume also that the set of admissible controls corresponds to the interval
The optimal control of the first player is presented in Figure 2. This function is a piecewise function defined on four separate intervals. The optimal control’s values belong to the predefined set of admissible control values.
Figure 2.
The optimal control of the first player on four time subintervals.
The graphs for the other two players differ in minor details and show a similar picture with the controls belonging to the admissible set.
The optimal trajectory is presented in Figure 3. The trajectory is a continuous function defined on all four time subintervals. The bold black dots represent the connectivity points calculated in the previous section.
Figure 3.
The optimal trajectory on four time subintervals connected by connectivity points—bold black dots correspondingly.
The evolution of the trajectory in Figure 3 is consistent with the intuitive understanding that it should be a non-decreasing continuous function. In contrast, Figure 2 shows the function for optimal investments, which should not satisfy the property of monotonicity and continuity, but should belong to a compact set, as shown in Figure 2.
7. Conclusions
In this work, a special class of differential games with random duration and discontinuous CDF was studied. The method to construct an optimal solution based on the consideration of separate adjoint time intervals is proposed. The analytical formulas of optimal control for every player and the optimal trajectory of the game are obtained. The numeric example is given and illustrated in the form of graphs.
In addition to the general formulation for the optimal solution we considered a number of special cases and shown that all considered cases agree well with the general formulation and can be derived from it. This proves the validity of our findings and extends the class of possible problems to be addressed within this framework.
The future research will consist, in particular, in studying the non-cooperative form of the game and time-consistency problem for the cooperative form of the game. In addition, we are going to investigate the Stackelberg equilibrium [] for the considered game formulation.
Author Contributions
Conceptualization, E.G.; Formal analysis, A.Z.; Investigation, A.Z. and A.T.; Methodology, E.G. and A.T.; Project administration, A.Z.; Supervision, E.G. and A.T.; Validation, A.Z. and A.T.; Writing—original draft, A.Z. and E.G.; Writing—review & editing, E.G. All authors have read and agreed to the published version of the manuscript.
Funding
The work by E. Gromova on the formulation and general solution of the problem was supported by the grant from Russian Science Foundation 17-11-01093, while the development of analytical methods used for obtaining the solution performed by A. Tur was supported by RFBR under the research project 18-00-00727 (18-00-00725).
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
Appendix A.1. Interval I1
Consider cooperative form of the game. Need to solve the following maximization issue
Using Pontryagin maximum principle under assumption of both sides of the trajectory on the interval fixed: , we construct a Hamiltonian for (A1)
From the equation based on the first derivative of (A2), we obtain the optimal control expression. The second derivative provides assurance that the this optimal control corresponds to the maximum
The differential equation for the adjoint variable takes form
Thus, we could derive
Finally, the optimal control on the interval takes form
Using the expression for (A3), we can rewrite the dynamic equation as
Simple integration helps to obtain the form of optimal trajectory on the interval
Using the condition on the right side of the interval , we can obtain the value of . Thus,
Finally,
The important note here is that the optimal control (A5) must be checked on the belonging to the set of admissible controls on the considered time interval. In the case that it does not belong to the set of admissible control, the solution must be found in the boarder of this set.
Appendix A.2. Interval I2
Similarly to the previous interval consider cooperative form of the game, the need to solve the following maximization issue now includes the trace of random component being involved
Using Pontryagin maximum principle under assumption of both sides of the trajectory on the interval fixed: , we write a Hamiltonian for (A6)
From the equation based on the first derivative of (A7), we obtain the optimal control expression. The second derivative provides assurance that this optimal control corresponds to the maximum
The differential equation for the adjoint variable takes form
Solving the above
Finally,
Dynamic equation
By solving
we obtain
where
We can find from the condition :
Overall,
Similarly, the optimal control (A8) must be checked on the belonging to the set of admissible controls on the considered time interval. In the case that it does not belong to the set of admissible control, the solution must be found in the boarder of this set.
Appendix A.3. Interval I3
Consider again the cooperative form of the game. Solve the following maximization issue
Using Pontryagin maximum principle under assumption of both sides of the trajectory on the interval, fixed: , we write a Hamiltonian for (A9)
From the equation based on the first derivative of (A10), we obtain the optimal control expression. The second derivative provides assurance that this optimal control corresponds to the maximum
The differential equation for the adjoint variable takes form
Taking the intergral
we obtain
The expression for optimal control takes form
From dynamic equation
we obtain
From the condition , we can obtain
Overall,
Similarly, the optimal control (A11) must be checked on the belonging of the set of admissible controls on the considered time interval. In the case that it does not belong to the set of admissible control, the solution must be found in the boarder of this set.
Appendix A.4. Interval I4
Consider again the cooperative form of the game on the last interval. Solve the following maximization issue
Using Pontryagin maximum principle under assumption of only one left sides of the trajectory fixed: , we write a Hamiltonian for (A12)
From the equation based on the first derivative of (A13), we obtain the optimal control expression. The second derivative provides assurance that this optimal control corresponds to the maximum
The differential equation for the adjoint variable takes form
Solving
The expression for optimal control takes form
Let us substitute the controls to dynamic equation. We obtain
Thus,
Overall, we get
Similarly, the optimal control (A14) must be checked on the belonging to the set of admissible controls on the considered time interval. In the case it does not belong to the set of admissible control, the solution must be found in the boarder of this set.
References
- Basar, T.; Olsder, G. Dynamic Noncooperative Game Theory; SIAM: New York, NY, USA, 1999. [Google Scholar]
- Petrosyan, L.; Danilov, N. Cooperative Differential Games and THEIR Applications; Izd. Tomskogo University: Tomsk, Russia, 1982. [Google Scholar]
- Dockner, E.J.; Jørgensen, S.; Long, N.V.; Sorger, G. Differential Games in Economics and Management Science. In Cambridge Books; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
- Yeung, D.; Petrosyan, L. Cooperative Stochastic Differential Games; Springer Science, Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Yaari, M. Uncertain lifetime, life insurance, and the theory of the consumer. Rev. Econ. Stud. 1965, 32, 137–150. [Google Scholar] [CrossRef]
- Petrosyan, L.; Murzov, N. Game-theoretic problems of mechanics. Litovsk. Math. Sb. 1966, 7, 423–433. [Google Scholar]
- Petrosyan, L.; Shevkoplyas, E. Cooperative differential games with stochastic time. Vestn. Petersburg Univ. Math. 2000, 33, 18–23. [Google Scholar]
- Marin-Solano, J.; Shevkoplyas, E. Non-constant discounting and differential games with random time horizon. Automatica 2011, 47, 2626–2638. [Google Scholar] [CrossRef]
- Gromova, E.; Malakhova, A.; Palestini, A. Payoff Distribution in a Multi-Company Extraction Game with Uncertain Duration. Mathematics 2018, 6, 165. [Google Scholar] [CrossRef]
- Gromova, E.; Tur, A. On the form of integral payoff in differential games with random duration. In Proceedings of the 2017 XXVI International Conference on Information, Communication and Automation Technologies (ICAT), Sarajevo, Bosnia-Herzegovina, 26–28 October 2017; pp. 1–6. [Google Scholar]
- Gromov, D.; Gromova, E. On a Class of Hybrid Differential Games. Dyn. Games Appl. 2017, 7, 266–288. [Google Scholar] [CrossRef]
- Reddy, P.V.; Schumacher, J.M.; Engwerda, J.C. Analysis of optimal control problems for hybrid systems with one state variable. SIAM J. Control. Optim. 2020, 58, 3262–3292. [Google Scholar] [CrossRef]
- Bonneuil, N.; Boucekkine, R. Optimal transition to renewable energy with threshold of irreversible pollution. Eur. J. Oper. Res. 2016, 248, 257–262. [Google Scholar] [CrossRef]
- Elliott, R.J.; Siu, T.K. A stochastic differential game for optimal investment of an insurer with regime switching. Quant. Financ. 2011, 11, 365–380. [Google Scholar] [CrossRef]
- Reddy, P.; Schumacher, J.; Engwerda, J. Optimal management with hybrid dynamics—The shallow lake problem. In Mathematical Control Theory I; Springer: Berlin/Heidelberg, Germany, 2015; pp. 111–136. [Google Scholar]
- Pontryagin, L.; Boltyanskii, V.; Gamkrelidze, R.; Mishchenko, E. The Mathematical Theory of Optimal Processes; Interscience: New York, NY, USA, 1962. [Google Scholar]
- Gromov, D.; Gromova, E. Differential games with random duration: A hybrid systems formulation. Contrib. Game Theory Manag. 2014, 7, 104–119. [Google Scholar]
- Tur, A.V.; Magnitskaya, N.G. Feedback and Open-Loop Nash Equilibria in a Class of Diferential Games with Random Duration. Contrib. Game Theory Manag. 2020, 13, 415–426. [Google Scholar] [CrossRef]
- Gromova, E.; Magnitskaya, N. Solution of the differential game with hybrid structure. Contrib. Game Theory Manag. 2019, 12, 159–176. [Google Scholar]
- Feichtinger, G.; Jørgensen, S. Differential game models in management science. Eur. J. Oper. Res. 1983, 14, 137–155. [Google Scholar] [CrossRef]
- Jørgensen, S.; Zaccour, G. Developments in differential game theory and numerical methods: Economic and management applications. Comput. Manag. Sci. 2007, 4, 159–181. [Google Scholar] [CrossRef]
- Malakhova, A.P.; Gromova, E.V. Strongly Time-Consistent Core in Differential Games with Discrete Distribution of Random Time Horizon. Math. Appl. 2018, 46, 197–209. [Google Scholar] [CrossRef]
- Abdel-Wahab, O.; Bentahar, J.; Otrok, H.; Mourad, A. Resource-Aware Detection and Defense System Against Multi-Type Attacks in the Cloud: Repeated Bayesian Stackelberg Game. IEEE Trans. Dependable Secur. Comput. 2019. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).