Resource Exploitation in a Stochastic Horizon under Two Parametric Interpretations

: This work presents a two-player extraction game where the random terminal times follow (different) heavy-tailed distributions which are not necessarily compactly supported. Besides, we delve into the implications of working with logarithmic utility/terminal payoff functions. To this end, we use standard actuarial results and notation, and state a connection between the so-called actuarial equivalence principle , and the feedback controllers found by means of the Dynamic Programming technique. Our conclusions include a conjecture on the form of the optimal premia for insuring the extraction tasks; and a comparison for the intensities of the extraction for each player under different phases of the lifetimes of their respective machineries.


Introduction
In this work we study an extension of the extraction game presented in Reference [1] to the case where the random terminal times follow (different) heavy-tailed distributions which are not necessarily compactly supported. We use the framework of the problem of common non-renewable resource exploitation as was posed in Reference [2], from both-the game-theoretical (cf. Reference [3]) and the actuarial points of view (see References [4,5]).
The first reported works on the dynamic development of exhaustible resources by the members of an oligopoly are those by Hotelling (see References [6,7]). There, we can find the well-known principle of marginal revenue, as well as the standard Hypothesis on the equality between the growth rate and the market interest rate over time. The survey [8] constitutes an excellent introduction to the topic from an Economic point of view, see also Reference [9] for empirical investigation of common-pool resource users' dynamic and strategic behavior at the micro level using real-world data.
The area owes its main developments to a discussion that took place during the late '70s and the early '80s on the possibility of replacing the exploitation schemes with some cutting-edge technology to be attained in the near future. The relevance of the debate was the search of a path to move from extracting a non-renewable resource to extracting a renewable one. In this line, we can quote the works showing that, as in References [4,[27][28][29], the agents are willing to pay for the coverage of the insurance, in both: the one player context (see References [15,16]), and the game theoretic case with asymmetries (see References [1,[51][52][53]); where the unbalance comes from the choice of different fat-tailed distributions for the terminal times of extraction of the agents (see References [50,56,59]).
The rest of the paper is organized as follows. In Section 2 we state the main hypotheses of our model and argue about the connection of the game theoretic framework and the actuarial perspective. Section 3.1 is a reduced version of our study for a degenerate game where only one player performs extraction tasks. This part of the work allows us to exemplify an actuarial interpretation of the verification result given by Theorem 1; and define what we call ease of the extraction of the player in terms of the intensity of the extraction rate. In Section 3.2 we state and explicitly solve the two-player extraction game where the random terminal times are distributed according to Weibull and Chen laws; compare the results in terms of the ease of extraction of each player for particular choices of the parameters; and interpret the verification Theorem 3 in actuarial terms. We give our conclusions in Section 4.

Problem Statement
In this section we present the problem we are interested in, introduce our hypotheses, and explain its connection with some ideas from the Actuarial Sciences.

Game Theoretical Framework
Let us consider the conflict-control process of the extraction of a non-renewable resource in which two participants are involved. We assume that this set of agents remains fixed for the whole duration of the process.
We describe the dynamics of the consumption of the resource by means of the model presented in Reference [12] (Chapter 10.3), according to which, where x(t) is the amount of the resource available at time t ≥ 0, u i (t) is the extraction rate of the i-th agent at time t, x 0 is the initial amount of the stock, and i = 1, 2. Let Γ(x 0 ) be a differential game whose system satisfies the following conditions.

Hypothesis 1. (a)
Both players act simultaneously and start the game at some initial time t 0 = 0 from the state x 0 . (b) The control variables of the players are their respective extraction rates at every moment in time, namely u 1 , u 2 : [0, ∞) → U , where U is a compact subset of [0, ∞). (c) The dynamics of the system is given by (1).
The system (1) mirrors the fact that the resource is non-renewable because, by Hypothesis 1(b), x(·) is non-increasing.
We assume that the extraction performed by each agent stops at some random moment of time T i , for i = 1, 2. We impose the following hypothesis on the random terminal times T i .

Hypothesis 2.
The cumulative distribution function of T i satisfies the following:  and that are absolutely continuous with respect to Lebesgue's measure.

(b)
Given the different characteristics of the firms, the terminal times of extraction of the same resource are mutually independent. (c) As soon as one of the firms reaches its terminal time, it quits the game and the remaining one keeps extracting the resource until its terminal time is reached (which might happen when the resource becomes extinct).
We define the failure rate function associated with the i-th firm (see References [57,60,61] (Chapter 3; Chapters 4 and 5; Chapter 8.5)) as where f i := F i is a density function for the random terminal time of the i-th player. The existence of such density function is ensured by Hypothesis 2(a). By virtue of Hypothesis 2(c), Γ(x 0 ) collapses to a one-player game at a random terminal time that may be formed by means of the rule of correspondence Now, by Hypothesis 2(b), the results in Reference [61] (Chapter 16.3) and the well-known relation (see Reference [57] (Chapter 3)), we define the distribution function of the random variable T as where λ(t) := λ 1 (t) + λ 2 (t) stands for the hazard rate function of the random variable T (see, for instance Reference [57] (Chapter 9.3)). We now introduce the payoff function and the performance indices for each player in the game Γ(x 0 ). To do this, we assume the following conditions. Hypothesis 3. For i = 1, 2, the utility functions h i and Φ i are continuous in all of their arguments, and concave with respect to u i . Additionally, the function h i satisfies either: (a) It is a nonnegative function, that is: (3), if the i-th player is the only one remaining in the extraction game, she receives a terminal payoff Φ i (x(T)).

At time T given by
With Hypothesis 3, we can introduce the performance index we are interested in. Define u : [0, ∞) → U 2 as the vector of functions (u 1 , u 2 ). For i = 1, 2, the performance index for the i-th player is: +E u 1 ,u 2 where E u 1 ,u 2 x 0 [·] is the conditional expectation of · given that the initial stock is x 0 , and the agents use the strategies u 1 and u 2 ; χ {·} is an indicator function; T is as in (3); and Φ i (·) is the terminal payoff function referred to in the final part of Hypothesis 3.

Remark 1.
Note that the payoff of the game has two components: the integral payoff (6) and (7); achieved while playing, and (8), a final reward, which is assigned to the player that stayed longer in the system. Now we use the compactness mentioned in Hypothesis 1(b) to introduce the sort of optimality we are interested in.
In this game, each firm intends to maximize its profit. Then, we designate the corresponding strategies of the players as u * 1 , u * 2 , and call them "optimal". We also write the trajectory under such strategies as x * , and also call it "optimal trajectory". Let h * i (t) := h i (x * , u * 1 , u * 2 ), and rewrite the optimal expected payoff resulting from the maximization of (6)-(8) as The following result is an extension of Reference [1] (Corollary 3.1). We include its proof here for the sake of completeness.

Proposition 1. Let Hypotheses 1-3 hold. If
for all θ > 0, then the optimal expected payoff for the problem starting at t = 0 is given by Proof. It is easy to see that The use of (9) and Fubini's rule yield An integration by parts yields Working with Substituting (15) and (16) in (12)- (14) and collecting similar terms yield: Note that the term where the first equality is due to an integration by parts. The substitution of (17) and (18) in (10) and (11) gives the result.

Interconnection with the Actuarial Sciences
In the traditional Actuarial Sciences literature, there are two main principles under which it is possible to compute the amount (of money, time or effort) to be invested/reserved -know as "premium" (premia)-in exchange for a benefit (e.g., the coverage of an insurance or the earnings of the extraction tasks). These principles are classified according to the following methodologies.

PI.
Optimization of the probability that the trajectory x(·) attains certain set. This approach and its refinements yield what is called "percentile premium", "value-at-risk" (see References [60,62] (Chapters 3.5.3 and 3.5.4)) or "risk premium" (see Reference [61] (Example 6.5 and Chapter 17)). PII. Search of a premium such that the decision maker is indifferent between taking the risk or not. This is typically done by setting the expectation of the utility function under uncertainty equal to the utility function without risk. A remarkable simplification arises when the utility function is linear, for in that case, the expectation of the prospective loss turns out to be null, and the computation of the corresponding surcharge is very straightforward (see Reference [57] (Example 6.1.1)). The result of this method is called "equivalence premium", "indifferent price", or "utility/benefit premium".
From the point of view of the actuarial scientist, we are using a third principle, which connects the traditional approaches. Thus, we state such third principle: PIII. Optimization of the conditional expectation (6)-(8) (see References [40] and [63] (Chapter 6)) to find the premium which is to be exchanged for the benefits derived from the extraction tasks. Such a mixture resembles the approach of the tail-value-at-risk -also dubbed conditional tail expectation and expected shortfall-(cf. References [64][65][66] and [60] (Chapter 3.5.4)) from PI, in the sense that the objective function of the optimization program we work with is a conditional expectation. However, due to the logarithmic form of the reward functions we use; and the particular form of the HJBI equations that this yields, the premia we obtain are very much alike the indifferent prices one would get, should one have chosen to work with the equivalence principle from PII in the first place (see the Remarks 2 and 3 below). In the present context, we follow References [31,33] to present the development of the extraction tasks as a process with uncertain duration, and interpret the maximizers u * 1 and u * 2 as measures of the value's cost that each of the agents earns from such development, that is, as the premia that they should invest in order to ensure/insure their operation.
We consider games with a random terminal time T, to which the basic terminology of reliability theory can be applied directly. In fact, the failure rate function (2) can be thought of as a conditional density provided that the agent did not default (i.e., leave the game until the moment t). In our terminology, we would talk about the density of the terminal time of the game, provided that the game was not terminated before the moment t. The failure rate function λ i (t) that describes the life cycle for the player i has the form described by Figure 1. We follow Reference [67] (Chapter 1) to identify three phases of the hazard rate with respect to time.

•
The first phase is called the pre-run phase. According to the theory of reliability, the failures in this phase arise due to undetectable latent defects. Specificity of this problem is understandable not only from the point of view of the application of elements to technical systems, in actuarial risk theory, for such a period the terms "newborn period", "infant mortality" and "early failures" can be used (see Reference [57] (Chapter 3)). From the point of view of game theory, early failures can be caused by inexperience, that is, inconsistency of the players just who just entered the game. The failure rate function λ i (·) in this phase is a decreasing function of time.

•
The next period of the life cycle of the system is the so-called period of normal system operation. The failure rate function λ i (·) in this period is constant (or approximately constant), and the sudden failures themselves are caused by imperfection of the system itself, or are caused by some external factor. This is called the "adult" period of the process (see Reference [67] (Chapters 1, 3 and 5)). The game in the period under consideration can stop under the influence of some unforeseen circumstances of the external environment.

•
In the last period, the system goes into the aging phase. The system failures in this period are associated with how the system ages, and that's why the failure rate function λ(t) is an increasing function.
Moreover, we are interested in presenting a comparative analysis of the results when the duration of the extraction is distributed according to the laws of Weibull and Chen. The former has been widely used for modelling losses, time-until-failure of many non-renewable electronic devices (electronic lamps, semiconductor devices, some microwave devices) and lifetimes in general (see References [57,[59][60][61]67]), while the latter is a fat-tailed distribution (see Reference [56]). The cumulative distribution function of the Weibull distribution is: where λ 1 > 0 is a scale parameter, and δ 1 > 0 is a shape parameter that corresponds to one of the three phases in which the lifetime of the player can be located. Namely, the value δ 1 < 1 corresponds to the pre-run period, here the failure rate function λ 1 (t) is a decreasing function of t. At δ 1 = 1, the system is in the normal operation mode, and λ 1 (·) equals a constant value of λ 1 > 0. We note that for δ 1 = 1, the Weibull distribution corresponds to an exponential distribution. For δ 1 > 1, the system is in an aging state, therefore λ 1 (·) is an increasing function. A special case of the Weibull distribution for this instance is the so-called Rayleigh distribution (see Reference [60] (Appendix A.3)). We are in this case when δ 1 = 2.
By (2), the corresponding failure rate function is The cumulative distribution function of Chen distribution is where λ 2 > 0 is a scale parameter, and δ 2 > 0 is a shape parameter. Now, for this case, it follows from (2) that the failure rate function takes the form If δ 2 < 1, we will be at the "newborn" phase. Here, the failure rate function λ 2 (·) is bathtub-shaped. This corresponds to a realistic process of extracting natural resources. When δ 2 = 1, the system is in the normal operation mode and the hazard rate function λ 2 (·) is increasing. At δ 2 > 1, the system is in aging state, and λ 2 (·) is also an increasing function, but from it is straightforward that the growth rate of λ 2 (·) is noticeably larger than in the case of normal operation. This implies that there is a greater probability of failure at this stage of the extraction.
Graphical representations of the failure rate function of Chen distribution for two values of the scale parameter λ 2 are shown in Figure 2. In Figure 2b, note that, as δ → 1, the slope of the graph grows larger. This fact might be interpreted by arguing that Chen's distribution plausibly describes how the system goes from the pre-run state into the normal operation mode.
Graphical representations of the failure rate functions for the Weibull and Chen distributions for fixed λ 1 = 2 = λ 2 , and the same values of the parameter δ 1 and δ 2 are shown in Figure 3. Note that we display these functions for each of the periods we have identified.

Resource Exploitation under Two Parametric Interpretations
It is easy to see that Weibull and Chen distributions (19) and (21) verify the Hypothesis 2(a). We will consider the cases of these probability laws for the random terminal times of the players, and take T = min{T 1 , T 2 } for different parameters of these families. This will allow us to model failures of the equipments depending on their operation mode.

Dynamic Models for the Extraction of Natural Resources by One Agent
In this section, we follow Reference [32], and consider the situation where only one agent performs extraction tasks, that is, the degenerate game where n = 1 (in this case, we do not consider a terminal payoff function, because the game finishes as soon as the only player leaves the system). Let x(t) be the stock of the non-renewable resource under consideration. Then, the dynamics (1) reduces tȯ where u(t) ≥ 0. An application of Proposition 1 yields that the expected payoff of the agent, provided that the terminal random time follows Weibull's law, that is Note that adding a terminal cost does not make sense with a single agent, so we set Φ(x) ≡ 0.
If we use Chen's law for the random terminal time, the winnings of the extractor are given by The problem of obtaining feedback maximizers for the expected gains in Formulae (24) and (25) under the condition (23) can be solved using the following Bellman equation with transversality condition lim In (26), λ(·) corresponds either to the failure rate of Weibull or Chen distributions. The details on the necessary calculations for achieving (26) and (27) can be seen in, for instance Reference [68] (Chapter I.5).
The following result assumes a specific form of the agent's utility function. There are, of course, many other forms, which need to be concave, continuous and non-decreasing. However, we have chosen this particular form because of the interest that the result has from the point of view of the Actuarial Scientist (see Remark 2 below).

Theorem 1. If the utility function is of the form
then the optimal controller is given by the Lebesgue-measurable function Proof. The substitution of the ansatz (see References [34,35]): into Bellman's Equation (26) yields Plugging (32) and (33) into (26) gives us that the maximized control is: and also the following system of differential equations: The transversality condition (27) takes the form Using (37) and (thus) the integrating factor (4) we can solve (35) and get The last equality follows from (4). Here, of course, F(·) is given by (19) or (21). From (30), it is straightforward that The substitution of (39) in (34) gives (29). The fact that the controller (29) is optimal for the degenerate game follows from Theorem I.7.1(a) in Reference [68]. The fact that that such controller is Lebesgue-measurable follows from Hypothesis 1(b), along with the so-named measurable selection theorems (see, for instance, References [69][70][71] (Theorem 12.1; Proposition D5(a); Theorem 3.4)). This proves the result.

Remark 2.
An actuarial interpretation of Theorem 1 is the fact that the function A(t) agrees with the expectation of the so-called contingent life annuity with 0% interest rate for a life aged (t) displayed in (30) (see Reference [57] (Chapter 5.2)) and write , which are used to establish the equivalence premium u * (t, x) that is to be continuously paid to obtain a benefit of x.
In the Actuarial Mathematics literature, we use this expression to state the existence of a balance between an expected income u * (t, x) ·ā t (that will be paid over a contingent horizon), and an expected benefit x (that will be received at a given moment of time). From this point of view, we might distinguish two parts within the Lebesgue-measurable optimal rate of extraction u * (t, x) from (29): (i) the benefit that the agent will eventually get, x; and (ii) the intensity of the extraction that the agent needs to apply to acquire u * (t, x), that is, A(t) =ā t .
Keeping this in mind, we can state that the resulting instantaneous utility of obtaining x by continuously extracting u * (t, x) with an intensity ofā t is given by h(x, u) in (28).
Since the intensity of the extraction appears in the denominator of (29), we will dub 1 a t as ease of extraction. Our goal is to emphasize the inverse proportionality of the optimal controller u * (t, x) with respect to the intensity of extractionā t .
We obtain the following result as a by-product of Theorem 1.

Theorem 2.
If the utility function is of the form (28), then the value function for the optimal control problem of maximizing (6) within U subject to (1) is given by Proof. From (31), we readily know that To find the function B(t), we note that the transversality condition (38) and again, the integrating factor (4) give that The last equality follows from (4) and (39). Substituting (42) into (31) gives (41). The fact that this function is actually the value of the optimal control problem of maximizing (6) subject to (1) follows from the verification Theorem I.7.1(b) in Reference [68]. This proves the result.
As we already stated, for the case of Weibull distribution, when δ 1 = 1, the random terminal time is exponentially distributed with mean λ −1 1 . Then, by Theorem 1, the optimal strategy of the agent is u * (t, x) = λ 1 x. We solve (23) and get that the optimal trajectory is For the case of Chen's law, we let δ 2 = 1 and substitute (22) into (29) to get We substitute this controller into (23) and solve the differential equation to obtain the optimal trajectory when the random terminal time is distributed according to Chen's law and δ 2 = 1. That is: Figures 4-6 summarize these results when λ 1 = 1 = λ 2 , which, since λ 1 and λ 2 are scale parameters of the distributions under study, we can have a sufficiently general idea of our developments. The reason is that, if we select other values for these parameters, we will end up having constant multiples of the random variables X 1 and X 2 analyzed in this study (see Reference [60] (Section 4.2.1)).   We might give a plausible interpretation of Figures 5 and 6 in the direction of Remark 2. The optimal extraction rates are linear in the state variable and, as we established in Remark 2(ii), the largest the value of A(t) =ā t is, the easiest it is for the extractor to obtain u * (t, x). In this sense, it is straightforward that, for Chen's law, as time goes by, it becomes harder to extract at a rate of u * (t, x). On the other hand, it is very easy to see that, under Weibull's law, A(t) ≡ 1 (see References [12,57,61] (Example 5.2.1; p. 323; Chapter 8.10.1)). This implies that, in the normal mode, the agent whose random terminal time follows the Weibull distribution is indifferent to the moment of time when the extraction task takes place, and should only look at the remaining amount of the resource.
As for the optimal trajectories in Figure 4, observe that the system exploited by an agent affected by Chen's law becomes exhausted a lot faster than a system exploited by a Weibull extractor. This is consistent with the fact that, according to Figure 5, a Chen extractor would be more intense in his exploitation tasks.

Aging Mode
We stated before that for δ 1 = 2 = δ 2 , Weibull's law coincides with Rayleigh distribution. In this case, (20) gives that λ 1 (t) = 2λ 1 t; then from Theorem 1 we get For Chen's distribution, when we have an aging system, by (22), the failure rate function is λ 2 (t) = 2λ 2 te t 2 , and the Theorem 1 gives that the optimal control is We solve (23) and get that the optimal trajectories under each law are: Figures 7-9 summarize these results when λ 1 = 1 = λ 2 . There, we can see that, in the aging mode, as time passes by, the extraction tasks need to be more intense for both assumptions.  However, for the Chen extractor, the situation deteriorates very rapidly, and hence, it needs to speed up the pace of its tasks. It readily follows from (29) that both controllers are linear in the state. Again, Figure 7 shows how the stock becomes exhausted for each case.
3.1.3. Early Period (δ 1 = 1 2 = δ 2 ) For the Weibull case, (20) yields that the hazard rate function is λ 1 For Chen's law, the pre-run system has a hazard rate function of the form λ 2 (t) = λ 2 The corresponding optimal control has the form We solve (23) and get that the optimal trajectories under each law are: Figures 10-12 summarize these results when λ = 1. It is of particular interest that, according to Figure 12, for a Chen extractor, we almost have the same situation displayed in Figure 6 with the Weibull extractor. However, in this case, the intensity function A(t) =ā t converges to a smaller value than the one displayed in that part of the illustration. This means that, during the early mode of the system, an agent whose random terminal time follows the Chen distribution should only mind about the remaining stock of the resource.   For a Weibull extractor at the pre-run phase, as time goes by, the exploitation becomes less intense, and, in spite of the fact that the reserve will be consumed at an exponential rate, it will last for more time than under Chen assumption (look at Figure 10). An economic interpretation of this fact is that the optimal rules of behavior for the agents demand Chen's law to be more intense in its labors even from the early period, while they allow Weibull's to be less intense (look at Figure 11).

Game Theoretic Model for the Extraction of Natural Resources
Now we consider the non degenerate game model from Section 2. For that purpose, we will make an extensive use of the results presented in References [1,50] and [57] (Chapter 9.2). Note that the utility function of the agent will explicitly depend only on its own control, and there are no payoff transfers among the players, that is, on the extraction rate applied by the agent, and on the stock of the resource at time t ≥ 0. Theorem 3.1 in Reference [1] allows us to state the HJBI equations associated with the optimization problem for the i-th player. Namely, for i = 1, 2. We will find explicit solutions for (43) when the functions h i are analogous to (28) for i = 1, 2. That is, when We also suppose that for some positive constant values c i and i = 1, 2. In this case, the HJBI Equation (43) turns out to be In what follows, we find the optimal strategies and the value function for this problem by proceeding in the same way that led us to (29) and (41). The next result is similar to Reference [1] (Proposition 4.1).

Theorem 3.
If the utility functions are of the form (28), and the terminal payoff functions are given by (46), then the optimal strategies for the game Γ(x 0 ) are Lebesgue-measurable, and are given by: Here, F(t) is as in (5).

Proof. The substitution of the informed guesses
for i = 1, 2; into (43) and (44) yields: • that the maximizers of (43) are of the form The fact that these controllers are Lebesgue-measurable follows from Hypothesis 1(b), along with the so-named measurable selection theorems (see, for instance, References [69][70][71] (Theorem 12.1; Proposition D5(a); Theorem 3.4)). • the following Cauchy problem (which is analogous to (35)-(38)): We apply the technique of the integrating factor in (51); use the transversality condition (53), and get The last equality holds by virtue of (5).
We now use (48) and (49) and observe that The substitution of this expression in (50) yields (47). By Theorem I.7.1(a) in Reference [68] (see also Reference [52] (Theorem 2(i))), we know that (47) is optimal for the game with no explicit payoff transfer among the players. This proves the result. (39) in Remark 2 as a continuous life annuity, we can do the same with (55).

•
In the Actuarial Mathematics literature, the continuous annuity of the joint-life status of the times-until-failure T 1 and T 2 is designed by the symbolā [t] 1 :[t] 2 , and represents a natural extension of (30) to the case where the payments stop when either player leaves the system (the use of the braces around t emphasizes the fact that each player has lived-up to the moment t > 0, and states that the age-at-selection of each player is t-see Reference [57] (Chapter 3.8)). Thus we define such annuity as in (48) (with null interest rate); • we also design the mathematical expectation of the present value of one monetary unit payable (to the i-th player) at the moment of failure of the (−i)-th player (with null interest rate) by the symbolĀ [t] i : (49) (the superscript 1 means that the (−i)-th player is the first who fails-see Reference [57] (Chapter 9.7)). The reason is that the expression can be thought of as the probability that both agents survive up to moment t, and the (−i)-th agent fails at moment t + s. (See References [46,57] (Chapter 9.9)).
Keeping this in mind we can propose an extension of (40) and write (47) as: From an actuarial perspective, this expression means that, for the i-th extractor, there is a balance between the eventual benefit x, and the continuous rate of extraction u * i (t, x), which includes a final payment (of size c i u * i (t, x)) that covers the possibility that the other player fails and leaves the game. This means that the optimal rate of extraction of the i-th player in (47) can be viewed as the composition of two parts: (i) the benefit that the agent will eventually get, x; and (ii) the intensity of the effort that the agent needs to apply to attain such benefit, that is, (55). Observe that this function explicitly takes into account the fact that the i-th agent will receive a payment if the other one leaves the system before he/she does.
The resulting utility of this exercise for the i-th player is given by (45).
As a by-product of Theorem 3, we can state and prove the following result.
Theorem 4. If the utility functions are of the form (28), and the terminal payoff functions are given by (46), then the value functions for game Γ(x 0 ) are given by Proof. From Theorem 3, we readily know that for i = 1, 2. To find the functions B i (t), we apply the technique of the integrating factor to (52), and use the transversality condition (54). This gives: where A i (·) is as in (55). The optimality of the functions W 1 and W 2 follows from Reference [52] (Theorem 2(ii)). This completes the proof.
The optimal trajectory can be found by plugging (47) into (1) and solving. That is, The relation (57) can also be compactly written as thus showing the interaction between the players and their effect on the system.

An Illustration
We devote this Section to the analysis of the particular cases of our interest. If the random terminal time of player 1 has a Weibull distribution, and that of player 2 follows Chen's law, the optimal extraction rates are: If we fix the initial stock at x 0 = 5, Figures 13-18 will show us a graphical depiction of the involved intensities in the process. From these images, we can notice that almost all the time the extraction rates of the player whose random terminal time follows the Chen distribution are higher than those of the other player. This fact is also consistent with the plots of Figures 4-6 and 10-12, which were exhibited in the illustration of Section 3. To have a proper interpretation of the results presented in Figure 13, recall Figures 10-12. Observe that the shapes of the plots in Figure 13a,b resemble those in Figure 12. It should be noted that the scales are much larger in the degenerate case. The reason is that, in this scenario, there are two actors making decisions and consuming the resource. Moreover, each of the agents needs to take into consideration the action of its counterpart (recall Theorem 3). As in the one-player case, it should be noticed that, in spite of the fact that the plots in Figure 13a,b have opposite trends, the optimal behavior of the Chen extractor is to obtain the resource at a much rapid pace than that of the Weibull extractor.
In view of Figure 14, we must recall Figures 4-6 and 10-12. Again, for the intensity function of the Chen extractor, A 2 (·), we see the same general trend as that in Figure 6. However, for the Weibull case, the trend observed in Figure 12 becomes perturbed by the presence of the other agent in a more advanced mode of the extraction.  For the case δ 1 = 1, we present the following result.
Proposition 2 gives us that, regardless of the distribution of T 2 (or of its shape parameter for the case of Chen's law), when T 1 follows the exponential distribution and c 1 λ 1 = 1, the optimal rate of extraction will be invariant.
To interpret Figure 16, we recall  This case corresponds to the situation where the Weibull extractor is already at the aging mode of the process, and Chen extractor is only starting its tasks. Here, the intensity of Weibull's extractor has to be higher as time passes, and the other player can take advantage of it because the intensity of its rate of extraction is decreasing. However, it is only at the beginning of the process when it is optimal for Chen extractor to have a stronger rate than that of Weibull (this is the only case in which this situation is observed). This is consistent with the behaviors shown by the intensity functions for the Weibull extractor in Figure 9, and for the Chen extractor in Figure 12. (a) (b) Figure 17. Ease of the efforts when the shape parameters of Weibull and Chen distributions are δ 1 = 2 and δ 2 = 1, respectively. (a) Ease of the effort required Weibull extractor when δ 1 = 2. (b) Ease of the effort required Chen extractor when δ 2 = 1. Figure 17a shows how the effort required by the Weibull extractor is a convex function that tends to stabilize in the long run. This feature is consistent with Figure 9. However, in the controlled case, we cannot observe an increasing trend. The behavior of the ease of Chen extractor mirrors that of the corresponding ease function in Figure 6. Figure 18 shows a dramatized effect of the one shown in Figure 17. The reason is that both agents are in the aging mode of their respective processes.
(a) (b) Figure 18. Ease of the efforts when the shape parameters of Weibull and Chen distributions are δ 1 = 2 and δ 2 = 2, respectively. (a) Ease of the effort required by Weibull extractor when δ 1 = 2. (b) Ease of the effort required by Chen extractor when δ 2 = 2.
If we fix the initial stock at x 0 = 5, and use Theorem 4, we can calculate the following table, where we have assumed, as in Section 3 that the scale parameters λ 1 and λ 2 equal the unity. Each of the entries represent the pair (W 1 (5, 0), W 2 (5, 0)). It should be noted that the player whose random terminal time is distributed according to the Chen's law end's up earning less than the other player in all cases. This is consistent with the comparisons we made in Section 2, and in particular, in Figure 3c.

Weibull/Chen
The optimal trajectory is given by the following exponential function:

Conclusions
In this paper we used standard Dynamic Programming techniques and classic Actuarial Mathematics tools for the analysis of a differential game with linear system and logarithmic reward function under the total payoff criteria with a random horizon. In Section 2.1 we presented the game model of our interest, and in Section 2.2 we presented a third actuarial principle to calculate premia under the total payoff criterion, which has been used to introduce the game with random terminal times. The distributions that we studied for these random variables are of great importance for risk analysts, and we devoted Section 3 to describe them and to present our main results and analyses. The first distribution that we considered is the classic two-parameter Weibull random variable, and the second is the heavy-tailed law of Chen. We compared the results of a resource extraction differential game model with each of these random variables and we identified the phases of the extraction with different values of the shape parameter of the distributions we used. In Section 3.1 we solved the degenerate case of the game (i.e., for only one player) and, in Section 3.2 we used some actuarial tools to study a two-player version of the game with two independent random terminal times for the extraction tasks.
For the purpose of illustrating our developments, we stated a degenerate game, and a two-person game with independent random terminal times for each player. It is important to emphasize the importance of the logarithmic utility of wealth function that we used, for it allowed us to find a connection between principles PI and PII from Section 2.2. This connection is clearly expressed by expressions (40) and (56) in Remarks 2 and 3, respectively. We believe that, regardless of the very particular form of the utility functions involved in our study, these expressions mean that the bond between the extraction game at hand (with random terminal times whose distributions are known), and the actuarial equivalence principle should be further studied, for it implies that, should there be an interest in insuring the operation of any of the agents, the optimal premium to be continuously charged to each player is given by u * (t, x). We think of such an extension as a plausible future line of work.
Additionally, we found short, explicit, numerical and graphical expressions in terms of actuarial nomenclature for the optimal rates of extraction of the agents in each case analyzed. We used this notation to characterize the optimal extraction rates in two parts, namely: (i) the benefit that the agent will eventually get, x; and (ii) the intensity of the effort that the agent needs to invest to acquire u * (t, x), that is, A(t) =ā t for the degenerate game in Section 3.1, and for i = 1, 2, for the game in Section 3.2. As this amount becomes larger, the extractor needs to become more intense in its extraction.
The resulting instantaneous utility of this exercise is given by (28) for the degenerate game, and (45) for the two-person differential game.
Moreover, with such representations of the premia available, it is possible to think of analyses of other kind; for instance, building confidence intervals and adding a loading factor to the premium to compute probabilities of ruin of the (hypothetical) insurance company, and thus extend our results to the realm of the actuarial risk theory. One of such extensions, that we look forward to work on, would be to consider the possibility of having an agent whose entry in the system happens at a random moment after the start of the extraction tasks (as is the case in Reference [52]) and (any of) the agents require(s) insurance to continue to develop the resource.
Another plausible extension is the one suggested by the fact that, in References [1,52], the authors found expressions for the optimal rates of extraction that can also be put in terms of simple contingent functions; which, on the one hand, is consistent with our findings in this paper, and on the other, can be used to model the presence of both, legal and illegal extracting agents. This means that in spite of the particular form of the utility functions we used for the agents, the optimal premia are given by expressions that follow the actuarial equivalence principle. With this in mind, we could try to perform an analysis of the optimality of equivalence premia with a broader class of reward functions, under the total payoff criterion, by using Dynamic Programming.
The speed of the extraction tasks differs for two different moments of the end of the game. And, as it was to be expected, Chen's distribution allows the most realistic description of the life cycle of the system. Our illustrations confirm that in normal operation, it is necessary to "dig" at a rate which needs to be gradually increased with time, but in the case where the end of the process is subject to Chen's law, the pace of resource extraction still needs a faster intensification. In the equipment aging mode, when the failure rate function increases, it does not matter what distribution law determines the completion of the development process, it is necessary to increase the rate of development of the fields.
Among our findings, we can also quote that the optimal behaviors of the agents differ for different game scenarios and moments of its completion. For example, during the pre-run phase (i.e., when the equipment is not yet established and the overall picture of the process is not fully clear), the development speed should be the smallest, which corresponds to the agent's caution. This is reflected in the entries of the row δ 1 = 1 2 , and the column δ 2 = 1 2 of the table of Section 3.2. We can notice about this by observing that, for both players, the entries in the other rows and columns are greater than these ones. This feature is consistent with the conclusions drawn in Reference [4], in the sense that both our works prove that the price of insuring the extraction tasks is decreasing in the level of expertise of the extractor (which we might interpret as the stages where the agents are located at). This conclusion is coherent as well with the developments shown in References [27][28][29], on the willingness of the agents to pay for the coverage of an insurance strategy to guarantee the continued supply of the revenues from the extraction tasks.
To conclude, we mention that the results of this paper can be applied to a wide range of critical thecnologies related domains, including Internet of Things, energy management, and security to mention just a few. An extension of our results to the mentioned applications will be addressed in our subsequent work.