Company Value with Ruin Constraint in a Discrete Model

Optimal dividend payment under a ruin constraint is a two objective control problem which—in simple models—can be solved numerically by three essentially different methods. One is based on a modified Bellman equation and the policy improvement method (see Hipp (2003)). In this paper we use explicit formulas for running allowed ruin probabilities which avoid a complete search and speed up and simplify the computation. The second is also a policy improvement method, but without the use of a dynamic equation (see Hipp (2016)). It is based on closed formulas for first entry probabilities and discount factors for the time until first entry. Third a new, faster and more intuitive method which uses appropriately chosen barrier levels and a closed formula for the corresponding dividend value. Using the running allowed ruin probabilities, a simple test for admissibility—concerning the ruin constraint—is given. All these methods work for the discrete De Finetti model and are applied in a numerical example. The non stationary Lagrange multiplier method suggested in Hipp (2016), Section 2.2.2, also yields optimal dividend strategies which differ from those in all other methods, and Lagrange gaps are present here.


Introduction
Let S(t), t = 0, 1, ... be the time t surplus of a company and D(t), t = 0, 1, ... the adapted non-decreasing sequence of accumulated dividends. For a fixed discount factor 0 < r < 1 the dividend value under D(t) = d(1) + ... + d(t) is given by where s ≥ 0 is the initial surplus. The with dividend ruin time of the company is and ψ D (s) is the corresponding with dividend ruin probability ψ D (s) = P{τ D < ∞|S(0) = s}. (1) We assume in the following that dividends are never paid at or after ruin. The object to be investigated is for a given value α. The quantity V(s, 1) is sometimes called value of the company. A lot of research has been done on this quantity, starting with the seminal work of De Finetti (1957) and Gerber (1969), Choulli et al. (2003) and Albrecher and Thonhauser (2008) as well as Schmidli (2007), Section 2.4, and Loeffen (2008), Avanzi (2009) and Feng et al. (2015) for related work. The concept leading to V(s, α) is a possible answer to the problem posed in Borch (1963) who wrote: If the general manager of our insurance company wants to run the company strictly as a business enterprise, he will probably always seek out the decisions which maximize V(s, 1). If, however, he is concerned with the social responsibility of the company, and the security which it offers to policy holders, he may also consider ψ 0 (s) [the ruin probability without dividend payment] when making his decisions. He will probably try to balance the two elements, but it is not easy to specify how this should be done.
One approach for the computation of the value function V(s, α) is based on a modified Hamilton-Jacobi-Bellman equation for the corresponding stationary Markov process with a bivariate state variable (see Hipp (2003)). This approach needs a fine discretization of the values for the ruin probability, and a large number of iteration steps. In the ruin probability grid, a complete search was necessary in the old version of the policy improvement method. A second approach is the iteration method presented in Hipp (2016). Here, we have shorter but still long computation times. Again we have a complete search, but in the much smaller set of possible surplus values.
The purpose of this paper is to study the form of optimal dividend strategies and use running allowed ruin probabilities to speed up the computation of the first method. This enables a big number of iterations for this first method even for fine discretizations. Compared with the iteration method, the second approach, we obtained slightly higher company values caused by the larger number of iterations. Finally, we show that optimal dividend strategies are of barrier type, and we present analytic formulas for the dividend value of these barrier type strategies. In a numerical example we show how appropriate barrier levels can be found.
The quantity company value under a ruin constraint should later serve as an objective function for finding optimal reinsurance or investment strategies. For this we need simple algorithms for the computation of V(s, α) with a possible chance to use them also in the corresponding control problem. We restrict ourselves to the following very simple space and time discrete model in which such algorithms can be more easily found.
We consider a simple random walk S(t), t = 0, 1, ..., on the integers starting at s and going up or down by 1 with probability p or q = 1 − p, respectively. This is the classical De Finetti model which is skip free (upwards and downwards). In the insurance framework, t labels periods in which premia of size 1 come in and claims of size 2 go out. In this discrete model, each dividend payment can be assumed to be integral (see Schmidli (2007), Lemma 1.9). In Hipp (2016) it is shown that for and for fixed s ≥ 0 the function α → V(s, α) is continuous (notice that the continuity statement in Hipp (2003), Lemma 2e, is not correct, and its proof has a gap; a correct proof can be found in Hipp (2016), Lemma 2). This shows that a purely discrete model can lead to a situation with a continuous parameter α. To avoid technical problems we will assume in the following that (3) holds. The function α → V(s, α) is strictly increasing on ψ 0 (s) ≤ α ≤ 1, and V(s, α) = 0 for α ≤ ψ 0 (s). In the De Finetti model the survival probability 1 − ψ 0 (s) satisfies the following difference equation for functions f (s), s ≥ −1, with f (−1) = 0 : This equation is homogeneous, and the set of solutions is one-dimensional. If 0 ≤ s < B then the probability p(s, B) that S(t) reaches B from s before ruin satisfies (4), and p(B, B) = 1 leads to p(s, B) = (1 − ψ 0 (s))/(1 − ψ 0 (B)).
The company value V(s, 1) satisfies a similar difference equation for functions f (s), s ≥ −1, satisfying f (−1) = 0 : which holds in the range without dividend payment: let W(s) be the unique solution of (5) with where M is the barrier for dividend payment: V(s, 1) = V(M, 1) + s − M for s ≥ M. Also equation (5) is homogeneous, and the set of solutions has dimension 1. So, for the waiting time τ(s, B) to reach B from s before ruin, the expected discount factor W(s, B) = E[r τ (s,B) ] is a solution of (5), and W(s, B) is proportional to the function W(s) :

A Modified Bellman Equation
Our first numerical method for the company value with ruin constraint is based on a modified Bellman equation. We use the following dynamic equations for V(s, α) (see Hipp (2003), formula (4)): These equations hold in the range s = 0, 1, 2, ... and ψ 0 (s) ≤ α ≤ 1, and we use the values V(−1, α) = 0 and ψ 0 (−1) = 1. The dynamic equations define the optimal dividend strategy in feedback form: Equation (6) tells us when a dividend of size 1 is paid. Equation (7) gives the value function when no dividend is paid, depending on the next period in which the surplus can go up with probability p or down with probability q. The number α is the running allowed ruin probability, which changes to β 1 or β 2 in the next period depending on an up-or down-move of the surplus. Equation (8) implies that the process of running allowed ruin probabilities is a martingale with mean α. Computation is based on an iteration which is the well known policy improvement procedure (see Hipp (2003)): we start from V 0 (s, α) = 0, and when V n (s, α) is given for all s and α, we compute V n+1 (s, α) from Equations (6)-(9) where we use the functions V n on the right hand side of (7) and obtain V n+1 on the left hand side of (6): One can show that the sequence of functions V n (s, α) is non-decreasing and bounded, and its limit is a solution of the dynamic equations above (see Hipp (2003), Lemma 2a). The classical verification argument yields that the limit is the value function of our control problem, and a solution to the dynamic Equations (6)-(9), see also Hipp (2003), Lemma 2b-d. By continuity of α → V(s, α), the supremum in (7) is attained at some (β 1 , β 2 ) ∈ A(s, α). Let α(t) be the process of allowed ruin probabilities defined by α(t + 1) = α(t) when a dividend of size 1 is paid at t, and otherwise α(t + 1) = β 1 or α(t + 1) = β 2 when S(t) goes up or down, respectively. Notice that for all t ≥ 0 we have Using the bivariate process (S(t), α(t)) we can define the optimal dividend strategy in feedback form: in state (s, α) we pay a dividend of size 1 whenever the maximum in (6) is at V(s − 1, α) + 1. The second component α(t) makes the optimal dividend strategy path dependent. Each payment of size 1 does not change the state, so during dividend payment we stay in the same state until the next claim (downward jump). This implies that there exists a function M(α) such that dividends are paid above M(α) when the allowed ruin probability equals α. The function M(α) is a non-increasing step function. Below, we study the running allowed ruin probabilities α(t) in more detail. In the above computation based on the modified Bellman equation we first used a complete search for the maximizer β 1 , β 2 . Here we replaced each complete search by an easy computation of running allowed ruin probabilities which speeds up a lot.

Iteration Method
The iteration method is based on the observation that, starting at initial surplus s, we either pay dividends immediately, or we wait until we arrive at some larger surplus B. If at B the ruin probability a(B) is allowed, then we continue with a dividend strategy producing a dividend value (close to) V(B, a(B)). If we start with an initial function V 0 (s, α) (e.g., V 0 (s, α) = 0), and if V n−1 (s, α) is given, then our iteration reads Here, p(s, B) is the probability that the without dividend process S(t) falls below zero before reaching B, and W(s, B) is the discounting factor E[r τ(s,B) )] for τ(s, B) the waiting time to reach B from s before ruin. This device produces a monotone sequence of functions V n which might converge to the value function V(s, α). The first Equation (13) covers the case in which no dividends are paid before reaching B, while Equation (14) allows for immediate dividend payment at surplus s. The numerical results verify that the optimal dividend strategies are of barrier type.

Running Allowed Ruin Probabilities
The running allowed ruin probabilities are ruin probabilities for optimal dividend strategies: if D is the optimal dividend strategy with initial surplus s and allowed ruin probability α, then the ruin probability of the with dividend process S(u) − D(u), u ≥ 0, equals α. At time t the dividend strategy D t (u) = D(t + u) is the optimal strategy for (S(t), a(t)), and so a(t) is the ruin probability for the with dividend process S(t + u) − D t (u), u ≥ 0. Let B 0 ≥ s 0 be the surplus above which dividends are paid first, i.e., dividends of size 1 are paid at state B 0 + 1 which produces a constant value B 0 for the with dividend process until the next claim (downward jump). Since no dividends are paid when s 0 ≤ S(t) ≤ B 0 , we can write a(t) = a 0 (S(t)), where a 0 (s) satisfies (4) with a 0 (−1) = 1. This implies that for 0 ≤ s ≤ B 0 we have a 0 (s) = 1 − γ 0 + γ 0 ψ 0 (s) for some 0 < γ 0 ≤ 1, and γ 0 can be computed from a 0 (s 0 ) = α 0 . During dividend payment, a(t) stays on the level α 0 = a 0 (B 0 ), it leaves this level at the first claim. Let B 1 ≥ B 0 be the level above which we first pay dividends after leaving B 0 . Repeating the above reasoning with B 1 instead of B 0 and B 0 − 1 instead of s 0 , we obtain a function a 1 (s), s ≤ B 1 , which is the ruin probability of the with dividend process for the initial pair (B 0 , α 0 ). Since the transition from B 0 to B 0 − 1 is certain, we get a 1 (B 0 − 1) = α 0 . This value determines γ 1 in the representation a 1 (s) = 1 − γ 1 + γ 1 ψ(s). Proceeding in this way, for a non-decreasing sequence of barriers B i , i ≥ 0, we obtain a non-decreasing sequence of numbers γ i , i ≥ 0 satisfying the recursion .
The dividend strategy D which pays dividends at the levels B i satisfies the ruin constraint If we stop the sequence B i at some finite number n, this means that after visiting n barrier levels we stop paying dividends for ever, i.e., γ i = 1 for i > n.

The Barrier Method
The barrier method does not use iterations or discretizations, it is more interactive and simpler. We start with a (finite) sequence of barrier levels B(i), i = 1, ..., n and compute the dividend value with an analytic formula in which all dividends which are paid on one of these levels are appropriately discounted and added. The value of dividend payments on the level B i , discounted to the time when we reach B i + 1 after leaving B i−1 − 1, does not depend on i and equals A = ∞ ∑ k=0 p n r n = 1/(1 − rp).
So the dividend value consists of the sum of all these payments, discounted over the times elapsed between s and B 0 + 1 (for the payments at level B 0 ), then over this time plus the time elapsed between B 0 − 1 and B 1 + 1 plus the time spent on level B 0 (for the payments at level B 1 ), and so on. The discount factor for the time spent on level B i is again independent of i, it equals C = ∞ ∑ k=1 qp n−1 r n = qr/(1 − rp).

The present value for payments on level B 0 is
for level B 1 we obtain the present value and so on. A closed formula for the total dividend value of the dividend strategy D is One method to find barrier levels uses the function M(α), which might come from the computation with one of the above numerical methods: Notice that for all s ≥ M(α) we have V(s + 1, α) = V(s, α) + 1, since the running allowed ruin probability equal α for s ≥ M(α) (use α = pβ 1 + qβ 2 which holds for β 2 = α only if β 1 = α). The function M(α) (see Figure 1 left below) is combined with the running ruin probabilities a i (s) defined sequentially as follows: a 0 (s) is computed from the initial data (s 0 , α 0 ). The intersection of a 0 (s) with M(α), plotted in the same diagram, is barrier B 0 . From the data (B 0 , a 0 (B 0 )) we compute a 1 (s), and so on: the intersection points of a i (s) with M(α) are the barriers B i . Figure 1  Another, more precise method is an (almost) complete search in the vectors of non-decreasing n−tuples of numbers k, k + 1, ..., K, where k is the barrier in the unconstrained problem and K a suitable limit of the state space for s. Search for a smallest -in pointwise order -vector for which the maximal γ i is smaller than 1. Finally we apply formula (18) to this smallest vector. The computation of the γ s is very simple, and the test checks for an appropriate with dividend ruin probability. A numerical example is given below. Following our intuition we searched for a barrier sequence only in the set of all non decreasing sequences. That intuition does not fail in this situation can be seen with the following argument. The functions a i (s) are defined by a i (−1) = 1, Equation (4) for 0 ≤ s ≤ B i − 1, and some value for a i (s 0 ) with 0 ≤ s 0 ≤ B i . The functions are concatenated by the value in which the with dividend surplus jumps after leaving the barrier level B i . For B i+1 ≥ B i − 1 this produces the recursion (16), but for B i+1 < B i − 1 after a jump to B i − 1 we pay out dividends immediately which leads us to B i+1 . In this case the recursion reads With a next barrier B i+2 ≥ B i+1 − 1 we obtain If we replace B i byB i = B i+1 + 1 < B i we obtain for the barriersB i , B i+1 , B i+2 a parameter and the same value appears for the non decreasing threetuple B i+1 ,B i , B i+2 . The dividend value for these barriers is larger than before, since we pay dividends earlier. Repeating this argument step by step, we can replace an arbitrary admissible sequence of barriers by an admissible non decreasing one which leads to a higher dividend value.

The Lagrange Multiplier Approach
For the Lagrange multiplier method we choose a constant L > 0 and maximize the company value minus the weighted corresponding ruin probability: We used a non-stationary approach and computed the quantities for time t with V(−1, L, t) = −L. The resulting optimal dividend strategy is a time dependent barrier strategy M(t) with which dividends are paid at t when the with dividend surplus is above M(t). Using the barrier function M(t) one can compute the ruin probability for the optimal dividend strategy via the recursion The value V(s, L) = V(s, L, 0) can efficiently be approximated via a backward recursion starting at V(s, L, T) = −Lψ(s) and ψ(s, T) = ψ 0 (s) for some large T, a computation which turned out to be easy. Numerical experiments indicate that the approach produces dividend strategies which differ from the ones computed with the other methods: The resulting optimal dividend strategies for V(s, L) are state and time dependent, but not path dependent.
The proposed policy improvement method without dynamic equation works also for more general models which are skip-free upwards and have independent stationary increments, e.g., classical Lundberg models with arbitrary claim size distribution or Brownian motions with drift. For these models the fist entrance probabilities and the discount factors for first entry waiting times are available. For Lundberg models the policy improvement method based on a modified Bellman equation can probably be applied, in particular with the explicit form of running allowed ruin probabilities. For the barrier method a continuous state space might cause problems: after discretization the resulting grid will be too large for an easy selection of optimal barriers.

Numerical Example
All computations in this section are done with MatLab (modified Bellman, policy improvement, and Lagrange) or with Maple (Barrier method). We consider the case with parameters p = 0.7, r = 1/1.03, s 0 = 4 and a 0 = 0.2. We have This shows that a ruin constraint is rather cheap. The method using the modified Bellman equation described in Hipp (2003) is done-slightly modified-with the same step size 1/100, 000 for α, which results with 800 iterations and interpolation in a somewhat larger value: V(4, 0.2) = 12.817618.
The modification, which speeds up a lot and allows for a small step size and a large number of iterations, is the specification of the maximizers β 1 and β 2 when s and α are given. We use again the running ruin probabilities for states without dividend payment a(x) = 1 − γ + γψ 0 (x) with γ derived from a(s) = α and set The larger value obtained with the old method indicates that the iteration method was used with an insufficient number of repetitions. Furthermore, interpolation reduces the effect of a discretization. Since the iteration method uses a complete search over the possible surplus values (reducing the search to one over a small region leads to wrong results), larger numbers of iterations are not acceptable even for a patient user. Finally, for the iteration method we do not have a proof for convergence to the value function. Of course the best results can be obtained using the barrier method which is based on exact formulas. We computed V(4, 0.2) from given barrier levels B 0 , ..., B 100 . Stopping dividend payment after visiting 100 not necessarily different barriers produces a numerical result below the true value, but the small size of this error can be seen in the (worst) case α = 1 : V(4, 1) = 13.1003845, while with 100 steps we obtain 13.1003469. We used the barriers All corresponding γ i are smaller than 1. With these we obtained the value V(4, 0.2) = 12.9099.
The barriers are found in an interactive procedure: we started with three regions [0, ..., 6], [7, ..., 13], [14, ..., 19] in which all barriers have the same value a, b, c, respectively. We took a = 4 which is the barrier in the unconstrained problem, b = 6 and b = 7. All other barriers are K. To avoid γ i > 1 we increased step by step to c = 8. Then we reduced the size of barriers in the remaining groups. We are close to the optimal value when γ K < 1 is very close to one. The difference between the dividend values 12.817618 and 12.9099 for V(4, 0.2) is caused by the discretization of α; even a step size of 1/100, 000 results in a rather big error due to the large number of calculations.