Time-Consistent Investment and Reinsurance Strategies for Mean-Variance Insurers under Stochastic Interest Rate and Stochastic Volatility

: This paper studies the time-consistent optimal investment and reinsurance problem for mean-variance insurers when considering both stochastic interest rate and stochastic volatility in the ﬁnancial market. The insurers are allowed to transfer insurance risk by proportional reinsurance or acquiring new business, and the jump-diffusion process models the surplus process. The ﬁnancial market consists of a risk-free asset, a bond, and a stock modelled by Heston’s stochastic volatility model. Interest rate in the market is modelled by the Vasicek model. By using extended dynamic programming approach, we explicitly derive equilibrium reinsurance-investment strategies and value functions. In addition, we provide and prove a veriﬁcation theorem and then prove the solution we get satisﬁes it. Moreover, sensitive analysis is given to show the impact of several model parameters on equilibrium strategy and the efﬁcient frontier.


Introduction
Risk management is a fundamental challenge for insurance companies. Reinsurance is a powerful tool for insurers to transfer insurance risk. Accordingly, it is still a fascinating topic among insurers to design optimal reinsurance contract. On the other hand, increasing insurance companies participate in financial markets nowadays. In this context, the reinsurance and investment problem has become a popular research topic in recent years. For example, Browne (1995) [1] considered the problem of maximizing the expected utility and minimizing the ruin probability for insurers. They derived the optimal investment strategies in a simple setting, where the surplus process is modelled by a drifted Brownian motion. Yang and Zhang (2005) [2] solved this problem in a jump-diffusion surplus process case. Schmidli (2002) [3] considered the problem of minimizing the ruin probability by investment and proportional reinsurance. Bai and Guo (2008) [4] investigated the problem of optimal proportional reinsurance and investment with multiple risky assets under no-short selling constraints. Li et al. (2018) [5] discussed the problem of maximizing expected utility in the case of excess-of-loss reinsurance under model uncertainty. We refer the readers to   [6] and Zhang et al. (2016) [7] for the criterion of maximizing the utility of terminal wealth, and Liang (2018) [8] for minimizing the ruin probability.
Besides these two criterions, the mean-variance has also been a popular criterion in asset allocation problems since the breakthrough of Markowitz (1952) [9]. Zhou and Li (2000) [10] analytically solved the continuous-time mean-variance problems by stochastic linear-quadratic control theory. In recent years, there has been growing interest in the reinsurance-investment problem under a mean-variance framework, see, for example, Bäuerle (2005) [11], Bai and Zhang (2008) [12], Bi and Guo (2013) [13], and Shen et al. (2014) [14]. In fact, these papers searched pre-commitment solution, which is not time-consistent in the sense that an optimal strategy that is derived at one time is not guaranteed to be optimal at a future time point. Because time-consistency is a basic requirement for decision makers, much literature has recently searched time-consistent mean-variance portfolio selection problems in the Nash equilibrium decision framework that was introduced by Björk et al. (2014) [15] and Björk et al. (2017) [16]. Among them, Zeng and Li (2011) [17] conducted the first study of the time-consistent mean-variance reinsurance-investment problem. They got closed form solutions with the Black-Scholes model by means of extended HJB equations. Subsequently, many related work appears, for example, Zeng and Li (2013) [18] incorporated jumps into this problem, Zhao et al. (2016) [19] considered this problem when assuming that there is a defaultable asset. Xiao et al. (2019) [20] considered the problem of maximizing joint interests of the insurer and the reinsurer. Zhu et al. (2019) [21] considered non-zero sum differential reinsurance and investment game between mean-variance insurers.
A lot of empirical evidence show that the volatilities of risky assets prices are not constant or deterministic; Heston et al. (1993) [22] introduced a stochastic volatility model, which is still popular for option pricing or asset pricing. Additionally, several papers also used this model or its extension in portfolio selection problems, for example, Li et al. (2012) [23], Liu (2007) [24], and Yan et al. (2020) [25]. Additionally, it is generally accepted that the interest rate is stochastic, and popular models describing interest rate include the CIR model and Vasicek model. It is natural to consider their combination, for example, Heston-CIR model or Heston-Vasicek model. In fact, several papers considered hybrid stochastic interest rate and stochastic volatility model in the literature, for example, Ahlip et al. (2013) [26] used this model in foreign option pricing and   [27] in asset allocation for DC pension plan. However, up to our knowledge, few papers discussed the reinsurance and investment problem in stochastic interest rate enviroment. Although   [6] and Li et al. (2015) [28] studied the management of interest risk for an insurer, the real interest rate in their work is deterministic. Accrdingly, in the real market, there, in fact, does not exist interest rate risk. Additionally, as the wealth process does not satisfy the usual Lipschitz conditions in stochastic interest rate model, we cannot use standard verification theorem directly when incorporating stochastic rate into the portfolio selection problem. Korn et al. (2002) [29] considered the portfolio selection problem with a stochastic interest rate for CRRA investors and provided a verification theorem. Pan et al. (2017) [30] studied the asset-liability management problem in stochastic interest rate environment for CARA investor. In this paper, we consider the reinsurance-investment problem under the dynamic mean-variance framework. We give a verification theorem for time-consistent mean-variance insurers based on the result of Björk et al. (2017) [16] and verify that the solution we obtain satisfies it.
In our paper, we consider the time-consistent reinsurance and investment problem for mean-variance insurer incorporating both stochastic interest rate and stochastic volatility risk. We use the Heston model to describe the dynamics of stock and the Vasicek model to describe interest rate. The surplus process is modelled by the jump-diffusion process. The insurer can transfer part of risk by proportional reinsurance. The zero coupon bond is included in the financial market in order to hedge interest risk. The objective of insurer is to find an optimal reinsurance investment strategy in a time consistent sense, so as to maximize the expected value of terminal wealth and minimize the associated variance. The contribution of our paper is threefold: (1) we consider the time-consistent mean-variance optimization problem for insurers when considering both stochastic interest rate and stochastic volatility, which is new in the literature. (2) We derive the time consistent equilibrium investment and reinsurance strategies and the value functions explicitly overcoming the difficulties caused by stochastic interest rate. (3) We verify our solution through a verification theorem. It can be seen as an extension of Björk et al. (2017) [16] to stochastic interest rate case and is also new in the literature as we know.
The remainder of this paper is organized, as follows. Section 2 describes model and assumptions. Section 3 formulates the problem in a game theoretic framework and provides the verification theorem. Section 4 derives the equilibrium reinsurance-investment strategy and efficient frontier through an extended HJB equation. Section 5 gives several numerical examples to show the effects of different model parameters on equilibrium strategies and the efficient frontier. We conclude this paper in Section 6.

Model Setup
Let (Ω, F , P) be a complete probability space with a right continuous filtration F := {F t } t∈[0,T] satisfying usual conditions, where T ∈ (0, +∞) represents the time horizon. We assume that all of the stochastic processes in this paper are well defined and adapted to F t .

The Financial Market
In our model, we assume that the insurer can continuously invest in three assets: risk-free asset, bond, and stock. The dynamics of instantaneous interest rate follows Vasicek model: where a, b and σ r are positive constants and W r (t) is a standard Brownian motion on the probability space. The price process of risk-free asset S 0 (t) follows: The second asset is zero-coupon bond. It is an asset that delivers a payment of $1 at maturity. Denote the price of zero-coupon with maturity s at time t with interest rate r(t) = r as B(t, s, r). We define the risk-neutral probability measure Q by dQ dP | F T = G(T), where: Here, λ r (s) is the market risk price of W r (t). We assume it to be a constant in this paper. Under Q, we have: By the Feynman-Kac Theorem, for a fixed s, B satisfies the following PDE: the bond price is known in an explicit form: where and From now on we use a shorter notation B(t, s) = B(t, s, r). Applying Itô's formula, we can obtain P-dynamics of bond: It is unrealistic to assume that we can find all of the zero-coupon bonds in the market. Following   [27], we introduce a rolling bond with constant maturity K in our analysis. Denote the price of rolling bond at time t as B K (t); B K (t) follows the following stochastic differential equation: Because the rolling bond is only correlated with the interest rate and the interest rate model is a one-factor model here, it can be used in order to replicate any zero coupon bond. The relation between rolling bond and zero-coupon bond satisfies: The third asset is stock, whose dynamics evolves according to the Heston stochastic volatility model: where σ S , σ L , κ, ν, δ are positive constants, W S (t) and W L (t) are two correlated one-dimensional Brownian motion independent of W r (t) in the interest rate model, and Cov(W S (t), W L (t)) = ρt. The condition 2κδ > σ 2 L is required here in order to ensure L(t) > 0.

Surplus Process
We assume that the insurer's surplus process is modelled by the following jump-diffusion process: where c > 0 is the premium rate and σ 0 is a positive constant. W 0 (t) is a standard Brownian motion and σ 0 W 0 (t) can be seen as the uncertainty from the premium of insurer. N(t) represents the number of claims between [0, t], which is a Poisson process with intensity λ > 0. Y i is the size of the ith claim and Y i , i = 1, 2, . . . are independent and identically distributed positive random variables with distribution function F Y . Here, W 0 (t) and ∑ N(t) i=1 Y i are independent with each other and both are independent with W S (t), W r (t), and W L (t). We assume that the moment generating function of Y i exists, which implies that momentum of each order exists. We denote the first-order moment of the random variable by The insurer calculates the premium rate by the expected value principle, which is, c = (1 + θ)λµ y , where θ is the safety loading of insurer.
Suppose that the insurer can control risk by proportional reinsurance. We denote α(t) as the reinsurance proportion of insurer at time t, i.e., the insurer pays 100α(t)% of the claim while the reinsurer pays 100(1 − α(t))%. Accordingly, α(t) can also be seen as insurance risk exposure of insurer. If α(t) ∈ [0, 1], the insurer purchases reinsurance from the reinsurer, if α(t) > 1, the insurer acquires new business from the reinsurer, i.e., the insurer acts as a reinsurer of the reinsurer. The reinsurer also uses the expected value principle in order to calculate the premium rate. The insurer has to pay a premium at the rate of (1 + η)λµ y (1 − α(t)) to the reinsurer. η is the safety loading of reinsurer and η > θ in order to exclude arbitrage. Subsequently, the surplus process of insurer becomes:

Wealth Process
The insurer, starting from initial capital x 0 at time 0, can invest in risk-free asset, stock, rolling-bond, and buy proportional reinsurance within time horizon [0, T]. Accordingly, the evolution of X π (t) can be described as: where β π (t) and γ π (t) represent the amount of money invested in stock and bond market and α π (t) is the value of risk exposure. Let π(t) = (β π (t), γ π (t), α π (t)) t∈[0,T] represent the reinsurance-investment strategy. We have the following definition: (1) ∀(r, l, x) ∈ O, SDE. (7) has a pathwise unique solution with initial condition r(t) = r, L(t) = l and X π (t) = x; (2) ∀(t, r, l, x) ∈ Q, α π (t) ≥ 0; The set of admissible strategies is denoted by Π and we only consider admissible strategies here.

Problem Formulation and Verification Theorem
The traditional mean-variance portfolio selection problem assumes that the insurer's goal is to solve the following problem: where positive constant γ is risk aversion parameter of insurer. The corresponding strategy is called pre-commitment strategy and it is only optimal at time 0. Rational decision-makers hope to find a time-consistent strategy that will be still optimal as time goes by. In this paper, we assume that the insurer can change objective function at a future time. Additionally, we want to derive a time-consistent strategy that will still be optimal at a future time. The objective function of the insurer is: Following Björk et al. (2017) [16], we use the notion of Nash equilibrium strategy in a game theoretic framework in order to define the time-consistent strategy (also called equilibrium strategy). The definition is as follows:

Definition 2.
A control lawπ is said to be an equilibrium strategy if, for any π ∈ Π, any fixed initial state (t, r, l, x) ∈ Q and a fixed number h > 0, if we define a new strategy π h by In addition, the equilibrium value function is defined by V(t, r, l, x) = J(t, r, l, x,π).
Proof. See Appendix A.

Solution to the Optimization Problem
In this section, we derive the time-consistent equilibrium reinsurance and investment strategies and the corresponding value function for Problem 9.
Suppose that there exist two functions W(t, r, l, x) and g(t, r, l, x) satisfying the conditions in Theorem 1, such that the left side of Equation (11) is concave w.r.t α π (t), β π (t), and γ π (t), then Equation (11) can be written as: Because our setting leads to an affine wealth process, we assume W(t, r, l, x) and g(t, r, l, x) to have the following structure: The corresponding partial derivatives are: x + m (t)l + n t (t, r), g r = q r (t, r)x + n r (t, r), g l = m(t), g x = q(t, r), g rr = q rr (t, r), g xx = g ll = g xl = 0, g xr = q r (t, r).
Substituting (17) and corresponding derivatives above into Equation (16) By first order conditions in Equation (18), we can first obtain equilibrium strategies: Inserting (19) into Equations (18) and (13), we can get: After separating variables w.r.t. x and l, we can obtain the following equations: In order to solve Equation (20), we conjecture q(t, r) = e q 1 (t)+q 2 (t)r . After plugging the structure into Equation (20) and separating variables w.r.t r, we have: Solving the two ODEs, we obtain: Equations (21) and (23) are linear ODEs that can be solved directly; we have: The difficulty of solving the problem in closed form lies in Equations (22) and (24). Note that Equation (24) is a non-homogeneous PDE due to the term λµ y q(θ − η); hence, it is difficult to conjecture the structure of solution like before. Here, we use the homogenization technique in PDE theory to solve Equation (24). Equation (22) can be solved then. Lemma 1. Supposen(t, r) and n(t, r, τ) to be solutions of the following two equations: then, the solution to Equation (24) can be expressed as: n(t, r) =n(t, r) + T t n(t, r; τ)dτ.
We define the following operator for any function φ(t, r) for convenience: Clearly, we have ∇n(t, r) = ∇n(t, r) + T t ∇ n(t, r; τ)dτ.
Equation (22) is also a non-homogeneous PDE with variable terms which cannot be solved directly. However, observing the structures of Equations (22) and (24), we have: h(t, r) = n(t, r) +ĥ(t, r), whereĥ(t, r) satisfies:ĥ t + (a − br)ĥ r + We conjectureĥ(t, r) = h 1 (t) + h 2 (t)r and, thus, have the following equations: Solving these two equations, we have: Before giving Theorem 2, we give the following lemma, which will be used in the proof of Theorem 2. (1), then, for any p > 0,, we have:

Lemma 2. if r(t) satisfies Equation
if L(t) satisfies Equation (6), then, for any ζ ≥ 1,, we have: Proof. The first inequality (45) is the same as Lemma 4.1 of Wei et al. (2017) [31]. For (46), notice holds for some constant K 1 , the linear growth condition of Theorem 4.4 in Mao (2007) [32] is satisfied, then we can prove (46) in the case ζ ≥ 2. The case of 1 ≤ ζ < 2 can be verified using the result of second order momentum since we have: by Hölder inequality.
By Theorem 1, the derivation above can be summarized as the following theorem: Theorem 2. For Problem (9), the time-consistent reinsurance and investment strategies for the insurer are given by: where n r = − T t n 4 (t, τ)e n 3 (t,τ)+n 4 (t,τ)r dτ and m(t) is given by Equation (27). The equilibrium value function and expectation of the terminal value are as follows: where q 1 (t), q 2 (t), p(t), n(t, r), and h(t, r) are given by Equation (25), Equation (26), Equation (28), Equation (41), and Equation (42), respectively.

Proof.
We need to verify that the optimal strategy we get belongs to the set of admissible strategies and function W(t, r, l, x) and g(t, r, l, x) satisfy the condition of Theorem 1. In this proof, for notation convenience, all of the conditional expectations we need to take at time t ∈ [0, T] are taken at time 0 without a loss of generality. Inserting Equations (47)-(49) into Equation (7), we have: where (1) Definition 1 can be verified once we have: and In fact, we have a stronger conclusion, that is, for arbitrary ζ ≥ 1, we have: We take E T 0 |I 2 (t)| ζ dt < +∞ as an example. The rest can be proved similarly. Notice that m(t), q 1 (t), q 2 (t), n 3 (t, τ), and n 4 (t, τ) are all bounded continuous functions. By fundamental inequality, there exists C 0 , C 1 > 0, such that: where the last inequality is due to (45) and (46) in Lemma 2.
Because α π * (t) ≥ 0, (2) in Definition 1 also holds. Hence, we have verified that the equilibrium strategy we get belongs to the set of admissible strategies. Now, we turn to prove the condition in Theorem 1. We have: Because q 1 (t), q 2 (t) are bounded in [0, T], by Lemma 2, we have: holds for arbitrary p > 0. By (61), (66), the existence of high order momentum of Y i , we can use Cauchy-Schwarz inequality and Burkholder-Davis-Gundy Inequality to prove the following inequality for arbitrary ζ ≥ 1: where C 0 , C 1 , C 2 are some positive constants. Now, we prove that the W(t, r, l, x) and g(t, r, l, x) we get satisfy the condition in Theorem 1. Clearly, W(t, r, l, x) and g(t, r, l, x) belong to C 1,2,2,2 (Q), we just need to verify the integrability condition. We prove: In fact, and, thanks to the affine structure value function, we have: where C 3 , C 4 , C 5 , C 6 are some positive constants. Subsequently, by fundamental inequality, (68) is proved. The other integrability conditions in Theorem 1 can be proved similarly by using (67) and Lemma 2; we omit them here. Thus, we have verified the condition of Theorem 1.
Additionally, Theorem 2 is also proved.

Remark 2.
In this model,insurance risk exposure α π * (t) is not a deterministic function of time t. It is a stochastic process, depending on r(t). α π * (t) also depends on the parameters from insurer's surplus process and α π * (t) > 0 always holds.
Remark 3. β π * (t) also depends on r(t). However, it does not depend on L(t) and X π * (t), which is in line with Björk et al. (2017) [16], Li et al. (2012) [23] et al. Moreover, simple computation shows that β π * (t) > 0 always holds here. γ π * (t) depends on all of the parameters, except δ. Moreover, it depends on both r(t) and wealth process X π * (t). In fact, bonds in the stochastic interest rate model take the similar role with a risk free asset in the constant interest rate model. While, in the constant interest rate case, the amount invested in risk free asset always depends on the wealth process. It's natural γ π * (t) also depends on X π * (t) here.

Remark 4.
We can see that, in our model, the efficient frontier depends on both the interest rate r(t) and current wealth X π * (t). However, efficient frontier in mean-standard deviation plane is still a straight line for a given time. In Equation (72), is the market risk of the portfolios, it is related with interest rate risk and volatility risk.
Remark 5. This model also includes several special cases, for example, the model with stochastic volatility and constant interest rate, see Li et al. (2012) [23]. If σ 0 = 0, then the insurance model reduces to the case of the classical C-L model, and, when λ = 0, the insurance model can be seen as Brownian motion approximation.

Remark 6.
When α π * (t) ≡ 1, the problem becomes an investment-only problem. In this case, the equilibrium investment strategy of stock and bond are the same as in (48) and (49). However, the value function will decrease in this case.

Sensitivity Analysis
In this section, we present some numerical studies on the effects of model parameters on the time-consistent reinsurance-investment strategy. Throughout numerical analysis, the parameters of stochastic interest rate and stochastic volatility are referred to by Escobar et al. (2017) [33], who calibrated the Heston-Vasicek model while using real market data. The parameters of insurance model are mainly referred to by Zeng and Li (2013) [18]. Unless otherwise stated, the basic parameters are given by a = 0.0125, b = 0.266, σ r = 0.013, Because the time-consistent strategy π * (t) = (β π * (t), γ π * (t), α π * (t)) depends on r(t) and X(t), they are not deterministic. We use the 1000 times Monte Carlo simulation in order to calculate the mean reinsurance and investment strategies. Both equilibrium investment in bond and stock will increase first and then decrease, as can be seen from Figure 1. When compared with stock, bond investment decreases more sharply. The equilibrium insurance risk exposure also increases steadily, which is in accordance with the case of deterministic interest rate. First, we discuss the effects of two basic parameters: γ and r 0 on the equilibrium reinsurance and investment strategy. By taking derivatives, we have γ < 0. This means that, as γ increases, equilibrium investment in stock and insurance risk exposure will decrease. It is natural because the insurer will buy more reinsurance and buy less stock when they become more risk averse. Figure 2 displays the effects of γ and r 0 on equilibrium strategy at time 0. We can find that bond investment will increase as γ increases, partly because they are a safer asset when compared with stock. Additionally, we have ∂α π * (0) ∂r 0 = −q 2 (0)α π * (0) < 0, ∂β π * (0) ∂r 0 = −q 2 (0)β π * (0) < 0, which is shown in Figure 2. In Figure 2, we can see that γ π * (0) increases as r 0 increases. This means that, as r 0 increases, the insurer will prefer risk free asset and bond to stock.
(a) Effects of γ on π * (0) (b) Effects of r 0 on π * (0) Second, we discuss how insurance parameters affect the equilibrium reinsurance strategy. From the expression of α π * (t) in Equation (47), we have ∂α π * (t) ∂µ y > 0, ∂α π * (t) ∂σ y < 0, ∂α π * (t) ∂η > 0, and ∂α π * (t) ∂σ 0 < 0. Accordingly, the insurance risk exposure of insurer will become larger as µ y becomes larger; this is because they can get more income from premium as µ y increases. However, α π * (t) will decrease when σ y or σ 0 increases. σ y and σ 0 are the volatility of insurance risk. When they become larger, the insurer tends more to reinsurance and transfers the risk to the reinsurer. η is the safety loading of reinsurer, when η increases, the reinsurance becomes more expensive for the insurer, so they will buy less reinsurance.
(a) Effects of ν on β π * (t) (b) Effects of σ L on β π * (t) Figure 3. Effects of ν and σ L on β π * (t). Figure 4 depicts the impact of initial interest rate r 0 and λ r on the efficient frontier. We can see that the efficient frontier moves upwards as r 0 increases. Insurers tend to invest more in bond and less in stock when r 0 increases, and the insurance risk exposure will be smaller. In this case, the variance of the terminal variance will be smaller under the same expected terminal wealth. Efficient frontier also moves upwards as λ r becomes larger. This is in accordance with our intuition, because larger λ r means more return per unit risk.

Conclusions
In this paper, we investigate a mean-variance investment and reinsurance problem in a stochastic interest rate and stochastic volatility framework. The surplus process is modelled by the jump-diffusion process. The insurer can purchase reinsurance or acquire new business and invest in a risk-free asset, stock, and bond. The dynamic of interest rate is modelled by Vasicek model, while the stock price follows the Heston model. We tackle the problem in a game-theoretic perspective in order to overcome time-inconsistency. First, we provide a verification theorem for the problem. Subsequently, we solve the extended HJB equations explicitly and obtain the equilibrium investment and reinsurance strategies. We find that the time-consistent investment in bond will depend on wealth level, because the bond here plays a similar role that the risk free asset plays in the constant interest rate case. We also find that time-consistent investment in stock and insurance risk exposure will depend on the interest rate and not depend on the level of stochastic volatility. Finally, we analyze the impact of model parameters on time-consistent strategies and efficient frontier.
Our model has several limitations, for example, the reinsurance proportion is allowed to be greater than one. It is more realistic to limit the reinsurance proportion to [0, 1]. However, in this way, we cannot get a closed form solution, because the equilibrium reinsurance strategy depends on the stochastic interest rate. Perhaps we can use the numerical method to solve this problem and we left it for future research. Our model can also be extended in following ways: (1) we can consider time-consistent robust control based on our model assuming that the insurer is ambiguity-averse.
(2) We can consider the insurer with an incomplete observation due to the fact that financial market information usually cannot be fully obtained. (3) The stock model and interest rate model can be extended to a more sophisticated case. However, these extensions may lead to new technical difficulties; we also left them for future research.  Acknowledgments: The authors are thankful to the editor and the reviewers for their valuable comments and suggestions that helped to improve the quality of the paper.

Conflicts of Interest:
The authors declare that there are no conflict of interest regarding the publication of this paper.
So by Definition 2, π * is the equilibrium strategy and V(t, r, l, x) is the corresponding value function.