Dynamic optimal mean-variance investment with mispricing in the family of 4/2 stochastic volatility models

: This paper considers an optimal investment problem with mispricing in the family of 4/2 stochastic volatility models under mean–variance criterion. The ﬁnancial market consists of a risk-free asset, a market index and a pair of mispriced stocks. By applying the linear–quadratic stochastic control theory and solving the corresponding Hamilton–Jacobi–Bellman equation, explicit expressions for the statically optimal (pre-commitment) strategy and the corresponding optimal value function are derived. Moreover, a necessary veriﬁcation theorem was provided based on an assumption of the model parameters with the investment horizon. Due to the time-inconsistency under mean–variance criterion, we give a dynamic formulation of the problem and obtain the closed-form expression of the dynamically optimal (time-consistent) strategy. This strategy is shown to keep the wealth process strictly below the target (expected terminal wealth) before the terminal time. Results on the special case without mispricing are included. Finally, some numerical examples are given to illustrate the effects of model parameters on the efﬁcient frontier and the difference between static and dynamic optimality.


Introduction
The development of continuous-time stochastic volatility models is deemed crucial in the field of modern finance.The attraction of stochastic volatility models mainly resides in their capacity to explain many stylized facts observed in the financial market such as fat tails, the leverage effect and the volatility smile/skew on implied volatility surfaces.See, for example, Hull and White [1], Stein and Stein [2], Heston [3] and Lewis [4].In 2017, Grasselli [5] proposed a new model called the 4/2 stochastic volatility model which embraces the celebrated Heston model and the 3/2 model (Lewis [4]) as special cases.The superposition of these two parsimonious models makes it possible for the new 4/2 model to better predict the evolution of the implied volatility surface.This leads to emerging interests in applications of Grasselli's work to derivative pricing problems, such as Cui et al. [6], Cui et al. [7] and Zhu and Wang [8].In view of the success of the 4/2 model in terms of option pricing, Cheng and Escobar-Anel [9] recently investigated a utility maximization problem under the 4/2 model.It seems, however, that little attention has been paid to portfolio optimization problems with the 4/2 model under Markowitz [10]'s mean-variance criterion.
The single-period portfolio selection problem under mean-variance criterion can be traced back to the seminar work of Markowitz [10].Li and Ng [11] and Zhou and Li [12] generalized Markowitz's work to multi-period and continuous settings, respectively.In particular, Zhou and Li [12] applied the standard results on the linear-quadratic stochastic control theory combined with an embedding technique to solve the problem in a financial market where all the market coefficients are deterministic.Many researchers then realized the potential of diversification.For example, Shen et al. [13] solved the problem under the constant elasticity of the variance model by imposing an exponential integrability condition on the market price of risk.Shen and Zeng [14] went a step forward by considering the optimal investment-reinsurance problem for a mean-variance insurer in an incomplete market where the market price of risk depends on an affine-form and square-root process, and they derived the modified locally square-integrable optimal strategy.Sun et al. [15] further extended Shen and Zeng [14]'s results to the case with multiple risky assets and random liabilities.For other previous works, one can refer to Chiu and Wong [16], Yu [17], Lv et al. [18], Tian et al. [19], Sun and Guo [20] and the references therein.
In the aforementioned literature, however, the optimal strategies depend on the initial position of state variables, which is due to the non-separability of the variance operator under mean-variance criterion in the sense of Bellman's optimality principle.In other words, once the investor arrives at any new position at a future time, the optimal strategy determined at the new position is inconsistent with the initial one unless the investor commits to the initial strategy over the whole investment period.This optimal strategy is therefore time-inconsistent, and is referred to as the pre-commitment strategy in the literature.The notion of time-inconsistency under mean-variance paradigm stemmed from the work of Strotz [21].In recent years, the time-inconsistency of the mean-variance portfolio selection problem has received considerable attention.For example, Basak and Chabakauri [22] determined a time-consistent strategy by using a backward recursion approach starting from the terminal time.Alternatively, Björk et al. [23] proposed the game theoretical approach and studied the subgame-perfect Nash equilibrium for the mean-variance problem.The equilibrium value function and the equilibrium strategy can be explicitly derived under Markovian settings by essentially solving an extended Hamilton-Jacobi-Bellman (HJB) equation.Rather than searching for the time-consistent equilibrium strategy, Pedersen and Peskir [24] pioneered the dynamically optimal approach to deal with the time-inconsistency of the statically optimal (pre-commitment) strategy.Along with this approach, previous works include Pedersen and Peskir [25], Zhang [26] and the references therein.
According to the law of one price, identical assets must have an identical price.There is, however, ample evidence of violations in the law of one price and of the prevalence of a mispricing phenomenon in the financial market.See, for example, Lamont and Thaler [27], Liu and Longstaff [28] and Liu and Timmermann [29].This leads to growing interests in portfolio optimization problems with mispricing in recent years.Yi et al. [30] studied a utility maximization problem with model ambiguity and mispricing in a financial market consisting of a risk-free asset, a market index, and a pair of mispricing stocks with the constant return rate and volatility.Ma et al. [31] considered a problem for a defined contribution plan with mispricing under the Heston model.Considering the methodology developed by Björk et al. [23] to deal with the time inconsistency under mean-variance paradigm, Wang et al. [32] investigated a mean-variance investment-reinsurance problem with mispricing in the context of constant volatility.Other preceding research outputs on the portfolio optimization problems with mispricing include Gu, Viens and Yi [33], Gu, Viens and Yao [34], Wang et al. [35], to name but only a few.
Motivated by the above aspects, within the framework introduced by Pedersen and Peskir [24] to overcome the time inconsistency under mean-variance criterion, in this paper we study a mean-variance portfolio selection problem that takes into consideration the family of 4/2 stochastic volatility models and mispricing simultaneously.The financial market consists of a risk-free asset, a market index and a pair of mispriced stocks.To solve this problem, we first apply the Lagrange multiplier method to relate the original problem to an unconstrained optimization problem.To solve the latter by using the dynamic programming approach, we establish the corresponding HJB equation.By solving the HJB equation explicitly, closed-form expressions of the statically optimal strategy and the corresponding optimal value function are derived.Based on an assumption on the model parameters combined with the investment horizon, we prove the necessary verification theorem from scratch and verify the admissibility of the optimal strategy.By solving the statically optimal strategy each time, the dynamically optimal strategy is explicitly derived.This time-consistent strategy keeps the wealth process strictly below the target (expected terminal wealth) before the terminal time.Moreover, we provide the results without mispricing and consider the special cases under the Heston and the 3/2 models.Finally, we present some numerical examples to illustrate the effects of some model parameters on the efficient frontier and the difference between static and dynamic optimality.In summary, compared with some related current research studies, the main contributions of this paper are as follows:

•
The market model incorporates the 4/2 model and mispricing simultaneously; • By making an assumption on the model parameters, a verification theorem is provided to guarantee that the candidate solution to the HJB equation is the optimal value function, and the admissibility of the optimal strategy is verified; • We derive both the statically optimal (pre-commitment) and the dynamically optimal (time-consistent) strategies explicitly for the mean-variance problem.
The remainder of this paper is structured as follows.In Section 2, we formulate the market model and the mean-variance portfolio problem.Section 3 is devoted to solving the HJB equation and deriving the closed-form expression of the optimal investment strategy of the unconstrained problem.In Section 4, we present the statically optimal strategy and the dynamically optimal strategy for the mean-variance problem, and provide the results on some special cases.In Section 5, some numerical examples are given to illustrate our theoretical results.Section 6 concludes the paper.

Formulation of the Problem
Let T > 0 be a fixed terminal time of decision making and (Ω, F , P) be a complete probability space carrying five one-dimensional, mutually independent standard Brownian motions W 1 , W 2 , Z, Z 1 , Z 2 .The probability space is further equipped with a right-continuous, P-complete filtration (F t ) t∈[0,T] generated by the Brownian motions.
We consider a financial market setting where a risk-free asset, a market index, and a pair of stocks with mispricing can be continuously traded.The risk-free asset price B = (B t ) t∈[t 0 ,T] evolves over time as: ), where the positive constant r > 0 is the risk-free interest rate.Let the price dynamic of the market index S m = (S m,t ) t∈[t 0 ,T] be governed by the 4/2 stochastic volatility model (Grasselli [5]): with S m,t 0 = s m,0 ∈ R + and V t 0 = v 0 ∈ R + at time t 0 ∈ [0, T), where the constant λ > 0 stands for a controller of the excess return, and the variance process V t follows a Cox-Ingersoll-Ross (CIR) process with mean-reversion speed κ > 0, long-term mean θ v > 0 and volatility of volatility σ v > 0. The Feller condition 2κθ v > σ 2 v is required such that V t is strictly positive.We assume that two parameters c 1 and c 2 are non-negative constants and ρ ∈ [−1, 1].
The two mispriced processes are modeled as a pair of stocks S 1 = (S 1,t ) t∈[t 0 ,T] and S 2 = (S 2,t ) t∈[t 0 ,T] which are coupled via the pricing error: where S 1,t and S 2,t evolve according to the following system of stochastic differential equations (SDEs): with initial values S 1,t 0 = s 1,0 and S 2,t 0 = s 2,0 at time t 0 ∈ [0, T), where l 1 , l 2 , β, σ and b are constant parameters.The term t characterizes the systematic risk of the market, while σ dZ t + b dZ i t stands for the idiosyncratic risk of stock i, i = 1, 2. In particular, σ dZ t describes the common risk whereas b dZ i t represents the individual risk generated by the stock i, i = 1, 2, respectively.The term l i M t reveals the effect of mispricing on ith stock's price via the pricing error M t defined above.Moreover, it can be shown that the pricing error M t follows an Ornstein-Uhlenbeck (OU) process as a result of Itô's formula: with M t 0 = m 0 = ln(s 1,0 /s 2,0 ) ∈ R, where two constant parameters l 1 and l 2 can be explained as liquidity terms which control the mean-reversion rate of the pricing error.
To be specific, the lower liquidity decreases the velocity of reversion of the pricing error towards the long-term mean of zero.Following some previous studies, such as Liu and Timmermann [29], Ma et al. [31] and Wang et al. [32], we hereby assume that l 1 + l 2 > 0, which ensures the stability of the financial market.Let π m (t, V t , M t , X π t ), π 1 (t, V t , M t , X π t ), π 2 (t, V t , M t , X π t ) be three Markov controls denoting the proportions of wealth invested in the market index S m , and the pair of stocks S 1 and S 2 at time t, respectively.We write π := (π m , π 1 , π 2 ) and such deterministic functions π m , π 1 , π 2 are referred to as feedback control laws in the literature.Suppose that the market is frictionless and no restrictions on leverage and short-selling are enforced, the investor decides to construct a self-financing portfolio of B, S m , S 1 and S 2 over the investment period [t 0 , T].So the controlled wealth process X π = (X π t ) t∈[t 0 ,T] is described by the following system of SDEs: with X π t 0 = x 0 , where we write u m := π m + β(π 1 + π 2 ) to simplify our notation.Let P t 0 ,v 0 ,m 0 ,x 0 denote the probability measure with the initial value (V t 0 , M t 0 , X π t 0 ) = (v 0 , m 0 , x 0 ) at time t 0 ∈ [0, T).Accordingly, E t 0 ,v 0 ,m 0 ,x 0 [•] and Var t 0 ,v 0 ,m 0 ,x 0 (•) denote the associated expectation and variance under the probability measure P t 0 ,v 0 ,m 0 ,x 0 , respectively.Definition 1 (Admissible strategy).Given any fixed t 0 ∈ [0, T), a strategy π is said to be admissible if for any (v 0 , m 0 , The set of all admissible strategies is denoted by A.
The investor wishes to determine an admissible strategy π ∈ A solving the following mean-variance portfolio problem.

Definition 2. The mean-variance portfolio problem is a stochastic optimization problem denoted by
where ξ is a fixed and given constant serving as a target.We denote the corresponding optimal value function by V MV (t 0 , v 0 , m 0 , x 0 ).

Remark 2.
Here, we impose ξ > x 0 e r(T−t 0 ) , which precludes the trivial case when the investor simply takes the risk-free strategy π ≡ 0 over the investment period [t 0 , T].This condition is consistent with some previous studies, such as Shen et al. [13], Sun and Guo [20] and Sun et al. [15].
As discussed in the Introduction, the mean-variance problem ( 5) is time-inconsistent due to the presence of the variance operator in the mean-variance objective.We take the dynamically optimal approach as championed by Pedersen and Peskir [24] to address the problem of time-inconsistency.For readers' convenience, we adapted the definition of the dynamic optimality (Definition 2 in Pedersen and Peskir [24]) into the current context.

Definition 3 (Dynamic optimality).
A control π d * is said to be dynamically optimal in meanvariance portfolio problem (5) for (t 0 , v 0 , m 0 , x 0 ) given and fixed, if for every given and fixed Upon considering the nature of the dynamically optimal approach, as discussed in the Introduction, we shall first pay attention to the static optimality (pre-commitment) for the mean-variance problem (5).
Due to the convexity of the objective function in the problem (5), we can deal with the linear constraint E t 0 ,v 0 ,m 0 ,x 0 [X π T ] = ξ by introducing a Lagrange multiplier θ ∈ R. The associated (dual) Lagrangian is formulated as follows: According to the Lagrangian duality theorem (Luenberger [36]), the mean-variance problem ( 5) is, in fact, equivalent to the following min-max stochastic optimization problem: This suggests that two steps are involved to obtain the static optimality of the meanvariance problem (5).First of all, we should solve the internal unconstrained stochastic optimization problem with regard to π ∈ A with θ ∈ R fixed and given.Subsequently, we turn to optimize Lagrange multiplier θ ∈ R in the external static problem.Hence, we are supposed to determine the optimal strategy of the following quadratic-loss minimization problem in the first place: with γ = ξ − θ fixed and given.

Solution to the Unconstrained Problem
In this section, we devote to solving the unconstrained quadratic-loss minimization problem (8) by using the dynamic programming approach.For this, we first define the optimal value function as where must satisfy the following HJB equation due to dynamic programming principle: where we denote D π∈A H(t, x, v, m) as the following differential operator: Then, the first-order minimization condition yields the optimal control: Inserting (11) into the HJB Equation (10) and simplifying the expression, we obtain the following second-order partial differential Equation (PDE) for function H: In the next proposition, we shall construct an explicit solution denoted by G(t, x, v, m) to PDE (12).Proposition 1.One solution to second-order PDE (12) is and the optimal feedback control is given by x , where with , , and with Proof.We propose a candidate solution to the second-order PDE (12) in the following form: with α(T) = β(T) = γ(T) = 0 and a(T) = γ.Then, we have the following partial derivatives: Substituting ( 19) into (12) and reshuffling terms yield This indicates that we have the following two identities: and Upon considering the boundary condition a(T) = γ, we obtain the following expression of a(t) by solving (20): As for (21), we can separate it with respect to variables v and m 2 as follows: Thus, we have the following system of ordinary differential equations (ODEs) from ( 22) due to the arbitrariness of v ∈ R + and m ∈ R: We see that both (23) and ( 24) are Riccati ODEs, and once these two equations are solved, the explicit expression of solution α(t) to (25) can be immediately derived.
If ∆ = 0, then ( 27) can be simplified to where n 0 is given in (17) above.Integrating both sides of (28) with respect to time t upon considering the boundary condition β(T) = 0, we obtain If ∆ < 0, then (23) can reformulated as follows: After calculations upon considering the boundary condition β(T) = 0, we find Then, we pay attention to the ODE (24) of γ(t).Considering we can rearrange the terms in (24) to have the following formulation: After some calculations, upon considering the boundary condition γ(T) = 0, we have Finally, a direct integral calculation on both sides of ( 25) upon considering the boundary condition α(T) = 0 yields (15).
The following proposition presents strict monotonicity results of β(t) and γ(t) with respect to time t, which in turn leads to the non-positiveness of β(t) and γ(t) over [t 0 , T]. Proposition 2. Functions β(t) and γ(t) given by ( 16) and ( 18), respectively, are strictly increasing with respect to time t, and thus non-positive over [t 0 , T].
Proof.By differentiating β(t) given in ( 16) with respect to t, we obtain It is obvious that dt > 0 holds for the first three cases.As for the fourth and the fifth cases, note that when ∆ ≤ 0, we must have ρ 2 > 1 2 .Similarly, a direct differentiation of γ(t) given in (18) leads to Finally, upon considering the boundary condition β(T) = γ(T) = 0, we can conclude that β(t) and γ(t) are non-positive over [t 0 , T].
To facilitate further discussions, we now present some auxiliary results on the OU process and the CIR process in the literature.The first lemma (Lemma 1) is adapted from Lemma 4.3 in Benth and Karlsen [37].
, then we have The second lemma (Lemma 2) follows from Theorem 5.1 in Zeng and Taksar [38].
Lemma 2. Consider the CIR process V t in (1).We have Inspired by the above results, throughout the rest of paper, we impose the following assumption on the model parameters and the investment horizon [t 0 , T]: Assumption 1.The model parameters and the investment horizon [t 0 , T] satisfy: , where Remark 3. It follows from Proposition 2 above that as t 0 → T, we have C b → (1128 + 96 √ 138)λ 2 , which indicates the feasibility of the assumption on C b .As for the assumption on C γ , it is straightforward to have (l 1 + l 2 )/4b 2 (T − t 0 ) and C γ are decreasing and increasing with respect to T, respectively.This means when the investment horizon T − t 0 is small enough, the assumption on C γ is well established as well.
We next define four Doléans-Dade exponential processes Π 0,t , Π 1,t , Π 2,t and Π 3,t as follows: We shall study the integrability of Π 0,t , Π 1,t , Π 2,t and Π 3,t which will be used in the proof of Theorem 1 below.Lemma 3. Suppose that Assumption 1 holds.Then, Π 0,t , Π 1,t , Π 2,t and Π 3,t satisfy E t 0 ,v 0 ,m 0 ,x 0 sup Proof.Let p > 1 be any given constant.Then, the following equation of k p = k 2 √ k − 1 admits two positive roots: with the first root satisfying k 1 > 1.In particular, when p = 24, we have k 1 = 1128 + 96 √ 138.From Assumption 1, we have According to Theorem 15.4.6 in Cohen and Elliott [39], we then find that Π 0 satisfies By applying the same technique to Π 1 , Π 2 and Π 3 , it is straightforward to obtain (31) due to Assumption 1.So we omit the details here.
To end this section, we shall prove a verification theorem from scratch which guarantees that the candidate solution G(t, x, v, m) derived in (13) coincides with the optimal value function H(t, x, v, m) defined in (9) to the quadratic-loss minimization problem (8).Furthermore, we will also prove the admissibility of the optimal strategy obtained in (14) in the sense of Definition 1.
Proof.In the following, we will finish the proof with two steps.At step 1, we show that the optimal strategy π * = (u * m , π * 1 , π * 2 ) given in ( 14) is admissible.At step 2, we verify that the candidate solution G given in ( 13) is indeed the optimal value function H defined in (9).
Step 1. Substituting the optimal strategy ( 14) into the controlled wealth process (4) leads to with X * t 0 = x 0 .Applying Itô's lemma to Y t := e r(T−t) X * t − γ, we have with Y t 0 = x 0 e r(T−t) − γ.By explicitly solving the linear SDE of Y t , we then have the following closed-form expression: where Π 0,t , Π 1,t , Π 2,t and Π 3,t are defined in (30) above.This in turn shows the optimal controlled wealth process X * t given by (32).We now proceed to show that the optimal strategy π * = (u * m , π * 1 , π * 2 ) given in ( 14) is admissible.To this end, we first show that E t 0 ,v 0 ,m 0 ,x 0 sup Indeed, from the expression of X * t given in (32), we have Π 24 0,t + Π 24 1,t + Π 24 2,t + Π 24 3,t

<∞,
where the positive constant K might differ between lines, the second inequality makes use of Jensen's inequality and the non-positiveness of functions β(t) and γ(t) from Proposition 2, and the last strictly inequality is due to Assumption 1 on C b and Lemma 1.This in turn leads to the establishment of Condition 3 in Definition 1 by Jensen's inequality.Then, we show that Condition 1 in Definition 1 is satisfied: Indeed, in view of the expression of u * m given in (14), we obtain where K is a positive constant, and the last strict inequality follows from (34) as well as the fact that the CIR process V t has a finite second moment at time t ∈ [t 0 , T], which is continuous in time t (see, for example, Cox et al. [40]).Recalling that M t given in (3) is an OU process, we can write the solution explicitly: where √ 2 is P t 0 ,v 0 ,m 0 ,x 0 Brownian motion due to Lévy's characterization of Brownian motion.Then, upon noticing that t t 0 e −(l 1 +l 2 )(t−s) dZ 3 s is normally distributed with mean zero and variance t t 0 e −2(l 1 +l 2 )(t−s) ds, we find that where K > 0 is a positive constant.Therefore, in view of the expressions of π * 1 and π * 2 given in ( 14), we find that Condition 2 in Definition 1 holds as well: where K is a positive constant.Using the same technique, we also have The above results show that the optimal strategy (14) π * ∈ A and completes the first part of the proof.
Step 2. Applying Itô's lemma to the candidate solution G given in ( 13) of the HJB Equation (10) for any admissible strategy π ∈ A, we have Due to the pathwise continuity of X π , π 1 , π 2 , u m , V, G x , G m , all the stochastic integrals on the right-hand side of ( 35) are clearly continuous local martingales under measure P t 0 ,v 0 ,m 0 ,x 0 .Then, there exists a sequence of stopping times localizing all the local martingales (see, for example, page 76 in Le Gall [41]).We therefore denote the associated localizing sequence by (τ n ) n≥1 such that τ n → ∞ P t 0 ,v 0 ,m 0 ,x 0 almost surely as n → ∞.Similar to the preceding definition of the probability measure P t 0 ,v 0 ,m 0 ,x 0 , we let P t,v,m,x denote the probability measure with initial data (V t , M t , X π t ) = (v, m, x) given and fixed at time t ∈ [t 0 , T).Thus, integrating both sides of (35) from t to T ∧ τ n and taking expectation lead to From the expression of candidate function G given in (13), we find where K is a positive constant independent of V and M 2 , and the inequality makes use of the non-positiveness of functions β(t) and γ(t) over [t 0 , T] from Proposition 2. On the one hand, we notice that X π T∧τ n − γe −r(T−T∧τ n ) 2 is P t,v,m,x integrable for any admissible strategy π ∈ A. On the other hand, since candidate function G given in (13) satisfies the HJB Equation ( 10), then we must have D π∈A G(t , X π t , V t , M t ) ≥ 0, P t,v,m,x almost surely for all t ∈ [t, T].Hence, passing to the limit in (36) and applying Lebesgue's dominated convergence theorem to the left-hand side and the monotone convergence theorem to the right-hand side of (36), respectively, we obtain which implies that, for any admissible strategy π ∈ A, we have with any (t, x, v, m) ∈ [t 0 , T] × R × R + × R fixed and given.Meanwhile, from Proposition 1 above, we know: with admissible strategy π * = (u * m , π * 1 , π * 2 ) ∈ A given by (14), which means Combining these two results, we can finally conclude that the candidate solution G coincides with the optimal value function H, i.e., R fixed and given.In particular, the optimal value function of the quadratic-loss minimization problem (8) is given by (33).

Static and Dynamic Optimality of the Problem
In this section, we derive the statically optimal strategy and the dynamically optimal strategy of the mean-variance portfolio problem (5) by utilizing the preceding results.As a matter of fact, in view of ( 6) and ( 7) above, we now only need to solve the following static optimization problem with respect to the Lagrange multiplier θ ∈ R to obtain the static optimality and the corresponding optimal value function for the mean-variance problem (5) max Reformulating (39) as a quadratic functional over θ ∈ R, we find that the optimal value function of the mean-variance problem (5) can be obtained from if the coefficient of the quadratic term is strictly negative.Indeed, upon noticing that π * given in ( 14) is the unique optimal strategy for the quadratic loss minimization problem (8), we must have where: π := ( πm , π1 , π2 ) = (0, 0, 0) stands for the risk-free strategy over the period [t 0 , T].This implies that the quadratic coefficient of θ in ( 40) is strictly negative as desired.Therefore, the maximum to the right-hand side of ( 40) is uniquely attained at Theorem 2. Suppose that Assumption 1 holds.For any initial data (t 0 , v 0 , m 0 , x 0 ) ∈ [0, T) × R + × R × R given and fixed such that x 0 < e −r(T−t 0 ) ξ, the statically optimal strategy of the mean-variance portfolio problem ( 5) is given by for t ∈ [t 0 , T], and the corresponding optimal value function is where α(t), β(t), and γ(t) are given in ( 15), (16), and (18), respectively, and θ * is given by (41).The controlled wealth process X * t is given by where processes Π 0,t , Π 1,t , Π 2,t , and Π 3,t are given in (30).Moreover, the statically optimal strategy given by ( 42) is admissible, i.e., π * = (π * m , π * 1 , π * 2 ) ∈ A.
Replacing γ in ( 14) and ( 32) with ξ − θ * gives the statically optimal strategy (42) and the statically optimal controlled wealth process (44), respectively.Following the proof in Theorem 1 above, it is obvious to see that the statically optimal strategy π * ∈ A. Corollary 1. (No mispricing under the 4/2 model).Suppose that Assumption 1 holds.For any initial data (t 0 , v 0 , x 0 ) ∈ [0, T) × R + × R given and fixed such that x 0 < e −r(T−t 0 ) ξ, the statically optimal strategy of the mean-variance portfolio problem (5) without mispricing is given by for t ∈ [t 0 , T].The corresponding optimal value function is where β(t) is given in ( 16) and ᾱ(t) is given by and θ * is given by The controlled wealth process X * t is given by with Π 0,t given in (30).Moreover, the optimal strategy given in (45) is admissible, i.e., π * m ∈ A.
As discussed in Section 2, the statically optimal strategy π * = (π * m , π * 1 , π * 2 ) in Theorem 2 relies on the initial position of state variables (t 0 , v 0 , m 0 , x 0 ).We will now proceed to derive the dynamically optimal strategy under the framework developed by Pedersen and Peskir [24].Theorem 3. Suppose that Assumption 1 holds.For any initial data (t 0 , v 0 , m 0 , x 0 ) ∈ [0, T) × R + × R × R given and fixed such that x 0 < e −r(T−t 0 ) ξ, the dynamically optimal strategy of the mean-variance portfolio problem (5) is given by Then, set w = π * under the measure P t,v,m,x .Replacing (t 0 , v 0 , m 0 , x 0 ) with (t, v, m, x) in (42), we see from (42) that π * (t, v, m, x) = π d * (t, v, m, x), and thus w(t, v, m, x) = π * (t, v, m, x) = π d * (t, v, m, x) = u(t, v, m, x) for any t ∈ [0, T).Due to the continuity of functions u and w, there exists a ball B := + ] such that w( t, ṽ, m, x) = u( t, ṽ, m, x) for ( t, ṽ, m, x) ∈ B when > 0 is small enough such that t + ≤ T. Therefore, since w = π * is the unique continuous function such that the infimum within the HJB Equation ( 10) is attained for any (t, v, m, x), then we can set the exiting time τ = inf{t ∧ T| (t, V t , M t , X u t ) / ∈ B } such that for t ≤ τ , it holds that where ζ is a fixed positive constant.Replacing γ by ξ − θ * in the boundary condition of the HJB Equation ( 10) with θ * given by where the strict inequality follows from the fact that τ ε > t, since the triple (V, M, X u ) has continuous sample paths with probability one under P t,v,m,x measure.From (55), we then have Var This shows that the candidate 54) is the dynamically optimal strategy for mean-variance portfolio problem (5).
Substitute π d * into (4) and denote the corresponding wealth process by X d * t .Applying Itô's lemma to Y t := e r(T−t) X d * t − ξ yields where the help function f is defined in (53).Solving this linear SDE (56) of Y t explicitly, we obtain the closed-form expression of X d * t given in (51).Moreover, it is easy to see that the initial value Y t 0 = x 0 e r(T−t 0 ) − ξ < 0 leads to X d * t e r(T−t) < ξ for t ∈ [t 0 , T).
Proof.The results follow from Corollary 1 and Theorem 3 directly.
Remark 5.If we specify (c 1 , c 2 ) = (1, 0) in Corollary 2, then we have the dynamically optimal strategy under the Heston model without mispricing; if we choose (c 1 , c 2 ) = (0, 1) instead, then the results in Corollary 2 correspond to the ones under the 3/2 model without mispricing.
Figures 1 and 2 below display the effects of r and λ on the efficient frontier, respectively.As a matter of fact, when the interest rate r increases, the investor can obtain more expected return by investing in the risk-free asset, and thus undertake less risk.Meanwhile, from the economic implications of λ, the investor can obtain a higher risk premium of W 1 as λ increases.This leads to a lower value of Var t 0 ,v 0 ,m 0 ,x 0 (X * T ) when the same E t 0 ,v 0 ,m 0 ,x 0 [X * T ] is asked for.
Figure 3 contributes to the evolution of the efficient frontier with respect to l 1 .When we vary l 1 from 0.1 to 0.5, the efficient frontier moves downwards.One possible explanation is that since l 1 partially characterizes the liquidity term, then as l 1 increases, the pricing error M t in (3) has a faster mean-reversion rate towards the long-term zero such that the investor can bear less risk coming out of the pricing error between S 1 and S 2 .
We finally give a simulation experiment to illustrate the difference between the dynamics of X * and X d * .As shown in Figure 4 below, two optimal wealth processes have significantly different trajectories while using the same random numbers.Particularly, we observe that the dynamically optimal wealth process X d * is strictly below the expected terminal wealth ξ = 3 when t < T = 1 in this case, which is consistent with the conclusion derived in Theorem 3 above.

Conclusions
In this paper, we consider an optimal investment problem with mispricing in the family of 4/2 stochastic volatility models (Grasselli [5]) which embraces the 3/2 and the Heston models as special cases under Markowitz's mean-variance criterion.
By applying the dynamic programming approach and establishing the corresponding HJB equation, we derive the closed-form expressions of the statically optimal (precommitment) strategy and the optimal value function.A verification theorem is further provided from scratch to ensure that the candidate solution to the HJB equation coincides with the optimal value function and that the optimal strategy is admissible.By recomputing the statically optimal strategy in an infinitesimally small period of time, we explicitly obtain the dynamically optimal (time-consistent) strategy (Pedersen and Peskir [24]).Moreover, some results on special cases, such as that without mispricing and that under the 3/2 and Heston models, are included.Finally, some numerical examples are presented to illustrate our results.To the best of our knowledge, there is no existing literature on the mean-variance problem with the new influential 4/2 stochastic volatility model and mispricing taken into consideration simultaneously.
Based on our current work, several potential topics in the future may be followed; for example, one may incorporate the stochastic interest rate into the model.One may also introduce random liabilities into the mean-variance problem.

Figure 1 .
Figure 1.Effects of r on the efficient frontier.

Figure 2 .
Figure 2. Effects of λ on the efficient frontier.

Figure 3 .
Figure 3. Effects of l 1 on the efficient frontier.

Figure 4 .
Figure 4. Trajectories of static and dynamic optimality.