A Robust Approximation of the Schur Complement Preconditioner for an Efﬁcient Numerical Solution of the Elliptic Optimal Control Problems

: In this paper, we consider the numerical solution of the optimal control problems of the elliptic partial differential equation. Numerically tackling these problems using the ﬁnite element method produces a large block coupled algebraic system of equations of saddle point form. These systems are of large dimension, block, sparse, indeﬁnite and ill conditioned. The solution of such systems is a major computational task and poses a greater challenge for iterative techniques. Thus they require specialised methods which involve some preconditioning strategies. The preconditioned solvers must have nice convergence properties independent of the changes in discretisation and problem parameters. Most well known preconditioned solvers converge independently of mesh size but not for the decreasing regularisation parameter. This work proposes and extends the work for the formulation of preconditioners which results in the optimal performances of the iterative solvers independent of both the decreasing mesh size and the regulation parameter. In this paper we solve the indeﬁnite system using the preconditioned minimum residual method. The main task in this work was to analyse the 3 × 3 block diagonal preconditioner that is based on the approximation of the Schur complement form obtained from the matrix system. The eigenvalue distribution of both the proposed Schur complement approximate and the preconditioned system will be investigated since the clustering of eigenvalues points to the effectiveness of the preconditioner in accelerating an iterative solver. This is done in order to create fast, efﬁcient solvers for such problems. Numerical experiments demonstrate the effectiveness and performance of the proposed approximation compared to the other approximations and demonstrate that it can be used in practice. The numerical experiments conﬁrm the effectiveness of the proposed preconditioner. The solver used is robust and optimal with respect to the changes in both mesh size and the regularisation parameter.


Introduction
The partial differential equation constrained optimisation problems have over the years become an active research area that has arisen in many application areas that include finance, medicine, modern science and engineering. We refer to [1,2] on their theoretical and also to [3][4][5] on their numerical developments. These problems are numerically and analytically challenging to solve.
The most clear challenge is that the resulting linear algebraic system is very large such that the only viable way is to get the solution iteratively using specialised methods. The formulation of the linear optimisation problem involves the objective function to minimise subject to the constraints defined by the underlying modelling partial differential equation in a bounded domain Ω ⊂ R 2 with boundary ∂Ω. Consider the following elliptic distributed PDE-optimal control problem min (u,y) J(y, u) : subject to the constraints −∆y = f + u in Ω y = g on ∂Ω (2) with y the state variable, y d the desired state known over the domainΩ and u the control variable on the right hand side. The parameter δ is called the regularisation parameter which measures the cost of the control and is supplied and positive. The size of the regularisation parameter is important in the performances of the iterative solvers. The optimal value of the regularisation parameter δ is 10 −2 ; see [6,7]. The performance of iterative solvers with the decreasing parameter is the central theme of this study. The solution of the state variable y must satisfy the PDE over Ω and must be as close as possible to the desired state y d so that the objective function can be minimised. The optimal control problem Equations (1) and (2) has a unique solution (y, u) characterised by the following optimality system called the Karush-Kuhn-Tucker (KKT) system [8,9]. The first order optimality system of the PDE-optimal control problems consists of a state equation, an adjoint equation and the control equation which is a saddle point problem as given below −∆p = y − y d , in Ω p = 0 on ∂Ω adjoint equation (3) −∆y = f + u, in Ω y = g on ∂Ω state equation (4) −δu = p in Ω control equation (5) The optimality system is achieved through the Lagrange multiplier method which partitions the model problem into three equations, namely, the state y, control u and the adjoint, p which form the saddle point problem. For the numerical solution of the elliptic optimal control problem we apply the finite element method to the system ((3)-(5)) to get the linear saddle point problem. The finite element method is the most popular technique for the numerical solution of the PDE-constrained optimisation problems; see [5][6][7] and many more. The finite element method results in the coupled linear algebraic system which has to be solved by the appropriate solvers. The resulting discrete KKT system is where K ∈ R n×n is a stiffness matrix, and that and the mass matrix M ∈ R n×n are both symmetric and positive definite. The vector b = My d ∈ R n is the finite element projection of the desired state y d , and d ∈ R n contains the terms arising from the boundaries of the finite element of the state y. The finite element discretisation produces a large scale linear algebraic system of Equation (6) which is indefinite and has poor spectral properties. The numerical solution of such problems is a computational uphill task and has a lot of challenges such that constructing robust and efficient solvers has preoccupied the computational scientific community for decades. The linear algebraic indefinite system is parameter dependent such that the condition number grows when the mesh size and the regularisation parameter approach zero. The well known Krylov and multigrid iterative solvers perform poorly for such systems. In recent years, much effort has been devoted in developing specialised iterative solvers using suitable preconditioners [3,6,7,[10][11][12] and appropriate smoothers for the multigrid solvers [8,9,[13][14][15][16][17]. For the development of the efficient multigrid solvers for the distributed optimal control problems, we refer to ([8,9,15,16,18,19]. Several preconditioned strategies have been developed and evolved that accelerate the iterative solvers. These are the block diagonal preconditioner [6,[20][21][22] for the minimum residual (MINRES) solver [23,24], and for the generalised minimal residual (GMRES) method, block triangular preconditioners with nonstandard inner products for the conjugate gradient method [7,10,21,22,25] and the constraint preconditioner [7,11,21]. The preconditioners were also developed for the two by two reduced block linear system and three by three block system to solve the optimal control problems of the Stokes [10,26,27], parabolic [28,29] and convection diffusion [30] equations. In most cases the application of preconditioners shows robustness and efficiency with the decreasing mesh size but not with the decreasing regularisation parameter. In this study, we focus on the improvement of the block diagonal preconditioner to enhance the performance of the MINRES solver. The MINRES was developed by [31] and widely used for PDE-constrained optimisation problems. The aim of this paper is to present preconditioners based on the Schur complement approach that seek to achieve an efficient solution of the linear system arising from the discretisation of the PDE-optimal control problems. We give particular attention to preconditioning techniques that achieve robust performances in the MINRES solver with respect to both decreasing mesh size and the regularisation parameter. As the mesh size approaches zero, the dimensions of the problem increase. A mesh-independent performance is always achieved for such systems, but the small values of the regularisation parameter pose a great challenge to the iterative solvers. As the regularisation parameter decreases, the performance of the iterative solver deteriorates, but it improves for large values.
The optimal performance of the preconditioner depends on applying an appropriate approximation of the (1,1) and (3,3)-block entries of the preconditioner. For the preconditioners based on the Schur complement, the (3,3) Schur complement remains one of the challenges to finding a suitable approximation to achieve robustness in terms of parameter changes. The Schur complement approximation developed in [6] and widely used in literature displays optimal performance for large values of the regularisation parameter, but the performance of it deteriorates as the parameter approaches zero. In [21,23] another Schur complement approximation was developed for PDE-constrained optimisation problems which gives convergence of the appropriate iterative method in a number of steps which are independent of the value of the regularisation parameter. The preconditioned iterative solver displays optimal performance for small values of the regularisation parameter. The main contribution of this work is to present a different form of the Schur compliment approximation to achieve an optimal performance of the MINRES solver in terms of iterative counts that are independent of the decreasing mesh size and the regularisation parameter. The clustering of the eigenvalues points to the good convergence properties of the preconditioned solver. We remove the dependency on the regularisation parameters within the preconditioned matrix as in [21]. We derive and investigate the clustering of the eigenvalues for the Schur compliment approximate and the preconditioned system. The numerical experiments' outcomes of the proposed approximation are compared with the existing approximations of the Schur complement preconditioner to show its efficiency. This paper is organised as follows. In Section 2 we preview the block diagonal preconditioner and investigate the eigenvalue distribution. In Section 3, we discuss our proposed approximation for the Schur complement and investigate the eigenvalue distribution of the preconditioned system. In Section 4 we present the numerical experiments to demonstrate how well the proposed approximation works and the conclusion is given.

Analysis of the Block Diagonal Preconditioner
We consider the preconditioning strategies for solving the saddle point problem, Equation (6). The optimal performance of the preconditioned Krylov subspace methods such as MINRES, GMRES and conjugate gradient rely more on the distribution of the eigenvalues of the coefficient matrix. It is well known in numerical analysis that the spectral radius must be bounded by one to guarantee convergence of the iterative solvers. The coefficient matrix of the system (6) is known to be large, sparse and indefinite with poor spectral properties. This section aims to outline the block diagonal preconditioner that involves the mass matrix and a Schur complement form. We consider a block diagonal preconditioner for the 3 × 3 block coefficient matrix of the Equation (6). For the 2 × 2 block preconditioners we refer to [7,22] and the references therein. The block diagonal preconditioner is given by where S := KM −1 K + 1 δ M is the Schur complement form. A good preconditioner must fulfil the assumptions that it is easy to invert, and the linear system Qx = b associated with it is easy to solve for any vector b. The approximation of the (3,3) block entry poses more challenges and difficulties. One possible widely used approach as in [6,21,24] is to get the approximation of S by discarding the term 1 δ M to obtain Q S 1 = KM −1 K by dropping a term with δ −1 M with an argument such that for all very small values of δ the term KM −1 K will be dominating. The application of the preconditioner with the approximation Q S 1 = KM −1 K has the shortfall that the iterative solver converges independently of the decreasing mesh size only but not in the decreasing regularisation parameter δ. To demonstrate this we need to give the eigenvalue bounds for Q −1 S 1 S as derived in [6,21]. We now use the following results, which are Theorems (3.4) and (3.5) in [6] and used by [21,24]. (6) in Ω ∈ R 2 with the degree of approximation Q m or P m with m ≥ 1 the following bounds hold:

Theorem 1 ([6]). For the problem Equation
where α 1 and α 2 are real constants independent of h but dependent on m.

Theorem 2 ([6]
). For the problem Equation (6) in Ω ∈ R 2 with the degree of approximation Q m or P m with m ≥ 1 the following bounds hold: where θ 1 and θ 2 are real constants independent of h but dependent on m.
whereα andθ are independent of h and but dependent on δ.
Proof of Theorem 3. To find the eigenvalue distribution of Q −1 S 1 S, let ω be the eigenvalue, and then Let υ be an eigenvalue of K −1 M then the eigenvalue of (K −1 M) 2 is υ 2 . This means that y T Ky = y T My y T y · y T y y T Ky . By using Theorems 1 and 2, there exist positive constantsα andθ independent of mesh size; we haveα This implies that we haveα Thus, it follows that there exist constantsα andθ independent of the mesh size such that It is clear that Q S 1 has no dependence on δ but the eigenvalue lower and upper bounds are dependent on δ. The eigenvalue bounds demonstrated by the Theorem 3 above show that the parameter δ determines the clustering and distribution of eigenvalues. This clearly shows that if δ is too small the eigenvalues are not clustered and Q S 1 will not be a better approximation. This entails that large values of δ, Q S 1 are a perfect approximation. The current research seeks to construct an approximation whose performance does not rely on both the mesh size and the regularisation parameter. The following theorem in [6] gives the eigenvalue distribution of the preconditioned coefficient matrix with preconditioner Q 1 with the approximation for the Schur complement form Q S 1 where Theorem 4 ( [7,24]). For Q 1 approximation, let the preconditioner Q 1 be defined by Equation (10). Assume that λ is an eigenvalue of the preconditioned matrix Q −1 1 K. Then λ = 1 or λ satisfies the following bound: For the proof and details we refer to [7,24] and references therein.

Proposed Schur Complement Preconditioner Approximation
The above Theorem 4 has demonstrated that applying the preconditioner with the Schur complement approximation Q S 1 results in performance and convergence of the preconditioned iterative solver that are independent of the discretisation mesh size and dependent on the regularisation parameter. It is expected that the convergences of the Krylov methods such as MINRES and GMRES are independent of the mesh size. The crucial task is to develop an efficient preconditioner that gives close eigenvalue bounds of the preconditioned system. The scientific literature on preconditioning optimal control systems includes classical Schur complement based approximations. In [21] we see the extension of the preconditioner whose effects are independent of the regularisation parameter. This was also used in [24]. The main task of this section is to briefly discuss the approximation in [21] and the one we propose in this work, which has the same clustering and distribution of eigenvalues and produces a preconditioned iterative solver that is not sensitive to the changes and decreasing regularisation parameter. The following approximation for S was developed in [21] shown Section 2 above. To get eigenvalue bounds for Q −1 S 2 S, we follow the same derivation and use the Theorems 1-3 above. The preconditioner involving the Schur complement approximation Q S 2 is given as follows There has been development of similarly structured Schur complement approximations to Q S 2 whose eigenvalue bounds are the same but differ in computational complexities. We have which is complex based developed by Choi and others in [32]. In this paper we propose and analyse a robust approximation for S which has the same eigenvalue bounds but a different eigenvalue distribution within the interval for Q S 2 as the approximation Q S 3 which is expected to accelerate the numerical solution. Our proposed approximation for the Schur complement is also derived from Q S 1 by writing which similarly reduces to by discarding a O(δ − 1 2 ) term this means that the error committed grows much slower. To motivate our choice of the Schur complement approximation, we illustrate the properties of the preconditioned Schur complement and the preconditioned system. We now discuss the derivation of the eigenvalue distribution of Q −1 S 3 S by following the same derivation in [21].
Theorem 5. The eigenvalues of Q −1 S 3 S satisfy the following bounds σ(Q −1 S 3 S) ∈ [ 1 2 , 1] independent of both mesh size h and regularisation parameter δ Proof of Theorem 5. Q S 3 is invertible since M is symmetric and positive definite. Let σ be an eigenvalue of Q −1 S 3 S corresponding to eigenvector y; then it is diagonalisable; this means that it describes all the eigenvalues of Q −1 S 3 S. It is known that a fraction of the form v 2 +w 2 (v+w) 2 is bounded between 1 2 and 1. Hence, (ν+ √ δI) −2 which takes the form v 2 +w 2 (v+w) 2 . Thus, 1 2 ≤ σ ≤ 1. By this we conclude that σ(Q −1 S 3 S) ∈ [ 1 2 , 1] is clustered and not dependent on the mesh size h and the regularisation parameter δ.
The actual application of the preconditioner Q S 3 in actual computations involving solving two subsystems to accelerate the convergence of Krylov subspace iteration process can be implemented as: Algorithm 1. Application of Preconditioner Q S 3 . Let L = √ δK + M and w be any given vector; we can compute the vector v and Q S 3 v = w, that is, LM −1 Lv = w, using the following procedures:
Lv = u solve for v.
Since M and K are sparse, L is sparse too, and an added bonus for Q S 3 and Q S 2 is that they maintain δ-dependency and their bounds are fairly good. The preconditioner involving the Schur complement approximation Q S 3 is given as follows.
We now derive the eigenvalues distribution of the preconditioned coefficient matrix Q −1 3 K with the modified approximation for the Schur complement form. We will use Proposition (2) in [7] which extends Theorem 4.

Theorem 6 ([7]
). The eigenvalues of the preconditioned system Q −1 Proof of Theorem 5. Let σ be the eigenvalue of Q −1 S 3 S and µ be an eigenvalue of Q −1 3 K corresponding to the eigenvector v = {v 1 , v 2 , v 3 }. We want to find the eigenvalues bounds for Q −1 3 K. We use a different approach from the one in [21]. We have The preconditioned coefficient matrix We consider the eigenvalue problem 1.
If µ = 1 from Equations (13) and (14) we have M −1 Kv 3 = 0 and 1 δ Iv 3 = 0, then v 3 = 0. Therefore, the corresponding eigenvector is If µ = 1 from Equations (13) and (14) we have Substituting Equations (16) and (17) into Equation (15), we get Solving Equation (18) we get The values of µ derived above agree with the Proposition (2) in [6] which gives the bounds Now we have This means that the eigenvalues of Q −1 1]. Let σ 1 = 1 2 and σ m = 1; substituting in Equations (19) and (20) we get Therefore the eigenvalues of the preconditioned coefficient matrix are The Theorem 6 above clearly shows that the eigenvalues of the preconditioned coefficient matrix are clustered and are independent of both the mesh size and the regularisation parameter. The iterative solver is expected to converge with changes in the problem parameters and discretisation parameter. In the next section we carry out numerical tests to verify the theoretical findings and compare the performance of the MINRES solver with the block diagonal preconditioner associated with the three approximations for the Schur complement form.

Numerical Results
In this section we present the results of the numerical experiments for solving the problem (Equation (6)) using the block diagonal preconditioned MINRES method. The main task here is to demonstrate the effectiveness of preconditioner in accelerating the MINRES solver. In this study the numerical experiments began with the Q 1 finite element discretisation of the state y, the adjoint p and the control u equations using uniform grids. All simulations and implementations were performed on a Windows 10 platform with Intel R Core TM i5-3230M CPU@2.6 GHz 6.00 GB speed Intel R using Matlab 7 programming language. We used the IFISS matlab package developed in [33] to generate a discrete linear algebraic system. The values of the mesh size h and the dimensions of M, K and K are shown in Table 1 below The exact solutions for the distributed optimality system and Ω = (−1, 1) 2 ⊂ R 2 are u exact = 2π 2 y exact (22) We give a thumbnail outline of the main task of the paper in the application of the approximation of the Schur complement preconditioner Q S 3 . The main dominant operations in the application of our proposed block diagonal preconditioner Q 3 are 10 fixed Chebyshev semi iterations [6,12,29] for the mass matrix M in the (1,1) and (2,2) blocks; the (3,3) block is approximated by two cycles of the algebraic multigrid methods for √ δK + M. This means in total we have two Chebyshev and two algebraic multigrid iterations with two pre and post-smoothing steps of the Jacobi method; similarly for Q S 1 and Q S 2 . We test for four regularisation parameters δ = 10 −3 , δ = 10 −5 , δ = 10 −7 and δ = 10 −9 .
We solve the linear algebraic system of saddle point form Equation (6) using the MINRES with the block diagonal preconditioners Q 1 (Equation (10)), Q 2 (Equation (11)) and Q 3 (Equation (12)) with the three different preconditioner approximations for the Schur complement form S. Then we compare the performance of the block diagonal preconditioned MINRES method with the three preconditioner approximations for the Schur complement in terms of the number of iterations, CPU time, L 2 error and the value of the objective function. The numerical solution produced by the MINRES with the preconditioner associated with the three Schur complement approximations is the same. Figure 1 gives the snapshots of the numerical solution,   We now present the eigenvalue distribution of the Q −1 Table 2 shows the largest and the smallest eigenvalues of the Schur complement S, Q −1 Table 2 clearly show the effects of the preconditioner approximation on the spectral properties of the Schur complement. The eigenvalues of Q −1  Figure 2 shows the distribution of eigenvalues for different values of regularisation parameter.  S is widely spread out as the regularisation parameter decreases. This agrees with the theoretical findings that the eigenvalues depends on the regularisation parameter. Hence the preconditioner approximation performance is not independent of δ and deteriorates for small values of δ. We also observe that for the approximations Q S 2 and Q S 3 the eigenvalue distribution falls with the interval that is given theoretically for the different values of the regularisation parameter. This entails that the clustering of the eigenvalues shown in the Figure 2 above points to the fast convergence of the iterative solver for all the values of the regularisation parameter.
We now give the numerical experiment results from our MINRES iterative solver preconditioned with the block diagonal preconditioner associated with the three approximations. Since the numerical solutions are the same, we concentrate on the performance in terms of the number of iterations and the CPU time in seconds for comparison purposes.
It is clear from Table 3 that all these preconditioners associated with the approximations Q S 1 , Q S 2 , Q S 3 are robust with respect to mesh size h, but the approximation Q S 1 is not robust with respect to the regularisation parameter. The preconditioners associated with approximations Q S 2 , Q S 3 performed efficiently for all the values of h and δ, with Q S 3 performing slightly better.
The results in Tables 3 and 4 demonstrate the practical applicability of the preconditioners explained theoretically in Sections 2 and 3 and the theoretical prowess of the three approximations of the Schur complement form using the eigenvalue distribution. Tables 3 and 4 give the number of iterations and the CPU time taken by the MINRES preconditioned by block diagonal preconditioners Equations (10)- (12) with different values of mesh size and the regularisation parameter, and we compare the effectiveness of the three preconditioners with approximations for the Schur compliment preconditioner. The results clearly show that the preconditioner with Q S 1 displays mesh-independent convergence but fails on the decreasing values of the regularisation parameter δ. The number of iterations and CPU time increase as δ → 0. This agrees with the poor eigenvalue distribution of Q −1 S 1 S and that of Q 1 K. The number of iterations and CPU times produced by the preconditioners associated with Q S 2 and Q S 3 are competitive and exhibited parameters h and δ independent of convergence, while both h and δ approached zero. The number of iterations of Q S 2 and Q S 3 decreased by the decrease of δ, as expected, and those of Q S 3 decreased further. The results show that these preconditioners are also robust to the mesh size h and the regularisation parameter δ. This agrees with the theoretical results that the eigenvalues are more clustered and distributed more independently of the parameters h and δ than those of Q −1 S 1 S; see Figure 2. The results of our numerical experiments clearly show that the preconditioners with the approximations Q S 3 and Q S 2 are very effective in improving the MINRES in solving elliptic control problems. When compared to other preconditioner approximations in terms of iteration count and computational time, Q S 3 slightly outperformed them. The preconditioners are more effective with our new approximation, producing more favourable results. We now give the results including the cost functional at 2 −6 for different values of the objective function for the approximation Q S 3 and there are the same outputs with the other approximations.    9  23  35  35  13  5  3  3  13  5  3  3  2 −5  11  29  71  99  13  9  3  3  13  9  3  3  2 −6  11  33  111  101  13  11  5  3  13  10  5  3  2 −7  13  33  133  107  15  11  7  5  15  10  5  3  2 −8  17  33  139  103  15  11  7  5  15  10  5  3  2 −9  17  39  147  191  17  11  11  7  17  11  5  5  Table 4. CPU time taken by MINRES solver with the block diagonal preconditioners Q 1 , Q 2 , Q 3 with Schur complement approximated by Q S 1 , Q S 2 , Q S 3 for different values of h and δ, tolerance = 10 −6 .  The results in Table 5 show the behaviour of the cost functional for different values of the regularisation parameter δ. It is well known that the δ determines how close the state approaches the desired state y d . The results provide an interesting observations that u 2 stops increasing at 10 −6 with J(y, u) and y − y d 2 decreases with the decrease by a constant factor of the decrease of δ. This means that the optimal value of δ for the problem is 10 −6 , becoming very close to the desired state, and also that the control variable u increases as δ decreases. This is clear indication that the cost functional will be insensitive to the control variable as δ decreases. Table 5. Numerical results using preconditioner with approximation Q S 3 with h = 2 −6 . 8.98 × 10 0 1.44 × 10 2 10 −4 6.08 × 10 2 7.43 × 10 −4 6.03 × 10 −5 1.20 × 10 0 1.92 × 10 1 10 −5 6.29 × 10 2 7.98 × 10 −4 6.30 × 10 −6 1.24 × 10 −1 1.98 × 10 −1 10 −6 6.32 × 10 2 8.03 × 10 −4 6.29 × 10 −7 1.25 × 10 −2 2.0 × 10 −2 10 −7 6.32 × 10 2 8.03 × 10 −4 6.26× 10 −8 1.24 × 10 −3 2.0 × 10 −3 10 −8 6.32 × 10 2 8.03 × 10 −4 6.32× 10 −9 1.24 × 10 −4 2.0 × 10 −4 10 −9 6.32 × 10 2 8.03 × 10 −4 6.32× 10 −10 1.24 × 10 −5 2.0 × 10 −5

Conclusions
The finite element discretisation of the elliptic PDE-constrained optimisation problem produces a large block 3 × 3 linear algebraic system of saddle point form. The main task of this paper was to present a different approach for the approximation of the Schur complement in the preconditioning strategies to get a robust and efficient numerical solver. This was done to achieve the MINRES solver whose performance is independent of the mesh size h and the regularisation parameter δ. We have demonstrated both theoretically and numerically that the eigenvalues of both the the preconditioned Schur complement and the preconditioned system involving the proposed approximation are independent of the changes in the mesh size and the regularisation parameter. This gives a robust numerical scheme that is independent of h and δ. We have compared numerically the results of the preconditioners involving the proposed approximation of the Schur complement with those that are found in the literature. Numerical results have confirmed the theoretical results and have demonstrated that the proposed preconditioner in this paper can be used practically and can be considered as a viable preconditioner for the problem under investigation.
Author Contributions: All authors contributed equally and significantly in writing this article. All authors have read and agreed to the published version of the manuscript.