VRP-GMRES ( m ) Iteration Algorithm for Fast Multipole Boundary Element Method

To solve large scale linear equations involved in the Fast Multipole Boundary Element Method (FM-BEM) efficiently, an iterative method named the generalized minimal residual method (GMRES)(m)algorithm with Variable Restart Parameter (VRP-GMRES(m) algorithm) is proposed. By properly changing a variable restart parameter for the GMRES(m) algorithm, the iteration stagnation problem resulting from improper selection of the parameter is resolved efficiently. Based on the framework of the VRP-GMRES(m) algorithm and the relevant properties of generalized inverse matrix, the projection of the error vector rm+1 on rm is deduced. The result proves that the proposed algorithm is not only rapidly convergent but also highly accurate. Numerical experiments further show that the new algorithm can significantly improve the computational efficiency and accuracy. Its superiorities will be much more remarkable when it is used to solve larger scale problems. Therefore, it has extensive prospects in the FM-BEM field and other scientific and engineering computing.


Introduction
Mathematical models of partial differential equations are usually established for many problems in scientific and engineering fields.After discretization, the equations can be concluded as the solution of a large scale linear system of equations, which can be expressed as where the coefficient matrix A is nonsingular.The generalized minimal residual (GMRES(m)) algorithm based on the Galerkin principle is the most successful method to solve Equation (1).At present, it has served as the fast solver for the Fast Multipole Boundary Element Method (FM-BEM).FM-BEM is a combination of the fast multipole expansion method (FMM) and the traditional boundary element method (BEM).FMM was first proposed by Greengard and Rokhlin [1,2] to accelerate the evaluation of interactions of large ensembles of particles governed by the Laplace'sequation.The main idea behind the FMM is a multipole expansion of the kernel in which the connection between the collocation point and the source point is separated.Many research works have been published since then to improve and extend the applicability of the FMM [3][4][5][6][7].Recently, Gu and Chen [8,9] applied the FMM to accelerate the solutions of the regularized meshless method (RMM) for large-scale three-dimensional elasticity problems and extended the singular boundary method (SBM) for the solution of 3D problems in linear elasticity.With the FMM and iterative solver GMRES(m), we are able to construct the FM-BEM.In the last several decades, the FM-BEM has been developed and used successfully for solving many large-scale applied mechanics problems.Shen and Liu put forward an adaptive fast multipole boundary element method for three-dimensional acoustic wave problems [10]; Wang and Yao came up with a new version of FMM for the numerical analysis of mechanical properties in 3D particle reinforced composites [11]; Gui and Huang presented an optimization FMM-BEM for a 3D Elastic Problem [12] and more work on these topics can be found in References [13][14][15].In addition, the GMRES(m) algorithm has become the research focus in many fields [16][17][18][19].Actualcomputations showed that it was very effective when the coefficient matrix A was well-conditioned or not severely ill-conditioned [20,21].Otherwise preconditioned techniques must be applied [22][23][24][25].
For the GMRES(m) algorithm, the selected parameter m is fixed during the whole iterative process; therefore, the selection of m is one of the key factors for the algorithm implementation.Research shows that a small value of m may result in slow convergence or no convergence, while a large value of m will cause too much of a memory requirement.Therefore, the appropriate selection of parameter m is a trouble for many scholars.Recently, some researchers have tried to overcome these difficulties by changing the restart parameter m in the GMRES(m) algorithm.Baker tried to improve the convergence by selecting an arbitrary parameter m.In addition, he tried it by determining the parameter m according to the continuous two times of residual norms ratio [26,27].However, the effect was not so good.Peairs determined the value of m by a reinforcement learning method, which indicated that proper changing of m could improve the computational efficiency.However, it was only a kind of machine learning method.Thus, it had certain limitations [28].
In this paper, the GMRES(m) algorithm combined with the FM-BEM is studied, and a new kind of GMRES(m) algorithm with Variable Restart Parameter (VRP-GMRES(m) algorithm) is presented.By appropriately changing a variable restart parameter, the new algorithm can effectively avoid many disadvantages caused by improper selection of parameter m in the GMRES(m) algorithm.At the same time, the computational efficiency and accuracy will be greatly improved.

Galerkin Principle
Given an arbitrary initial value x 0 ∈ R n .Let x = x 0 + z, and then Equation ( 1)is equivalent to where r 0 = b − Ax 0 indicates the initial residual vector.Suppose that K m and L m are two m-dimensional Krylov subspaces in R n , and they are formed by For the solution of Equation ( 2), the Galerkin principle can be described as follows: give a fixed m > 0, and find an approximate solution z m ∈ K m , so that (r 0 − Az m )⊥L m , namely, (r 0 − AV m y m )⊥L m or (r 0 − AV m y m , w i ) = 0 , y m ∈ R m .

Arnoldi Process
The Arnoldi process can be described as follows: 1.For the given m > 0 and The Arnoldi process has the following important property: , and then we have the following relationship [29]: where H m = (h ij ) m is an upper Hessenberg matrix, e T m = (0, 0, • • • , 1) ∈ R m , and I is an m + 1 order identity matrix .

GMRES(m) Algorithm
Theorem 2. Suppose that A ∈ R n×n and L m = AK m , x = x 0 + z = x 0 + V m y.Then, the approximate solution z m obtained from the Galerkin principle makes the residual r 0 − Az be at a minimum in the Krylov subspaceK m , and the residual satisfies [30] where Theorem 2 indicates that min The basic thought of the GMRES(m) algorithm is to give a fixed restart parameter m and compute z m by an iterative procedure so that r m = r 0 − Az m < ε, ∀ε > 0 .
The GMRES(m) algorithm includes the following steps: 1. Give a fixed integer m << n and an initial value , and let i=1 and H m through the Arnoldi process .k+1) and stop .Otherwise, let r 0 = r (k+1) , and go to 2.

Solve the least squares problem min
For the GMRES(m) algorithm, arbitrary selection of parameter m cannot guarantee convergence.In fact, appropriate change of m can effectively improve the convergence and avoid the phenomenon of slow convergence or even no convergence.

VRP-GMRES(m) Algorithm
The basic thought of the VRP-GMRES(m) algorithm is to appropriately change the restart parameter m and carry out iterative procedures so that the residual vector satisfies r m = r 0 − Az m < ε, ∀ε > 0. It includes the following main steps : 1. Give an integer m << n and an initial value , and let r 0 = r (0) .2. Carry out the Arnoldi process and obtain {v i } m i=1 and H m .

Solve the least squares problem min
and get y m .Then, compute z m = V m y m .For the VRP-GMRES(m) algorithm, slow convergence or no convergence rarely occurs.However, the memory requirement will grow with the increase of parameter m , which can be cleverly avoided.When m increases to some extent and the residual norm reduces to a proper value, the parameter m can be fixed and the GMRES(m) algorithm will be carried out .

Convergence Analysis
Definition 1. Suppose that A ∈ C m×n , X ∈ C m×n , at the same time if Theorem 3. Suppose that A ∈ C m×n , and A = BC is the maximum rank decomposition of A, then Proof of Theorem 3.
Hereto, the first two Moore Penrose equations are established.The following is to verify the other two Moore Penrose equations: From Definition 1, the matrix X is a pseudo inverse matrix of A.
Corollary 1. Suppose that A ∈ C m×n is a matrix with full column rank.Then, we have Theorem 4. Suppose that r m and r m+1 are the error vector of the m cycle and m + 1, respectively.Then, the following relationship is established: Proof of Theorem 4. For the VRP-GMRES(m) algorithm, when m = m + 1 , the subspace K m becomes From the Arnoldi process, {v i } m+1 i=1 and Hm+1 are obtained, where From Equation ( 5), the least squares solutiony m+1 = H+ m+1 (βe 1 ) , e 1 ∈ R m+2 is obtained, where H+ m+1 is a real matrix with full column rank.According to Equation ( 6),we have Then, On the other hand, we have Accrording to Equation (3), we haveAV m+1 = V m+2 Hm+1 .Then, we have Thus, r m+1 At the same time, because (h T indicates a vector formed by the first line elements in Hm+1 ), follows Theorem 5.For the VRP-GMRES(m) algorithm, let E = r m 2 , Ē = r m+1 2 , and therefore Ē < E.

Numerical Experiments
In this section, three different types of numerical experiments are given to illustrate the validity and feasibility of the VRP-GMRES(m) algorithm.In Example 1 and Example 2,take as the initial value and ε = 1 × 10 −8 as the convergence criterion.
Example 1.Consider the following one-dimensional Wave equation: As is shown in Figure 1a, the exact solution is u (x, t) = sin (πx) cos (2πt) + sin (2πx) cos (4πt), where u (x, t) indicates the amplitude.For the problem expressed by Equation (10), n equall divisions along the x-direction and t-direction can be obtained by the central difference method.
Equation ( 11) is solved by the VRP-GMRES(m) algorithm, and the numerical results are shown in Figure 1b.From Figure 1, the numerical solution is consistent with the exact solution.Equation ( 11)is solved by the GMRES(m) algorithm and VRP-GMRES(m) algorithm, respectively.When n = 10, m = 7, the condition number is 56.5079.For the GMRES(m) algorithm, iterative stagnation occurs after 9 iterations, and the residual norm is 1.4099.However, for the VRP-GMRES(m) algorithm, the residual norm has reduced to 1.4230 × 10 −14 after only 3 iterations, as is shown in Figure 3.The comparison of absolute errors in the iterative process is shown in Figures 2 and 3.  From Figures 2 and 3, for the same number of iterations, we can see that the absolute errors generated from the VRP-GMRES(m) algorithm are much smaller than those from the GMRES(m) algorithm.With the increase in the number of iterations, the errors from the VRP-GMRES(m) algorithm reduces much faster.It shows that the new algorithm not only has higher computational accuracy, but can also speedily converge.

Preprints
The parameters n and m are taken as different values and the calculation results from the two algorithms are compared.Tables 1-4 show the iteration times, computation time, computational accuracy and convergence, respectively.From Tables 1-3, we can see that the new algorithm is fast convergent, stable, highly accurate and effective.From Tables 1-4, the selection of parameter m is very important for the GMRES(m) algorithm, and the improper parameter mwill result in an algorithm failure.However, it has no effect on the VRP-GMRES(m) algorithm because the variable restart parameter m can partly reduce the sensitivity of m .In fact, the condition number of coefficient matrix A keeps growing with the increase of n.When n = 60, it reaches 2.1098 × 10 3 , which makes Equation ( 11) become severely ill-conditioned.From Table 1, under the same accuracy, the number of iterations for the GMRES(m) algorithm is 11.7 times larger than that for the VRP-GMRES(m) algorithm, and the computation time is 10.7 times larger.With the increase of computing scale, the VRP-GMRES(m) algorithm will be much more effective, and its engineering application prospect is much wider.

Example 2. Consider the following two-dimensional Poisson equation:
where The exact solution is u (x, y) = x 2 x + y 2 + 2, which indicates a temperature distribution function.The temperature distribution is shown in Figure 4a.For the problem expressed by Equation (12),n equal divisions along the x-direction and t-direction can be obtained by a five-point difference scheme.After discretization, a linear equation can be obtained, which is expressed as follows: Take n = 40, m = 10, Equation ( 13)is solved by the VRP-GMRES(m) algorithm , and the temperature distribution is shown in Figure 4b.From Figure 4, the numerical solution is consistent with the exact solution.When m = 10, Equation ( 13) is solved by the GMRES(m) algorithm and VRP-GMRES(m) algorithm, respectively.With the increase of the computational scale, there are few changes in the iteration times for the VRP-GMRES(m) algorithm, and it is much lower than that for the GMRES(m) algorithm, as shown in Figure 5a.Thus, the new algorithm has higher computational efficiency.At the same time, it has higher computational accuracy, as is shown in Figure 5b.In addition, the total computation time for the VRP-GMRES(m) algorithm is much less than that for the GMRES(m) algorithm, as is shown in Table 5.   6.The discrete data are shown in Table 6.Bodies A, B and C are of the same material with Young modulus E = 210 GpaPoisson ratio υ = 0.3, and the friction coefficient f = 0.2.For body C, a uniform load P = 100 Mpais applied tothe top surface.The total load is divided into six steps, and the contact tolerance is 0.001 mm.
In the example, the VRP-GMRES(m) algorithm is used as a fast solver for the FM-BEM, and some results are shown in Figure 7, which are consistent with those obtained by the GMRES(m) algorithm.The iteration times by the GMRES(m) algorithm and VRP-GMRES(m) algorithm are shown in Table 7. From Table7, we can see that there are few changes in the iteration times for the VRP-GMRES(m) algorithm.However, the relative error is quite small for the pressure of the same node, as is shown in Figure 8.     From Table 7 and Figure 8 , the VRP-GMRES(m) algorithm is more rapidly and stably convergent than the GMRES(m) algorithm .

Conclusions
In this paper, an iterative VRP-GMRES(m) algorithm is proposed for the solution of linear equations, which can be used as an improved solver of the FM-BEM and can be used to solve many mathematical, mechanical, physical and engineering problems.Through theoretical analysis, the new algorithm is proved to be rapidly convergent with higher computational accuracy.Numerical experiments show that the presented algorithm not only has higher computational efficiency and accuracy, but also is much more stable.The results of the last example also prove that it is highly efficient and rapidly convergent for the elastic problems.On the whole, the new algorithm has extensive prospects in the FM-BEM field and other large-scale scientific and engineering computing.

Figure 5 .
Figure 5.Comparison of the computational efficiency and accuracy under different meshes for the two algorithms (m = 10).(a) comparison of computational efficiency; and (b) comparison of computational accuracy.

Figure 7 .
Figure 7. Displacements on the contact surface for body A (z = 5 mm).

Table 1 .
Comparison of iteration times for the two algorithms.

Table 2 .
Comparison of computation times for the two algorithms.

Table 4 .
Comparison of convergence for the two algorithms.

Table 5 .
Comparison of computation times under different meshes for the two algorithms (m = 10).

Table 7 .
Comparison of iteration times for the two algorithms.