Multigrid for Q k Finite Element Matrices Using a (Block) Toeplitz Symbol Approach

: In the present paper, we consider multigrid strategies for the resolution of linear systems arising from the Q k Finite Elements approximation of one- and higher-dimensional elliptic partial differential equations with Dirichlet boundary conditions and where the operator is div ( − a ( x ) ∇· ) , with a continuous and positive over Ω , Ω being an open and bounded subset of R 2 . While the analysis is performed in one dimension, the numerics are carried out also in higher dimension d ≥ 2, showing an optimal behavior in terms of the dependency on the matrix size and a substantial robustness with respect to the dimensionality d and to the polynomial degree k .


Introduction
We consider the solution of large linear systems whose coefficient matrices arise from the Q k Lagrangian Finite Element approximation of the elliptic problem div (−a(x)∇u) = f , x ∈ Ω ⊆ R d , with Ω a bounded subset of R d having smooth boundaries and with a being continuous and positive on Ω.
Based on the spectral analysis of the related matrix-sequences and on the study of the associated spectral symbol [1,2], this paper deals with ad hoc multigrid techniques where the choice of the basic ingredients, i.e., that of the smoothing strategy and of the projectors, has a foundation in the analysis of the symbol provided in [3].
Indeed, in the systematic work in [3], tensor rectangular Finite Element approximations Q k of any degree k and of any dimensionality d are considered and the spectral analysis of the stiffness matrix-sequences {A n } is provided in the sense of: • spectral distribution in the Weyl sense and spectral clustering; and • spectral localization, extremal eigenvalues, and conditioning.
We observe that the information obtained in [3] is strongly based on the notion of spectral symbol (see [1,2]) and is studied from the perspective of (block) multilevel Toeplitz operators [4,5] and (block) Generalized Locally Toeplitz sequences [6,7].
We remind that a similar analysis is carried out in [8] for the finite approximations P k for k ≥ 2 and for d = 2: the analysis for d = 1 is contained in [3] trivially because Q k ≡ P k for every k ≥ 1, while, for d = 2, and even more for d ≥ 3, the situation is greatly complicated by the fact that we do not encounter a tensor structure. Nevertheless, the picture is quite similar and the obtained information in terms of spectral symbol is sufficient for deducing a quite accurate analysis concerning the distribution and the extremal behavior of the eigenvalues of the resulting matrix-sequences.
It is worth noticing that the information regarding the conditioning determines the intrinsic difficulty in the precision of solving a linear system, that is the impact of the inherent error, and it is also important in evaluating the convergence rate of classical stationary and non-stationary iterative solvers. On the other hand, the spectral distribution and the clustering results represent key ingredients in the design and in the convergence analysis of specialized multigrid methods and preconditioned Krylov solvers [9] such as preconditioned conjugate gradient (PCG) (see [7], Subsection 3.7, and [10][11][12][13][14][15][16]). As proven in [11], the knowledge of the spectral distribution allows explaining the superlinear convergence history of the (PCG), thanks to the powerful potential theory.
We emphasize that in [3,8] the final goal is the analysis and the design of fast iterative solvers for the associated linear systems. In the current note, we go exactly in this direction, by focusing our attention on multigrid techniques.

Structure of the Paper
The outline of the paper is as follows. In Section 2, we provide the notation and we present results regarding multigrid methods and we fix the notation for matrix-valued trigonometric polynomials, and the related block-Toeplitz matrices. Section 3 is devoted to the analysis of the structure and of the spectral features of considered matrices and matrix-sequences. The multigrid strategy definition and the symbol analysis of the projection operators are given in Section 4, together with selected numerical tests. The paper is concluded by Section 5, where open problems are discussed and conclusions are reported.

Two-Grid and Multigrid Methods
Here, we concisely report few relevant results concerning the convergence theory of algebraic multigrid methods and we present the definition of block-Toeplitz matrices generated by a matrix-valued trigonometric polynomial.
We start by taking into consideration the generic linear system A m x m = b m with large dimension m, where A m ∈ C m×m is a Hermitian positive definite matrix and x m , b m ∈ C m . Let m 0 = m > m 1 > . . . > m s > . . . > m s min and let P s+1 s ∈ C m s+1 ×m s be a full-rank matrix for any s. At last, let us denote by V s a class of stationary iterative methods for given linear systems of dimension m s .
In accordance with [17], the algebraic two-grid Method (TGM) can be easily seen a stationary iterative method whose generic steps are reported below.
where we refer to the dimension m s by means of its subscript s.
In the first and last steps, a pre-smoothing iteration and a post-smoothing iteration are applied ν pre times and ν post times, respectively, in accordance with the considered stationary iterative method in the class V s . Furthermore, the intermediate steps define the exact coarse grid correction operator, which is depending on the considered projector operator P s s+1 . The resulting iteration matrix of the TGM is then defined as where V s,pre and V s,post represent the pre-smoothing and post-smoothing iteration matrices, respectively, and I (s) is the identity matrix at the sth level. By employing a recursive procedure, the TGM leads to a Multi-Grid Method (MGM): indeed, the standard V-cycle can be expressed in the following way: Coarse Grid Correction From a computational viewpoint, it is more efficient that the matrices A s+1 = P s+1 s A s (P s+1 s ) H are computed in the so-called setup phase for reducing the related costs.
According to the previous setting, the global iteration matrix of the MGM is recursively defined as MGM s min = O ∈ C s min ×s min , Definition 1. Let M k be the linear space of the complex k × k matrices and let f : (−π, π) → M k be a measurable function with Fourier coefficients given bŷ Then, we define the block-Toeplitz matrix T n ( f ) associated with f as the kn × kn matrix given by where ⊗ denotes the (Kronecker) tensor product of matrices. The term J (j) n is the matrix of order n whose (i, k) entry equals 1 if i − k = j and zero otherwise. The set {T n ( f )} n is called the family of block-Toeplitz matrices generated by f , which is called the generating function or the symbol of {T n ( f )} n .

Remark 1.
In the relevant literature (see, for instance, [10]), the convergence analysis of the two-grid method splits into the validation of two separate conditions: the smoothing property and the approximation property. Regarding the latter, with reference to scalar structured matrices [10,15], the optimality of two-grid methods is given in terms of choosing the proper conditions that the symbol p of a family of projection operators has to fulfill. Indeed, consider T n ( f ) with n = (2 t − 1) and f being a nonnegative trigonometric polynomial. Let θ 0 be the unique zero of f . Then, the optimality of the two-grid method applied to T n ( f ) is guaranteed if we choose the symbol p of the family of projection operators such that where the sets Ω(θ) and M(θ) are the following corner and mirror points Informally, it means that the optimality of the two-grid method is obtained by choosing the family of projection operators associated to a symbol p such that |p| 2 (ϑ) + |p| 2 (ϑ + π) does not have zeros and |p| 2 (ϑ + π)/ f (ϑ) is bounded (if we require the optimality of the V-cycle, then the second condition is a bit stronger) (see [10]). In a differential context, the previous conditions mean that p has a zero of order at least α at ϑ = π, whenever f has a zero at θ 0 = 0 of order 2α.
In our specific block setting, by interpreting the analysis given in [18], all the involved symbols are matrix-valued and the conditions which are sufficient for the two-grid convergence and optimality are the following: (A) zero of order 2 at ϑ = π of the proper eigenvalue function of the symbol of the projector for Q k , k = 1, 2, 3 (mirror point theory [10,15]); (B) positive definiteness of pp H (ϑ) + pp H (ϑ + π); and (C) commutativity of p(ϑ) and p(ϑ + π).
Even if the theoretical extension to the V-Cycle and W-cycle convergence and optimality is not given, in the subsequent section, we propose specific choices of the projection operators, numerically showing how this leads to two-grid, V-cycle, and W-cycle procedures converging optimally or quasi-optimally with respect to all the relevant parameters (size, dimensionality, and polynomial degree k).
Our choices are in agreement with the mathematical conditions set in Items (A) and (B), while Condition (C) is not satisfied. The violation of Condition (C) is discussed in Section 5, while, in relation to Condition (A), we observe that a stronger condition is met, since the considered order of the zero at ϑ = π is k + 1, which is larger than 2 for k = 2, 3.

Structure of the Matrices and Spectral Analysis:
We report some results derived in [3] for the Lagrangian Finite Elements Q k ≡ P k , d = 1. Let us consider the Lagrange polynomials L 0 , . . . , L k associated with the reference knots t j = j/k, j = 0, . . . , k: and let the symbol , denote the scalar product in L 2 ([0, 1]), i.e., ϕ, ψ := 1 0 ϕψ. In the case a(x) ≡ 1 and Ω = (0, 1), the Q k stiffness matrix for Equation (1) equals the matrix K (k) n in Theorem 1.
where the subscripts "−" mean that the last row and column of the of the whole matrices in square brackets are deleted, while K 0 , K 1 are k × k blocks given by with L 0 , . . . , L k being the Lagrange polynomials in (5). In particular, following the notation in Definition 1, n is the (nk − 1) × (nk − 1) leading principal submatrix of the block-Toeplitz matrices T n ( f Q k ) and f Q k : [−π, π] → C k×k is an Hermitian matrix-valued trigonometric polynomial given by An interesting property of the Hermitian matrix-valued functions f Q k (ϑ) defined in Equation (8) is reported in the theorem below from [3]: in fact, from the point of view of the spectral distribution, the message is that, independently of the parameter k, the spectral symbol if of the same character as 2 − 2 cos(ϑ), which is the symbol of the basic linear Finite Elements and the most standard Finite Differences.
being the determinant of the empty matrix equal to 1 by convention) and L 0 , . . . , L k are the Lagrange polynomials in Equation (5).
Furthermore, a generalization of the previous result in higher dimension is given in [8] and is reported in the subsequent theorem.

Theorem 3. ([8])
Given the symbols f Q k in dimension d ≥ 1, the following statements hold true:

Multigrid Strategy Definition, Symbol Analysis, and Numerics
Let us consider a family of meshes Clearly, the same inclusion property is inherited by the corresponding Finite Element functional spaces and hence we find Therefore, to formulate a multigrid strategy, it is quite natural to follow a functional approach and to impose the prolongation operator p h 2h : V 2h → V h to be defined as the identity operator, that is Thus, the matrix representing the prolongation operator is formed, column by column, by representing each function of the basis of V 2h as linear combination of the basis of V h , the coefficients being the values of the functions ϕ 2h i on the fine mesh grid points, i.e., ϕ 2h In the following subsections, we consider in detail the case of Q k Finite Element approximation with k = 2 and k = 3, the case k = 1 being reported in short just for the sake of completeness.

Q 1 Case
Firstly, let us consider the case of Q 1 Finite Elements, where, as is well known, the stiffness matrix is the scalar Toeplitz matrix generated by f Q 1 (ϑ) = 2 − 2 cos(ϑ), and, for the sake of simplicity, let us consider the case of T 2h partitioning with five equispaced points (three internal points) and T h partitioning with nine equispaced points (seven internal points) obtained from T 2h by considering the midpoint of each subinterval. In the standard geometric multigrid, the prolongation operator matrix is defined as Indeed, the basis functions with respect to the reference interval [0, 1] areφ 1 (x) = 1 −x,φ 2 (x) =x, and, according to Equation (12), the ϕ 2h i coefficients arê giving the columns of the matrix in Equation (13). However, we can think the prolongation matrix above as the product of the Toeplitz matrix generated by the polynomial p Q 1 (ϑ) = 1 + cos(ϑ) and a suitable cutting matrix (see [15] for the terminology and the related notation) defined as i.e., P m s m s+1 = (P m s+1 Two-grid/Multigrid convergence with the above defined restriction/prolongation operators and a simple smoother (for instance, Gauss-Seidel iteration) is a classical result, both from the point of view of the literature of approximated differential operators [17] and from the point of view of the literature of structured matrices [10,15].
In the first panel of Table 1, we report the number of iterations needed for achieving the predefined tolerance 10 −6 , when increasing the matrix size in the setting of the current subsection. Indeed, we use A m s (p Q 1 )(K m s+1 ×m s ) T and its transpose as restriction and prolongation operators and Gauss-Seidel as a smoother. We highlight that only one iteration of pre-smoothing and only one iteration of post-smoothing are employed in the current numerics. Therefore, considering the results of Remark 1 and the subsequent explanation, there is no surprise in observing that the number of iterations needed for the two-grid, V-cycle, and W-cycle convergence remains almost constant when we increase the matrix size, numerically confirming the predicted optimality of the methods in this scalar setting. Table 1. Number of iterations needed for the convergence of the two-grid, V-cycle, and W-cycle methods for k = 1, 2, 3 in one dimension with a(x) ≡ 1 and tol = 1 × 10 −6 .

# Subintervals TGM V-Cycle W-Cycle TGM V-Cycle W-Cycle TGM V-Cycle W-Cycle
For the sake of simplicity, let us consider the case of T 2h partitioning with five equispaced points (three internal points) and T h partitioning with nine equispaced points (seven internal points) obtained from T 2h by considering the midpoint of each subinterval.
Thus, with respect to Equation (12) and so on again as for that first couple of basis functions. Notice also that, to evaluate the coefficients, for the sake of simplicity, we are referring to the basis functions on the reference interval, as depicted in Figure 1. To sum up, the obtained prolongation matrix is as follows Hereafter, we are interested in setting such a geometrical multigrid strategy, proposed in [17,19,20], in the framework of the more general algebraic multigrid theory and in particular in the one driven by the matrix symbol analysis. To this end, we represent the prolongation operator quoted above as the product of a Toeplitz matrix generated by a polynomial p Q 2 and a suitable cutting matrix. We recall that the Finite Element stiffness matrix could be thought as a principal submatrix of a Toeplitz matrix generated by the matrix-valued symbol that, from Equation (8), has the compact form Then, it is quite natural to look for a matrix-valued symbol for the polynomial p Q 2 as well. In addition, the cutting matrix is also formed through the Kronecker product of the scalar cutting matrix in Equation (14) and the identity matrix of order 2, so that Taking into account the action of the cutting matrix (K m s+1 ×m s ) T ⊗ I 2 , we can easily identify from Equation (15) the generating polynomial as where A very preliminary analysis, just by computing the determinant of p Q 2 (ϑ) shows there is a zero of third order in the mirror point ϑ = π, being det(p Moreover, the analysis can be more detailed, as highlighted in Section 2. We highlight that our choices are in agreement with the mathematical conditions set in Items (A) and (B). Condition (C) is violated and we discuss it in Section 5 and Remark 2. Nevertheless, it is possible to derive the following TGM convergence and optimality sufficient conditions that should be verified by f and p = p Q 2 , exploiting the idea in the proof of the main result of [18]: where q(ϑ) = p(ϑ) H p(ϑ) + p(ϑ + π) H p(ϑ + π) −1 , O k is the k × k null matrix, γ > 0 is a constant independent on n, and we denote by A > B (respectively, A ≤ B) the positive definiteness (respectively, non-positive definiteness) of the matrix A − B. The condition in Equation (19) requires the matrix-valued function R(ϑ) to be uniformly bounded in the spectral norm. These conditions are obtained from the proof of the main convergence result in [18], where, after several numerical derivations, it was concluded that the above conditions are the final requirements needed.
To this end, we have explicitly formed the matrices involved in the conditions in Equations (18) and (19) and computed their eigenvalues for ϑ ∈ [0, 2π]. The results are reported in Figure 2 and are in perfect agreement with the theoretical requirements.
In the second panel of Table 1, we report the number of iterations needed for achieving the predefined tolerance 10 −6 , when increasing the matrix size in the setting of the current subsection. Indeed, we use A m s (p Q 2 )(K m s+1 ×m s ) T and its transpose as restriction and prolongation operators and Gauss-Seidel as a smoother. Again, we remind that only one iteration of pre-smoothing and only one iteration of post-smoothing are employed in our numerical setting.
As expected, we observe that the number of iterations needed for the two-grid convergence remains constant when we increase the matrix size, numerically confirming the optimality of the method.
Moreover, we notice that also the V-cycle and W-cycle methods possess optimal convergence properties. Although this behavior is expected from the point of view of differential approximated operators, it is interesting in the setting of algebraic multigrid methods. Indeed, constructing an optimal V-cycle method for matrices in this block setting might require a specific analysis of the spectral properties of the restricted operators (see [18]).

Q 3 Case
Hereafter, we briefly summarize the case of Q 3 Finite Elements, following the very same path we already considered in the previous section for P 2 Finite Elements. The basis functions with respect to the reference interval [0, 1] areφ For the sake of simplicity, let us consider the case of T 2h partitioning with seven equispaced points (five internal points) and T h partitioning with 13 equispaced points (11 internal points) obtained from T 2h by considering the midpoint of each subinterval.
Thus, with respect to Equation (12) (see also  Thus, the obtained prolongation matrix is as follows: Thus, taking into consideration that the stiffness matrix is a principal submatrix of the Toeplitz matrix generated by the matrix-valued function we are looking for the matrix-valued symbol p it is easy to identify the generating polynomial as A trivial computation shows again shows there is a zero of fourth order in the mirror point ϑ = π, being det(p However, the main goal is to verify the conditions in Equations (18) and (19): we have explicitly formed the matrices involved and computed their eigenvalues for ϑ ∈ [0, 2π]. The results are in perfect agreement with the theoretical requirements (see Figure 4). This analysis links the geometric approach proposed in [17,19,20] to the novel algebraic multigrid methods for block-Toeplitz matrices.
In the third panel of Table 1, we report the number of iterations needed for achieving the predefined tolerance 10 −6 , when increasing the matrix size in the setting of the current subsection. Indeed, we use A m s (p Q 3 )(K m s+1 ×m s ) T and its transpose as restriction and prolongation operators and Gauss-Seidel as a smoother (one iteration of pre-smoothing and one iteration of post-smoothing).
As expected, we observe that the number of iterations needed for the two-grid convergence remains constant when we increase the matrix size, numerically confirming the optimality of the method. As in the Q 2 case, we also notice that the V-cycle and W-cycle methods possess the same optimal convergence properties.
Comparing the three panels in Table 1, we also notice a mild dependency of the number of iterations on the polynomial degree k. In addition, we can see in Tables 2 and 3 that the optimal behavior of the two-grid, V-cycle, and W-cycle methods for k = 2, 3 remains unchanged if we test different tolerance values. Table 2. Number of iterations needed for the convergence of the two-grid, V-cycle, and W-cycle methods for k = 2 in one dimension with a(x) ≡ 1 and tol = 1 × 10 −2 , 1 × 10 −4 , and 1 × 10 −8 .

Remark 3.
It is worth stressing that the results hold also in dimension d ≥ 2. In fact, interestingly, we observe that the dimensionality d does not affect the efficiency of the proposed method, as well shown in Table 4 for the case d = 2. We finally remind that the tensor structure of the resulting matrices highly facilitates the generalization and extension of the numerical code to the case of d ≥ 2. Indeed, the prolongation operators in the multilevel setting are constructed by a proper tensorization of those in 1D. Table 4. Number of iterations needed for the convergence of the two-grid, V-cycle, and W-cycle methods for k = 1, 2, 3 in dimension d = 2 with a(x) ≡ 1.

Concluding Remarks
In the present paper, we consider multigrid strategies for the resolution of linear systems arising from the Q k Finite Elements approximation of one-and higher-dimensional elliptic partial differential equations with Dirichlet boundary conditions and where the operator is div (−a(x)∇·), with a continuous and positive over Ω, Ω being an open and bounded subset of R d . While the analysis has been given in one dimension, the numerics are shown also in higher dimension d ≥ 2, showing an optimal behavior in terms of the dependency on the matrix size and a substantial robustness with respect to the dimensionality d and to the polynomial degree k (see Remark 3).
We mention the fact that our analysis might be of interest for several variations on the problem in Equation (1). Indeed, if we impose different boundary conditions, our procedure can be applied with slight changes. In fact, the resulting stiffness matrices differ from the ones analyzed in the present paper, of a small rank correction matrix. Therefore, they share the same asymptotic spectral properties, which means we only have to take care of possible outliers, affecting the choice of the proper smoother.
By interpreting the analysis given in [18] in our specific block setting, we provide a study of the relevant analytical features of all the involved spectral symbols, both of the stiffness matrices f Q k and of the projection operators p Q k , k = 1, 2, 3. While the two-grid, V-cycle, and W-cycle procedures show optimal or quasi-optimal convergence rate, with respect to all the relevant parameters (size, dimensionality, polynomial degree k, and diffusion coefficient), the theoretical prescriptions are only partly satisfied. In fact, our choices are in agreement with the mathematical conditions set in Items (A) and (B), while Condition (C) is violated. Here, by quasi-optimal convergence rate, we mean that the convergence speed does not depend on the size (optimality with respect to the this parameter) and it is mildly depending on the other relevant parameters such as dimensionality, polynomial degree k, and diffusion coefficient. By looking at the mathematical derivations in [18], we observe that the latter condition indeed is a technical one. In reality, we believe that Condition (C) is not essential and the commutation request can be substituted by a less restrictive one, possibly following the considerations in Remark 2. Such a point is in our opinion important for widening the generality of the theory and it will be the subject of future investigations.
In conclusion, regarding the computational cost of the proposed algorithm, we highlight that the choice of the optimal smoother from a computational viewpoint is beyond the scope of the present paper. Indeed, in the case where the matrices possess a tensor structure, a further analysis will be performed to devise a more competitive method.