An Aggregation-Based Algebraic Multigrid Method with Deflation Techniques and Modified Generic Factored Approximate Sparse Inverses

Anastasia A. Natsiou; George A. Gravvanis; Christos K. Filelis-Papadopoulos; Konstantinos M. Giannoutakis

doi:10.3390/math11030640

,

and

¹

School of Computing, Technological University of Dublin, D07 H6K8 Dublin, Ireland

²

Department of Electrical and Computer Engineering, School of Engineering, Democritus University of Thrace, Kimmeria, 67100 Xanthi, Greece

³

Department of Computer Science, University College Cork, T12 XF62 Cork, Ireland

⁴

Department of Applied Informatics, School of Information Sciences, University of Macedonia, 54636 Thessaloniki, Greece

Mathematics2023, 11(3), 640;https://doi.org/10.3390/math11030640

This article belongs to the Special Issue Advanced Numerical Analysis and Scientific Computing

Version Notes

Order Reprints

Abstract

In this paper, we examine deflation-based algebraic multigrid methods for solving large systems of linear equations. Aggregation of the unknown terms is applied for coarsening, while deflation techniques are proposed for improving the rate of convergence. More specifically, the V-cycle strategy is adopted, in which, at each iteration, the solution is computed by initially decomposing it utilizing two complementary subspaces. The approximate solution is formed by combining the solution obtained using multigrids and deflation. In order to improve performance and convergence behavior, the proposed scheme was coupled with the Modified Generic Factored Approximate Sparse Inverse preconditioner. Furthermore, a parallel version of the multigrid scheme is proposed for multicore parallel systems, improving the performance of the techniques. Finally, characteristic model problems are solved to demonstrate the applicability of the proposed schemes, while numerical results are given.

Keywords:

multigrid; deflation; approximate inverses; aggregation-based algebraic multigrid; iterative methods; linear systems

MSC:

65F08; 65F10; 65F50; 65N55

1. Introduction

Lately, research efforts for efficiently solving large sparse linear systems have focused on multigrid methods [,]. Their applicability and efficiency are based on the use of a stationary method as a smoother for higher frequency components of the error, while the lower frequency components are transferred to a coarser level with higher frequency, which can reduce the error [,]. Multigrid algorithms consider a recursive application with two-grid coarse grid-correction schemes []. In the coarse level, a smoother is used for defecting correction, which is interpolated to the finer level and then added to the solution [,].

An algebraic multigrid (AMG) requires the information of the coefficient matrix of the linear system []. The only difference between the geometrical and algebraic approaches is the coarsening algorithm; however, they both consist of a smoother and a coarse grid correction. The smoother, is usually the application of some iterations of a fixed-point iterative method (such as Jacobi and Gauss–Seidel) and coarse grid correction is the transformation of the approximate residual to the finest grid. It should be noted that the notion of aggregation has multiple applications in several scientific fields, including gene regulatory networks [,].

Moreover, approximate inverses have been extensively used for approximating the inverse of the coefficient matrix of a linear system, resulting in efficient preconditioners for Krylov subspace iterative methods. Their efficiency relies on the fact that they have inherent parallelism and can be computed quickly in parallel [,,]. Additionally, approximate inverse techniques have been efficiently used in conjunction with multigrid methods for solving partial differential equations discretized with finite difference methods [], and for a variety of other applications [,].

In this paper, a class of Modified Generic Factored Approximate Sparse Inverses (MGenFAspI) based on approximate inverse sparsity patterns is used to solve two-dimensional and three-dimensional partial differential equations, discretized with high-order finite difference schemes. For the computation of these approximate inverses, a fill-in parameter is required which affects the density, in terms of nonzero elements, of the preconditioner []. A deflation technique is also utilized in this work, splitting the problem at each multigrid level into two complementary subspaces, aiming to further improve the convergence rate.

The paper is organized as follows: Section 2 introduces the deflation technique and the corresponding algorithm used for the solution of linear systems. Section 3 presents the aggregation-based algebraic multigrid method in conjunction with the deflation techniques and the Modified Generic Factored Approximate Sparse Inverses. The algorithmic procedures for the pairwise aggregation and the preconditioned multigrid are also given. The parallel aspects and techniques for accelerating the computations required by the Modified Generic Factored Approximate Sparse Inverses in hybrid parallel environments are discussed in Section 4. Finally, numerical results regarding the efficiency and convergence behavior of the proposed techniques are presented and discussed in Section 5, by solving characteristic model problems. Concluding remarks and directions for future research are provided in Section 6.

2. Deflation Techniques

Deflation techniques were introduced by Nicolaides in 1987 for improving the rate of convergence of the conjugate gradient method []. The main idea is based on the projection of the linear system into two complementary subspaces, in order to separately solve the two produced linear systems. The efficiency of this technique is based on the fact that the projected linear systems could be solved faster than the initial system, and then their solution can be combined for formulating the solution of the initial problem [].

Let us consider an

n \times n

linear system of the form:

A x = b,

(1)

where A is a symmetric and positive definite matrix, b is the right-hand side vector and x is the solution vector. The deflation subspace

V \in R

can be represented by a full-rank matrix

V \in R^{n \times k}

with

k ≪ n

such that:

E = V^{T} A V .

(2)

Then, the projection matrix Q can be defined as follows:

Q = I - V E^{- 1} V^{T} A,

(3)

where I is the identity matrix. The matrix Q is the orthogonal projection on the subspace

V^{⊥}

, while

I - Q

is the orthogonal projection on the subspace V. For the selection of the subspace V, we can use the deflation technique for the generation of the basis of V, proposed in [], where the domain

Ω

is decomposed into m nonoverlapping subdomains

Ω_{j}

,

j = 1, \dots, m

, where:

v_{i, j} = \{\begin{matrix} 1, & if x_{i} \in Ω_{j} \\ 0, & if x_{i} \notin Ω_{j} \end{matrix} .

(4)

The solution of the linear system can be computed as follows

x = (I - Q) x_{c} + Q x_{s},

(5)

where

x_{c}

is the solution vector of the projection to subspace V, while

x_{s}

is the solution vector from the projection to the subspace

V^{⊥}

. These vectors can be computed as follows:

x_{c} = (I - Q) x and x_{s} = Q x_{i t},

(6)

where x is an initial approximation for the solution. By applying matrix operations, the solution vector

x_{c}

can be defined as:

x_{c} = (I - Q) x = V E^{- 1} V^{T} A x = V {(V^{T} A V)}^{- 1} V^{T} A x = V {(V^{T} A V)}^{- 1} V^{T} b .

(7)

Then, the vector

x_{i t}

can be computed by solving iteratively the following linear system:

A (I - V {(V^{T} A V)}^{- 1} V^{T} A) x_{i t} = b - A x_{c} .

(8)

The procedure for the solution of the linear system with the deflation technique can be summarized by Algorithm 1.

Algorithm 1 Deflation technique for solving linear systems

1:: Select the basis of the subspace V.
2:: Project vector V based on the subspace V.
3:: Compute $E^{- 1} = {(V^{T} A V)}^{- 1}$ .
4:: Compute the solution from the subspace V, using the transformation matrix $I - Q$ , such that $x_{c} = V {(V^{T} A V)}^{- 1} V^{T} b$ .
5:: Solve iteratively the linear system $A (V {(V^{T} A V)}^{- 1} V^{T}) x_{i t} = b - A x_{c}$ .
6:: Multiply the solution $x_{i t}$ with Q and define the solution $x_{s}$ .
7:: Compute the approximate solution by adding the two solution vectors from the two complementary subspaces.

The use of the deflation technique in conjunction with an efficient preconditioning scheme can accelerate the convergence behavior of the solution of the system []. Preconditioning techniques yield the solution of a linear system of the form:

M A x = M b,

(9)

where M is a symmetric positive definite preconditioner. An efficient preconditioner should be efficiently computed in parallel,

M A

should have a “clustered” spectrum, and the M × vector should be fast to compute in parallel [,,]. Then, the preconditioner can be used in conjunction with a Krylov-subspace iterative method, such as the conjugate gradient or the generalized minimum residual methods for the solution of the linear system [,].

3. Aggregation Multigrid Method

Multigrid methods have been widely used by the scientific community for solving linear systems, due to their convergence behavior and efficiency [,]. The design of these methods is based on the fact that high-frequency error components can be damped by a stationary iterative method, while low-frequency error components are handled by coarser grids with a higher discretization step, where the high-frequency modes of the error are more oscillatory and can be damped efficiently by a stationary iterative method [,]. Multigrid methods are composed of four components: (a) the stationary iterative method (iterative methods such as the Richardson, Jacobi and Gauss–Seidel methods), (b) the restriction operator (transfer operators from finer to coarser grids), (c) the prolongation operator (transfer operators from coarser to finer grids) and (d) the cycle strategy (the sequence in which the grids are visited until a solution with the predefined tolerance is achieved) [].

The method utilized for grid coarsening in this paper is the aggregation multigrid, first introduced in [] and developed by Notay []. The nodes of the grid are combined into pairs, creating aggregates, based on a weighting rule.

The coarsening algorithm constructs a prolongation matrix P from the coefficient matrix A, which is the transformation matrix from the coarse to the fine grid. The dimensions of the prolongation matrix are

n \times n_{c}

, where

n_{c} < n

is the number of the nodes of the coarser grid. The transformation from the fine to the coarser grid can be managed by the restriction matrix R, which is computed as follows:

R = P^{T},

(10)

and the coarse coefficient matrix

A_{c}

is estimated by Galerkin’s formula:

A_{c} = R A P .

(11)

For the formulation of the prolongation matrix, the aggregates

G_{i}

, which are disjoint subsets of the variable set, have to be defined. The number of these variables,

n_{c}

, is, then, the number of the subsets and is used to define the prolongation matrix P:

P_{i, j} = \{\begin{matrix} 1, if i \in G_{j} \\ 0, otherwise \end{matrix} (1 \leq i \leq n, 1 \leq j \leq n_{c}) .

(12)

Therefore, P is a Boolean matrix with one nonzero entry in each row. In order to define the aggregates, the algorithm searches for the set of nodes

S_{i}

to which i is strongly negatively coupled, using a strong/weak coupling threshold

β

:

S_{i} = \{j \neq i | a_{i, j} < - β max_{a_{i, k} < 0} | a_{i, k} |\} .

(13)

Then, one unmarked node at a time is selected, giving priority to the node with the minimal

m_{i}

, where

m_{i}

is the number of unmarked nodes which are strongly negatively coupled to I [,]. Next, one can pick an unmarked node j, which consists of the column of the matrix A with the minimal element

a_{i, j}

in line i. Then, an aggregate is created between the node i and the node j. The algorithm of the pairwise aggregation is given in Algorithm 2 []. It should be noted that the

C h e c k D D

parameter is optional and confirms if matrix A is diagonal-dominant [].

For the solution of a linear system with an algebraic aggregation multigrid, the V-cycle strategy can be used. The linear system is smoothed and the residual is transferred to the coarser level until the resulting linear system is small enough to be solved directly. Then, the solution vector is subject to the procedure of coarse grid correction, by means of the appropriate prolongation [,].

During each iteration of the V-cycle, the deflation technique is used to split the problem into two complementary subspaces, and the approximate solution is computed, which is transferred to the coarser level of the grid. The combination of the multigrid and deflation techniques is depicted in Figure 1.

Algorithm 2 Pairwise aggregation

Inputs: matrix

A = (a_{i, j})

with n rows,

Threshold

β = 0.25

,

Logical parameter

C h e c k D D

Outputs: number of coarse variables

n_{c}

,

Aggregates

G_{i}

,

i = 1, \dots n_{c}

(such as

G_{i} \cap G_{j} = \emptyset)

1:: if $C h e c k D D$ then
2:: $U = [1, n] ∖ {i | a_{i, i} > 5 \sum_{j \neq i} | a_{i, j} |}$
3:: else $U = [1, n]$
4:: end if
5:: for $i = 1, \dots, n$ do
6:: $S_{i} = \{j \in U ∖ {i} | a_{i, j} < - β max_{a_{i, k} < 0} | a_{i, k} |\}$
7:: end for
8:: for $i = 1, \dots, n$ do
9:: $m = | {j | i \in S_{j}} |$
10:: end for
11:: $n_{c} = 0$
12:: while $U \neq \emptyset$ do
13:: Select $i \in U$ with minimum $m_{i}$ , $n_{c} = n_{c} + 1$
14:: Select $j \in U$ such that $a_{i, j} = min_{k \in U} a_{i, k}$
15:: if $j \in U$ then
16:: $G_{n_{c}} = {i, j}$
17:: else $G_{n_{c}} = {i}$
18:: end if
19:: $U = U ∖ G_{n_{c}}$
20:: For all $k \in G_{n_{c}}$ update $m_{l} = m_{l} - 1$ for $l \in S_{k}$
21:: end while

Figure 1. A V-cycle of aggregation multigrid in conjunction with deflation.

Algorithm 3 depicts the procedure of using a multigrid as preconditioner for a V-cycle.

Algorithm 3 Preconditioned V cycle

Inputs: Residual vector

r_{k}

Grid matrix at level k:

A_{k}

Deflation matrix

M_{k}

Prolongation matrix

P_{k}

Matrix of next grid

A_{k - 1}

Outputs:

z_{k}

1:: function $M G p r e c$ ( $r_{k}$ , k)
2:: Compute $z_{k}^{(1)} = M_{k}^{- 1} r_{k}$
3:: Calculate the new residual vector $r_{k} = r_{k} - A_{k} z_{k}^{(1)}$
4:: Compute $r_{k - 1} = P_{k}^{T} r_{k}$
5:: if k=1 then
6:: $x_{k - 1} = A_{k - 1}^{- 1} r_{k - 1}$
7:: else
8:: $x_{k - 1} = M G p r e c (r_{k - 1}, k - 1)$
9:: end if
10:: $z_{k}^{(2)} = P_{k} x_{k - 1}$
11:: $r_{k} = r_{k} - A_{k} z_{k}^{(2)}$
12:: $z_{k}^{(3)} = M_{k}^{- 1} r_{k}$
13:: $z_{k} = z_{k}^{(1)} + z_{k}^{(2)} + z_{k}^{(3)}$
14:: end function

The function

M G p r e c

is executed once at the higher level for

k = l

using the matrix

A_{l} = A

, and iteratively calls itself until the parameter k reduces to 1. For smoothing, the symmetric Gauss–Seidel method is used, where

M_{k} = l o w (A_{k}) d i a g {(A_{k})}^{- 1} u p p (A_{k}),

(14)

where

l o w

,

d i a g

and

u p p

are the lower, diagonal and upper submatrices of

A_{k}

[].

In this paper, the coarse coefficient matrix

A_{c}

is inverted by the Modified Generic Factored Approximate Sparse Inverse (MGenFAspI) technique []. For a coefficient matrix A, we consider its incomplete

L U

factorization, i.e.,

A = L U + E,

(15)

where L and U are, respectively, the upper and lower triangular factors and E is the error matrix. The GenFAspI matrix can be computed using the following procedure:

A = L U \Leftrightarrow A^{- 1} = U^{- 1} L^{- 1} \Leftrightarrow A^{- 1} = M = G H,

(16)

considering

G = U^{- 1}

and

H = L^{- 1}

. In order to compute the G and H, an “a-priori” sparsity pattern is required, []. These patterns occur when utilizing a predetermined drop tolerance (droptol) and then raising to a predefined power (level of fill or lfill) the upper and lower triangular factors. After that, the GenFAspI matrix is constructed by solving the following system:

M = G_{droptol}^{lfill} H_{droptol}^{lfill} \Leftrightarrow \{\begin{matrix} U g_{:, j} = e_{:, j} \\ L h_{:, j} = e_{:, j} \end{matrix}, 0 \leq j < m,

(17)

where

g_{:, j}

denote the elements of G,

h_{:, j}

, the elements of H;

e_{:, j}

are the elements of the identity matrix and m the order of the coefficient matrix of the linear system []. A modified version of GenFAspI (MGenFAspI) is used in order to enhance the performance during the computation of the approximate inverse []. The MGenFAspI matrix is used to compute each column of the factor of the approximate inverse, separately, by a restricted solution process of only the factors. The elements that do not belong to G and H factors are set to zero; thus, Equation (17) is rewritten as:

\{\begin{matrix} U g_{:, j} = e_{:, j} \\ L h_{:, j} = e_{:, j} \end{matrix} 0 \leq j < m \Leftrightarrow \{\begin{matrix} g_{i, j} = 0, (i, j) \notin G_{droptol}^{lfill} \\ h_{i, j} = 0, (i, j) \notin H_{droptol}^{lfill} \end{matrix} .

(18)

The advantage of the modified version of GenFAspI is that the construction of the approximate inverse can be easily parallelized, since the elements of the matrices G and H or L and U are not required to be identified. The algorithmic procedure of MGenFAspI is given in [].

4. Parallel Techniques

In this section, the parallel algorithm for the Modified Generic Factored Approximate Sparse Inverse and the parallel V-cycle are discussed. The matrices G and H can be computed by solving the following equations:

\{\begin{matrix} U G = I \\ L H = I \end{matrix},

(19)

where U and L are the upper and lower triangular matrices that are derived from the incomplete LU factorization of the coefficient matrix A. Thus, for calculating the columns of the matrices G and H, the linear system of Equation (18) should be solved.

The MGenFAspI algorithm allows the individual computation of the columns of matrices G and H, and, due to this inherent parallelism, it can be efficiently implemented on hybrid parallel systems that consist of distributed and multicore CPUs. Each node of the distributed system can be assigned into a block of columns of the matrices, while, within each node, the multiple cores can further distribute the workload.

The computation of the elements of each column of matrices G and H does not require all rows of matrices U or L, respectively, but only the rows that correspond to nonzero elements of each column. For this reason, there is no need to broadcast the whole matrices U and L, yielding a reduced communication cost, especially for large-scale matrices. A visual representation of the distribution of matrix G on the available nodes is given in Figure 2, while the workload of two cores of a distributed node is depicted in Figure 3.

Figure 2. Distribution of columns of matrix G on four tasks.

Figure 3. Computations performed from the two cores of Task 2 for the formulation of fourth and fifth columns of matrix G of the MGenFAspI.

The same process was implemented for the computation of the elements of matrix H, where the rows of matrix L were used. Since this technique does not require any communication between the distributed processes, the parallel speedup and parallel efficiency are expected to be high. The derived matrices (G and H) are distributed on the nodes; however, they can be used on any distributed system for executing a column-wise matrix×vector multiplication. The algorithm for the parallelization of the MGenFAspI has been presented in []. In the context of this work, the parallelization on multicore systems was considered.

5. Numerical Results

The applicability, performance and convergence behavior of the proposed schemes are given in this section. Various model problems are considered and numerical results are obtained, experimentally demonstrating the behavior of the proposed methods. It should be noted that the experimental results were obtained on an Intel Xeon 2420v2 with 6 cores and 12 threads, 48GB of RAM memory, running Linux CentOS 7. All algorithmic techniques were implemented in C++ 11.

For the evaluation of the aggregation-based multigrid method with V-cycle, the 2D Poisson problem was initially selected with the number of elements per direction

n x = 50, 100, 200, 500

. Four iterations of the Jacobi method were performed as a pre-smoother and four iterations of the Jacobi method were used for post-smoothing in each level of grid hierarchy. In the last level of the multigrid, the linear system is solved by a direct method, such as LU factorization. The termination criterion is

| | r_{i} {| |}_{2} < t o l | | r_{0} {| |}_{2}

, where

| | r_{i} {| |}_{2}

is the norm of the residual vector of the i-th iteration,

| | r_{0} {| |}_{2}

is the norm of the residual of the initial problem and

t o l = 10^{- 8}

. In the parallel experiments, the resolution at the lowest level was held by the PARDISO method (parallel direct solver) contained in the parallel libraries of Intel [].

It should be noted that the choice of pre-smoothing and post-smoothing iterations affects the convergence behavior, e.g., increased values lead to better overall performance. However, since the stationary iterative methods in the context of multigrids are used to damp the high-frequency components of the error, an arbitrary increase does not guarantee substantial improvements. In the context of an algebraic multigrid, the choices for pre-smoothing and post-smoothing iterations vary between 1 and 4 for smoothers, such as Jacobi or Gauss–Seidel, in the literature. Thus, determination of the optimal parameters in the context of a multigrid depends on the problem to be solved and can be determined by experimentation [,].

The convergence behavior and the performance of the AGAMG method for different numbers of the unknowns per dimension for the 2D Poisson problem are given in Table 1.

Table 1. Convergence behavior and performance of V cycle of AGAMG for 2D Poisson model problem.

The number of iterations required for convergence slightly increases as the size of matrix A increases, for the different number of levels. Moreover, increasing the number of levels while increasing the number of points per dimension (

n x

) yields reduced execution time. This is due to the size of the matrix at the last level, which is solved by a direct method. Thus, for large-scale systems, an increased number of levels is recommended.

In order to examine the applicability and efficiency of the deflation techniques, the same model problem was executed by considering deflation with V-cycle. For the integration of the two methods, the four Jacobi pre-smoothing iterations were replaced with one deflation iteration. The deflation algorithm computes, using a direct method, the solution in subspace

I - Q

:

x_{c} = (I - Q) x = P E^{- 1} R A x = P {(R A P)}^{- 1} R A x,

(20)

by inverting the matrix E with the Jacobi method (i.e., inverting only the main diagonal). The solution in subspace Q is computed with six iterations of the BiCGStab method for the following linear system:

A (P E^{- 1} R) x_{i t} = b - A x_{c} .

(21)

The efficiency and convergence behavior of the aggregation-based algebraic multigrid method with deflation are given in Table 2, for various numbers of levels and

n x

. It should be noted that the Jacobi method with

n_{2} = 4

was used as post-smoother.

Table 2. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique for 2D Poisson model problem.

To further improve the convergence behavior, the Modified Generic Factored Approximate Sparse Inverse (MGenFAspI) was used for approximating matrix

E^{- 1}

. The

l f i l l

parameter was set to 1 and the

d r o p t o l

to zero. The corresponding results are given in Table 3, for various values of

n x

and numbers of levels for the 2D Poisson problem. It can be seen that in most of the cases, the number of iterations is reduced.

Table 3. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 1

for 2D Poisson model problem.

In Table 4, the convergence behavior and the performance of the proposed scheme with the use of MGenFAspI preconditioner for various values of

n x

and number of levels for solving the 3D Poisson problem are given. Moreover, in Table 5 and Table 6, numerical results are given for the 2D and 3D Poisson problem with

l f i l l = 2

and

d r o p t o l = 0.0

for various numbers of levels.

Table 4. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 1

for 3D Poisson model problem.

Table 5. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 2D Poisson model problem.

Table 6. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem.

The convergence behavior and the performance of the proposed schemes with the use of MGenFAspI for various numbers of the

l f i l l

parameter and number of levels for the 2D Poisson problem with

n x = 300

is given in Table 7.

Table 7. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and different values of

l f i l l

for 2D Poisson model problem with

n x = 300

.

The performance and applicability were also tested against the Sherman problem (https://sparse.tamu.edu/HB/sherman1 (accessed on 28 November 2022)). In Table 8, the convergence behavior and performance of the proposed schemes with the use of the MGenFAspI preconditioner, for various values of the parameter

l f i l l

and number of levels for the Sherman1 problem, are given in Table 8. In Table 9, the parallel aggregation-based algebraic multigrid method in conjunction with deflation and Modified Generic Factored Approximate Sparse Inverses for the 3D Poisson problem, with

n x = 70

and

l f i l l = 2

, is given.

Table 8. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and different values of

l f i l l

for Sherman1 problem.

Table 9. Convergence behavior and performance of a parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

The addition of deflation is expected to increase the computational time, since it involves matrix inversion operations (approximate) and matrix-by-matrix products. However, for difficult problems, e.g., Sherman1, where iterations increase substantially when following the original AGMG approach, deflation leads to substantial improvements and even ensures convergence within the prescribed maximum iterations. Moreover, the additional computational work is composed of inherently parallel operations which can be accelerated on modern hardware, leading to an efficient solution approach.

Direct parallelization of algebraic multigrid algorithms has been studied extensively in the literature and has been shown to be limited []. The advantages of a truncated approach, which is similar to the one adopted in this manuscript, for the parallelization of the V cycle have been discussed extensively in [].

Finally, the speedups and efficiency of the parallel aggregation-based algebraic multigrid method in conjunction with deflation and Modified Generic Factored Approximate Sparse Inverses for the 3D Poisson problem, with

n x = 70

and

l f i l l = 2

, are given in Table 10 and Table 11, respectively. The corresponding figures for the execution times, speedups and efficiency are given in Figure 4, Figure 5 and Figure 6, respectively.

Table 10. Speedup of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Table 11. Efficiency of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Figure 4. Execution times (in seconds) of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Figure 5. Speedups of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Figure 6. Efficiency of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

6. Discussion and Conclusions

The Modified Generic Factored Approximate Sparse Inverse technique based on incomplete factorization has been recently proposed and used as a preconditioner for Krylov subspace methods. In this work, algebraic multigrid methods were examined that are based on aggregation, while a V cycle was used in conjunction with deflation techniques to improve the convergence behavior. The proposed schemes were discussed and numerical results from solving two model problems were presented.

The evaluation of the proposed methods demonstrates the improvement in the rate of convergence of the multigrid techniques using deflation, for various values of the parameters

n x

and

l f i l l

. The extensive experimentation that was carried out validated that the V cycle of AGAMG in combination with deflation and MGenFAspI was the most efficient approach in terms of convergence behavior; however, with greater execution time. For this, the paper took advantage of the inherent parallelism of the approximate inverse scheme and the aggregation-based multigrid method, and resulted in reduced execution times on a shared memory parallel system. The parallel numerical results obtained demonstrated good parallel results with good potential for large-scale parallel systems.

Future work includes the use of the aggregation-based multigrid method in conjunction with deflation techniques and approximate inverses as preconditioners for Krylov-subspace iterative methods. Moreover, the use of domain decomposition methods will be examined, in order to further improve the parallel efficiency of the methods.

Author Contributions

Conceptualization, G.A.G. and C.K.F.-P.; Investigation, A.A.N., G.A.G., C.K.F.-P. and K.M.G.; Methodology, A.A.N., G.A.G., C.K.F.-P. and K.M.G.; Software, A.A.N. and C.K.F.-P.; Supervision, G.A.G.; Validation, A.A.N., C.K.F.-P. and K.M.G.; Visualization, C.K.F.-P. and K.M.G.; Writing—original draft, A.A.N., G.A.G., C.K.F.-P. and K.M.G.; Writing—review and editing, A.A.N., G.A.G., C.K.F.-P. and K.M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AMG	Algebraic multigrid
AGAMG	Aggregation-based algebraic multigrid
CPU	Central processing unit
GenFAspI	Generic Factored Approximate Sparse Inverse
MGenFAspI	Modified Generic Factored Approximate Sparse Inverse

References

Paludetto Magri, V.A.; Franceschini, A.; Janna, C. A Novel Algebraic Multigrid Approach Based on Adaptive Smoothing and Prolongation for Ill-Conditioned Systems. SIAM J. Sci. Comput. 2019, 41, A190–A219. [Google Scholar] [CrossRef]
Demidov, D. AMGCL: An efficient, flexible, and extensible algebraic multigrid implementation. Lobachevskii J. Math. 2019, 40, 535–546. [Google Scholar] [CrossRef]
Briggs, W.L.; Henson, V.E.; McCormick, S.F. A Multigrid Tutorial; SIAM: Philadelphia, PA, USA, 2000. [Google Scholar]
Wesseling, P. Theoretical and Practical Aspects of a Multigrid Method. SIAM J. Sci. Stat. Comput. 1982, 3, 387–407. [Google Scholar] [CrossRef]
Notay, Y. An aggregation-based algebraic multigrid method. Electron. Trans. Numer. Anal. 2008, 37, 123–146. [Google Scholar]
Li, H.; Liu, Y.; Wang, S.; Niu, B. State Feedback Stabilization of Large-Scale Logical Control Networks via Network Aggregation. IEEE Trans. Autom. Control. 2021, 66, 6033–6040. [Google Scholar] [CrossRef]
Wang, S.; Li, H. Aggregation method to reachability and optimal control of large-size Boolean control networks. Sci. China Inf. Sci. 2022, 66, 1869–1919. [Google Scholar] [CrossRef]
Bröker, O.; Grote, M.J.; Mayer, C.; Reusken, A. Robust Parallel Smoothing for Multigrid Via Sparse Approximate Inverses. SIAM J. Sci. Comput. 2001, 23, 1396–1417. [Google Scholar] [CrossRef]
Giannoutakis, K.M.; Gravvanis, G.A. High performance finite element approximate inverse preconditioning. Appl. Math. Comput. 2008, 201, 293–304. [Google Scholar] [CrossRef]
Filelis-Papadopoulos, C.; Gravvanis, G. A class of generic factored and multi-level recursive approximate inverse techniques for solving general sparse systems. Eng. Comput. 2016, 33, 74–99. [Google Scholar] [CrossRef]
Filelis-Papadopoulos, C.K.; Gravvanis, G.A. On the Multigrid Method Based on Finite Difference Approximate Inverses. Comput. Model. Eng. Sci. 2013, 90, 233–253. [Google Scholar]
Gravvanis, G.; Filelis-Papadopoulos, C.; Matskanidis, P. Algebraic Multigrid Methods Based on Generic Approximate Banded Inverse Matrix Techniques. Comput. Model. Eng. Sci. 2014, 100, 323–345. [Google Scholar]
Kyziropoulos, P.; Filelis-Papadopoulos, C.; Gravvanis, G. Parallel N-body simulation based on the PM and P3M methods using multigrid schemes in conjunction with generic approximate sparse inverses. Math. Probl. Eng. 2015, 2015, 450980. [Google Scholar] [CrossRef]
Nicolaides, R.A. Deflation of Conjugate Gradients with Applications to Boundary Value Problems. SIAM J. Numer. Anal. 1987, 24, 355–365. [Google Scholar] [CrossRef]
Frank, J.; Vuik, C. On the Construction of Deflation-Based Preconditioners. SIAM J. Sci. Comput. 2001, 23, 442–462. [Google Scholar] [CrossRef]
Gravvanis, G.; Filelis-Papadopoulos, C. On the multigrid cycle strategy with approximate inverse smoothing. Eng. Comput. Int. J. Comput.-Aided Eng. 2014, 31, 110–122. [Google Scholar] [CrossRef]
Haelterman, R.; Vierendeels, J.; Van Heule, D. Non-stationary two-stage relaxation based on the principle of aggregation multi-grid. In Proceedings of the Computational Fluid Dynamics 2006; Deconinck, H., Dick, E., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 243–248. [Google Scholar]
Hackbusch, W. Multi-Grid Methods and Applications; Springer: Berlin/Heidelberg, Germany, 1985; Volume 4. [Google Scholar] [CrossRef]
Bulgakov, V.E. Multi-level iterative technique and aggregation concept with semi-analytical preconditioning for solving boundary-value problems. Commun. Numer. Methods Eng. 1993, 9, 649–657. [Google Scholar] [CrossRef]
Emans, M. Aggregation schemes for k-cycle AMG. Proc. Conf. Algoritm. 2015, 141–150. Available online: http://www.iam.fmph.uniba.sk/amuc/ojs/index.php/algoritmy/article/view/324 (accessed on 28 November 2022).
Chow, E. A Priori Sparsity Patterns for Parallel Sparse Approximate Inverse Preconditioners. SIAM J. Sci. Comput. 2000, 21, 1804–1822. [Google Scholar] [CrossRef]
Grylonakis, E.N.G.; Filelis-Papadopoulos, C.K.; Gravvanis, G.A. On the Numerical Solution of the Generalized Dirichlet-Neumann Map for the 2D Laplace Equation Using Modified Generic Factored Approximate Sparse Inverse Preconditioning. In Proceedings of the 19th Panhellenic Conference on Informatics, Athens, Greece, 1–3 October 2015; Association for Computing Machinery: New York, NY, USA, 2015; pp. 13–18. [Google Scholar] [CrossRef]
Schenk, O.; Gärtner, K. PARDISO. In Encyclopedia of Parallel Computing; Padua, D., Ed.; Springer: Boston, MA, USA, 2011; pp. 1458–1464. [Google Scholar] [CrossRef]
Trottenberg, U.; Oosterlee, C.V.; Schuller, A. Multigrid; Academic Press: Cambridge, MA, USA; Elsevier: Amsterdam, The Netherlands, 2000. [Google Scholar]
Filelis-Papadopoulos, C.K.; Gravvanis, G.A. Parallel multigrid algorithms based on generic approximate sparse inverses: An SMP approach. J. Supercomput. 2014, 67, 384–407. [Google Scholar] [CrossRef]

Figure 1. A V-cycle of aggregation multigrid in conjunction with deflation.

Figure 2. Distribution of columns of matrix G on four tasks.

Figure 3. Computations performed from the two cores of Task 2 for the formulation of fourth and fifth columns of matrix G of the MGenFAspI.

Figure 4. Execution times (in seconds) of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Figure 5. Speedups of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Figure 6. Efficiency of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Table 1. Convergence behavior and performance of V cycle of AGAMG for 2D Poisson model problem.

Level	$nx$
Level		50	100	200	300	500
2	Execution time	0.0967	0.8053	8.7529	40.4147	275.5410
2	Iterations	25	27	27	28	28
3	Execution time	0.1118	0.5446	4.2991	15.6946	90.7970
3	Iterations	41	45	52	55	57
4	Execution time	0.125	0.6148	3.7552	11.7529	59.7511
4	Iterations	49	59	63	64	65
5	Execution time	0.1483	0.7674	3.9582	10.4572	39.3368
5	Iterations	61	85	98	105	115

Table 2. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique for 2D Poisson model problem.

Level	$nx$
Level		50	100	200	300	500
2	Execution time	1.6971	1.2749	11.1617	42.9594	296.577
2	Iterations	20	26	29	27	29
3	Execution time	0.3221	1.5183	7.8454	22.2745	117.9680
3	Iterations	42	45	47	46	56
4	Execution time	0.3351	2.0727	9.1327	24.3004	100.505
4	Iterations	40	60	59	61	66
5	Execution time	0.3964	2.9885	16.976	41.8845	170.785
5	Iterations	46	88	124	130	181

Table 3. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 1

for 2D Poisson model problem.

Table 3. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 1

for 2D Poisson model problem.

Level	$nx$
Level		50	100	200	300	500
2	Execution time	0.6232	4.2415	24.0091	72.6027	374.277
2	Iterations	17	25	27	27	28
3	Execution time	1.3349	16.3954	39.3299	93.8537	378.846
3	Iterations	32	96	54	54	67
4	Execution time	2.9044	33.2589	52.0096	379.832	561.296
4	Iterations	65	185	70	218	105

Table 4. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 1

for 3D Poisson model problem.

Table 4. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 1

for 3D Poisson model problem.

Level	$nx$
Level		50	100	200	300	500
2	Execution time	9.1539	47.4512	573.946	2146.98	5649.41
2	Iterations	8	10	16	20	18
3	Execution time	9.9429	48.6524	571.837	1261.92	3664.08
3	Iterations	8	10	26	21	23
4	Execution time	10.4677	54.8827	327.732	535.449	2954.17
4	Iterations	8	11	28	23	57

Table 5. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 2D Poisson model problem.

Table 5. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 2D Poisson model problem.

Level	$nx$
Level		50	100	200	300	500
2	Execution time	1.3201	6.9443	36.0793	97.401	441.648
2	Iterations	21	26	28	27	28
3	Execution time	1.9226	11.4484	50.453	135.453	454.7931
3	Iterations	27	40	42	48	52
4	Execution time	2.093	13.0483	73.6803	191.0220	555.9120
4	Iterations	28	43	59	65	64
5	Execution time	1.9354	17.4081	100.6990	282.1470	860.0930
5	Iterations	25	56	79	96	103

Table 6. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem.

Table 6. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem.

Level	$nx$
Level		50	100	200	300	500
2	Execution time	8.9012	48.5778	709.2940	1974.3400	5529.2900
2	Iterations	8	10	15	16	16
3	Execution time	9.986	49.2396	519.0140	1704.4200	4300.770
3	Iterations	8	10	15	20	22
4	Execution time	10.2313	54.0935	427.9530	1100.960	2115.110
4	Iterations	8	11	17	23	25
5	Execution time	13.5096	51.6422	605.7950	1135.4700	1956.0400
5	Iterations	10	10	24	25	27

Table 7. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and different values of

l f i l l

for 2D Poisson model problem with

n x = 300

.

Table 7. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and different values of

l f i l l

for 2D Poisson model problem with

n x = 300

.

Level	$lfill$
Level		1	2	3	4
2	Execution time	72.6027	98.6010	132.5990	174.9720
2	Iterations	27	27	27	27
3	Execution time	93.8537	138.0190	208.5200	310.3720
3	Iterations	54	48	48	50
4	Execution time	379.8320	191.1260	318.5270	398.9170
4	Iterations	218	65	71	61
5	Execution time	-	283.9880	407.6990	730.2840
5	Iterations	-	96	89	111

Table 8. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and different values of

l f i l l

for Sherman1 problem.

Table 8. Convergence behavior and performance of V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and different values of

l f i l l

for Sherman1 problem.

Level	$lfill$
Level		1	2	3	4
2	Execution time	0.8757	0.9876	1.0288	1.0522
2	Iterations	369	367	369	357
3	Execution time	1.1042	1.2073	1.3506	1.1743
3	Iterations	365	368	366	341
4	Execution time	1.3235	1.5063	1.6007	1.6300
4	Iterations	369	371	369	354
5	Execution time	1.5504	1.6823	1.8967	1.8641
5	Iterations	375	360	367	339

Table 9. Convergence behavior and performance of a parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Table 9. Convergence behavior and performance of a parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Level	$lfill$
Level		1	2	3	4
2	Execution time	2867.37	1804.98	1125.15	820.11
2	Iterations	54	37	43	40
3	Execution time	2129.46	1410.57	1087.21	866.76
3	Iterations	34	41	35	33
4	Execution time	3542.25	2145.36	1621.55	1299.05
4	Iterations	53	37	41	36
5	Execution time	3083.26	2145.08	1506.31	1201.97
5	Iterations	45	46	30	34

Table 10. Speedup of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Table 10. Speedup of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Level	Threads
Level	2	4	6
2	1.59	2.55	3.77
3	1.51	1.96	2.46
4	1.65	2.18	2.73
5	1.44	2.05	2.57

Table 11. Efficiency of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Table 11. Efficiency of parallel V cycle of AGAMG in combination with deflation technique and MGenFAspI with

d r o p t o l = 0

and

l f i l l = 2

for 3D Poisson model problem with

n x = 70

.

Level	Threads
Level	2	4	6
2	0.79	0.64	0.58
3	0.75	0.49	0.41
4	0.83	0.558	0.45
5	0.72	0.51	0.43

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

An Aggregation-Based Algebraic Multigrid Method with Deflation Techniques and Modified Generic Factored Approximate Sparse Inverses

Abstract

1. Introduction

2. Deflation Techniques

3. Aggregation Multigrid Method

4. Parallel Techniques

5. Numerical Results

6. Discussion and Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Article Metrics

Citations

Article Access Statistics