A FEAST Algorithm for the Linear Response Eigenvalue Problem

: In the linear response eigenvalue problem arising from quantum chemistry and physics, one needs to compute several positive eigenvalues together with the corresponding eigenvectors. For such a task, in this paper, we present a FEAST algorithm based on complex contour integration for the linear response eigenvalue problem. By simply dividing the spectrum into a collection of disjoint regions, the algorithm is able to parallelize the process of solving the linear response eigenvalue problem. The associated convergence results are established to reveal the accuracy of the approximated eigenspace. Numerical examples are presented to demonstrate the effectiveness of our proposed algorithm.


Introduction
In computational quantum chemistry and physics, the random phase approximation (RPA) or the Bethe-Salpeter (BS) equation describe the excitation states and absorption spectra for molecules or the surfaces of solids [1,2].One important question in the RPA or BS equation is how to compute a few eigenpairs associated with several of the smallest positive eigenvalues of the following eigenvalue problem: where A, B ∈ R N×N are both symmetric matrices and A B B A is positive definite [3][4][5].Through a similarity transformation, the eigenvalue problem (1) can be equivalently transformed into: where K = A − B and M = A + B. The eigenvalue problem (2) was still referred to as the linear response eigenvalue problem (LREP) [3,6] and will be so in this paper, as well.The condition imposed upon A and B in (1) implies that both K and M are N × N real symmetric positive definite matrices.However, there are cases where one of them may be indefinite [7].Therefore, to be consistence with [8][9][10], throughout the rest of the paper, we relax the condition on K and M to that they are symmetric and one of them is positive definite, while the other may be indefinite.There has been immense recent interest in developing new theories, efficient numerical algorithms of LREP, and the associated excitation response calculations of molecules for materials' design in energy science [11][12][13][14][15].
This practice of enumerating the eigenvalues of H will be used later for the much smaller projection of H, as well.
Since the dimension N is usually very large for systems of practical interest, solving LREP (2) is a very challenging problem.The iterative methods are recent efforts for computing the partial spectrum of (2), such as the locally-optimal block preconditioned 4D conjugate gradient method (LOBP4DCG) [6] and its space-extended variation [16], the block Chebyshev-Davidson method [10], as well as the generalized Lanczos-type methods [8,9].These algorithms are all based on the so-called pair of deflating subspaces, which is a generalization of the concept of the invariant subspace in the standard eigenvalue problems.Thus, these methods can be regarded as extensions of the associated classical methods for the standard eigenvalue problem.Recently, in [17], a density-matrix-based algorithm, called FEAST, was proposed by Polizzi for the calculation of a segment of eigenvalues and their associated eigenvectors of a real symmetric or Hermitian matrix.The algorithm promises high potential for parallelism by dividing the spectrum of a matrix into an arbitrary number of nonintersecting intervals.The eigenvalues in each interval and the associated eigenvectors can be solved independently of those in the other intervals.As a result, one can compute a large number of eigenpairs at once on modern computing architectures, while typical Krylov-or Davidson-type solvers fail in this respect because they require orthogonalization of any new basis vector with respect to the previously-converged ones.
The basic idea of the FEAST algorithm is based on the contour integral function: to construct bases of particular subspaces and obtain the corresponding approximate eigenpairs by the Rayleigh-Ritz procedure where C is any contour enclosing the set of wanted eigenvalues.The corresponding theoretical analysis was established in [18].In particular, it was shown that the FEAST algorithm can be understood as an accelerated subspace iteration algorithm.Due to the advantage of the FEAST algorithm in parallelism, it has been well received in the electronic structure calculations community.Additionally, the improvements and generalizations of FEAST algorithm have been active research subjects in eigenvalue problems for nearly a decade.For example, in [19], Kestyn, Polizzi, and Tang presented the FEAST eigensolver for non-Hermitian problems by using the dual subspaces for computing the left and right eigenvectors.In [20], the FEAST algorithm was developed to solve nonlinear eigenvalue problems for eigenvalues that lie in a user-defined region.Some other work for the FEAST method can be found in [21][22][23][24].Motivated by these facts, in this paper, we will continue the effort by extending the FEAST algorithm to LREP.The rest of the paper is organized as follows.In Section 2, some notations and preliminaries including the canonical angles of two subspaces and basic results for LREP are collected for use later.Section 3 contains our main algorithm, the implementation issue, and the corresponding convergence analysis.In Section 4, we present some numerical examples to show the numerical behavior of our algorithm.Finally, conclusions are drawn in Section 5.

Preliminaries
R m×n is the set of all m × n real matrices.For X ⊆ R n , dim(X ) is the dimension of X .For X ∈ R m×n , X T is its transpose and R(X) the column space of X, and the submatrix X (:,i:j) of X consists of column i to column j.I n is the n × n identity matrix or simply I if its dimension is clear from the context.For matrices or scalars X i , diag(X 1 , X 2 , . . ., X k ) denotes the block diagonal matrix with diagonal block X 1 , X 2 , . . ., X k .
For any given symmetric and positive definite matrix W ∈ R N×N , the W-inner product and its induced W-norm are defined by: If x, y W = 0, then we say x ⊥ W y or y ⊥ W x. The projector Π W is said to the W-orthogonal projector onto X if for any vector x ∈ R N , , let X and Y be the W-orthonormal basis matrices of subspaces X and Y, respectively, i.e., Denote the singular values of X T WY by σ j for 1 ≤ j ≤ k in ascending order, i.e., σ 1 ≤ • • • ≤ σ k .The diagonal matrix of W-canonical angles from (If = k, we may say that these angles are between X and Y [25].)X to Y in descending order is defined by: In what follows, we sometimes place a vector or matrix in one or both arguments of Θ W (., .)with the understanding that it is about the subspace spanned by the vector or the columns of the matrix argument.Several theoretical properties of the canonical angle and LREP have been established in [25] and [3], respectively.In particular, the following lemmas are necessary for our later developments.Lemma 1 ([9] Lemma 3.2).Let X and Y be two subspaces in R N with equal dimension dim(X ) = dim(Y ) = k.Suppose θ (1)  W (X , Y ) < π/2.Then, for any set {y 1 , y 2 , . . ., y k 1 } of the basis vectors in Y where 1 ≤ k 1 ≤ k, there is a set {x 1 , x 2 , . . ., x k 1 } of linear independent vectors in X such that Π W x j = y j for 1 ≤ j ≤ k 1 , where Π W is the W-orthogonal projector onto Y. (a) There exists a nonsingular Ψ ∈ R N×N such that: also definite, then all λ i > 0, and H is diagonalizable: (c) The eigen-decomposition of KM and MK is: respectively.
Lemma 3 ([10] Lemma 2.2).Let Z be an invariant subspace of H and Z = [V T , U T ] T be the basis matrix of Z with both V and U having N rows, then R(MV) = R(U).

The Main Algorithm
For LREP, suppose λ i for i = 1, . . ., s are all the wanted eigenvalues whose square are inside the circle C with center at c and radius r on the complex plane, as illustrated in the following Figure 1.Given a random matrix Y ∈ R N×s , for any filter function f (•) that serves the purpose of filtering out the unwanted eigen-directions, we have, by Lemma 2, As we known in [10], the ideal filter for computing λ i for i = 1, . . ., s should map all s wanted eigenvalues to one and all unwanted ones to zero.For such a purpose, we also consider the filter function applied in [17], i.e., By Cauchy's residue theorem [26] in complex analysis, we have: As stated in [24] [Lemma 1], the matrix Φ T (:, 1 : s ) Y, with probability one, is nonsingular if the entries of Y are random numbers from a continuous distribution and they are independent and identically distributed.Therefore, naturally, the columns of f (KM)Y span the invariant subspace of KM corresponding to λ 2 i for i = 1, . . ., s.However, it needs to calculate the contour integral.In practice, this is done by using numerical quadratures to compute this filter f (KM) approximately.It is noted that λ 2 i for i = 1, . . ., s are all real.Let µ = c + re ιt for 0 ≤ t ≤ 2π.Then, we have: where Re() stands for the real part.A q-point interpolatory quadrature rule, such as the trapezoidal rule or Gauss-Legendre quadrature, leads to: where t i and ω i are the quadrature nodes and weights, respectively, and Then, by ( 6), V is computed by solving shifted linear systems of the form: Furthermore, for any interpolatory quadrature rule, we have by [18] that: It follows by Lemma 3 that if the columns of V span the y-component of an approximate invariant subspace of H, then the columns of U = MV should span the x-component of the same approximate invariant subspace.Therefore, {R(U), R(V)} is a pair of approximated deflating subspaces of H.By [3], the best approximations of λ i for i = 1, . . ., s with the pair of approximate deflating subspaces {R(U), R(V)} are eigenvalues of: where W 1 and W 2 are two nonsingular factors of In particular, we choose Cholesky decomposition.This leads to: Let the eigenvalues of G be ρ 2 j in ascending order and the associated eigenvectors be q j , i.e., Gq j = ρ 2 j q j for 1 ≤ j ≤ s.Then, Approximating eigenpairs of H is then taken to be (ρ j , zj ) where: zj = ρ j ψj φj , φj = UR −1 q j and ψj = VR −1 q j .( We summarize the pseudo-code of the FEAST algorithm for LREP in Algorithm 1.A few remarks regarding Algorithm 1 are in order: Remark 1.(a) In practice, we do not need to save V (i) , U (i) and G (i) for every i in Algorithm 1.In fact, the subscript " (i) " is just convenience for the convergence analysis in the next subsection.(b) Since the trapezoidal rule yields much faster decay outside C than the Gauss-Legendre quadrature [23], our numerical examples use the trapezoidal rule to evaluate f (KM)Y, i.e., in (6), , , (c) In Algorithm 1, it usually is required to know the eigenvalue counts s inside the contour C in advance.In practice, s is unknown a priori, and an estimator has to be used instead.Some available methods have been proposed in [27,28] to estimate eigenvalue counts, based on stochastic evaluations of: (d) In our numerical implementation of Algorithm 1, we monitor the convergence of a computed eigenpair (ρ j , zj ) by its normalized relative residual norm: The approximate eigenpair (ρ j , zj ) is considered as converged when: where ω = c − r, ω = c + r, and ε is a preset tolerance.

Algorithm 1
The FEAST algorithm for LREP.

5:
If convergence is not reached then go to Step 2,with 6: end for

Convergence Analysis
Without loss of generality, we suppose in this subsection: and use the following simplifying notation: In Algorithm 1, for a given scalar k, naturally, we would use Ψ (k) as approximations to Ψ 1 and Φ (k) as approximations to Φ 1 .In this subsection, we will investigate how good such approximations may be.Notice that f (Λ 2 1 ) is invertible since f (λ 2 i ) > 1/2 for i = 1, . . ., s by (8).
Lemma 5. Suppose Ψ T 1 MY (0) is nonsingular.Then, Y (i) is full column rank for all i in Algorithm 1.
Proof.We prove this statement by induction.If Ψ T 1 MY (j) is invertible for some j, then: where j+1) has full column rank.This leads to R (j+1) and Q (j+1) both being invertible.By Step 5 of Algorithm 1, we have , which is also full column rank.Furthermore, is nonsingular.We conclude by induction that Y (i) is full column rank for i = 1, 2, . . . .
As we know in [29], an important quantity for the convergence properties of projection methods is the distance of the exact eigenspace from the search subspace.Theorem 1 establishes bounds on M-canonical angles from Ψ (:, k 1 : k 2 ) to the search subspace V (k) and M −1 -canonical angles from Φ (:, k 1 : k 2 ) to the search subspace U (k) , respectively, to illustrate the effectiveness of the FEAST algorithm for LREP.As stated in [18], if the spectrum of KM is distributed somewhat uniformly, then we would expect | f (λ )| to be as small as 10 −3 .That means tan Θ M (Ψ (:, k 1 : k 2 ) , V (k) ) F → 0 and tan Θ M −1 (Φ (:, k 1 : k 2 ) , U (k) ) F → 0 at a rate of 10 −3k .Based on Theorem 1, the following theorem is obtained to bound the M-canonical angles between Ψ (:, k 1 : k 2 ) and Ψ (k) (:,k 1 :k 2 ) and the M −1 -canonical angles between Φ (:, k 1 : k 2 ) and Φ (k) (:,k 1 :k 2 ) , respectively.Theorem 2. Suppose the condition of Lemma 5 holds.Using the notations of Theorem 1, we have: where: and Then, we can write . Furthermore, we have: sin Θ M (Ψ (:, Now, we turn to bound the term Q T Multiply ( 19) by V T (k) M from the left to get: Furthermore, we have: By Lemma 4, we conclude that: The inequality ( 15) is now a consequence of ( 17), ( 18), (20), and Theorem 1. Similarly we can prove (16).

Numerical Examples
In this section, we present some numerical examples to illustrate the effectiveness of Algorithm 1 and the upper bounds in Theorem 1.
In such a way, Y (0) satisfies the condition that Y T (0) MΨ (:,1:3) is nonsingular.We run Algorithm 1 and check the bound for tan Θ M (Ψ (:,1:3) , V (k) ) F given by (11) with k = 1 and k = 2, respectively.In Algorithm 1, the trapezoidal rule is applied to calculate f (KM)Y (0) with the parameters c = 1 and r = 0.2.We compute the following factors: in Table 1.From Table 1, we can see that, as η goes to zero, the bounds ε 2 for k = 1, 2 are sharp and insensitive to η.
Table 1.ε 1 together with their corresponding upper bounds ε 2 of Example 1. Example 2. To test the effectiveness of Algorithm 1, we chose three test problems, i.e., TEST 1 to TEST 3 in Table 2, which come from the linear response analysis for Na 2 , Na 4 , and silane (SiH4) compounds, respectively [9].The matrices K and M of TEST 1, TEST 2, and TEST 3 are both symmetric positive definite with order N = 1862, 2834, and 5660, respectively.We compute the eigenvalues whose square lies in the interval (ω t , ω t ) for t = 1, 2, 3 and the associated eigenvectors by using the MATLAB parallel pool.The interval (ω t , ω t ) and their corresponding eigenvalue counts s t for t = 1, 2, 3 are detailed in Table 2.All our experiments were performed on a Windows 10 (64 bit) PC-Intel(R) Core(TM) i7-6700 CPU 3.40 GHz, 16 GB of RAM using MATLAB Version 8.5 (R2015a) with machine epsilon 2.22 × 10 −16 in double precision floating point arithmetic.In demonstrating the quality of computed approximations, we calculated the normalized residual norms r(ρ j ) defined in (10) and relative eigenvalues errors: for the jth approximation eigenvalue (ρ j , zj ) where λ j were computed by MATLAB's function eig on H, and considered to be the "exact" eigenvalues for testing purposes.
In this example, we first investigated how the maximal relative eigenvalue errors and normalized residual norms, i.e., max e(ρ j ) and max r(ρ j ) in Table 3, respectively, changed for an increase in the number of quadrature points q in (6).We used the direct methods, such as Gaussian elimination, to solve the shifted linear systems (7) in this example.The numerical results of Algorithm 1 by running four FEAST iterations with q varying from four to nine are presented in Table 3.It is shown by Table 3 that the trapezoidal rule with 6, 7, or 8 quadrature nodes already achieved satisfactory results.
Notice that the wanted eigenvalues are inside the intervals presented in Table 2.The shift and invert strategy is usually not available for LREP since the traditional algorithms to calculate the partial spectrum of LREP, such as the methods proposed in [6,8], combined with the shift and invert strategy will lose the structure of (2).
In addition, as stated in [17], the FEAST algorithm can be cast as a direct technique in comparison to iterative Krylov subspace-type methods, because it is based on an exact mathematical derivation (5).Therefore, by dividing the whole spectrum of LREP into a number of disjoint intervals, the FEAST algorithm is able to compute all eigenvalues simultaneously on the parallel architectures.From this aspect, as [24], we compared Algorithm 1 with the MATLAB built-in function eig, which was used to compute all eigenvalues of H, and then, the target eigenvalues were selected in this example.Let the number of quadrature nodes q = 7 and the convergence tolerance ε = 10 −8 .Table 4 reports the total required CPU time in seconds of Algorithm 1 and eig for TEST 1, TEST 2, and TEST3, respectively.It is clear from Table 4 that Algorithm 1 was much faster than the MATLAB built-in function eig in this example.A similar comparison between the FEAST-type algorithms and the MATLAB built-in function eig in terms of timing can be found in [30,31].(µ i I − KM)X i = Y, for i = 1, 2, . . ., q.
Direct solvers referring to O(N 3 ) operations are prohibitively expensive in solving large-scale sparse linear systems.Iterative methods, such as Krylov subspace-type methods [32], are usually preferred for large-scale sparse linear systems.However, Krylov subspace solvers may need a large number of iterations to converge.A way to speed up the FEAST method for LREP is to terminate the iterative solvers for linear systems before full convergence is reached.Therefore, in this example, we investigated the effect of the accuracy in the solution of linear systems on the relative eigenvalues errors and normalized residual norms.The linear systems were solved column-by-column by running GMRES [33] until: Figure 2 plots the maximal relative eigenvalues errors and normalized residual norms of Algorithm 1 with q = 7 in solving TEST 1, TEST 2, and TEST 3 with ε lin = 10 −4 , 10 −6 , and 10 −8 .It is revealed by Figure 2 that the accuracy from the solution of the linear systems translated to the accuracy of the eigenpairs obtained by the FEAST algorithm for LREP; that is to say, higher accuracy in solving linear systems leads to better numerical performance in Algorithm 1.In particular, for a rather large ε lin , such as ε lin = 10 −4 , it is hard to obtain satisfactory normalized residual norms in this example.

Conclusions
In this paper, we proposed a FEAST algorithm, i.e., Algorithm 1, for the linear response eigenvalue problem (LREP).The algorithm can effectively calculate the eigenvectors associated with eigenvalues that are located inside some preset intervals.Compared with other methods for LREP, the attractive computational advantage of Algorithm 1 is that it is easily parallelizable.We implemented numerical examples by using the MATLAB parallel pool to compute the eigenpairs in each interval independently of the eigenpairs in the other interval.The corresponding numerical results demonstrated that Algorithm 1 was fast and could achieve high accuracy.In addition, theoretical convergence results for eigenspace approximations of Algorithm 1 were established in Theorems 1 and 2. These theoretical bounds revealed the accuracy of the approximations of eigenspace.Numerical examples showed that the bounds provided by Theorem 1 were sharp.
Notice that the main computational tasks in Algorithm 1 consisted of solving q independent shifted linear systems with s right-hand sides.Meanwhile, as described above, the relative residual bounds in the solution of the inner shifted linear systems were translated almost one-to-one into the residuals of the approximate eigenpairs.Therefore, the subjects of further study include the acceleration of solving shifted linear systems on parallel architectures and theoretical analysis on how the accuracy of the inexact solvers in solving shifted linear systems interacts with the convergence behavior of the FEAST algorithm.In addition, while the algorithm in this paper is for real symmetric K and M, it is also available for the case of Hermitian K and M, simply by replacing all R by C (the set of complex and each matrix/vector transpose by the complex conjugate and transpose.

Lemma 2 ([ 3 ]
Theorem 2.3).The following statements hold for any symmetric matrices K, M ∈ R N×N with M being positive definite.

Lemma 4 ([ 25 ]
Lemma 2.1).Let A and B be two symmetric matrices and λ(A) and λ(B) be the set of the eigenvalues of A and B, respectively.For the Sylvester equation AY − YB = S, if λ(A) ∩ λ(B) = ∅, then the equation has a unique solution Y, and moreover:

Figure 1 .
Figure 1.Schematic representation of the wanted eigenvalues.

Table 4 .
Comparison of Algorithm 1 and eig for CPU time in seconds.