Anderson Acceleration of the Arnoldi-Inout Method for Computing PageRank

: Anderson( m 0 ) extrapolation, an accelerator to a ﬁxed-point iteration, stores m 0 + 1 prior evaluations of the ﬁxed-point iteration and computes a linear combination of those evaluations as a new iteration. The computational cost of the Anderson( m 0 ) acceleration becomes expensive with the parameter m 0 increasing, thus m 0 is a common choice in most practice. In this paper, with the aim of improving the computations of PageRank problems, a new method was developed by applying Anderson(1) extrapolation at periodic intervals within the Arnoldi-Inout method. The new method is called the AIOA method. Convergence analysis of the AIOA method is discussed in detail. Numerical results on several PageRank problems are presented to illustrate the effectiveness of our proposed method.


Introduction
As the core technology of network information retrieval, Google's PageRank model (called the PageRank problem) uses the original hyperlink structure of the World Wide Web to determine the importance of each page and has received a lot of attention in the last two decades. The core of the PageRank problem is to compute a dominant eigenvector (or PageRank vector) of the Google matrix A by using the classical power method [1]: where x is the PageRank vector, e is a column vector with all elements equal to 1, v is a personalized vector and the sum of its elements is 1, P is a column-stochastic matrix (i.e., the dangling nodes have been replaced by columns with 1/n), and α ∈ (0, 1) is a damping factor. As the damping factor α gradually approaches 1, the Google matrix is close to the original hyperlink structure. However, for large α such as α ≥ 0.99, the second eigenvalue (≤ α) of the matrix A will be close to the main eigenvalue (equal to 1) [2], such that the classical power method suffers from slow convergence. In order to accelerate the power method, a lot of new algorithms are used to compute PageRank problems. The quadratic extrapolation method proposed by Kamvar et al. [3] accelerates the convergence by periodically subtracting estimates of non-dominant eigenvectors from the current iteration of the power method. It is worth mentioning that the authors [4] provide a theoretical justification for acceleration methods, generalizing the quadratic extrapolation and interpreting it as a Krylov subspace method. Gleich et al. [5] proposed an inner-outer iteration, wherein an inner PageRank linear system with a smaller damping factor is solved in each iteration. The inner-outer iteration shows good potential as a framework for accelerating PageRank computations, and a series of methods have been proposed based on it. For example, Gu et al. [6] constructed the power-inner-outer (PIO) method by combining the inner-outer iteration with the power method. It is worth mentioning that different versions of the Arnoldi algorithm applied to PageRank computations were first introduced in [7]. Gu and Wang [8] proposed the Arnoldi-Inout (AIO) algorithm by knitting the inner-outer iteration with the thick restarted Arnoldi algorithm [9]. Hu et al. [10] proposed a variant of the Power-Arnoldi (PA) algorithm [11] by using an extrapolation process based on a trace of the Google matrix A [12].
Anderson(m 0 ) acceleration [13,14] has been widely used to accelerate the convergence of a fixed-point iteration. Its principle is to store m 0 + 1 prior evaluations of the fixed-point method and compute a linear combination of those evaluations such that a new iteration is obtained. Anderson(0) is the given fixed-point iteration. Note that when the parameter m 0 becomes large, the computational cost of the Anderson(m 0 ) acceleration becomes expensive. Hence, in most applications, m 0 is chosen to be small, and we set m 0 = 1 as a usual choice in this paper. In [15], Toth et al. proved that Anderson(1) extrapolation was locally q-linearly convergent. Pratapa et al. [16] developed the Alternating Anderson-Jacobi (AAJ) method by periodically employing Anderson extrapolation to accelerate the classical Jacobi iterative method for sparse linear systems.
In this paper, with the aim of accelerating the Arnoldi-Inout method for computing PageRank problems, the Anderson(1) extrapolation is used as an accelerator, and thus a new method is presented by combining the Anderson(1) extrapolation with the Arnoldi-Inout method periodically. Our proposed method is called the AIOA method, and its construction and convergence behavior are analyzed in detail, and numerical simulation experiments prove the effectiveness of the new algorithm.
The other parts of this article are structured as follows: In Section 2, we briefly review the Anderson acceleration and the Arnoldi-Inout method for PageRank problems. In Section 3, the AIOA method is constructed, and its convergence behavior is discussed. In Section 4, numerical comparisons are reported. Finally, in Section 5, we give some conclusions.

Anderson Acceleration
Anderson acceleration (also known as Anderson mixing) has been widely used in electronic structure computations [17]. Walker et al. [14] developed it for solving fixedpoint problems: x = g(x), where x ∈ R n and g : R n → R n . They showed that Anderson acceleration without truncation was essentially equivalent, in a certain sense, to the generalized minimum residual method (GMRES) [18] for linear problems. It has been proved that the Anderson iteration is convergent if the fixed-point iteration g is a contraction and the coefficients in the linear combination remain bounded [15].
In this paper, we consider the Anderson(1) acceleration that stores two prior evaluations g(x 0 ), g(x 1 ) and then computes x 2 (a linear combination of g(x 0 ) and g(x 1 )) as the new iteration. The main algorithmic steps of Anderson(1) are given as Algorithm 1. (2) Compute According to [15], the constrained linear least-squares problem (2) in step 4 of Algorithm 1 can be formulated as an equivalent, unconstrained least-squares problem: It is easy to solve the unconstrained least-squares problem (3), for example, Pratapa et al. [16] chose the generalized inverse to compute γ 0 , and Walker et al. [19] chose QR decomposition [18] to compute γ 0 .

The Arnoldi-Inout Method for Computing PageRank
Gu and Wang [8] proposed the Arnoldi-Inout method by preconditioning the innerouter iteration with the thick restarted Arnoldi method. Its algorithmic version can be found in Algorithm 2.
Algorithm 2 Arnoldi-Inout method [8] Input: an initial vector x 0 , the size of the subspace m, the number of approximate eigenvectors that are retained from one cycle to the nextp, an inner tolerance η, an outer tolerance τ, three parameters α 1 , α 2 , and maxit to control the inner-outer iteration. Set (1). Apply the thick restarted Arnoldi algorithm [8,9] a few times (2-3 times). If the residual norm satisfies the prescribed tolerance, then stop; otherwise, continue. (2). Run the inner-outer iteration with x as the initial guess, where x is the approximate vector obtained from the thick restarted Arnoldi algorithm: End While 2.13.
For Algorithm 2, it is necessary to indicate that: (1) The detailed description of the thick restarted Arnoldi algorithm in step 1 can be found in [8,9]. Here, we leave out its implementation for conciseness. (2) The parameters α 1 , α 2 , restart and maxit are used to control the conversion between the inner-outer iteration and the thick restarted Arnoldi algorithm. The specific utility mechanism and more details can be found in [8].

The AIOA Method for Computing PageRank
In this section, we combine the Arnoldi-Inout method with the Anderson(1) acceleration. The new method is called the AIOA method, which can be understood as the Arnoldi-Inout method accelerated with the Anderson(1) extrapolation. We first describe the construction of the AIOA method and then analyze its convergence behavior.

The Construction of the AIOA Method
The mechanism of the AIOA method can be described as follows: We first ran the Arnoldi-Inout method with a given initial guess x 0 to get an approximation vector x 1 . If the approximation vector was unsatisfactory, then we treated the inner-outer iteration as a fixed-point problem and ran Algorithm 1 with vector x 1 as the starting vector to get another approximation vector x new . If the vector x new did not work better than the approximation vector x 3 of the fixed-point problem, we set x new = x 3 . If the new approximation vector x new was still not up to the specified accuracy, then we returned to the Arnoldi-Inout method with x new as the starting vector. We repeated the above process similarly until the required accuracy was reached. The specific algorithmic version is shown as follows.

Convergence Analysis
The convergence of the Arnoldi-Inout method and that of the Anderson acceleration can be found in [8,14,15]. In this subsection, we analyze the convergence of the AIOA method. Specifically, the convergence analysis of Algorithm 3 focuses on the process when turning from the Anderson(1) acceleration to the Arnoldi-Inout method.

Algorithm 3 AIOA method
(1). Given a unit initial guess x 0 , an inner tolerance η, an outer tolerance τ, the size of the subspace m, the number of approximate eigenvectors that are retained from one cycle to the next p, three parameters α 1 , α 2 and maxit to control the inner-outer iteration. Set (2). Run the Algorithm 2 with the initial vector x 0 . If the residual norm satisfies τ, then stop, otherwise continue. (3). Run the Algorithm 1 with x 1 as the starting guess, where x 1 is the approximation vector obtained from step 2.
x new = x 3 ; 3. 16 Let L m−1 denote the set of polynomials whose degree does not exceed m − 1 and σ(A) represent the set of eigenvalues of the matrix A. Assume the eigenvalues of A are sorted in the decreasing order 1 = |λ 1 | > |λ 2 | ≥ · · · ≥ |λ n |. The following theorem proposed by Saad [20] describes the relationship between an approximate eigenvector µ 1 and the Krylov subspace K m . Theorem 1. [20] Assume that A is diagonalizable and that the initial vector v 0 in Arnoldi's method has expansion v 0 = ∑ n i=1 ζ i µ i with respect to the eigenbasis {µ i } i=1,2,3,··· ,n in which ||µ i || 1 = 1, i = 1, 2, 3, · · · , n and ζ 1 = 0. Then the following inequality holds where P m is the orthogonal projector onto the subspace K m (A, v 0 For the purpose of analyzing the convergence speed of our algorithm, it is given that two useful theorems about the spectrum properties of the Google matrix A are as follows. Theorem 2. [21] Assume that the spectrum of the column-stochastic matrix P is [1, π 2 , · · · , π n ] and then the spectrum of the matrix A = αP + (1 − α)ev T is [1, απ 2 , · · · , απ n ], where α ∈ (0, 1), and v is a vector with nonnegative elements such that e T v = 1.

Theorem 3. [2]
Let P be an n × n column-stochastic matrix. Let α be a real number such that 0 < α < 1. Let E be an n × n rank-one column-stochastic matrix E = ve T , where e is the n-vector whose elements are all ones and v is an n-vector whose elements are all nonnegative and sum to 1. Let A = αP + (1 − α)E be an n × n column-stochastic matrix, and then its dominant eigenvalue λ 1 = 1,|λ 2 | ≤ α.
In the Arnoldi-Inout method, let v 0 from the previous thick restarted Arnoldi method be the starting vector for the inner-outer iteration. Next, the inner-outer method produces the vector v 1 = G k v 0 , where k ≥ maxit and G = (I − βP) −1 (α − β)P + (1 − α)ve T . The derivation of the iterative matrix G can be found in [5]. In our proposed method, we ran Algorithm 1 with vector v 1 as the initial vector. Note that in the Anderson(1) acceleration, we treated the inner-outer iteration as a fixed-point iteration such that the new vector v new = ω (1 − γ 0 )G 2 v 1 + γ 0 Gv 1 was produced such that ω was the normalizing factor. If the vector v new worked better than the vector G 2 v 1 , then, as given in Algorithm 3, we set v new = G 2 v 1 , which meant the Anderson(1) acceleration was reduced to the inner-outer iteration and the convergence of Algorithm 3 was certainly established for this case. Hence, it is discussed that the convergence for another case when the vector v new = ω (1 − γ 0 )G 2 v 1 + γ 0 Gv 1 works better than the vector G 2 v 1 .
In the next cycle of the AIOA algorithm, a m-step Arnoldi process was run with v new as the starting vector, and then the new Krylov subspace K m (A, v new ) = span v new , Av new , · · · , A m−1 v new was constructed. Next, we introduced the theorem that illustrates the convergence of the AIOA method.

Remark 1.
Comparing (4) with (5), it is easy to find that our method can improve the convergence speed by a factor of at least Λ· when turning from the Anderson(1) acceleration to the Arnoldi-Inout method.

Numerical Experiments
In this section, we first give the appropriate choice for the parameter maxit and then test the effectiveness of the AIOA method. For the thick restarted Arnoldi method, there were two parameters, m andp, that needed to be considered, but the thick restarted Arnoldi method had the same effect as the Arnoldi-Inout [8] method and the AIOA method. In addition, with the parameters m andp increasing, the cost would have been expensive, and they usually take small values. As a result, we don't discuss the choice of the two parameters m andp in detail and set m = 4 andp = 3 for all test examples.
All the numerical experiments were performed using MATLAB R2018a programming package on 2.10 GHZ CPU with 1 6GB RAM. Table 1 lists the characteristics of the test matrices, where n represents the matrix size, nnz denotes the number of nonzero elements and den is the density which is defined by den = nnz n×n × 100. All the test matrices are available from https://sparse.tamu.edu/ (accessed on 14 July 2020). For the sake of justice, the same initial guess x 0 = v = e/n with e = [1, 1, · · · , 1] T was used. The damping factors were chosen as α = 0.99, 0.993, 0.995 and 0.998 in all numerical experiments. The stopping criterion were set as the 2-norm of the residual, and the prescribed outer tolerance was τ = 10 −8 . For the inner-outer iterations, the inner residual tolerance was η = 10 −2 , and the smaller damping factor was β = 0.5. The parameters chosen to control the flip-flop were α 1 = α − 0.1 and α 2 = α − 0.1. We ran the thick restarted Arnoldi procedure twice in each loop of the Arnoldi-Inout [8] method and the AIOA method. In the AIOA algorithm, we chose the QR decomposition to compute γ 0 .

The Selection of Parameter Maxit
In this subsection, we discuss the selection of the parameter value maxit by analyzing the numerical results of the Arnoldi-Inout [8] (denoted as "AIO") method and the AIOA method for the web-Stanford matrix, which contains 281,903 pages and 2,312,497 links. Table 2 lists the matrix-vector products (MV) of the AIO method and the AIOA method for the web-Stanford matrix when α = 0.99, 0.993, 0.995, 0.998 and maxit = 2, 4, 6, 8, 10. Figure 1 depicts the curves of computing time (CPU) of the two methods versus number maxit, respectively.  From Table 2, it is observed that the optimal maxit was different for different α and different methods. From Figure 1, optimal maxit is 6 and the worst performing maxit is 8 for the AIO method, but for the AIOA method, the best value of maxit is not 6. For fairness, we decided to choose the maxit = 4 in the following numerical experiments. In addition, in Table 2, when α = 0.995 and maxit = 6, the MV of the AIOA is a little more than that of the AIO method, but the CPU time of AIOA method is better than that of the AIO method. The situation suggests that our method has some potential.

Comparisons of Numerical Results
In this subsection, we tested the effectiveness of the AIOA method through numerical comparison experiments with the inner-outer (denoted as "Inout") [5] method, the powerinner-outer (denoted as "PIO") [6] method and the Arnoldi-Inout (denoted as "AIO") [8] method in terms of iteration counts (IT), the number of matrix-vector products (MV) and the computing time (CPU) in seconds. In all experiments in this subsection, we set the parameters m = 4,p = 3 and maxit = 4. Tables 3-6 give the numerical experiment results of the Inout method, the PIO method, the AIO method and the AIOA method for four matrices when α = 0.99, 0.993, 0.995, 0.998, and Figures 2-5 describe the residual convergence images of the above methods with different α for all test matrices.
In order to better demonstrate the efficiency of our proposed method, we defined to show the speedup of the AIOA method with respect to the AIO method in terms of CPU.        From the numerical results in Tables 3-6, it is easy to see that the AIOA method performed better than the other three methods in terms of IT, MV and CPU time for four matrices with different damping factors. As we expected, the advantage of the AIOA method was obvious for large α. For instance, when α = 0.995, the speedup is 52.65% in Table 3 and 36.66% in Table 5. When α = 0.998, the speedup is 49.48% in Table 4 and 60.32% in Table 6. In addition, from Figures 2-5, it is easy to observe that the AIOA method can reach the accuracy requirement faster than the Inout method, the PIO method and the AIO method for all test examples. Therefore, the above results verify the effectiveness of the AIOA method.

Conclusions
In this paper, by employing the Anderson(1) extrapolation at periodic intervals within the Arnoldi-Inout method, we have presented a new method called the AIOA method to accelerate the computation speed of PageRank problems. Its implementation process and convergence theorem can be found in Section 3. Numerical simulation experiment results in Section 4 proved that the AIOA method was very efficient and converged faster compared to the inner-outer method, the power-inner-outer method and the Arnoldi-Inout method. However, there is still a lot of work to be further studied. For example, it is difficult to handle the best choices for parameters m, β, maxit.