Estimating the Quadratic Form x T A − m x for Symmetric Matrices: Further Progress and Numerical Computations

: In this paper, we study estimates for quadratic forms of the type x T A − m x , m ∈ N , for symmetric matrices. We derive a general approach for estimating this type of quadratic form and we present some upper bounds for the corresponding absolute error. Speciﬁcally, we consider three different approaches for estimating the quadratic form x T A − m x . The ﬁrst approach is based on a projection method, the second is a minimization procedure, and the last approach is heuristic. Numerical examples showing the effectiveness of the estimates are presented. Furthermore, we compare the behavior of the proposed estimates with other methods that are derived in the literature.


Introduction
Let A ∈ R n×n be a given symmetric positive definite matrix and x ∈ R n . We are interested in estimating the quadratic forms of the type x T A −m x, m ∈ N. Our main goal was to find an efficient and cheap approximate evaluation of the desired quadratic form without the direct computation of the matrix A −m . As such, we revisited the approach for estimating the quadratic form x T A −1 x, developed in [1], and extended it to the case of an arbitrary negative power of A.
The computation of quadratic forms is a mathematical problem with many applications. Indicatively, we refer to some usual applications.
• Statistics: The inverse of the covariance matrix, which is referred to as a precision matrix, usually appears in statistics. The covariance matrix reveals marginal correlations between the variables, whereas the precision matrix represents the conditional correlations between two data variables of the other variables [2]. The diagonal of the inverse of covariance matrices provides information about the quality of data in uncertainty quantification [3]. • Network analysis: The determination of the importance of the nodes of a graph is a major issue in network analysis. Information for these details can be extracted by the evaluation of the diagonal elements of the matrix (I n − aA) −1 , where A is the adjacency matrix of the network, 0 < a < 1

ρ(A)
, and ρ(A) is the spectral radius of A. This matrix is referred to as a resolvent matrix, see, for example, [4] and the references therein. • Numerical analysis: Quadratic forms arise naturally in the context of the computation of the regularization parameter in Tikhonov regularization for solving ill-posed problems. In this case, the matrix has the form AA T + λI n , λ > 0. In the literature, many methods have been proposed for the selection of the regularization parameter λ, such as the discrepancy principle, cross-validation, generalized cross-validation (GCV), L-curve, and so forth; see, for an example, [5] (Chapter 15) and references therein. These methods involve quadratic forms of type x T (AA T + λI n ) −m x, with m = 1, 2, 3.
In practice, exact computation of a quadratic form is often replaced using an estimate that is faster to evaluate. Regarding its numerous applications, the estimation of quadratic forms is an important practical problem that has been frequently studied in the literature. Let us indicatively refer to some well-known methods. A widely used method is based on Gaussian quadrature [5] (Chapter 7) and [6]. Moreover, extrapolation procedures have been proposed. Specifically, in [7], families of estimates for the bilinear form x T A −1 y for any invertible matrix, and in [8], families of estimates for the bilinear form y * f (A)x for a Hermitian matrix were developed.
In the present work, we consider alternative approaches to this problem. To begin, notice that the value of the quadratic form (x, A −m x) is proportional to the second power of the norm of x. Therefore, the task of estimating (x, A −m x) consists of two steps:

2.
Assessing the absolute error of the above estimate, i.e., determining a bound for the quantity In Section 2, we present the upper bounds for the absolute error (2) for any given α. Section 3 is devoted to estimates of the value α in (1) using a projection method. In Section 4, we use bounds from Section 2 as a stepping stone for estimating x T A −m x using the minimization method. A heuristic approach is outlined in Section 5. In Section 6, we briefly describe two methods that were used in previous studies, namely, an extrapolation approach and another one based on Gaussian quadrature. Section 7 is focused on adapting the proposed estimates to the case of the matrix of form AA T + λI n . Numerical examples that illustrate the performance of the derived estimates are found in Section 8. We end this work with several concluding remarks in Section 9.

Bounds on the Error
In Proposition 1 below, we derive an upper bound on the error (2) for a given estimate α x 2 of the quadratic form x T A −m x. The first three expressions for the bounds (UB1-UB3) are a direct generalization of a result from [1]. Proposition 1. Let A ∈ R n×n be a symmetric positive definite matrix and x ∈ R n and est = α x 2 be an estimate of the quadratic form x T A −m x. If we denote b = αA m x − x, the absolute error of the estimate α x 2 − (x, A −m x) is bounded from above by the following expressions: . For estimates satisfying α x 2 ≤ (x, A −m x), we have also the family of error bounds where p ≥ 0 can be chosen as any integer such that (x,A p x) (A m x,A p x) < α.

UB1.
The matrix A −m is symmetric because A is symmetric, and it holds that by the Cauchy-Schwarz inequality. Moreover, we have Using the Kantorovich inequality for the matrix A m and considering that where κ = λ max λ min is the condition number of A. Therefore, the norm A −m x given by (3) can be bounded by Hence, we have

UB2.
Due to the Cauchy-Schwarz inequality, it holds that Following a similar approach as above based on the Kantorovich inequality, we obtain

UB3.
It holds that Applying the Kantorovich inequality to the matrix A m in a similar way as above, we can immediately obtain the inequality

UB4.
Applying the Cauchy-Schwarz inequality, we obtain

UB5.
Since A is positive definite, as is A q for any integer q, the angle between vectors v and Taking v = A −m x and q = p + m, we obtain The assumption Hence, we obtain At the same time, the assumption α To summarize, Consequently, The norm A −m x can be bounded using the Kantorovich inequality, as shown in Relation (4). Regarding the factor |sin ( Therefore, the relation (5) can be reformulated as

Estimate of x T A −m x by the Projection Method
Our goal is to find a number α such that (1)). To that end, let us take a fixed k ∈ N 0 = N ∪ {0} and consider the following decomposition of x, Using the assumption b ⊥ A k x, we obtain Hence, we obtain a family of estimates for x T A −m x as follows: We denote these estimates by est proj(k) , k ∈ N 0 . The computational implementation requires m + k 2 matrix-vector products (mvps).
Let us now explore the error corresponding to the above choice of α. We have Since α x 2 is the estimate (see (1)), the error term is provided as (x, A −m b). Bounds on its absolute value can be found using Proposition 1 with Remark 1. Let us comment on the choice of the parameter k.
• Observe that upper bounds UB1 and UB4 from Proposition 1 are minimal for k = m. In this case, we have b ⊥ A m x; thus, b has the smallest possible norm. Therefore, from the point of view of minimizing the upper bound on the error (more precisely, minimizing upper bounds UB1 and UB4), a convenient choice is k = m. • However, if the goal is fast estimation, we can take k = 0 for even m and k = 1 for odd m, as these two choices provide est proj(0) = respectively, which are both easy to evaluate.
In general, for any choice of k, the error of the estimate can be assessed using Proposition 1.

Estimate of x T A −m x Using the Minimization Method
The estimates that we present in this section stem from the upper bounds UB2 and UB3 for the absolute error |(x, A −m b)|, which are derived in Proposition 1. Our goal is to reduce the absolute error by finding the value α that minimizes these bounds.
Plugging b = αA m x − x in the explicit formulas for UB2 and UB3, we can easily check that the two upper bounds in question attain their minimal values if and only if α minimizes the function , where k = m corresponds to UB2 and k = 0 corresponds to UB3. By differentiating this expression with respect to α, we find that the upper bounds UB2 and UB3 are minimized atα, being the root of the equation where, as before, the values k = m and k = 0 correspond to UB2 and UB3, respectively. With this valueα, we obtain the estimation of x T A −m x as For the sake of brevity, we adopt the notation est min1 for k = 0 and est min2 for k = m.

The Heuristic Approach
Let us consider the quantity We refer to R m (x) as the generalized index of proximity.
Lemma 1. Assume that A ∈ R n×n is a symmetric matrix. For any nonzero vector x ∈ R n , the value R m (x) satisfies R m (x) ≥ 1. The equality R m (x) = 1 holds true if and only if x is an eigenvector of A.
Proof. By the Cauchy-Schwarz inequality, we have (x, A m x) 2 ≤ x 2 A m x 2 ; hence, R m (x) ≥ 1. The equality R m (x) = 1 is equivalent to the equality in the Cauchy-Schwarz inequality, which occurs if and only if the vector A m x is a scalar multiple of the vector x, in other words, when A m x = αx for a certain α ∈ R. This is further equivalent to Ax = λx (with λ satisfying λ m = α) given the assumption that A is symmetric.
As a result of Lemma 1, the equality where n 1 , n 2 ∈ Z, is identically true for any eigenvector of A (i.e., for any vector satisfying R m (x) = 1), and becomes approximately true for vectors x with the property R m (x) ≈ 1.
We refer to this estimate as est h . If, in particular, n 1 = 1 and n 2 = 0, we denote the estimate by est h1 , and if n 1 = n 2 = 1, the corresponding estimate is denoted by est h2 . The computational implementation requires 3m 2 mvps.

A Comparison with Other Methods
In this section, we briefly describe two methods that were proposed in the literature for estimating quadratic forms of the type x T f (A)x, where A ∈ R n×n , x ∈ R n , and f is a smooth function defined on the spectrum of A. The first method is an extrapolation procedure developed in [8] and the second one is based on Gaussian quadrature [5] (Chapter 7) and [6].

The Extrapolation Method
We adjust the family of estimates for x T f (A)x given in [8] (Proposition 2) by setting f (t) = t −m , m ∈ N. Hence, we directly obtain the estimating formula given in the following lemma.

Lemma 2.
Let A ∈ R n×n be a symmetric matrix. An extrapolation estimate for the quadratic form x T A −m x, m ∈ N, is given by We refer to this estimation as est extrap(ν) . The computational implementation requires just one mvp.

Remark 2.
In the special case of m = 1, some of the proposed estimates are identified to the corresponding extrapolation estimates for specific choices of the family parameter ν. We have Notably, the extrapolation procedure proposes estimates for the quadratic form x T A −m x and not bounds. The choice of the family parameter ν is arbitrary and no bounds for the absolute error of the estimates are provided.

Gaussian Techniques
We consider the spectral factorization of A, which allows us to express the matrix A as A = ∑ n k=1 λ k v k v T k , where λ k ∈ R are the eigenvalues of A with corresponding eigenvectors v k . Therefore, the quadratic form x T A −m x can be written as The Summation (9) can be considered a Riemann-Stieltjes integral of the form where the measure µ(λ) is a piecewise constant function defined by This Riemann-Stieltjes integral can be approximated using Gauss quadrature rules [5,6]. Hence, it is necessary to produce a sequence of orthogonal polynomials, which can be achieved by the Lanczos algorithm. The operation count for this procedure is dominated by the application of the Lanczos algorithm, which requires a cost of kn 2 matrix-vector products, where k is the number of Lanczos iterations. As the number of the iterations increases, the estimates increase in accuracy but the complexity and the execution time increase as well.
We refer to this estimation as to est Gauss .

Application in Estimating x T (AA T + λI n ) −m x
In several applications, the appearance matrix has the form B = AA T + λI n , λ > 0, which is a symmetric positive definite matrix. For instance, this type of matrix appears in specifying the regularization parameter in Tikhonov regularization. In this case, the estimation of the quadratic forms of the type x T B −m x is required. The estimates derived in the previous sections involve positive powers of B, i.e., B k , k ∈ N. However, since the direct computation of the matrix powers B k is not stable for every λ, our next goal was to develop an alternative approach to its evaluation. As we show below, the computation of B k can be obviated.
Since the matrices AA T and I n commute, the binomial theorem applies, and hence The above representation of the vector B m x effectively allows us to avoid the computation of the powers of the matrix B = AA T + λI n that appear in the estimates of the quadratic form x T B −m x. The expressions of type (AA T ) m−j can be evaluated successively as follows: A T x, AA T x, A T AA T x, AA T AA T x, . . .

Numerical Examples
Here, we present several numerical examples that illustrate the performance of the derived estimates. All computations were performed using MATLAB (R2018a). Throughout the numerical examples, we denote by e i the ith column of the identity matrix of appropriate order and 1 n as the nth vector with all elements equal to one.

Example 1. Upper bounds for the absolute error.
In this example, we consider the symmetric positive define matrix A = B T B ∈ R 1000×1000 , where B is the Parter matrix selected from the MATLAB gallery. The condition number of the matrix A is κ = 17.8983. We choose the vector x ∈ R 1000 as the 100th column of the identity matrix, i.e., x = e 100 . We estimate the quadratic form x T A −2 x whose exact value is 0.0127. In Table 1, we present the generated estimates following the proposed approach and the upper bounds for the corresponding absolute error, which are given in Proposition 1. We consider the Kac-Murdock-Szegö (KMS) matrix A ∈ R 1000×1000 , which is symmetric positive-definite and Toeplitz. The elements A ij of this matrix are A ij = r |i−j| , i, j = 1, 2, . . . , 1000, 0 < r < 1. We tested this matrix for r = 0.2 and the condition number of A is κ = 2.25. We estimated both the quadratic forms x T A −2 x = 1.2072 and x T A −3 x = 296.8727. The chosen vectors were x = e 1000 + 1/4e 120 ∈ R 1000 and x = 1 n . The results are provided in Tables 2 and 3. As we shown, the derived estimates are satisfactory in both cases.  Example 3. Estimation of the whole diagonal of the covariance matrices.
In this example, we consider thecovariance matrices of order n, whose elements A ij are given by where α, β ∈ R and β ≥ 1 [9]. We estimated the whole diagonal of the inverse of covariance matrices through the derived estimates presented in this work. Moreover, we used the two approaches presented in Section 6, which were used in previous studies. We applied the Gauss quadrature using k = 3 Lanczos iterations. We chose the pair of values for the parameters (α, β) = (3, 1). We validated the quality of the generated estimates by computing the mean relative error (MRE) given by where est(i) is the corresponding estimate for the diagonal element A −1 ii . The results are recorded in Table 4. Specifically, we analyzed the performance of the proposed estimates in terms of the MRE and the execution time (in seconds). Table 4. Mean relative errors and execution times for estimating the diagonal of the covariance matrices of order n with (α, β) = (3, 1). In this example, we tested the behavior of the proposed estimates in network analysis. Specifically, we estimated the whole diagonal of the resolvent matrix (I n − aA) −1 , where A is the adjacency matrix of the network. We chose the parameter a = 0.85/λ max . We considered three adjacency matrices of order n = 4000, which were selected by the CONTEST toolbox [10]. In Table 5, we provide the mean relative error for estimating the whole diagonal of the resolvent matrix. We also provide the execution time in seconds in the brackets in this table. Example 5. Solution of ill-posed problems via the GCV method.
Let us consider the least-squares problem of the form min x∈R d Ax − b 2 , where A ∈ R n×d and b ∈ R n . In ill-posed problems, the solution of the above minimization problem is not satisfactory and it is necessary to replace this problem with another one that is a penalized least-squares problem of the form where λ > 0 is the regularization parameter. This is the popular Tikhonov regularization. The solution of (10) is x λ = (A T A + λI d ) −1 A T b. A major issue is the specification of the regularization parameter λ. This can be achieved by minimizing the GCV function. Following the expression of the GCV function V(λ) in terms of quadratic forms presented in [11], we write where B = AA T + λI n ∈ R n×n .
In this example, we considered three test problems of order n, which were selected from the Regularization Tools package [12]. In particular, we considered the Shaw, Tomo, and Baart problems. Each of these test problems generates a matrix A and a solution x. We computed the error-free vector b such that b = Ax. The perturbed data vector b per ∈ R p was computed by the formula b per = b + e b σ √ n , where σ is a given noise level and e ∈ R n is a Gaussian noise with mean zero and variance one. We estimated the GCV function using the estimate est h1 without computing the matrix B, but we used the relations for B x given in Section 7. We found the minimum of the corresponding estimation over a grid of values for λ and we computed the solution x λ . Concerning the grid of λ, we considered 100 equally spaced values in log-scale in the interval [10 −12 , 10].
In Figures 1-3, we plot the exact solution x of the problem and the estimated solution x λ generated by Tikhonov regularization via the GCV function. Specifically, for each test problem, we depict two graphs. The left-hand-side graph corresponds to the determination of the regularization parameter via the estimated GCV using est h1 , and the right-handside graph concerns the exact computation of the GCV function. In Table 6, we list the characteristics of Figures 1-3. In particular, we provide the order n, the noise level σ, and the error norm of the derived solution x λ of each test problem.

Conclusions
In this work, we proposed three different approaches for estimating the quadratic forms of the type x T A −m x, m ∈ N. Specifically, we considered a projection method, a minimization approach, and a heuristic procedure. We also expressed upper bounds on the absolute error of the derived estimates; they allowed us to assess the precision of the results obtained by the aforementioned methods.
The proposed approaches provide efficient and fast estimates. Their efficiency was illustrated by numerical examples. Comparing the proposed estimates with the corresponding ones presented in the literature, we formed the following conclusions.

•
The projection method improves the results of the extrapolation procedure by providing bounds on the absolute error. • Although the estimates based on the Gauss quadrature are accurate, they require more time and more mvps than the proposed approaches as the number of the Lanczos iterations increases. The methods shown in the present paper are thus convenient especially in situations when a fast estimation of moderate accuracy is sought.