Abstract
In this paper, we study estimates for quadratic forms of the type , , for symmetric matrices. We derive a general approach for estimating this type of quadratic form and we present some upper bounds for the corresponding absolute error. Specifically, we consider three different approaches for estimating the quadratic form . The first approach is based on a projection method, the second is a minimization procedure, and the last approach is heuristic. Numerical examples showing the effectiveness of the estimates are presented. Furthermore, we compare the behavior of the proposed estimates with other methods that are derived in the literature.
1. Introduction
Let be a given symmetric positive definite matrix and . We are interested in estimating the quadratic forms of the type , . Our main goal was to find an efficient and cheap approximate evaluation of the desired quadratic form without the direct computation of the matrix . As such, we revisited the approach for estimating the quadratic form , developed in [1], and extended it to the case of an arbitrary negative power of A.
The computation of quadratic forms is a mathematical problem with many applications. Indicatively, we refer to some usual applications.
- Statistics: The inverse of the covariance matrix, which is referred to as a precision matrix, usually appears in statistics. The covariance matrix reveals marginal correlations between the variables, whereas the precision matrix represents the conditional correlations between two data variables of the other variables [2]. The diagonal of the inverse of covariance matrices provides information about the quality of data in uncertainty quantification [3].
- Network analysis: The determination of the importance of the nodes of a graph is a major issue in network analysis. Information for these details can be extracted by the evaluation of the diagonal elements of the matrix , where A is the adjacency matrix of the network, , and is the spectral radius of A. This matrix is referred to as a resolvent matrix, see, for example, [4] and the references therein.
- Numerical analysis: Quadratic forms arise naturally in the context of the computation of the regularization parameter in Tikhonov regularization for solving ill-posed problems. In this case, the matrix has the form , . In the literature, many methods have been proposed for the selection of the regularization parameter , such as the discrepancy principle, cross-validation, generalized cross-validation (GCV), L-curve, and so forth; see, for an example, [5] (Chapter 15) and references therein. These methods involve quadratic forms of type , with .
In practice, exact computation of a quadratic form is often replaced using an estimate that is faster to evaluate. Regarding its numerous applications, the estimation of quadratic forms is an important practical problem that has been frequently studied in the literature. Let us indicatively refer to some well-known methods. A widely used method is based on Gaussian quadrature [5] (Chapter 7) and [6]. Moreover, extrapolation procedures have been proposed. Specifically, in [7], families of estimates for the bilinear form for any invertible matrix, and in [8], families of estimates for the bilinear form for a Hermitian matrix were developed.
In the present work, we consider alternative approaches to this problem. To begin, notice that the value of the quadratic form is proportional to the second power of the norm of x. Therefore, the task of estimating consists of two steps:
- Finding an such that
- Assessing the absolute error of the above estimate, i.e., determining a bound for the quantity
In Section 2, we present the upper bounds for the absolute error (2) for any given . Section 3 is devoted to estimates of the value in (1) using a projection method. In Section 4, we use bounds from Section 2 as a stepping stone for estimating using the minimization method. A heuristic approach is outlined in Section 5. In Section 6, we briefly describe two methods that were used in previous studies, namely, an extrapolation approach and another one based on Gaussian quadrature. Section 7 is focused on adapting the proposed estimates to the case of the matrix of form . Numerical examples that illustrate the performance of the derived estimates are found in Section 8. We end this work with several concluding remarks in Section 9.
2. Bounds on the Error
In Proposition 1 below, we derive an upper bound on the error (2) for a given estimate of the quadratic form . The first three expressions for the bounds (UB1–UB3) are a direct generalization of a result from [1].
Proposition 1.
Let be a symmetric positive definite matrix and and be an estimate of the quadratic form . If we denote , the absolute error of the estimate is bounded from above by the following expressions:
- UB1.
- UB2.
- UB3.
- UB4.
- UB5.
- For estimates satisfying , we have also the family of error boundswherecan be chosen as any integer such that.
Proof.
- UB1.
The matrix is symmetric because A is symmetric, and it holds that
by the Cauchy–Schwarz inequality.
Moreover, we have
Using the Kantorovich inequality for the matrix and considering that , , we have
where is the condition number of A. Therefore, the norm given by (3) can be bounded by
Hence, we have
- UB2.
Due to the Cauchy–Schwarz inequality, it holds that
Following a similar approach as above based on the Kantorovich inequality, we obtain
So,
- UB3.
It holds that
Applying the Kantorovich inequality to the matrix in a similar way as above, we can immediately obtain the inequality
So, we have
- UB4.
Applying the Cauchy–Schwarz inequality, we obtain
- UB5.
Since A is positive definite, as is for any integer q, the angle between vectors v and does not exceed for any v, i.e., .
Taking and , we obtain
The assumption implies that
Hence, we obtain
At the same time, the assumption implies
so, . To summarize,
Consequently,
So, we have
The norm can be bounded using the Kantorovich inequality, as shown in Relation (4). Regarding the factor , we have
Therefore, the relation (5) can be reformulated as
□
3. Estimate of by the Projection Method
Our goal is to find a number such that (cf. (1)). To that end, let us take a fixed and consider the following decomposition of x,
where . (That is, is a projection of x onto along the orthogonal complement of .) Then, we have
Using the assumption , we obtain
and so
Hence, we obtain a family of estimates for as follows:
We denote these estimates by , . The computational implementation requires matrix-vector products (mvps).
Let us now explore the error corresponding to the above choice of . We have
therefore,
Since is the estimate (see (1)), the error term is provided as . Bounds on its absolute value can be found using Proposition 1 with
Remark 1.
Let us comment on the choice of the parameter k.
- Observe that upper bounds UB1 and UB4 from Proposition 1 are minimal for . In this case, we have ; thus, b has the smallest possible norm. Therefore, from the point of view of minimizing the upper bound on the error (more precisely, minimizing upper bounds UB1 and UB4), a convenient choice is .
- However, if the goal is fast estimation, we can take for even m and for odd m, as these two choices provide and , respectively, which are both easy to evaluate.
In general, for any choice of k, the error of the estimate can be assessed using Proposition 1.
4. Estimate of Using the Minimization Method
The estimates that we present in this section stem from the upper bounds UB2 and UB3 for the absolute error , which are derived in Proposition 1. Our goal is to reduce the absolute error by finding the value that minimizes these bounds.
Plugging in the explicit formulas for UB2 and UB3, we can easily check that the two upper bounds in question attain their minimal values if and only if minimizes the function
where corresponds to UB2 and corresponds to UB3. By differentiating this expression with respect to , we find that the upper bounds UB2 and UB3 are minimized at , being the root of the equation
where, as before, the values and correspond to UB2 and UB3, respectively. With this value , we obtain the estimation of as
For the sake of brevity, we adopt the notation for and for . The computational implementation requires mvps.
5. The Heuristic Approach
Let us consider the quantity
We refer to as the generalized index of proximity.
Lemma 1.
Assume that is a symmetric matrix. For any nonzero vector , the value satisfies . The equality holds true if and only if x is an eigenvector of A.
Proof.
By the Cauchy–Schwarz inequality, we have ; hence, . The equality is equivalent to the equality in the Cauchy–Schwarz inequality, which occurs if and only if the vector is a scalar multiple of the vector x, in other words, when for a certain . This is further equivalent to (with satisfying ) given the assumption that A is symmetric. □
As a result of Lemma 1, the equality
where , , is identically true for any eigenvector of A (i.e., for any vector satisfying ), and becomes approximately true for vectors x with the property .
Therefore, if , we have
We refer to this estimate as . If, in particular, and , we denote the estimate by , and if , the corresponding estimate is denoted by . The computational implementation requires mvps.
6. A Comparison with Other Methods
In this section, we briefly describe two methods that were proposed in the literature for estimating quadratic forms of the type , where , , and f is a smooth function defined on the spectrum of A. The first method is an extrapolation procedure developed in [8] and the second one is based on Gaussian quadrature [5] (Chapter 7) and [6].
6.1. The Extrapolation Method
We adjust the family of estimates for given in [8] (Proposition 2) by setting , . Hence, we directly obtain the estimating formula given in the following lemma.
Lemma 2.
Let be a symmetric matrix. An extrapolation estimate for the quadratic form , , is given by
We refer to this estimation as . The computational implementation requires just one mvp.
Remark 2.
In the special case of , some of the proposed estimates are identified to the corresponding extrapolation estimates for specific choices of the family parameter ν. We have
- For , .
- For , .
- For , .
Notably, the extrapolation procedure proposes estimates for the quadratic form and not bounds. The choice of the family parameter is arbitrary and no bounds for the absolute error of the estimates are provided.
6.2. Gaussian Techniques
We consider the spectral factorization of A, which allows us to express the matrix A as , where are the eigenvalues of A with corresponding eigenvectors . Therefore, the quadratic form can be written as
The Summation (9) can be considered a Riemann–Stieltjes integral of the form
where the measure is a piecewise constant function defined by
This Riemann–Stieltjes integral can be approximated using Gauss quadrature rules [5,6]. Hence, it is necessary to produce a sequence of orthogonal polynomials, which can be achieved by the Lanczos algorithm. The operation count for this procedure is dominated by the application of the Lanczos algorithm, which requires a cost of matrix-vector products, where k is the number of Lanczos iterations. As the number of the iterations increases, the estimates increase in accuracy but the complexity and the execution time increase as well.
We refer to this estimation as to .
7. Application in Estimating
In several applications, the appearance matrix has the form , , which is a symmetric positive definite matrix. For instance, this type of matrix appears in specifying the regularization parameter in Tikhonov regularization. In this case, the estimation of the quadratic forms of the type is required. The estimates derived in the previous sections involve positive powers of B, i.e., , . However, since the direct computation of the matrix powers is not stable for every , our next goal was to develop an alternative approach to its evaluation. As we show below, the computation of can be obviated.
Since the matrices and commute, the binomial theorem applies,
and hence
The above representation of the vector effectively allows us to avoid the computation of the powers of the matrix that appear in the estimates of the quadratic form . The expressions of type can be evaluated successively as follows:
8. Numerical Examples
Here, we present several numerical examples that illustrate the performance of the derived estimates. All computations were performed using MATLAB (R2018a). Throughout the numerical examples, we denote by the ith column of the identity matrix of appropriate order and as the nth vector with all elements equal to one.
Example 1.
Upper bounds for the absolute error.
In this example, we consider the symmetric positive define matrix , where B is the Parter matrix selected from the MATLAB gallery. The condition number of the matrix A is . We choose the vector as the 100th column of the identity matrix, i.e., . We estimate the quadratic form whose exact value is . In Table 1, we present the generated estimates following the proposed approach and the upper bounds for the corresponding absolute error, which are given in Proposition 1.
Table 1.
Estimating , where , , .
Example 2.
Estimation of quadratic forms.
We consider the Kac–Murdock–Szegö (KMS) matrix , which is symmetric positive-definite and Toeplitz. The elements of this matrix are . We tested this matrix for and the condition number of A is . We estimated both the quadratic forms and . The chosen vectors were and . The results are provided in Table 2 and Table 3. As we shown, the derived estimates are satisfactory in both cases.
Table 2.
Estimating , where , .
Table 3.
Estimating , where , .
Example 3.
Estimation of the whole diagonal of the covariance matrices.
In this example, we consider thecovariance matrices of order n, whose elements are given by
where and [9]. We estimated the whole diagonal of the inverse of covariance matrices through the derived estimates presented in this work. Moreover, we used the two approaches presented in Section 6, which were used in previous studies. We applied the Gauss quadrature using Lanczos iterations. We chose the pair of values for the parameters . We validated the quality of the generated estimates by computing the mean relative error (MRE) given by
where is the corresponding estimate for the diagonal element . The results are recorded in Table 4. Specifically, we analyzed the performance of the proposed estimates in terms of the MRE and the execution time (in seconds).
Table 4.
Mean relative errors and execution times for estimating the diagonal of the covariance matrices of order n with .
Example 4.
Network analysis.
In this example, we tested the behavior of the proposed estimates in network analysis. Specifically, we estimated the whole diagonal of the resolvent matrix , where A is the adjacency matrix of the network. We chose the parameter . We considered three adjacency matrices of order , which were selected by the CONTEST toolbox [10]. In Table 5, we provide the mean relative error for estimating the whole diagonal of the resolvent matrix. We also provide the execution time in seconds in the brackets in this table.
Table 5.
Mean relative errors and execution times (seconds) for estimating the diagonal of the resolvent matrix.
Example 5.
Solution of ill-posed problems via the GCV method.
Let us consider the least-squares problem of the form , where and . In ill-posed problems, the solution of the above minimization problem is not satisfactory and it is necessary to replace this problem with another one that is a penalized least-squares problem of the form
where is the regularization parameter. This is the popular Tikhonov regularization. The solution of (10) is . A major issue is the specification of the regularization parameter . This can be achieved by minimizing the GCV function. Following the expression of the GCV function in terms of quadratic forms presented in [11], we write
where .
In this example, we considered three test problems of order n, which were selected from the Regularization Tools package [12]. In particular, we considered the Shaw, Tomo, and Baart problems. Each of these test problems generates a matrix A and a solution x. We computed the error-free vector b such that . The perturbed data vector was computed by the formula , where is a given noise level and is a Gaussian noise with mean zero and variance one. We estimated the GCV function using the estimate without computing the matrix B, but we used the relations for given in Section 7. We found the minimum of the corresponding estimation over a grid of values for and we computed the solution . Concerning the grid of , we considered 100 equally spaced values in log-scale in the interval .
In Figure 1, Figure 2 and Figure 3, we plot the exact solution x of the problem and the estimated solution generated by Tikhonov regularization via the GCV function. Specifically, for each test problem, we depict two graphs. The left-hand-side graph corresponds to the determination of the regularization parameter via the estimated GCV using , and the right-hand-side graph concerns the exact computation of the GCV function. In Table 6, we list the characteristics of Figure 1, Figure 2 and Figure 3. In particular, we provide the order n, the noise level , and the error norm of the derived solution of each test problem.
Figure 1.
Solution of the Shaw test problem via an estimation of GCV (left) and the exact GCV (right).
Figure 2.
Solution of the Tomo test problem via an estimation of GCV (left) and the exact GCV (right).
Figure 3.
Solution of the Baart test problem via an estimation of GCV (left) and the exact GCV (right).
9. Conclusions
In this work, we proposed three different approaches for estimating the quadratic forms of the type , . Specifically, we considered a projection method, a minimization approach, and a heuristic procedure. We also expressed upper bounds on the absolute error of the derived estimates; they allowed us to assess the precision of the results obtained by the aforementioned methods.
The proposed approaches provide efficient and fast estimates. Their efficiency was illustrated by numerical examples. Comparing the proposed estimates with the corresponding ones presented in the literature, we formed the following conclusions.
- The projection method improves the results of the extrapolation procedure by providing bounds on the absolute error.
- Although the estimates based on the Gauss quadrature are accurate, they require more time and more mvps than the proposed approaches as the number of the Lanczos iterations increases. The methods shown in the present paper are thus convenient especially in situations when a fast estimation of moderate accuracy is sought.
Author Contributions
Conceptualization, M.M., P.R., and O.T.; methodology, M.M., P.R., and O.T.; software, A.P. and P.R.; validation, M.M., P.R., and O.T.; formal analysis, A.P. and P.R.; investigation, M.M., A.P., P.R., and O.T.; data curation, A.P. and P.R.; writing—original draft preparation, M.M., P.R., and O.T.; writing—review and editing, M.M., P.R., and O.T.; visualization, A.P. and P.R.; supervision, M.M.; project administration, M.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Acknowledgments
We thank the reviewers of the paper for the valuable remarks. This paper is dedicated to Constantin M. Petridi.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Fika, P.; Mitrouli, M.; Turek, O. On the estimation of xTA−1x for symmetric matrices. 2020; submitted. [Google Scholar]
- Fan, J.; Liao, Y.; Liu, H. An overview on the estimation of large covariance and precision matrices. Econom. J. 2016, 19, C1–C32. [Google Scholar] [CrossRef]
- Tang, J.; Saad, Y. A probing method for computing the diagonal of a matrix inverse. Numer. Linear Algebra Appl. 2012, 19, 485–501. [Google Scholar] [CrossRef]
- Benzi, M.; Klymko, C. Total Communicability as a centrality measure. J. Complex Netw. 2013, 1, 124–149. [Google Scholar] [CrossRef]
- Golub, G.H.; Meurant, G. Matrices, Moments and Quadrature with Applications; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
- Bai, Z.; Fahey, M.; Golub, G. Some large-scale matrix computation problems. J. Comput. Appl. Math. 1996, 74, 71–89. [Google Scholar] [CrossRef]
- Fika, P.; Mitrouli, M.; Roupa, P. Estimates for the bilinear form xTA−1y with applications to linear algebra problems. Electron. Trans. Numer. Anal. 2014, 43, 70–89. [Google Scholar]
- Fika, P.; Mitrouli, M. Estimation of the bilinear form y*f(A)x for Hermitian matrices. Linear Algebra Appl. 2016, 502, 140–158. [Google Scholar] [CrossRef]
- Bekas, C.; Curioni, A.; Fedulova, I. Low-cost data uncertainty quantification. Concurr. Comput. Pract. Exp. 2012, 24, 908–920. [Google Scholar] [CrossRef]
- Taylor, A.; Higham, D.J. CONTEST: Toolbox Files and Documentation. Available online: http://www.mathstat.strath.ac.uk/research/groups/numerical_analysis/contest/toolbox (accessed on 15 April 2021).
- Reichel, L.; Rodriguez, G.; Seatzu, S. Error estimates for large-scale ill-posed problems. Numer. Algorithms 2009, 51, 341–361. [Google Scholar] [CrossRef]
- Hansen, P.C. Regularization Tools Version 4.0 for MATLAB 7.3. Numer. Algorithms 2007, 46, 189–194. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).