Abstract
A randomized block Kaczmarz method and a randomized extended block Kaczmarz method are proposed for solving the matrix equation , where the matrices A and B may be full-rank or rank-deficient. These methods are iterative methods without matrix multiplication, and are especially suitable for solving large-scale matrix equations. It is theoretically proved that these methods converge to the solution or least-square solution of the matrix equation. The numerical results show that these methods are more efficient than the existing algorithms for high-dimensional matrix equations.
Keywords:
matrix equation; randomized block Kaczmarz; randomized extended block Kaczmarz; convergence MSC:
65F10; 65F45; 65H10
1. Introduction
Consider the linear matrix equation
where , and . Such problems arise in many practical applications such as surface fitting in computer-aided geometric design (CAGD), signal and image processing, photogrammetry, etc.; see, for example, [1,2,3,4] and the large body of literature therein. If is consistent, is the minimum Frobenius norm solution. If is inconsistent, is the minimum Frobenius norm least-squares solution. When the matrices A and B are small and dense, direct methods based on QR fractions are attractive [5,6]. However, for large A and B matrices, iterative methods have attracted a lot of attention [7,8,9,10,11]. Recently, Du et al. proposed the randomized block coordinate descent (RBCD) method for solving the matrix least-squares problem without strong convexity assumption in [12]. This method requires that matrix B is a full row-rank matrix. Wu et al. [13] introduced two kinds of Kaczmarz-type methods to solve the consistent matrix equation : relaxed greedy randomized Kaczmarz (ME-RGRK) and maximal weighted residual Kaczmarz (ME-MWRK). Although the row and column index selection strategy is time-consuming, the ideas of these two methods are suitable for solving large-scale consistent matrix equations.
In this paper, the randomized Kaczmarz method [14] and the randomized extended Kaczmarz method [15] are used to solve consistent and inconsistent matrix equation (1) with the product of the matrix and vector.
All the results in this paper hold in the complex field. But for the sake of simplicity, we only discuss them in terms of the real number field.
In this paper, we denote , , , , and as the transpose, the Moore–Penrose generalized inverse, the rank of A, the column space of A, the Frobenius norm of A and the inner product of two matrices A and B, respectively. For an integer , let . We use I to denote the identity matrix whose order is clear from the context. In addition, for a given matrix , , , and are used to denote the ith row, the jth column, the maximum singular value and the smallest nonzero singular value of G, respectively. Let denote the expected value conditional on the first k iterations, that is, where and are the row and the column chosen at the sth iteration. Let the conditional expectations with respect to the random row index be and with respect to the random column index be By the law of total expectation, it holds that .
The organization of this paper is as follows. In Section 2, we will discuss the block Kaczmarz method (ME-RBK) for finding the minimal F-norm solution () of consistent matrix equation (1). In Section 3, we discuss the extended block Kaczmarz method (IME-REBK) for finding the minimal F-norm least-squares solution of matrix equation (1). In Section 4, some numerical examples are provided to illustrate the effectiveness of our new methods. Finally, some brief concluding remarks are described in Section 5.
2. The Randomized Block Kaczmarz Method for Consistent Equation
At the kth iteration, the Kaczmarz method selects randomized a row of A and performs an orthogonal projection of the current estimate matrix onto the corresponding hyperplane , that is,
The Lagrangian function of the conditional optimization problem (2) is
where is a Lagrangian multiplier. Via the matrix differentiation, we obtain the gradient of and set to find the stationary matrix:
Using the first equation of (4), we have . Substituting this into the second equation of (4), we can obtain . So, the projected randomized block Kacmarz (ME-PRBK) for solving iterates as
However, in practice, it is very expensive to calculate the pseudoinverse of large-scale matrices. Next, we generalize the average block Kaczmarz method [16] for solving linear equation to matrix equation.
At the kth step, we obtain the approximate solution by projecting the current estimate onto the hyperplane . Using the Lagrangian multiplier method, we can obtain the following Kaczmarz method for :
Inspired by the idea of the average block Kaczmaz algorithm for , we consider the average block Kaczmaz method for with respect to B.
where is stepsize and are the weights that satisfy and . If , then
Setting , we obtain the following randomized block Kaczmarz iteration:
where i is selected with probability . We describe this method as Algorithm 1, which is called the ME-RBK algorithm.
| Algorithm 1 Randomized Block Kaczmarz Method for (ME-RBK) |
|
We arrange the computational process of calculating in Table 1, which only costs flopping operations (flops) if the square of the row norm of A has been calculated in advance.
Table 1.
The complexities of computing in ME-RBK.
Remark 1.
Note that the problem of finding a solution of can be posed as the following linear least-squares problem:
Define the component function
then differentiate with X to obtain its gradient
First, we give the following lemma, whose proof can be found in [12].
Lemma 1
([12]). Let and be any nonzero matrix. Let
For any matrix , it holds that
Remark 2.
means that and . In fact, is well defined because and .
In the following theorem, with the idea of the RK method [14], we will prove that generated by Algorithm 1 converges to the least F-norm solution of .
Theorem 1.
Assume . If matrix equation (1) is consistent, the sequence generated by the ME-RBK method starting from the initial matrix , in which and , converges linearly to in mean square form. Moreover, the solution error in expectation for the iteration sequence obeys
where , and picked with probability .
Proof.
It follows from
and
that
By taking the conditional expectation, we have
From and , we have . Noting , it is easy to show that through induction. Then, from Lemma 1 and , we can obtain
Remark 3.
Using a similar approach to that used in the proof of Theorem 1, we can prove that the iterate generated by ME-PRBK (5) satisfies the following estimate:
where . The convergence factor of GRK in [18] is . It is obvious that
and when . This means that the convergence factor of ME-PRBK is the smallest and the factor of ME-RBK can be smaller than that of GRK when α is properly selected.
3. The Randomized Extended Block Kaczmarz Method for Inconsistent Equation
In [15,19,20], the authors proved that the Kaczmarz method does not converge to the least-squares solution of when is inconsistent. Analogously, if the matrix equation (1) is inconsistent, the above ME-PRBK method dose not converge to . The following theorem gives the error bound of the inconsistent matrix equation.
Theorem 2.
Assume that the consistent equation has a solution . Let denote the kth iterate of the ME-PRBK method applied to the inconsistent equation for any starting from the initial matrix , in which and . In exact arithmetic, it follows that
Proof.
Set , . Let Y denote the iterate of the PRBK method applied to the consistent equation at the kth step, that is,
It follows from
and
that
By taking the conditional expectation on both sides of (11), we can obtain
The inequality is obtained using Remark 3. Applying this recursive relation iteratively, we have
This completes the proof. □
Next, we use the idea of the randomized extended Kaczmarz method (see [20,21,22] for details) to solve the least-squares solution of the inconsistent Equation (1). At each iteration, is the kth iterate of ME-RBK applied to with the initial guess , and is the one-step ME-RBK update for . We can obtain the following randomized extended block Kaczmarz iteration:
where is the step size, and i and j are selected with probability and , respectively. The cost of each iteration of this method is for updating and for updating if the square of the row norm and the column norm of A have been calculated in advance. We describe this method as Algorithm 2, which is called the ME-REBK algorithm.
| Algorithm 2 Randomized Extended Block Kaczmarz Method for (ME-REBK) |
|
Theorem 3.
Assume . Let denote the kth iteration of ME-RBK applied to starting from the initial matrix , in which and . Then, converges linearly to in mean square form, and the solution error in expectation for the iteration sequence obeys
where the jth column of A is selected with probability .
Proof.
In Theorem 1, replacing A with , B with and C with 0, we can prove Theorem 3 based on the result of Theorem 1. For the sake of conciseness, we omit the proof process. □
Theorem 4.
Assume . The sequence is generated using the ME-REBK method for , starting from the initial matrix and , where , and . For any , it holds that
where , are picked with probability and , respectively.
Proof.
Let denote the kth iteration of the ME-REBK method for , and be the one-step Kaczmarz update for the matrix equation from , i.e.,
We have
and
For any , via triangle inequality and Young’s inequality, we can obtain
By taking the conditional expectation on the both sides of (15), we have
It follows from
that
By Theorem 3, it yields
From , we have . Then, by using Theorem 1, we can obtain
then
This completes the proof. □
Remark 4.
Replacing in (12) with , we obtain the following projection-based randomized extended block Kaczmarz mathod (ME-PREBK) iteration:
4. Numerical Experiments
In this section, we will present some experimental results of the proposed algorithms for solving various matrix equations, and compare them with ME-RGRK and ME-MWRK in [13] for consistent matrix equations and RBCD in [12] for inconsistent matrix equations. All experiments were carried out using MATLAB (version R2020a) on a DESKTOP-8CBRR86 with Intel(R) Core(TM) i7-4712MQ CPU @2.30GHz 2.29GHz, RAM 8GB and Windows 10.
All computations start from the initial guess , and are terminated once the relative error (RE) of the solution, defined by
at the current iteration , satisfies or exceeds the maximum iteration K = 50,000, where . We report the average number of iterations (denoted as “IT”) and the average computing time in seconds (denoted as“CPU”) for 20 repeated trial runs of the corresponding method. Three examples are tested, and A and B are generated as follows.
- Type I: For given , the entries of A and B are generated from a standard normal distribution, i.e.,
- Type II: Like [18], for given , and , we construct a matrix A by , where and are orthogonal column matrices, is a diagonal matrix whose first diagonal entries are uniformly distributed numbers in , and the last two diagonal entries are . The entries of B are generated using a similar method with parameters .
- Type III: The real-world sparse data come from the Florida sparse matrix collection [23]. Table 2 lists the features of these sparse matrices.
Table 2.
The detailed features of sparse matrices from [23].
Table 2.
The detailed features of sparse matrices from [23].
| Name | Size | Rank | Sparsity |
|---|---|---|---|
| ash219 | 85 | ||
| ash958 | 292 | ||
| divorce | 9 |
4.1. Consistent Matrix Equation
Given , we set with to construct a consistent matrix equation. First, we test the impact of in the ME-RBK method on the experimental results. Figure 1 plots the IT and CPU versus different “para” with different matrices in Table 3, where 0.1:0.1:1.9 so that satisfies in Theorem 1. From Figure 1, it can be seen that the number of iteration steps and the running time decrease with the increase in parameters. However, when , both IT and CPU begin to increase. The same situations occur when solving consistent or inconsistent equations with different matrices in Table 4 and Table 5. Therefore, we set in all experiments.
Figure 1.
IT (left) and CPU (right) of different para of ME-RBK for consistent matrix equations with differnt matrices in Table 3.
Table 3.
IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type I.
Table 4.
IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type II.
Table 5.
IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type III.
In Table 3, Table 4 and Table 5, we report the average IT and CPU of the ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK methods for solving consistent eqautions. In the following tables, the item “>” represents that the number of iteration steps exceeds the maximum iteration (50,000), and the item “-” represents that the method does not converge.
From these tables, we can see that the ME-RBK and ME-PRBK methods vastly outperform the ME-RGRK and ME-MWRK methods in terms of both IT and CPU times regardless of whether the matrices A and B are full column/row rank or not. As the matrix dimension increases, the CPU time of the ME-RBK and ME-PRBK methods increases slowly, while the running time of ME-RGRK and ME-MWRK increases dramatically.
In addition, when the matrix size is small, the ME-PRBK method is competitive, because the pseudoinverse is less expensive and the number of iteration steps is small. When the matrix size is large, the matrix is large, and the ME-RBK method is more challenging because it does not need to calculate the pseudoinverse (see the last line in Table 3).
4.2. Inconsistent Matrix Equation
To construct an inconsistent matrix equation, we set , where and R are random matrices which are generated by and . Numerical results of the RBCD, IME-REBK and IME-PREBK methods are listed in Table 6, Table 7 and Table 8. From these tables, we can see that the IME-PREBK method is better than the RBCD method in terms of IT and CPU time, especially when the is large (see the last line in Table 7). The IME-REBK method is not competitive for B with full row rank because it needs to solve two equations. However, when B does not have full row rank, the RBCD method does not converge, while the IME-REBK and IME-PREBK methods do.
Table 6.
IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type I.
Table 7.
IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type II.
Table 8.
IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type III.
5. Conclusions
In this paper, we have proposed a randomized block Kaczmarz algorithm for solving the consistent matrix equation and its extended version for the inconsistent case. Theoretically, we have proved that the proposed algorithms converge linearly to the unique minimal F-norm solution or least-squares solution (i.e., ) without requirements on A and B having full column/row rank. The numerical results show the effectiveness of the algorithms. Since the proposed algorithms only require one row or one column of A at each iteration without a matrix–matrix product, they are suitable for the scenarios where the matrix A is too large to fit in the memory or matrix multiplication is considerably expensive.
Author Contributions
Conceptualization, L.X.; Methodology, L.X. and W.B.; Validation, L.X. and W.L.; Writing—Original Draft Preparation, L.X.; Writing—Review and Editing, L.X., W.B. and W.L.; Software, L.X.; Visualization, L.X. and W.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data Availability Statement
The datasets that support the findings of this study are available from the corresponding author upon reasonable request.
Acknowledgments
The authors are thankful to the referees for their constructive comments and valuable suggestions, which have greatly improved the original manuscript of this paper.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Rauhala, U.A. Introduction to array algebra, Photogramm. Eng. Remote Sens. 1980, 46, 177–192. [Google Scholar]
- Regalia, P.A.; Mitra, S.K. Kronecker products, unitary matrices and signal processing applications. SIAM Rev. 1989, 31, 586–613. [Google Scholar] [CrossRef]
- Liu, M.Z.; Li, B.J.; Guo, Q.J.; Zhu, C.; Hu, P.; Shao, Y. Progressive iterative approximation for regularized least square bivariate B-spline surface fitting. J. Comput. Appl. Math. 2018, 327, 175–187. [Google Scholar] [CrossRef]
- Liu, Z.Y.; Li, Z.; Ferreira, C.; Zhang, Y.L. Stationary splitting iterative methods for the matrix equation AXB=C. Appl. Math. Comput. 2020, 378, 125195. [Google Scholar] [CrossRef]
- Fausett, D.W.; Fulton, C.T. Large least squares problems involving Kronecker products. SIAM J. Matrix Anal. Appl. 1994, 15, 219–227. [Google Scholar] [CrossRef]
- Zha, H.Y. Comments on large least squares problems involving Kronecker products. SIAM J. Matrix Anal. Appl. 1995, 16, 1172. [Google Scholar] [CrossRef]
- Peng, Z.Y. An iterative method for the least squares symmetric solution of the linear matrix equation AXB = C. Appl. Math. Comput. 2005, 170, 711–723. [Google Scholar] [CrossRef]
- Ding, F.; Liu, P.X.; Ding, J. Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 2008, 197, 41–50. [Google Scholar] [CrossRef]
- Huang, G.X.; Yin, F.; Guo, K. An iterative method for the skew-symmetric solution and the optimal approximate solution of the matrix equation AXB = C. J. Comput. Appl. Math. 2008, 212, 231–244. [Google Scholar] [CrossRef]
- Wang, X.; Li, Y.; Dai, L. On hermitian and skew-hermitian splitting iteration methods for the linear matrix equation AXB = C. Comput. Math. Appl. 2013, 65, 657–664. [Google Scholar] [CrossRef]
- Shafiei, S.G.; Hajarian, M. Developing Kaczmarz method for solving Sylvester matrix equations. J. Franklin Inst. 2022, 359, 8991–9005. [Google Scholar] [CrossRef]
- Du, K.; Ruan, C.C.; Sun, X.H. On the convergence of a randomized block coordinate descent algorithm for a matrix least squaress problem. Appl. Math. Lett. 2022, 124, 107689. [Google Scholar] [CrossRef]
- Wu, N.C.; Liu, C.Z.; Zuo, Q. On the Kaczmarz methods based on relaxed greedy selection for solving matrix equation AXB = C. J. Comput. Appl. Math. 2022, 413, 114374. [Google Scholar] [CrossRef]
- Strohmer, T.; Vershynin, R. A randomized Kaczmarz algorithm with exponential convergence. J. Fourier. Anal. Appl. 2009, 15, 262–278. [Google Scholar] [CrossRef]
- Zouzias, A.; Freris, N.M. Randomized extended Kaczmarz for solving least squares. SIAM J. Matrix. Anal. Appl. 2013, 34, 773–793. [Google Scholar] [CrossRef]
- Ion, N. Faster randomized block kaczmarz algorithms. SIAM J. Matrix Anal. Appl. 2019, 40, 1425–1452. [Google Scholar]
- Nemirovski, A.; Juditsky, A.; Lan, G.; Shapiro, A. Robust stochastic approximation approach to stochastic programming. SIAM J. Optimiz. 2009, 19, 1574–1609. [Google Scholar] [CrossRef]
- Niu, Y.Q.; Zheng, B. On global randomized block Kaczmarz algorithm for solving large-scale matrix equations. arXiv 2022, arXiv:2204.13920. [Google Scholar]
- Needell, D. Randomized Kaczmarz solver for noisy linear systems. BIT Numer. Math. 2010, 50, 395–403. [Google Scholar] [CrossRef]
- Ma, A.; Needell, D.A. Ramdas, Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods. SIAM J. Matrix Anal. Appl. 2015, 36, 1590–1604. [Google Scholar] [CrossRef]
- Du, K. Tight upper bounds for the convergence of the randomized extended Kaczmarz and Gauss-Seidel algorithms. Numer. Linear Algebra Appl. 2019, 26, e2233. [Google Scholar] [CrossRef]
- Du, K.; Si, W.T.; Sun, X.H. Randomized extended average block kaczmarz for solving least squares. SIAM J. Sci. Comput. 2020, 42, A3541–A3559. [Google Scholar] [CrossRef]
- Davis, T.A.; Hu, Y. The university of Florida sparse matrix collection. Math. Softw. 2011, 38, 1–25. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).