Abstract
The conditioning theory of the -weighted least squares and -weighted pseudoinverse problems is explored in this article. We begin by introducing three types of condition numbers for the -weighted pseudoinverse problem: normwise, mixed, and componentwise, along with their explicit expressions. Utilizing the derivative of the -weighted pseudoinverse problem, we then provide explicit condition number expressions for the solution of the -weighted least squares problem. To ensure reliable estimation of these condition numbers, we employ the small-sample statistical condition estimation method for all three algorithms. The article concludes with numerical examples that highlight the results obtained.
Keywords:
ℳℒ-weighted pseudoinverse; ℳℒ-weighted least squares problem; normwise condition number; mixed and componentwise condition numbers; small-sample statistical condition estimation MSC:
65F20; 65F35; 65F30; 15A12; 15A60
1. Introduction
The study of generalized inverses of matrices has been a very important research field since the middle of last century and remains one of the most active research branches in the world [1,2,3]. Generalized inverses, including the weighted pseudoinverse, have numerous applications in various fields, such as control, networks, statistics, and econometrics [4,5,6,7]. The -weighted pseudoinverse of matrix with the entries of two weight matrices and (with order and , respectively) is defined as
where denotes the Moore-Penrose inverse of and . The -weighted pseudoinverse [3] originated from the -weighted least squares problem (-WLS), which is stated as follows:
where and are the ellipsoidal seminorms
with . The -WLS exists and has a unique solution:
if and only if , with . In this case it can be shown [3] that
The -weighted pseudoinverse [8] is helpful in solving -WLS problems [2,9], which is a generalization of the equality constraint least squares problem and has been widely explored in the literature (see, e.g., [9,10,11,12]). Eldén [9] studied perturbation theory for this problem, whereas Cox et al. [12] retrieved the upper perturbation bound and provided the normwise condition number. Li and Wang [13] presented structured and unstructured partial normwise condition numbers, whereas Diao [14] provided partial mixed and componentwise condition numbers for this problem. Until today, the condition numbers for the -WLS problem were not yet explored. Motivated by this and considering their significance in ELS research, we present explicit representations of the normwise, mixed, and componentwise condition numbers for the -WLS problem, as well as statistical estimation.
A large number of articles and monographs have appeared during the last two decades in the literature dealing with the -weighted pseudoinverse [1,3,8]. The -weighted pseudoinverse converts to the K-weighted pseudoinverse [3] when , the generalized inverse [15] when has a full row rank and , and the Moore-Penrose inverse [16] when both , and . Wei and Zhang [8] discussed the structure and uniqueness of the -weighted pseudoinverse . Elden [3] devised the algorithm for . Wei [17] considered the expression using GSVD. Gulliksson et al. [18] proposed a perturbation equation for . Galba et al. [4] proposed iterative methods for calculating , but they may not be appropriate for time-varying applications. Recurrent neural networks (RNNs) [2,6,7] are commonly used to calculate time-varying solutions. Recently, Mahvish et al. [19,20] presented condition numbers and statistical estimates for the -weighted pseudoinverse and the generalized inverse .
A fundamental idea in numerical analysis is the condition number, which expresses how sensitive a function’s output is to small variations in its input. It is used to predict the worst-case sensitivity of how errors in input data can affect the results of computations. Various condition numbers are available that consider various aspects of the input and output data. The normwise condition number [21] disregards the scaling structure of both the input and output data. On the other hand, the mixed and componentwise [22] condition numbers consider the scaling structure of the data. The mixed condition numbers employ componentwise error analysis for the input data and normwise error analysis for the output data. This means that the errors in the input data are estimated componentwise, while the errors in the output data are estimated using a normwise approach. The componentwise condition numbers, on the other hand, employ componentwise error analysis for both the input and output data. This means that the errors in both the input and output data are estimated componentwise. The condition numbers of the matrix , when associated with the two weight matrices and , and its estimation have not been investigated until now. Nonetheless, it is crucial to delve into some generalized findings that encapsulate other pre-existing results in the scientific literature.
The article is organized as follows: The normwise, mixed, and componentwise condition numbers for are discussed in Section 3, and the condition number expressions for the -WLS solution are obtained in Section 4. In Section 2, preliminaries and basic properties are summarized, which help in understanding the results presented in the paper. A highly reliable statistical estimate of the condition numbers is obtained using the small-sample statistical condition estimation (SSCE) method [23] in Section 5. A few numerical examples are also included in Section 5 to illustrate the results that were attained. The efficiency of the estimators is illustrated by these examples, which also show the distinction between the normwise condition numbers and the mixed and componentwise condition numbers.
Throughout this article, denotes the set of real matrices. For a matrix is the transpose of denotes the rank of is the spectral norm of X, and is the Frobenius norm of X. For a vector is its ∞-norm, and the 2-norm. The notation is a matrix whose components are the absolute values of the corresponding components of X.
2. Preliminaries
In this part, we will present several definitions and key findings that will be utilized in the following sections. The entry-wise division [24] between the vectors is defined as
where is diagonal with diagonal elements Here, for a number , is defined by
It is obvious that has components Similarly, for is defined as follows:
We describe the relative distance between u and v using the entry-wise division as
In other words, we take into account the absolute distance at zero components and the relative distance at nonzero components.
To establish the definitions of normwise, mixed and componentwise condition numbers, it is necessary to also determine the set , and .
To define the normwise, mixed, and componentwise condition numbers, we consider the following definitions:
Definition 1
([24]). Assume that is a continuous mapping described on an open set and , such that .
(i) The normwise condition number of χ at u is stated as
(ii) The mixed condition number of χ at u is stated as
(iii) The componentwise condition number of χ at u is stated as
Using the Fréchet derivative, the next lemma provides explicit expressions for these three condition numbers.
Lemma 1
([24]). Assuming the same specifications as in Definition 1, if χ is differentiable at u, then we obtain
Here, represents the Fréchet derivative of χ at point u.
In order to derive explicit formulas for the previously mentioned condition numbers, we require certain properties of the Kronecker product, denoted as [25], between matrices A and B. Here, the operator ‘vec’ is defined as
for with and the Kronecker product between and defined as .
Here, the matrix X should have an appropriate dimension. Moreover, is the vec-permutation matrix, and its definition is based on the dimensions m and n.
Moving forward, we provide two useful lemmas. These will help in the calculation of condition numbers as well as in determining their upper bounds.
Lemma 2
([26], P. 174, Theorem 5). Let S be an open subset of , and let be a matrix function defined and times (continuously) differentiable on S. If is constant on then is k times (continuously) differentiable on S, and
Lemma 3
([16]). For any matrices Z and S that have dimensions such that
are well-defined, we have
and
3. Condition Numbers for -Weighted Pseudoinverse
To derive the explicit expression of the condition numbers of , we define the mapping by
Here, , , and for a matrix ,
The definitions of the normwise, mixed, and componentwise condition numbers for are given below, following [27] and using Definition 1.
By applying the operator and the spectral, Frobenius, and Max norms, we redefine the previously mentioned definitions accordingly.
The expression for the Fréchet derivative of at w is given below.
Lemma 4.
Suppose that ϕ is a continous mapping. Then it is Fréchet differentiable at w and its Fréchet derivative is:
where
with
Proof.
By differentiating both sides of (1), we acquire
Considering the facts
and
which are from [9] Theorem 2.1, Lemma 2, and
Using (20) and the result , the above equation can be rewritten as
Furthermore, given that , (20), and
we can simplify the above equation as considering (21) and (22), we have
That is,
The definition of Fréchet derivative yields the expected results. □
Remark 1.
Assuming and the Fréchet derivative of ϕ at w might be described as follows:
where
whereas the latter is simply the results of ([19], Lemma 4), which allows us to retrieve the K-weighted pseudoinverse [19] condition numbers. The K-weighted pseudoinverse of [19] uses a notation different from this paper ( and are interchanged).
Remark 2.
Considering and as identity matrices, we obtain
which yields the outcome in [16], Lemma 10, from which the condition numbers for the Moore-Penrose inverse [16] can be obtained.
Next, we provide the normwise, mixed, and componentwise condition numbers for which belong to the direct outcomes of Lemmas 1 and 4.
In the following corollary, we propose simply computable upper bounds to decrease the cost of determining these condition numbers. Numerical investigations in Section 5 illustrate the reliability of these bounds.
Corollary 1.
The upper bounds for the normwise, mixed and componentwise condition numbers for are
Proof.
Considering the known property for a pair of matrices, U and V, and the Theorem 1, and (7), we obtain
Again using Theorem 1, and Lemma 3, we have
and
□
4. Condition Numbers for -Weighted Least Squares Problem
First we define the -WLS problem mapping by
Then, using Definition 1, we denote the normwise, mixed and componentwise condition numbers for the -WLS problem as follows:
Lemma 5.
The mapping ψ is continuous, Fréchet differentiable at , and
where
with
Proof.
Differentiating both sides of (28), we obtain
Considering in (30), we obtain
That is,
Hence, the required results can be obtained by using the definition of Fréchet derivative. □
Remark 3.
Assuming and as identity matrices and using (32), we obtain
It accomplishes the result stated in ([16] Lemma 11), from which the condition numbers of the linear least squares solution [16] can be acquired.
Now, we give the normwise, mixed, and componentwise condition numbers for -WLS solution which are the immediate results of Lemmas 1 and 4.
Theorem 2.
The next corollary yields effortlessly computable bounds for -WLS solution. Numerical investigations in Section 5 confirm the reliability of these bounds.
Corollary 2.
The upper bounds for the normwise, mixed and componentwise condition numbers for -WLS solution are
Here, we have a different version of the normwise condition number that does not include the Kronecker products.
Theorem 3.
The normwise condition number of the -WLS solution is given by
where
5. Numerical Experiments
In this section, first we present reliable condition estimation algorithms for normwise, mixed, and componentwise condition numbers using small sample statistical condition estimation (SCE) method then we show the accuracy of the propose condition estimation algorithms by numerical experiments. Kenny and Laub [23] provided small sample statistical condition estimation (SCE) as a reliable method to estimate condition numbers for linear least squares problems [13,28,29], indefinite least squares problems [20,30] and total least squares problems [31,32,33]. We proposed Algorithms A, B and C based on the SSCE method [23] to estimate the normwise, mixed, and componentwise condition numbers of and for the -WLS solution.
Algorithm A (Small-sample statistical condition estimation method for the normwise condition number of -weighted pseudoinverse)
- Generate matrices with each entry in and orthonormalize the following matrixto obtain by modified Gram-Schmidt orthogonalization process. Each can be converted into the corresponding matrices by applying the unvec operation.
- Let . The Wallis factor approximate and by
- Here, for any vector . Where the power operation is applied at each entry of and with . Where the square operation is applied to each entry of and the square root is also applied componentwise.
- Estimate the normwise condition number (33) bywhere .
The corresponding SSCE method, which is from [23] and has been used in numerous problems (see, for example, [27,32,33,34]), is required in order to estimate the mixed and componentwise condition numbers for .
Algorithm B (Small-sample statistical condition estimation method for the mixed and componentwise condition numbers of -weighted pseudoinverse)
- Generate matrices with each entry in and orthonormalize the following matrixto obtain by modified Gram-Schmidt orthogonalization process. Each can be converted into the corresponding matrices by applying the unvec operation. Let be the matrix multiplied by componentwise.
- Let . Approximate and by (38).
- For calculate by (39). Compute the absolute condition vector
- Estimate the mixed and componentwise condition estimations and as follows:
In order to estimate the normwise, mixed, and componentwise condition numbers of the -WLS problem, we provide Algorithm C based on the SSCE approach [23].
Algorithm C (Small-sample statistical condition estimation method for the condition numbers of -weighted least squares problem)
- Generate matrices with entries in where To orthonormalize the below matrixto obtain an orthonormal matrix by using modified Gram-Schmidt orthogonalization technique. Where can be converted into the corresponding matrices by applying the unvec operation.
- Let . Approximate and by using (38).
- For compute from (31)Using the approximations for and , estimate the absolute condition vector
- Estimate the normwise condition estimation as follows:
- Compute the mixed condition estimation and componentwise condition estimation as follows:
Next, we provide three individual examples. In the first, we compare our SCE-based estimates with the condition numbers of . It also concludes how well Algorithms A and B perform while developing very high estimations. The second one helps to show the accuracy of statistical condition estimators of normwise, mixed, and componentwise condition numbers for the -WLS solution. The third one verifies the effectiveness of over-estimation ratios by Algorithm C related to the condition numbers of the -WLS solution.
Example 1.
We constructed 200 matrices by repeatedly applying the data matrices below and varying the value of θ.
The results in Table 1 show that Algorithms A and B can reliably estimate the condition numbers in the majority of instances. As stated in ([35], Chapter 15), an estimate of the condition number that is correct to within a factor of 10 is usually suitable because it is the magnitude of an error bound that is of interest rather than its precise value.
Table 1.
The efficiency of statistical condition estimates by Algorithms A and B.
Figure 1 demonstrates that Algorithms A and B are very efficient in estimating condition numbers of . To evaluate the efficiency of the Algorithms A and B, we created 500 matrix pairings and set , , , and with fixed . In order to determine the effectiveness of Algorithms A and B, we specify the following ratios:
Figure 1.
Efficiency of condition eliminators of Algorithms A and B.
Example 2.
The nonsymmetric Gaussian random Toeplitz matrix is constructed using the Matlab function using and , where . Assume that and are the random orthogonal matrices, and is the diagonal matrix with a certain condition number and positive elements on its diagonal. Following that, we are given the matrices and as
where . The residual vector and the solution with indicating any random vector with a certain norm and . Here, we construct 200 random -WLS problems for each specified to check the performance of Algorithm C.
The mixed and componentwise condition numbers, rather than the normwise condition number, are more appropriate for describing the underlying conditioning of this -WLS problem, considering the facts given in Table 2. Furthermore, we observed that condition estimates based on SSCE may yield accurate results when executed by Algorithm C.
Table 2.
The efficiency of statistical condition estimates by Algorithm C.
The ratios between the exact condition numbers and their estimated values are listed here.
In order to determine the effectiveness of Algorithm C, generate 500 random -WLS problems with the assumptions that , , , and and we used the parameter . Therefore, as seen in Figure 2, normwise condition estimation, , is not as effective as mixed condition estimation, , and componentwise condition estimation, .
Figure 2.
Efficiency of condition eliminators of Algorithm C.
Example 3.
Consider the random orthogonal matrices: , , , then the matrices , , and are provided by
with appropriate sizes, where , and are diagonal matrices with diagonal elements distributed exponentially from to 1. Furthermore, we define as the solution x and , where r is the random vector of the 2-norm.
For the perturbations, we generate them as
In this example, the componentwise product of two matrices is indicated by ⊙, and the random matrices E, F, G, and g have uniformly distributed components in the open interval and .
To evaluate the accuracy of the estimators, we define the overestimation ratios.
To check the performance of Algorithm C, we constructed 500 random -WLS problems with , , , and , and we used the parameter and outputs , , and of Algorithm C. Figure 3 illustrates that the mixed condition estimation, , and the componentwise condition estimation, , are more efficient when compared to the normwise condition estimation, . However, it is important to note that the latter tends to significantly overestimate the true relative normwise error.
Figure 3.
Efficiency of over-estimation ratios of Algorithm C.
6. Conclusions
This article presents explicit expressions and upper bounds for the normwise, mixed, and componentwise condition numbers of the -weighted pseudoinverse . In a specific situation, the results for the K-weighted pseudoinverse and Moore-Penrose inverse are also recovered. Additionally, we provide the process of deriving the -weighted least squares solution’s condition numbers from the -weighted pseudoinverse condition numbers and condition numbers. We proposed three algorithms to efficiently estimate the normwise, mixed, and componentwise conditions for the -weighted pseudoinverse and -weighted least squares solutions using the small-sample statistical condition estimation method. Finally, numerical results confirmed the efficacy and accuracy of the algorithms. In the future, we will continue our study on the -weighted pseudoinverse.
Author Contributions
Conceptualization, M.S. and X.Z.; Methodology, M.S.; Software, M.S.; Validation, H.X.; Formal analysis, X.Z. and H.X.; Investigation, H.X.; Resources, X.Z.; Data curation, X.Z.; Writing—original draft, M.S.; Supervision, X.Z.; Project administration, X.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the Zhejiang Normal University Postdoctoral Research Fund (Grant No. ZC304022938), the Natural Science Foundation of China (project no. 61976196), the Zhejiang Provincial Natural Science Foundation of China under Grant No. LZ22F030003.
Data Availability Statement
Data are contained within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Nashed, M.Z. Generalized Inverses and Applications: Proceedings of an Advanced Seminar Sponsored by the Mathematics Research Center, The University of Wisconsin-Madison, New York, NY, USA, 8–10 October 1973; Academic Press, Inc.: Cambridge, MA, USA, 1976; pp. 8–10. [Google Scholar]
- Qiao, S.; Wei, Y.; Zhang, X. Computing Time-Varying ML-Weighted Pseudoinverse by the Zhang Neural Networks. Numer. Funct. Anal. Optim. 2020, 41, 1672–1693. [Google Scholar] [CrossRef]
- Eldén, L. A weighted pseudoinverse, generalized singular values, and constrained least squares problems. BIT 1982, 22, 487–502. [Google Scholar] [CrossRef]
- Galba, E.F.; Neka, V.S.; Sergienko, I.V. Weighted pseudoinverses and weighted normal pseudosolutions with singular weights. Comput. Math. Math. Phys. 2009, 49, 1281–1363. [Google Scholar] [CrossRef]
- Rubini, L.; Cancelliere, R.; Gallinari, P.; Grosso, A.; Raiti, A. Computational experience with pseudoinversion-based training of neural networks using random projection matrices. In Artificial Intelligence: Methodology, Systems, and Applications; Gennady, A., Hitzler, P., Krisnadhi, A., Kuznetsov, S.O., Eds.; Springer International Publishing: New York, NY, USA, 2014; pp. 236–245. [Google Scholar]
- Wang, X.; Wei, Y.; Stanimirovic, P.S. Complex neural network models for time-varying Drazin inverse. Neural Comput. 2016, 28, 2790–2824. [Google Scholar] [CrossRef]
- Zivkovic, I.S.; Stanimirovic, P.S.; Wei, Y. Recurrent neural network for computing outer inverse. Neural Comput. 2016, 28, 970–998. [Google Scholar] [CrossRef]
- Wei, M.; Zhang, B. Structures and uniqueness conditions of MK-weighted pseudoinverses. BIT 1994, 34, 437–450. [Google Scholar] [CrossRef]
- Eldén, L. Perturbation theory for the least squares problem with equality constraints. SIAM J. Numer. Anal. 1980, 17, 338–350. [Google Scholar] [CrossRef]
- Björck, Å. Numerical Methods for Least Squares Problems; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
- Wei, M. Perturbation theory for rank-deficient equality constrained least squares problem. SIAM J. Numer. Anal. 1992, 29, 1462–1481. [Google Scholar] [CrossRef]
- Cox, A.J.; Higham, N.J. Accuracy and stability of the null space method for solving the equality constrained least squares problem. BIT 1999, 39, 34–50. [Google Scholar] [CrossRef]
- Li, H.; Wang, S. Partial condition number for the equality constrained linear least squares problem. Calcolo 2017, 54, 1121–1146. [Google Scholar] [CrossRef]
- Diao, H. Condition numbers for a linear function of the solution of the linear least squares problem with equality constraints. J. Comput. Appl. Math. 2018, 344, 640–656. [Google Scholar] [CrossRef]
- Björck, Å.; Higham, N.J.; Harikrishna, P. The equality constrained indefinite least squares problem: Theory and algorithms. BIT Numer. Math. 2003, 43, 505–517. [Google Scholar]
- Cucker, F.; Diao, H.; Wei, Y. On mixed and componentwise condition numbers for Moore Penrose inverse and linear least squares problems. Math Comp. 2007, 76, 947–963. [Google Scholar] [CrossRef]
- Wei, M. Algebraic properties of the rank-deficient equality-constrained and weighted least squares problem. Linear Algebra Appl. 1992, 161, 27–43. [Google Scholar] [CrossRef]
- Gulliksson, M.E.; Wedin, P.A.; Wei, Y. Perturbation identities for regularized Tikhonov inverses and weighted pseudoinverses. BIT 2000, 40, 513–523. [Google Scholar] [CrossRef]
- Samar, M.; Li, H.; Wei, Y. Condition numbers for the K-weighted pseudoinverse and their statistical estimation. Linear Multilinear Algebra 2021, 69, 752–770. [Google Scholar] [CrossRef]
- Samar, M.; Zhu, X.; Shakoor, A. Conditioning theory for Generalized inverse and their estimations. Mathematics 2023, 11, 2111. [Google Scholar] [CrossRef]
- Rice, J. A theory of condition. SIAM J. Numer. Anal. 1966, 3, 287–310. [Google Scholar] [CrossRef]
- Gohberg, I.; Koltracht, I. Mixed, componentwise, and structured condition numbers. SIAM J. Matrix Anal. Appl. 1993, 14, 688–704. [Google Scholar] [CrossRef]
- Kenney, C.S.; Laub, A.J. Small-sample statistical condition estimates for general matrix functions. SIAM J. Sci. Comput. 1994, 15, 36–61. [Google Scholar] [CrossRef]
- Xie, Z.; Li, W.; Jin, X. On condition numbers for the canonical generalized polar decomposition of real matrices. Electron. J. Linear Algebra. 2013, 26, 842–857. [Google Scholar] [CrossRef]
- Horn, R.A.; Johnson, C.R. Topics in Matrix Analysis; Cambridge University Press: New York, NY, USA, 1991. [Google Scholar]
- Magnus, J.R.; Neudecker, H. Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd ed.; John Wiley and Sons: Chichester, NH, USA, 2007. [Google Scholar]
- Diao, H.; Xiang, H.; Wei, Y. Mixed, componentwise condition numbers and small sample statistical condition estimation of Sylvester equations. Numer. Linear Algebra Appl. 2012, 19, 639–654. [Google Scholar] [CrossRef]
- Baboulin, M.; Gratton, S.; Lacroix, R.; Laub, A.J. Statistical estimates for the conditioning of linear least squares problems. Lect. Notes Comput. Sci. 2014, 8384, 124–133. [Google Scholar]
- Samar, M. A condition analysis of the constrained and weighted least squares problem using dual norms and their statistical estimation. Taiwan. J. Math. 2021, 25, 717–741. [Google Scholar] [CrossRef]
- Li, H.; Wang, S. On the partial condition numbers for the indefinite least squares problem. Appl. Numer. Math. 2018, 123, 200–220. [Google Scholar] [CrossRef]
- Samar, M.; Zhu, X. Structured conditioning theory for the total least squares problem with linear equality constraint and their estimation. AIMS Math. 2023, 8, 11350–11372. [Google Scholar] [CrossRef]
- Diao, H.; Wei, Y.; Xie, P. Small sample statistical condition estimation for the total least squares problem. Numer. Algor. 2017, 75, 435–455. [Google Scholar] [CrossRef]
- Samar, M.; Lin, F. Perturbation and condition numbers for the Tikhonov regularization of total least squares problem and their statistical estimation. J. Comput. Appl. Math. 2022, 411, 114230. [Google Scholar] [CrossRef]
- Diao, H.; Wei, Y.; Qiao, S. Structured condition numbers of structured Tikhonov regularization problem and their estimations. J. Comput. Appl. Math. 2016, 308, 276–300. [Google Scholar] [CrossRef]
- Higham, N.J. Accuracy and Stability of Numerical Algorithms, 2nd ed.; SIAM: Philadelphia, PA, USA, 2002. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).