Geometrical inverse preconditioning for symmetric positive definite matrices

We focus on inverse preconditioners based on minimizing $F(X) = 1-\cos(XA,I)$, where $XA$ is the preconditioned matrix and $A$ is symmetric and positive definite. We present and analyze gradient-type methods to minimize $F(X)$ on a suitable compact set. For that we use the geometrical properties of the non-polyhedral cone of symmetric and positive definite matrices, and also the special properties of $F(X)$ on the feasible set. Preliminary and encouraging numerical results are also presented in which dense and sparse approximations are included.


Introduction
Algebraic inverse preconditioning play a key role in a wide variety of applications that involve the solution of large and sparse linear systems of equations; see e.g., [4,6,7,9,14,22,23]. For a given square matrix A, there exist several proposals for constructing sparse inverse approximations which are based on optimization techniques, mainly based on minimizing the Frobenius norm of the residual (I − XA) over a set P of matrices with a certain sparsity pattern; see e.g., [2,7,8,9,10,12,18,19,13]. However, we must remark that when A is symmetric and positive definite, minimizing the Frobenius norm of the residual in general will produce an inverse preconditioner which is neither symmetric nor positive definite; see, e.g., [2].
There is currently a growing interest, and understanding, in the rich geometrical structure of the non-polyhedral cone of symmetric and positive semidefinite matrices (P SD); see e.g., [1,5,12,19,16,17,24]. In this work, we focus on inverse preconditioners based on minimizing the positive-scaling-invariant function F (X) = 1 − cos(XA, I), instead of minimizing the Frobenius norm of the residual. Our approach takes advantage of the geometrical properties of the P SD cone, and also of the special properties of F (X) on a suitable compact set, to introduce specialized gradient-type methods for which we analyze their convergence properties.
The rest of the document is organized as follows. In Section 2, we develop and analyze two different gradient-type iterative schemes for finding inverse approximations based on minimizing F (X), including sparse versions. In Section 3, we present numerical results on some well-known test matrices to illustrate the behavior and properties of the introduced gradient-type methods. Finally, in Section 4 we present some concluding remarks.

Gradient-type iterative methods
Let us recall that the cosine between two n × n real matrices A and B is defined as where A, B = trace(B T A) is the Frobenius inner product in the space of matrices and . F is the associated Frobenius norm. By the Cauchy-Schwarz inequality it follows that | cos(A, B)| ≤ 1, and the equality is attained if and only if A = γB for some nonzero real number γ.
To compute the inverse of a given symmetric and positive definite matrix A we consider the function F (X) = 1 − cos(XA, I) ≥ 0, (2) for which the minimum value zero is reached at X = ξA −1 , for any positive real number ξ. Let us recall that any positive semidefinite matrix B has nonnegative diagonal entries and so trace(B) ≥ 0. Hence, if XA is symetric, we need to impose that XA, I = trace(XA) ≥ 0 as a necessary condition for XA to be in the P SD cone, see [5,17,24]. Therefore, in order to impose uniqueness in the P SD cone, we consider the constrained minimization problem Min where S = {X ∈ IR n×n | XA F = √ n} and T = {X ∈ IR n×n | trace(XA) ≥ 0}. Notice that S ∩ T is a closed and bounded set, and so problem (3) is well-posed. The derivative of F (X), denoted by ∇F (X), plays an important role in our work. Proof. For fixed matrices X and Y , we consider the function ϕ(t) = F (X + tY ). It is well-known that ϕ ′ (0) = ∇F (X), Y . We have and we obtain after differentiating ϕ(t) and some algebraic manipulations and the result is established.
Before discussing different numerical schemes for solving problem (3), we need a couple of technical lemmas.
Proof. Since X ∈ S then XA 2 F = n, and we have But XAA, X = XA 2 F = n, so ∇F (X), X = XA, I n − 1 n A, X = 0, since A, X = AX, I = XA, I .
Proof. For every X we have X = A −1 AX, and so and the result is established.

The negative gradient direction
For the numerical solution of (3), we start by considering the classical gradient iterations that, from an initial guess X 0 , are given by where α k > 0 is a suitable step length. A standard approach is to use the optimal choice i.e., the positive step length that (exactly) minimizes the function F (X) along the negative gradient direction. We present a closed formula for the optimal choice of step length in a more general setting, assuming that the iterative method is given by: where D k is a search direction in the space of matrices.

Lemma 2.4
The optimal step length α k , that optimizes F (X (k) + αD k ), is given by Proof. Consider the auxiliary function in one variable Differentiating ψ(α), using that X (k) A, X (k) A = n, and also that and then forcing ψ ′ (α) = 0 the result is obtained, after some algebraic manipulations.
Remark 2.2 For our first approach, D k = −∇F (X (k) ), and so for the optimal gradient method (also known as Cauchy method or steepest descent method) the step length is given by Notice that if we use instead D k = ∇F (X (k) ), the obtained α k which also forces ψ ′ (α k ) = 0 is given by (4) but with a negative sign. Therefore, to guarantee that α k > 0 minimizes F along the negative gradient direction to approximate A −1 , instead of maximizing F along the gradient direction to approximate −A −1 , we will choose the step length α k as the absolute value of the expression in (4).
Since I F = √ n, the gradient iterations can be written as which can be further simplified by imposing the condition for uniqueness X (k) A F = √ n.
In that case we set and then we multiply the matrix Z (k+1) by the factor √ n/ Z (k+1) A F to guarantee that Concerning the condition that the sequence {X (k) } remains in T , in our next result we establish that if the step length α k remains uniformly bounded from above, then trace(X (k) A) > 0 for all k. Then Proof. We proceed by induction. Let us assume that It follows that , then (1 − α k n 2 √ n A 2 F ) > 0, and we conclude that Since X (k+1) is obtained as a positive scaling factor of Z (k+1) , then w k+1 > 0 and the result is established. Now, for some given matrices A, we cannot guarantee that the step length computed as the absolute value of (4) will satisfy α k ≤ (n 3/2 )/ A 2 F for all k. Therefore, if trace(X (k+1) A) = X (k+1) A, I < 0 then we will set in our algorithm X (k+1) = −X (k+1) to guarantee that trace(X (k+1) A) ≥ 0, and hence that the cosine between X (k+1) A and I is nonnegative, which is a necessary condition to guarantee that X (k+1) remains in the P SD cone; see, e.g., [5,17,24].
We now present our steepest descent gradient algorithm that will be referred as the CauchyCos Algorithm. 1: Given X 0 ∈ P SD 2: for k = 0, 1, · · · until a stopping criterion is satisfied, do 3: We note that if we start from X (0) such that X (0) A F = √ n then by construction For that initial guess, trace(X (0 A) = X (0 A, I > 0 and again by construction all the iterates will remain in the P SD cone. Notice also that, at each iteration, we need to compute the three matrix-matrix products: X (k) A, ( w k n X (k) A − I)A, and ∇F (X (k) )A, which for dense matrices require n 3 floating point operations (flops) each. Every one of the remaining calculations (inner products and Frobenius norms) are obtained with n column-oriented inner products that require n flops each. Summing up, in the dense case, the computational cost of each iteration of the CauchyCos Algorithm is 3n 3 + O(n 2 ) flops. In section 2.5, we will discuss a sparse version of the CauchyCos Algorithm and its computational cost.

Convergence properties of the CauchyCos Algorithm
We start by establishing the commutativity of all iterates with the matrix A.
Proof. We proceed by induction. Assume that X (k) A = AX (k) . It follows that and since Z (k+1) and X (k+1) differ only by a scaling factor, then AX (k+1) = X (k+1) A.
It is worth noticing that using Lemma 2.6 and (5), it follows by simple calculations that Z (k) as well as X (k) are symmetric matrices for all k. In turn, if X (0) A = AX (0) , this clearly imply using Lemma 2.6 that X (k) A is also a symmetric matrix for all k. Our next result establishes that the sequences generated by the CauchyCos Algorithm are uniformly bounded away from zero, and hence the algorithm is well-defined. Proof. Using Lemmas 2.2 and 2.6 we have that which combined with the Cauchy-Schwarz inequality and Lemma 2.3 implies that for all k. Moreover, since A is nonsingular then is bounded away from zero for all k.

Theorem 2.2 The sequence {X (k) } generated by the CauchyCos Algorithm converges to
Proof. The sequence {X (k) } ⊂ S ∩ T , which is a closed and bounded set, therefore there exist limit points in S ∩ T . Let X be a limit point of {X (k) }, and let {X (k j ) } be a subsequence that converges to X. Let us suppose, by way of contradiction, that ∇F ( X) = 0.
In that case, the negative gradient, −∇F ( X) = 0, is a descent direction for the function F at X. Hence, there existsα > 0 such that Consider now an auxiliary function θ : IR n×n → IR given by Clearly, θ is a continuous function, and then θ(X (k j ) ) converges to θ( X) = δ. Therefore, for all k j sufficiently large, Now, since α k j was obtained using Lemma 2.4 as the exact optimal step length along the negative gradient direction, then using Remark 2.1 it follows that and thus, for all k j sufficiently large.
On the other hand, since F is continuous, F (X (k j ) ) converges to F ( X). However, the whole sequence {F (X (k) )} generated by the CauchyCos Algorithm is decreasing, and so F (X (k) ) converges to F ( X), and since F is bounded below then for k j large enough Nevertheless, as we argued before, the whole sequence F (X (k) ) converges to F (A −1 ) = 0, and by continuity the whole sequence {X (k) } converges to A −1 .

Remark 2.3
The optimal choice of step length α k , as it usually happens when combined with the negative gradient direction (see e.g., [3,21]), produces an orthogonality between consecutive gradient directions, that in our setting becomes This orthogonality is responsible for the well-known zig-zagging behavior of the optimal gradient method, which in some cases induces a very slow convergence.

A simplified search direction
To avoid the zig-zagging trajectory of the optimal gradient iterates, we now consider a different search direction: to move from X (k) ∈ S ∩ T to the next iterate. Notice that D k A = −∇F (X (k) ) and so D k can be viewed as a simplified version of the search direction used in the classical steepest descent method. Notice also that D k resembles the residual direction (X (k) A − I) used in the minimal residual iterative method (MinRes) for minimizing I − XA F in the least-squares sense; see e.g., [9,22]. Nevertheless, the scaling factors in (7) differ from the scaling factors in the classical residual direction at X (k) . For solving (3), we now present a variation of the CauchyCos Algorithm, that will be referred as the MinCos Algorithm, which from a given initial guess X 0 produces a sequence of iterates using the search direction D k , while remaining in the compact set S ∩ T . This new algorithm consists of simply replacing −∇F (X (k) ) in the CauchyCos Algorithm by D k . Algorithm 2 : MinCos (simplified gradient approach on F (X) = 1 − cos(XA, I)) 1: Given X 0 ∈ P SD 2: for k = 0, 1, · · · until a stopping criterion is satisfied, do 3: As before, we note that if we start from X (0) = ( √ n/ A F )I then by construction again by construction all the iterates remain in the P SD cone. Notice also that, at each iteration, we now need to compute the two matrix-matrix products: X (k) A, and D k A, which for dense matrices require n 3 flops each. Every one of the remaining calculations (inner products and Frobenius norms) are obtained with n column-oriented inner products that require n flops each. Summing up, in the dense case, the computational cost of each iteration of the MinCos Algorithm is 2n 3 + O(n 2 ) flops. In Section 2.5, we will discuss a sparse version of the MinCos Algorithm and its computational cost.

Convergence properties of the MinCos Algorithm
We start by noticing that, unless we are at the solution, the search direction D k is a descent direction.
Lemma 2.8 If X ∈ S ∩ T and ∇F (X) = 0, the search direction D(X) is a descent direction for the function F at X.
Proof. We need to establish that, for a given X ∈ S ∩ T , D(X), ∇F (X) < 0. Since A −1 is symmetric and positive definite, then it has a unique square root which is also symmetric and positive definite. This particular square root will be denoted as A −1/2 . Therefore, since D(X)A = −∇F (X), and using that trace(E 1 E 2 ) = trace(E 2 E 1 ), for given square matrices E 1 and E 2 , it follows that Remark 2.4 The step length in the MinCos Algorithm is obtained using the search direction D k in Lemma (2.4). Notice that if we use − D k instead of D k , the obtained α k which also forces ψ ′ (α k ) = 0 is the one given by Lemma (2.4) but with a negative sign. Therefore, as in the CauchyCos Algorithm, to guarantee that α k > 0 minimizes F along the descent direction D k to approximate A −1 , instead of maximizing F along the ascent direction − D k to approximate −A −1 , we choose the step length α k as the absolute value of the expression in Lemma (2.4).
We now establish the commutativity of all iterates with the matrix A.
It is worth noticing that using Lemma 2.9 and (5), it follows by simple calculations that Z (k) , X (k) , and X (k) A in the MinCos Algorithm are symmetric matrices for all k. These three sequences generated by the MinCos Algorithm are also uniformly bounded away from zero, and so the algorithm is well-defined.
where A 1/2 is the unique square root of A which is also symmetric and positive definite. Combining the previous equality with the Cauchy-Schwarz inequality, and using the consistency of the Frobenius norm, we obtain Since X (k) ∈ S, then √ n = X (k) A F ≤ X (k) A 1/2 F A 1/2 F , which combined with (8) implies that is bounded away from zero for all k. Moreover, since A is nonsingular then is also bounded away from zero for all k. Proof. From Lemma 2.8 the search direction D(X) is a descent direction for F at X, unless ∇F (X) = 0. Therefore, since α k in the MinCos Algorithm is obtained as the exact minimizer of F along the direction D(X k ) for all k, the proof is obtained repeating the same arguments shown in the proof of Theorem 2.2, simply replacing −∇F (Y ) by D(Y ) for all possible instances Y .

Sparse versions
We now discuss how to dynamically impose sparsity in the sequence of iterates {X (k) } generated by either the CauchyCos Algorithm or the MinCos Algorithm, to reduce their required storage and computational cost. A possible way of accomplishing this task is to prescribe a sparsity pattern beforehand, which is usually related to the sparsity pattern of the original matrix A, and then impose it at every iteration; see e.g., [6,13,18,19]. At this point, we would like to mention that although there exist some special applications for which the involved matrices are large and dense [11,15], frequently in real applications the involved matrices are large and sparse. However, in general the inverse of a sparse matrix is dense anyway. Moreover, with very few exceptions, it is not possible to know a priori the location of the large or the small entries of the inverse. Consequently, it is very difficult in general to prescribe a priori a nonzero sparsity pattern for the approximate inverse.
As a consequence, to force sparsity in our gradient related algorithms, we use instead a numerical dropping strategy to each column (or row) independently, using a threshold tolerance, combined with a fixed bound on the maximum number of nonzero elements to be kept at each column (or row) to limit the fill-in. This combined strategy will be fully described in our numerical results section.
In the CauchyCos and MinCos Algorithms, the dropping strategy must be applied to the matrix Z k+1 right after it is obtained at Step 6, and before computing X k+1 at Step 7. That way, X k+1 will remain sparse at all iterations, and we guarantee that X k+1 ∈ S ∩ T . The new Steps 7 and 8, in the sparse versions of both algorithms, are given by 7 : Apply numerical dropping to Z (k+1) with a maximum number of nonzero entries Notice that, since all the involved matrices are symmetric, the matrix-matrix products required in both algorithms can be performed using sparse-sparse mode column-oriented inner products; see, e.g., [9]. The remaining calculations (inner products and Frobenius norms), required to obtain the step length, must be also computed using sparse-sparse mode. Using this approach, which takes advantage of the imposed sparsity, the computational cost and the required storage of both algorithms are drastically reduced. Moreover, using the column oriented approach both algorithms have a potential for parallelization.

Numerical Results
We present some numerical results to illustrate the properties of our gradient-type algorithms for obtaining inverse approximations. All computations are performed in MATLAB using double precision.
For a given matrix A, the merit function Φ(X) = 1 2 I − XA 2 F has been widely used for computing approximate inverse preconditioners; see; e.g., [2,7,8,9,10,12,13,19]. In that case, the properties of the Frobenius norm permit in a natural way the use of parallel computing. Moreover, the minimization of Φ(X) can also be accomplished imposing a column-wise numerical dropping strategy leading to a sparse approximation of A −1 . Therefore, when possible, it is natural to compare the CauchyCos and the MinCos Algorithms applied to the angle-related merit function F (X) with the optimal Cauchy method applied to Φ(X) (referred from now on as the CauchyFro method), and also to the Minimal Residual (MinRes) method applied to Φ(X); see, e.g., [2,9].
The gradient of Φ(X) is given by ∇Φ(X) = −A T (I − XA), and so the iterations of the CauchyFro method, from the same initial guess X (0) = ( √ n/ A F )I used by MinCos and CauchyCos, can be written as where G k = −∇Φ(X (k) ) and the step length α k > 0 is obtained as the global minimizer of Φ(X (k) + αG k ) along the direction G k , as follows where R k = I −AX (k) is the residual matrix at X (k) . The iterations of the MinRes method can be obtained replacing G k by the residual matrix R k in (9) and (10); see [9] for details. We need to remark that in the dense case, the CauchyFro method needs to compute two matrix-matrix products per iteration, whereas the MinRes method by using the recursion R k+1 = R k − α k AR k needs one matrix-matrix product per iteration.
For our experiments we consider the following test matrices in the P SD cone: • from the Matlab gallery: Poisson, Lehmer, Wathen, Moler, and miij. Notice that the Poisson matrix, referred in Matlab as (Poisson, N ) is the N 2 ×N 2 finite differences 2D discretization matrix of the negative Laplacian on ]0, 1[ 2 with homogeneous Dirichlet boundary conditions.
• Poisson 3D (that depends on the parameter N ), is the N 3 × N 3 finite differences 3D discretization matrix of the negative Laplacian on the unit cube with homogeneous Dirichlet boundary conditions.
In Table 1 we report the considered test matrices with their size, sparsity properties, and 2-norm condition number κ(A). Notice that the Wathen matrices have random entries so we cannot report their spectral properties. Moreover, Wathen (N ) is a sparse n × n matrix with n = 3N 2 + 4N + 1. In general the inverse of all the considered matrices are dense, except the inverse of the Lehmer matrix which is tridiagonal.

Approximation to the inverse with no dropping strategy
To add understanding to the properties of the new CauchyCos and MinCos Algorithms, we start by testing their behavior, as well as the behavior of CauchyFro and MinRes, without imposing sparsity. Since the goal is to compute an approximation to A −1 , it is not necessary to carry on the iterations up to a very small tolerance parameter ǫ, and we choose ǫ = 0.01 for our experiments. For all methods, we stop the iterations when min{F (X (k) ), Φ(X (k) )} ≤ ǫ.   Table 2 shows the number of required iterations by the four considered algorithms when applied to some of the test functions, and for different values of n. No information in some of the entries of the table indicates that the corresponding method requires an excessive amount of iterations as compared with the MinRes and MinCos Algorithms. We can observe that CauchyFro and CauchyCos are not competitive with MinRes and MinCos, except for very few cases and for very small dimensions. Among the Cauchytype methods, CauchyCos requires less iterations than CauchyFro, and in several cases the difference is significant. The MinCos and MinRes Algorithms were able to accomplish the required tolerance using a reasonable amount of iterations, except for the Lehmer(n) and minij(n) matrices for larger values of n, which are the most difficult ones in our list of test matrices. The MinCos Algorithm clearly outperforms the MinRes Algorithm, except for the Poisson 2D (n) and Poisson 3D (n) for which both methods require the same number of iterations. For the more difficult matrices and specially for larger values of n, MinCos reduces in the average the number of iterations with respect to MinRes by a factor of 4.  In Figure 1 we show the (semilog) convergence history for the four considered methods and for both merit functions: F (X) and Φ(X), when applied to the Wathen matrix for n = 20 and ǫ = 0.01. Once again, we can observe that CauchyFro and CauchyCos are not competitive with MinRes and MinCos, and that MinCos outperforms MinRes. Moreover, we observe in this case that the function F (X) is a better merit function than Φ(X) in the sense that it indicates with fewer iterations that a given iterate is sufficiently close to the inverse matrix. The same good behavior of the merit function F (X) has been observed in all our experiments.

Matrix
Based on these preliminary results, we will only report the behavior of MinRes and MinCos for the forthcoming numerical experiments.

Sparse approximation to the inverse
We now build sparse approximations by applying the dropping strategy, described in Section 2.5, which is based on a threshold tolerance with a limited fill-in (lf il) on the matrix Z (k+1) , at each iteration, right before the scaling step to guarantee that the iterate X (k+1) ∈ S ∩ T . We define thr as the percentage of coefficients less than the maximum value of the modulus of all the coefficients in a column. To be precise, for each i-th column we select at most lf il off-diagonal coefficients among the ones that are larger in magnitude than thr × (Z (k+1) ) i ∞ , where (Z (k+1) ) i represents the i-th column of Z (k+1) .
We begin by comparing MinRes and MinCos when we apply the numerical dropping strategy. For both algorithms we use the column-oriented sparse calculations described in Section 2.5; see also [9].

Matrix
Method   Table 3 shows the performance of MinRes and MinCos when applied to the matrices nos1, nos2, nos5, and nos6, for ǫ = 0.01, thr = 0.01, and several values of lf il. We report the iteration k (Iter) at which the method was stopped, the interval [λ min , λ max ] of (X (k) A), the quotient κ(X (k) A)/κ(A), and the percentage of fill-in (% fill-in) at the final matrix X (k) . We observe that, when imposing the dropping strategy to obtain sparsity, MinRes fails to produce an acceptable preconditioner. Indeed, as it has been already observed (see [2,9]) quite frequently MinRes produces an indefinite approximation to the inverse of a sparse matrix in the P SD cone. We also observe that, in all cases, the MinCos method produces a sparse symmetric and positive definite preconditioner with relatively few iterations and a low level of fill-in. Moreover, with the exception of the matrix nos6, the MinCos method produces a preconditioned matrix (X (k) A) whose condition number is reduced by a factor of approximately 10 with respect to the condition number of A. In some cases, MinRes was capable of producing a sparse symmetric and positive definite preconditioner, but in those cases the MinCos produced a better preconditioner in the sense that it exhibits a better reduction of the condition number, and also a better eigenvalues distribution. Based on these results, for the remaining experiments we only report the behavior of the MinCos Algorithm. Table 4 shows the performance of the MinCos Algorithm when applied to the Wathen matrix for different values of n and a maximum of 20 iterations. For this numerical experiment we fix ǫ = 0.01, thr = 0.04, and lf il = 20. For the particular case of the Wathen matrix when n = 50, we show in Figure 2 the (semilog) convergence history of the norm of the residual when solving a linear system with a random right hand side vector, using the Conjugate Gradient (CG) method without preconditioning, and also using the preconditioner generated by the MinCos Algorithm after 20 iterations, fixing ǫ = 0.01, thr = 0.04, and lf il = 20. We also report in Figure 3 the eigenvalues distribution of A and of X (k) A, at k = 20, for the same experiment with the Wathen matrix and n = 50. Notice that the eigenvalues of A are distributed in the interval [0, 350], whereas the eigenvalues of X (k) A are located in the interval [0.03, 1.4] (see Table 4). Even better, we can observe that most of the eigenvalues are in the interval [0. 3, 1.4], and very few of them are in the interval [0.03, 0.3], which clearly accounts for the good behavior of the preconditioned CG method (see Figure 2).  Tables 5, 6, and 7 show the performance of the MinCos Algorithm when applied to the Poisson 2D, the Poisson 3D, and the Lehmer matrices, respectively, for different values of n, and different values of the maximum number of iterations, ǫ, thr, and lf il. We can observe that, for the Poisson 2D and 3D matrices, the MinCos Algorithm produces a sparse symmetric and positive definite preconditioner with very few iterations, a low level of fill-in, and a significant reduction of the condition number.  For the Lehmer matrix, which is one of the most difficult considered matrices, we observe in Table 7 that the MinCos Algorithm produces a symmetric and positive definite preconditioner with a significant reduction of the condition number, but after 40 iterations    Table 7: Performance of MinCos applied to the Lehmer matrix, for different values of n and a maximum of 40 iterations, when ǫ = 0.01, thr = 0.06, and lf il = 100. and fixing lf il = 100, for which the preconditioner accepts a high level of fill-in. If we impose a low level of fill-in, by reducing the value of lf il, MinCos still produces a symmetric and positive definite matrix, but the reduction of the condition number is not significant.

Final remarks
We have introduced and analyzed two gradient-type optimization schemes to build sparse inverse preconditioners for symmetric positive definite matrices. For that we have proposed the novel objective function F (X) = 1 − cos(XA, I), which is invariant under positive scaling and has some special properties that are clearly related to the geometry of the P SD cone. One of the new schemes, the CauchyCos Algorithm, is closely related to the classical steepest descent method, and as a consequence it shows in most cases a very slow convergence. The second new scheme, denoted as the MinCos Algorithm, shows a much faster performance and competes favorably with well-known methods. Based on our numerical results, by choosing properly the numerical dropping parameters, the MinCos Algorithm produces a sparse inverse preconditioner in the P SD cone for which a significant reduction of the condition number is observed, while keeping a low level of fill-in.