In this section, we show how NA techniques can be used to improve the CPU time of the LLPF computation introduced in
Section 4. For this purpose, all solution methods are re-implemented in Matlab. We consider both LLPF problems with complex numbers Equation (
9) and without imaginary parts Equation (
13). For the numerical experiments, all computations are done on Intel computer i5-6500 3.2 GHz CPU with four cores and 64 GB memory.
5.1. LLPF Problem With Real Components
Let us consider the LLPF problem with real components Equation (
13) where the size of matrix
is
and the number of nonzeros is
. Due to the large dimension of the matrix, it is very costly to compute the inverse of the matrix
. Therefore, we study the properties of the matrix
and seek the fastest way to solve Equation (
13).
By analyzing matrix
, we observe that
is a sparse and Symmetric and Positive Definite (SPD) matrix. Due to its SPD properties, we can use NA techniques that are developed for this type of matrices such as the Cholesky decomposition, Incomplete Cholesky (IC) and the Conjugate Gradient (CG) iterative method on
. In addition, some reordering techniques such as Reverse Cuthill-McKee (RCM) and Approximate Minimum Degree (AMD) permutations could improve the properties of
as well.
Figure 6 shows the sparsity structure of
and reordered
using RCM.
From the figure, it is clear that the sparseness properties of the matrix are improved by using RCM reordering.
For the direct solver, the Cholesky decomposition with RCM reordering could solve the linear system Equation (
13) fast. Since
is a SPD matrix, the best iterative method for matrix Equation (
13) is CG. Furthermore, the convergence rate of CG depends on eigenvalues
of
.
Table 2 shows the largest
and smallest
magnitude eigenvalues and the condition number
of
and preconditioned
as
.
From the first row of
Table 2, we see that the condition number of the matrix
is very large which means that
is ill-conditioned. Therefore, using CG without any preconditioner on the linear system Equation (
13) cannot improve the computational time since many iterations are required for CG. Thus, the Preconditioned Conjugate Gradient (PCG) method is a proper choice to use instead of CG. In PCG, we solve the transformed system as:
where
M is called a preconditioner and is a SPD matrix. The eigenvalues of
should be clustered around one, resulting in a faster convergence for PCG. Generally,
M is obtained as
where
L is a lower triangular matrix. We can compute
L using Cholesky or Incomplete Cholesky decompositions on
or on reordered
. The eigenvalues of
can be improved by choosing a right preconditioner
M for
.
In the second row of
Table 2, the Cholesky decomposition is used for
L and results in eigenvalues equal to one for the preconditioned
. Therefore, PCG with the Choleksy decomposition is expected to converge after one iteration for Equation (
13). However, using the full Cholesky decomposition for
L is computationally expensive and the solution time can be larger than using a direct method. In order to decrease the computation time of constructing the lower triangular matrix
L, we can use the Incomplete Cholesky decomposition instead of the full Cholesky.
In rows 3–6 of
Table 2, we see how the eigenvalues and condition number of
are improved by changing the drop tolerance of IC. Moreover, we can conclude that preconditioner
M using IC(
) or IC with a drop tolerance smaller than
for
L can be a good preconditioner for matrix
in terms of the computational time and number of iterations for PCG.
Table 3 shows the comparison between various linear solvers on Equation (
13) in terms of the CPU time, number of iterations and the number of non-zeros (NNZ). All results are averaged over 10 computations. For PCG, the maximum iteration and relative tolerance are set to 100 and
respectively. The first and second rows of
Table 3 are the results of direct solvers using Matlab’s backslash∖operator (R2015a, MathWorks, Natick, MA, USA) without any additional techniques. It is necessary to mention that the CPU time of the first row doubles the CPU time of second row due to the positioning of the minus sign in Equation (
7). In addition, if we write the minus sign on the left side of the Equation (
7),
is not a positive definite matrix which results in large computational time. Therefore, it is better to put the minus sign on the right side of Equation (
7) and to keep it inside the vector
b.
For the direct solver, the Cholesky decomposition with RCM reordering results in the fastest computational time for matrix Equation (
13) as we can see from
Table 3. Furthermore, as we expected, IC(
) with RCM reordering is the best preconditioner for
that results in only one iteration in 4.96 s for PCG. However, when IC(
) is used for the preconditioner, the relative difference between the direct and iterative solutions
is high compared to other options. Therefore, we also solve the problem Equation (
13) with various tolerances for PCG and drop tolerances for IC. Numerical results are given in
Table 4.
From
Table 4, we see that the relative difference
can be improved by decreasing the drop tolerance (
,
, ⋯) of IC for the preconditioner
while keeping PCG still converge after 1 iteration. Additionally, applying IC gives us smaller NNZ compared to full Cholesky and direct solvers.
Finally, the original computation time (14.32 s) of LLPF problems with real components Equation (
13) is improved by 2.8 times (4.96 s) using NA techniques.
5.2. LLPF Problem With Complex Components
In this subsection, we consider the LLPF problem with complex components Equations (
9) and (
15). For simplicity, let us denote the matrix
in Equation (
15) by
A. Matrices
and
A are not positive definite unlike
. Moreover, matrix
is symmetric and matrix
A is non-symmetric. Therefore, the Cholesky decomposition and CG are not suitable for these types of matrices. Instead, the LU decomposition, Generalized Minimal RESidual (GMRES) and Bi-Conjugate Gradient Stabilized (BiCGSTAB) methods are more convenient to use on matrices
and
A. For iterative solvers, GMRES and BiCGSTAB, the maximum iteration and relative tolerance are set to 20 and
respectively.
Table 5 shows the comparison between various NA techniques on the LLPF problem Equation (
9) in terms of the CPU time, number of iterations and the relative difference between the direct and iterative solutions. In Alliander DNO, Equation (
15) is used to solve the LLPF problem because the
R programming language does not support complex numbers. Furthermore, from the first and second rows of
Table 5, we can see that using Equation (
9) to solve the LLPF problem with complex components is almost 2.5 times faster than using Equation (
15) when Matlab’s backslash∖operator is used without any additional techniques. Therefore, we use Equation (
9) for further experiments.
The same RCM reordering is applied to matrix
in order to improve the structure of the matrix. The best computational time (7.41 s) is achieved by the direct solver LU decomposition on the reordered
using RCM as can be seen from
Table 5. For the iterative methods, the best computation time with the smallest relative difference is obtained by BiCGSTAB with ILU(
) as a preconditioner and RCM reordering. However, the best CPU time of the iterative method is still larger than the best CPU time of the direct solver due to the fact that ILU, GMRES and BiCGSTAB are not implemented in the optimal way in Matlab. Furthermore, both LU and ILU decompositions provide relatively similar NNZ for the LLPF problem with complex components.
As a result of the application of NA techniques, the original computation time (42.6 s) of LLPF problems with complex components Equation (
9) is improved by 5.7 times (7.41 s).