Deﬂated Restarting of Exponential Integrator Method with an Implicit Regularization for Efﬁcient Transient Circuit Simulation

: Exponential integrator (EI) method based on Krylov subspace approximation is a promising method for large-scale transient circuit simulation. However, it suffers from the singularity problem and consumes large subspace dimensions for stiff circuits when using the ordinary Krylov subspace. Restarting schemes are commonly applied to reduce the subspace dimension, but they also slow down the convergence and degrade the overall computational efﬁciency. In this paper, we ﬁrst devise an implicit and sparsity-preserving regularization technique to tackle the singularity problem facing EI in the ordinary Krylov subspace framework. Next, we analyze the root cause of the slow convergence of the ordinary Krylov subspace methods when applied to stiff circuits. Based on the analysis, we propose a deﬂated restarting scheme, compatible with the above regularization technique, to accelerate the convergence of restarted Krylov subspace approximation for EI methods. Numerical experiments demonstrate the effectiveness of the proposed regularization technique, and up to 50% convergence improvements for Krylov subspace approximation compared to the non-deﬂated version.


Introduction
High performance and full-accuracy transient circuit simulation has always been one of the major demands in modern IC design industry. And its importance is increasing due to the fast-growing design complexities with advanced technology nodes. Besides IC design, SPICE-type simulators have also been widely used in some other scientific domains, such as electric/thermal analysis [1], electric/magnetic analysis [2], and advection-diffusion analysis [3], where the systems of interest can be described by differential algebraic equations (DAEs).
The essence of transient circuit simulation is to numerically solve a system of DAEs, normally derived from Modified Nodal Analysis (MNA), by an explicit or implicit integration method. Being a category of implicit methods, Backward differentiation formula (BDF) is widely adopted by existing SPICE simulators as it offers larger time steps and better stability than explicit methods. It is also more suitable for handling stiff DAEs with time constants differing by several orders of magnitude.
However, BDF is seeing elevated challenges in scalability, parallelizability and adaptivity as problem sizes grow into million-scale, or even billion-scale. The accuracy order of BDF is typically no higher than 2, which limits the step sizes and demands a large number of time steps. Often, it requires in each time step a sparse LU decomposition, which is known to be less scalable and parallelizable. Although considerable efforts have been dedicated to accelerating BDF-based transient simulation, new integration methods different from BDF might be needed to address the simulation capacity demands from the scientific and industrial communities.
Exponential integrator (EI) method is one of the emerging simulation methodologies to solve ordinary differential equations (ODEs) in a semi-analytical way, where the time span is discretized but the equation is solved analytically in each time step by expressing the solution in a matrix exponential form. For linear circuits, EI is in principle immune to the local truncation error (LTE) of polynomial expansion approximation in BDF [1,2,[4][5][6][7]. The error only exists in the Krylov subspace approximation of the matrix exponential vector product; therefore, the accuracy order can be much higher than 2 without compromising stability [7]. Besides, only sparse matrix-vector multiplications are needed in EI method if the ordinary Krylov subspace is applied, which are highly scalable and parallelizable [8][9][10][11]. These advantages make EI a promising alternative to BDF for large-scale transient circuit simulation.
However, Krylov-subspace-based EI also has its own difficulties. One important issue is the singularity problem caused by the singular dynamic matrix of the DAE, which prohibits a straightforward conversion of the circuit DAE to an ODE required by EI. The singularity is mainly due to the algebraic constraints that do not involve time derivatives in the circuit, which leaves empty rows in the dynamic matrix. To solve the singularity problem, Chen has proposed a two-step DAE-ODE transformation for circuit simulation [5]. First, a topology-based approach is applied to reduce the DAE index of the circuit from 2 to 1, which can be skipped if the circuit is already index-1. Second, row echelon reduction is used to swap particular rows between the dynamic matrix and static matrix to eliminate the singularity. The problem lies in that the row echelon reduction is computationally expensive and tends to degrade the sparsity of the resulting matrices. Reference [4] proposed an implicit and sparsity-preserving regularization technique for EI, which is for the rational Krylov subspace only.
Another bottleneck lies in the large Krylov subspace that is needed for EI to handle stiff circuits, if the ordinary Krylov subspace is adopted. To maintain sufficient accuracy, the Krylov subspace dimension easily reaches several hundreds for intermediate stiffness.
Storing a number of dense basis vectors in memory and performing full orthogonalization w.r.t. these vectors constitute substantial computational and memory challenges. One solution to this challenge is to use other types of Krylov subspace, such as the rational [8] or the extended Krylov subspace [12], which typically demands a much smaller subspace dimension. The cost, however, is extra LU factorizations that to some extent reduces the benefits of using EI. Another route, as motivated by the Krylov-subspace iterative solvers for linear systems, is to resort to restarting. Weng proposed a restarted scheme for EI [6], in which the Krylov subspace basis generation is restarted every m steps; thus, the memory footprint and orthogonalization cost is limited to a m-dimensional subspace. The drawback of such a simple restarting is that the convergence is slowed down. The total number of subspace dimension of restarted EI is higher than that of the non-restarted version, resulting in degraded performance.
In this paper, we aim to address the above two issues limiting the performance of EI for large stiff circuit simulation. We restrict our focus to linear circuit simulation in this work, though extension to nonlinear circuits is possible. Specifically, our main contributions are two-fold:

1.
We devise an implicit and algebraic regularization for EI based on the ordinary Krylov subspace, as it works directly on the original sparse matrices in an implicit manner without the row echelon reduction. This way the sparsity of the system matrices is preserved and the computational efficiency is improved.

2.
We propose a deflated restarting scheme for EI with the ordinary Krylov subspace.
The scheme extracts some useful information from the previous round of Arnoldi process and incorporates such information into the new round of process, so as to accelerate the convergence of the Krylov subspace approximation. In other words, the proposed method "deflates" a subspace from the search space, so that the new search space is narrowed down and the convergence becomes faster compared to the non-deflated version. Some preliminary results have been reported in Reference [13], but an in-depth analysis of the root cause of the slow convergence and the optimal selection of the eigenvalues to be deflated are still missing.
With the two techniques, we aim to make EI comparable in robustness and superior in performance compared to BDF, rendering EI a practical alternative for future large-scale transient circuit simulation.
The paper is organized as follows. Section 2 introduces the EI formulation for transient circuit simulation. Section 3 presents the proposed regularization to tackle the singularity problem. Section 4 reveals the causes of the convergence problem of EI methods and proposes a deflated restarting scheme which is compatible with the above regularization to accelerate the convergence. Section 5 shows the numerical results, and Section 6 concludes the paper.

Background
In transient circuit simulation, a linear circuit is described by the following DAE: where C and G represent the capacitance/inductance matrix and resistance/conductance matrix, respectively, and the matrix B indicates the locations of independent sources. u(t) denotes the input source term. x(t) is comprised of the unknown voltages and branch currents at time t. Figure 1 shows an example of (1). We assume C is non-singular, the above DAE (1) can be directly transformed into an ODE:ẋ where Then, the analytical solution for x n+1 of the above ODE (2) can be solved with matrix exponential: where h is the time step size. The integral term in (3) can be computed analytically by applying the second-order piece-wise-linear (PWL) approximation b(t [14], which turns the solution into the sum of three matrix exponential functions. where ). And ω 2 approximates the slope of input waveform.
The analytical solution (4) has three matrix exponential functions, which are generally referred as ϕ functions of the zero, first and second order.
Almohy [15] has shown that a series of ϕ functions can be calculated by computing the exponential of a (n + p) × (n + p) matrix, where n and p describe the dimension of A and the order of ϕ functions, respectively, which prevents from explicitly calculating each ϕ function. Therefore, we can merge the sum of the three terms of ϕ functions (4) into the exponential of an augmented matrix.
Due to the large matrix size (on the order of millions), direct computation of matrix exponential is prohibitive. However, the Krylov subspace approximation can be applied to reduce the problem to the evaluation of a much smaller matrix exponential by projecting the matrix exponential vector product (MEVP) in (5) onto a smaller Krylov subspace K m (A, v) = span(v, Av, A 2 v, ..., A m−1 v) with m n [16]. It computes an orthogonal basis V m of K m (A, v) from the Arnoldi process; see Algorithm 1: where H m is the upper Hessenberg matrix, and e m is the mth column of identity matrix I m . Then, we project A onto the Krylov subspace and use (6) to derive an approximation of e Ah v: where e H m h can be conveniently evaluated by a dense matrix approach, and Saad [16] proposed a posteriori residue estimate applied to evaluate the approximation (7) quality.

Algorithm 1: Arnoldi Process
Input: vector v ∈ C n×1 , matrix A ∈ C n×n and m Output:

Implicit Regularization
Note that, in (2), we assume C is non-singular and transform the DAE (1) into the ODE (2) directly. However, C is generally singular, since the algebraic constraints without time derivatives result in empty rows or columns in C, which prevents the straightforward multiplication of C −1 on both sides of the DAE (1). Higher-order singularity is also possible due to irregular circuit topologies [17].
To eliminate the singularity, Reference [5] has proposed a two-step topology-based regularization. The first step is to reduce the circuit to index-1 by breaking all C-V loops and L-I cutsets [18] in the circuit. The second step is to apply the row echelon reduction to eliminate the variables corresponding to the singular part of the system. However, such row echelon reduction is computationally expensive and does not preserve the sparsity of C and G (1). To enhance the efficiency of regularization for large-scale systems, in this section, we devise a new regularization technique, which is implicit, algebraic, and sparsity preserving. The regularization is applied to EI based on the ordinary Krylov subspace.

Partition
We firstly define semi-explicit DAEs as follows.

Definition 1.
A semi-explicit DAEs of index-1 admits the following partition and conditions [19] with 1. a singular sparse matrix C, 2.
a non-singular sub-matrix C s , 3.
The above partition can be obtained by re-arranging the nonzero part of C in (1) to the upper left corner. If the above conditions do not hold, the DAE (9) is typically index-2 or index-1 with floating capacitors. In such case, the topology-based method in Reference [5] can be applied to break the C-V loops and/L-I cutsets and eliminate the floating capacitors, rendering a system in the form of (9).

Regularization
From (9), we have the following formula.
Next, we can eliminate x 2 from (10) using (11), which yields With (12), x 1 can be solved normally with the EI method, while x 2 is obtained by solving the algebraic Equation (13) after x 1 is solved. For simplicity, we denote and we obtain the ODE we need to solve (12) as: Then, we multiply C −1 s on both sides of (16) and transform the above DAE to an ODE. And the ODE can be solved analytically by EI assuming PWL inputs: where Merging the three ϕ functions in (17) yields: where Each Arnoldi step requires the computation: The parameter α in (18) is a scaling factor introduced to balance the quantities in C s and G s . Besides, the augmented matrixÃ s is transformed into a nonsingular matrixÂ s as derived in (18).
Then, the computation in each Arnoldi step requires solving: where κ is a scalar quantity, and κ = h α exp h α . Specifically, the core computation in Equation (21) is: where The main challenge in computing (22) lies in that the regularized matrices G s and W s in (22) are rather expensive to compute and tend to have much worse sparsity compared to the original matrices G and W, especially when G 22 is large. Hence, it is not recommended to explicitly form the matrices G s and W s for computing (22). Instead, we propose to compute y s in the following implicit and sparsity-preserving manner.
Firstly, we partition the original W matrix following the partition in (9). And Substituting the expressions of G s (14) and W s into (22) then yields: i.e., y s = (C s /α) −1 (P 1 + P 2 ). P 2 can be directly computed by matrix-vector multiplications with the corresponding blocks. P 1 can be solved by three steps: Solve x 1,2 from G 22 x 1,2 = rhs 1,2 , 3. Compute The starting vector v s is obtained as follows: where v s has the same index as C s . And to merge the three ϕ functions, we augment v s in (25) with e 2 .ṽ Given a solution v, we start the Arnoldi process with regularization. See Algorithm 2. Once we obtain one part of the solution in (12), the other part is solved algebraically in (13). The whole flow of our regularization is illustrated in Figure 2.

Start with DAE system
ሶ + = is singular?
Arrange and partition the system as ሶ 1 ሶ 2 + 11 12 21 22 Partition the input sources to obtain 1 and 2 (the same index as 1 and 2 ) Transform the core as Two remarks are in order: 1. The proposed regularization technique is "algebraic" in the sense that it works directly on the matrix level instead of the circuit topology level. It is "implicit" and "sparsitypreserving" since we do not explicitly form G s and W s , but instead use the original sparse matrices C, G, W and their blocks in the computation.

2.
In Algorithm 2, each Arnoldi step involves two linear system solutions, one with C s and another with G 22 . The former is needed for any variation of EI using the ordinary Krylov subspace approximation, while the latter is specifically associated to the proposed regularization technique. Since both the two sub-matrices are constant throughout the simulation, we only need to perform 1 sparse LU factorization for each matrix at the beginning, then re-use the LU factors in all subsequent computations. Thus, the overhead due to the proposed regularization is mild. In contrast, the rational Krylov subspace approach requires to factorize C + G, which is generally much more costly due to the enlarged size and a higher number of nonzeros.
Set P = P 1 + P 2 ; Solve (C s /α)w 1 = P for w 1 ; Stack w 1 and w 2 for w: break; end end

Deflated Restarting of Exponential Integrator
Another challenge facing EI with ordinary Krylov subspace is the large subspace dimension required for stiff circuits. The stiffness appears when the time constants of the circuits differ by several orders of magnitudes. To guarantee the simulation accuracy of stiff circuits, one needs to either use small step sizes or a large Krylov subspace dimension m, with commonly m > 100. For large problems, storing and performing orthogonalization with all the basis vectors can be highly memory and computation demanding. Moving big chunks of data back and forth between memory and CPU also induces large overheads and hampers parallelizability.
Therefore, there has been a strong desire to limit the subspace dimension for EI when handling stiff circuits. Motivated by the restarting scheme for Krylov subspace iterative solvers (such as GMRES), Reference [6] proposed a restarted Krylov subspace EI method in which the generation of Krylov subspace is restarted after a moderate number of Arnoldi steps, which reduces the memory and orthogonalization cost. However, simple restarting typically results in slower convergence compared to the no-restarted version, consuming more Arnoldi iterations and offsetting the benefit of restarting. In the next parts, we will first analyze the cause of the slow convergence of the ordinary Krylov subspace for stiff circuits, and then propose a deflated restarting scheme to accelerate the convergence by "deflating" certain subspace in the new search space. The deflated restarting scheme is also compatible with the implicit regularization technique described above.

Slow Convergence in Krylov Subspace Approximation for Stiff Circuits
The dimension m required to accurately approximate e A v depends mainly on the eigenvalue distribution of the original projected matrix A = −C −1 G. The underlying idea is that the eigenvalues of H m in (6), or the Ritz values, gradually approximate the eigenvalues of A, and the basis V m gradually forms an orthogonal basis of the subspace spanned by the approximate eigenvectors of A. Being a variant of power iteration, the Arnoldi process is known to approximate large-magnitude (negative) eigenvalues with a higher priority [20]. More specifically, the Arnoldi process spends most of its "dimension resources" building a Krylov subspace spanned by the eigenvectors corresponding to those large-magnitude eigenvalues in A, while leaving fewer "resources" to the subspace corresponding to the small-magnitude eigenvalues. However, for transient circuit simulation, the approximation quality of small (magnitude) eigenvalues is more important than that of the larger ones in deciding the final accuracy. This can be explained by the fact that e −λ ≈ 0 when λ 1, which means an accurate capture of large-magnitude eigenvalues is not necessary since their contributions to the solution are almost zero. On the other hand, small eigenvalues have more tangible impacts on the approximation accuracy.
Consequently, the key problem with the ordinary Krylov subspace EI (6) is that it oversamples the less important region but undersamples the more important region, in the matrix spectrum, as illustrated in Figure 3. For stiff circuits, the gap between small and large eigenvalues becomes more significant, thus demanding a larger subspace to ensure adequate sampling in the critical part of the spectrum.

Re
Im sample the eigenvalues but oversample but undersample less important more important

Ordinary Restarted Krylov Subspace Method
To mitigate the large subspace dimension m required for stiff circuits, Reference [6] proposed a restarted Krylov subspace method specific for EI [21]. The Arnoldi process is restarted every m iterations, with the last basis vector v m+1 in (6) in the current run being the initial vector v 1 of the next m-step Arnoldi process.
where V (1) and k denotes the total number of restarts. The restarting scheme effectively generates a sequence of k Krylov subspaces, each of m dimensional. Concatenating the k sets of basis vectors, one could have the following relation where V km is defined as and H km as Note that the columns of V km form a basis of the km-dimensional Krylov subspace K km (A, v), albeit not an orthogonal one. Similarly as (7), projecting e A v onto K km (A, v) yields the approximation It is shown in Reference [6] that the restarted Krylov subspace approximation (32) can be obtained efficiently by k iterations, i.e., m e H km e 1 (k−1)m:km (k = 2, 3, . . .) . ( Only a m-dimensional basis V (k) m local to the current Krylov subspace is involved in each iteration, instead of global basis V km , leading to substantial saving in memory and orthogonalization.
However, this restarting scheme results in slower convergence than the version without restarting. In other words, km is larger than m 0 , the subspace dimension in the nonrestarted version. This can be attributed partially to the fact that the global subspace basis V km is not orthogonal; thus, with the same number of basis vectors, the subspace coverage is smaller with that with orthogonal basis. A larger Krylov subspace is thereby needed to ensure the important region of the spectrum is properly sampled.

Deflated Restarting Krylov Subspace Method
The analysis above reveals an important point that the ordinary Krylov subspace falls short for stiff circuits because it does not capture, with priority, the subspace spanned by the approximate eigenvectors associated to the small-magnitude eigenvalues (called the rightmost eigenvectors as the eigenvalues are all negative) of A. The simple restarting proposed in Reference [6], while mitigating the memory bottleneck, inherits and magnifies this weakness by producing a non-orthogonal basis.
To address this challenge, we propose a new deflated restarting EI scheme, dubbed EI-DR, for transient simulation of linear stiff circuits. Our rationale is that, if some "useful" information from the current run of Arnoldi process can be extracted and added to the new round of Arnoldi process, the convergence will be improved. In other words, certain important-yet-hard-to-reach subspace from the previous computation can be maintained and "deflated" from the search space of the new Arnoldi process. Consequently, the new search space is shrunk and fewer basis vectors are needed to approximately span the reduced search space.
Based on the above analysis, it would be most effective to deflate the subspace spanned by the rightmost eigenvectors of H m [22]. To avoid confusion, we would like to stress again that we actually want to maintain the information of this subspace in the next round of Arnoldi process. And the term "deflation" means that, since this subspace has been used as the initial subspace, the new Arnoldi process would not search this part again because any newly generated basis vector is orthogonal to the previous subspace. In effect, the subspace becomes "invisible" to the Arnoldi process and deflated from the entire vector space.
To facilitate the discussion of the deflated restarting, we first introduce the Krylovlike decomposition.
W m+l are linearly dependent if and only if l > 0.
In particular, the Krylov-like decomposition turns into a Krylov decomposition as (6) if l = 0.
There are five steps in our proposed deflated restarting scheme. See Algorithm 3.
Compute m further Arnoldi steps to obtain (45); m+1,m e l+1 e T l+m ; Compute a compensatory solution f (k) by (48); if converge then break; end end Step 1: perform the 1st round of Arnoldi process (6) and obtain the relation below: Step 2: extract l eigenvectors from H (1) m (more precisely a partial Schur decomposition): where U (1) l ∈ C m×l , T where Y forms the deflated subspace spanned by the l approximate eigenvectors (Ritz vectors).
Step 4: apply another m-step Arnoldi process with respect to the new Krylov subspace Note that columns of [Y Step 5: glue the basis Y whereH and m ∈ C l×m . After k − 1(k ≥ 2) cycles, we have: , m+1,m e l+1 e T m ∈ C (l+m)×m m+1,m e l+1 e T l+m ∈ C (l+m)×(l+m) (j = 2, 3, . . . , k − 1) . Now, we obtain a Krylov-like decomposition (46) of A with regard to the Krylov subspace K km (A, v). Then, the approximation of e A v based on the Krylov-like decomposition (46) is: Taking into account the special structure of H km+(k−1)l , the approximation (47) can be retrieved by compensation [23].
Note that, with the above iterative updating scheme, it only requires to store m + l basis vectors in each round of restart. The extra computational cost induced by the deflation is mainly the eigenvalue decomposition of a (m + l) × (m + l) matrix. The computation of S (j−1) in (41), while appearing to be expensive, can be performed with nearly zero additional cost in a way outlined in Appendix A. Hence, the overhead arising from the deflation is insignificant for reasonable m and l.

Numerical Results
All the numerical experiments are conducted on a computer with Intel Xeon(R) Golden 6140 processor 2.30 GHz × 72 and 128 GB memory under a Linux system. We implement both the regularization technique and the deflated restarting scheme (EI-DR) in an opensource Python circuit simulator Ahkab [24]. The LU factorizations for C s and G 22 are performed by the SuperLU package of SciPY. For all EI-related simulation, we set the scaling parameter α = 10 −6 and the tolerance ABStol = 10 −6 and RELtol = 10 −3 .
We test four cases of IBM P/G networks with the specifications shown in Table 1. Len(x1) and Len(x2) denote the length of x 1 and x 2 in (9) due to the matrix partitioning, respectively.

Performance of Regularization
We first confirm the effectiveness of our regularization technique proposed in Section 3. In Table 2, we test a simple RC circuit with 10 nodes in the first time point. The leftmost column shows the generalized eigenvalues of the matrix pencil (−C, G). The system matrix A = −C −1 G has three infinite eigenvalues due to the fact that there are three empty rows in C. The remaining columns show the eigenvalues of the Hessenberg matrix H m in (6) obtained using the proposed regularization algorithm. The data from the seven consecutive Arnoldi steps demonstrate that the Ritz values gradually approximate the finite eigenvalues of the original system in a stable manner. When m = 7, the Ritz values reproduces exactly the original eigenvalues as expected, with no infinite eigenvalues included. This proves that our regularization technique does not alter or miss any useful information contained in the original system, except removing those infinite modes to ensure numerical stability. In Figure 4, we compare the transient voltage waveforms simulated by the built-in BE (Backward Euler) method and the EI-DR method combined with the regularization technique. The D1 and D2 cases are used with a fixed time step size 1 × 10 −11 s. The overlapping agreements confirm that no accuracy loss is caused by the proposed regularization.

Deflated Restarting Scheme
In this section, we investigate the performance of the EI-DR method. As mentioned in Section 4.1, the ordinary Arnoldi process (6) tends to oversample the large-magnitude eigenvalues but undersample the small-magnitude ones that are more pertinent to the final accuracy. Hence, the first question of EI-DR is how to choose the proper eigenvalues (or eigenvectors) of the block Hessenberg matrixH in (41) be to deflated. Below we compare two different choices of the l eigenvalues for deflation. The D1 case in Table 1 is used with a uniform time step size of 0.2 ns for convenience. Figure 5 shows the convergence history for different settings, where m and l denote the restarting length of the ordinary Krylov subspace and the number of eigenvectors we extract fromH for deflation, respectively. The symbol (L) and (S) indicate whether we deflate the eigenvectors corresponding to the l largest eigenvalues or the l smallest eigenvalues. The figure plots how the error decreases with the number of restarts k. Note that, when k differs by 1, the corresponding subspace dimension differs by m. When l = 0, our deflated restarting scheme coincides with the non-deflated version (EI-R) proposed in Reference [6] and the two curves for (L) and (S) overlap in Figure 5 as expected. For nonzero l, however, deflation of the small-magnitude eigenvalues consistently yields faster convergence than that of the large-magnitude eigenvalues, which confirms the analysis in Section 4.1 that slow convergence is due to inadequate sampling of the small eigenvalues. Comparing the curves of l = 2(S) and l = 5(S), one can see that the convergence rates are similar, demonstrating a saturated improvement one could obtain by increasing l. This again suggests that the accuracy is controlled by a fraction of small eigenvalues, and further improvements would be limited if these set of eigenvalues have already been deflated from the new search space. Only for a high accuracy requirement below 10 −9 , deflating more small eigenvalues is beneficial as shown by the discrepancy at the bottom part of the leftmost two lines. Tables 3 and 4 tabulate the performance data of EI-DR for all the four testcases with different combinations of m and l. spsolve denotes the number of triangular solves involved in the Arnoldi process (=km), a main indicator of convergence speed and total runtime. dim(V km+(k−1)l ) is the effective total subspace dimension for EI-DR taking into account the deflated vectors. The l smallest eigenvalues are chosen for deflation. Several observations can be made:

1.
A higher restarting length m in general leads to faster convergence and reduced runtime, at the cost of a higher memory consumption. Comparing to EI without restarting, EI-R and EI-DR both consume more spsolve, indicating a slower convergence.

2.
For the same m, a higher l typically improves the convergence and reduces spsolve. The improvements, however, tend to saturate after a certain value of l. For instance, for the case D2 with m = 10, using l = 5 or l = 10 results in the same number of restarting k, suggesting a higher l is not always optimal for a particular m.

3.
Comparing to EI-R (l = 0), EI-DR reduces number of spsolve by ratios between 30% to 50% across different cases and various m, confirming that the proposed EI-DR is an effective acceleration scheme. The improvement is more significant for small m than for large m.
Finally, we plot the 3D profiles of the number of spsolve with respect to different (m, l) pairs for D1 and D4 in Figure 6a,b, respectively. m varies from 6 to 14, and l varies from 0 to m. In the two figures, the peaks are both at (6, 0), and the valleys are near (14,14), which is consistent with our analysis above. The optimal selection of m and l in practice is generally a trade-off between runtime and memory.
extract l eigenvectors corresponding to the smallest-magnitude eigenvalues ofH.

Conclusions
In this paper, we proposed two techniques to improve EI based on the ordinary Krylov subspace for linear circuits. We firstly propose an implicit regularization technique for the ordinary Krylov subspace EI to solve the singular C problem. This regularization is computationally efficient and sparsity preserving. Next, we analyze the convergence problems with the ordinary Krylov subspace and the simple restarting scheme for stiff circuits. Based on the analysis, we then develop a deflated restarting scheme that deflates a carefully chosen region of the matrix spectrum to accelerate the convergence. Numerical results demonstrate the effectiveness of our regularization technique, and the substantial convergence improvement arising from the proposed deflated restarting scheme for stiff circuits. The optimal, even adaptive, selection of m and l in our deflated restarting scheme will be a topic for future investigation.

Conflicts of Interest:
The authors declare no conflict of interest.