Abstract
We develop a shared denominator Carathéodory–Fejér (CF) method for efficiently evaluating linear combinations of -functions for matrices whose spectrum lies in the negative real axis, as required in exponential integrators for large stiff ODE systems. This entire family is approximated with a single set of poles (a common denominator). The shared pole set is obtained by assembling a stacked Hankel matrix from Chebyshev boundary data for all target functions and computing a single SVD; the zeros of the associated singular-vector polynomial, mapped via the standard CF slit transform, yield the poles. With the poles fixed, per-function residues and constants are recovered by a robust least squares fit on a suitable grid of the negative real axis. For any linear combination of resolvent operators applied to right-hand sides, the evaluation reduces to one shifted linear solve per pole with a single combined right-hand side, so the dominant cost matches that of computing a single -function action. Numerical experiments indicate geometric convergence at a rate consistent withHalphen’s constant, and for highly stiff problems our algorithm outperforms existing Taylor and Krylov polynomial-based algorithms.
Keywords:
matrix functions; Carathéodory–Fejér approximation; shared poles; exponential integrators; φ-functions MSC:
15A16; 65F60; 65L05
1. Introduction
Over the last two decades, the -functions have become central objects in the design of exponential integrators for stiff systems; see, e.g., the survey by Hochbruck and Ostermann [1], the review by Minchev and Wright [2], and the references therein. Typical target problems include semilinear diffusion–reaction equations, Schrödinger-type equations, advection–diffusion–reaction systems, and more general evolution equations obtained by spatial discretization of parabolic or highly oscillatory PDEs in physics and engineering leading to a system of ODEs:
where is obtained from spatial discretization and g is a nonlinear vector function. In such applications, the matrix A is usually large and sparse and in many cases it is negative semidefinite.
Exponential integrators have emerged as a successful class of numerical methods for systems of ODEs. A broad class of exponential integrators reduces each stage to a linear combination of –function actions of the form (see, e.g., [1,3,4,5,6,7,8,9,10]):
For a scalar argument , the functions admit a convenient integral representation
which, by the holomorphic functional calculus, extends directly to matrices:
Equivalently, one can use the power series representation .
As an illustration of the pivotal role of the -functions in exponential integrators, consider the problem (1) with step size . The classical fourth-order exponential time differencing Runge–Kutta method (ETDRK4) of Cox and Matthews [9], modified later by Kassam and Trefethen [11] for stability, reads as follows. Given , define the stage values
Then, the step from to is given by
Using the fact that ([12], Section 10.7.4)
the ETDRK4 scheme can be written in the form
The step from to is then given by
This formulation is algebraically equivalent to the original ETDRK4 scheme of Cox and Matthews, but all matrix coefficients are expressed as linear combinations of the -functions , , making the scheme stable and less prone to subtractive cancellation [11].
The demand of computing such a combination in (2) has led to a substantial body of work devoted specifically to the numerical evaluation of matrix -functions. Niesen and Wright [8] proposed a Krylov subspace algorithm for computing that is now widely used in exponential integrator codes. Their algorithm has been improved and extended to several Krylov-based algorithms that simultaneously evaluate several linear combinations of the form (2); see, e.g., Luan et al. [13], Gaudreault et al. [5], and Caliari et al. [14]. Recently, Al-Mohy [3] proposed an algorithm based on the Taylor series that simultaneously calculates several linear combinations of the form (2). For the implementation of rational Krylov subspaces, see, e.g., Moret [15], Bergermann and Stoll [16], and the references therein.
For algorithms of -functions of medium size, Berland, Skaflestad, and Wright [17] developed the expint MATLAB package, emphasizing that the stability and efficiency of exponential integrators hinge on the accurate evaluation of the underlying -functions. A more recent contribution includes the scaling and recovering algorithm by Al-Mohy and Liu [18], which extends the work of Al-Mohy and Higham [19] for the matrix exponential.
In this context, our goal in the present work is to develop a CF-based rational approximation framework that exploits a shared set of poles for the family . By constructing near-best scalar rational approximants with a common denominator on the negative real axis, and then lifting them to the matrix level, we obtain an efficient mechanism for evaluating general linear combinations of the form (2) using only one set of shifted factorizations across all indices j and all vectors .
In this manuscript, we focus on approximating the linear combination (2) simultaneously by constructing a shared-pole CF rational approximation for each of the form [20]
where the pole set is common to all j, while the residues and constants depend on j. Equivalently,
so that all share the same denominator and differ only in their numerators .
The CF approach proposed by Trefethen [21,22] and Trefethen and Gutknecht [23] constructs near-best real rational approximants to scalar functions from boundary data using singular structure of a Hankel matrix. Its roots trace back to the early 20th-century work of Carathéodory and Fejér on the relation between the extrema of harmonic functions and their coefficients [24]; an extensive historical review is given in [23]. The use of CF approximants for matrix functions goes back to Trefethen, Weideman, and Schmelzer ([20], Section 4) for the matrix exponential. Schmelzer and Trefethen [25] subsequently used CF approximants to evaluate actions , typically with distinct pole sets for each j. They also advocate a common-pole strategy exploiting block-matrix identities among the -functions ([25], Section 4), yielding rational approximants that share a single denominator and thereby allowing reuse of the same shifted factorizations across multiple right-hand sides. They did not develop a general framework for arbitrary linear combinations of the form (2) with heterogeneous right-hand sides, which we provide below. Moreover, they explicitly remark that these approximants are far from optimal.
A key advantage of our approach is that the required approximation degree (and hence the accuracy up to the conditioning of the problems) of the CF rational approximants is its tendency to be independent of the spectral radius of A: the same shared pole set yields geometric decay uniformly on , so large spectral radius does not force a higher degree n. By contrast, the algorithms based on Taylor series, like that of Al-Mohy [3] and based on the standard Krylov like those of [5,8,14] typically require degrees that grow with [16].
This paper is organized as follows: In Section 2, we show how the shared poles, residues, and constants can be computed and propose an algorithm for their computation. In Section 3, we present several results showing that the shared-pole rational approximants retain the exponential accuracy. Section 4 presents the main algorithm for computing the linear combination in (2). Next we present our numerical experiments in Section 5. Finally, we draw some concluding remarks in Section 6.
In the next section, we describe how to construct the shared pole set together with the per–function residues and constants that define the CF approximants in (3).
5. Numerical Experiments
This section presents three numerical experiments. The first investigates the algorithmic parameters and the geometric convergence rate of the CF approximation while the second experiment implements Algorithm 2 for a highly nonnormal matrix. The third experiment involves the 2D Poisson matrix.
All runs were performed in MATLAB R2022b on a single desktop (Intel® Core™ i7–7700T @ 2.90 GHz, 16 GB RAM, Intel Corporation, Santa Clara, CA, USA). To contextualize performance and accuracy, we compare the following five routines:
- cfphimv:
- our MATLAB routine for Algorithm 2 (https://github.com/aalmohy/cfphimv (accessed on 9 December 2025)).
- phimv:
- Al-Mohy’s algorithm ([3], Algorithm 2) (https://github.com/aalmohy/phimv (accessed on 20 October 2025)).
- phi_funm:
- Al-Mohy and Liu’s algorithm ([18], Algorithm 5.1) (https://github.com/xiaobo-liu/phi_funm (accessed on 20 October 2025)). This algorithm evaluates several –functions of a moderate size matrix jointly via a scaling and recovering strategy. We use this routine to compute a reference solution for medium-sized problems.
- bamphi:
- Caliari, Cassini, and Živković’s routine [14], combining Newton form polynomial interpolation at special nodes with Krylov techniques (https://github.com/francozivcovich/bamphi (accessed on 20 October 2025)).
- kiops:
- The adaptive Krylov solver of Gaudreault, Rainwater, and Tokman [5] based on the incomplete orthogonalization procedure (https://gitlab.com/stephane.gaudreault/kiops (accessed on 20 October 2025)).
All phimv, bamphi, and kiops natively accept block right-hand sides and, in one call, evaluate several linear combinations of the form (2). We use each routine with its default settings; for kiops, we set the tolerance to the double-precision machine epsilon for consistency with the other routines.
5.1. Numerical Sweep for Shared-Denominator CF Tables
The purpose of this experiment is to (i) empirically verify geometric convergence of the shared denominator CF approximants for the family on , and (ii) identify practical defaults for the CF scale and degree n that deliver double precision accuracy uniformly across j.
For we build type- shared-pole tables at and . Pole extraction via Algorithm 1 uses Chebyshev coefficients and Chebyshev points. With the shared poles fixed, per-function residues and constants are fitted by least squares (9) on a log-dense training grid of size over with . Accuracy is measured on an independent testing grid of size over . For each configuration, we compute the familywise worst error
We report the geometric rate as , defined by a least squares fit of over the interior degrees to reduce endpoint bias, yielding and . Across all p, the fitted rates lie near the Halphen’s constant and the worst errors at are between and . Similar behavior holds for , with consistently optimal by rate and end error.
We conclude that choosing is sufficient to achieve a nearly optimal geometric convergence rate across . This default aligns with the scales reported in [20,25]. Accordingly, we recommend as the CF scale parameter in Algorithm 1.
5.2. Chebyshev Spectral Laplacian (Dirichlet)
We use a Chebyshev collocation discretization on the interval with homogeneous Dirichlet boundary conditions ([3], Section 5). Starting from the Chebyshev first derivative matrix D on , the change in variables implies , so the discrete second derivative on is . Enforcing Dirichlet conditions by deleting the first and last rows/columns yields a dense, strongly nonnormal matrix whose spectrum lies on the negative real axis and whose spectral radius grows with N, making it a stiff, representative test for exponential integrators. The code in Table 2 (adapted from [29]) constructs A for general N and L.
Table 2.
MATLAB code for the discrete second derivative on with Dirichlet conditions.
We consider and . The aim is to demonstrate the speed of our shared denominator CF rational approximant (with ) relative to Taylor- and Krylov polynomial-based routines. As and nonnormality grow, polynomial approaches typically require higher degrees (or smaller steps) to control the error, so their cost escalates. By contrast, the CF rational approximants employ a fixed set of poles that capture the branch cut of the -family, making accuracy essentially insensitive to the spectral radius of A; the dominant work reduces to a small number of shifted linear solves. This is therefore a highly stiff test where rational approximants should retain both accuracy and speed as N increases. For and each N, we generate a random set of vectors and evaluate using cfphimv, phimv, bamphi, and kiops. The reference solution is computed using phi_funm.
The data in Table 3 highlight three key points. (i) The cost of cfphimv is essentially flat in N: even as increases from to and the matrix becomes more nonnormal, the runtime stays below s and the relative error remains in the – range. (ii) Krylov/Taylor-based methods (phimv, bamphi, kiops) degrade rapidly with N: phimv and bamphi become slower by one to two orders of magnitude and eventually time out, and kiops either times out or returns unusably large errors (up to ). (iii) Beyond , only cfphimv continues to deliver both accuracy and subsecond turnaround. This supports the main claim of the paper: once the shared pole set is precomputed, evaluating the full linear combination of -functions reduces to solving a small number of shifted systems, and this remains stable and fast even for highly stiff, strongly nonnormal matrices.
Table 3.
Runtimes and relative errors for the linear combination with a hard per-call timeout of 300 s. “TO” = timed out; “ER” = routine error.
5.3. Two-Dimensional Poisson Matrix
In this experiment we use the two-dimensional Poisson matrix obtained from the standard five-point finite difference discretization of on with homogeneous Dirichlet boundary conditions. With lexicographic ordering of the interior grid points, P can be written as a sum of Kronecker products
where is the tridiagonal Toeplitz matrix corresponding to the one-dimensional second-difference operator. The matrix A can be generated by the MATLAB command P = gallery(‘poisson’, N). Poisson matrices of this type arise ubiquitously in the finite difference and finite element discretization of diffusion and heat equations, electrostatics and potential problems, pressure Poisson equations in incompressible flow, and in image processing and graph-based models where discrete Laplacians are used for smoothing and regularization.
The spectral structure of P can be characterized explicitly. The one-dimensional matrix T is diagonalized by the discrete sine transform matrix S with entries
yielding with eigenvalues
By standard properties of Kronecker products,
so the eigenvectors of P are tensor products of the one-dimensional sine modes. Detailed expositions of this construction and its use in fast Poisson solvers and spectral analysis can be found; for example, in Strang ([27], Section 5.5) and Golub and Van Loan ([30], Section 4.8).
To evaluate efficiently for a scalar matrix function f analytic on the spectrum of P and a vector , we exploit the explicit diagonalization
We first reshape b into an matrix B such that , where the vec operator stacks the columns of a matrix on top of each other. Using the identity
we compute the spectral coefficients . Defining
we apply by pointwise multiplication,
and transform back via , so that
In this procedure, we never form or explicitly; the dominant cost is a small number of matrix–matrix products (or fast sine transforms), i.e., flops with an explicit S or flops with FFT-based discrete sine transforms ([30], Section 4.8). We use this spectral machinery, combined with multiprecision arithmetic using the Multiprecision Computing Toolbox (ver. 5.1.0) [31], to generate highly accurate reference solutions.
We now use this eigenvalue decomposition as a high-accuracy reference to evaluate the linear combination (2). Let be the mesh size and for . We define the scaled operator
Thus, A is negative definite and increasingly stiff as N grows (indeed, ). For each N and four () randomly generated vectors , we evaluate the linear combination
using the routines cfphimv, phimv, kiops, and bamphi. The reference solution is computed via the spectral diagonalization of A described above, with applied to the eigenvalues in multiprecision arithmetic, and the relative error is measured in the 1-norm against this reference.
Table 4 presents the results. For the smallest sizes ( 64,128), all four algorithms reach very small relative errors, typically between and , confirming that they all resolve the highly stiff operator A on this model problem. However, the runtimes differ markedly: cfphimv is already more than an order of magnitude faster than phimv and substantially faster than kiops and bamphi. As N increases, the stiffness and dimension grow rapidly ( increases by roughly two orders of magnitude over the tested range). For , phimv already hits the time limit of 600 s, while kiops and bamphi remain accurate but require tens of seconds. For , all three polynomial- or Krylov-based routines time out, whereas cfphimv continues to deliver accurate results with runtimes ranging from a fraction of a second (for ) to about 80 s (for ). The modest growth of the cfphimv error with N (remaining around at ) indicates that the rational approximation error and the effect of the reference solver are both well under control. Overall, this experiment shows that the shared denominator CF approach can handle very stiff, large-scale diffusion operators for linear combinations of -functions at a cost comparable to a small number of shifted linear solves, while competing Taylor and rational Krylov methods become prohibitively slow or fail to complete within the time limit.
Table 4.
Runtimes and relative errors for the linear combination with A the scaled 2D Poisson matrix and a hard per-call timeout of 600 s. “TO” = timed out.
6. Conclusions
We presented a shared denominator CF framework for evaluating linear combinations of –functions with stable matrices, a core task in exponential integrators. The key idea is to approximate on with rational functions that share a single pole set , while allowing per-function residues and constants . We obtain the poles by a single SVD of a stacked weighted Hankel built from Chebyshev boundary data of all functions, and then recover the residues and constants via a robust least squares fit on a log-dense grid of . With these ingredients fixed, any linear combination is reduced to solving only (for real data and even n) shifted linear systems independent of p—one per shared pole—against a single combined right-hand side per shift as described in Algorithm 2. Thus, the dominant cost matches that of evaluating a single -function action.
On the theoretical side, we proved a “no assumptions” shared denominator approximation theorem on that yields a geometric rate for a finite family of analytic functions and their linear combinations, and we lifted these bounds to matrix arguments for normal matrices (via the spectral theorem) and for nonnormal matrices whose field of values avoids the cut (via Crouzeix’s theorem). This places our construction on the same exponential convergence footing as CF/contour-trapezoid approaches, with the observed rates closely tied to the classical constants for slit domains.
Our numerical sweeps corroborate the theory: for and moderate degrees , the worst case scalar errors across decay geometrically to , with a near-uniformly effective CF scale around . The experiments also clarify the distinct roles of the parameters used in practice. Importantly, because the rational tables are constructed on the continuous negative real axis and then applied to A via shifted solves, their effectiveness is largely insensitive to the spectral radius of A, in contrast to Taylor and polynomial Krylov approaches whose difficulty grows directly with .
Because our evaluation reduces to solves with the shifted operators against a single combined right-hand side per pole, the method is immediately amenable to matrix-free implementations.
Funding
This research was funded by the Deanship of Scientific Research at King Khalid University through the Research Groups Program under Grant No. RGP.1/318/45.
Data Availability Statement
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.
Acknowledgments
We thank the reviewers for their insightful comments and suggestions that helped to improve the presentation of this paper.
Conflicts of Interest
The author declares no conflicts of interest.
References
- Hochbruck, M.; Ostermann, A. Exponential integrators. Acta Numer. 2010, 19, 209–286. [Google Scholar] [CrossRef]
- Minchev, B.V.; Wright, W.M. A Review of Exponential Integrators for First Order Semi-Linear Problems; Technical Report 2/05; Norwegian University of Science and Technology: Trondheim, Norway, 2005. [Google Scholar]
- Al-Mohy, A.H. Computing Linear Combinations of φ-Function Actions for Exponential Integrators. arXiv 2025, arXiv:2509.26475. [Google Scholar]
- Al-Mohy, A.H.; Higham, N.J. Computing the Action of the Matrix Exponential, with an Application to Exponential Integrators. SIAM J. Sci. Comput. 2011, 33, 488–511. [Google Scholar] [CrossRef]
- Gaudreault, S.; Rainwater, G.; Tokman, M. KIOPS: A Fast Adaptive Krylov Subspace Solver for Exponential Integrators. J. Comput. Phys. 2018, 372, 236–255. [Google Scholar] [CrossRef]
- Hochbruck, M.; Lubich, C.; Selhofer, H. Exponential Integrators for Large Systems of Differential Equations. SIAM J. Sci. Comput. 1998, 19, 1552–1574. [Google Scholar] [CrossRef]
- Koskela, A.; Ostermann, A. Exponential Taylor Methods: Analysis and Implementation. Comput. Math. Appl. 2013, 65, 487–499. [Google Scholar] [CrossRef]
- Niesen, J.; Wright, W.M. Algorithm 919: A Krylov Subspace Algorithm for Evaluating the φ-Functions Appearing in Exponential Integrators. ACM Trans. Math. Softw. 2012, 38, 22. [Google Scholar] [CrossRef]
- Cox, S.; Matthews, P. Exponential Time Differencing for Stiff Systems. J. Comput. Phys. 2002, 176, 430–455. [Google Scholar] [CrossRef]
- Luan, V.T. Efficient Exponential Runge–Kutta Methods of High Order: Construction and Implementation. BIT Numer. Math. 2021, 61, 535–560. [Google Scholar] [CrossRef]
- Kassam, A.K.; Trefethen, L.N. Fourth-Order Time-Stepping for Stiff PDEs. SIAM J. Sci. Comput. 2005, 26, 1214–1233. [Google Scholar] [CrossRef]
- Higham, N.J. Functions of Matrices: Theory and Computation; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2008; p. xx+425. [Google Scholar] [CrossRef]
- Luan, V.T.; Pudykiewicz, J.A.; Reynolds, D.R. Further Development of Efficient and Accurate Time Integration Schemes for Meteorological Models. J. Comput. Phy. 2019, 376, 817–837. [Google Scholar] [CrossRef]
- Caliari, M.; Cassini, F.; Zivcovich, F. BAMPHI: Matrix-Free and Transpose-Free Action of Linear Combinations of φ-Functions from Exponential Integrators. J. Comput. Appl. Math. 2023, 423, 114973. [Google Scholar] [CrossRef]
- Moret, I. On RD-rational Krylov approximations to the core-functions of exponential integrators. Numer. Linear Algebra Appl. 2007, 14, 445–457. [Google Scholar] [CrossRef]
- Bergermann, K.; Stoll, M. Adaptive Rational Krylov Methods for Exponential Runge–Kutta Integrators. SIAM J. Matrix Anal. Appl. 2024, 45, 744–770. [Google Scholar] [CrossRef]
- Berland, H.; Skaflestad, B.; Wright, W.M. EXPINT—A MATLAB Package for Exponential Integrators. ACM Trans. Math. Softw. 2007, 33, 4-es. [Google Scholar] [CrossRef]
- Al-Mohy, A.H.; Liu, X. A Scaling and Recovering Algorithm for the Matrix φ-Functions. arXiv 2025, arXiv:2506.01193. [Google Scholar]
- Al-Mohy, A.H.; Higham, N.J. A New Scaling and Squaring Algorithm for the Matrix Exponential. SIAM J. Matrix Anal. Appl. 2009, 31, 970–989. [Google Scholar] [CrossRef]
- Trefethen, L.N.; Weideman, J.A.C.; Schmelzer, T. Talbot quadratures and rational approximations. BIT Numer. Math. 2006, 46, 653–670. [Google Scholar] [CrossRef]
- Trefethen, L.N. Near-circularity of the error curve in complex Chebyshev approximation. J. Approx. Theory 1981, 31, 344–367. [Google Scholar] [CrossRef]
- Trefethen, L.N. Rational Chebyshev approximation on the unit disk. Numer. Math. 1981, 37, 297–320. [Google Scholar] [CrossRef]
- Trefethen, L.N.; Gutknecht, M.H. The Carathéodory–Fejér Method for Real Rational Approximation. SIAM J. Numer. Anal. 1983, 20, 420–436. [Google Scholar] [CrossRef]
- Carathéodory, C.; Fejér, L. Über den Zusammenhang der Extremen von harmonischen Funktionen mit ihren Koeffizienten und über den Picard–Landau’schen Satz. Rend. Circ. Mat. Palermo 1911, 32, 218–239. [Google Scholar] [CrossRef]
- Schmelzer, T.; Trefethen, L.N. Evaluating Matrix Functions for Exponential Integrators via Carathéodory–Fejér Approximation and Contour Integrals. Electron. Trans. Numer. Anal. 2007, 29, 1–18. [Google Scholar]
- Trefethen, L.N.; Weideman, J.A.C. The Exponentially Convergent Trapezoidal Rule. SIAM Rev. 2014, 56, 385–458. [Google Scholar] [CrossRef]
- Strang, G. Introduction to Applied Mathematics; Wellesley–Cambridge Press: Wellesley, MA, USA, 1986. [Google Scholar]
- Crouzeix, M.; Palencia, C. The Numerical Range is a (1 + )-Spectral Set. SIAM J. Matrix Anal. Appl. 2017, 38, 649–655. [Google Scholar] [CrossRef]
- Trefethen, L.N. Spectral Methods in MATLAB; SIAM: Philadelphia, PA, USA, 2000. [Google Scholar] [CrossRef]
- Golub, G.H.; Van Loan, C.F. Matrix Computations; Johns Hopkins University Press: Baltimore, MD, USA, 2013. [Google Scholar]
- Advanpix. Multiprecision Computing Toolbox; Advanpix: Tokyo, Japan, 2025; Available online: http://www.advanpix.com (accessed on 15 August 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).