On the Matrix Mittag–Lefﬂer Function: Theoretical Properties and Numerical Computation

: Many situations, as for example within the context of Fractional Calculus theory, require computing the Mittag–Lefﬂer (ML) function with matrix arguments. In this paper, we collect theoretical properties of the matrix ML function. Moreover, we describe the available numerical methods aimed at this purpose by stressing advantages and weaknesses.


Introduction
The Mittag-Leffler (ML) function has earned the title of "Queen function of fractional calculus" [1][2][3] for the fundamental role it plays within this subject. Indeed, the solution of many integral or differential equations of noninteger order can be expressed in terms of this function.
For this reason, the accurate evaluation of the ML function has deserved great attention, not least because of the serious difficulties this computation raises. We cite, among the most fruitful contributions, the papers [4][5][6][7][8].
We have recently observed an increasing interest in computing the ML function for matrix arguments (e.g., see [9][10][11][12][13][14][15]): this need occurs, for example, when dealing with multiterm Fractional Differential Equations (FDEs), or with systems of FDEs, or to decide the observability and controllability of fractional linear systems.
In this paper, we want to collect the main results concerning the matrix ML function: we will start from the theoretical properties to move to the practical aspects related to its numerical approximation. Our inspiring work is the milestone paper by Moler and van Loan [16], dating back to 1978, dealing with the several ways to compute the matrix exponential. The authors offered indeed a review of the available methods which were declaimed, already in the paper's title, as "dubious" in the sense that none of them can be considered the top-ranked. Due to the great interest of the topic, twenty-five years later, the same authors published a revised version of this paper [17] to discuss important contributions given in the meantime. In this paper, we would like to use the same simple approach to highlight the difficulties related to the numerical approximation of the matrix ML function.
It is worth stressing that the exponential function is a special ML function; however, it has very nice properties that are not valid for any other instance of ML functions. The semi-group property is one of these and the impossibility to apply it enables, for example, the use of local approximations (which, in the case of the exponential, can be generalized to any argument by exploiting the cited property). Moreover, several methods for the matrix exponential computation were deduced from the fact that this function can be regarded as the solution of simple ordinary differential equations. An analog of this strategy for the ML function presents more difficulties since it can be regarded as a solution of the more involved FDEs.
It becomes clear then that the difficult goal of settling the best numerical method for the exponential function becomes even more tough when treating the matrix ML function. However, in this case, we can affirm that a top-ranked method exists; it was recently proposed [18] and is based on the combination of the Schur-Parlett method [19] and the Optimal Parabolic Contour (OPC) method [4] for the scalar ML function and its derivative. Roughly speaking, this method starts from a Schur decomposition, with reordering and blocking, of the matrix argument and then applies the Parlett's recurrence to compute the function in the triangular factor. Since this step involves the computation of the ML scalar function and its derivatives, the OPC method [4] is, with some suitable modification, fruitfully applied, as we will accurately describe in the following.
The paper is organized as follows: in Section 2, we recall the definition of the ML function and some basic facts about it. In Section 3, we collect the main theoretical properties of the ML function when evaluated in matrix arguments. We then move to the description of the numerical methods for the matrix ML function in Section 4 and to the computation of its action on given vectors in Section 5. Finally, some concluding remarks are collected in Section 6.

The Matrix ML Function
The ML function is defined for complex parameters α and β, with (α) > 0, by means of the series with the Euler's gamma function Γ(z) = ∞ 0 t z−1 e −t dt. It was introduced for β = 1 by the Swedish mathematician Magnus Gustaf Mittag-Leffler at the beginning of the twentieth century [20,21] and then generalized to any complex β by Wiman [22]. Throughout the paper, we will consider real parameters α and β since they are the most common.
The exponential is trivially a ML function for α = β = 1. Even the numerical computation of the scalar ML function is not a trivial task, and several studies have been devoted to it [4][5][6]23]. They all agree that the best approach to numerically evaluate E α,β (z) is based on a series representation for small values of |z|, asymptotic expansions for large arguments, and special integral representations for intermediate values of z. Finally, Garrappa [4] proposed an effective code, based on some ideas previously developed in [5], which allows for reaching any desired accuracy on the whole complex plane. It is implemented in Matlab (2019, MatheWorks Inc., Natick, MA, USA) and we will use this routine for the numerical tests in the following .
There are many equivalent ways to extend the definition of the ML function to matrix arguments, as for more general functions [24]. Here, we recall some of them: Definition 1. Let A ∈ C n×n , α and β complex values with (α) > 0. Then, the following equivalent definitions hold for the matrix ML function:

•
Taylor series . (2) • Jordan canonical form Let λ 1 , . . . , λ p be the distinct eigenvalues of A; then, A can be expressed in the Jordan canonical form and m 1 + . . . + m p = n. Then, where Γ is a simple closed rectifiable curve that strictly encloses the spectrum of A.

•
Hermite interpolation If A has the eigenvalues λ 1 , λ 2 , . . . , λ p with multiplicities m 1 , m 2 , . . . , m p , then where r is the unique Hermite interpolating polynomial of degree less than ∑ p i=1 m i that satisfies the interpolation conditions

Theoretical Properties of the Matrix ML Function
We collect here the main theoretical properties of the matrix ML function [24][25][26]: the first 11 hold for general matrix functions while the remaining are specific for the ML function. Proposition 1. Let A, B ∈ C n×n , α, β ∈ R with α > 0. Let I and 0 denote the identity and the zero matrix, respectively, of dimension n. Then, If A has no eigenvalues on the negative real axis, then 16.

Numerical Evaluation of the Matrix ML Function
We give now an overview of different methods for the numerical evaluation of the matrix ML function, with emphasis on the strengths and weaknesses of each of them.

Series Expansion
As for the exponential, the Taylor series expansion (2) may be regarded as the most direct way to compute the matrix ML function. Indeed, in this definition, only matrix products appear thus to make the approach ideally very simple to implement. In practice, once a fixed number K of terms is chosen, one can use the approximation However, by following exactly the example presented in [16] for the exponential, we show the weakness of this approach. Indeed, we consider the matrix argument whose eigenvectors and eigenvalues are explicitly known, Then, the exact solution can be directly computed as VE α,β (D)V −1 and E α,β (D) is the diagonal matrix of diagonal entries E α,β (−1) and E α,β (−17). In Figure 1, we relate the relative error, in norm, between the exact solution and the approximation (5) as K varies, for three different values of α and β = 1. Figure 1 clearly shows that the numerical approximation (5) can give unreliable results. In this specific example, the impressive growth of the error is due to numerical cancellation; indeed, the summation terms in Equation (5) get larger as j enlarges and they change sign by passing from the jth power to the next one. This means that we sum up terms with very large modulus and opposite sign, and this is an undisputed source of catastrophic errors.  (5) for three values of α, β = 1 and the matrix A as in Equation (6).

Polynomial Methods
Methods based on the minimal polynomial or the eigenpolynomial of a matrix have been proposed to numerically evaluate the matrix ML function. This kind of approach is in general poor and we show the weak points (which are exactly the same we encounter in applying it for the matrix exponential [16]).
The first thing to stress is that they require the eigenvalues' knowledge. This is usually not a priori available and numerical methods for their computation are usually very expensive. Thus, their application is limited to the case in which eigenvalues are at one's disposal.
Although in general the minimal polynomial and the eigenpolynomial are very difficult to compute, the latter is simpler to calculate and we focus on this approach.
Let c(z) denote the characteristic polynomial of A with Then, by means of the Cayley-Hamilton theorem, it is easy to prove that any power of A can be expressed as a linear combination of I, A, . . . , A n−1 . Thus, also E α,β (tA) is a polynomial in A with analytic coefficients in t; indeed, formula (2) for the matrix ML function reads The expression of coefficients p jk is simply obtained once the coefficients c j are known. However, the weak point is related to their numerical computation since it is very prone to round off error (as shown in [16] already for the 1-by-1 case). For this reason, methods of this kind are strongly discouraged.

The Schur-Parlett Method Combined with the OPC Method
The third property of Proposition 1 suggests looking for a suitable similarity transformation which moves the attention to the matrix function evaluated in a different argument, hopefully simpler to deal with. In particular, among the best conditioned similarity transformations, one can resort to the Schur one. Indeed, it factors a matrix A as A = QTQ * with T upper triangular and Q unitary. Then, The Schur decomposition is among the best factorization one can consider since its computation is perfectly stable, unlike other decompositions, as the Jordan one that we will describe in the following. For this reason, it is commonly employed for computing matrix functions [27,28]. The actual evaluation of Equation (7) requires the computation of the ML function for a triangular matrix factor. This topic has been properly addressed for general functions by Parlett in 1976 [29], resulting in a cheap method. Unfortunately, Parlett's recurrence can give inaccurate results when T has close eigenvalues. In 2003, Higham and Davies [19] proposed an improved version of this method: once the Schur decomposition is computed, the matrix T is reordered and blocked according to its eigenvalues resulting in a matrix, sayT. Specifically, each diagonal block ofT has "close" eigenvalues and distinct diagonal blocks have "sufficiently distinct" eigenvalues. In this way, Parlett's recurrence works well even in the presence of closed eigenvalues of T. Just a final reordering is required at the end of the process to recover E α,β (T) from E α,β (T).
The evaluation of E α,β (T) starts from the evaluation of the ML function of its diagonal blocks, which are still triangular matrices whose eigenvalues are "close". Let T ii be one of these diagonal blocks and σ denotes the mean of these eigenvalues. Then, T ii = σI + M and with E (k) α,β denoting the k-th order derivative of E α,β . The powers of M are expected to decay quickly since the eigenvalues of T ii are close. This means that only a few terms of (8), usually less than the dimension of the block T ii , suffice to get a good accuracy.
Evidently, the computation of (8) involves the computation of the derivatives of the scalar ML function, up to an order depending on the eigenvalues' properties. This issue has been completely addressed in [18], and we refer to it for the details.
In particular, the analysis of the derivatives of the ML function has been facilitated by resorting to the three parameters' ML function (also known as the Prabhakar function) The Prabhakar function is an important function occurring in the description of many physical models [30][31][32][33][34][35].
In practice, as for the scalar ML function, one could compute the Prabhakar function, or equivalently the ML derivatives, by the Taylor series for small arguments, the asymptotic expansion for large arguments and an integral representation in the remaining cases. In [18], however, to obtain the same accuracy for all arguments, the inverse Laplace transform has been used in all cases to obtain the simple expression with C any suitable contour in the complex plane encompassing at the left the singularities of the integrand. This last issue is quite delicate: indeed, from a theoretical point of view, the contours chosen to define the inverse Laplace transform are all equivalent while they can lead to extremely different results when the numerical evaluation of these integrals comes into play. Then, an accurate analysis is needed to choose the "optimal" contour which guarantees the desired accuracy, minimizes the computational complexity, and results in a simple implementation. The method proposed in [18], grounded on well established analysis [4,5,15], actually fulfills these requirements since the obtained accuracy is in any case close to the machine precision and the computational complexity is very reasonable. The Matlab code ml_matrix.m implements this method and will be used in the following for numerical tests.

Jordan Canonical Form
The expression of the matrix ML function in terms of its Jordan canonical form, as stated in Equation (3), could be a direct way to numerically evaluate it. However, the true obstacle in using it is the high sensitivity of the Jordan canonical form to numerical errors (in general, "there exists no numerically stable way to compute Jordan canonical forms" [36]).
To give an example, we consider the Matlab code by Matychyn [37], which implements this approach. We restrict the attention to the exponential case (that is, α = β = 1) to have as reference solution the result of the well-established expm code by Matlab.
We consider the Chebyshev spectral differentiation matrix of dimension 10. Oddly, even for this "simple" function, the relative error is quite high, namely proportional to 10 −3 .
The error source is almost certainly the well-known ill-conditioning of the eigenvector matrix. Indeed, the code ml_matrix gives a relative error proportional to 10 −7 since it does not involve the Jordan canonical form. Now, we consider as example the test matrix in [36] A = ε 0 1 0 as matrix argument; as before, we just consider the simplest case α = β = 1 as a significant example.
For small values of ε, say ε < 10 −16 , the code [37] stops running, since, when computing the Jordan canonical form, Matlab recognizes that the matrix is singular. On the other hand, the code ml_matrix works very well even for tiny values, say ε equal to the Matlab machine precision.
From these examples, we can appreciate the high accuracy reached by the code ml_matrix described in Section 4.3. Moreover, its computational cost is lower than the code based on the Jordan canonical form and, far more important, it does not suffer from the eigenvalues' conditioning.

Rational Approximations
Among the nineteen methods to approximate the matrix exponential, Moler and van Loan [16] consider the "exact" evaluation of a rational approximation of the exponential function evaluated in the desired matrix argument. This is indeed a very common approach when dealing with more general functions having good rational approximations (see, e.g., [38][39][40]).
Indeed, let p µ and q ν be polynomials of degree µ and ν, respectively, such that, for scalar arguments z, To evaluate the approximation above in the matrix case, we use a partial fraction expansion of the right-hand side above, leading to in this way, the computation ofp (A) is trivial while the sum requires the computation of ν matrix inversions, namely (A − σ i I) −1 , for i = 1, . . . , ν.
Once the rational approximation is fixed, the sum (10) can be computed by actually inverting the matrices (A − σ i I) −1 if A is a small well-conditioned matrix or, if it is large, incomplete factorizations of A can be cheaply applied [40].
For the ML function, the problem is the detection of a suitable rational approximation to use. The Padé and the Chebyshev rational approximation are commonly preferred for the exponential; this choice is mainly due to their good approximation properties, to the fact that they are explicitly known, and the error analysis is well established.
A key feature of the Padé approximation is that it can be used if A is not too large. This does not represent a restriction for the exponential function since it is endowed with the fundamental property exp(A) = (exp(A/m)) m , it allows for computing the exponential of an arbitrarily small argument A/m to then extend it to the original argument A. In general, m is chosen as a power of two in order to require only the repeated squaring of the matrix argument.
The property above is only valid for the exponential function, meaning that, for the ML function, there is no direct way to extend the local approximation to the global case.
Some years ago, a global Padé approximation to the ML function was proposed [8] working on the whole real semiaxis (−∞, 0]. In the matrix case, this restricts the applicability to matrix arguments with real positive eigenvalues. Moreover, the computation of the coefficients is arduous; for this reason, small degrees are considered in [8], which lead to quite important errors.
We now describe the Carathéodory-Fejér approximation of the ML function as an effective tool when a rational approximation is needed.

Carathéodory-Fejér Approximation of the ML Function
As concerns the ML function, Trefethen was the first to address the problem of finding rational approximations when α = 1 and β ∈ N [41]. Later on, the most general case has been deeply analyzed [11] grounding on the Carathéodory-Fejér (CF) theory; this theory is important since it allows for constructing a near best rational approximation of the ML function. In practice, once we fix a given degree ν, the residues ω 0 , . . . , ω ν and the poles σ 1 , . . . , σ ν are found that define the CF approximation of degree ν of the ML function. Thus, When dealing with real arguments, the sum can be arranged as to almost halve the number of terms to compute. Moreover, since a small degree ν usually suffices to give a good approximation, only a few matrix inverses are actually required. Obviously, this approach is meaningful only for matrix arguments whose inversion can be computed in a stable and reliable way.

Numerical Computation of E α,β (A)b for a Given Vector b
In many situations, the interest is in the computation of E α,β (A)b for a given vector b. Any method described so far can be applied to compute E α,β (A) and then multiply it by the vector b. However, when the dimension of the matrix argument A is very large, ad hoc strategies have to be preferred.
The rational approximation (11) is, for example, a good solution; indeed, it reads and, rather than matrix inversions, the right-hand side requires only solving linear systems. This approach is effective even for small matrix arguments A, in which case direct methods can be applied for the linear systems involved. When the matrix argument is very large, several alternatives are at one's disposal: iterative methods can be, for example, applied (see [11]) and, when preconditioning is needed, the same preconditioner can be computed just once and then applied to all shifted systems. Incomplete factorizations of A can be used for example as preconditioners for the systems involved (we refer to [40] for a deep description of the approach).
Krylov subspace methods are an effective tool for the numerical approximation of vectors like E α,β (A)b; their first application was related to the exponential and then they have been successfully employed for general functions [38,42,43]. In particular, for the ML function, we refer to [11,12]. The idea is to approximate E α,β (A)b in Krylov subspaces defined as The matrix A is projected in these spaces as H m = V T m AV m , where V m ∈ C n×m is an orthonormal basis of K m (A, b) built by applying the Gram-Schmidt procedure with b/ b as starting vector and H m ∈ C m×m is an unreduced Hessenberg matrix whose entries are the coefficients of the orthonormalization process. Then, where e 1 denotes the first column of the identity matrix of dimension m × m. The potency of these techniques is that usually a small dimension m is enough to get a sufficiently accurate approximation; thus, some classical method usually works to compute E α,β (H m ).
The convergence of Krylov subspace methods can be quite slow when the spectrum of A is large; this phenomenon was primarily studied for the exponential [44] and successively for the ML function [12]. In these cases, the Rational Arnoldi method can be successfully used, with superb results already for the "one-pole case" [12,[45][46][47]. The idea of this method, known as Shift and Invert in the context of eigenvalue problems, is to fix a parameter γ and to approximate E α,β (A)b in the Krylov subspaces generated by Z = (I + γA) −1 , rather than A and b.
The computational complexity is larger than for standard Krylov subspace methods since the construction of the Krylov subspaces requires computing vectors of the form (I + γA) −1 y, that is, solving several linear systems with the same shifted coefficient matrix. However, for suitable shift parameters, the convergence becomes much faster, to thus compensate the additional cost.
We refer to [12] for a comprehensive description of this method applied to the computation of the matrix ML function, together with the numerical tests to show the effectiveness of the approach. Moreover, for completeness, we want to stress that the actual computation of the matrix ML function in [12] was accomplished by combining the Schur-Parlett recurrence and the Matlab code mlf.m by Podlubny and Kacenak [48]. However, this approach cannot handle the derivatives of the ML scalar function; therefore, to treat more general situations, as, for example, matrix arguments with repeated eigenvalues, the approach described in Section 4.3 has to be considered within the implementation.

Conclusions
This paper offers an overview of the matrix ML function: the most important theoretical properties are collected to serve as a review and to help in the treatment of this function. Moreover, the existing methods for its numerical computation are presented, by following the plot used by Moler and Van Loan [16] to describe the methods for the numerical computation of the matrix exponential.
From this analysis, we may conclude that the approach based on the combination of the Schur-Parlett method and the OPC method is the most efficient: it is indeed cheap, accurate, and easy to implement.

Conflicts of Interest:
The author declares no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: