# Efficient Evaluation of Matrix Polynomials beyond the Paterson–Stockmeyer Method

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- Evaluating polynomial approximations of matrix functions of degrees 15 and 21 with cost $4M$ and $5M$, respectively.
- Evaluating matrix polynomials of degrees $6s$ with $s=3,4,\dots $ with cost $(s+2)M$.
- Evaluating matrix polynomials of degrees greater than 30 with two matrix products less than the Paterson–Stockmeyer method.

## 2. Efficient Evaluation of Matrix Polynomials

#### 2.1. Paterson–Stockmeyer Method

#### 2.2. General Polynomial Evaluation Methods beyond the Paterson–Stockmeyer Method

## 3. Three General Expressions for ${\mathit{y}}_{\mathbf{2}\mathit{s}}\mathbf{\left(}\mathit{A}\mathbf{\right)}$

#### 3.1. Evaluation of Matrix Polynomial Approximations of Order 15 with ${y}_{2s}\left(A\right)$, $s=2$

**Proposition**

**1.**

**Proof**

**of**

**Proposition**

**1.**

`% MATLAB code fragment 4.1: solves coefficient c8 of`

`% the system of equations`(30)

`for general coefficients bi`

`1 syms A c2 c3 c4 c5 c6 c7 c8 d1 d2 e0 e1 f0 g0 h2 h1 h0I`

`2 syms b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 b15 b16`

`3 c=[c2;c3;c4;c5;c6;c7;c8];`

`4 b=[b16;b15;b14;b13;b12;b11;b10;b9;b8;b7;b6;b5;b4;b3;b2;b1;b0];`

`5 y0=A^2*(sqrt(c8)*A^2+c7/(2*sqrt(c8))*A); % y0 $\ge 0$ from`(29)

`6 y1=sum(c.*A.^([2:8]’));`

`7 y2=(y1+d2*A^2+d1*A)*(y1+e0*y0+e1*A)+f0*y1+g0*y0+h2*A^2+h1*A+h0I;`

`8 [cy2,a1]=coeffs(y2,A);`

`9 cy2=cy2.’;`

`10 v=[cy2 b a1.’]%v shows the coefficients of each power of A`

`11 cy2=cy2(2:end)-b(2:end); %System of equations`

`12 c7s=solve(cy2(1),c7,’ReturnConditions’,true); %c7s=f(c8,bi)`

`13 c7s.conditions %c8 ~= 0 condition for the existence of solutions`

`14 c7s=c7s.c7;`

`15 cy2=subs(cy2,c7,c7s);`

`16 c6s=solve(cy2(2), c6); %c6s depends on c8 bi`

`17 cy2=subs(cy2,c6,c6s);`

`18 c5s=solve(cy2(3), c5); %c5s depends on c8 bi`

`19 cy2=simplify(subs(cy2,c5,c5s));`

`20 symvar(cy2(4)) %cy2(4) depends on c8, c4, e0 bi`

`21 e0s=solve(cy2(4), e0);`

`22 cy2=simplify(subs(cy2,e0,e0s));`

`23 symvar(cy2(5)) %cy2(5) depends on c8, c3, c4, bi`

`24 c3s=solve(cy2(5), c3);`

`25 cy2=simplify(subs(cy2,c3,c3s));`

`26 symvar(cy2(6)) %cy2(6) depends only on c8, c2, d2, bi`

`27 d2s=solve(cy2(6), d2);`

`28 cy2=simplify(subs(cy2,d2,d2s));`

`29 symvar(cy2(7)) %cy2(7) depends only on c8, d1, e1, bi`

`30 d1s=solve(cy2(7), d1);`

`31 cy2=simplify(subs(cy2,d1,d1s));`

`32 symvar(cy2(8)) %cy2(8) depends only on c8, c4, f0, bi`

`33 f0s=solve(cy2(8), f0);`

`34 cy2=simplify(subs(cy2,f0,f0s));`

`35 symvar(cy2(9)) %cy2(9) depends only on c8, b7, b8,...,b15`

`36 c8s=solve(cy2(9), c8)`

`cy2(9)`from the code fragment line 35 depends only on coefficients ${b}_{i}$, for $i=7,8,\dots ,15$, and ${c}_{8}$, the solutions for ${c}_{8}$ are given by the zeros of equation

`cy2(9)`with the condition given by MATLAB code fragment line 13, i.e., condition (26).

`solve`function gives 16 solutions for ${c}_{8}$. They are the roots of a polynomial with coefficients depending on variables ${b}_{i}$, for $i=7,8,\dots ,15$.

`% MATLAB code fragment 4.2: solves coefficient c2 of the`

`% system of equations`(30)

`for general coefficients bi by`

`% using the solutions for coefficient c8 obtained using the`

`% MATLAB piece of code 4.1`

`1 symvar(cy2(10)) %cy2(10) depends on c8, c2, c4, bi`

`2 c4s=solve(cy2(10), c4) % two solutions depending on c8, c2, bi`

`3 cy2=simplify(subs(cy2,c4,c4s(1)))%change c4s(1) for c4s(2) for more solutions`

`4 symvar(cy2(11)) %cy2(11) depends on c8, c2, e1, bi`

`5 e1s=solve(cy2(11), e1)`

`6 cy2=simplify(subs(cy2,e1,e1s))`

`7 symvar(cy2(12)) %cy2(12) depends on c8, c2, g0, bi`

`8 g0s=solve(cy2(12), g0,’ReturnConditions’,true)`

`9 g0s.conditions %conditions for the existence of solutions:`

`%3*b15^2 ~= 8*b14*c8^2 &`

`%27*b15^6*c8^(45/2) + 576*b14^2*b15^2*c8^(53/2) ~=`

`%512*b14^3*c8^(57/2) + 216*b14*b15^4*c8^(49/2) &`

`%c8 ~= 0`

`10 g0s=g0s.g0`

`11 cy2=simplify(subs(cy2,g0,g0s))`

`12 symvar(cy2(13)) %cy2(13) depends on c8, c2, bi`

`cy2(13)`depends only on coefficients ${b}_{i}$, for $i=3,4,\dots ,15$, ${c}_{8}$, and ${c}_{2}$, substituting the values obtained previously for ${c}_{8}$ in

`cy2(13)`, the solutions for ${c}_{2}$ are given by the zeros of equation

`cy2(13)`with the conditions given by line 9 from

`MATLAB code fragment 4.2`when solving ${c}_{7}$ and ${g}_{0}$, given by (26)–(28). Both code fragments are available (http://personales.upv.es/jorsasma/Software/coeffspolm15plus.m (accessed on 24 June 2021)).

**Example**

**1.**

`cy2(13)`from the code fragment in line 12 depending on ${c}_{2}$ and ${c}_{8}$. This equation was solved for every real solution of ${c}_{8}$, using the MATLAB Symbolic Math Toolbox with variable precision arithmetic. Finally, we obtained the nested solutions for computing (20) with (8) and (9) with ${q}_{4}>0$ from (10).

`cosm`from [7] with a function using the coefficients from Table 1 in (39)–(41) and (36) with no scaling for simplicity. Since [7] used a relative backward error analysis, we used the values of Θ from [15] (Table 1) corresponding to the backward relative error analysis of the Taylor approximation of the matrix cosine, denoted by ${E}_{b}$. Then, if $\left|\right|B\left|\right|=\left|\right|{A}^{2}\left|\right|\le \Theta $, then $\left|\right|{E}_{b}\left|\right|\le u$ for the corresponding Taylor approximations. In [15] (Table 1), Θ for Taylor approximation of order 16 was 9.97 and ${\Theta}_{20}=10.18$, showing two decimal digits. Then, for our test with order $2m=34+$, we used a set of 48 $8\times 8$ matrices from the Matrix Computation Toolbox [18] divided by random numbers to give $\parallel B\parallel $ between 9 and 10. We compared the forward error ${E}_{f}$ of both functions

`cosm`and the function using ${z}_{222}\left(B\right)$. The “exact value” of $cos\left(A\right)$ was computed using the method in [19]. The total cost of the new matrix cosine computation function ${z}_{222}$ summing up the number of matrix products over all the test matrices is denoted by ${\mathrm{Cost}}_{{\mathrm{z}}_{222}}$. Taking into account (4), the cost for the

`cosm`Padé function summing up the number of matrix products and inversions over all the test matrices is denoted by ${\mathrm{Cost}}_{\mathrm{cosm}}$. Then, the following cost comparison was obtained for that set of test matrices

`cosm`. Moreover, the results were more accurate in $76.60\%$ of the matrices. Therefore, the new formulae are efficient and accurate.

#### 3.2. Evaluation of Matrix Polynomial Approximations of Order 21

**Proposition**

**2.**

**Proof**

**of**

**Proposition**

**2.**

`coeffspolm21plus.m`(http://personales.upv.es/jorsasma/Software/coeffspolm21plus.m (accessed on 24 June 2021)) similar to

`MATLAB code fragments 4.1 and 4.2`is able to reduce the whole nonlinear system of 22 equations to a nonlinear system of three equations with three variables ${c}_{10}$, ${c}_{11}$, and ${c}_{12}$. The MATLAB code

`coeffspolm21plus.m`returns conditions (49) (see the actual code for details.) □

**Example**

**2.**

`vpasolve`(https://es.mathworks.com/help/symbolic/vpasolve.html (accessed on 24 June 2021)) function from the MATLAB Symbolic Computation Toolbox to solve those equations with variable precision arithmetic. We used the

`Random`option of

`vpasolve`, which allows to obtain different solutions for the coefficients, running it 100 times. The majority of the solutions were complex, but there were two real stable solutions. Then, we obtained the nested solutions for the coefficients of (16) and (17) with $s=3$ for computing polynomial (44) with four matrix products (see [1], Section 3), giving also real and complex solutions.

`logm_iss_full`from [20]. For that comparison, we used a matrix test set of 43 $8\times 8$ matrices of the Matrix Computation Toolbox [18]. We reduced their norms so that they are random with a uniform distribution in $[0.2,{\theta}_{21+}]$ in order to compare the Padé approximations of

`logm_iss_full`with the Taylor-based evaluation Formulas (56)–(58) using no inverse scaling in none of the approximations (see [11]).

`logm_iss_full`in $100\%$ of the matrices with a $19.61\%$ lower relative cost in flops. Therefore, evaluation Formulas (56)–(58) are efficient and accurate for a future Taylor-based implementation for computing the matrix logarithm.

#### 3.3. Evaluation of Matrix Polynomials of Degree $m=6s$

**Proposition**

**3.**

- a.
- $$\begin{array}{ccc}\hfill {c}_{4s-k}& =& {c}_{4s-k}({b}_{6s},{b}_{6s-1},\dots ,{b}_{6s-k}),\mathit{for}k=0,1,\dots ,s-1,\hfill \\ \hfill {e}_{2s-k}& =& {e}_{2s-k}({b}_{6s},{b}_{6s-1},\dots ,{b}_{6s-k}),\mathit{for}k=0,1,\dots ,s-1.\hfill \end{array}$$
- b.
- $${c}_{3s-k}={c}_{3s-k}\left(\right)open="("\; close=")">{b}_{6s},{b}_{6s-1},\dots ,{b}_{5s-k},{e}_{s},\dots ,{e}_{s-k}$$
- c.
- $${c}_{2s-k}={c}_{2s-k}({b}_{6s},\dots ,{b}_{4s-k},{e}_{s},\dots ,{e}_{1}),k=0,\dots ,s-1.$$
- d.
- $${c}_{s-k}={c}_{s-k}({b}_{6s},\dots ,{b}_{3s-k},{e}_{s},\dots ,{e}_{1}),k=0,\dots ,s-1.$$

**Proof**

**of**

**Proposition**

**3.**

- In the following, we show that (71) holds. Taking (16) and (17) into account, one gets$${y}_{1s}\left(A\right)={y}_{0s}^{2}\left(A\right)+q\left(A\right),$$$$\begin{array}{c}{c}_{4s}{e}_{2s}={b}_{6s},\hfill \\ {c}_{4s}\left(\right)open="("\; close=")">\pm \sqrt{{c}_{4s}}={b}_{6s},\hfill \end{array}{c}_{4s}=\sqrt[3]{{b}_{6s}^{2}}\ne 0,\hfill $$$${e}_{2s}=\pm \sqrt[3]{{b}_{6s}}\ne 0.$$Since condition (69) is fulfilled, then by (76), one gets that condition (18) is also fulfilled. Then, polynomial ${y}_{1s}\left(A\right)$ from (65) can be effectively computed using (16) and (17), and by (76) and (77), one gets that ${c}_{4s}$ and ${e}_{4s}$ depend on ${b}_{6s}$, i.e.,$${c}_{4s}={c}_{4s}\left({b}_{6s}\right),\phantom{\rule{0.277778em}{0ex}}{e}_{2s}={e}_{2s}f\left({b}_{6s}\right).$$Therefore,$${e}_{2s-1}=\frac{{c}_{4s-1}}{2{e}_{2s}}.$$Equating the terms of degree $4s-2$ in (75), we obtain$${c}_{4s-2}={e}_{2s}{e}_{2s-2}+{e}_{2s-1}^{2}+{e}_{2s-2}{e}_{2s},$$$${e}_{2s-2}=\frac{{c}_{4s-2}-{e}_{2s-1}^{2}}{2{e}_{2s}}.$$Equating the terms of degree $4s-3$ in (75), we obtain$${c}_{4s-3}={e}_{2s}{e}_{2s-3}+{e}_{2s-1}{e}_{2s-2}+{e}_{2s-2}{e}_{2s-1}+{e}_{2s-3}{e}_{2s},$$$${e}_{2s-3}=\frac{{c}_{4s-3}-({e}_{2s-1}{e}_{2s-2}+{e}_{2s-2}{e}_{2s-1})}{2{e}_{2s}}.$$Equating the terms of degree $4s-4$ in (75), we obtain$${c}_{4s-4}={e}_{2s}{e}_{2s-4}+{e}_{2s-1}{e}_{2s-3}+{e}_{2s-2}{e}_{2s-2}+{e}_{2s-3}{e}_{2s-1}+{e}_{2s-4}{e}_{2s},$$$${e}_{2s-4}=\frac{{c}_{4s-4}-{\displaystyle \sum _{i=1}^{3}}{e}_{2s-i}{e}_{2s+i-4}}{2{e}_{2s}}.$$Proceeding in an analogous way with ${e}_{2s-k}$ for $k=5,\phantom{\rule{0.166667em}{0ex}}6,\dots ,s-1$, we obtain$${e}_{2s-k}=\frac{{c}_{4s-k}-{\displaystyle \sum _{i=1}^{k-1}}{e}_{2s-i}{e}_{2s+i-k}}{2{e}_{2s}},\phantom{\rule{0.166667em}{0ex}}k=1,2,\dots ,s-1.$$On the other hand, equating the terms of degree $6s-1$ in (66), and taking (79) into account, we obtain$$\begin{array}{c}{c}_{4s}{e}_{2s-1}+{c}_{4s-1}{e}_{2s}={c}_{4s-1}\left(\right)open="("\; close=")">\frac{{c}_{4s}}{2{e}_{2s}}+{e}_{2s}={b}_{6s-1},\hfill \end{array}$$Since$$\frac{{c}_{4s}}{2{e}_{2s}}+{e}_{2s}=\frac{{c}_{4s}+2{e}_{2s}^{2}}{2{e}_{2s}}=\frac{\sqrt[3]{{b}_{6s}^{2}}+2\sqrt[3]{{b}_{6s}^{2}}}{2\sqrt[3]{{b}_{6s}}}=\frac{3\sqrt[3]{{b}_{6s}}}{2}\ne 0,$$$${c}_{4s-1}=\frac{2{b}_{6s-1}}{3\sqrt[3]{{b}_{6s}}}.$$Taking into account (77), (79) and (83), we obtain that ${c}_{4s-1}$ and ${e}_{2s-1}$ depend on ${b}_{6s}$ and ${b}_{6s-1}$, i.e.,$${c}_{4s-1}={c}_{4s-1}({b}_{6s},{b}_{6s-1}),\phantom{\rule{0.277778em}{0ex}}{e}_{2s-1}={e}_{2s-1}({b}_{6s},{b}_{6s-1}).$$Equating the terms of degree $6s-2$ in (66) and taking (82) into account, we obtain$${c}_{4s}{e}_{2s-2}+{c}_{4s-1}{e}_{2s-1}+{c}_{4s-2}{e}_{2s}={b}_{6s-2},$$$${c}_{4s-2}=\frac{{b}_{6s-2}-{c}_{4s-1}{e}_{2s-1}+\frac{{e}_{2s-1}^{2}}{2{e}_{2s}}}{\frac{3\sqrt[3]{{b}_{6s}}}{2}}.$$
- In the following, we show that (72) holds. Equating the terms of degree $5s$ in (66) and taking condition ${e}_{2s}\ne 0$ from (77) into account, we obtain$$\begin{array}{c}{c}_{4s}{e}_{s}+{c}_{4s-1}{e}_{s+1}+\dots +{c}_{3s+1}{e}_{2s-1}+{c}_{3s}{e}_{2s}={b}_{5s},\hfill \\ {c}_{3s}=\frac{{b}_{5s}-\left(\right)open="("\; close=")">{c}_{4s}{e}_{s}+{c}_{4s-1}{e}_{s+1}+\dots +{c}_{3s+1}{e}_{2s-1}}{}{e}_{2s}.\hfill \end{array}$$Hence, taking (71) into account, it follows that$${c}_{3s}={c}_{3s}\left(\right)open="("\; close=")">{b}_{6s},{b}_{6s-1},\dots ,{b}_{5s+1},{b}_{5s},{e}_{s}$$Equating the terms $5s-1$ in (66) and taking condition ${e}_{2s}\ne 0$ from (77) into account, we obtain$$\begin{array}{c}{c}_{4s}{e}_{s-1}+{c}_{4s-1}{e}_{s}+\dots +{c}_{3s}{e}_{2s-1}+{c}_{3s-1}{e}_{2s}={b}_{5s-1},\hfill \\ {c}_{3s-1}=\frac{{b}_{5s-1}-\left(\right)open="("\; close=")">{c}_{4s}{e}_{s-1}+{c}_{4s-1}{e}_{s}+\dots +{c}_{3s}{e}_{2s-1}}{}{e}_{2s}.\end{array}$$Hence, using (87), one gets$${c}_{3s-1}={c}_{3s-1}\left(\right)open="("\; close>{b}_{6s},{b}_{6s-1},\dots ,{b}_{5s},{b}_{5s-1},{e}_{s},{e}_{s-1}$$
- In the following, we show that (73) holds. Equating the terms of degree $4s$ in (66) and taking condition ${e}_{2s}\ne 0$ from (77) into account, it follows that$$\begin{array}{c}{c}_{4s-1}{e}_{1}+{c}_{4s-2}{e}_{2}+\dots +{c}_{2s+1}{e}_{2s-1}+{c}_{2s}{e}_{2s}={b}_{4s},\hfill \\ {c}_{2s}=\frac{{b}_{4s}-\left(\right)open="("\; close=")">{c}_{4s-1}{e}_{1}+{c}_{4s-2}{e}_{2}+\dots +{c}_{2s+1}{e}_{2s-1}}{}{e}_{2s}\end{array}$$Taking (71) and (72) into account, we obtain$${c}_{2s}={c}_{2s}\left(\right)open="("\; close=")">{b}_{6s},\dots ,{b}_{4s},{e}_{s},\dots ,{e}_{1}$$Equating the terms of degree $4s-1$ in (66) and condition ${e}_{2s}\ne 0$, one gets$$\begin{array}{c}{c}_{4s-2}{e}_{1}+{c}_{4s-3}{e}_{2}+\dots +{c}_{2s}{e}_{2s-1}+{c}_{2s-1}{e}_{2s}={b}_{4s-1},\hfill \\ {c}_{2s-1}=\frac{{b}_{4s-1}-\left(\right)open="("\; close=")">{c}_{4s-2}{e}_{1}+{c}_{4s-3}{e}_{2}+\dots +{c}_{2s}{e}_{2s-1}}{}{e}_{2s}.\hfill \end{array}$$
- In the following, we show that (74) holds. Equating the terms of degree $3s$ in (66) and takingcondition ${e}_{2s}\ne 0$ into account, it follows that$$\begin{array}{c}{c}_{3s-1}{e}_{1}+{c}_{3s-2}{e}_{2}+\dots +{c}_{s+1}{e}_{2s-1}+{c}_{s}{e}_{2s}={b}_{3s},\hfill \\ {c}_{s}=\frac{{b}_{3s}-\left(\right)open="("\; close=")">{c}_{3s-1}{e}_{1}+{c}_{3s-2}{e}_{2}+\dots +{c}_{s+1}{e}_{2s-1}}{}{e}_{2s}.\hfill \end{array}$$Hence, from (71)–(73), we obtain$${c}_{s}={c}_{s}\left(\right)open="("\; close=")">{b}_{6s},\dots ,{b}_{3s},{e}_{s},\dots ,{e}_{1}$$Equating the terms of degree $3s-1$ in (66) and condition ${e}_{2s}\ne 0$, we obtain$$\begin{array}{c}{c}_{3s-2}{e}_{1}+{c}_{3s-3}{e}_{2}+\dots +{c}_{s}{e}_{2s-1}+{c}_{s-1}{e}_{2s}={b}_{3s-1},\hfill \\ {c}_{s-1}=\frac{{b}_{3s-1}-\left(\right)open="("\; close=")">{c}_{3s-2}{e}_{1}+{c}_{3s-3}{e}_{2}+\dots +{c}_{s}{e}_{2s-1}}{}{e}_{2s}.\hfill \end{array}$$

**Corollary**

**1.**

**Proof**

**of**

**Corollary**

**1.**

- The Paterson–Stockmeyer evaluation formula.

**Example**

**3.**

`vpasolve`giving a range $[-10,10]$ for the solutions of the three variables and using 32 decimal digits. The results of the coefficients from (91)–(93) rounded to IEEE double precision arithmetic are given in Table 4.

`logm_iss_full`from [20] for the previous matrix set, computing the “exact” values of the matrix logarithm in the same way. The error of using the evaluation Formulas (91)–(93) was lower than

`logm_iss_full`in $97.62\%$ of the matrices with a $42.40\%$ lower relative cost in flops, being competitive in efficiency and accuracy for future implementations for computing the matrix logarithm.

`vpasolve`in a similar way as in Example 2, we could find solutions for the coefficients of (94)–(96) and (19) so that ${y}_{2s}\left(A\right)$ and ${z}_{2ps}$ allows to evaluate matrix logarithm Taylor-based approximations of orders from $15+$ up to $75+$. Similarly, we could also find the coefficients for Formulas (94)–(96) to evaluate matrix hyperbolic tangent Taylor approximations of orders higher than 21. Then, our next research step is to show that evaluation Formulas (94)–(96) and its combination with the Paterson–Stockmeyer method from (19) can be used for the general polynomial approximations of matrix functions.

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

MDPI | Multidisciplinary Digital Publishing Institute |

DOAJ | Directory of open access journals |

TLA | Three letter acronym |

LD | Linear dichroism |

## References

- Sastre, J. Efficient evaluation of matrix polynomials. Linear Algebra Appl.
**2018**, 539, 229–250. [Google Scholar] [CrossRef] - Paterson, M.S.; Stockmeyer, L.J. On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J. Comput.
**1973**, 2, 60–66. [Google Scholar] [CrossRef] [Green Version] - Higham, N.J. Functions of Matrices: Theory and Computation; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2008. [Google Scholar]
- Sastre, J.; Ibáñez, J.; Alonso, P.; Peinado, J.; Defez, E. Fast taylor polynomial evaluation for the computation of the matrix cosine. J. Comput. Appl. Math.
**2019**, 354, 641–650. [Google Scholar] [CrossRef] - Sastre, J.; Ibáñez, J.E. Defez, Boosting the computation of the matrix exponential. Appl. Math. Comput.
**2019**, 340, 206–220. [Google Scholar] - Al-Mohy, A.H.; Higham, N.J. A new scaling and squaring algorithm for the matrix exponential. SIAM J. Matrix Anal. Appl.
**2009**, 31, 970–989. [Google Scholar] [CrossRef] [Green Version] - Al-Mohy, A.H.; Higham, N.J.; Relton, S. New Algorithms for Computing the Matrix Sine and Cosine Separately or Simultaneously. SIAM J. Sci. Comput.
**2015**, 37, A456–A487. [Google Scholar] [CrossRef] [Green Version] - Bader, P.; Blanes, S.; Casas, F. Computing the Matrix Exponential with an Optimized Taylor Polynomial Approximation. Mathematics
**2019**, 7, 1174. [Google Scholar] [CrossRef] [Green Version] - Bader, P.; Blanes, S.; Casas, F. An improved algorithm to compute the exponential of a matrix. arXiv
**2017**, arXiv:1710.10989. [Google Scholar] - Sastre, J. On the Polynomial Approximation of Matrix Functions. Available online: http://personales.upv.es/~jorsasma/AMC-S-16-00951.pdf (accessed on 20 April 2020).
- Al-Mohy, A.H.; Higham, N.J. Improved inverse scaling and squaring algorithms for the matrix logarithm. SIAM J. Sci. Comput.
**2012**, 34, C153–C169. [Google Scholar] [CrossRef] [Green Version] - Moler, C.B.; Loan, C.V. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev.
**2003**, 45, 3–49. [Google Scholar] [CrossRef] - Blackford, S.; Dongarra, J. Installation Guide for LAPACK, LAPACK Working Note 41. Available online: http://www.netlib.org/lapack/lawnspdf/lawn41.pdf (accessed on 20 April 2020).
- Sastre, J. Efficient mixed rational and polynomial approximation of matrix functions. Appl. Math. Comput.
**2012**, 218, 11938–11946. [Google Scholar] [CrossRef] - Sastre, J.; Ibáñez, J.; Alonso, P.; Peinado, J.; Defez, E. Two algorithms for computing the matrix cosine function. Appl. Math. Comput.
**2017**, 312, 66–77. [Google Scholar] [CrossRef] [Green Version] - Fasi, M. Optimality of the Paterson–Stockmeyer method for evaluating matrix polynomials and rational matrix functions. Linear Algebra Appl.
**2019**, 574, 182–200. [Google Scholar] [CrossRef] [Green Version] - Ibáñez, J.; Alonso, J.M.; Sastre, J.; Defez, E.; Alonso-Jordá, P. Advances in the Approximation of the Matrix Hyperbolic Tangent. Mathematics
**2021**, 9, 1219. [Google Scholar] [CrossRef] - Higham, N.J. The Matrix Computation Toolbox. Available online: http://www.ma.man.ac.uk/~higham/mctoolbox (accessed on 18 April 2020).
- Davies, E.B. Approximate diagonalization. SIAM J. Matrix Anal. Appl.
**2007**, 29, 1051–1064. [Google Scholar] [CrossRef] [Green Version] - Higham, N. Matrix Logarithm. 2020. Available online: https://www.mathworks.com/matlabcentral/fileexchange/33393-matrix-logarithm (accessed on 18 April 2020).

${q}_{4}$ | 3.571998478323090 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-11}$ | ${d}_{1}$ | −2.645687940516643 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-3}$ |

${q}_{3}$ | −1.857982456862233 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-8}$ | ${e}_{1}$ | 1.049722718717408 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{1\phantom{-}}$ |

${r}_{2}$ | 3.278753597700932 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-5}$ | ${e}_{0}$ | 8.965376033761624 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-4}$ |

${r}_{1}$ | −1.148774768780758 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${f}_{0}$ | −1.859420533601965 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{0\phantom{-}}$ |

${s}_{2}$ | −2.008741312156575 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-5}$ | ${g}_{0}$ | 1.493008139094410 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{1\phantom{-}}$ |

${s}_{0}$ | 1.737292932136998 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{1\phantom{-}}$ | ${h}_{2}$ | 1.570135323717639 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-4}$ |

${t}_{2}$ | 6.982819862335600 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-5}$ | ${h}_{1}$ | −1/6! |

${d}_{2}$ | −5.259287265295055 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-5}$ | ${h}_{0}$ | 1/4! |

**Table 2.**Coefficients of ${y}_{03}$, ${y}_{13}$, and ${y}_{23}$ from (56)–(58) for computing a Taylor-based approximation of function $log\left(B\right)=log(I-A)$ of order $m=21+$.

${c}_{1}$ | 2.475376717210241 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{11}$ | −1.035631527011582 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{2}$ | 2.440262449961976 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{12}$ | −3.416046999733390 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{3}$ | 1.674278428631194 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{13}$ | 4.544910328432021 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ |

${c}_{4}$ | −9.742340743664729 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${c}_{14}$ | 2.741820014945195 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{5}$ | −4.744919764579607 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${c}_{15}$ | −1.601466804001392 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{0\phantom{-}}$ |

${c}_{6}$ | 5.071515307996127 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{16}$ | 1.681067607322385 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{7}$ | 2.025389951302878 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{17}$ | 7.526271076306975 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{8}$ | −4.809463272682823 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${c}_{18}$ | 4.282509402345739 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ |

${c}_{9}$ | 6.574533191427105 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{19}$ | 1.462562712251202 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{10}$ | 3.236650728737168 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{20}$ | 5.318525879522635 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

**Table 3.**Maximum available approximation order for a cost C using the Paterson–Stockmeyer method, order denoted by ${d}_{PS}$, maximum order using ${z}_{1ps}$ from (19) combining (16) and (17) with the Paterson–Stockmeyer, denoted by ${d}_{{z}_{1s}}$, and maximum order using ${z}_{2ps}$ from (19) combining (66) with the Paterson–Stockmeyer method, denoted by ${d}_{{z}_{2s}}$, whenever a solution for the coefficients of ${z}_{2ps}$ exist. Parameters s and p for ${z}_{2ps}\left(x\right)$ such that s is minimum to obtain the required order giving a system (89) of s equations with minimum size.

$\mathbf{C}\left(\mathbf{M}\right)$ | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
---|---|---|---|---|---|---|---|---|---|---|---|

${d}_{PS}$ | 6 | 9 | 12 | 16 | 20 | 25 | 30 | 36 | 42 | 49 | 56 |

${d}_{{z}_{1s}}$ | 8 | 12 | 16 | 20 | 25 | 30 | 36 | 42 | 49 | 56 | 64 |

${d}_{{z}_{2s}}$ | - | 12 | 18 | 24 | 30 | 36 | 42 | 49 | 56 | 64 | 72 |

${s}_{{z}_{2s}}$ | - | 2 | 3 | 4 | 5 | 6 | 6 | 7 | 7 | 8 | 8 |

${p}_{{z}_{2s}}$ | - | 0 | 0 | 0 | 0 | 0 | 6 | 7 | 14 | 16 | 24 |

**Table 4.**Coefficients of ${y}_{05}$, ${y}_{15}$, ${y}_{25}$ from (91)–(93) for computing the Taylor approximation of $log\left(B\right)=log(I-A)=-{y}_{25}\left(A\right)$ of order $m=30$.

${c}_{1}$ | 3.218297948685432 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{16}$ | 2.231079274704953 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{2}$ | 1.109757913339804 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{17}$ | 3.891001336083639 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{3}$ | 7.667169819995447 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${c}_{18}$ | 6.539646241763075 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{4}$ | 6.192062222365700 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${c}_{19}$ | 8.543283349051067 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{5}$ | 5.369406358130299 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${c}_{20}$ | −1.642222074981266 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ |

${c}_{6}$ | 2.156719633283115 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{21}$ | 6.179507508449100 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ |

${c}_{7}$ | −2.827270631646985 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ | ${c}_{22}$ | 3.176715034213954 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ |

${c}_{8}$ | −1.299375958233227 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{23}$ | 8.655952402393143 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-2}$ |

${c}_{9}$ | −3.345609833413695 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{24}$ | 3.035900161106295 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{10}$ | −8.193390302418316 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{25}$ | 9.404049154527467 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{11}$ | −1.318571680058333 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{26}$ | −2.182842624594848 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{12}$ | 1.318536866523954 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{27}$ | −5.036471128390267 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{13}$ | 1.718006767617093 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{28}$ | −4.650956099599815 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{14}$ | 1.548174815648151 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{29}$ | 5.154435371157740 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ |

${c}_{15}$ | 2.139947460365092 $\times \phantom{\rule{0.166667em}{0ex}}{10}^{-1}$ | ${c}_{30}$ | 1 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Sastre, J.; Ibáñez, J.
Efficient Evaluation of Matrix Polynomials beyond the Paterson–Stockmeyer Method. *Mathematics* **2021**, *9*, 1600.
https://doi.org/10.3390/math9141600

**AMA Style**

Sastre J, Ibáñez J.
Efficient Evaluation of Matrix Polynomials beyond the Paterson–Stockmeyer Method. *Mathematics*. 2021; 9(14):1600.
https://doi.org/10.3390/math9141600

**Chicago/Turabian Style**

Sastre, Jorge, and Javier Ibáñez.
2021. "Efficient Evaluation of Matrix Polynomials beyond the Paterson–Stockmeyer Method" *Mathematics* 9, no. 14: 1600.
https://doi.org/10.3390/math9141600