Accurate Goertzel Algorithm: Error Analysis, Validations and Applications

Li, Chuanying; Du, Peibing; Li, Kuan; Liu, Yu; Jiang, Hao; Quan, Zhe

doi:10.3390/math10111788

Open AccessArticle

Accurate Goertzel Algorithm: Error Analysis, Validations and Applications

by

Chuanying Li

¹

,

Peibing Du

^2,*,

Kuan Li

³

,

Yu Liu

²,

Hao Jiang

⁴ and

Zhe Quan

¹

College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China

²

Northwest Institute of Nuclear Technology, Xi’an 710024, China

³

School of Cyberspace Security, Dongguan University of Technology, Dongguan 523106, China

⁴

College of Computer, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(11), 1788; https://doi.org/10.3390/math10111788

Submission received: 13 April 2022 / Revised: 4 May 2022 / Accepted: 18 May 2022 / Published: 24 May 2022

(This article belongs to the Special Issue Numerical Analysis and Scientific Computing II)

Download

Browse Figures

Versions Notes

Abstract

The Horner and Goertzel algorithms are frequently used in polynomial evaluation. Each of them can be less expensive than the other in special cases. In this paper, we present a new compensated algorithm to improve the accuracy of the Goertzel algorithm by using error-free transformations. We derive the forward round-off error bound for our algorithm, which implies that our algorithm yields a full precision accuracy for polynomials that are not too ill-conditioned. A dynamic error estimate in our algorithm is also presented by running round-off error analysis. Moreover, we show the cases in which our algorithms are less expensive than the compensated Horner algorithm for evaluating polynomials. Numerical experiments indicate that our algorithms run faster than the compensated Horner algorithm in those cases while producing the same accurate results, and our algorithm is absolutely stable when the condition number is smaller than

10^{16}

. An application is given to illustrate that our algorithm is more accurate than MATLAB’s

fft

function. The results show that the relative error of our algorithm is from

10^{15}

to

10^{17}

, and that of the

fft

was from

10^{12}

to

10^{15}

.

Keywords:

polynomial evaluation; goertzel algorithm; round-off error; error-free transformation; compensated algorithm; numerical stability

MSC:

68U01

1. Introduction

Polynomial evaluation is ubiquitous in computational sciences and their applications, such as interpolation and approximation practices and signal processing. This article will investigate a broader situation of polynomial evaluation:

ω (z) = \sum_{n = 0}^{N} a_{n} z^{n},

(1)

where

z, a_{0}, a_{1}, \dots, a_{N} \in C

. The nested-type algorithms are usually used to evaluate polynomials. The Horner algorithm is the most widely used polynomial evaluation algorithm [1]. In special cases, like

z \in C

and

a_{0}, a_{1}, \dots, a_{N} \in R

, the Goertzel algorithm that can be applied to compute the discrete Fourier transform (DFT) of specific indices in a vector [2,3] is less expensive the Horner algorithm. The numerical stability of the Horner and Goertzel algorithms was given by Wilkinson [4] and Smoktunowicz [5]. The computed results from these algorithms are arbitrarily less accurate than the working precision u when the polynomial is ill-conditioned due to the round-off errors in floating-point arithmetic. The relative accuracy of these algorithms verifies the following priori bound:

\frac{| ω (z) - \hat{ω} (z) |}{| ω (z) |} \leq cond (ω, z) \times O (u),

(2)

where

\hat{ω} (z)

is the computed result and

cond (ω, z) = \sum_{n = 0}^{N} | a_{n} {| | z |}^{n} / | \sum_{n = 0}^{N} a_{n} z^{n} |

is the condition number.

In order to improve the accuracy of double precision, Bailey [6] proposed a famous library for double-double and quad-double arithmetic. However, this library needs to normalize floating-point numbers in every operation, and thus the instruction level parallelism is affected [7,8]. The compensated algorithm is improved to solve this problem with the developments and applications of error-free transformation [9]. The relative accuracy of compensated algorithms verifies the following priori bound:

\frac{| ω (z) - \bar{ω} (z) |}{| ω (z) |} \leq u + cond (ω, z) \times O (u^{2}),

(3)

where

\bar{ω} (z)

is the computed result of a compensated algorithm.

Recently, compensated algorithms have been widely studied in evaluating polynomials. Graillat [10] proposed a compensated Horner algorithm that achieves full precision accuracy for polynomials that are not excessively ill-conditioned. Aside from that, he extended the error-free transformation and compensated Horner algorithm in complex floating-point arithmetic [11,12] and applied a compensated Horner algorithm to evaluate rational functions [13] and solve all polynomial roots [14]. Polynomial series represented in other basis were also considered, such as the Chebyshev form evaluated by a compensated Chenshaw algorithm [15], the Bernstein form evaluated by a compensated de Casteljau algorithm [16], and a compensated Volk and Schumaker(VS) algorithm [17]. Furthermore, the compensated idea is also applied to matrix multiplication to obtain more accurate results [18,19,20].

With the wide application of floating-point numbers and floating-point operations in numerical computing, the analysis of rounding errors has become the focus [21,22]. Running round-off errors are analyzed and applied to many algorithms of polynomial evaluation [23]. Delgado [17] proposed an adaptive evaluation algorithm by using the de Casteljau algorithm and compensated VS algorithms with a dynamic error estimate. Jiang [24] presented running round-off error analysis for evaluating elementary symmetric functions in real and complex floating-point arithmetic. Barrio [25] developed a more complete compensated algorithm library to evaluate orthogonal polynomial series with dynamic error estimates. In addition, error analysis can also be used for machine learning and the numerical solution of differential equations [26].

In this paper, our contributions are as follows:

We design a new compensated Goertzel algorithm and prove that our algorithm can almost yield full working precision to evaluate polynomials (1);
We propose dynamic error estimates, which can offer a sharper bound for our approach without considerably increasing its computing complexity;
Numerical experiments show that our algorithm runs faster than the compensated Horner algorithm in some cases while keeping a similar precision accuracy;
An application is given to illustrate that our algorithm outperforms MATLAB’s $fft$ when dealing with the DFT.

The rest of this paper is organized as follows. Section 2 introduces our compensated Goertzel algorithm. A dynamic error estimate is proposed in Section 3. Section 4 analyzes numerical experiment results and gives an application to illustrate that our algorithm outperforms them. Finally, the full paper is summarized in Section 5.

2. Goertzel Compensated Algorithm

We assume working with IEEE-754 floating-point standard [27] rounding to the nearest value in this paper. Let

F

be the set of floating-point numbers,

C

represent the complex number, and

{\oplus, ⊖, \otimes, ⊘}

represent a floating-point operation. This part presents how to design the Goertzel compensated algorithm. First, the Goertzel algorithm and its relationship with the Clenshaw algorithm are listed. Then, the error-free transformations and sum of squares algorithm are recalled. At last, we present the compensated Goertzel algorithm.

2.1. Goertzel Algorithm

By assuming

λ, z \in C

and

z = x + i y

, then we have a quadratic polynomial

(λ - z) (λ - \bar{z}) = λ^{2} - p λ + q,

(4)

where

p = 2 x

and

q = {| z |}^{2}

. By dividing the polynomial in Equation (1) by that in Equation (4), we obtain

ω (λ) = b_{0} + b_{1} λ + (λ - z) (λ - \bar{z}) \sum_{n = 2}^{N} b_{n} λ^{n - 2},

(5)

where

\begin{matrix} \{\begin{matrix} a_{0} & = b_{0} + q b_{2}, \\ a_{1} & = b_{1} - p b_{2} + q b_{3}, \\ ⋮ \\ a_{n} & = b_{n} - p b_{n + 1} + q b_{n + 2} . \end{matrix} \end{matrix}

(6)

Thus, the evaluated result of the polynomial in Equation (1) is

ω (z) = b_{0} + b_{1} z .

(7)

Above Equations (5)–(7), we can find the Goertzel algorithm [2] with Algorithm 1.

Algorithm 1 Polynomial evaluation by Goertzel algorithm

Function:

ω (z) = Goertzel ({(a_{n})}_{n = 0}^{N}, z)

Require:

z = x + i y \in C

,

{(a_{n})}_{n = 0}^{N} \in C

Ensure:

ω (z) = \sum_{n = 0}^{N} a_{n} z^{n}

p = 2 x, q = x^{2} + y^{2}

b_{N + 1} = b_{N + 2} = 0

for

n = N, N - 1, . . ., 1

b_{n} = a_{n} + p b_{n + 1} - q b_{n + 2}

end

b_{0} = a_{0} + x b_{1} - q b_{2}

ω (z) = b_{0} + i y b_{1}

In floating-point arithmetic, a backward error bound for the computed result of the polynomial evaluation by Algorithm 1 is presented by Smoktunowicz [5] as Theorem 1:

Theorem 1.

Assume

z, a_{n} \in F + i F

for

n = 0, \dots, N

. Let

10 N^{2} u \leq 0.1

. Then the Goertzel algorithm for evaluating the polynomial in Equation (1) is componentwise backward stable such that

\hat{ω} (z) = \sum_{n = 0}^{N} a_{n} (1 + Δ_{n}) z^{n},

(8)

where

| Δ_{n} | \leq 10 N^{2} u + O (u^{2}) .

(9)

In fact, Algorithm 1 is a special case of the Clenshaw algorithm [2,5]. When we let

t = x / | z |

and

B_{n} = b_{n} {| z |}^{n}

, Algorithm 1 can be represented in Clenshaw form [28]:

B_{n} = a_{n} {| z |}^{n} + 2 t B_{n + 1} - B_{n + 2} .

(10)

According to the properties of the Chebyshev polynomial series [29] evaluated by the Clenshaw algorithm, we have

\begin{matrix} \{\begin{matrix} b_{n} & = \sum_{k = n}^{N} a_{k} {| z |}^{k - n} U_{k - n} (t), \\ b_{0} & = \sum_{k = 0}^{N} a_{k} {| z |}^{k} T_{k} (t), \\ n & = 1, 2, \dots, N . \end{matrix} \end{matrix}

(11)

where

T_{k} (t)

and

U_{k} (t)

are Chebyshev polynomials of the first and second kinds, respectively. They satisfy

| T_{k} (t) | \leq 1, | U_{k} (t) | \leq k + 1,

(12)

and

\frac{z^{k}}{{| z |}^{k}} = T_{k} (t) + i \frac{y}{| z |} U_{k - 1} (t) f o r k = 0, 1, \dots, N .

(13)

2.2. Error-Free Transformations and Sum of Squares Algorithm

The basic algorithms of error-free transformations are

TwoSum

and

TwoProd

, which were presented by Knuth [30] and Dekker [31], respectively. Graillat [11] extended the addition and multiplication to complex number cases, which are

TwoSumCplx

and

TwoProdCplx

, respectively. In this paper, we shall use a new product error-free transformation of one real and one complex floating-point number, which is called

TwoProdRC

in Algorithm 2.

Algorithm 2 Error-free transformation of the product of real and complex floating-point numbers

Function:

[x, y] = TwoProdRC (a, b)

Require:

a \in F

,

b = c + i d \in F + i F

Ensure:

x + y = a \times b

[p, e] = TwoProd (a, c)

[f, g] = TwoProd (a, d)

x = p + i f

y = e + i g

The details of the error-free transformations above are presented in Table 1.

Furthermore, the sum of squares algorithm [32] is given in Algorithm 3. It requires 42 flops.

Algorithm 3 Sum of squares by two floating-point numbers

Function:

[x, y] = SumOfSquares (a, b)

Require:

a, b \in F

Ensure:

x + y \approx a^{2} + b^{2}

[p, f] = TwoProd (a, a)

[e, g] = TwoProd (b, b)

[x, h] = TwoSum (p, e)

y = f \oplus g \oplus h

2.3. Compensated Goertzel Algorithm

Although Algorithm 1 is a special Clenshaw algorithm, computation via Equation (10) is more expensive. Thus, using the compensated Clenshaw algorithm [28] to express the compensated Goertzel algorithm is not a good idea. We design a compensated Goertzel algorithm by using

SumOfSquares

,

TwoSumCplx

and

TwoProdRC

algorithms to record the round-off errors.

Assume

\hat{a} \in F

is a computed result in floating-point arithmetic, and its perturbation is

ϵ a

such that

\hat{a} = a + ϵ a .

(14)

SumOfSquares

can find the approximate round-off error

\hat{ϵ q}

of q. Other round-off errors in Algorithm 1 can be accurately computed by error-free transformation algorithms

TwoSumCplx

and

TwoProdRC

. Then, we can combine all round-off errors by

{\hat{ℓ}}_{n}

and find the approximate perturbation

{\hat{ϵ b}}_{n}

of

b_{n}

for

n = N - 1, N - 2, \dots, 1

in each loop. The loop ends using

TwoProdRC

and

TwoSumCplx

to calculate

{\hat{b}}_{0}

and combines all round-off errors to obtain the approximate perturbation. The round-off error of

y \otimes {\hat{b}}_{1}

should also be considered by

TwoProdRC

. The compensated Goertzel algorithm is presented in Algorithm 4, and Figure 1 shows the flow chart of this algorithm.

Algorithm 4 Polynomial evaluation by compensated Goertzel algorithm

Function:

\bar{ω} (z) = CompGoertzel ({(a_{n})}_{n = 0}^{N}, z)

Require:

z = x + i y \in F + i F

,

{(a_{n})}_{n = 0}^{N} \in F + i F

Ensure:

\bar{ω} (z) \approx \sum_{n = 0}^{N} a_{n} z^{n}

p = 2 x

[\hat{q}, \hat{ϵ q}] = SumOfSquares (x, y)

{\hat{b}}_{N} = a_{N}, {\hat{b}}_{N + 1} = {\hat{ϵ b}}_{N} = {\hat{ϵ b}}_{N + 1} = 0

for

n = N - 1, N - 2 . . ., 1

[r_{n}, π_{n}] = TwoProdRC (p, {\hat{b}}_{n + 1})

[s_{n}, σ_{n}] = TwoProdRC (- \hat{q}, {\hat{b}}_{n + 2})

[t_{n}, η_{n}] = TwoSumCplx (r_{n}, s_{n})

[{\hat{b}}_{n}, ξ_{n}] = TwoSumCplx (t_{n}, a_{n})

{\hat{ℓ}}_{n} = π_{n} \oplus σ_{n} \oplus η_{n} \oplus ξ_{n} ⊖ \hat{ϵ q} \otimes {\hat{b}}_{n + 2}

{\hat{ϵ b}}_{n} = {\hat{ℓ}}_{n} \oplus p \otimes {\hat{ϵ b}}_{n + 1} ⊖ \hat{q} \otimes {\hat{ϵ b}}_{n + 2}

end

[r_{0}, π_{0}] = TwoProdRC (x, {\hat{b}}_{1})

[s_{0}, σ_{0}] = TwoProdRC (- \hat{q}, {\hat{b}}_{2})

[t_{0}, η_{0}] = TwoSumCplx (r_{0}, s_{0})

[{\hat{b}}_{0}, ξ_{0}] = TwoSumCplx (t_{0}, a_{0})

{\hat{ℓ}}_{0} = π_{0} \oplus σ_{0} \oplus η_{0} \oplus ξ_{0} ⊖ \hat{ϵ q} \otimes {\hat{b}}_{2}

{\hat{ϵ b}}_{0} = {\hat{ℓ}}_{0} \oplus x \otimes {\hat{ϵ b}}_{1} ⊖ \hat{q} \otimes {\hat{ϵ b}}_{2}

[ϕ, ψ] = TwoProdRC (y, {\hat{b}}_{1})

\hat{e} = {\hat{ϵ b}}_{0} \oplus i ({\hat{ϵ b}}_{1} \otimes y \oplus ψ)

\hat{ω} (z) = {\hat{b}}_{0} \oplus i ϕ

\bar{ω} (z) = \hat{ω} (z) \oplus \hat{e}

We remark that if

{(a_{n})}_{n = 0}^{N} \in F

, then we shall replace

TwoSumCplx

and

TwoProdRC

with

TwoSum

and

TwoProd

in Algorithm 4, respectively.

3. Round-Off Error and Complexity Analysis

In this section, we consider the error bound and complexity of Algorithm 4 through Higham’s theories [33]. In our analysis, we assume that there is no computational overflow or underflow. First, we present the priori bound by forward round-off error analysis. Then, we show a dynamic error estimate by running round-off error analysis. At last, we compare the complexities of Horner, Goertzel, their compensated algorithms and the compensated Goertzel algorithm with a dynamic error estimate to evaluate the polynomial in Equation (1) in real and complex coefficients.

3.1. Forward Round-Off Error Analysis

Let ⧫

\in {\oplus, ⊖, \otimes, ⊘}

,

\circ \in {+, -, \times, \div}

and

a, b \in F

. Then, a floating-point computation obeys the model

a ⧫ b = (a \circ b) (1 + ε_{1}) = \frac{a \circ b}{1 + ε_{2}},

(15)

where

| ε_{1} |, | ε_{2} | \leq u

. We define

< N > : = 1 + θ_{N} = \prod_{n = 1}^{N} {(1 + ε_{n})}^{ρ_{n}},

(16)

where

| ε_{n} | \leq u, ρ_{n} = \pm 1

and

| θ_{n} | \leq γ_{n} : = \frac{n u}{(1 - n u)},

(17)

for

n = 1, 2, \dots, N

and

N u < 1

. We assume that

\hat{a}

and

\hat{b}

in real arithmetic are denoted by

\tilde{c} = \hat{a} \circ \hat{b}

, which will be used in our later analysis.

Lemma 1 summarizes the properties of the error-free transformations in Table 1:

Lemma 1.

For

a, b, x, y \in F

,

[x, y] = TwoSum (a, b)

verifies

| y | \leq u | x |, | y | \leq u | a + b |,

(18)

In addition,

[x, y] = TwoProd (a, b)

verifies

| y | \leq u | x |, | y | \leq u | a \times b | .

(19)

For

a, b, x, y \in F + i F

,

[x, y] = TwoSumCplx (a, b)

verifies

| y | \leq u | x |, | y | \leq u | a + b |,

(20)

Additionally,

[x, y] = TwoProdRC (a, b)

verifies

| y | \leq u | x |, | y | \leq u | a \times b | .

(21)

For

a, b, p, e, f, g \in F + i F

,

[p, e, f, g] = TwoProdCplx (a, b)

verifies

| e + f + g | \leq \sqrt{2} γ_{2} | a \times b | .

(22)

Proof.

Equations (18)–(22) are presented in [9,11]. In Algorithm 2, it is easy to obtain Equation (21) from Equation (18). □

Lemma 2 shows the property of Algorithm 3:

Lemma 2.

For

a, b, x, y \in F

,

[x, y] = SumOfSquares (a, b)

verifies

\begin{matrix} \begin{matrix} | x | \leq (1 + γ_{2}) | a^{2} + b^{2} |, \\ | y | \leq γ_{2} | x |, \\ | x + y - (a^{2} + b^{2}) | \leq u γ_{3} | a^{2} + b^{2} | . \end{matrix} \end{matrix}

(23)

Proof.

According to Algorithm 3 and Table 1,

x = a \otimes a \oplus b \otimes b

. With Lemma 1, we obtain

| f | \leq u | p |, | g | \leq u | e |

and

| h | \leq u | x |

. From Equations (15)–(17), we have

\begin{matrix} \begin{matrix} | x | & = | a^{2} < 2 > + b^{2} < 2 > | \leq (1 + γ_{2}) | a^{2} + b^{2} |, \\ | y | & = | f < 1 > + g < 2 > + h < 1 > | \leq (1 + γ_{2}) (| f | + | g | + | h |) \\ \leq u (1 + γ_{2}) (| p | + | e | + | x |) \leq 2 u (1 + γ_{2}) | x | = γ_{2} | x | . \end{matrix} \end{matrix}

(24)

From Theorem 4.2 in [32],

| x + y - (a^{2} + b^{2}) | \leq \frac{3 u^{2}}{1 - 3 u^{2}} | a^{2} + b^{2} | \leq u γ_{3} | a^{2} + b^{2} |

. □

Theorem 2 presents a priori error bound of Algorithm 4 to evaluate the polynomial in Equation (1) in complex floating-point arithmetic.

Theorem 2.

Assume

z, a_{n} \in F + i F

for

n = 0, \dots, N

. Then, the relative forward round-off error bound in the compensated Goertzel algorithm for evaluting

ω (z) = \sum_{n = 0}^{N} a_{n} z^{n}

in floating-point arithmetic satisfies

\begin{matrix} \frac{| CompGoertzel ({(a_{n})}_{n = 0}^{N}, z) - ω (z) |}{| ω (z) |} \leq u + 3 N^{2} γ_{15} γ_{3 N + 1} cond (ω, z) . \end{matrix}

(25)

Proof.

Assume the error of

\hat{ω} (z)

is e (i.e.,

\hat{ω} (z) + e = ω (z)

). Then, we have

\begin{matrix} \begin{matrix} | \bar{ω} (z) - ω (z) | & = | (\hat{ω} (z) \oplus \hat{e}) - ω (z) | = | (1 + ε) (\hat{ω} (z) + \hat{e}) - ω (z) | \\ = | (1 + ε) (ω (z) - e + \hat{e}) - ω (z) | \leq u | ω (z) | + (1 + u) | e - \hat{e} |, \end{matrix} \end{matrix}

(26)

and

e = {ϵ b}_{0} + i ({ϵ b}_{1} y + ψ),

(27)

where

{ϵ b}_{0}

and

{ϵ b}_{1}

are the error of

b_{0}

and

b_{1}

, respectively, such that

\begin{matrix} \begin{matrix} {ϵ b}_{n} & = ℓ_{n} + p {ϵ b}_{n + 1} + q {ϵ b}_{n + 2}, \\ {ϵ b}_{0} & = ℓ_{0} + x {ϵ b}_{1} + q {ϵ b}_{2}, \end{matrix} \end{matrix}

(28)

where

ℓ_{n} = π_{n} + σ_{n} + η_{n} + ξ_{n} - ϵ q b_{n + 2}

for

n = N - 1, N - 2 \dots, 0

. Similarly, let

\begin{matrix} \begin{matrix} \tilde{e} & = {\tilde{ϵ b}}_{0} + i ({\tilde{ϵ b}}_{1} y + ψ), \\ \hat{e} & = {\hat{ϵ b}}_{0} \oplus i ({\hat{ϵ b}}_{1} \otimes y \oplus ψ), \end{matrix} \end{matrix}

(29)

where

\begin{matrix} \begin{matrix} {\tilde{ϵ b}}_{n} & = {\hat{ℓ}}_{n} + p {\tilde{ϵ b}}_{n + 1} + \hat{q} {\tilde{ϵ b}}_{n + 2}, \\ {\tilde{ϵ b}}_{0} & = {\hat{ℓ}}_{0} + x {\tilde{ϵ b}}_{1} + \hat{q} {\tilde{ϵ b}}_{2}, \end{matrix} \end{matrix}

(30)

and

\begin{matrix} \begin{matrix} {\hat{ϵ b}}_{n} & = {\hat{ℓ}}_{n} \oplus p \otimes {\hat{ϵ b}}_{n + 1} \oplus \hat{q} \otimes {\hat{ϵ b}}_{n + 2}, \\ {\hat{ϵ b}}_{0} & = {\hat{ℓ}}_{0} \oplus x \otimes {\hat{ϵ b}}_{1} \oplus \hat{q} \otimes {\hat{ϵ b}}_{2}, \end{matrix} \end{matrix}

(31)

where

{\hat{ℓ}}_{n} = π_{n} \oplus σ_{n} \oplus η_{n} \oplus ξ_{n} ⊖ \hat{ϵ} q \otimes {\hat{b}}_{n + 2}

for

n = N - 1, N - 2 \dots, 0

. Then, Equation (26) can be simplified as

| \bar{ω} (z) - ω (z) | \leq u | ω (z) | + (1 + u) (| e - \tilde{e} | + | \tilde{e} - \hat{e} |) .

(32)

First, we consider the bound of

| e - \tilde{e} |

. From Equations (11), (28) and (30), we have

\begin{matrix} \begin{matrix} ϵ b_{1} & = \sum_{n = 1}^{N} ℓ_{n} {| z |}^{n - 1} U_{n - 1} (t), \\ ϵ b_{0} & = \sum_{n = 0}^{N} ℓ_{n} {| z |}^{n} T_{n} (t), \\ {\tilde{ϵ b}}_{1} & = \sum_{n = 1}^{N} {\hat{ℓ}}_{n} {| z |}^{n - 1} U_{n - 1} (t), \\ {\tilde{ϵ b}}_{0} & = \sum_{n = 0}^{N} {\hat{ℓ}}_{n} {| z |}^{n} T_{n} (t) . \end{matrix} \end{matrix}

(33)

Then, by Equations (13), (27) and (29), we deduce

\begin{matrix} | e - \tilde{e} | = | (ϵ b_{0} - {\tilde{ϵ b}}_{0}) + i (ϵ b_{1} - {\tilde{ϵ b}}_{1}) y | = | \sum_{n = 0}^{N - 1} (ℓ_{n} - {\hat{ℓ}}_{n}) z^{n} | \leq \sum_{n = 0}^{N - 1} | ℓ_{n} - {\hat{ℓ}}_{n} {| | z |}^{n} . \end{matrix}

(34)

In Algorithm 4, according to Lemmas 1 and 2, we obtain

\begin{matrix} \begin{matrix} | π_{n} | & \leq u | p {\hat{b}}_{n + 1} | \leq 2 u | z | | {\hat{b}}_{n + 1} |, \\ | σ_{n} | & \leq u | \hat{q} {\hat{b}}_{n + 2} | \leq u (1 + γ_{2}) {| z |}^{2} | {\hat{b}}_{n + 2} |, \\ | η_{n} | & \leq u | p {\hat{b}}_{n + 1} - \hat{q} {\hat{b}}_{n + 2} | \leq u (1 + γ_{2}) (2 | z | | {\hat{b}}_{n + 1} {| + | z |}^{2} | {\hat{b}}_{n + 2} |), \\ | ξ_{n} | & \leq u | a_{n} + p {\hat{b}}_{n + 1} - \hat{q} {\hat{b}}_{n + 2} | \leq u (1 + γ_{2}) (| a_{n} | + 2 | z | | {\hat{b}}_{n + 1} {| + | z |}^{2} | {\hat{b}}_{n + 2} |), \\ | \hat{ϵ q} | & \leq γ_{2} | \hat{q} | \leq γ_{2} (1 + γ_{2}) {| z |}^{2}, \end{matrix} \end{matrix}

(35)

for

n = 0, 1, \dots, N

, and then

\begin{matrix} \begin{matrix} | π_{n} | + | σ_{n} | + | η_{n} | + | ξ_{n} | + | \hat{ϵ q} | | {\hat{b}}_{n + 2} | \\ \leq & (3 u + γ_{2}) (1 + γ_{2}) (| a_{n} | + 2 | z | | {\hat{b}}_{n + 1} {| + | z |}^{2} | {\hat{b}}_{n + 2} |) . \end{matrix} \end{matrix}

(36)

In Algorithm 4, from Equations (15)–(17), we have

\begin{matrix} \{\begin{matrix} {\hat{b}}_{N - 1} & = p a_{N} < 2 > + a_{N - 1} < 1 >, \\ {\hat{b}}_{N - 2} & = p {\hat{b}}_{N - 1} < 3 > - q a_{N} < 5 > + a_{N - 2} < 1 > \\ = a_{N} (p^{2} - q) < 5 > + p a_{N - 1} < 3 > + a_{N - 2} < 1 >, \\ ⋮ \\ {\hat{b}}_{1} & = p {\hat{b}}_{2} < 3 > - q {\hat{b}}_{3} < 5 > + a_{1} < 1 > \\ = a_{N} (p^{N - 1} + \dots) < 3 N - 4 > + a_{N - 1} (p^{N - 2} + \dots) < 3 N - 5 > + \dots, \\ {\hat{b}}_{0} & = x {\hat{b}}_{1} < 3 > - \hat{q} {\hat{b}}_{2} < 5 > + a_{0} < 1 > \\ = a_{N} (p^{N} + \dots) < 3 N - 1 > + a_{N - 1} (p^{N - 1} + \dots) < 3 N - 2 > + \dots, \end{matrix} \end{matrix}

(37)

Then, by induction, we get

| b_{n} - {\hat{b}}_{n} | \leq γ_{3 (N - n) - 1} | b_{n} |,

(38)

and

| {\hat{b}}_{n} | \leq (1 + γ_{3 (N - n) - 1}) | b_{n} | .

(39)

Assume that

g_{k} = \sum_{n = k}^{N} | a_{n} {| | z |}^{n - k}, k = 0, 1, \dots, N,

(40)

Given this, then

\sum_{n = 0}^{N} g_{n} {| z |}^{n} \leq (N + 1) g_{0} .

(41)

By combining Equations (11), (12) and (40), we have

\begin{matrix} \begin{matrix} | b_{n} | & \leq (N - n + 1) g_{n}, n = 1, \dots, N . \\ | b_{0} | & \leq g_{0} . \end{matrix} \end{matrix}

(42)

It is easy to obtain that

{\hat{ℓ}}_{n} = π_{n} < 4 > + σ_{n} < 4 > + η_{n} < 3 > + ξ_{n} < 2 > - \hat{ϵ} q {\hat{b}}_{n + 2} < 2 >,

Through Lemma 2 and Equations (36)–(42), we deduce

\begin{matrix} \begin{matrix} | ℓ_{n} - {\hat{ℓ}}_{n} | \leq γ_{4} (| π_{n} | + | σ_{n} | + | η_{n} | + | ξ_{n} | + | \hat{ϵ q} | | {\hat{b}}_{n + 2} |) + | ϵ q b_{n + 2} - \hat{ϵ q} {\hat{b}}_{n + 2} | \\ \leq & γ_{4} (3 u + γ_{2}) (1 + γ_{2}) (| a_{n} | + 2 | z | | {\hat{b}}_{n + 1} {| + | z |}^{2} | {\hat{b}}_{n + 2} |) + | ϵ q | | b_{n + 2} - {\hat{b}}_{n + 2} | \\ + | ϵ q - \hat{ϵ q} | | {\hat{b}}_{n + 2} | \\ \leq & γ_{4} γ_{5} (| a_{n} | + 2 | z | | {\hat{b}}_{n + 1} {| + | z |}^{2} | {\hat{b}}_{n + 2} |) + γ_{2} γ_{3 (N - n) - 7} {| z |}^{2} | b_{n + 2} | + u γ_{3} {| z |}^{2} | {\hat{b}}_{n + 2} | \\ \leq & [γ_{5}^{2} (1 + γ_{3 (N - n) - 4}) + γ_{2} γ_{3 (N - n) - 7}] (| a_{n} | + 2 | z | | b_{n + 1} {| + | z |}^{2} | b_{n + 2} |) \\ \leq & γ_{5} γ_{3 (N - n) + 1} (| a_{n} | + 2 | z | | b_{n + 1} {| + | z |}^{2} | b_{n + 2} |) \\ \leq & γ_{5} γ_{3 (N - n) + 1} (| a_{n} | + 2 (N - n) | z | g_{n + 1} + {(N - n - 1) | z |}^{2} g_{n + 2}) \\ \leq & 3 γ_{5} γ_{3 (N - n) + 1} (N - n) g_{n} . \end{matrix} \end{matrix}

(43)

Thus, from Equations (34), (41) and (43), we obtain

| e - \tilde{e} | \leq \sum_{n = 0}^{N - 1} | ℓ_{n} - {\hat{ℓ}}_{n} {| | z |}^{n} \leq 3 N^{2} γ_{5} γ_{3 N + 1} g_{0} .

(44)

Second, we consider the bound of

| \tilde{e} - \hat{e} |

. In Algorithm 4, according to Equations (15)–(17), we have

\begin{matrix} \{\begin{matrix} {\hat{ϵ b}}_{N - 1} & = {\hat{ℓ}}_{N - 1}, \\ {\hat{ϵ b}}_{N - 2} & = p {\hat{ℓ}}_{N - 1} < 2 > + {\hat{ℓ}}_{N - 2} < 1 >, \\ {\hat{ϵ b}}_{N - 3} & = p {\hat{ϵ b}}_{N - 2} < 3 > - \hat{q} {\hat{ℓ}}_{N - 1} < 3 > + {\hat{ℓ}}_{N - 3} < 1 > \\ = {\hat{ℓ}}_{N - 1} (p^{2} - \hat{q}) < 5 > + p {\hat{ℓ}}_{N - 2} < 3 > + {\hat{ℓ}}_{N - 3} < 1 >, \\ ⋮ \\ {\hat{ϵ b}}_{1} & = p {\hat{ϵ b}}_{2} < 3 > - \hat{q} {\hat{ϵ b}}_{3} < 3 > + {\hat{ℓ}}_{1} < 1 > \\ = {\hat{ℓ}}_{N - 1} (p^{N - 2} + \dots) < 3 N - 7 > + {\hat{ℓ}}_{N - 2} (p^{N - 3} + \dots) < 3 N - 8 > + \dots, \\ {\hat{ϵ b}}_{0} & = x {\hat{ϵ b}}_{1} < 3 > - \hat{q} {\hat{ϵ b}}_{2} < 3 > + {\hat{ℓ}}_{0} < 1 > \\ = {\hat{ℓ}}_{N - 1} (p^{N - 1} + \dots) < 3 N - 4 > + {\hat{ℓ}}_{N - 2} (p^{N - 2} + \dots) < 3 N - 5 > + \dots . \end{matrix} \end{matrix}

(45)

Then, by induction, from Equations (13) and (33) as well as Lemma 1, we deduce

\begin{matrix} | ({\tilde{ϵ b}}_{0} - {\hat{ϵ b}}_{0}) + i ({\tilde{ϵ b}}_{1} - {\hat{ϵ b}}_{1}) y | \leq γ_{3 N - 4} | \sum_{n = 0}^{N - 1} {\hat{ℓ}}_{n} z^{n} | \leq γ_{3 N - 4} \sum_{n = 0}^{N - 1} | {\hat{ℓ}}_{n} {| | z |}^{n}, \end{matrix}

(46)

and

\begin{matrix} \begin{matrix} | {\hat{ϵ b}}_{0} + i ({\hat{ϵ b}}_{1} y + ψ) | \leq & | {\hat{ϵ b}}_{0} + i {\hat{ϵ b}}_{1} y | + | ψ | \\ \leq & (1 + γ_{3 N - 4}) | \sum_{n = 0}^{N - 1} {\hat{ℓ}}_{n} z^{n} | + u | y {\hat{b}}_{1} | \\ \leq & (1 + γ_{3 N - 4}) \sum_{n = 0}^{N - 1} | {\hat{ℓ}}_{n} {| | z |}^{n} + u | z | | {\hat{b}}_{1} | . \end{matrix} \end{matrix}

(47)

Thus, through Equations (29), (30), (46) and (47), we obtain

\begin{matrix} \begin{matrix} | \tilde{e} - \hat{e} | = & | {\tilde{ϵ b}}_{0} + i ({\tilde{ϵ b}}_{1} y + ψ) - {\hat{ϵ b}}_{0} < 2 > - i ({\hat{ϵ b}}_{1} y + ψ) < 3 > | \\ \leq & | ({\tilde{ϵ b}}_{0} - {\hat{ϵ b}}_{0}) + i ({\tilde{ϵ b}}_{1} - {\hat{ϵ b}}_{1}) y | + γ_{3} | {\hat{ϵ b}}_{0} + i ({\hat{ϵ b}}_{1} y + ψ) | \\ \leq & γ_{3 N - 1} \sum_{n = 0}^{N - 1} | {\hat{ℓ}}_{n} {| | z |}^{n} + u γ_{3} | z | | {\hat{b}}_{1} | . \end{matrix} \end{matrix}

(48)

According to Equations (36), (39) and (42), we get

\begin{matrix} \begin{matrix} | {\hat{ℓ}}_{n} | \leq & (1 + γ_{5}) (| π_{n} | + | σ_{n} | + | η_{n} | + | ξ_{n} | + | \hat{ϵ} q | | {\hat{b}}_{n + 2} |) \\ \leq & γ_{9} (| a_{n} | + 2 | z | | {\hat{b}}_{n + 1} {| + | z |}^{2} | {\hat{b}}_{n + 2} |) \\ \leq & γ_{9} (1 + γ_{3 (N - n) - 4}) (| a_{n} | + 2 | z | | b_{n + 1} {| + | z |}^{2} | b_{n + 2} |) \\ \leq & γ_{9} (1 + γ_{3 (N - n) - 4}) (| a_{n} | + 2 (N - n) | z | g_{n + 1} + (N - n - 1) {| z |}^{2} g_{n + 2}) \\ \leq & 3 γ_{9} (1 + γ_{3 (N - n) - 4}) (N - n) g_{n} . \end{matrix} \end{matrix}

(49)

Then, by Equations (39), (41), (48) and (49), we obtain

\begin{matrix} \begin{matrix} | \tilde{e} - \hat{e} | \leq & 3 γ_{9} γ_{3 N - 1} \sum_{n = 0}^{N - 1} (1 + γ_{3 (N - n) - 4}) (N - n) g_{n} + u γ_{3} (1 + γ_{3 N - 4}) N | z | g_{1} \\ \leq & (1 + γ_{3 N - 4}) N (3 N γ_{9} γ_{3 N - 1} + u γ_{3}) g_{0} \\ \leq & 3 N^{2} γ_{10} γ_{3 N - 1} g_{0} . \end{matrix} \end{matrix}

(50)

Hence, when combining Equations (32), (44) and (50), we have

\begin{matrix} \begin{matrix} | \bar{ω} (z) - ω (z) | \leq & u | ω (z) | + (1 + u) 3 N^{2} (γ_{5} γ_{3 N + 1} + γ_{10} γ_{3 N - 1}) g_{0} \\ \leq & u | ω (z) | + 3 N^{2} γ_{15} γ_{3 N + 1} g_{0} . \end{matrix} \end{matrix}

(51)

Thus, with

cond (ω, z) = g_{0} / | ω (z) |

, we deduce Equation (25). □

3.2. Running Round-Off Error Analysis

Theorem 3 gives a running error bound of Algorithm 4 to evaluate the polynomial in Equation (1) in complex floating-point arithmetic:

Theorem 3.

Assume

z, a_{n} \in F + i F

for

n = 0, \dots, N

. Then, the running round-off error bound in the compensated Goertzel algorithm for evaluting

ω (z) = \sum_{n = 0}^{N} a_{n} z^{n}

in floating-point arithmetic satisfies

| \bar{ω} (z) - ω (z) | \leq (| c | \oplus \hat{α}) ⊘ (1 ⊖ 2 u),

(52)

where

\begin{matrix} \{\begin{matrix} c & = \hat{ω} (z) \oplus \hat{e} ⊖ \bar{ω} (z), \\ \hat{α} & = {\hat{γ}}_{3 N + 1} \otimes \hat{E} ⊘ (1 ⊖ 6 (N - 1) \otimes u), \\ \hat{E} & = {\hat{E b}}_{0} \oplus i {\hat{E b}}_{1} \otimes | y |, \\ {\hat{E b}}_{n} & = | \hat{ℓ_{n}} | \oplus | p | \otimes {\hat{E b}}_{n + 1} \oplus | \hat{q} | \otimes {\hat{E b}}_{n + 2}, \\ {\hat{E b}}_{0} & = | \hat{ℓ_{0}} | \oplus | x | \otimes {\hat{E b}}_{1} \oplus | \hat{q} | \otimes {\hat{E b}}_{2}, \\ n & = N - 1, N - 2, \dots, 1 . \end{matrix} \end{matrix}

(53)

Proof.

Assume e is the error of

\hat{ω} (z)

, i.e.,

\hat{ω} (z) + e = ω (z)

. Let

\bar{ω} (z) + c = \hat{ω} (z) + \hat{e}

. Then, we have

\begin{matrix} \begin{matrix} | \bar{ω} (z) - ω (z) | & = | \bar{ω} (z) - (\hat{ω} (z) + e) | \\ = | \bar{ω} (z) - (\hat{ω} (z) + \hat{e} - \hat{e} + e) | \\ \leq | c | + | e - \hat{e} | . \end{matrix} \end{matrix}

(54)

Considering the round-off error

ϵ q

in Algorithm 4, from Lemma 2, we have

ϵ q = \hat{ϵ} q < 2 >

, and thus

\begin{matrix} \{\begin{matrix} {\hat{ℓ}}_{N - 1} & = π_{N - 1} < 3 > + σ_{N - 1} < 3 > + η_{N - 1} < 2 > + ξ_{N - 1} < 1 >, \\ {\hat{ℓ}}_{N - 2} & = π_{N - 2} < 4 > + σ_{N - 2} < 4 > + η_{N - 2} < 3 > + ξ_{N - 2} < 2 > - ϵ q {\hat{b}}_{N} < 4 >, \\ ⋮ \\ {\hat{ℓ}}_{0} & = π_{0} < 4 > + σ_{0} < 4 > + η_{0} < 3 > + ξ_{0} < 2 > - ϵ q {\hat{b}}_{2} < 4 > . \end{matrix} \end{matrix}

(55)

Combined with Equation (45), we can deduce that

\begin{matrix} \begin{matrix} | ϵ b_{0} - {\hat{ϵ b}}_{0} | & = | ϵ b_{0} - < 3 N - 1 > ϵ b_{0} | \leq γ_{3 N - 1} E b_{0}, \\ | ϵ b_{1} - {\hat{ϵ b}}_{1} | & = | ϵ b_{1} - < 3 N - 4 > ϵ b_{1} | \leq γ_{3 N - 4} E b_{1}, \end{matrix} \end{matrix}

(56)

and

\begin{matrix} \begin{matrix} | {\hat{ϵ b}}_{0} | & = | < 3 N - 1 > ϵ b_{0} | \leq (1 + γ_{3 N - 1}) E b_{0}, \\ | {\hat{ϵ b}}_{1} | & = | < 3 N - 4 > ϵ b_{1} | \leq (1 + γ_{3 N - 4}) E b_{1}, \end{matrix} \end{matrix}

(57)

where

E b_{n}

can be derived from

\begin{matrix} \{\begin{matrix} E b_{n} & = | ℓ_{n} | + | p | E b_{n + 1} + | \hat{q} | E b_{n + 2}, \\ E b_{0} & = | ℓ_{0} | + | x | E b_{1} + | \hat{q} | E b_{2}, \\ ℓ_{n} & = π_{n} + σ_{n} + η_{n} + ξ_{n} - ϵ q {\hat{b}}_{n + 2}, \\ ℓ_{0} & = π_{0} + σ_{0} + η_{0} + ξ_{0} - ϵ q {\hat{b}}_{2}, \\ n & = N - 1, N - 2, \dots, 1 . \end{matrix} \end{matrix}

(58)

Furthermore, through Equation (15), we have

\begin{matrix} \begin{matrix} | {\hat{ℓ}}_{n} | = & {{[(π_{n} + σ_{n}) \frac{1}{1 + ε_{1}} + η_{n}] \frac{1}{1 + ε_{2}} + ξ_{n}} \frac{1}{1 + ε_{3}} \\ - ϵ q \frac{1}{1 + ε_{4}} \frac{1}{1 + ε_{5}} {\hat{b}}_{n + 2} \frac{1}{1 + ε_{6}}} \frac{1}{1 + ε_{7}}, \\ {\hat{E b}}_{n} = & [(| {\hat{ℓ}}_{n} | + | p | {\hat{E b}}_{n + 1} \frac{1}{1 + ε_{8}}) \frac{1}{1 + ε_{9}} + | \hat{q} | {\hat{E b}}_{n + 2} \frac{1}{1 + ε_{10}}] \frac{1}{1 + ε_{11}}, \end{matrix} \end{matrix}

(59)

where

| ε_{k} | \leq u

for

k = 1, 2, \dots, 11

. Then, we obtain

| ℓ_{n} | + | p | {\hat{E b}}_{n + 1} + | \hat{q} | {\hat{E b}}_{n + 2} \leq {(1 + u)}^{6} {\hat{E b}}_{n} .

(60)

By induction, we get

E b_{0} \leq {(1 + u)}^{6 (N - 1)} {\hat{E b}}_{0} .

(61)

Assuming that

E = {E b}_{0} + i {E b}_{1} | y |

, we have

E \leq {(1 + u)}^{6 (N - 1)} ({\hat{E b}}_{0} + i {\hat{E b}}_{1} | y |) \leq {(1 + u)}^{6 N - 8} \hat{E} .

(62)

From Equations (15)–(17) and (29), we get

\hat{e} = {\hat{ϵ b}}_{0} < 2 > + i ψ < 2 > + i {\hat{ϵ b}}_{1} y < 2 > = < 2 > \tilde{e}

. Then, by Equations (56), (57) and (62), due to

γ_{k} \leq (1 + u) {\hat{γ}}_{k}

and

{(1 + u)}^{n} \hat{E} \leq \hat{E} ⊘ (1 ⊖ (n + 1) \otimes u)

, we deduce

\begin{matrix} \begin{matrix} | e - \hat{e} | & \leq | e - \tilde{e} | + γ_{2} | \tilde{e} | = | | ϵ b_{0} - {\hat{ϵ b}}_{0} | + i | ϵ b_{1} - {\hat{ϵ b}}_{1} | | y | | + γ_{2} | {\hat{ϵ b}}_{0} + i {\hat{ϵ b}}_{1} y | \\ \leq γ_{3 N - 1} (E b_{0} + i E b_{1} | y |) + γ_{2} (1 + γ_{3 N - 1}) (E b_{0} + i E b_{1} | y |) \\ \leq γ_{3 N + 1} E \leq γ_{3 N + 1} {(1 + u)}^{6 N - 8} \hat{E} \leq {\hat{γ}}_{3 N + 1} \otimes \hat{E} ⊘ (1 ⊖ 6 (N - 1) \otimes u) : = \hat{α} . \end{matrix} \end{matrix}

(63)

Thus, by combining Equations (54) and (63), we obtain

| \bar{ω} (z) - ω (z) | \leq | c | + | e - \hat{e} | \leq (| c | \oplus \hat{α}) ⊘ (1 ⊖ 2 u) .

(64)

□

Considering a dynamic error estimate to evaluate the polynomial based on Theorem 3, we can improve Algorithm 4 into Algorithm 5, whose flowchart is shown in Figure 2. We can see from Algorithm 5 that a new approximate perturbation term is obtained by combining all the dynamic error estimates in each calculation, and participating in the following computation makes the final result more accurate.

3.3. Computational Complexity

The

Horner

and its compensated algorithm

CompHorner

[12] are recalled in Algorithms 6 and 7. We shall use

TwoSum

and

TwoProd

instead of

TwoSumCplx

and

TwoProdCplx

, while the inputs of Algorithm 7 are real floating-point numbers. A comparison of the computational costs of Algorithms 1 and 4–7 is shown in Table 2. As we can see, each of the compensated algorithms (i.e.,

CompHorner

or

CompGoertzel

) can be less expensive than the others in different cases. For example,

CompGoertzel

is half as expensive as

CompHorner

when

z \in C

,

z \neq \pm 1

and

a_{n} \in R

. For

z \in C

,

| z | \neq 1

and

z \neq \pm 1

, the cost of

CompGoertzel

is also less than that of

CompHorner

. Even in these cases,

CompGoertzelwErr

costs less than

CompHorner

. However, for

z \in R

,

CompGoertzel

is twice as expensive as

CompHorner

regardless of

a_{n}

.

Algorithm 5 Polynomial evaluation by compensated Goertzel algorithm with a dynamic error estimate

Function:

[\bar{ω} (z), μ] = CompGoertzelwErr ({(a_{n})}_{n = 0}^{N}, z)

Require:

z = x + i y \in F + i F

,

{(a_{n})}_{n = 0}^{N} \in F + i F

Ensure:

\bar{ω} (z) \approx \sum_{n = 0}^{N} a_{n} z^{n}

,

| ω (z) - \bar{ω} (z) | \leq μ

p = 2 x

[\hat{q}, \hat{ϵ q}] = SumOfSquares (x, y)

{\hat{b}}_{N} = a_{N}, {\hat{b}}_{N + 1} = {\hat{ϵ b}}_{N} = {\hat{ϵ b}}_{N + 1} = {\hat{E b}}_{N} = {\hat{E b}}_{N + 1} = 0

for

n = N - 1, N - 2, . . ., 1

[r_{n}, π_{n}] = TwoProdRC (p, {\hat{b}}_{n + 1})

[s_{n}, σ_{n}] = TwoProdRC (- \hat{q}, {\hat{b}}_{n + 2})

[t_{n}, η_{n}] = TwoSumCplx (r_{n}, s_{n})

[{\hat{b}}_{n}, ξ_{n}] = TwoSumCplx (t_{n}, a_{n})

{\hat{ℓ}}_{n} = π_{n} \oplus σ_{n} \oplus η_{n} \oplus ξ_{n} ⊖ \hat{ϵ q} \otimes {\hat{b}}_{n + 2}

{\hat{ϵ b}}_{n} = {\hat{ℓ}}_{n} \oplus p \otimes {\hat{ϵ b}}_{n + 1} ⊖ \hat{q} \otimes {\hat{ϵ b}}_{n + 2}

{\hat{E b}}_{n} = | {\hat{ℓ}}_{n} | \oplus | p | \otimes {\hat{E b}}_{n + 1} \oplus | \hat{q} | \otimes {\hat{E b}}_{n + 2}

end

[r_{0}, π_{0}] = TwoProdRC (x, {\hat{b}}_{1})

[s_{0}, σ_{0}] = TwoProdRC (- \hat{q}, {\hat{b}}_{2})

[t_{0}, η_{0}] = TwoSumCplx (r_{0}, s_{0})

[{\hat{b}}_{0}, ξ_{0}] = TwoSumCplx (t_{0}, a_{0})

{\hat{ℓ}}_{0} = π_{0} \oplus σ_{0} \oplus η_{0} \oplus ξ_{0} ⊖ \hat{ϵ q} \otimes {\hat{b}}_{2}

{\hat{ϵ b}}_{0} = {\hat{ℓ}}_{0} \oplus x \otimes {\hat{ϵ b}}_{1} ⊖ \hat{q} \otimes {\hat{ϵ b}}_{2}

{\hat{E b}}_{0} = | {\hat{ℓ}}_{0} | \oplus | x | \otimes {\hat{E b}}_{1} \oplus | \hat{q} | \otimes {\hat{E b}}_{2}

[ϕ, ψ] = TwoProdRC (y, {\hat{b}}_{1})

\hat{e} = {\hat{ϵ b}}_{0} \oplus i ({\hat{ϵ b}}_{1} \otimes y \oplus ψ)

\hat{E} = {\hat{E b}}_{0} \oplus i {\hat{E b}}_{1} \otimes y

\hat{ω} (z) = {\hat{b}}_{0} \oplus i ϕ

\bar{ω} (z) = \hat{ω} (z) \oplus \hat{e}

c = \hat{ω} (z) \oplus \hat{e} ⊖ \bar{ω} (z)

\hat{α} = ({\hat{γ}}_{3 N + 1} \otimes \hat{E}) ⊘ (1 ⊖ 6 (N - 1) \otimes u)

μ = (| c | \oplus \hat{α}) ⊘ (1 ⊖ 2 u)

Algorithm 6 Polynomial evaluation by the Horner algorithm

Function:

\hat{ω} (z) = Horner ({(a_{n})}_{n = 0}^{N}, z)

Require:

z = x + i y \in F + i F

,

{(a_{n})}_{n = 0}^{N} \in F + i F

Ensure:

\hat{ω} (z) \approx \sum_{n = 0}^{N} a_{n} z^{n}

{\hat{b}}_{N + 1} = 0

for

n = N, N - 1, . . ., 0

{\hat{b}}_{n} = {\hat{b}}_{n + 1} \otimes z \oplus a_{n}

end

\hat{ω} (z) = {\hat{b}}_{0}

Algorithm 7 Polynomial evaluation by the compensated Horner algorithm

Function:

\bar{ω} (z) = CompHorner ({(a_{n})}_{n = 0}^{N}, z)

Require:

z = x + i y \in F + i F

,

{(a_{n})}_{n = 0}^{N} \in F + i F

Ensure:

\bar{ω} (z) \approx \sum_{n = 0}^{N} a_{n} z^{n}

{\hat{b}}_{N + 1} = {\hat{ϵ b}}_{N + 1} = 0

for

n = N - 1, N - 2 . . ., 0

[r_{n}, π_{n}] = TwoProdCplx ({\hat{b}}_{n + 1}, z)

[{\hat{b}}_{n}, σ_{n}] = TwoSumCplx (r_{n}, a_{n})

{\hat{ϵ b}}_{n} = {\hat{ϵ b}}_{n + 1} \otimes z \oplus π_{n} \oplus σ_{n}

end

\bar{ω} (z) = {\hat{b}}_{0} + {\hat{ϵ b}}_{0}

4. Numerical Experiments

In this section, we test the accuracy, performance and application of our algorithm. All numerical experiments are performed in IEEE-754 double precision as working precision.

4.1. Accuracy

The accuracy measurements in this part were found in MATLAB R2019b, and the exact results were obtained by the Symbolic Toolbox in order to compute the relative errors. As in [12], we considered the expanded form of the polynomial

ω (z) = {(z - 1 - i)}^{n}

(65)

at

z = 1.333 + 1.333 i

for

n = 3

:42, while the condition number varied from

10^{3}

to

10^{33}

. The relative accuracy of the

Horner

,

Goertzel

,

CompHorner

and

CompGoertzel

algorithms as well as the theoretical bounds of

CompGoertzel

in Theorems 2 and 3 are exhibited in Figure 3. We can observe that the

CompGoertzel

algorithm, which had almost the same accuracy as the

CompHorner

algorithm, was absolutely stable when the condition number was smaller than

10^{16}

. Moreover, the numerical results and the error bounds had a good agreement, especially the running error bound, which was almost the same as the real relative errors, along with the results while the condition number was smaller than

10^{13}

.

4.2. Running Time

In this part, we show the practical performance of the

Goertzel

,

CompGoertzel

,

CompHorner

and

CompGoertzelwErr

algorithms in terms of measured computing time. The tests were performed in the following environments:

Env1: Laptop with Intel Core i7-7700 CPU, 4 cores each at 3.6 GHz and with Microsoft Visual C++ 2012 with the default compiler option /od on Windows 7;
Env2: Node of workstation with Intel Xeon E5-2697A CPU, 16 cores each at 2.6 GHz and with gcc 7.4.0 with the default compiler option-O0 on x86_64-Ubuntu-linux 18.04.

We generated the test polynomials with random coefficients in the interval

[- 1, 1]

, whose degree varied from 50 to 10,000 by a step of 50. The average time ratios for

CompGoertzel

/ Goertzel

,

CompGoertzel / CompHorner

,

CompGoertzelwErr / CompGoertzel

and

CompGo

ertzelwErr / CompHorner

are reported in Table 3 and Table 4, while the coefficients of the test polynomials were

a_{n} \in R

and

a_{n} \in C

, respectively. As we can see,

CompGoertzel

was faster than

Goertzel

in practice, especially when testing in Linux. We note the good agreement of the numerical and theoretical results except for

CompGoertzelwErr / CompHorner

while

z \in C

,

| z | \neq 1

,

z \neq \pm 1

and

a_{n} \in C

in Env2. This is because

CompHorner

takes more benefit than

CompGoertzelwErr

from the Fused-Multiply-and-Add instruction [34,35] and the instruction-level parallelism [7,8]. However,

CompGoertzelwErr

was still faster than

CompHorner

in this case in Env1.

4.3. Application

The test environment in this part was the same as the accuracy measurements. We considered the polynomial in Equation (1) with

a_{0}, a_{1}, \dots, a_{N} \in R

and

z_{k} = e^{- 2 π k i / (N + 1)}

. Then, the DFT, which could be computed by the function “fft” with the polynomial’s coefficients in MATLAB returned

ω (z_{k})

for

k = 0, 1, \dots, N

. The

Goertzel

algorithm can also compute the DFT with the polynomial’s coefficients of specific indices

z_{k}

in a vector. Figure 4 shows the relative errors of

Goertzel

,

CompGoertzel

, and

fft

applied to polynomials whose degrees varied from 50 to 1000 by a step of 10 with random coefficients in the interval

[- 1, 1]

, where the relative error was defined as

∥ {res}_{comput} - {res}_{exact} ∥_{2} / {∥ {res}_{exact} ∥}_{2}

. In Figure 4, although

fft

was more accurate than

Goertzel

, the relative errors of

Goertzel

and

fft

were all increasing, while the degree of the polynomial grew. However,

CompGoertzel

always obtained full-precision accurate results in this test, and the relative error of our algorithm was

10^{15}

to

10^{17}

, while the

fft

was from

10^{12}

to

10^{15}

.

5. Conclusions

In this paper, we presented a compensated Goertzel algorithm with dynamic error estimation to evaluate polynomials in complex floating-point arithmetic. The forward error analysis and numerical experiments show that the algorithm can yield full working precision accuracy. Furthermore, although the algorithm is as precise as the compensated Horner algorithm, it is quicker in certain situations. The algorithm also performed well in the application of computing the DFT of specific indices.

Author Contributions

Conceptualization, C.L., P.D. and K.L.; methodology, C.L., P.D., Y.L. and H.J.; validation, C.L., K.L. and Z.Q.; formal analysis, K.L., Y.L. and H.J.; investigation, C.L., P.D. and H.J.; resources, Y.L. and Z.Q.; data curation, C.L. and P.D.; writing—original draft preparation, C.L., P.D., K.L. and Y.L.; writing—review and editing, C.L., H.J. and Z.Q.; supervision, H.J. and Z.Q.; project administration, P.D. and H.J.; funding acquisition, P.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Natural Science Foundation of China (No. 61907034), the 173 program of China (2020-JCJQ-ZD-029), the National Key Research and Development Program of China (2020YFA0709803), the Science Challenge Project of China (TZ2016002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank the reviewers for providing valuable comments about our article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DFT	discrete Fourier transform
VS	Volk and Schumaker

References

Peña, J.M.; Sauer, T. On the multivariate Horner scheme. Siam J. Numer. Anal. 2000, 37, 1186–1197. [Google Scholar] [CrossRef]
Gentleman, W.M. An error analysis of Goertzel’s (Watt’s) method for computing Fourier coefficients. Comput. J. 1969, 12, 160–165. [Google Scholar] [CrossRef]
Newbery, A.C.R. Error analysis for Fourier series evaluation. Math. Comput. 1973, 27, 639–644. [Google Scholar] [CrossRef]
Wilkinson, J.H. Rounding Errors in Algebraic Processes; Courier Corporation: Englewood Cliffs, NJ, USA, 1994. [Google Scholar]
Smoktunowicz, A.; Wróbel, I. On improving the accuracy of Horner’s and Goertzel’s algorithms. Numer. Algorithms 2005, 38, 243–258. [Google Scholar] [CrossRef]
Bailey, D.H. Library for Double-Double and Quad-Double Arithmetric. Available online: http://www.nersc.gov/dhbailey/mpdist/mpdist.html (accessed on 18 February 2021).
Louvet, N. Compensated Algorithms in Floating-Point Arithmetic: Accuracy, Validation, Performances; Université de Perpignan Via Domitia: Perpignan, France, 2007. [Google Scholar]
Langlois, P.; Louvet, N. More Instruction Level Parallelism Explains the Actual Efficiency of Compensated Algorithm; Technical Report hal-00165020; DALI Research Team, University of Perpignan: Perpignan, France, 2007. [Google Scholar]
Ogita, T.; Rump, S.M.; Oishi, S. Accurate sum and dot product. Siam J. Sci. Comput. 2005, 26, 1955–1988. [Google Scholar] [CrossRef]
Graillat, S.; Langlois, P.; Louvet, N. Algorithms for accurate, validated and fast polynomial evaluation. Jpn. J. Ind. Appl. Math. 2009, 26, 191–214. [Google Scholar] [CrossRef]
Graillat, S.; Morain, V. Error-free transformations in real and complex floating-point arithmetic. In Proceedings of the International Symposium on Nonlinear Theory and Its Applications, Vancouver, BC, Canada, 16–19 September 2007; pp. 341–344. [Google Scholar]
Graillat, S.; Morain, V. Accurate summation, dot product and polynomial evaluation in complex floating-point arithmetic. Inf. Comput. 2012, 216, 57–71. [Google Scholar] [CrossRef]
Graillat, S. An accurate algorithm for evaluating rational functions. Appl. Math. Comput. 2018, 337, 494–503. [Google Scholar] [CrossRef]
Cameron, T.; Graillat, S. On a Compensated Ehrlich-Aberth Method for the Accurate Computation of All Polynomial Roots. Available online: https://hal.archives-ouvertes.fr/hal-03335604 (accessed on 16 March 2021).
Jiang, H.; Barrio, R.; Li, H.; Liao, X.; Cheng, L.; Su, F. Accurate evaluation of a polynomial in Chebyshev form. Appl. Math. Comput. 2011, 217, 9702–9716. [Google Scholar] [CrossRef]
Jiang, H.; Li, S.; Cheng, L.; Su, F. Accurate evaluation of a polynomial and its derivative in Bernstein form. Comput. Math. Appl. 2010, 60, 744–755. [Google Scholar] [CrossRef][Green Version]
Delgado, J.; Peña, J.M. Algorithm 960: POLYNOMIAL: An Object-Oriented Matlab Library of Fast and Efficient Algorithms for Polynomials. ACM Trans. Math. Softw. 2016, 42, 1–19. [Google Scholar] [CrossRef]
Kazal, N.Y.; Mukhlash, I.; Sanjoyo, B.A.; Hidayat, N.; Ozaki, K. Extended use of error-free transformation for real matrix multiplication to complex matrix multiplication. Siam J. Phys. Conf. Ser. 2021, 1821, 012022. [Google Scholar] [CrossRef]
Ozaki, K. Error-free transformation of matrix multiplication for multi-precision computations. In Proceedings of the 19th International Symposium on Scientific Computing, Computer Arithmetic, and Verified Numerical Computations, Szeged, Hungary, 13–15 September 2021; Volume 33. [Google Scholar]
Ozaki, K. An Error-Free Transformation for Matrix Multiplication with Reproducible Algorithms and Divide and Conquer Methods. J. Phys. Conf. Ser. 2020, 1490, 012062. [Google Scholar] [CrossRef]
Ozaki, K.; Ogita, T. The Essentials of verified numerical computations, rounding error analyses, interval arithmetic, and error-free transformations. Nonlinear Theory Its Appl. 2020, 11, 279–302. [Google Scholar] [CrossRef]
Blanchard, P.; Higham, D.J.; Higham, N.J. Accurately computing the log-sum-exp and softmax functions. IMA J. Numer. Anal. 2021, 41, 2311–2330. [Google Scholar] [CrossRef]
Delgado, J.; Peña, J.M. Running relative error for the evaluation of polynomials. SIAM J. Sci. Comput. 2009, 31, 3905–3921. [Google Scholar] [CrossRef]
Jiang, H.; Graillat, S.; Barrio, R.; Yang, C. Accurate, validated and fast evaluation of elementary symmetric functions and its application. Appl. Math. Comput. 2016, 273, 1160–1178. [Google Scholar] [CrossRef]
Barrio, R.; Du, P.; Jiang, H.; Serrano, S. ORTHOPOLY: A library for accurate evaluation of series of classical orthogonal polynomials and their derivatives. Comput. Phys. Commun. 2018, 231, 146–162. [Google Scholar] [CrossRef]
Croci, M.; Fasi, M.; Higham, N.J.; Mary, T.; Mikaitis, M. Stochastic rounding: Implementation, error analysis and applications. R. Soc. Open Sci. 2022, 9, 211631. [Google Scholar] [CrossRef]
IEEE Standard 754-2008; Standard for Binary Floating Point Arithmetic. ANSI: New York, NY, USA, 2008.
Clenshaw, C.W. A note on the summation of Chebyshev series. Math. Comput. 1955, 9, 118–120. [Google Scholar] [CrossRef]
Szegö, G. Orthogonal Polynomials; American Mathematical Society: Providence, RI, USA, 1939. [Google Scholar]
Knuth, D.E. The Art of Computer Programming: Seminumerical Algorithms, 3rd ed.; Addison-Wesley: Boston, MA, USA, 1998. [Google Scholar]
Dekker, T.J. A floating-point technique for extending the available precision. Numer. Math. 1971, 18, 224–242. [Google Scholar] [CrossRef]
Graillat, S.; Lauter, C.; Tang, P.T.; Yamanaka, N.; Oishi, S. Efficient Calculations of Faithfully Rounded I₂-Norms of n-Vectors. ACM Trans. Math. Softw. 2015, 41, 1–20. [Google Scholar] [CrossRef]
Higham, N.J. Accuracy and Stability of Numerical Algorithms, 2nd ed.; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 2002. [Google Scholar]
Markstein, P. IA-64 and Elementary Functions: Speed and Precision; Prentice-Hall: Englewood Cliffs, NJ, USA, 2000. [Google Scholar]
Nievergelt, Y. Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit. ACM Trans. Math. Softw. 2003, 29, 27–48. [Google Scholar] [CrossRef]

Figure 1. The flowchart of the compensated Goertzel algorithm.

Figure 2. The flowchart of the compensated Goertzel algorithm with dynamic error estimates.

Figure 3. Accuracy of evaluation of

ω (z) = {(z - 1 - i)}^{n}

at

z = 1.333 + 1.333 i

for

n = 3

:42.

Figure 3. Accuracy of evaluation of

ω (z) = {(z - 1 - i)}^{n}

at

z = 1.333 + 1.333 i

for

n = 3

:42.

Figure 4. The relative errors of DFT for polynomials with random coefficients.

Table 1. Error-free transformations, their properties and operation costs.

Algorithm	Properties	Flops
$[x, y] = TwoSum (a, b)$	$x = a \oplus b$ , $x + y = a + b$	6
$[x, y] = TwoProd (a, b)$	$x = a \otimes b$ , $x + y = a \times b$	17
$[x, y] = TwoProdRC (a, b)$	$x = a \otimes b$ , $x + y = a \times b$	34
$[x, y] = TwoSumCplx (a, b)$	$x = a \oplus b$ , $x + y = a + b$	12
$[p, e, f, g] = TwoProdCplx (a, b)$	$p = a \otimes b$ , $p + e + f + g = a \times b$	80

{⊕, ⊗} represents {+, ×} in floating-point operations.

Table 2. Comparison of computational costs of

Horner, Goertzel, CompHorner, CompGoertzel

and

CompGoertzelwErr

algorithms.

Table 2. Comparison of computational costs of

Horner, Goertzel, CompHorner, CompGoertzel

and

CompGoertzelwErr

algorithms.

Variates	Coefficients	$Horner$	$Goertzel$	$CompHorner$	$CompGoertzel$	$CompGoertzelwErr$
$z \in R$	$a_{n} \in R$	2N	4N + 4	26N + 3	55N + 45	59N + 60
$z \in R$	$a_{n} \in C$	4N	8N + 6	52N + 6	110N + 66	114N + 81
$z \in C$ and $\| z \| \neq 1$	$a_{n} \in R$	7N − 4	4N + 7	90N + 6	55N + 91	59N + 106
$z \in C$ and $\| z \| \neq 1$	$a_{n} \in C$	8N	8N + 12	97N + 6	110N + 150	114N + 165
$z \in C$ , $\| z \| = 1$ and $z \neq \pm 1$	$a_{n} \in R$	7N − 4	3N + 4	90N + 6	34N + 26	39N + 41
$z \in C$ , $\| z \| = 1$ and $z \neq \pm 1$	$a_{n} \in C$	8N	6N + 9	97N + 6	68N + 90	72N + 105

Table 3. Theoretical computational complexity and measured running time ratios in

a_{n} \in R

.

Table 3. Theoretical computational complexity and measured running time ratios in

a_{n} \in R

.

Variates		$\frac{CompGoertzel}{Goertzel}$	$\frac{CompGoertzel}{CompHorner}$	$\frac{CompGoertzelwErr}{CompGoertzel}$	$\frac{CompGoertzelwErr}{CompHorner}$
	Theoretical	13.75	2.12	1.07	2.27
$z \in R$	Evn1	9.15	1.42	1.13	1.59
	Evn2	2.76	1.35	1.11	1.49
	Theoretical	13.75	61.14%	1.07	65.59%
$z \in C$ and $\| z \| \neq 1$	Evn1	9.29	64.19%	1.13	72.22%
	Evn2	2.81	67.71%	1.1	74.14%
	Theoretical	11.33	37.79%	1.15	43.35%
$z \in C$ , $\| z \| = 1$ and $z \neq \pm 1$	Evn1	6.53	44.09%	1.1	48.43%
	Evn2	2.22	53.66%	1.09	58.38%

Table 4. Theoretical computational complexity and measured running time ratios in

a_{n} \in C

.

Table 4. Theoretical computational complexity and measured running time ratios in

a_{n} \in C

.

Variates		$\frac{CompGoertzel}{Goertzel}$	$\frac{CompGoertzel}{CompHorner}$	$\frac{CompGoertzelwErr}{CompGoertzel}$	$\frac{CompGoertzelwErr}{CompHorner}$
	Theoretical	13.75	2.12	1.04	2.19
$z \in R$	Evn1	10.32	1.24	1.17	1.46
	Evn2	4.8	1.16	1.35	1.57
	Theoretical	13.75	1.13	1.04	1.18
$z \in C$ and $\| z \| \neq 1$	Evn1	10.37	1.25	1.19	1.48
	Evn2	4.8	1.17	1.35	1.58
	Theoretical	11.33	70.13%	1.06	74.26%
$z \in C$ , $\| z \| = 1$ and $z \neq \pm 1$	Evn1	7.21	84.61%	1.17	98.54%
	Evn2	3.67	89.22%	1.33	1.18

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, C.; Du, P.; Li, K.; Liu, Y.; Jiang, H.; Quan, Z. Accurate Goertzel Algorithm: Error Analysis, Validations and Applications. Mathematics 2022, 10, 1788. https://doi.org/10.3390/math10111788

AMA Style

Li C, Du P, Li K, Liu Y, Jiang H, Quan Z. Accurate Goertzel Algorithm: Error Analysis, Validations and Applications. Mathematics. 2022; 10(11):1788. https://doi.org/10.3390/math10111788

Chicago/Turabian Style

Li, Chuanying, Peibing Du, Kuan Li, Yu Liu, Hao Jiang, and Zhe Quan. 2022. "Accurate Goertzel Algorithm: Error Analysis, Validations and Applications" Mathematics 10, no. 11: 1788. https://doi.org/10.3390/math10111788

APA Style

Li, C., Du, P., Li, K., Liu, Y., Jiang, H., & Quan, Z. (2022). Accurate Goertzel Algorithm: Error Analysis, Validations and Applications. Mathematics, 10(11), 1788. https://doi.org/10.3390/math10111788

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Accurate Goertzel Algorithm: Error Analysis, Validations and Applications

Abstract

1. Introduction

2. Goertzel Compensated Algorithm

2.1. Goertzel Algorithm

2.2. Error-Free Transformations and Sum of Squares Algorithm

2.3. Compensated Goertzel Algorithm

3. Round-Off Error and Complexity Analysis

3.1. Forward Round-Off Error Analysis

3.2. Running Round-Off Error Analysis

3.3. Computational Complexity

4. Numerical Experiments

4.1. Accuracy

4.2. Running Time

4.3. Application

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI