1. Introduction
Let
be the solution to the equation
An effective iterative method used for solving this equation that makes direct use of
(but no derivatives of
) is the
secant method that is discussed in many books on numerical analysis. See, for example, Atkinson [
1], Dahlquist and Björck [
2], Henrici [
3], Ralston and Rabinowitz [
4], and Stoer and Bulirsch [
5]. See also the recent note [
6] by the author, in which the treatment of the secant method and those of the Newton–Raphson, regula falsi, and Steffensen methods are presented in a unified manner.
Recently, this method was generalized by the author in [
7] as follows: Starting with
,
initial approximations to
, we generate a sequence of approximations
, via the recursion
being the derivative of the polynomial
that interpolates
at the points
. (Thus,
is of degree
k.) Clearly, the case
is simply the secant method. In [
7], we also showed that, provided
are sufficiently close to
, the method converges with order
, that is,
for some constant
C, and that
. (We call
the
order of convergence of the method or the
order of the method for short.) Here
is the only positive root of the polynomial
. We also have that
Actually, rounded to four significant figures,
Note that to compute we need knowledge of only , and because have already been computed, is the only new quantity to be computed. Thus, each step of the method requires to be computed only once. From this, it follows that the efficiency index of this method is simply and that this index approaches 2 by increasing k even moderately, as can be concluded from the values of given above.
In this work, we consider the application of this method to simple complex roots of a function
, where
z is the complex variable. Let us denote a real or complex root of
by
again; that is,
and
. Thus, starting with
,
initial approximations to
, we generate a sequence of approximations
via the recursion
being the derivative of the polynomial
that interpolates
at the points
. As in [
7], we can use Newton’s interpolation formula to generate
and
. Thus
and
Here,
is the divided difference of order
m of the function
over the set of points
and is a symmetric function of these points. For details, we refer the reader to [
7].
As proposed in [
7], we generate the
initial approximations as follows: We choose the approximations
first. We then generate
by applying our method with
(that is, with the secant method). Next, we apply our method to
with
and obtain
, and so on, until we have generated all
initial approximations, via
Remark 1.
- 1.
Instead of choosing arbitrarily, we can generate it as as suggested in Brin [8], which is quite sensible since is small near the root α. We can also use the method of Steffensen—which uses only and no derivatives of —to generate from ; thus, - 2.
It is clear that, in case takes on only real values along the axis and we are looking for nonreal roots of , at least one of the initial approximations must be chosen to be nonreal.
- 3.
We would like to mention that Kogan, Sapir, and Sapir [9] have proposed another generalization of the secant method for simple real roots of nonlinear equations that resembles our method described in (1). In the notation of (1), this method produces a sequence of approximations viastarting with arbitrary and , and it is of order 2. Note that, in (6), interpolates at the points hence is of degree n, which is tending to infinity. In (1), is of degree k, which is fixed. - 4.
Yet another generalization of the secant method for finding simple real roots of was recently given by Nijmeijer [10]. This method too requires no derivative information, requires one evaluation of per iteration, and has the same order of convergence as our method. It follows an idea of applying a convergence acceleration method, such as Aitken’s -process, to approximations obtained from the secant method, as proposed by Han and Potra [11]. Because Nijmeijer’s method is not based on polynomial interpolation, it is completely different from our method, however. For Aitken’s -process, see [1,2,3,4,5]. See also [12] (Chapter 15) by the author.
In the next section, we analyze the local convergence properties of the method as it is applied to complex roots. We show that the analysis of [
7] can be extended to the complex case following some clever manipulation. We prove that the order
of the method is the same as that we discovered in the real case. In
Section 3, we provide two numerical examples to confirm the results of our convergence analysis.
2. Local Convergence Analysis
We now turn to the analysis of the sequence that is generated via (2). Our treatment covers all .
In our analysis, we will make use of the Hermite-Genocchi formula that provides an integral representation for divided differences (For a proof of this formula, see Atkinson [
1], for example). Even though this formula is usually stated for functions defined on real intervals, it is easy to verify (see Filipsson [
13], for example) that it also applies to functions defined in the complex plane under proper assumptions. Thus, provided
is analytic on
E, a bounded closed convex set in the complex plane, and provided
are in
E, there holds
Here
is the
m-dimensional simplex defined as
We note that (7) holds whether the
are distinct or not. We also note that
is a symmetric and continuous function of its arguments.
By the conditions we have imposed on
, it is easy to see that the integrand
in (7) is always defined because
is in the set
E and
is analytic on
E. This is so because, by (7) and (8),
which implies that
is a convex combination of
hence is in the set
, the convex hull of the points
, and
. Consequently, taking moduli on both sides of (7), we obtain, for all
in
E,
In addition, since
in (7), as
for all
, there hold
and
, and hence
In (9) and (10), we have also invoked the fact that (see [
14] (p. 346), for example)
We will make use of these in the proof of our main theorem that follows. This theorem and its proof are almost identical to that given in [
7] once we take into account, where and when needed, the fact that we are now working in the complex plane. For convenience, we provide all the details of the proof.
Theorem 1. Let α be a simple root of , that is, , but . Let be the closed disk of radius r containing α as its center, that is,Let be analytic on . Choose a positive integer k and let be distinct initial approximations to α. Generate viawhere is the polynomial of interpolation to at the points . Then, provided are in and sufficiently close to α, we have the following cases: - 1.
If , the sequence converges to α, andThe order of convergence is , , where is the only positive root of the polynomial and satisfiese being the base of natural logarithms, andwhich also implies that - 2.
If is a polynomial of degree at most k, the sequence converges to α, andThus, converges with order 2
if , and with order greater than 2
if .
Proof. We start by deriving a closed-form expression for the error in
. Subtracting
from both sides of (12), and noting that
we have
We now note that
and that
and
Note that (20) can be obtained by starting with the divided difference representation of
, namely,
and by computing
via L’Hôpital’s rule.
For simplicity of notation, let
and rewrite (19) and (20) as
Substituting these into (18), we finally obtain
We now prove that convergence takes place. First, let us assume without loss of generality that
for all
, and set
. (This is possible since
and
and we can choose
r as small as we wish to also guarantee
.) Next, let
Thus, assuming that
and noting that
is a convex set, we have by (9) that
Next, choose the ball
sufficiently small (with
) to ensure that
. It can now be verified that, provided
are all in
, there holds
where
Consequently, by (25),
, which implies that
, just like
. Therefore, if
are chosen in
, then
for all
, hence
and
.
As for (13) when
, we proceed as follows: By the fact that
, we first note that, by (20) and (21),
and thus
. This means that
and, equivalently, that
converges with order greater than 1. As a result,
and
Consequently, expanding in (25) the product
, we have
Substituting (27) into (25), and defining
we obtain
Dividing both sides of (29) by
, and defining
we have
Now, by (10), (22), and (26),
Because
and
are finite, and because
and
, it follows that there exist a positive integer
N and positive constants
and
D, with
when
, for which (31) gives
Using (33), it is easy to show that
which, by the fact that
, implies that
is a bounded sequence. Making use of this fact, we have
. Substituting this into (31), and invoking (32), we next obtain
, which is precisely (13).
That
, the order of the method, as defined in the statement of the theorem, satisfies (14) and (15) follows from Traub [
15] (Chapter 3). We provide a simplified treatment of this topic in
Appendix A.
This completes the proof of part 1 of the theorem.
When
is a polynomial of degree at most
k, we first observe that
for all
z, which implies that
for all
z, hence also
for all
z. Therefore, we have that
in the recursion of (12). Consequently, (12) becomes
which is the recursion for the Newton–Raphson method. Thus, (17) follows. This completes the proof of part 2 of the theorem. □
3. Numerical Examples
In this section, we present two numerical examples that we treated with our method. Our computations were done in quadruple-precision arithmetic (approximately 35-decimal-digit accuracy). Note that in order to verify the theoretical results concerning iterative methods with order greater than unity, we need to use computer arithmetic of high precision (preferably, of variable precision, if available) because the number of correct significant decimal digits in the increases dramatically from one iteration to the next as we are approaching the solution.
In both examples below, we take
. We choose
and
and compute
using one step of the secant method, namely,
Following that, we compute
via
In our examples, we have carried out our computations for several sets of
, and we have observed essentially the same behavior that we observe in
Table 1 and
Table 2.
Example 1. Consider , where , whose solutions are , . We would like to obtain the root . We chose and . The results of our computations are given in Table 1. From (13) and (16) in Theorem 1, we should haveandand these seem to be confirmed in Table 1. Furthermore, in infinite-precision arithmetic, should have close to 60 correct significant figures; we do not see this in Table 1 due to the fact that the arithmetic we have used to generate Table 1 can provide an accuracy of at most 35 digits. Example 2. Consider , where . has infinitely many roots , . We would like to obtain the root . We chose and . The results of our computations are given in Table 2. From (13) and (16) in Theorem 1, we should haveandand these seem to be confirmed in Table 2. Furthermore, in infinite-precision arithmetic, should have close to 50 correct significant figures; we do not see this in Table 2 due to the fact that the arithmetic we have used to generate Table 2 can provide an accuracy of at most 35 digits. Remark 2. In relation to the examples we have just presented, we would like to discuss the issue of estimating the relative errors in the . This should help the reader when studying the numerical results included in Table 1 and Table 2. Starting with (13) and (15), we first note that, for all large n,Therefore, assuming also that , we haveNow, if has correct significant figures, we have . If, in addition, for some r, then we will haveFor simplicity, let us consider the case , which is practically what we have in the two examples we have treated. Then has approximately correct significant decimal digits. That is, if has q correct significant decimal digits, then, due to the fact that , will have times as many correct significant decimal digits as .