An Accelerated Steffensen Iteration via Interpolation-Based Memory and Optimal Convergence

Shuai Wang; Chenshuo Lu; Zhanmeng Yang; Tao Liu

doi:10.3390/math14030498

,

and

¹

Department of Mathematics and Physics, Changchun Guanghua University, Changchun 130033, China

²

School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China

³

Sydney Smart Technology College, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China

^*

Author to whom correspondence should be addressed.

Mathematics2026, 14(3), 498;https://doi.org/10.3390/math14030498

This article belongs to the Special Issue Computational Methods in Analysis and Applications, 3rd Edition

Version Notes

Order Reprints

Abstract

We develop a novel Steffensen-type iterative solver to solve nonlinear scalar equations without requiring derivatives. A two-parameter one-step scheme without memory is first introduced and analyzed. Its optimal quadratic convergence is then established. To enhance the convergence rate without additional functional evaluations, we extend the scheme by incorporating memory through adaptively updated accelerator parameters. These parameters are approximated by Newton interpolation polynomials constructed from previously computed values, yielding a derivative-free method with R-rate of convergence of approximately 3.56155. A dynamical system analysis based on attraction basins demonstrates enlarged convergence regions compared to Steffensen-type methods without memory. Numerical experiments further confirm the accuracy of the proposed scheme for solving nonlinear equations.

Keywords:

Steffensen method; derivative-free iteration; methods with memory; convergence analysis; interpolation-based acceleration; attraction basins

MSC:

65H05; 65H99

1. Introduction

The numerical approximation of simple zeros of nonlinear scalar equations

g (x) = 0,

is among the most fundamental problems in computational mathematics; see, for instance, [1,2]. Classical derivative-based procedures, such as Newton’s method, possess attractive local convergence properties, yet their success relies heavily on the availability and stability of derivative information [3]. In many practical models—especially those arising in applied physics, computational chemistry, engineering design, and matrix-oriented formulations—analytic derivatives may be unavailable or computationally expensive. This necessitates the development of derivative-free schemes with competitive convergence properties.

A useful parameterized modification of Steffensen’s scheme is introduced in [4], expressed as

x_{j + 1} = x_{j} - \frac{g (x_{j})}{g [x_{j}, w_{j}]}, w_{j} = x_{j} + β g (x_{j}), β \in R ∖ {0}, j = 0, 1, 2, \dots,

(1)

wherein

g [x_{j}, w_{j}] = \frac{g (w_{j}) - g (x_{j})}{w_{j} - x_{j}},

denotes the first-order divided difference. While the original formula in [5] corresponds to

β = 1

, the introduction of a parameter

β \neq 0

still keeps the quadratic order of convergence and provides a family of derivative-free accelerators well-suited to nonlinear and matrix-oriented problems; cf. [6,7,8].

Following the classification of Traub [9], iterative processes may be grouped into two broad categories: methods without memory and methods with memory. Kung and Traub formulated in [10] a well-known optimality conjecture, stating that any derivative-free method without memory employing

η

functional evaluations for every iterate can reach at most the convergence order

2^{η - 1}

. For

η = 2

, this implies the upper bound

\bar{p} = 2

, which is attained by Steffensen-type procedures. Motivated by this limitation, the literature contains extensive developments of optimal and non-optimal methods without memory.

Iteration (1) realizes the optimal order

\bar{p} = 2

under this conjecture. By contrast, methods with memory preserve the quantity of functional evaluations per iterate but introduce adaptive parameters that are updated dynamically from previously computed approximations. These parameters are typically constructed through interpolation formulas, divided differences of various orders, or rational/Padé-type local models. Because no additional evaluations are performed, the computational cost remains unchanged while the convergence order is increased, improving both asymptotic speed and the computational efficiency index [11].

The efficiency index, proposed in [12], is defined as

E I = {\bar{p}}^{\frac{1}{η}},

(2)

where

\bar{p}

denotes the (local) order of convergence, and

η

denotes the quantity of functional evaluations for every iterate. For derivative-free methods, the enhancement of

\bar{p}

without increasing

η

is particularly significant. In addition to efficiency, attraction basins provide a perspective on convergence, revealing the robustness of an iteration to initial guesses and identifying regions of stability or chaotic behavior; see [13].

In the root-finding literature, the order of convergence and the asymptotic error constant quantify the local speed of an iteration in a precise sense. If

{x_{j}}

converges to a simple zero

α

and

e_{j} = x_{j} - α

, then the method has (Q-)order

\bar{p} \geq 1

if there exists a finite nonzero constant C such that

\lim_{j \to \infty} \frac{| e_{j + 1} |}{| e_{j} |^{\bar{p}}} = C, 0 < C < \infty .

In this relation, the exponent

\bar{p}

determines the rate at which the error contracts (larger

\bar{p}

means a higher-order, faster asymptotic contraction), while the constant C governs the asymptotic scale of the error reduction among methods of the same order. Accordingly, when comparing two methods with the same order, the one with a smaller asymptotic error constant typically achieves smaller errors with fewer iterations in the asymptotic regime. For methods with memory, the order is commonly expressed in the R-sense: there exists

r \geq 1

and a constant

K \neq 0

such that an error relation of the form (11) holds, that is,

e_{j + 1} = K e_{j}^{r} + O (e_{j}^{r + 1})

. In what follows, we use the term “order” (Q-order for memoryless methods and R-order for methods with memory) when referring to the exponent

\bar{p}

or r, and we reserve the phrase "asymptotic error constant" for the leading coefficient C or K.

To place our development in context, we recall several recent Steffensen-type contributions. A two-parameter derivative-free scheme without memory introduced in [14] reads

\begin{matrix} \{\begin{matrix} w_{j} = x_{j} - β g (x_{j}), \\ x_{j + 1} = x_{j} - \frac{g (x_{j})}{g [x_{j}, w_{j}]} (1 + ξ \frac{g (w_{j})}{g [x_{j}, w_{j}]}), \end{matrix} β \in R ∖ {0}, ξ \in R, j \geq 0 . \end{matrix}

(3)

This method retains quadratic convergence while incorporating an additional free parameter

ξ

.

A two-step method with memory presented in [15] illustrates the effect of incorporating adaptive parameters through Newton interpolant (

N (\cdot)

):

\begin{matrix} \{\begin{matrix} v_{j} = x_{j} - β_{j} g (x_{j}), \\ y_{j} = x_{j} - \frac{g (x_{j})}{g [x_{j}, v_{j}]}, \\ x_{j + 1} = y_{j} - \frac{g (y_{j})}{g [x_{j}, v_{j}]}, \\ β_{j + 1} = \frac{1}{N_{2}^{'} (x_{j})} \end{matrix} \end{matrix}

(4)

requiring three function evaluations per iteration and achieving an R-order of convergence approximately

3.37

.

A one-step method with memory is due to Džunić [16], who introduced adaptive parameters computed from Newton interpolation polynomials:

\begin{matrix} \{\begin{matrix} β_{j} = - \frac{1}{N_{2}^{'} (x_{j})}, p_{j} = - \frac{N_{3}^{″} (w_{j})}{2 N_{3}^{'} (w_{j})}, j \geq 1, \\ w_{j} = x_{j} + β_{j} g (x_{j}), j \geq 0, \\ x_{j + 1} = x_{j} - \frac{g (x_{j})}{g [x_{j}, w_{j}] + p_{j} g (w_{j})}, \end{matrix} \end{matrix}

(5)

where

N_{k} (\cdot)

denotes the Newton interpolation polynomial having degree k. The resulting R-order equals

\frac{1}{2} (3 + \sqrt{17}) \approx 3.56

.

Derivative-free solvers thus remain indispensable in contexts where evaluating functional derivatives is impractical, as emphasized in [17]. In view of recent improvements in the literature [18,19], our aim is to construct a Steffensen-type procedure that simultaneously increases the local convergence rate as well as the index of efficiency, without any rise in the quantity of functional evaluations. Here, and throughout, "classical Steffensen-type" denotes the standard derivative-free schemes without memory based on first-order divided differences (including (1)) which are also old root-finding methods in this field. This terminology does not imply that the proposed variants with memory are “non-classical,” but only that they extend the classical framework by incorporating adaptive parameters without increasing the number of function evaluations.

Whenever g is differentiable at u, the divided difference satisfies the well-known mean-value relation

g [u, v] = g^{'} (ξ_{u, v}), ξ_{u, v} \in (u, v),

(6)

establishing that

g [u, v]

acts as a discrete derivative. To see its approximation accuracy near a simple root

α

of g, consider the Taylor expansions

\begin{matrix} g (u) & = g^{'} (α) (u - α) + \frac{g^{″} (α)}{2} {(u - α)}^{2} + \frac{g^{‴} (α)}{6} {(u - α)}^{3} + O ({(u - α)}^{4}), \\ g (v) & = g^{'} (α) (v - α) + \frac{g^{″} (α)}{2} {(v - α)}^{2} + \frac{g^{‴} (α)}{6} {(v - α)}^{3} + O ({(v - α)}^{4}) . \end{matrix}

Subtracting these expressions and dividing by

(v - u)

yields the expansion

g [u, v] = g^{'} (α) + \frac{g^{″} (α)}{2} (u + v - 2 α)

+ \frac{g^{‴} (α)}{6} ({(u - α)}^{2} + (u - α) (v - α) + {(v - α)}^{2}) + O (\max {| u - α |, | v - {α |}}^{3}),

(7)

illustrating that the approximation error decays quadratically with respect to the distances of u and v from

α

. In Steffensen-type schemes such as (1) and (3), the pair

(u, v)

is taken as

(x_{j}, w_{j})

, where

w_{j} = x_{j} + β g (x_{j})

; hence, the divided difference

g [x_{j}, w_{j}]

serves as an approximation of

g^{'} (x_{j})

without requiring explicit differentiation.

In order to improve the performance of (1) or (3), it is natural to introduce acceleration parameters whose purpose is to reduce the truncation error in the associated error equation. Consider a general one-step Steffensen-type update of the form

x_{j + 1} = x_{j} - A_{j} \frac{g (x_{j})}{g [x_{j}, w_{j}]},

(8)

where

A_{j}

is a parameter (possibly depending on previously computed quantities). Let

e_{j} = x_{j} - α

denote the error. Using the Taylor expansions and the divided difference expansion (7), one may write

\frac{g (x_{j})}{g [x_{j}, w_{j}]} = e_{j} - \frac{g^{″} (α)}{2 g^{'} (α)} e_{j}^{2} + Q_{j} e_{j}^{3} + O (e_{j}^{4}),

where

Q_{j}

depends smoothly on

β

,

g^{″} (α)

,

g^{‴} (α)

, and the structure of

w_{j}

. Substituting this expression into (8) produces the error recursion

e_{j + 1} = (1 - A_{j}) e_{j} + A_{j} \frac{g^{″} (α)}{2 g^{'} (α)} e_{j}^{2} - A_{j} Q_{j} e_{j}^{3} + O (e_{j}^{4}) .

(9)

If

A_{j} \equiv 1

, then the linear term vanishes and the quadratic term dominates, yielding quadratic convergence. However, if the parameter

A_{j}

is allowed to satisfy an approximation of the form

A_{j} = 1 + λ_{j} e_{j} + O (e_{j}^{2}),

(10)

the linear coefficient in (9) still vanishes, but the quadratic coefficient becomes

A_{j} \frac{g^{″} (α)}{2 g^{'} (α)} = \frac{g^{″} (α)}{2 g^{'} (α)} + λ_{j} \frac{g^{″} (α)}{2 g^{'} (α)} e_{j} + O (e_{j}^{2}) .

By choosing

λ_{j}

suitably—or by approximating it via interpolation of earlier steps, as done in methods with memory—the quadratic term can be suppressed or reduced, thereby elevating the order of convergence from 2 to a value exceeding 3. This mechanism is the theoretical motivation behind the acceleration parameters used in (3), (4), and (5).

In methods with memory,

A_{j}

is not a fixed number but an adaptively updated parameter, typically obtained by approximating higher-order information such as

g^{″} (α) / g^{'} (α)

or similar ratios using interpolation polynomials built from stored points

{x_{j}, w_{j}, x_{j - 1}, w_{j - 1}, \dots}

. Let

{\hat{λ}}_{j}

denote an approximation of

λ_{j}

in (10) computed without additional evaluations. Then, the resulting parameter satisfies

A_{j} = 1 + {\hat{λ}}_{j} e_{j} + O (e_{j}^{2}),

and substituting it into (9) yields an error equation of the type

e_{j + 1} = K e_{j}^{r} + O (e_{j}^{r + 1}),

(11)

where

r > 2

is the R-order of convergence induced by the adaptive construction. Schemes such as (5) achieve

r = \frac{1}{2} (3 + \sqrt{17}) \approx 3.56

through this mechanism. The central idea exploited later in this work is that such accelerator parameters may be approximated efficiently using Newton interpolation (or Padé-like formulas) without increasing the evaluation count of the basic Steffensen-type method. In addition, the authors in the works [20,21] solve nonlinear equation systems using arbitrary operators and functionals with certain conditions.

The present work introduces a one-step Steffensen-type method with memory that attains an R-order of convergence while maintaining exactly the same functional evaluation count as (1). Our construction begins with a two-parameter Steffensen-type method without memory and proceeds by embedding dynamically updated accelerator parameters obtained via Newton interpolation (or analogously via Padé-type local models [22]). The adaptive parameters elevate the R-order from 2 to approximately

3.56

, matching the best-performing one-step derivative-free schemes with memory. Numerical and dynamical analysis of attraction basins demonstrates that the proposed method not only accelerates convergence but also enlarges convergence regions in the complex plane.

The convergence analysis provided in Section 2 is local; under the smoothness assumptions stated there and for initial guesses sufficiently close to the simple zero

α

, the proposed methods converge with the proven orders. In addition, Section 3 offers a local dynamical assessment by studying basins of attraction in the complex plane, which provides information on convergence regions; such basin portraits are not a formal convergence theorem, but they do demonstrate enlarged convergence domains and fewer divergent initial conditions for the with-memory scheme in the tested families.

The structure of the article is as follows. In Section 2, we introduce a two-parameter Steffensen-type iterative method without memory and establish its quadratic convergence. Using interpolants constructed from previously computed iterates and function values, we then derive an enhanced family of methods with memory. In Section 3, the dynamical behavior of the new method is investigated through detailed attraction basin diagrams. Computational experiments in Section 4 confirm the good performance of the contributed approach relative to existing derivative-free solvers. Concluding remarks are provided in Section 5.

2. Developing a New Scheme

In this section, we construct a Steffensen-type method containing two free parameters, prove its quadratic convergence, and subsequently introduce a memory-based improvement by adaptively updating the parameters through Newton interpolation of appropriate degrees. Throughout the analysis, we assume that g is adequately smooth in a vicinity of its simple zero

α

, though the resulting algorithm is derivative-free in implementation.

We begin by presenting a two-parameter modification of (1), written as

\begin{matrix} \{\begin{matrix} w_{j} = x_{j} + β g (x_{j}), β \in R ∖ {0}, j \geq 0, \\ x_{j + 1} = x_{j} - \frac{g (x_{j})}{g [x_{j}, w_{j}] + p t_{j}}, t_{j} = \frac{g (w_{j})}{g [x_{j}, w_{j}]}, p \in R . \end{matrix} \end{matrix}

(12)

The structure of (12) reduces to (1) for

p = 0

, while the nonzero parameter p introduces an auxiliary correction term

p t_{j}

in the denominator. The goal is to retain second-order convergence while obtaining a formulation that can later be adapted to methods with memory. At iteration j, scheme (12) requires exactly two evaluations of g, namely

g (x_{j})

and

g (w_{j})

. Indeed, the divided difference

g [x_{j}, w_{j}]

is formed from the already computed pair

{g (x_{j}), g (w_{j})}

, and the quantity

t_{j} = g (w_{j}) / g [x_{j}, w_{j}]

introduces no additional evaluation. Hence, (12) is a derivative-free method with an evaluation count

η = 2

per iterate, and its quadratic order (as seen below) is optimal in the Kung–Traub sense for memoryless derivative-free iterations.

To establish the local behavior of (12), let

e_{j} = x_{j} - α

and

e_{w, j} = w_{j} - α

. Following standard notation, we write

b_{k} = \frac{1}{k!} \frac{g^{(k)} (α)}{g^{'} (α)}

,

k \geq 2

. Taylor expansion of the function above

α

gives the classical representation

g (x_{j}) = g^{'} (α) (e_{j} + b_{2} e_{j}^{2} + b_{3} e_{j}^{3} + b_{4} e_{j}^{4} + O (e_{j}^{5})) .

(13)

Since

w_{j} = x_{j} + β g (x_{j})

, inserting (13) into this expression yields

e_{w, j} = (β g^{'} (α) + 1) e_{j} + β b_{2} g^{'} (α) e_{j}^{2} + β b_{3} g^{'} (α) e_{j}^{3} + O (e_{j}^{4}) .

(14)

The divided difference

g [x_{j}, w_{j}]

is expanded by substituting the Taylor expansions of

g (x_{j})

and

g (w_{j})

, producing

\begin{matrix} g [x_{j}, w_{j}] & = g^{'} (α) + b_{2} g^{'} (α) (β g^{'} (α) + 2) e_{j} + (β b_{2}^{2} g^{'} (α) + b_{3} (β g^{'} (α) (β g^{'} (α) + 3) + 3)) e_{j}^{2} \\ + (β g^{'} (α) + 2) (2 β b_{2} b_{3} g^{'} (α) + b_{4} (β g^{'} (α) (β g^{'} (α) + 2) + 2)) e_{j}^{3} + O (e_{j}^{4}) . \end{matrix}

(15)

Using (15), we compute

t_{j} = g (w_{j}) / g [x_{j}, w_{j}]

. Substitution of the expansions leads to

t_{j} = (β g^{'} (α) + 1) e_{j} - b_{2} e_{j}^{2} + (b_{2}^{2} - b_{3}) (β g^{'} (α) (β g^{'} (α) + 2) + 2) e_{j}^{3} + O (e_{j}^{4}) .

(16)

Finally, inserting (16) into the denominator of (12) and expanding the resulting fraction, we obtain the error equation

e_{j + 1} = \frac{(β g^{'} (α) + 1) (b_{2} g^{'} (α) + p)}{g^{'} (α)} e_{j}^{2} + \frac{g^{'} (α) ζ - {(β g^{'} (α) p + p)}^{2}}{g^{'} {(α)}^{2}} e_{j}^{3} + O (e_{j}^{4}),

(17)

where

ζ = b_{2}^{2} (- g^{'} (α)) (β g^{'} (α) (β g^{'} (α) + 2) + 2)

+

b_{3} g^{'} (α) (β g^{'} (α) + 1) (β g^{'} (α) + 2)

−

b_{2} p (β g^{'} (α)

(2 β g^{'} (α) + 5) + 4) .

This establishes the following convergence result.

Theorem 1.

Let α be a simple zero of g, i.e.,

g (α) = 0

and

g^{'} (α) \neq 0

. Assume that

g \in C^{4} (I)

on an open interval

I \subset R

with

α \in I

. Then, there exists

δ > 0

such that, for a suitable initial guess

x_{0} \in I

, the sequence

{x_{j}}_{j \geq 0}

generated by (12) is well defined and converges to α. Moreover, the convergence is quadratic.

Proof.

From (17), we observe that the leading term in the error satisfies

e_{j + 1} = C_{2} e_{j}^{2} + O (e_{j}^{3}), C_{2} = \frac{(β g^{'} (α) + 1) (b_{2} g^{'} (α) + p)}{g^{'} (α)},

with

C_{2} \neq 0

for generic parameter values. Thus, the iteration is of quadratic order. No derivatives of the function appear in the computational steps, and the smoothness assumption is solely required to derive the theoretical error. The method is therefore derivative-free and optimal in the Kung–Traub sense for two functional evaluations. □

The method (12) therefore achieves second-order convergence using only two functional evaluations per iteration, matching the optimal bound

\bar{p} = 2^{η - 1} = 2

for derivative-free, memoryless methods with

η = 2

evaluations. To exceed quadratic convergence, one must either increase the functional evaluations or incorporate memory; see [23,24] for further discussions. We pursue the latter approach now.

The key observation from the error Equation (17) is that superquadratic convergence occurs if the parameters satisfy

β = - \frac{1}{g^{'} (α)}, p = - b_{2} g^{'} (α),

since these values eliminate the quadratic term in (17). In practice, however,

α

,

g^{'} (α)

, and

b_{2}

are unknown. Thus, the goal becomes to construct asymptotically correct approximations

β_{j}

and

p_{j}

computed recursively from stored values

x_{j}, w_{j}, x_{j - 1}, w_{j - 1}, \dots

without requiring additional evaluations.

We impose the adaptive forms

β = β_{j}, p = p_{j},

(18)

and choose

β_{j} \approx - \frac{1}{g^{'} (α)}

,

p_{j} \approx - b_{2} g^{'} (α)

, with the approximations generated by Newton interpolation polynomials based on already computed points. This leads naturally to the following memory-based method:

\begin{matrix} \{\begin{matrix} β_{j} = - \frac{1}{N_{2}^{'} (x_{j})}, p_{j} = - \frac{N_{3}^{″} (w_{j})}{2}, j \geq 1, \\ w_{j} = x_{j} + β_{j} g (x_{j}), \\ x_{j + 1} = x_{j} - \frac{g (x_{j})}{g [x_{j}, w_{j}] + p_{j} t_{j}}, t_{j} = \frac{g (w_{j})}{g [x_{j}, w_{j}]} . \end{matrix} \end{matrix}

(19)

Use one startup step of the fixed-parameter scheme (12) (or (1) as a special case) with prescribed (

β_{0}

,

p_{0}

) to compute

x_{1}

(and

w_{0}

), then start the memory updates for

j \geq 1

using (19).

The merit of (19) lies in its ability to enhance the asymptotic convergence order without increasing the number of function evaluations per iteration. The improved performance stems from employing interpolation polynomials of degrees two and three, whose derivatives approximate

g^{'} (α)

and curvature-related quantities. In practice, one may use built-in interpolation procedures to compute

N_{2}^{'} (x_{j})

and

N_{3}^{″} (w_{j})

without forming the polynomials analytically.

The interpolation mechanism generalizes naturally: increasing the degree of the interpolation by two at each memory update yields higher R-orders approaching, but never exceeding, the theoretical limit of four. This leads to the general family

\begin{matrix} \{\begin{matrix} β_{j} = - \frac{1}{N_{2 l}^{'} (x_{j})}, p_{j} = - \frac{N_{2 l + 1}^{″} (w_{j})}{2}, j \geq l, \\ w_{j} = x_{j} + β_{j} g (x_{j}), \\ x_{j + 1} = x_{j} - \frac{g (x_{j})}{g [x_{j}, w_{j}] + p_{j} t_{j}}, t_{j} = \frac{g (w_{j})}{g [x_{j}, w_{j}]} . \end{matrix} \end{matrix}

(20)

We now provide the R-order of the scheme (19).

Theorem 2.

Take into consideration similar assumptions as in Theorem 1. Then, the Steffensen-type scheme with memory (19) attains an R-order of convergence equal to 3.56155.

Proof.

Let

α

be the simple zero of g and define the error sequences

e_{j} = x_{j} - α, e_{w, j} = w_{j} - α .

We emphasize that the error sequences in this theorem are local quantities, defined for iterates sufficiently close to the simple root

α

. Moreover, in the proof we use

\bar{p}

solely to denote the R-order exponent describing the auxiliary sequence

{w_{j}}

relative to

{x_{j}}

(see (22)); this exponent should not be confused with the parameter p appearing in the memoryless scheme (12) (which becomes

p_{j}

in (19)). Since the subsequent relations imply

(r - 1) (\bar{p} - 1) = 2

(see (33)–(35)), it follows that

\bar{p} \geq 1

whenever

r \geq 1

; in particular, values such as

\bar{p} = 0.5

are not compatible with the asymptotics induced by (19). We assume that the sequence

{x_{j}}

converges to

α

with R-order

r \geq 1

and that the auxiliary sequence

{w_{j}}

satisfies an R-type relation with exponent

\bar{p} \geq 1

(with respect to

{x_{j}}

) in the sense of (22):

e_{j + 1} \sim C e_{j}^{r}, C \neq 0,

(21)

and

e_{w, j} \sim D e_{j}^{\bar{p}}, D \neq 0,

(22)

as

j \to \infty

. Here, “∼” denotes asymptotic equivalence up to a nonzero constant factor. From (21), shifting the index by one gives

e_{j} \sim C e_{j - 1}^{r}

as

j \to \infty

, hence

e_{j} \sim e_{j - 1}^{r} .

Applying the same relation to

e_{j - 1}

yields

e_{j - 1} \sim C e_{j - 2}^{r}

, and therefore

e_{j} \sim {(e_{j - 2}^{r})}^{r} = e_{j - 2}^{r^{2}} .

Iterating this argument produces

e_{j} \sim e_{j - k}^{r^{k}}

for any fixed

k \geq 1

as

j \to \infty

. In particular,

e_{j} \sim e_{j - 1}^{r}, e_{j - 1} \sim e_{j - 2}^{r} .

(23)

Similarly, (22) implies

e_{w, j} \sim e_{j}^{\bar{p}} \sim e_{j - 1}^{r p} \sim e_{j - 2}^{r^{2} \bar{p}} \sim \dots,

in particular,

e_{w, j - 1} \sim e_{j - 1}^{\bar{p}} .

(24)

On the other hand, the structure of the method with memory (19) yields the following asymptotic relations (see the detailed expansions derived after (19)):

e_{w, j} \sim (1 + β_{j} g^{'} (α)) e_{j},

(25)

e_{j + 1} \sim (1 + β_{j} g^{'} (α)) (g^{'} (α) b_{2} + p_{j}) e_{j}^{2},

(26)

and, for the self-accelerating parameters,

1 + β_{j} g^{'} (α) \sim κ_{1} e_{j - 1} e_{w, j - 1},

(27)

b_{2} + p_{j} g^{'} (α) \sim κ_{2} e_{j - 1} e_{w, j - 1}, g^{'} (α) b_{2} + p_{j} \sim κ_{2} e_{j - 1} e_{w, j - 1},

(28)

where

κ_{1}

,

κ_{2}

are nonzero constants depending on derivatives of g at

α

. We first use (25) and (27) to obtain an exponent relation for

\bar{p}

and r. Combining (25) and (27) gives

e_{w, j} \sim (1 + β_{j} g^{'} (α)) e_{j} \sim κ_{1} e_{j - 1} e_{w, j - 1} e_{j} .

Using (24) and (23), we have

e_{w, j - 1} \sim e_{j - 1}^{\bar{p}}, e_{j} \sim e_{j - 1}^{r} .

Hence,

e_{w, j} \sim {\tilde{C}}_{1} e_{j - 1}^{1 + \bar{p} + r},

(29)

for some nonzero constant

{\tilde{C}}_{1}

. On the other hand, from (22) and (23), we also have

e_{w, j} \sim D e_{j}^{\bar{p}} \sim {\tilde{C}}_{2} e_{j - 1}^{r \bar{p}}

, for some

{\tilde{C}}_{2} \neq 0

. Equating the exponents of

e_{j - 1}

in these two asymptotic representations of

e_{w, j}

, we obtain

r \bar{p} = r + \bar{p} + 1 .

(30)

Next, we use (26), (27), and (28) to obtain a second relation between r and

\bar{p}

. From (26), we have

e_{j + 1} \sim (1 + β_{j} g^{'} (α)) (g^{'} (α) b_{2} + p_{j}) e_{j}^{2} .

Combining (27) and (28), we deduce

(1 + β_{j} g^{'} (α)) (g^{'} (α) b_{2} + p_{j}) \sim κ_{1} κ_{2} {(e_{j - 1} e_{w, j - 1})}^{2} .

Using (24) again, this gives

(1 + β_{j} g^{'} (α)) (g^{'} (α) b_{2} + p_{j}) \sim {\tilde{C}}_{3} e_{j - 1}^{2 + 2 \bar{p}},

for some nonzero constant

{\tilde{C}}_{3}

. Therefore, from (26),

e_{j + 1} \sim {\tilde{C}}_{4} e_{j - 1}^{2 + 2 \bar{p}} e_{j}^{2},

(31)

for some constant

{\tilde{C}}_{4} \neq 0

. Using (23), we have

e_{j} \sim e_{j - 1}^{r}

, so that

e_{j}^{2} \sim e_{j - 1}^{2 r} .

Substituting this into (31) yields

e_{j + 1} \sim {\tilde{C}}_{5} e_{j - 1}^{2 + 2 \bar{p} + 2 r},

for some constant

{\tilde{C}}_{5} \neq 0

. On the other hand, from the definition of the R-order r in (21) and the backward relation (23), we also have

e_{j + 1} \sim C e_{j}^{r} \sim {\tilde{C}}_{6} e_{j - 1}^{r^{2}},

for some nonzero constant

{\tilde{C}}_{6}

. Equating the exponents of

e_{j - 1}

in the two asymptotic representations of

e_{j + 1}

, we obtain

r^{2} = 2 r + 2 + 2 \bar{p} .

(32)

We are thus led to the system of algebraic equations

r p = r + \bar{p} + 1,

(33)

r^{2} = 2 r + 2 + 2 \bar{p},

(34)

with unknowns

r \geq 1

and

\bar{p} > 0

. From (33), we obtain

r \bar{p} - r - \bar{p} = 1 ⟺ (r - 1) (\bar{p} - 1) = 2,

so that

\bar{p} - 1 = \frac{2}{r - 1}, \bar{p} = 1 + \frac{2}{r - 1} .

(35)

Substituting (35) into (34) gives

r^{2} = 2 r + 2 + 2 (1 + \frac{2}{r - 1})

=

2 r + 4 + \frac{4}{r - 1}

. Rewriting,

r^{2} - 2 r - 4 = \frac{4}{r - 1}

. Multiplying both sides by

(r - 1)

and simplifying, we obtain the quadratic equation

r^{2} - 3 r - 2 = 0 .

(36)

The positive solution of (36) is

r = \frac{3 + \sqrt{17}}{2} \approx 3.56155,

which is precisely the claimed R-order of the sequence

{x_{j}}

generated by (19). This completes the proof. □

Theorem 2 establishes local convergence of (19) with R-order

r = \frac{1}{2} (3 + \sqrt{17})

under the standard smoothness and proximity assumptions near a simple root. In the root-finding sense, (19) is also stable with respect to small perturbations in the self-accelerating parameters

β_{j}

and

p_{j}

generated by interpolation, because such perturbations are of higher asymptotic order than the leading error term and therefore do not alter the R-order. Global behavior is not claimed as a theorem; instead, it is assessed numerically in Section 3 through basins of attraction, which provide empirical evidence of enlarged convergence regions for the memory-based scheme.

The value

r = \frac{1}{2} (3 + \sqrt{17}) \approx 3.56155

in Theorem 2 is the R-order attained by a one-step derivative-free scheme with memory constructed from divided differences and interpolation-based parameter updates without increasing the per-iterate evaluation count beyond that of (1). This R-order is consistent with other well-known one-step with-memory Steffensen-type approaches based on similar interpolation mechanisms (e.g., the order reported for (5)). We do not claim that

r \approx 3.56155

is the highest possible among all Steffensen-type variants in the literature; rather, the contribution is that (19) attains this high R-order within a simple one-step framework while preserving the evaluation count of (1).

To better understand the role of the Newton interpolation polynomials in the construction of the parameters

β_{j}

and

p_{j}

in (19), we now present explicit formulas and asymptotic expansions for

N_{2}

and

N_{3}

, followed by a stability analysis with respect to perturbations in

β_{j}

and

p_{j}

, and finally a remark on local well-posedness of the scheme.

We first consider the quadratic Newton interpolant

N_{2}

used in the definition of

β_{j}

, see e.g., [25]. Let the nodes be chosen as

x_{j - 1}, w_{j - 1}, x_{j},

and denote the corresponding function values by

G_{j - 1} = g (x_{j - 1}), W_{j - 1} = g (w_{j - 1}), G_{j} = g (x_{j}) .

The quadratic Newton interpolation polynomial passing through these three points can be written as

N_{2} (o) = G_{j} + g [x_{j}, x_{j - 1}] (o - x_{j}) + g [x_{j}, x_{j - 1}, w_{j - 1}] (o - x_{j}) (o - x_{j - 1}),

(37)

where

g [x_{j}, x_{j - 1}] = \frac{G_{j} - G_{j - 1}}{x_{j} - x_{j - 1}}, g [x_{j}, x_{j - 1}, w_{j - 1}] = \frac{g [x_{j}, x_{j - 1}] - g [x_{j - 1}, w_{j - 1}]}{x_{j} - w_{j - 1}} .

Differentiating (37) with respect to o gives

N_{2}^{'} (o) = g [x_{j}, x_{j - 1}] + g [x_{j}, x_{j - 1}, w_{j - 1}] (2 o - x_{j} - x_{j - 1}),

(38)

so that, in particular,

N_{2}^{'} (x_{j}) = g [x_{j}, x_{j - 1}] + g [x_{j}, x_{j - 1}, w_{j - 1}] (x_{j} - x_{j - 1}) .

(39)

By Taylor expanding g about

α

and using the standard divided-difference expansions, one obtains

N_{2}^{'} (x_{j}) = g^{'} (α) + C_{1} e_{j - 1} e_{w, j - 1} + O (\max {| e_{j} |, | e_{j - 1} |, | e_{w, j - 1} {|}}^{3}),

(40)

for a constant

C_{1}

depending on derivatives of g at

α

. The key point is that the first-order error term of

N_{2}^{'} (x_{j})

relative to

g^{'} (α)

is of order

e_{j - 1} e_{w, j - 1}

, which is a product of previous errors; hence, it tends to zero faster than each individual error, a cornerstone for methods with memory. Recalling that in (19), we set

β_{j} = - \frac{1}{N_{2}^{'} (x_{j})}

, we may write

β_{j} = - \frac{1}{g^{'} (α)} {(1 + \frac{C_{1}}{g^{'} (α)} e_{j - 1} e_{w, j - 1} + O (\max {| e_{j} |, | e_{j - 1} |, | e_{w, j - 1} {|}}^{3}))}^{- 1} .

(41)

Using the standard series expansion

{(1 + z)}^{- 1} = 1 - z + O (z^{2})

, we obtain

β_{j} = - \frac{1}{g^{'} (α)} + \frac{C_{1}}{g^{'} {(α)}^{2}} e_{j - 1} e_{w, j - 1} + O (\max {| e_{j} |, | e_{j - 1} |, | e_{w, j - 1} {|}}^{3}) .

(42)

Hence

1 + β_{j} g^{'} (α) = \frac{C_{1}}{g^{'} (α)} e_{j - 1} e_{w, j - 1} + O (\max {| e_{j} |, | e_{j - 1} |, | e_{w, j - 1} {|}}^{3}),

(43)

which justifies the structure of (27) and shows how the memory through

N_{2}^{'} (x_{j})

leads to a small factor

1 + β_{j} g^{'} (α)

of size

O (e_{j - 1} e_{w, j - 1})

.

We now turn to the cubic interpolant

N_{3}

that appears in the definition of

p_{j}

. Let the nodes be

w_{j - 1}

,

x_{j - 1}

,

x_{j}

,

w_{j}

, with corresponding values

W_{j - 1} = g (w_{j - 1})

,

G_{j - 1} = g (x_{j - 1})

,

G_{j} = g (x_{j})

,

W_{j} = g (w_{j})

. The cubic Newton interpolation polynomial through these four points is given by

\begin{matrix} N_{3} (o) & = W_{j} + g [w_{j}, x_{j}] (o - w_{j}) + g [w_{j}, x_{j}, x_{j - 1}] (o - w_{j}) (o - x_{j}) \\ + g [w_{j}, x_{j}, x_{j - 1}, w_{j - 1}] (o - w_{j}) (o - x_{j}) (o - x_{j - 1}) . \end{matrix}

(44)

Differentiating once yields

\begin{matrix} N_{3}^{'} (o) & = g [w_{j}, x_{j}] + g [w_{j}, x_{j}, x_{j - 1}] (2 o - w_{j} - x_{j}) \\ + g [w_{j}, x_{j}, x_{j - 1}, w_{j - 1}] (3 o^{2} - 2 o (w_{j} + x_{j} + x_{j - 1}) + (w_{j} x_{j} + w_{j} x_{j - 1} + x_{j} x_{j - 1})), \end{matrix}

(45)

while the second derivative is

\begin{matrix} N_{3}^{″} (o) & = 2 g [w_{j}, x_{j}, x_{j - 1}] \\ + g [w_{j}, x_{j}, x_{j - 1}, w_{j - 1}] (6 o - 2 (w_{j} + x_{j} + x_{j - 1})) . \end{matrix}

(46)

In particular, evaluating at

o = w_{j}

gives

N_{3}^{″} (w_{j}) = 2 g [w_{j}, x_{j}, x_{j - 1}] + 4 g [w_{j}, x_{j}, x_{j - 1}, w_{j - 1}] (w_{j} - x_{j - 1}) .

(47)

Using higher-order Taylor expansions of g around

α

, it follows that

\frac{N_{3}^{″} (w_{j})}{2} = \frac{g^{″} (α)}{2} + C_{2} e_{j - 1} e_{w, j - 1} + O (\max {| e_{j} |, | e_{j - 1} |, | e_{w, j - 1} {|}}^{3}),

(48)

where

C_{2}

depends on

g^{(3)} (α)

and

g^{(4)} (α)

. In scheme (19), the parameter

p_{j}

is chosen as

p_{j} = - \frac{N_{3}^{″} (w_{j})}{2}

, so that

p_{j} = - \frac{g^{″} (α)}{2} - C_{2} e_{j - 1} e_{w, j - 1} + O (\max {| e_{j} |, | e_{j - 1} |, | e_{w, j - 1} {|}}^{3}) .

(49)

Recalling that

b_{2} = \frac{g^{″} (α)}{2 g^{'} (α)}

, we have

b_{2} g^{'} (α) = \frac{g^{″} (α)}{2}

. Hence,

b_{2} g^{'} (α) + p_{j} = - \frac{g^{″} (α)}{2} - C_{2} e_{j - 1} e_{w, j - 1} + O (\max {| e_{j} |, | e_{j - 1} |, | e_{w, j - 1} {|}}^{3}),

(50)

so that, up to a nonzero multiplicative constant,

b_{2} + p_{j} g^{'} (α) \sim e_{j - 1} e_{w, j - 1},

(51)

which corresponds to the structure of (28). Combining (43) and (51) with the error Equation (26) explains why the error factor

e_{j + 1}

is asymptotically proportional to

e_{j}^{2}

multiplied by

e_{j - 1}^{2} e_{w, j - 1}^{2}

, ultimately leading to the improved R-order in Theorem 2.

We now examine the stability of the scheme (19) with respect to perturbations in

β_{j}

and

p_{j}

. Assume that the ideal (unattainable) choices satisfy

β^{*} = - \frac{1}{g^{'} (α)}

,

p^{*} = - b_{2} g^{'} (α)

, and suppose the actual parameters used by the algorithm can be written as

β_{j} = β^{*} + ε_{j}, p_{j} = p^{*} + δ_{j},

(52)

where

ε_{j}

and

δ_{j}

are perturbations stemming from the interpolation process and therefore depend on previous errors. We assume that, asymptotically, these perturbations satisfy

ε_{j} = O (e_{j - 1} e_{w, j - 1}), δ_{j} = O (e_{j - 1} e_{w, j - 1}) .

(53)

Inserting (52) into the general error Equation (17) and exploiting the ideal cancellation that would occur for

β^{*}

,

p^{*}

, we obtain

e_{j + 1} = K e_{j}^{3 + \frac{2}{\bar{p}}} + L_{1} ε_{j} e_{j}^{2} + L_{2} δ_{j} e_{j}^{2} + O (e_{j}^{3 + \frac{2}{\bar{p}} + 1}),

(54)

for some constants

K, L_{1}, L_{2}

depending on derivatives of g at

α

. Using (53) and the fact that

e_{w, j - 1} \sim e_{j - 1}^{\bar{p}}

, we have

ε_{j} e_{j}^{2} = O (e_{j - 1} e_{w, j - 1} e_{j}^{2}) = O (e_{j}^{2} e_{j}^{1 + \bar{p}}) = O (e_{j}^{3 + \bar{p}}),

and similarly for

δ_{j} e_{j}^{2}

. Thus, the perturbation terms in (54) are of order at least

e_{j}^{3 + \bar{p}}

, whereas the leading term behaves like

e_{j}^{3 + 2 / \bar{p}}

. Since for the positive root

\bar{p} \approx 3.56155

we have

3 + \bar{p} > 3 + 2 / \bar{p}

, the perturbation terms are of strictly higher order than the principal term. Therefore, the R-order of convergence is not affected by these perturbations, and the method is stable with respect to the interpolation-induced errors in

β_{j}

and

p_{j}

.

Finally, we address the local well-posedness of the method, i.e., the fact that all denominators involved remain nonzero for iterates sufficiently close to the simple root

α

. We state this as a lemma.

Lemma 1.

Assume that

g \in C^{4}

in a neighborhood of a simple zero α and that

g^{'} (α) \neq 0

. Then, there exists a neighborhood U of α such that, if

x_{0} \in U

and the iterates

{x_{j}}

generated by (19) remain in U, the following hold for all sufficiently large j:

g [x_{j}, w_{j}] + p_{j} t_{j} \neq 0, N_{2}^{'} (x_{j}) \neq 0, N_{2 l}^{'} (x_{j}) \neq 0,

for any fixed

l \geq 1

in the family (20). In particular, the iterations (19) and (20) are well defined in U.

Proof.

Here

m > 0

denotes a lower bound for

| g^{'} (x) |

on a neighborhood of

α

(guaranteed by continuity and

g^{'} (α) \neq 0

); it is unrelated to the number of functional evaluations per iteration. Since

g^{'} (α) \neq 0

, continuity of

g^{'}

implies that there exists a neighborhood U of

α

and a constant

m > 0

such that

| g^{'} (x) | \geq m > 0,

for all

x \in U

. From the expansion (40), we have

N_{2}^{'} (x_{j}) \to g^{'} (α) as x_{j} \to α,

so for j sufficiently large and

x_{j} \in U

, it follows that

| N_{2}^{'} (x_{j}) | \geq \frac{m}{2} > 0

. An analogous argument, using continuity of the higher-order divided differences and the fact that

N_{2 l}^{'} (x_{j}) \to g^{'} (α)

as

x_{j} \to α

, shows the existence of

j_{0}

such that for all

j \geq j_{0}

,

N_{2 l}^{'} (x_{j}) \neq 0

. Regarding the denominator

g [x_{j}, w_{j}] + p_{j} t_{j}

, note that the ideal parameter choices yield cancellation of the quadratic term but not of the denominator itself. From the expansions (15) and (16), along with the asymptotic behavior (42) and (49), we obtain

g [x_{j}, w_{j}] + p_{j} t_{j} = g^{'} (α) + O (e_{j}) + O (e_{j - 1} e_{w, j - 1}),

so that

g [x_{j}, w_{j}] + p_{j} t_{j} \to g^{'} (α),

as

j \to \infty

. Thus, for all sufficiently large j with

x_{j}, w_{j} \in U

, one has

|g [x_{j}, w_{j}] + p_{j} t_{j}| \geq \frac{m}{2} > 0 .

This ensures that none of the denominators vanish, and hence the schemes (19) and (20) are locally well-defined. □

The explicit constructions (37)–(47), asymptotic expansions (40)–(51), stability analysis (52)–(54), and the well-posedness Lemma 1 together provide a theoretical foundation for the Steffensen-type method with memory proposed in this work.

3. Basin of Attractions

The local convergence analysis in Section 2 provides information about the behavior of the proposed schemes when the initial approximation lies adequately near to a simple root. This leads naturally to the study of attraction basins and the associated dynamical systems generated by the iterative schemes; see more concrete discussions in [26].

Let

g : C \to C

be a nonlinear function, and let

Φ : C \to C

denote the iteration function corresponding to a one-step method without memory, so that the iteration can be written as

z_{j + 1} = Φ (z_{j}) .

(55)

In the case of the basic Steffensen-type method (12) with fixed parameters

β

and p, the iteration function takes the form

Φ (z) = z - \frac{g (z)}{g [z, z + β g (z)] + p \frac{g (z + β g (z))}{g [z, z + β g (z)]}},

(56)

which is a rational function of z whenever g is a polynomial. A point

α \in C

satisfying

Φ (α) = α

is a fixed point of

Φ

. Simple zeros of g are fixed points of

Φ

, since

g (α) = 0

implies

Φ (α) = α

. More generally, other fixed points may appear, including spurious or extraneous ones not satisfying

g (α) = 0

, [27].

For a given root

α

of g, the basin of attraction of

α

with respect to the iteration (55) is defined as

A (α) = \{z_{0} \in C : \lim_{j \to \infty} Φ^{(j)} (z_{0}) = α\},

(57)

where

Φ^{(j)}

denotes the j-fold composition of

Φ

. The union of all such basins over all attracting fixed points partitions the complex plane (modulo the Julia set) into regions of convergence. The boundary of

A (α)

is contained in the Julia set associated with

Φ

, which is typically a complicated fractal set even for simple rational maps.

The local dynamical nature of a fixed point

α

of

Φ

is characterized by the multiplier

Φ^{'} (α)

. The fixed point

α

is attracting if

| Φ^{'} (α) | < 1

, repelling if

| Φ^{'} (α) | > 1

, and indifferent if

| Φ^{'} (α) | = 1

. For methods designed to approximate simple roots of g, one typically has

Φ^{'} (α) = 0

, which implies quadratic or higher-order convergence and a nonempty open basin of attraction

A (α)

.

We remark that the iteration map

Φ

in (56) is a rational function of z only when g is a polynomial. For general nonlinear functions g, the map

Φ

is still well defined but may no longer be rational; nevertheless, the dynamical interpretation through fixed points and attraction basins remains valid [28].

Since the computation is performed over a discrete grid, these values should be interpreted as mesh-dependent estimates whose accuracy increases with grid refinement. For methods with memory, such as (19), the iteration cannot be described by a scalar recurrence of the form (55). Instead, one must consider a higher-dimensional dynamical system. For instance, if the method uses

x_{j}

and

x_{j - 1}

simultaneously, one can rewrite it as a map on

C^{2}

,

Ψ : C^{2} \to C^{2}, Ψ (z_{j}, z_{j - 1}) = (z_{j + 1}, z_{j}),

(58)

where

z_{j + 1}

is computed using both

z_{j}

and

z_{j - 1}

. In such a setting, the root

α

corresponds to a fixed point

(α, α)

in

C^{2}

. The basin of attraction in this case is a subset of

C^{2}

defined by

A_{2} (α) = \{(z_{0}, z_{- 1}) \in C^{2} : \lim_{j \to \infty} Ψ^{(j)} (z_{0}, z_{- 1}) = (α, α)\} .

(59)

For graphical representation in the complex plane, one typically restricts to a one-parameter family of initial pairs, for example, by fixing

z_{- 1}

or by choosing a prescribed initialization rule, and then plotting the convergence behavior as a function of

z_{0} \in C

. Here,

z_{0} \in C

is the initial approximation, which is taken from the mesh points located for the discretization of the complex domain.

In the present context, the term "dynamical system" refers to the discrete-time iteration induced by a root-finding method: once an iteration function

Φ

is defined, the sequence

{z_{j}}

is generated by repeated composition

z_{j + 1} = Φ (z_{j})

, and the convergence behavior can be observed through fixed points, their stability, and the corresponding basins of attraction.

In order to compare the dynamical behavior of Steffensen-type methods with and without memory, we consider the polynomial family

g (z) = z^{n} - 1, n = 3, 4, 5, 6,

whose simple roots are given by

α_{k} = \exp (\frac{2 π i k}{n})

,

k = 0, 1, \dots, n - 1

. These roots lie on the unit circle and enjoy the rotational symmetry

g (ω z) = ω^{n} g (z)

,

ω = \exp (\frac{2 π i}{n})

, which implies that the basins of attraction of any reasonable root-finding method inherit rotational symmetry of order n. For a given iteration

Φ

, each root

α_{k}

is an attracting fixed point, and the dynamics are governed by how the complex plane is partitioned into the basins

A (α_{k})

and the set of nonconvergent initial conditions.

In our numerical experiments, we fix a rectangular region

D = [- 4, 4] \times [- 4, 4] \subset C

and define a uniform grid of initial points

z_{0} = x + i y, x, y \in [- 4, 4],

with a prescribed mesh size (

h = 0.005

). For each grid point

z_{0}

, the iteration scheme under consideration is applied up to a maximum number of steps

N_{\max} = 30

. We declare convergence to a root

α_{k}

if there exists an index

j \leq N_{\max}

such that

| g (z_{j}) | < 10^{- 2},

(60)

in which case, the point

z_{0}

is assigned to the basin of

α_{k}

. If no such j is found, or if the magnitude

| z_{j} |

exceeds a certain large threshold (indicating divergence to infinity), then

z_{0}

is classified as nonconvergent for that method [29].

The numerical goal of Section 3 is not to compute accurate approximations of the roots but to study the convergence behavior of the competing iterations as dynamical systems, namely (i) which root attracts a given initial point in the domain D, (ii) how large the corresponding attraction regions are (convergence radii and robustness), and (iii) how the memory mechanism modifies the set of nonconvergent (or slowly convergent) initial conditions. For this reason, the stopping rule (60) is employed as a classification threshold under a fixed iteration budget

N_{\max}

, rather than as a stringent accuracy certificate. In the test family

g (z) = z^{n} - 1

, all roots are simple and are available in closed form, so the basin plots can be interpreted unambiguously as root-attraction classifications over a large set of initial conditions. The tolerance

10^{- 2}

in (60) is chosen to balance two practical aspects that are intrinsic to basin computations on dense grids: first, it reliably separates convergent from divergent orbits within the fixed cap

N_{\max}

over the entire domain D; second, it avoids inflating the “nonconvergent” region by classifying as failures those initial points that are already attracted to a root but approach it more slowly and would require more than

N_{\max}

steps to satisfy a very stringent residual requirement. In other words, the basin study is intended to compare robustness and stability regions of the methods (especially the effect of memory), not to compare the final digits of accuracy. Finally, we stress that the basin computations use the same tolerance (60) and the same

N_{\max}

for every method, so the comparison is fair, and the observed enlargement of attraction basins for the with-memory scheme reflects improvements in stability rather than an artifact of inconsistent stopping rules.

The visual representation of attraction basins is obtained by assigning a distinct color to each root

α_{k}

and a special color to the set of nonconvergent initial points. Furthermore, to encode information about the convergence speed within each basin, we employ a shading scheme: darker colors correspond to smaller iteration counts j required to satisfy (60), whereas lighter colors indicate larger iteration counts closer to the upper bound

N_{\max}

. Thus, the color intensity at a given starting point

z_{0}

reflects the local efficiency of the iteration: darker regions correspond to fast convergence and lighter regions to slower convergence within the same basin.

The parameter

β_{0}

is chosen as

β_{0} = 0.1

and

β_{0} = 0.01

when required for initialization, unless stated otherwise. This small positive value prevents premature numerical instabilities in the early stages of the memory-based schemes. All other free parameters are set to fixed values determined by the particular method under test (for instance, the choice of p in (12) or of

p_{j}

and

β_{j}

in (19)), unless adaptive updates are explicitly used by the with-memory algorithm.

The computation and plotting of attraction basins were implemented in Mathematica 13.3 [30], where each method is coded as a function that, given an initial point

z_{0}

, iteratively produces the sequence

{z_{j}}

and records the convergence behavior.

When comparing the Steffensen-type method without memory (12) to its memory-based counterpart (19), the dynamical portraits exhibit a clear qualitative difference. For methods without memory, the boundaries between basins are typically more intricate and interwoven, and a larger fraction of initial points in D either do not converge within

N_{\max}

iterations or diverge to infinity. In contrast, the with-memory method (19) exhibits larger homogeneous colored regions associated with each root, indicating enlarged basins of attraction. Moreover, the shading within these regions is generally darker, reflecting a reduced average number of iterations and thus faster convergence in a large portion of the domain.

From a dynamical systems perspective, in Figure 1, Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6, the enlargement of attraction basins for methods with memory can be attributed to the improved local contraction properties induced by the adaptive parameters

β_{j}

and

p_{j}

. Near a simple root

α_{k}

, the map

Ψ

in (58) has a Jacobian whose eigenvalues are closely related to the R-order of convergence: smaller eigenvalues in modulus correspond to a stronger local contraction, which in turn typically enlarges the region from which iterates are drawn toward the root.

Figure 1. Basin of attractions for (1) with

β = 0.1

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 2. Basin of attractions for (1) with

β = 0.1

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 3. Basin of attractions for (12) with

β = p = 0.01

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 4. Basin of attractions for (12) with

β = p = 0.01

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 5. Basin of attractions for (19) with

β_{0} = p_{0} = 0.1

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 6. Basin of attractions for (19) with

β_{0} = p_{0} = 0.1

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

4. Computational Results

Here, we present computational experiments that demonstrate the practical performance of the newly developed Steffensen-type schemes, both with and without memory. All computations were carried out using Mathematica 13.3 with 1500-digit floating-point precision to observe the asymptotic convergence behavior and to avoid premature rounding errors that may obscure the higher-order convergence phenomenon. The results presented here are entirely consistent with the theoretical analysis performed in Section 2 and Section 3, confirming the predicted R-orders, the enhanced convergence radii, and the dynamical robustness of the proposed methods.

To illustrate the advantages of the self-accelerating parameters and the memory mechanism, we consider the oscillatory test function

g (x) = (x - 1) (x^{10} + x^{3} + 1) \sin (x),

(61)

which is known to be challenging for many iterative solvers due to its rapidly oscillating nature and its combination of polynomial and transcendental components. The function has a simple root at

α = 1

, but the surrounding landscape contains multiple oscillatory extrema and steep curvature variations, making it a good benchmark for convergence speed and sensitivity to initial guesses.

To visualize the numerical behavior, the iterative trajectories are plotted in Figure 7, where the left frame shows convergence from a moderate initial approximation and the right frame illustrates robustness from a distant starting point. The parameters

β_{0}

,

p_{0}

, and the initial guess

x_{0}

used for each scheme are explicitly stated in the figure captions. The simulations reveal that the method with memory not only converges more quickly but also maintains stable behavior in regions where the derivative-free method without memory struggles to converge or becomes sensitive to perturbations.

Figure 7. Numerical convergence results for different values of the initial approximations.

The figure shows the residual histories for the test problem (61) using the three derivative-free iterations (1) (Equation

(1)

, green diamonds), (12) (Equation

(13)

, orange squares), and the proposed with-memory scheme (19) (Equation

(20)

, blue circles). In each panel, the vertical axis reports the residual magnitude

| g (x_{j}) |

on a logarithmic scale versus the iteration index j. Left panel: Starting from the moderate initial approximation

x_{0} = 0.9

with

β_{0} = 0.1

and

p_{0} = - 0.1

, all three schemes converge; however, the with-memory method

(20)

produces a markedly steeper decay of

| g (x_{j}) |

, reaching a near-machine residual level within only a few iterations, whereas

(1)

and

(13)

require substantially more steps to attain comparable residual reductions. Right panel: For the more challenging initial approximation

x_{0} = 0.8

(with the same initialization

β_{0} = 0.1

,

p_{0} = - 0.1

), the with-memory scheme

(20)

still converges rapidly, the two-parameter method without memory

(13)

converges but much more slowly, and the classical Steffensen-type iteration

(1)

exhibits stagnation (no visible residual decrease) over the displayed iteration range. Overall, the two panels illustrate that incorporating memory not only accelerates the asymptotic residual decay but also improves robustness with respect to the choice of the initial approximation.

This is consistent with the enlarged attraction basins documented in Section 3, which demonstrate that the interpolation-based parameter updates reinforce the stability properties of the scheme. The numerical experiments support the theoretical findings of Section 2 and Section 3 in two complementary ways. First, for the benchmark (61), the proposed memory-based scheme exhibits a faster residual decay and improved robustness with respect to the initial approximation when compared to the memoryless competitors. Second, the basin-of-attraction study in Section 3 indicates enlarged convergence regions and fewer nonconvergent initial conditions for the with-memory iteration on the tested polynomial families. These results provide evidence of improved efficiency and dynamical robustness.

Before ending this section, we evaluate the performance of the discussed schemes with some other recently proposed schemes [31] as follows (

q_{j} \in [0, 1]

):

Moser–Secant method:

$\begin{matrix} \{\begin{matrix} x_{j + 1} = x_{j} - {\bar{A}}_{j} g (x_{j}), j \geq 0, \\ y_{j + 1} = x_{j} + q_{j} (x_{j + 1} - x_{j}), \\ {\bar{K}}_{j + 1} = \frac{g (y_{j + 1}) - g (x_{j + 1})}{y_{j + 1} - x_{j + 1}}, \\ {\bar{A}}_{j + 1} = 2 {\bar{A}}_{j} - {\bar{A}}_{j} {\bar{K}}_{j + 1} {\bar{A}}_{j}, \end{matrix} \end{matrix}$

(62)
Moser–Kurchatov method:

$\begin{matrix} \{\begin{matrix} x_{j + 1} = x_{j} - {\bar{A}}_{j} g (x_{j}), j \geq 0, \\ y_{j + 1} = x_{j} + q_{j} (x_{j + 1} - x_{j}), \\ {\bar{K}}_{j + 1} = \frac{g (2 y_{j + 1} - x_{j + 1}) - g (x_{j + 1})}{(2 y_{j + 1} - x_{j + 1}) - x_{j + 1}}, \\ {\bar{A}}_{j + 1} = 2 {\bar{A}}_{j} - {\bar{A}}_{j} {\bar{K}}_{j + 1} {\bar{A}}_{j} . \end{matrix} \end{matrix}$

(63)

We have chosen for the initial approximation

x_{0} = 1.2

. The results are provided in Table 1. The schemes (62) and (63) are useful for systems of nonlinear equations, but they are sensitive to the choice of

{\bar{A}}_{j}

as an approximation of the first derivative, and thus their convergence radii are lower than that of the Steffensen-type method with memory. Here, we have chosen

{\bar{A}}_{0} = f^{'} {(x_{0})}^{- 1}

and

q_{j} = 1 / 2

. Numerical results confirm the efficiency of (19).

Table 1. Convergence comparisons for

x_{0} = 1.2

to solve (61).

5. Concluding Summary

In this work, we improved and analyzed a new Steffensen-type iterative scheme equipped with two free parameters and a memory-based acceleration mechanism. The derivative-free method without memory was shown to achieve optimal quadratic convergence, while its extension with memory attained an R-order of

3.56155

without requiring any additional functional evaluations. By employing Newton interpolation polynomials to approximate the self-accelerating parameters, the proposed scheme demonstrated improved local convergence behavior, enhanced efficiency index, and enlarged attraction basins. Numerical experiments and dynamical analyses confirmed that the new method performs efficiently.

Future research will focus on broadening the applicability of the proposed framework to systems of nonlinear equations and matrix equations. Some other scholars may study an alternative approximation tools, such as Padé-type rational interpolants or machine-learning-based parameter estimators for enhanced convergence.

Author Contributions

Conceptualization, S.W. and T.L.; Methodology, S.W. and C.L.; Software, C.L.; Validation, Z.Y.; Formal analysis, C.L.; Investigation, S.W. and C.L.; Resources, T.L.; Data curation, Z.Y.; Writing—original draft, S.W., C.L., and Z.Y.; Writing—review and editing, T.L.; Visualization, Z.Y.; Supervision, T.L.; Project administration, T.L.; Funding acquisition, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Research Project on Graduate Education and Teaching Reform of Hebei Province of China (YJG2024133), the Open Fund Project of Marine Ecological Restoration and Smart Ocean Engineering Research Center of Hebei Province (HBMESO2321), and the Technical Service Project of the Eighth Geological Brigade of the Hebei Bureau of Geology and Mineral Resources Exploration (KJ2025-029, KJ2025-037).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available because the research data are confidential.

Acknowledgments

We sincerely thank the two referees for their careful reading of the manuscript and for their constructive comments and suggestions, which have substantially improved the quality and clarity of the paper.

Conflicts of Interest

No conflicts of interest.

References

Shil, S.; Nashine, H.K.; Soleymani, F. On an inversion-free algorithm for the nonlinear matrix problem $X^{α} + A^{*} X^{- β} A + B^{*} X^{- γ} B = I$ . Int. J. Comput. Math. 2022, 99, 2555–2567. [Google Scholar] [CrossRef]
Bini, D.A.; Iannazzo, B. Computational aspects of the geometric mean of two matrices a survey. Acta Sci. Math. 2024, 90, 349–389. [Google Scholar] [CrossRef]
Dehdezi, E.K.; Karimi, S. A fast and efficient Newton-Shultz-type iterative method for computing inverse and Moore-Penrose inverse of tensors. J. Math. Model. 2021, 9, 645–664. [Google Scholar]
Noda, T. The Steffensen iteration method for systems of nonlinear equations. Proc. Jpn. Acad. 1987, 63, 186–189. [Google Scholar] [CrossRef]
Steffensen, J.F. Remarks on iteration. Skand. Aktuarietidskr. 1933, 16, 64–72. [Google Scholar] [CrossRef]
Mehta, B.; Parida, P.K.; Nayak, S.K. Convergence analysis of higher order iterative methods in Riemannian manifold. Rend. Circ. Mat. Palermo II. Ser 2026, 75, 8. [Google Scholar] [CrossRef]
Song, Y.; Soleymani, F.; Kumar, A. Finding the geometric mean for two Hermitian matrices by an efficient iteration method. Math. Methods Appl. Sci. 2025, 48, 7188–7196. [Google Scholar] [CrossRef]
Zhang, K.; Soleymani, F.; Shateyi, S. On the construction of a two-step sixth-order scheme to find the Drazin generalized inverse. Axioms 2025, 14, 22. [Google Scholar] [CrossRef]
Traub, J.F. Iterative Methods for the Solution of Equations; Prentice-Hall: New York, NY, USA, 1964. [Google Scholar]
Kung, H.T.; Traub, J.F. Optimal order of one–point and multi–point iteration. J. ACM 1974, 21, 643–651. [Google Scholar] [CrossRef]
Arora, H.; Cordero, A.; Torregrosa, J.R. On generalized one-step derivative-free iterative family for evaluating multiple roots. Iran. J. Numer. Anal. Optim. 2024, 14, 291–314. [Google Scholar]
Ostrowski, A.M. Solution of Equations and Systems of Equations; Academic Press: New York, NY, USA, 1966. [Google Scholar]
Torkashvand, V. A two-step method adaptive with memory with eighth-order for solving nonlinear equations and its dynamic. Comput. Methods Differ. Equ. 2022, 10, 1007–1026. [Google Scholar]
Khaksar Haghani, F. A modified Steffensen’s method with memory for nonlinear equations. Int. J. Math. Model. Comput. 2015, 5, 41–48. [Google Scholar]
Chicharro, F.I.; Cordero, A.; Garrido, N.; Torregrosa, J.R. Stability and applicability of iterative methods with memory. J. Math. Chem. 2019, 57, 1282–1300. [Google Scholar] [CrossRef]
Džunić, J.; Petković, M.S. On generalized biparametric multipoint root finding methods with memory. J. Comput. Appl. Math. 2014, 255, 362–375. [Google Scholar] [CrossRef]
Soleymani, F.; Vanani, S.K. A modified eighth-order derivative-free root solver. Thai. J. Math. 2012, 10, 541–549. [Google Scholar]
Yilmaz, N. Introducing three new smoothing functions: Analysis on smoothing-Newton algorithms. J. Math. Model. 2024, 12, 463–479. [Google Scholar]
Dehghani-Madiseh, M. A family of eight-order interval methods for computing rigorous bounds to the solution to nonlinear equations. Iran. J. Numer. Anal. Optim. 2023, 13, 102–120. [Google Scholar]
Sreedeep, C.D.; Argyros, I.K.; Gopika Dinesh, K.C.; Shrestha, N.; Argyros, M. Convergence analysis of a power series based iterative method having seventh order of convergence. Mat. Stud. 2025, 64, 179–193. [Google Scholar] [CrossRef]
George, S.; Grammont, M.M.L. Derivative-free convergence analysis for Steffensen-type schemes for nonlinear equations. Appl. Numer. Math. 2026, 223, 101–120. [Google Scholar] [CrossRef]
Wang, X.; Guo, S. Dynamic analysis of a family of iterative methods with fifth-order convergence. Fractal Fract. 2025, 9, 783. [Google Scholar] [CrossRef]
Ogbereyivwe, O.; Izevbizua, O. A three-free-parameter class of power series based iterative method for approximation of nonlinear equations solution. Iran. J. Numer. Anal. Optim. 2023, 13, 157–169. [Google Scholar]
Soleymani, F. Some optimal iterative methods and their with memory variants. J. Egypt. Math. Soc. 2013, 21, 59–67. [Google Scholar] [CrossRef]
Kiyoumarsi, F. On the construction of fast Steffensen–type iterative methods for nonlinear equations. Int. J. Comput. Meth. 2018, 15, 1850002. [Google Scholar] [CrossRef]
Getz, C.; Helmstedt, J. Graphics with Mathematica Fractals, Julia Sets, Patterns and Natural Forms; Elsevier: Amsterdam, The Netherlands, 2004. [Google Scholar]
Zhu, J.; Li, Y.; Li, Y.; Liu, T.; Ma, Q. Fourth-order iterative algorithms for the simultaneous calculation of matrix square roots and their inverses. Mathematics 2025, 13, 3370. [Google Scholar] [CrossRef]
Howk, C.L.; Hueso, J.L.; Martínez, E.; Teruel, C. A class of efficient high–order iterative methods with memory for nonlinear equations and their dynamics. Math. Meth. Appl. Sci. 2018, 41, 7263–7282. [Google Scholar] [CrossRef]
Wang, S.; Wang, Z.; Xie, W.; Qi, Y.; Liu, T. An accelerated sixth-order procedure to determine the matrix sign function computationally. Mathematics 2025, 13, 1080. [Google Scholar] [CrossRef]
Mureşan, M. Introduction to Mathematica with Applications; Springer: Cham, Switzerland, 2017. [Google Scholar]
Argyros, I.K.; Shakhno, S.M.; Shunkin, Y.V. On an iterative Moser–Kurchatov method for solving systems of nonlinear equations. Mat. Stud. 2025, 63, 88–97. [Google Scholar] [CrossRef]

Figure 1. Basin of attractions for (1) with

β = 0.1

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 1. Basin of attractions for (1) with

β = 0.1

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 2. Basin of attractions for (1) with

β = 0.1

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 2. Basin of attractions for (1) with

β = 0.1

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 3. Basin of attractions for (12) with

β = p = 0.01

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 3. Basin of attractions for (12) with

β = p = 0.01

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 4. Basin of attractions for (12) with

β = p = 0.01

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 4. Basin of attractions for (12) with

β = p = 0.01

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 5. Basin of attractions for (19) with

β_{0} = p_{0} = 0.1

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 5. Basin of attractions for (19) with

β_{0} = p_{0} = 0.1

on

g (z) = z^{3} - 1

(left),

g (z) = z^{4} - 1

(right).

Figure 6. Basin of attractions for (19) with

β_{0} = p_{0} = 0.1

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 6. Basin of attractions for (19) with

β_{0} = p_{0} = 0.1

on

g (z) = z^{5} - 1

(left),

g (z) = z^{6} - 1

(right).

Figure 7. Numerical convergence results for different values of the initial approximations.

Table 1. Convergence comparisons for

x_{0} = 1.2

to solve (61).

Table 1. Convergence comparisons for

x_{0} = 1.2

to solve (61).

	(1)	(12)	(19)	(62)	(63)
$\| g (x_{2}) \|$	0.5915	0.5911	0.1375	0.2715	0.2934
$\| g (x_{3}) \|$	0.2549	0.2545	0.0013	0.1271	0.1471
$\| g (x_{4}) \|$	0.07539	0.0750	$5.213 \times 10^{- 11}$	0.0500	0.0639
$\| g (x_{5}) \|$	0.0104	0.0102	$2.889 \times 10^{- 37}$	0.013889	0.0211
$\| g (x_{6}) \|$	0.0002	0.0002	$5.167 \times 10^{- 130}$	0.0019	0.0042
$\| g (x_{7}) \|$	$1.597 \times 10^{- 7}$	$1.479 \times 10^{- 7}$	-	0.00007	0.0003
$\| g (x_{8}) \|$	$6.298 \times 10^{- 14}$	$5.362 \times 10^{- 14}$	-	$2.450 \times 10^{- 7}$	$4.606 \times 10^{- 6}$
$\| g (x_{9}) \|$	$9.791 \times 10^{- 27}$	$7.041 \times 10^{- 27}$	-	$2.024 \times 10^{- 11}$	$3.875 \times 10^{- 9}$
Rates	2.00	2.00	3.53	1.61	1.61

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

An Accelerated Steffensen Iteration via Interpolation-Based Memory and Optimal Convergence

Abstract

1. Introduction

2. Developing a New Scheme

3. Basin of Attractions

4. Computational Results

5. Concluding Summary

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics