A Convex Combination Approach for Mean-Based Variants of Newton’s Method

Cordero, Alicia; Franceschi, Jonathan; Torregrosa, Juan R.; Zagati, Anna C.

doi:10.3390/sym11091106

Open AccessArticle

A Convex Combination Approach for Mean-Based Variants of Newton’s Method

by

Alicia Cordero

¹

,

Jonathan Franceschi

^1,2,*,

Juan R. Torregrosa

¹

and

Anna C. Zagati

^1,2

¹

Institute of Multidisciplinary Mathematics, Universitat Politècnica de València, Camino de Vera, s/n, 46022-Valencia, Spain

²

Universitá di Ferrara, via Ludovico Ariosto, 35, 44121 Ferrara, Italy

^*

Author to whom correspondence should be addressed.

Symmetry 2019, 11(9), 1106; https://doi.org/10.3390/sym11091106

Submission received: 31 July 2019 / Revised: 13 August 2019 / Accepted: 18 August 2019 / Published: 2 September 2019

(This article belongs to the Special Issue Symmetry with Operator Theory and Equations)

Download

Browse Figures

Versions Notes

Abstract

:

Several authors have designed variants of Newton’s method for solving nonlinear equations by using different means. This technique involves a symmetry in the corresponding fixed-point operator. In this paper, some known results about mean-based variants of Newton’s method (MBN) are re-analyzed from the point of view of convex combinations. A new test is developed to study the order of convergence of general MBN. Furthermore, a generalization of the Lehmer mean is proposed and discussed. Numerical tests are provided to support the theoretical results obtained and to compare the different methods employed. Some dynamical planes of the analyzed methods on several equations are presented, revealing the great difference between the MBN when it comes to determining the set of starting points that ensure convergence and observing their symmetry in the complex plane.

Keywords:

nonlinear equations; iterative methods; general means; basin of attraction

1. Introduction

We consider the problem of finding a simple zero

α

of a function

f : I \subset R \to R

, defined in an open interval I. This zero can be determined as a fixed point of some function g by means of the one-point iteration method:

x_{n + 1} = g (x_{n}), n = 0, 1, \dots,

(1)

where

x_{0}

is the starting point. The most widely-used example of these kinds of methods is the classical Newton’s method given by:

x_{n + 1} = x_{n} - \frac{f (x_{n})}{f^{'} (x_{n})}, n = 0, 1, \dots .

(2)

It is well known that it converges quadratically to simple zeros and linearly to multiple zeros. In the literature, many modifications of Newton’s scheme have been published in order to improve its order of convergence and stability. Interesting overviews about this area of research can be found in [1,2,3]. The works of Weerakoon and Fernando [4] and, later, Özban [5] have inspired a whole set of variants of Newton’s method, whose main characteristic is the use of different means in the iterative expression.

It is known that if a sequence

{x_{n}}_{n \geq 0}

tends to a limit

α

in such a way that there exist a constant

C > 0

and a positive integer

n_{0}

such that:

| x_{n + 1} - α | \leq C | x_{n} {- α |}^{p}, \forall n \geq n_{0},

(3)

for

p \geq 1

, then p is called the order of convergence of the sequence and C is known as the asymptotic error constant. For

p = 1

, constant C satisfies

0 < C \leq 1

.

If we denote by

e_{n} = x_{n} - α

the exact error of the nth iterate, then the relation:

e_{n + 1} = C e_{n}^{p} + O (e_{n}^{p + 1})

(4)

is called the error equation for the method and p is the order of convergence.

Let us suppose that

f : I \subseteq R \to R

is a sufficiently-differentiable function and

α

is a simple zero of f. It is plain that:

f (x) = f (x_{n}) + \int_{x_{n}}^{x} f^{'} (t) d t .

(5)

Weerakoon and Fernando in [4] approximated the definite integral (5) by using the trapezoidal rule and taking

x = α

, getting:

0 \approx f (x_{n}) + 1 / 2 (α - x_{n}) (f^{'} (x_{n}) + f^{'} (α)),

(6)

and therefore, a new approximation

x_{n + 1}

to

α

is given by:

\begin{matrix} x_{n + 1} = x_{n} - \frac{f (x_{n})}{(f^{'} (x_{n}) + f^{'} (z_{n})) / 2}, & z_{n} = x_{n} - \frac{f (x_{n})}{f^{'} (x_{n})}, n = 0, 1, \dots . \end{matrix}

(7)

Thus, this variant of Newton’s scheme can be considered to be obtained by replacing the denominator

f^{'} (x_{n})

of Newton’s method (2) by the arithmetic mean of

f^{'} (x_{n})

and

f^{'} (z_{n})

. Therefore, it is known as the arithmetic mean Newton method (AN).

In a similar way, the arithmetic mean can be replaced by other means. In particular, the harmonic mean

M_{H a} (x, y) = 2 x y / (x + y)

, where x and y are two nonnegative real numbers, from a different point of view:

M_{H a} (x, y) = \frac{2 x y}{x + y} = x \underset{θ}{\underset{︸}{\frac{y}{x + y}}} + y \underset{1 - θ}{\underset{︸}{\frac{x}{x + y}}},

(8)

where since

0 \leq y \leq x + y

, then

0 \leq θ \leq 1

, i.e., the harmonic mean can be seen as a convex combination between x and y, where every element is given the relevance of the other one in the sum. Now, let us switch the roles of x and y; we get:

x \frac{x}{x + y} + y \frac{y}{x + y} = \frac{x^{2} + y^{2}}{x + y} = M_{C h} (x, y),

(9)

that is the contraharmonic mean between x and y.

Özban in [5] used the harmonic mean instead of the arithmetic one, which led to a new method:

x_{n + 1} = x_{n} - \frac{f (x_{n}) (f^{'} (x_{n}) + f^{'} (z_{n}))}{2 f^{'} (x_{n}) f^{'} (z_{n})}, n = 0, 1, \dots,

(10)

being

z_{n}

a Newton step, which he called the harmonic mean Newton method (HN).

Ababneh in [6] designed an iterative method associated with this mean, called the contraharmonic mean Newton method (CHN), whose iterative expression is:

x_{n + 1} = x_{n} - \frac{(f^{'} (x_{n}) + f^{'} (z_{n})) f (x_{n})}{f^{'} {(x_{n})}^{2} + f^{'} {(z_{n})}^{2}},

(11)

with third-order of convergence for simple roots of

f (x) = 0

, as well as the methods proposed by Weerakoon and Fernando [4] and Özban [5].

This idea has been used by different authors for designing iterative methods applying other means, generating symmetric fixed point operators. For example, Xiaojian in [7] employed the generalized mean of order

m \in R

between two values x and y defined as:

M_{G} (x, y) = {(\frac{x^{m} + y^{m}}{2})}^{1 / m},

(12)

to construct a third-order iterative method for solving nonlinear equations. Furthermore Singh et al. in [8] presented a third-order iterative scheme by using the Heronian mean between two values x and y, defined as:

M_{H e} (x, y) = \frac{1}{3} (x + \sqrt{x y} + y) .

(13)

Finally, Verma in [9], following the same procedure, designed a third-order iterative method by using the centroidal mean between two values x and y, defined as:

M_{C e} (x, y) = \frac{2 (x^{2} + x y + y^{2})}{3 (x + y)} .

(14)

In this paper, we check that all these means are functional convex combinations means and develop a simple test to prove easily the third-order of the corresponding iterative methods, mentioned before. Moreover, we introduce a new method based on the Lehmer mean of order

m \in R

, defined as:

M_{L_{m}} (x, y) = \frac{x^{m} + y^{m}}{x^{m - 1} + y^{m - 1}}

(15)

and propose a generalization that also satisfies the previous test. Finally, all these schemes are numerically tested, and their dependence on initial estimations is studied by means of their basins of attraction. These basins are shown to be clearly symmetric.

The rest of the paper is organized as follows: Section 2 is devoted to designing a test that allows us to characterize the third-order convergence of the iterative method defined by a mean. This characterization is used in Section 3 for giving an alternative proof of the convergence of mean-based variants of Newton’s (MBN) methods, including some new ones. In Section 4, we generalize the previous methods by using the concept of

σ

-means. Section 5 is devoted to numerical results and the use of basins of attraction in order to analyze the dependence of the iterative methods on the initial estimations used. With some conclusions, the manuscript is finished.

2. Convex Combination

In a similar way as has been stated in the Introduction for the arithmetic, harmonic, and contraharmonic means, the rest of the mentioned means can be also regarded as convex combinations. This is not coincidental: one of the most interesting properties that a mean satisfies is the averaging property:

min (x, y) \leq M (x, y) \leq max (x, y),

(16)

where

M (x, y)

is any mean function of x and y nonnegative. This implies that every mean that satisfies this property is a certain convex combination among its terms.

Indeed, there exists a unique

θ (x, y)) \in [0, 1]

such that:

θ (x, y) = \{\begin{matrix} \frac{M (x, y) - y}{x - y} & if x \neq y \\ 0 & if x = y \end{matrix} .

(17)

This approach suggests that it is possible to generalize every mean-based variant of Newton’s method (MBN), by studying their convex combination counterparts. As a matter of fact, every mean-based variant of Newton’s method can be rewritten as:

x_{n + 1} = x_{n} - \frac{f (x_{n})}{θ f^{'} (x_{n}) + (1 - θ) f^{'} (z_{n})},

(18)

where

θ = θ (f^{'} (x_{n}), f^{'} (z_{n}))

. This is a particular case of a family of iterative schemes constructed in [10].

We are interested in studying its order of convergence as a function of

θ

. Thus, we need to compute the approximated Taylor expansion of the convex combination at the denominator and then its inverse:

\begin{matrix} θ f^{'} (x_{n}) + (1 - θ) f^{'} (z_{n}) & = θ f^{'} (α) [1 + 2 c_{2} e_{n} + 3 c_{3} e_{n}^{2} + 4 c_{4} e_{n}^{3} + O (e_{n}^{4})] + \\ + (1 - θ) f^{'} (α) [1 + 2 c_{2} e_{n}^{2} + 4 c_{2} (c_{3} - c_{2}^{2}) e_{n}^{3} + O (e_{n}^{4})] \\ = f^{'} (α) [θ + 2 θ c_{2} e_{n} + 3 θ c_{3} e_{n}^{2} + 4 θ c_{4} e_{n}^{3} + O (e_{n}^{4})] + \\ + f^{'} (α) [1 + 2 c_{2}^{2} e_{n}^{2} + 4 c_{2} (c_{3} - c_{2}^{2}) e_{n}^{3} + O (e_{n}^{4})] + \\ - f^{'} (α) [θ + 2 θ c_{2}^{2} e_{n}^{2} + 4 θ c_{2} (c_{3} - c_{2}^{2}) e_{n}^{3} + O (e_{n}^{4})] \\ = f^{'} (α) [1 + 2 θ c_{2} e_{n} + (2 c_{2}^{2} + 3 θ c_{3} - 2 θ c_{2}^{2} + 3 θ c_{3}) e_{n}^{2}] + \\ + f^{'} (α) [(4 θ c_{4} + (1 - θ) 4 c_{2} (c_{3} - c_{2}^{2})) e_{n}^{3} + O (e_{n}^{4})]; \end{matrix}

(19)

where

c_{j} = \frac{1}{j!} \frac{f^{(j)} (α)}{f^{'} (α)}, j = 1, 2, \dots

. Then, its inverse can be expressed as:

\begin{matrix} f^{'} {(α)}^{- 1} (1 - [2 θ c_{2} e_{n} + (2 c_{2}^{2} + 3 θ c_{3} - 2 θ c_{2}^{2} + 3 θ c_{3}) e_{n}^{2} + (4 θ c_{4} + (1 - θ) 4 c_{2} (c_{3} - c_{2}^{2})) e_{n}^{3} + O (e_{n}^{4})] + \\ + {[2 θ c_{2} e_{n} + (2 c_{2}^{2} + 3 θ c_{3} - 2 θ c_{2}^{2} + 3 θ c_{3}) e_{n}^{2} + (4 θ c_{4} + (1 - θ) 4 c_{2} (c_{3} - c_{2}^{2})) e_{n}^{3} + O (e_{n}^{4})]}^{2} - \dots) \\ = & f^{'} {(α)}^{- 1} [1 - 2 θ c_{2} e_{n} + (2 θ c_{2}^{2} - 2 c_{2}^{2} + 4 θ^{2} c_{2}^{2} - 3 θ c_{3}) e_{n}^{2} - (4 θ c_{4} + (1 - θ) 4 c_{2} (c_{3} - c_{2}^{2})) e_{n}^{3} + O (e_{n}^{4})] . \end{matrix}

(20)

Now,

\frac{f (x_{n})}{θ f^{'} (x_{n}) + (1 - θ) f^{'} (z_{n})} = e_{n} + c_{2} (1 - 2 θ) e_{n}^{2} + (4 θ^{2} c_{2}^{2} - 2 c_{2}^{2} + c_{3} - 3 θ c_{3}) e_{n}^{3} + O (e_{n}^{4}),

(21)

and by replacing it in (18), it leads to the MBN error equation as a function of

θ

:

e_{n + 1} = - c_{2} (1 - 2 θ) e_{n}^{2} - (4 θ^{2} c_{2}^{2} - 2 c_{2}^{2} + c_{3} - 3 θ c_{3}) e_{n}^{3} + O (e_{n}^{4}) ≕ Φ (θ) .

(22)

Equation (22) can be used to re-discover the results of convergence: for example, for the contraharmonic mean, we have:

θ (f^{'} (x_{n}), f^{'} (z_{n})) = \frac{f^{'} (x_{n})}{f^{'} (x_{n}) + f^{'} (z_{n})},

(23)

where:

f^{'} (x_{n}) + f^{'} (z_{n}) = 2 f^{'} (α) [1 + c_{2} e_{n} (c_{2}^{2} - 3 / 2 c_{3}) e_{n}^{2} + 2 (c_{2} c_{3} - c_{2}^{3} + c_{4}) e_{n}^{3} + O (e_{n}^{4})],

(24)

so that:

\begin{matrix} \frac{1}{f^{'} (x_{n}) + f^{'} (z_{n})} & = {(2 f^{'} (α))}^{- 1} [1 - c_{2} e_{n} - 3 / 2 c_{3} e_{n}^{2} + 4 c_{2}^{3} e_{n}^{3} - 2 c_{4} e_{n}^{3} + c_{2} c_{3} e_{n}^{3} + O (e_{n}^{4})] \\ = {(2 f^{'} (α))}^{- 1} [1 - c_{2} e_{n} - 3 / 2 c_{3} e_{n}^{2} + (4 c_{2}^{3} - 2 c_{4} + c_{2} c_{3}) e_{n}^{3} + O (e_{n}^{4})] . \end{matrix}

(25)

Thus, we can obtain the

θ

associated with the contraharmonic mean:

\begin{matrix} θ (f^{'} (x_{n}), f^{'} (z_{n})) & = [1 / 2 + c_{2} e_{n} + 3 / 2 c_{3} e_{n}^{2} + 2 c_{4} e_{n}^{3} + O (e_{n}^{4})] \cdot \\ \cdot [1 - c_{2} e_{n} - 3 / 2 c_{3} e_{n}^{2} + (4 c_{2}^{3} + c_{2} c_{3} - 2 c_{4}) e_{n}^{3} + O (e_{n}^{4})] \\ = 1 / 2 + 1 / 2 c_{2} e_{n} - c_{2}^{2} e_{n}^{2} + 3 / 4 c_{3} e_{n}^{2} + 2 c_{2}^{3} e_{n}^{3} + c_{4} e_{n}^{3} - 5 / 2 c_{2} c_{3} e_{n}^{3} + O (e_{n}^{4}) \\ = 1 / 2 + 1 / 2 c_{2} e_{n} + (3 / 4 c_{3} - c_{2}^{2}) e_{n}^{2} + (2 c_{2}^{3} + c_{4} - 5 / 2 c_{2} c_{3}) e_{n}^{3} + O (e_{n}^{4}) . \end{matrix}

(26)

Finally, by replacing the previous expression in (22):

e_{n + 1} = (1 / 2 c_{3} + 2 c_{2}^{2}) e_{n}^{3} + O (e_{n}^{4}),

(27)

and we obtain again that the convergence for the contraharmonic mean Newton method is cubic.

Regarding the harmonic mean, it is straightforward that it is a functional convex combination, with:

\begin{matrix} θ (f^{'} (x_{n}), f^{'} (z_{n})) & = 1 - \frac{f^{'} (x_{n})}{f^{'} (x_{n}) + f^{'} (z_{n})} \\ = 1 / 2 + 1 / 2 c_{2} e_{n} + (c_{2}^{2} - 3 / 4 c_{3}) e_{n}^{2} + (5 / 2 c_{2} c_{3} - 2 c_{2}^{3} - c_{4}) e_{n}^{3} + O (e_{n}^{4}) . \end{matrix}

(28)

Replacing this expression in (22), we find the cubic convergence of the harmonic mean Newton method,

e_{n + 1} = 1 / 2 c_{3} e_{n}^{3} + O (e_{n}^{4}) .

(29)

In both cases, the independent term of

θ (f^{'} (x_{n}), f^{'} (z_{n}))

was

1 / 2

; it was not a coincidence, but an instance of the following more general result.

Theorem 1.

Let

θ = θ (f^{'} (x_{n}), f^{'} (z_{n}))

be associated with the mean-based variant of Newton’s method (MBN):

\begin{matrix} x_{n + 1} = x_{n} - \frac{f (x_{n})}{M (f^{'} (x_{n}), f^{'} (z_{n}))}, & z_{n} = x_{n} - \frac{f (x_{n})}{f^{'} (x_{n})}, \end{matrix}

(30)

where M is a mean function of the variables

f^{'} (x_{n})

and

f^{'} (z_{n})

. Then, MBN converges, at least, cubically if and only if the estimate:

θ = 1 / 2 + O (e_{n}) .

(31)

holds.

Proof.

We replace

θ = 1 / 2 + O (e_{n})

in the MBN error Equation (22), obtaining:

e_{n + 1} = (4 θ^{2} c_{2}^{2} - 2 c_{2}^{2} + c_{3} - 3 θ c_{3}) e_{n}^{3} + O (e_{n}^{4}) . ☐

(32)

Now, some considerations follow.

Remark 1.

Generally speaking,

θ = a_{0} + a_{1} e_{n} + a_{2} e_{n}^{2} + a_{3} e_{n}^{3} + O (e_{n}^{4}),

(33)

where

a_{i}

are real numbers. If we put (33) in (22), we have:

e_{n + 1} = - c_{2} (1 - 2 a_{0}) e_{n}^{2} - (4 a_{0}^{2} c_{2}^{2} - 3 a_{0} c_{3} - 2 a_{1} c_{2} - 2 c_{2}^{2} + c_{3}) e_{n}^{3} + O (e_{n}^{4});

(34)

it follows that, in order to attain cubic convergence, the coefficient of

e_{n}^{2}

must bezero. Therefore,

a_{0} (u) = 1 / 2

. On the other hand, to achieve a higher order (i.e., at least four), we need to solve the following system:

\{\begin{matrix} 1 - 2 a_{0} & = & 0 \\ 4 a_{0}^{2} c_{2}^{2} - 3 a_{0} c_{3} - 2 a_{1} c_{2} - 2 c_{2}^{2} + c_{3} & = & 0 \end{matrix} .

(35)

This gives us that

a_{0} (u) = 1 / 2, a_{1} (u) = - 1 / 4 (2 c_{2}^{2} + c_{3}) / (c_{2})

assure at least a fourth-order convergence of the method. However, none of the MBN methods under analysis satisfy these conditions simultaneously.

Remark 2.

The only convex combination involving a constant θ that converges cubically is

θ = 1 / 2

, i.e., the arithmetic mean.

The most useful aspect of Theorem 1 is synthesized in the following corollary, which we call the “

θ

-test”.

Corollary 1 (θ-test)

With the same hypothesis of Theorem 1, an MBN converges at least cubically if and only if the Taylor expansion of the mean holds:

M (f^{'} (x_{n}), f^{'} (z_{n})) = f^{'} (α) [1 + \frac{1}{2} c_{2} e_{n}] + O (e_{n}^{2}) .

(36)

Let us notice that Corollary 1 provides a test to analyze the convergence of an MBN without having to find out the inherent

θ

, therefore sensibly reducing the overall complexity of the analysis.

Re-Proving Known Results for MBN

In this section, we apply Corollary 1 to prove the cubic convergence of known MBN via a convex combination approach.

(i): Arithmetic mean:

$\begin{matrix} M_{A} (f^{'} (x_{n}), f^{'} (z_{n})) & = \frac{f^{'} (x_{n}) + f^{'} (z_{n})}{2} \\ = \frac{1}{2} (f^{'} (α) [1 + 2 c_{2} e_{n} + O (e_{n}^{2})] + f^{'} (α) [1 + O (e_{n}^{2})]) \\ = f^{'} (α) [1 + c_{2} e_{n} + O (e_{n}^{2})] . \end{matrix}$

(37)
(ii): Heronian mean: In this case, the associated $θ$ -test is:

$\begin{matrix} M_{H e} f^{'} (x_{n}), f^{'} (z_{n}) & = \frac{1}{3} (f^{'} (α) [1 + 2 c_{2} e_{n} + O (e_{n}^{2})] + f^{'} (α) [1 + c_{2} e_{n} + O (e_{n}^{2})] + f^{'} (α) [1 + O (e_{n}^{2})]) \\ = \frac{f^{'} (α)}{3} [3 + 2 c_{2} e_{n} + c_{2} e_{n} + O (e_{n}^{2})] . \end{matrix}$

(38)
(iii): Generalized mean:

$\begin{matrix} M_{G} (f^{'} (x_{n}), f^{'} (z_{n})) & = {(\frac{f^{'} {(x_{n})}^{m} + f^{'} {(z_{n})}^{m})}{2})}^{1 / m} \\ = {(\frac{f^{'} {(α)}^{m} {[1 + 2 c_{2} e_{n} + O (e_{n}^{2})]}^{m} + f^{'} {(α)}^{m} {[1 + O (e_{n}^{2})]}^{m}}{2})}^{1 / m} \\ = f^{'} (α) {({[1 + c_{2} e_{n} + O (e_{n}^{2})]}^{m})}^{1 / m} \\ = f^{'} (α) [1 + c_{2} e_{n} + O (e_{n}^{2})] . \end{matrix}$

(39)
(iv): Centroidal mean:

$\begin{matrix} M_{C e} (f^{'} (x_{n}), f^{'} (z_{n})) & = \frac{2 (f^{'} {(x_{n})}^{2} + f^{'} (x_{n}) f^{'} (z_{n}) + f^{'} (z_{n}))}{3 (f^{'} (x_{n}) + f^{'} (z_{n}))} \\ = \frac{2 (f^{'} {(α)}^{2} [1 + 2 c_{2} e_{n} + O (e_{n}^{2})] + f^{'} {(α)}^{2} [2 + 4 c_{2} e_{n} + O (e_{n}^{2})])}{3 (f^{'} (α) [2 + 2 c_{2} e_{n} + O (e_{n}^{2})])} \\ = \frac{2 (f^{'} {(α)}^{2} [3 + 6 c_{2} e_{n} + O (e_{n}^{2})])}{3 (f^{'} (α) [2 + 2 c_{2} e_{n} + O (e_{n}^{2})])} \\ = f^{'} (α) [1 + 2 c_{2} e_{n} + O (e_{n}^{2})] [1 + c_{2} e_{n} + O (e_{n}^{2})] \\ = f^{'} (α) [1 + c_{2} e_{n} + O (e_{n}^{2})] . \end{matrix}$

(40)

3. New MBN by Using the Lehmer Mean and Its Generalization

The iterative expression of the scheme based on the Lehmer mean of order

m \in R

is:

x_{n + 1} = x_{n} - \frac{f (x_{n})}{M_{L_{m}} (f^{'} (x_{n}), f^{'} (z_{n}))},

where

z_{n} = x_{n} - \frac{f (x_{n})}{f^{'} (x_{n})}

and:

M_{L_{m}} (f^{'} (x_{n}), f^{'} (z_{n})) = \frac{f^{'} {(x_{n})}^{m} + f^{'} {(z_{n})}^{m}}{f^{'} {(x_{n})}^{m - 1} + f^{'} {(z_{n})}^{m - 1}} .

(41)

Indeed, there are suitable values of parameter p such that the associated Lehmer mean equals the arithmetic one and the geometric one, but also the harmonic and the contraharmonic ones. In what follows, we will find it again, this time in a more general context.

By analyzing the associated

θ

-test, we conclude that the iterative scheme designed with this mean has order of convergence three.

\begin{matrix} M_{L_{m}} (f^{'} (x_{n}), f^{'} (z_{n})) & = \frac{f^{'} {(x_{n})}^{m} + f^{'} {(z_{n})}^{m}}{f^{'} {(x_{n})}^{m - 1} + f^{'} {(z_{n})}^{m - 1}} \\ = \frac{f^{'} {(α)}^{m} {[1 + 2 c_{2} e_{n} + O (e_{n}^{2})]}^{m} + f^{'} {(α)}^{m} {[1 + O (e_{n}^{2})]}^{m}}{f^{'} {(α)}^{m - 1} {[1 + 2 c_{2} e_{n} + O (e_{n}^{2})]}^{m - 1} + f^{'} {(α)}^{m - 1} {[1 + O (e_{n}^{2})]}^{m - 1}} \\ = f^{'} (α) [1 + m c_{2} e_{n} + O (e_{n}^{2})] \cdot [1 - ((m - 1) c_{2} e_{n} + O (e_{n}^{2})) + {((m - 1) c_{2} e_{n} + O (e_{n}^{2}))}^{2} + \dots] \\ = f^{'} (α) [1 + m c_{2} e_{n} + O (e_{n}^{2})] \cdot [1 - (m - 1) c_{2} e_{n} + O (e_{n}^{2})] \\ = f^{'} (α) [1 + c_{2} e_{n} + O (e_{n}^{2})] . \end{matrix}

(42)

$σ$ -Means

Now, we propose a new family of means of n variables, starting again from convex combinations. The core idea in this work is that, in the end, two distinct means only differ in their corresponding weights

θ

and

1 - θ

. In particular, we can regard the harmonic mean as an “opposite-weighted”mean, while the contraharmonic one is a “self-weighted”mean.

This behavior can be generalized to n variables:

M_{C H} (x_{1}, \dots, x_{n}) = \frac{\sum_{i = 1}^{n} x_{i}^{2}}{\sum_{i = 1}^{n} x_{i}}

(43)

is the contraharmonic mean among n numbers. Equation (43) is just a particular case of what we call

σ

-mean.

Definition 1 (σ-mean)

Given

x = (x_{1}, \dots, x_{n}) \in R^{n}

a vector of n real numbers and a bijective map

σ : {1, \dots, n} \to {1, \dots, n}

(i.e.,

σ (x)

is a permutation of

x_{1}, \dots, x_{n}

), we call the σ-mean of order

m \in R

the real number given by:

M_{σ} (x_{1}, \dots, x_{n}) \frac{\sum_{i = 1}^{n} x_{i} \cdot x_{σ (i)}^{m}}{\sum_{j = 1}^{n} x_{j}^{m}} .

(44)

Indeed, it is easy to see that, in an

σ

-mean, the weight assigned to each node

x_{i}

is:

\frac{x_{σ (i)}^{m}}{\sum_{j = 1}^{n} x_{σ (j)}^{m}} = \frac{x_{σ (i)}^{m}}{\sum_{j = 1}^{n} x_{j}^{m}} \in [0, 1],

(45)

where the equality holds because

σ

is a permutation of the indices. We are, therefore, still dealing with a convex combination, which implies that Definition 1 is well posed.

We remark that if we take

σ = 𝟙

, i.e., the identical permutation, in (44), we find the Lehmer mean of order m. Actually, the Lehmer mean is a very special case of the

σ

-mean, as the following result proves.

Proposition 1.

Given

m \in R

, the Lehmer mean of order m is the maximum σ-mean of order m.

Proof.

We recall the rearrangement inequality:

x_{n} y_{1} + \dots + x_{1} y_{n} \leq x_{σ (1)} y_{1} + \dots + x_{σ (n)} y_{n} \leq x_{1} y_{1} + \dots + x_{n} y_{n},

(46)

which holds for every choice of

x_{1}, \dots, x_{n}

and

y_{1}, \dots, y_{n}

regardless of the signs, assuming that both

x_{i}

and

y_{j}

are sorted in increasing order. In particular,

x_{1} < x_{2} < \dots < x_{n}

and

y_{1} < y_{2} < \dots < y_{n}

imply that the upper bound is attained only for the identical permutation.

Then, to prove the result, it is enough to replace every

y_{i}

with the corresponding weight defined in (45). ☐

The Lehmer mean and

σ

-mean are deeply related: if

n = 2

, as is the case of MBN, there are only two possible permutations, the identical one and the one that swaps one and two. We have already observed that the identical permutation leads to the Lehmer mean; however, if we express

σ

in standard cycle notation as

\bar{σ} = (1, 2)

, we have that:

M_{\bar{σ}} (x_{1}, x_{2}) = \frac{x_{1} x_{2} (x_{1}^{m} + x_{2}^{m})}{x_{1}^{m + 1} + x_{2}^{m + 1}} = \frac{x_{1}^{- m} + x_{2}^{- m}}{x_{1}^{- m - 1} + x_{2}^{- m - 1}} = M_{L_{- m}} (x_{1}, x_{2}) .

(47)

We conclude this section proving another property of

σ

-means, which is that the arithmetic mean of all possible

σ

-means of n numbers equals the arithmetic mean of the numbers themselves.

Proposition 2.

Given n real numbers

x_{1}, \dots, x_{n}

and

Σ_{n}

denoting the set of all possible permutations of

{1 \dots, n}

, we have:

\frac{1}{n!} \sum_{σ \in Σ_{n}} M_{σ} (x_{1}, \dots, x_{n}) = \frac{1}{n} \sum_{i = 1}^{n} x_{i}

(48)

for all

m \in R

.

Proof.

Let us rewrite Equation (48); by definition, we have:

\frac{1}{n!} \sum_{σ \in Σ_{n}} M_{σ} (x_{1}, \dots, x_{n}) = \frac{1}{n!} \sum_{σ \in Σ_{n}} (\frac{\sum_{i = 1}^{n} x_{i} x_{σ (i)}^{m}}{\sum_{j = 1}^{n} x_{j}^{m}}) = \frac{1}{n} \sum_{i = 1}^{n} x_{i}

(49)

and we claim that the last equality holds. Indeed, we notice that every term in the sum of the

σ

-means on the left side of the last equality involves a constant denominator, so we can multiply both sides by it and also by

n!

to get:

\sum_{σ \in Σ_{n}} (\sum_{i = 1}^{n} x_{i} x_{σ (i)}^{m}) = (n - 1)! (\sum_{j = 1}^{n} x_{j}^{m}) (\sum_{i = 1}^{n} x_{i}) .

(50)

Now, it is just a matter of distributing the product on the right in a careful way:

(n - 1)! (\sum_{j = 1}^{n} x_{j}^{m}) (\sum_{i = 1}^{n} x_{i}) = \sum_{i = 1}^{n} (x_{i} \cdot \sum_{k = 1}^{n} ((n - 1)!) x_{k}^{m}),

(51)

If we fix

i \in {1, \dots, n}

, in

Σ_{n}

, there are exactly

(n - 1)!

permutations

σ

such that

σ (i) = i

. Therefore, the equality in (50) follows straightforwardly.

4. Numerical Results and Dependence on Initial Estimations

Now, we present the results of some numerical computations, in which the following test functions have been used.

(a): $f_{1} (x) = x^{3} + 4 x^{2} - 10$ ,
(b): $f_{2} (x) = sin {(x)}^{2} - x^{2} + 1$ ,
(c): $f_{3} (x) = x^{2} - e^{x} - 3 x + 2$ ,
(d): $f_{4} (x) = cos (x) - x$ ,
(e): $f_{5} (x) = {(x - 1)}^{3} - 1$ .

The numerical tests were carried out by using MATLAB with double precision arithmetics in a computer with processor i7-8750H @2.20 GHz, 16 Gb of RAM, and the stopping criterion used was

| x_{n + 1} - x_{n} | + | f (x_{n + 1}) | < 10^{- 14}

.

We used the harmonic mean Newton method (HN), the contraharmonic mean Newton method (CHN), the Lehmer mean Newton method (LN(m)), a variant of Newton’s method where the mean is a convex combination with

θ = 1 / 3

,

1 / 3

N, and the classic Newton method (CN). The main goals of these calculations are to confirm the theoretical results stated in the preceding sections and to compare the different methods, with CN as a control benchmark. In Table 1, we show the number of iterations that each method needs for satisfying the stopping criterion and also the approximated computational order of convergence, defined in [11], with the expression:

A C O C = \frac{ln (| x_{n + 1} - x_{n} | / | x_{n} - x_{n - 1} |)}{ln (| x_{n} - x_{n - 1} | / | x_{n - 1} - x_{n - 2} |)}, n = 2, 3, \dots,

which is considered as a numerical approximation of the theoretical order of convergence p.

Regarding the efficiency of the MBN, we used the efficiency index defined by Ostrowski in [12] as

E I = p^{\frac{1}{d}}

, where p is the order of convergence of the method and d is the number of functional evaluations per iteration. In this sense, all the MBN had the same

E I_{M B N} = 3^{\frac{1}{3}}

; meanwhile, Newton’s scheme had the index

E I_{C N} = 2^{\frac{1}{2}}

. Therefore, all MBN were more efficient than the classical Newton method.

The presented numerical tests showed the performance of the different iterative methods to solve specific problems with fixed initial estimations and a stringent stopping criterion. However, it is useful to know their dependence on the initial estimation used. Although the convergence of the methods has been proven for real functions, it is usual to analyze the sets of convergent initial guesses in the complex plane (the proof would be analogous by changing the condition on the function to be differentiable by being holomorphic). To get this aim, we plotted the dynamical planes of each one of the iterative methods on the nonlinear functions

f_{i} (x)

,

i = 1, 2, \dots, 5

, used in the numerical tests. In them, a mesh of

400 \times 400

initial estimations was employed in the region of the complex plane

[- 3, 3] \times [- 3, 3]

.

We used the routines appearing in [13] to plot the dynamical planes corresponding to each method. In them, each point of the mesh was an initial estimation for the analyzed method on the specific problem. If the method reached the root in less than 40 iterations (closer than

10^{- 3}

), then this point is painted in orange (green for the second, etc.) color; if the process converges to another attractor different from the roots, then the point is painted in black. The zeros of the nonlinear functions are presented in the different pictures by white stars.

In Figure 1, we observe that Harmonic and Lehmer (for

m = - 7

) means showed the most stable performance, whose unique basins of attraction were those of the roots (plotted in orange, red, and green). In the rest of the cases, there existed black areas of no convergence to the zeros of the nonlinear function

f_{1} (x)

. Specially unstable were the cases of Heronian, convex combination (

θ = \pm 2

), and generalized means, with wide black areas and very small basins of the complex roots.

Regarding Figure 2, again Heronian, convex combination (

θ = - 2

), and generalized means showed convergence only to one of the roots or very narrow basins of attraction. There existed black areas of no convergence to the roots in all cases, but the widest green and orange basins (corresponding to the zeros of

f_{2} (x)

) corresponded to harmonic, contra harmonic, centroidal, and Lehmer means.

Function

f_{3} (x)

had only one zero at

x \approx 0.25753

, whose basin of attraction is painted in orange color in Figure 3. In general, most of the methods presented good performance; however, three methods did not converge to the root in the maximum of iterations required: Heronian and generalized means with

m = \pm 2

. Moreover, the basin of attraction was reduced when the parameter

θ

of the convex combination mean was used.

A similar performance is observed in Figure 4, where Heronian and generalized means with

m = \pm 2

showed no convergence to only the root of

f_{4} (x)

; meanwhile, the rest of the methods presented good behavior. Let us remark that in some cases, blue areas appear; this corresponded to initial estimations that, after 40 consecutive iterations, had an absolute value higher than 1000. In these cases, they and the surrounding black areas were identified as regions of divergence of the method. The best methods in this case were associated with the arithmetic and harmonic means.

In Figure 5, the best results in terms of the wideness of the basins of the attraction of the roots were for harmonic and Lehmer means, for

m = - 7

. The biggest black areas corresponded to convex combination with

θ = - 2

, where the three basins of attraction of the roots were very narrow, and for Heronian and generalized means, there was only convergence to the real root.

5. Conclusions

The proposed

θ

-test (Corollary 1) has proven to be very useful to reduce the calculations of the analysis of convergence of any MBN. Moreover, though the employment of

σ

-means in the context of mean-based variants of Newton’s method is probably not the best one to appreciate their flexibility, their use could still lead to interesting results due to their much greater capability of interpolating between numbers than already powerful means, such as the Lehmer one.

With regard to the numerical performance, Table 1 confirms that a convex combination with a constant coefficient could converge cubically if and only if it was the arithmetic mean; otherwise, as with this case, it converged quadratically, even if it may have done so with less iterations, generally speaking, than CN. Regarding the number of iterations, there were non-linear functions for which LN(m) converged with fewer iterations than HN. In our calculations, we set

m = - 7

, but similar results were achieved also for different parameters. Regarding the dependence on initial estimations, the harmonic and Lehmer methods were proven to be very stable, with the widest areas of convergence in most of the nonlinear problems used in the tests.

Author Contributions

The individual contributions of the authors are as follows: conceptualization, J.R.T.; writing, original draft preparation, J.F. and A.C.Z.; validation, A.C. and J.R.T. formal analysis, A.C.; numerical experiments, J.F. and A.C.Z.

Funding

This research was partially funded by Spanish Ministerio de Ciencia, Innovación y Universidades PGC2018-095896-B-C22 and by Generalitat Valenciana PROMETEO/2016/089 (Spain).

Acknowledgments

The authors would like to thank the anonymous reviewers for their comments and suggestions, which improved the final version of this manuscript.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

References

Traub, J.F. Iterative Methods for the Solution of Equations; Prentice-Hall: Englewood Cliffs, NY, USA, 1964. [Google Scholar]
Petković, M.S.; Neta, B.; Petković, L.D.; Džunić, J. Multipoint Methods for Solving Nonlinear Equations; Academic Press: Cambridge, MA, USA, 2013. [Google Scholar]
Amat, S.; Busquier, S. Advances in Iterative Methods for Nonlinear Equations; SEMA SIMAI Springer Series; Springer: Berlin, Germany, 2016; Volume 10. [Google Scholar]
Weerakoon, S.; Fernando, T.A. A variant of Newton’s method with accelerated third-order convergence. Appl. Math. Lett. 2000, 13, 87–93. [Google Scholar] [CrossRef]
Özban, A. Some new variants of Newton’s method. Appl. Math. Lett. 2004, 17, 677–682. [Google Scholar] [CrossRef]
Ababneh, O.Y. New Newton’s method with third order convergence for solving nonlinear equations. World Acad. Sci. Eng. Technol. 2012, 61, 1071–1073. [Google Scholar]
Xiaojian, Z. A class of Newton’s methods with third-order convergence. Appl. Math. Lett. 2007, 20, 1026–1030. [Google Scholar]
Singh, M.K.; Singh, A.K. A new-mean type variant of Newton’s method for simple and multiple roots. Int. J. Math. Trends Technol. 2017, 49, 174–177. [Google Scholar] [CrossRef]
Verma, K.L. On the centroidal mean Newton’s method for simple and multiple roots of nonlinear equations. Int. J. Comput. Sci. Math. 2016, 7, 126–143. [Google Scholar] [CrossRef]
Zafar, F.; Mir, N.A. A generalized family of quadrature based iterative methods. Gen. Math. 2010, 18, 43–51. [Google Scholar]
Cordero, A.; Torregrosa, J.R. Variants of Newton’s method using fifth-order quadrature formulas. Appl. Math. Comput. 2007, 190, 686–698. [Google Scholar] [CrossRef]
Ostrowski, A.M. Solution of Equations and Systems of Equatiuons; Academic Press: New York, NY, USA, 1966. [Google Scholar]
Chicharro, F.I.; Cordero, A.; Torregrosa, J.R. Drawing dynamical and pa- rameters planes of iterative families and methods. Sci. World J. 2013, 2013, 780153. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Dynamical planes of the mean-based methods on

f_{1} (x) = x^{3} + 4 x^{2} - 10

.

Figure 1. Dynamical planes of the mean-based methods on

f_{1} (x) = x^{3} + 4 x^{2} - 10

.

Figure 2. Dynamical planes of mean-based methods on

f_{2} (x) = s i n {(x)}^{2} - x^{2} + 1

.

Figure 2. Dynamical planes of mean-based methods on

f_{2} (x) = s i n {(x)}^{2} - x^{2} + 1

.

Figure 3. Dynamical planes of mean-based methods on

f_{3} (x) = x^{2} - e^{x} - 3 x + 2

.

Figure 3. Dynamical planes of mean-based methods on

f_{3} (x) = x^{2} - e^{x} - 3 x + 2

.

Figure 4. Dynamical planes of mean-based methods on

f_{4} (x) = c o s (x) - x

.

Figure 4. Dynamical planes of mean-based methods on

f_{4} (x) = c o s (x) - x

.

Figure 5. Dynamical planes of mean-based methods on

f_{5} (x) = {(x - 1)}^{3} - 1

.

Figure 5. Dynamical planes of mean-based methods on

f_{5} (x) = {(x - 1)}^{3} - 1

.

Table 1. Numerical results. HN, the harmonic mean Newton method; CHN, the contraharmonic mean Newton method; LN, the Lehmer–Newton method; CN, the classic Newton method.

Function	x $_{0}$	Number of Iterations					ACOC
		HN	CHN	LN(−7)	1/3 N	CN	HN	CHN	LN(−7)	1/3 N	CN
(a)	−0.5	50	18	55	6	132	3.10	3.03	2.97	1.99	2.00
	1	4	5	5	5	6	2.94	3.01	2.96	2.02	2.00
	2	4	5	5	5	6	3.10	2.99	3.02	2.00	2.00
(b)	1	4	5	6	6	7	3.06	3.16	3.01	2.01	2.00
(b)	3	4	5	7	6	7	3.01	2.95	3.02	2.01	2.00
(c)	2	5	5	5	5	6	3.01	2.99	3.11	2.01	2.00
(c)	3	5	6	5	6	7	3.10	3.00	3.10	2.01	2.00
(d)	−0.3	5	5	5	6	6	2.99	3.14	3.02	2.01	1.99
	1	4	4	4	5	5	2.99	2.87	2.88	2.01	2.00
	1.7	4	4	5	5	5	3.00	2.72	3.02	2.01	1.99
(e)	0	6	$> 1000$	7	7	10	3.06	3.00	3.02	2.01	2.00
	1.5	5	7	7	7	8	3.04	3.01	2.99	2.01	2.00
	2.5	4	5	5	5	7	3.07	2.96	3.01	1.99	2.00
	3.0	5	6	6	6	7	3.04	2.99	2.98	2.00	2.00
	3.5	5	6	6	6	8	3.07	2.95	2.99	2.00	2.00

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cordero, A.; Franceschi, J.; Torregrosa, J.R.; Zagati, A.C. A Convex Combination Approach for Mean-Based Variants of Newton’s Method. Symmetry 2019, 11, 1106. https://doi.org/10.3390/sym11091106

AMA Style

Cordero A, Franceschi J, Torregrosa JR, Zagati AC. A Convex Combination Approach for Mean-Based Variants of Newton’s Method. Symmetry. 2019; 11(9):1106. https://doi.org/10.3390/sym11091106

Chicago/Turabian Style

Cordero, Alicia, Jonathan Franceschi, Juan R. Torregrosa, and Anna C. Zagati. 2019. "A Convex Combination Approach for Mean-Based Variants of Newton’s Method" Symmetry 11, no. 9: 1106. https://doi.org/10.3390/sym11091106

APA Style

Cordero, A., Franceschi, J., Torregrosa, J. R., & Zagati, A. C. (2019). A Convex Combination Approach for Mean-Based Variants of Newton’s Method. Symmetry, 11(9), 1106. https://doi.org/10.3390/sym11091106

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Convex Combination Approach for Mean-Based Variants of Newton’s Method

Abstract

1. Introduction

2. Convex Combination

Re-Proving Known Results for MBN

3. New MBN by Using the Lehmer Mean and Its Generalization

$σ$ -Means

4. Numerical Results and Dependence on Initial Estimations

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Convex Combination Approach for Mean-Based Variants of Newton’s Method

Abstract

1. Introduction

2. Convex Combination

Re-Proving Known Results for MBN

3. New MBN by Using the Lehmer Mean and Its Generalization

σ -Means

4. Numerical Results and Dependence on Initial Estimations

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

$σ$ -Means