A Convex Combination Approach for Mean-Based Variants of Newton’s Method

: Several authors have designed variants of Newton’s method for solving nonlinear equations by using different means. This technique involves a symmetry in the corresponding ﬁxed-point operator. In this paper, some known results about mean-based variants of Newton’s method (MBN) are re-analyzed from the point of view of convex combinations. A new test is developed to study the order of convergence of general MBN. Furthermore, a generalization of the Lehmer mean is proposed and discussed. Numerical tests are provided to support the theoretical results obtained and to compare the different methods employed. Some dynamical planes of the analyzed methods on several equations are presented, revealing the great difference between the MBN when it comes to determining the set of starting points that ensure convergence and observing their symmetry in the complex plane.


Introduction
We consider the problem of finding a simple zero α of a function f : I ⊂ R → R, defined in an open interval I.This zero can be determined as a fixed point of some function g by means of the one-point iteration method: x n+1 = g(x n ), n = 0, 1, . . ., (1) where x 0 is the starting point.The most widely-used example of these kinds of methods is the classical Newton's method given by: It is well known that it converges quadratically to simple zeros and linearly to multiple zeros.In the literature, many modifications of Newton's scheme have been published in order to improve its order of convergence and stability.Interesting overviews about this area of research can be found in [1][2][3].
The works of Weerakoon and Fernando [4] and, later, Özban [5] have inspired a whole set of variants of Newton's method, whose main characteristic is the use of different means in the iterative expression.
It is known that if a sequence { x n } n≥0 tends to a limit α in such a way that there exist a constant C > 0 and a positive integer n 0 such that: for p ≥ 1, then p is called the order of convergence of the sequence and C is known as the asymptotic error constant.For p = 1, constant C satisfies 0 < C ≤ 1.
If we denote by e n = x n − α the exact error of the nth iterate, then the relation: is called the error equation for the method and p is the order of convergence.
Let us suppose that f : I ⊆ R → R is a sufficiently-differentiable function and α is a simple zero of f .It is plain that: Weerakoon and Fernando in [4] approximated the definite integral (5) by using the trapezoidal rule and taking x = α, getting: and therefore, a new approximation x n+1 to α is given by: Thus, this variant of Newton's scheme can be considered to be obtained by replacing the denominator f (x n ) of Newton's method (2) by the arithmetic mean of f (x n ) and f (z n ).Therefore, it is known as the arithmetic mean Newton method (AN).
In a similar way, the arithmetic mean can be replaced by other means.In particular, the harmonic mean M Ha (x, y) = 2xy/(x + y), where x and y are two nonnegative real numbers, from a different point of view: M Ha (x, y) = 2xy x + y = x y x + y where since 0 ≤ y ≤ x + y, then 0 ≤ θ ≤ 1, i.e., the harmonic mean can be seen as a convex combination between x and y, where every element is given the relevance of the other one in the sum.Now, let us switch the roles of x and y; we get: that is the contraharmonic mean between x and y.
Özban in [5] used the harmonic mean instead of the arithmetic one, which led to a new method: being z n a Newton step, which he called the harmonic mean Newton method (HN).
Ababneh in [6] designed an iterative method associated with this mean, called the contraharmonic mean Newton method (CHN), whose iterative expression is: with third-order of convergence for simple roots of f (x) = 0, as well as the methods proposed by Weerakoon and Fernando [4] and Özban [5].This idea has been used by different authors for designing iterative methods applying other means, generating symmetric fixed point operators.For example, Xiaojian in [7] employed the generalized mean of order m ∈ R between two values x and y defined as: to construct a third-order iterative method for solving nonlinear equations.Furthermore Singh et al.
in [8] presented a third-order iterative scheme by using the Heronian mean between two values x and y, defined as: Finally, Verma in [9], following the same procedure, designed a third-order iterative method by using the centroidal mean between two values x and y, defined as: In this paper, we check that all these means are functional convex combinations means and develop a simple test to prove easily the third-order of the corresponding iterative methods, mentioned before.Moreover, we introduce a new method based on the Lehmer mean of order m ∈ R, defined as: and propose a generalization that also satisfies the previous test.Finally, all these schemes are numerically tested, and their dependence on initial estimations is studied by means of their basins of attraction.These basins are shown to be clearly symmetric.The rest of the paper is organized as follows: Section 2 is devoted to designing a test that allows us to characterize the third-order convergence of the iterative method defined by a mean.This characterization is used in Section 3 for giving an alternative proof of the convergence of mean-based variants of Newton's (MBN) methods, including some new ones.In Section 4, we generalize the previous methods by using the concept of σ-means.Section 5 is devoted to numerical results and the use of basins of attraction in order to analyze the dependence of the iterative methods on the initial estimations used.With some conclusions, the manuscript is finished.

Convex Combination
In a similar way as has been stated in the Introduction for the arithmetic, harmonic, and contraharmonic means, the rest of the mentioned means can be also regarded as convex combinations.This is not coincidental: one of the most interesting properties that a mean satisfies is the averaging property: min(x, y) ≤ M(x, y) ≤ max(x, y), where M(x, y) is any mean function of x and y nonnegative.This implies that every mean that satisfies this property is a certain convex combination among its terms.
Indeed, there exists a unique θ(x, y)) ∈ [0, 1] such that: This approach suggests that it is possible to generalize every mean-based variant of Newton's method (MBN), by studying their convex combination counterparts.As a matter of fact, every mean-based variant of Newton's method can be rewritten as: where . This is a particular case of a family of iterative schemes constructed in [10].
We are interested in studying its order of convergence as a function of θ.Thus, we need to compute the approximated Taylor expansion of the convex combination at the denominator and then its inverse: where c j = 1 j! f (j) (α) f (α) , j = 1, 2, . ...Then, its inverse can be expressed as: and by replacing it in (18), it leads to the MBN error equation as a function of θ: Equation ( 22) can be used to re-discover the results of convergence: for example, for the contraharmonic mean, we have: where: so that: Thus, we can obtain the θ associated with the contraharmonic mean: Finally, by replacing the previous expression in ( 22): and we obtain again that the convergence for the contraharmonic mean Newton method is cubic.Regarding the harmonic mean, it is straightforward that it is a functional convex combination, with: Replacing this expression in (22), we find the cubic convergence of the harmonic mean Newton method, In both cases, the independent term of θ( f (x n ), f (z n )) was 1/2; it was not a coincidence, but an instance of the following more general result.
be associated with the mean-based variant of Newton's method (MBN): where M is a mean function of the variables f (x n ) and f (z n ).Then, MBN converges, at least, cubically if and only if the estimate: Proof.We replace θ = 1/2 + O(e n ) in the MBN error Equation ( 22), obtaining: Now, some considerations follow.
Remark 1. Generally speaking, where a i are real numbers.If we put (33) in (22), we have: it follows that, in order to attain cubic convergence, the coefficient of e 2 n must bezero.Therefore, a 0 (u) = 1/2.On the other hand, to achieve a higher order (i.e., at least four), we need to solve the following system: This gives us that a 0 (u) = 1/2, a 1 (u) = −1/4(2c 2 2 + c 3 )/(c 2 ) assure at least a fourth-order convergence of the method.However, none of the MBN methods under analysis satisfy these conditions simultaneously.
The most useful aspect of Theorem 1 is synthesized in the following corollary, which we call the "θ-test".

Corollary 1 (θ-test).
With the same hypothesis of Theorem 1, an MBN converges at least cubically if and only if the Taylor expansion of the mean holds: Let us notice that Corollary 1 provides a test to analyze the convergence of an MBN without having to find out the inherent θ, therefore sensibly reducing the overall complexity of the analysis.

Re-Proving Known Results for MBN
In this section, we apply Corollary 1 to prove the cubic convergence of known MBN via a convex combination approach.

New MBN by Using the Lehmer Mean and Its Generalization
The iterative expression of the scheme based on the Lehmer mean of order m ∈ R is: , ) and: (41) Indeed, there are suitable values of parameter p such that the associated Lehmer mean equals the arithmetic one and the geometric one, but also the harmonic and the contraharmonic ones.In what follows, we will find it again, this time in a more general context.
By analyzing the associated θ-test, we conclude that the iterative scheme designed with this mean has order of convergence three. (42)

σ-Means
Now, we propose a new family of means of n variables, starting again from convex combinations.The core idea in this work is that, in the end, two distinct means only differ in their corresponding weights θ and 1 − θ.In particular, we can regard the harmonic mean as an "opposite-weighted"mean, while the contraharmonic one is a "self-weighted"mean.
This behavior can be generalized to n variables: is the contraharmonic mean among n numbers.Equation ( 43) is just a particular case of what we call σ-mean.
Definition 1 (σ-mean).Given x = (x 1 , . . ., x n ) ∈ R n a vector of n real numbers and a bijective map σ : {1, . . ., n} → {1, . . ., n} (i.e., σ(x) is a permutation of x 1 , . . ., x n ), we call the σ-mean of order m ∈ R the real number given by: Indeed, it is easy to see that, in an σ-mean, the weight assigned to each node x i is: where the equality holds because σ is a permutation of the indices.We are, therefore, still dealing with a convex combination, which implies that Definition 1 is well posed.
We remark that if we take σ = 1, i.e., the identical permutation, in (44), we find the Lehmer mean of order m.Actually, the Lehmer mean is a very special case of the σ-mean, as the following result proves.
Proposition 1.Given m ∈ R, the Lehmer mean of order m is the maximum σ-mean of order m.
Proof.We recall the rearrangement inequality: which holds for every choice of x 1 , . . ., x n and y 1 , . . ., y n regardless of the signs, assuming that both x i and y j are sorted in increasing order.In particular, x 1 < x 2 < • • • < x n and y 1 < y 2 < • • • < y n imply that the upper bound is attained only for the identical permutation.Then, to prove the result, it is enough to replace every y i with the corresponding weight defined in (45).
The Lehmer mean and σ-mean are deeply related: if n = 2, as is the case of MBN, there are only two possible permutations, the identical one and the one that swaps one and two.We have already observed that the identical permutation leads to the Lehmer mean; however, if we express σ in standard cycle notation as σ = (1, 2), we have that: We conclude this section proving another property of σ-means, which is that the arithmetic mean of all possible σ-means of n numbers equals the arithmetic mean of the numbers themselves.
Proposition 2. Given n real numbers x 1 , . . ., x n and Σ n denoting the set of all possible permutations of {1 . . ., n}, we have: for all m ∈ R.
Proof.Let us rewrite Equation (48); by definition, we have: and we claim that the last equality holds.Indeed, we notice that every term in the sum of the σ-means on the left side of the last equality involves a constant denominator, so we can multiply both sides by it and also by n! to get: Now, it is just a matter of distributing the product on the right in a careful way: If we fix i ∈ { 1, . . ., n}, in Σ n , there are exactly (n − 1)! permutations σ such that σ(i) = i.Therefore, the equality in (50) follows straightforwardly.

Numerical Results and Dependence on Initial Estimations
Now, we present the results of some numerical computations, in which the following test functions have been used.
The numerical tests were carried out by using MATLAB with double precision arithmetics in a computer with processor i7-8750H @2.20 GHz, 16 Gb of RAM, and the stopping criterion used was We used the harmonic mean Newton method (HN), the contraharmonic mean Newton method (CHN), the Lehmer mean Newton method (LN(m)), a variant of Newton's method where the mean is a convex combination with θ = 1/3, 1/3N, and the classic Newton method (CN).The main goals of these calculations are to confirm the theoretical results stated in the preceding sections and to compare the different methods, with CN as a control benchmark.In Table 1, we show the number of iterations that each method needs for satisfying the stopping criterion and also the approximated computational order of convergence, defined in [11], with the expression: , n = 2, 3, . . ., which is considered as a numerical approximation of the theoretical order of convergence p.
Table 1.Numerical results.HN, the harmonic mean Newton method; CHN, the contraharmonic mean Newton method; LN, the Lehmer-Newton method; CN, the classic Newton method.Regarding the efficiency of the MBN, we used the efficiency index defined by Ostrowski in [12] as EI = p 1 d , where p is the order of convergence of the method and d is the number of functional evaluations per iteration.In this sense, all the MBN had the same EI MBN = 3 1 3 ; meanwhile, Newton's scheme had the index EI CN = 2 1 2 .Therefore, all MBN were more efficient than the classical Newton method.The presented numerical tests showed the performance of the different iterative methods to solve specific problems with fixed initial estimations and a stringent stopping criterion.However, it is useful to know their dependence on the initial estimation used.Although the convergence of the methods has been proven for real functions, it is usual to analyze the sets of convergent initial guesses in the complex plane (the proof would be analogous by changing the condition on the function to be differentiable by being holomorphic).To get this aim, we plotted the dynamical planes of each one of the iterative methods on the nonlinear functions f i (x), i = 1, 2, . . ., 5, used in the numerical tests.In them, a mesh of 400 × 400 initial estimations was employed in the region of the complex plane We used the routines appearing in [13] to plot the dynamical planes corresponding to each method.In them, each point of the mesh was an initial estimation for the analyzed method on the specific problem.If the method reached the root in less than 40 iterations (closer than 10 −3 ), then this point is painted in orange (green for the second, etc.) color; if the process converges to another attractor different from the roots, then the point is painted in black.The zeros of the nonlinear functions are presented in the different pictures by white stars.
In Figure 1, we observe that Harmonic and Lehmer (for m = −7) means showed the most stable performance, whose unique basins of attraction were those of the roots (plotted in orange, red, and green).In the rest of the cases, there existed black areas of no convergence to the zeros of the nonlinear function f 1 (x).Specially unstable were the cases of Heronian, convex combination (θ = ±2), and generalized means, with wide black areas and very small basins of the complex roots.
Re (z) Regarding Figure 2, again Heronian, convex combination (θ = −2), and generalized means showed convergence only to one of the roots or very narrow basins of attraction.There existed black areas of no convergence to the roots in all cases, but the widest green and orange basins (corresponding to the zeros of f 2 (x)) corresponded to harmonic, contra harmonic, centroidal, and Lehmer means.
Function f 3 (x) had only one zero at x ≈ 0.25753, whose basin of attraction is painted in orange color in Figure 3.In general, most of the methods presented good performance; however, three methods did not converge to the root in the maximum of iterations required: Heronian and generalized means with m = ±2.Moreover, the basin of attraction was reduced when the parameter θ of the convex combination mean was used.
Re (z) A similar performance is observed in Figure 4, where Heronian and generalized means with m = ±2 showed no convergence to only the root of f 4 (x); meanwhile, the rest of the methods presented good behavior.Let us remark that in some cases, blue areas appear; this corresponded to initial estimations that, after 40 consecutive iterations, had an absolute value higher than 1000.In these cases, they and the surrounding black areas were identified as regions of divergence of the method.The best methods in this case were associated with the arithmetic and harmonic means.In Figure 5, the best results in terms of the wideness of the basins of the attraction of the roots were for harmonic and Lehmer means, for m = −7.The biggest black areas corresponded to convex combination with θ = −2, where the three basins of attraction of the roots were very narrow, and for Heronian and generalized means, there was only convergence to the real root.

Conclusions
The proposed θ-test (Corollary 1) has proven to be very useful to reduce the calculations of the analysis of convergence of any MBN.Moreover, though the employment of σ-means in the context of mean-based variants of Newton's method is probably not the best one to appreciate their flexibility, their use could still lead to interesting results due to their much greater capability of interpolating between numbers than already powerful means, such as the Lehmer one.
With regard to the numerical performance, Table 1 confirms that a convex combination with a constant coefficient could converge cubically if and only if it was the arithmetic mean; otherwise, as with this case, it converged quadratically, even if it may have done so with less iterations, generally speaking, than CN.Regarding the number of iterations, there were non-linear functions for which LN(m) converged with fewer iterations than HN.In our calculations, we set m = −7, but similar results were achieved also for different parameters.Regarding the dependence on initial estimations, the harmonic and Lehmer methods were proven to be very stable, with the widest areas of convergence in most of the nonlinear problems used in the tests.
Author Contributions: The individual contributions of the authors are as follows: conceptualization, J.R.T.; writing, original draft preparation, J.F. and A.C.Z.; validation, A.C. and J.R.T. formal analysis, A.C.; numerical experiments, J.F. and A.C.Z.