Abstract
A process for solving an algebraic equation was presented by Newton in 1669 and later by Raphson in 1690. This technique is called Newton’s method or Newton–Raphson method and is even today a popular technique for solving nonlinear equations in abstract spaces. The objective of this article is to update developments in the convergence of this method. In particular, it is shown that the Kantorovich theory for solving nonlinear equations using Newton’s method can be replaced by a finer one with no additional and even weaker conditions. Moreover, the convergence order two is proven under these conditions. Furthermore, the new ratio of convergence is at least as small. The same methodology can be used to extend the applicability of other numerical methods. Numerical experiments complement this study.
MSC:
49M15; 47H17; 65G99; 65H10; 65N12; 58C15
1. Introduction
Given Banach spaces . Let stand for the space of all continuous linear operators mapping into Consider differentiable as per Fréchet operator and its corresponding nonlinear equation
with D denoting a nonempty open set. The task of determining a solution is very challenging but important, since applications from numerous computational disciplines are brought in form (1) [1,2]. The analytic form of is rarely attainable. That is why mainly numerical methods are used generating approximations to solution . Most of them are based on Newton’s method [3,4,5,6,7]. Moreover, authors developed efficient high-order and multi-step algorithms with derivative [8,9,10,11,12,13] and divided differences [14,15,16,17,18].
Among these processes the most widely used is Newton’s and its variants. In particular, Newton’s Method (NM) is developed as
There exists a plethora of results related to the study of NM [3,5,6,7,19,20,21]. These papers are based on the theory inaugurated by Kantorovich and its variants [21]. Basically, the conditions (K) are used in non-affine or affine invariant form. Suppose (K1) ∃ point and parameter and
(K2) ∃ parameter Lipschitz condition
holds and
(K3)
and
(K4) where parameter is given later.
Denote for Set
There are many variants of Kantorovich’s convergence result for NM. One of these results follows [4,7,20].
Theorem 1.
Under conditions (K) for NM is contained in convergent to a solution of Equation (1), and
Moreover, the convergence is linear if and quadratic if Furthermore, the solution is unique in the first case and in in the second case where and scalar sequence is given as
A plethora of studies have used conditions (K) [3,4,5,19,21,22,23].
Example 1.
Consider the cubic polynomial
for and parameter Select initial point Conditions (K) give and It follows that estimate
holds That is condition (K3) is not satisfied. Therefore convergence is not assured by this theorem. However, NM may converge. Hence, clearly, there is a need to improve the results based on the conditions K.
By looking at the crucial sufficient condition (K3) for the convergence, (K4) and the majorizing sequence given by Kantorovich in the preceding Theorem 1 one sees that if the Lipschitz constants is replaced by a smaller one, say , than the convergence domain will be extended, the error distances , will be tighter and the location of the solution more accurate. This replacement will also lead to fewer Newton iterates to reach a certain predecided accuracy (see the numerical Section). That is why with the new methodology, a new domain is obtained inside D that also contains the Newton iterates. However, then, L can replace in Theorem 1 to obtain the aforementioned extensions and benefits.
In this paper several avenues are presented for achieving this goal. The idea is to replace Lipschitz parameter by smaller ones.
(K5) Consider the center Lipschitz condition
the set and the Lipschitz-2 condition
(K6)
These Lipschitz parameters are related as
since
Notice also since parameters and M are specializations of parameter , but . Therefore, no additional work is required to find and M (see also [22,23]). Moreover the ratio can be very small (arbitrarily). Indeed,
Example 2.
Define scalar function
for where are real parameters. It follows by this definition that for sufficiently large and sufficiently small, can be small (arbitrarily), i.e.,
Then, clearly there can be a significant extension if parameters and or M and can be replace in condition (K3). Looking at this direction the following replacements are presented in a series of papers [19,22,23], respectively
and
where and These items are related as follows:
and as relation
and
Preceding items indicate the times (at most) one is improving the other. These are the extensions given in this aforementioned references. However, it turns out that parameter L can replace in these papers (see Section 3). Denote by the corresponding items. It follows
for and Hence, the new results also extend the ones in the aforementioned references. Other extensions involve tighter majorizing sequences for NM (see Section 2) and improved uniqueness report for solution (Section 3). The applications appear in Section 4 followed by conclusions in Section 5.
2. Majorizations
Let be given positive parameters and s be a positive variable. The real sequence defined for and by
plays an important role in the study of NM, we adopted the notation That is why some convergence results for it are listed in what follows next in this study.
Lemma 1.
Suppose conditions
hold Then, the following assertions hold
and such that
Proof.
Next, stronger convergence criteria are presented. However, these criteria are easier to verify than conditions of Lemma 1. Define parameter by
This parameter plays a role in the following results.
Case: and
Part (i) of the next auxiliary result relates to the Lemma in [19].
Lemma 2.
hold. Moreover, conclusions of Lemma 1 are true for sequence The sequence, converges linearly to Furthermore, if for some
Suppose condition
holds, where
Then, the following assertions hold
- (i)
- Estimates
Then, the following assertions hold
- (ii)
- andwhere and the conclusions of Lemma 1 for sequence are true. The sequence, converges quadratically to
Proof.
is true. This estimate is true for since it is equivalent to But this is true by condition (11) and inequality Then, in view of estimate (13), estimate (17) certainly holds provided that
- (i)
- It is given in [19].
- (ii)
This estimate motivates the introduction of recurrent polynomials which are defined by
In view of polynomial assertion (18) holds if
The polynomials are connected:
so
Define function by
Hence, assertion (20) holds if
or equivalently
which can be rewritten as condition (14). Therefore, the induction for assertion (17) is completed. That is assertion (15) holds by the definition of sequence and estimate (15). It follows that
so
Notice also that then so □
Remark 1.
so
- (1)
- The technique of recurrent polynomials in part (i) is used: to produce convergence condition (11) and a closed form upper bound on sequence (see estimate (13)) other than and (which is not given in closed form). This way we also established the linear convergence of sequence By considering condition (14) but being able to use estimate (13) we establish the quadratic convergence of sequence in part (ii) of Lemma 2.
- (2)
- (3)
- Sequence is tighter than the Kantorovich sequence since and Concerning the ration of convergence this is also smaller than given in the Kantorovich Theorem [19]. Indeed, by these definitions provided that where Notice that
Part (i) of the next auxiliary result relates to a Lemma in [19]. The case has been studied in the introduction. So, in the next Lemma we assume in part (ii).
Lemma 3.
Suppose condition
holds, where
Then, the following assertions hold
- (i)
- andMoreover, conclusions of Lemma 1 are true for sequence The sequence converges linearly to Define parameters byand
- (ii)
- Supposeand (25) hold, where is the smallest solution of scalar equation Then, the conclusions of Lemma 2 also hold for sequence The sequence converges quadratically to
- (iii)
- Supposehold. Then, the conclusions of Lemma 2 are true for sequence The sequence converges quadratically to
- (iv)
- and (25) hold. Then, and the conclusions of Lemma 2 are true for sequence The sequence converges quadratically to
Proof.
- (i)
- It is given in Lemma 2.1 in [23].
- (ii)
It suffices
or
where
Notice that
Define function by
It follows that
So, (30) holds provided that
Claim. The right hand side of assertion (31) equals Indeed, this is true if
or
or by squaring both sides
or
or
or
or
or
which is true. Notice also that
and since and (by condition (25)). Thus, . It remains to show
or by the choice of and
or
Claim. By the definition of parameters and it must be shown that
or if for
However, the last inequality holds by (28). The claimed is justified. So, estimate (33) holds by (25) and this claim.
- (iii)
- (iv)
□
Comments similar to Remark 1 can follow for Lemma 3.
Case. Parameters and K are not equal to Comments similar to Remark 1 can follow for Lemma 3.
It is convenient to define parameter by
and the quadratic polynomial by
The discriminant ∆ of polynomial q can be written as
It follows that the root given by the quadratic formula can be written as
Denote by the unique positive zero of equation
This root can be written as
Define parameter by
Part (i) of the next auxiliary result relates to Lemma 2.1 in [22].
Lemma 4.
- (i)
- EstimatesandMoreover, conclusions of Lemma 2 are true for sequence The sequence converges linearly to
- (ii)
- Supposeand (36) hold for some Then, the conclusions of Lemma 3 are true for sequence The sequence converges quadratically to
Proof.
- (i)
- It is given in Lemma 2.1 in [22].
- (ii)
- Define polynomial by
By this definition it follows
As in the proof of Lemma 3 (ii), estimate
holds provided that
Define function by
It follows by the definition of function and polynomial that
Hence, estimate (39) holds provided that
However, this assertion holds, since . Moreover, the definition of and condition (38) of the Lemma 4 imply
Hence, the sequence converges quadratically to □
Remark 2.
Conditions (36)–(38) can be condensed and a specific choice for μ can be given as follows: Define function by
It follows by this definition
Denote by the smallest solution of equation in Then, by choosing conditions (37) holds as equality. Then, if follows that if we solve the first condition in (37) for “s", then conditions (36)–(38) can be condensed as
If then condition (40) should hold as a strict inequality to show quadratic convergence.
3. Semi-Local Convergence
Sequence given by (6) was shown to be majorizing for and tighter than under conditions of Lemmas in [19,22,23], respectively. These Lemmas correspond to part (i) of Lemma 1, Lemma 3 and Lemma 4, respectively. However, by asking the initial approximation s to be bounded above by a slightly larger bound the quadratic order of convergence is recovered. Hence, the preceding Lemmas can replace the order ones, respectively in the semi-local proofs for NM in these references. The parameter and K are connected to and as follows
(K7) ∃ parameter such that for
(K8) ∃ parameter K such that
Note that and The convergence criteria in Lemmas 1, 3 and 4 do not necessarily imply each other in each case. That is why we do not only rely on Lemma 4 to show the semi-local convergence of NM. Consider the following three sets of conditions:
- (A1): (K1), (K4), (K5), (K6) and conditions of Lemma 1 hold for or
- (A2): (K1), (K4) (K5), (K6), conditions of Lemma 2 hold with or
- (A3): (K1), (K4) (K5), (K6), conditions of Lemma 3 hold with or
- (A4): (K1), (K4) (K5), (K6), conditions of Lemma 4 hold with
The upper bounds of the limit point given in the Lemmas and in closed form can replace in condition (K4). The proof are omitted in the presentation of the semi-local convergence of NM since the proof is given in the aforementioned references [19,20,22,23] with the exception of quadratic convergence given in part (ii) of the presented Lemmas.
Theorem 2.
Suppose any of conditions hold. Then, sequence generated by NM is well defined in remains in and converges to a solution of equation Moreover, the following assertion hold
and
The convergence ball is given next. Notice, however that we do not use all conditions
Proposition 1.
Suppose: there exists a solution of equation for some condition (K5) holds and such that
Set Then, the only solution of equation in the set is
Proof.
Let be a solution of equation Define linear operator Then, using (K5) and (41)
Therefore, is implied by the invertability of J and
If conditions of Theorem 2 hold, set □
4. Numerical Experiments
Two experiments are presented in this Section.
Example 3.
Recall Example 1 (with ). Then, the parameters are , It also follows so Denote by the set of values a for which conditions are satisfied. Then, by solving these inequalities for and respectively.
The domain can be further extended. Choose then, The following Table 1 shows, that the conditions of Lemma 1, since and .
Table 1.
Sequence (6) for Example 1.
Example 4.
Let , and
The equation has the solution and .
Let . Then ,
It also follows that , and
where , .
Notice that and The Kantorovich convergence condition (K3) is not fulfilled, since Hence, convergence of converge NM is not assured by the Kantorovich criterion. However, the new conditions (N2)–(N4) are fulfilled, since , , .
The following Table 2 shows, that the conditions of Lemma 1 are fulfilled, since and .
Table 2.
Sequence (6) for Example 4.
Example 5.
Let be the domain of continuous real functions defined on the interval Set and define operator on D as
where y is given in and N is a kernel given by Green’s function as
By applying this definition the derivative of is
Pick The norm-max is used. It then follows from (43)–(45) that
and so Notice that and Choose The Kantorovich convergence condition (K3) is not fulfilled, since Hence, convergence of converge NM is not assured by the Kantorovich criterion. However, new condition (36) is fulfilled, since
Example 6.
Let , and
The equation has the solution . The parameters are , , and
Let us choose . Then, . Conditions (K3) and (N2) are fulfilled. The majorizing sequences (6) and from Theorem 1 are:
In Table 3, there are error bounds. Notice that the new error bounds are tighter, than the ones in Theorem 1.
Table 3.
Results for for Example 6.
Let us choose . Then, . In this case condition (K3) is not held, but (N2) holds. The majorizing sequence (6) is:
Table 4 shows the error bounds from Theorem 2.
Table 4.
Results for for Example 6.
5. Conclusions
We developed a comparison between results on the semi-local convergence of NM. There exists an extensive literature on the convergence analysis of NM. Most convergence results are based on recurrent relations, where the Lipschitz conditions are given in affine or non-affine invariant forms.The new methodology uses recurrent functions. The idea is to construct a domain included in the one used before which also contains the Newton iterates. That is important, since the new results do not require additional conditions. This way the new sufficient convergence conditions are weaker in the Lipschitz case, since they rely on smaller constants. Other benefits include tighter error bounds and more precise uniqueness of the solution results. The new constants are special cases of earlier ones. The methodology is very general making it suitable to extend the usage of other numerical methods under Hölder or more generalized majorant conditions. This will be the topic of our future work.
Author Contributions
Conceptualization I.K.A.; Methodology I.K.A.; Investigation S.R., I.K.A., S.S. and H.Y. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Appell, J.; DePascale, E.; Lysenko, J.V.; Zabrejko, P.P. New results on Newton-Kantorovich approximations with applications to nonlinear integral equations. Numer. Funct. Anal. Optim. 1997, 18, 1–17. [Google Scholar] [CrossRef]
- Traub, J.F. Iterative Methods for the Solution of Equations; Prentice Hall: Hoboken, NJ, USA, 1964. [Google Scholar]
- Ezquerro, J.A.; Hernández-Verón, M.A. Newton’s Method: An Updated Approach of Kantorovich’s Theory. Frontiers in Mathematics; Birkhäuser/Springer: Cham, Switzerland, 2017. [Google Scholar]
- Kantorovich, L.V.; Akilov, G.P. Functional Analysis; Pergamon Press: Oxford, UK, 1982. [Google Scholar]
- Potra, F.A.; Pták, V. Nondiscrete induction and iterative processes. In Research Notes in Mathematics; Pitman (Advanced Publishing Program): Boston, MA, USA, 1984; Volume 103. [Google Scholar]
- Verma, R. New Trends in Fractional Programming; Nova Science Publisher: New York, NY, USA, 2019. [Google Scholar]
- Yamamoto, T. Historical developments in convergence analysis for Newton’s and Newton-like methods. J. Comput. Appl. Math. 2000, 124, 1–23. [Google Scholar] [CrossRef]
- Zhanlav, T.; Chun, C.; Otgondorj, K.H.; Ulziibayar, V. High order iterations for systems of nonlinear equations. Int. J. Comput. Math. 2020, 97, 1704–1724. [Google Scholar] [CrossRef]
- Sharma, J.R.; Guha, R.K. Simple yet efficient Newton-like method for systems of nonlinear equations. Calcolo 2016, 53, 451–473. [Google Scholar] [CrossRef]
- Grau-Sanchez, M.; Grau, A.; Noguera, M. Ostrowski type methods for solving system of nonlinear equations. Appl. Math. Comput. 2011, 218, 2377–2385. [Google Scholar] [CrossRef]
- Homeier, H.H.H. A modified Newton method with cubic convergence: The multivariate case. J. Comput. Appl. Math. 2004, 169, 161–169. [Google Scholar] [CrossRef]
- Kou, J.; Wang, X.; Li, Y. Some eight order root finding three-step methods. Commun. Nonlinear Sci. Numer. Simul. 2010, 15, 536–544. [Google Scholar] [CrossRef]
- Nashed, M.Z.; Chen, X. Convergence of Newton-like methods for singular operator equations using outer inverses. Numer. Math. 1993, 66, 235–257. [Google Scholar] [CrossRef]
- Wang, X. An Ostrowski-type method with memory using a novel self-accelerating parameters. J. Comput. Appl. Math. 2018, 330, 710–720. [Google Scholar] [CrossRef]
- Moccari, M.; Lofti, T. On a two-step optimal Steffensen-type method: Relaxed local and semi-local convergence analysis and dynamical stability. J. Math. Anal. Appl. 2018, 468, 240–269. [Google Scholar] [CrossRef]
- Sharma, J.R.; Arora, H. Efficient derivative-free numerical methods for solving systems of nonlinear equations. Comput. Appl. Math. 2016, 35, 269–284. [Google Scholar] [CrossRef]
- Noor, M.A.; Waseem, M. Some iterative methods for solving a system of nonlinear equations. Comput. Math. Appl. 2009, 57, 101–106. [Google Scholar] [CrossRef]
- Shakhno, S.M. On a two-step iterative process under generalized Lipschitz conditions for first-order divided differences. J. Math. Sci. 2010, 168, 576–584. [Google Scholar] [CrossRef]
- Argyros, I.K. On the Newton-Kantorovich hypothesis for solving equations. J. Comput. Math. 2004, 169, 315–332. [Google Scholar] [CrossRef]
- Argyros, I.K. Unified Convergence Criteria for Iterative Banach Space Valued Methods with Applications. Mathematics 2021, 9, 1942. [Google Scholar] [CrossRef]
- Proinov, P.D. New general convergence theory for iterative processes and its applications to Newton-Kantorovich type theorems. J. Complex. 2010, 26, 3–42. [Google Scholar] [CrossRef]
- Argyros, I.K.; Hilout, S. On an improved convergence analysis of Newton’s scheme. Appl. Math. Comput. 2013, 225, 372–386. [Google Scholar]
- Argyros, I.K.; Hilout, S. Weaker conditions for the convergence of Newton’s scheme. J. Complex. 2012, 28, 364–387. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).