Abstract
We study an iterative differential-difference method for solving nonlinear least squares problems, which uses, instead of the Jacobian, the sum of derivative of differentiable parts of operator and divided difference of nondifferentiable parts. Moreover, we introduce a method that uses the derivative of differentiable parts instead of the Jacobian. Results that establish the conditions of convergence, radius and the convergence order of the proposed methods in earlier work are presented. The numerical examples illustrate the theoretical results.
Keywords:
nonlinear least squares problem; differential-difference method; divided differences; order of convergence; residual MSC:
65F20; 65G99; 65H10; 49M15
1. Introduction
Nonlinear least squares problems often arise while solving overdetermined systems of nonlinear equations, parameter estimation of physical processes by measurement results, constructing nonlinear regression models for solving engineering problems, etc.
The nonlinear least squares problem has the form
where the residual function () is nonlinear in x; F is a continuously differentiable function. Effective methods for solving nonlinear least squares problems is the Gauss-Newton method [1,2,3]
However, in practice, there are often problems with the calculation of derivatives. Hence, one can use the iterative-difference methods. These methods do not require calculation of derivatives. Moreover, they do not perform worse than Gauss-Newton method in terms of the convergence rate and the number of iterations. In some cases, nonlinear functions consist of differentiable and nondifferentiable parts. However, it is possible to use iterative-difference methods [4,5,6,7]
where
or
It is desirable to build iterative methods that take into account properties of the problem. In particular, we can use only derivative of differentiable part of operator instead of full Jacobian, which in fact, does not exist. The methods obtained using this approach converge slowly. More efficient methods use sum of the derivatives of the differentiable part and divided difference of the nondifferentiable part of the operator instead of the Jacobian. Such an approach shows great results in the case of solving nonlinear equations.
In this work we study a combined method for solving nonlinear least squares problem, based on the Gauss-Newton, secant methods. We also use a method, requiring only derivative from the differentiable part of operator. We prove the local convergence and show efficiency on test cases when comparing with secant type methods [5,6]. The convergence region of iterative methods is small in general. This fact limits the number of initial approximations. It is therefore important to extend this region without requiring additional hypotheses. The new approach [8] leads to larger convergence radius than before [9]. We achieve this goal by locating an at least as small region as before containing the iterates. Then, the new Lipschitz constants are at least as tight as the old Lipschitz constants. Moreover, using more precise estimates on the distances involved, under weaker hypotheses, and under the same computational cost, we provide an analysis of the Gauss-Newton-Secant method with the following advantages over the corresponding results in [9]: larger convergence region; finer error estimates on the distances involved, and an at least as precise information on the location of the solution.
2. Description of the Problem
Consider the nonlinear least squares problem
where residual function () is nonlinear in x; F is continuously differentiable function; G is continuous function, differentiability of which, in general, is not required.
We propose a modification of the Gauss-Newton method to find a solution of problem (4):
Here, is Fréchet derivative by ; is a divided difference of order one for function [10], where vectors , and , are given initial approximations, satisfying for and , if G is differentiable. Setting , from method (5) we get Gauss-Newton type iterative method for solving problem (4)
In case of , problem (4) turns into a system of nonlinear equations
Then, it is well known ([3], p. 267) that techniques for minimizing problem (4) are techniques for finding a solution of Equation (7). In this case (5) transforms into the Newton-Secant combined method [11,12]
and method (6) into Newton’s-type method for solving nonlinear Equation (7) [13]
We assume from now on that function G is differentiable at .
3. Local Convergence Analysis (5)
Sufficient conditions and the convergence order of the iterative process (5) are presented. However first, we need some crucial definitions. They are needed to provide a clear relationship between the Lipschitz constants appearing in the local convergence analysis and the relationship between them.
Definition 1.
The Fréchet derivative satisfies the center-Lipschitz condition on D, if there exists such that for each
Definition 2.
The divided difference satisfies the center-Lipschitz condition , if there exists such that for each
Let and . Define function by
Let . Suppose that equation has at least one positive solution. Denote by γ the smallest such solution. Define
Definition 3.
The Fréchet derivative satisfies the restricted Lipschitz condition on , if there exists such that for each
Definition 4.
The first order divided difference satisfies the restricted Lipschitz condition on , if there exists such that for each
Next, we also state the definitions given in [9], so we can compare them to preceding ones.
Definition 5.
The Fréchet derivative satisfies the Lipschitz condition on D, if there exists such that for each
Definition 6.
The first order divided difference satisfies the Lipschitz condition on , if there exists such that for each
Remark 1.
It follows from the preceding definitions that ,
and
since . If any of (17)–(20) are strict inequalities, then the following advantages are obtained over the work in [9] using and instead of the new constants:
At least as large convergence domain leading to at least as many initial choices.
At least as tight upper bounds on the distances , so at most as many iterations are needed to obtain a desired error tolerance.
It is always true that is at least as small and included in D by (12). Here lies the new idea and the reason for the advantages. Notice that these advantages are obtained under the same computational cost, as in [9], since the new constants and M are special cases of constants and . This technique of using the center Lipschitz condition in combination with the restricted convergence region has been used on Newton’s, Secant and Newton-like methods [14] and can be used on other methods in order to extend their applicability.
The Euclidean norm, and the corresponding matrix norm are used in this study which has the advantage .
The proof of the next result follows the corresponding one in [9] but there are crucial differences where we use instead of and instead of .
Theorem 1.
Let be continuous on set , F be continuously differentiable in this set, and be a divided difference of order one. Suppose, the problem (4) has a solution on set D, and the inverse operator exists, , (9), (10), (13), (14) hold, and γ defined in (11) exists. Moreover,
and , where is the unique positive zero of function q, defined by
Then, for method (5) is well defined and generates the sequence which belongs to set , and converges to the solution . Moreover, the following error bounds hold
where
Proof.
According to the intermediate value theorem on for sufficiently large r and in view of (22) function q has at least one positive zero. Denote by the least such positive zero. Moreover, we have for Indeed, this zero is unique on .
We shall show estimate (24) by first showing that sequence is well defined.
Let , and set . We need to show that linear operator is invertible. By assuming, , we obtain the following estimation:
By the Banach Lemma on invertible operators [3], and (28) is invertible. Then from (26), (27) and (28), we get in turn that
Hence, iterate is well defined by method (5) for . Next, we will show that . First of all, we get the estimation
Then, by method (5) for and the preceding estimate, we have in turn that
where . That is and estimate (24) holds for .
Suppose that for and estimate (24) holds for , where 1 is integer. We shall show that and estimate (24) holds for .
Hence, exists and
Therefore, iteration is well defined, and the following estimate holds
That proves and estimate (24) for
Thus, method (5) is well defined, for all and estimate (24) holds for all . It remains to prove that for .
Define a and b on by
and
According to , we get
Corollary 1.
In case of , we have a nonlinear least squares problem with zero residual. Then, and , and estimate (24) reduces to
That is method (5) converges with order .
Let in (4), corresponding to the residual functions being differentiable. Then, from Theorem 1, we obtain the following corollary.
Corollary 2.
If , then in the conditions of theorem, we set, , , , and estimate (24) reduces to:
Hence method (5) has a convergence order two.
Remark 2.
If and our results specialize to the corresponding ones in [9]. Otherwise, they constitute an improvement as already noted in the Remark 1. As an example let the denote the functions and parameter where are replaced by respectively. Then we have in view of (17)–(20) that
so
Consequently, the new sufficient convergence criteria are weaker than the ones in [9], unless, if and . And moreover, the new error bounds are tighter than the corresponding ones in [9] and the rest of the advantages already mentioned in Remark 1 hold true.
4. Local Convergence Analysis (6)
Sufficient conditions and the rate of local converges of method (6) are defined in the following theorem.
Theorem 2.
Let be continuous on set , F be continuously differentiable in this set, and G be a function on D. Suppose, the problem (4) has a solution on set D, and the inverse operator exists and . Fréchet derivative and function G satisfy Lipschitz conditions on set
Moreover,
and where is unique positive zero of function q, defined by
Then, for method (6) is well defined and generates the sequence which belongs to set , and converges to the solution . Moreover, the following error bounds hold
where
Proof.
According to intermediate value theorem on for sufficiently large r and in view of (39) function q has a least positive zero, denoted by , and for Indeed, this zero is unique on . The proof analogous to the one given in Theorem 1.
Let , and set . By assuming . By analogy to (26) in Theorem 1, we get
From the Banach Lemma on invertible operators [3], and (45) is invertible. Then, from (43)–(45), we get
Hence, iteration is well defined.
Next, we will show that . We have the estimate
In view of the estimates
we obtain in turn that
Hence, and inequality (41) holds for .
Suppose for and estimate (41) holds for , where is integer. Next, we show that and estimate (41) holds for .
Then, we obtain
Hence, exists and
Therefore iteration is well defined, and we get in turn that
That proves , and estimate (41) for .
Define function a on
For any and initial point , exists and such that . Similarly to the proof that all iterates stay in , we show that all iterates stay in . So, estimation (47) holds, if is replaced by . In particular, from (47) for , we get
where . Obviously , . Therefore, we obtain
However, for . Hence, sequence {} converges to as , with a rate of geometric progression. □
The same type of improvements as in Theorem 1 are obtained for Theorem 2 (see Remark 2).
Remark 3.
As we can see from estimations (41) and (42), convergence of method (6) depends on α, , L and M. For problems with weak nonlinearity (α, , L and – “small”) convergence rate of iterative process is linear. In case of strongly nonlinear problems (α, , L and/or – “large”) method (6) may not converge at all.
5. Numerical Experiments
Let us compare the convergence rate of combined method (5), Gauss-Newton type method (6) Secant-type method for solving nonlinear least squares problem [5,6] on some test cases with
Testing is carried out on nonlinear systems with a nondifferentiable operator with zero and non-zero residual. Classic Gauss-Newton and Newton methods can not be used for solving such a problem. Results are searched with an accuracy . Calculations are performed until the following conditions are satisfied
in this case
Example 1.
[11,12].
Example 2.
Remark 4.
The results of the numerical experiments are shown in the Table 1. In particular, we compare studied methods with respect to the number of iterations needed to find the solution with given accuracy. In Example 1, all methods converge to one solution. In Example 2 Gauss-Newton type method (6) converges to point with residual , with the same number of iterations. Such iterations are marked with * symbol in the table. Other methods find the point with smaller residual . Additional initial approximation is chosen as:
Table 1.
Number of iteration made to solve test problem.
6. Conclusions
Based on the theoretical studies, the numerical experiments, and the comparison of obtained results, we can argue that the combined differential-difference method (5) converges faster than Gauss-Newton type method (6) and Secant type method (48). Moreover, the method has high convergence order in case of zero residual and does not require calculation of derivatives of the nondifferentiable part of operator. Therefore, the proposed method (5) solves the problem efficiently and fast.
Author Contributions
All authors contributed equally and significantly to the writing of this article. All authors read and approved the final manuscript.
Funding
This research received no external funding.
Acknowledgments
The authors would like to express their sincere gratitude to the referees for their valuable comments which have significantly improved the presentation of this paper.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Argyros, I.K. Convergence and Applications of Newton-Type Iterations; Springer: New York, NY, USA, 2008; 506p. [Google Scholar]
- Dennis, J.E.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
- Ortega, J.M.; Rheinboldt, W.C. Iterative Solution of Nonlinear Equations in Several Variables; Academic Press: New York, NY, USA, 1970. [Google Scholar]
- Argyros, I.K.; Ren, H. A derivative free iterative method method for solving least squares problems. Numer. Algorithms 2011, 58, 555–571. [Google Scholar]
- Ren, H.; Argyros, I.K. Local convergence of a secant type method for solving least squares problems. Appl. Math. Comput. 2010, 217, 3816–3824. [Google Scholar] [CrossRef]
- Shakhno, S.M.; Gnatyshyn, O.P. On an iterative algorithm of order 1.839... for solving the nonlinear least squares problems. Appl. Math. Comput. 2005, 161, 253–264. [Google Scholar] [CrossRef]
- Shakhno, S.M.; Gnatyshyn, O.P. Iterative-difference methods for solving nonlinear least-squares problem. In Progress in Industrial Mathematics at ECMI 98; Vieweg + Teubner Verlag: Stuttgart, Germany, 1999; pp. 287–294. [Google Scholar]
- Argyros, I.K.; Hilout, S. On an improved convergence analysis of Newton’s method. Appl. Math. Comput. 2013, 225, 372–386. [Google Scholar] [CrossRef]
- Shakhno, S.M.; Shunkin, Y.V. One combined method for solving nonlinear least squares problems. Visnyk Lviv Univ. Ser. Appl. Math. Inform. 2017, 25, 38–48. (In Ukrainian) [Google Scholar]
- Ulm, S. On generalized divided differences. Proc. Acad. Sci. Estonian SSR. Phys. Mathe. 1967, 16, 13–26. (In Russian) [Google Scholar]
- Cătinas, E. On some iterative methods for solving nonlinear equations. Revue d’Analyse Numérique et de Theorie de l’Approximation 1994, 23, 47–53. [Google Scholar]
- Shakhno, S.M.; Mel’nyk, I.V.; Yarmola, H.P. Analysis of the Convergence of a Combined Method for the Solution of Nonlinear Equations. J. Math. Sci. 2014, 201, 32–43. [Google Scholar] [CrossRef]
- Zabrejko, P.P.; Nguen, D.F. The majorant method in the theory of Newton-Kantorovich approximations and the Pták error estimates. Numer. Funct. Anal. Optim. 1987, 9, 671–686. [Google Scholar] [CrossRef]
- Argyros, I.K.; Magreñán, Á.A. A Contemporary Study of Iterative Methods: Convergence, Dynamics and Applications; Academic Press: London, UK, 2018. [Google Scholar]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).