Abstract
We develop a local convergence of an iterative method for solving nonlinear least squares problems with operator decomposition under the classical and generalized Lipschitz conditions. We consider the case of both zero and nonzero residuals and determine their convergence orders. We use two types of Lipschitz conditions (center and restricted region conditions) to study the convergence of the method. Moreover, we obtain a larger radius of convergence and tighter error estimates than in previous works. Hence, we extend the applicability of this method under the same computational effort.
Keywords:
nonlinear least squares problem; differential-difference method; divided differences; radius of convergence; residual; error estimates MSC:
65J15
1. Introduction
Nonlinear least squares problems often arise while solving overdetermined systems of nonlinear equations, estimating parameters of physical processes by measurement results, constructing nonlinear regression models for solving engineering problems, etc. The most used method for solving nonlinear least squares problems is the Gauss–Newton method [1]. In the case when the derivative can not be calculated, difference methods are used [2,3].
Some nonlinear functions have a differentiable and a nondifferentiable part. In this case, a good idea is to use a sum of the derivative of the differentiable part of the operator and the divided difference of the nondifferentiable part instead of the Jacobian [4,5,6]. Numerical study shows that these methods converge faster than Gauss–Newton type’s method or difference methods.
In this paper, we study the local convergence of the Gauss–Newton–Secant method under the classical and generalized Lipschitz conditions for first-order Fréchet derivative and divided differences.
Let us consider the nonlinear least squares problem:
where residual function () is nonlinear in x, F is a continuously differentiable function, and G is a continuous function, the differentiability of which, in general, is not required.
We propose the following modification of the Gauss–Newton method combined with the Secant-type method [4,6] for finding the solution to problem (1):
where , is a Fréchet derivative of ; is a divided difference of the first order of function [7] at points , ; and , are given.
For , problem (1) turns into a system of nonlinear equations:
In this case, method (2) is transformed into the combined Newton–Secant method [8,9,10]:
and method (3) into the Newtons-type method for solving nonlinear equations [11]:
The convergence domain is small (in general), and error estimates are pessimistic. These problems restrict the applicability of these methods. The novelty of our work is in the claim that these problems can be addressed without adding hypotheses. In particular, our idea is to use a center and restricted radius Lipschitz conditions. Such an approach to the study of the convergence of methods allows for extending the convergence ball of the method and improving error estimates.
2. Local Convergence Analysis
Let us consider, at first, some auxiliary lemmas needed to obtain the main results. Let D be an open subset of .
Lemma 1
([4]). Let , where E is an integrable and positive nondecreasing function on . Then, is monotonically increasing with respect to t on .
Lemma 2
([1,12]). Let where H is an integrable and positive nondecreasing function on . Then, is nondecreasing with respect to t on .
Additionally, at is defined as .
Lemma 3
([13]). Let where S is an integrable and positive nondecreasing function on . Then, is nondecreasing with respect to t on .
Definition 1.
The Fréchet derivative satisfies the center Lipschitz condition on D with average if
where , is a solution of problem (1), and is an integrable, positive, and nondecreasing function on .
The functions and introduced next are as the function : integrable, positive, and nondecreasing functions defined on .
Definition 2.
The first order divided difference satisfies the center Lipschitz condition on with average if
Let and . We define function on by
Suppose that equation
has at least one positive solution. Denote by the minimal such solution. Then, we can define , where .
Definition 3.
The Fréchet derivative satisfies the restricted radius Lipschitz condition on with L average if
Definition 4.
The first order divided difference satisfies the restricted radius Lipschitz condition on with M average if
Definition 5.
The Fréchet derivative satisfies the radius Lipschitz condition on D with average if
Definition 6.
The first order divided difference satisfies the radius Lipschitz condition on D with average if
Remark 1.
It follows from the preceding definitions that , , and for each
since . By , we mean that L (or M) depends on and by the definition of . In case any of (15)–(17) are strict inequalities, the following benefits are obtained over the work in [4] using instead of the new functions:
- (a1)
- An at least as large convergence region leading to at least as many initial choices;
- (a2)
- At least as tight upper bounds on the distances , so at least as few iterations are needed to obtain a desired error tolerance.
These benefits are obtained under the same computational effort as in [4], since the new functions and M are special cases of the functions and . This technique of using the center Lipschitz condition in combination with the restricted convergence region has been used by us on Newton’s, Secant, Newton-like methods [14,15], and can be used on other methods, too, with the same benefits.
The proof of the next result follows as the corresponding one in [4], but there are crucial differences, where we use instead of and instead of used in [4].
We use the Euclidean norm. Note that the following equality is satisfied for the Euclidean norm where .
Theorem 1.
Let be continuous on an open convex subset , F be a continuously differentiable function, and G be a continuous function. Suppose that problem (1) has a solution ; the inverse operation
exists, such that ; (7), (8), (11), and (12) hold, and γ given in (10) exists.
Furthermore,
and where is the unique positive zero of the function q given by
Then, for , the iterative sequence , generated by (2), is well defined, remains in Ω, and converges to . Moreover, the following error estimates hold for each :
where
Proof.
We obtain
since and are positive and nondecreasing functions on , and , respectively. Taking into account Lemma 1 for a sufficiently small , . With a sufficiently large R, the inequality holds. By the intermediate value theorem, the function q has a positive zero on denoted by . Moreover, this zero is the only one on . Indeed, according to Lemma 2, the function is non-decreasing with respect to r on . By Lemma 1, functions , , and are monotonically increasing on . Furthermore, by Lemma 3, the function is monotonically increasing with respect to r on . Therefore, is monotonically increasing on . Thus, the graph of function crosses the positive r-axis only once on . Finally, from the monotonicity of q and since , we obtain , so .
We denote . Let . By the assumption , , we obtain the following estimation:
Using conditions (11) and (12), we obtain
where . Then, from inequality (29) and the equation , we obtain by (10)
Hence, is correctly defined. Next, we will show that .
Using the fact
and the choice of , we obtain the estimate
So, considering the inequalities
we obtain
where
Therefore, , and estimate (22) holds for .
Let us assume that for and estimate (22) holds for , where 1 is an integer. We shall show and that the estimate (22) holds for .
We can write
Consequently, exists, and
Therefore, is correctly defined, and the following estimate holds:
This proves that and estimate (22) for .
It remains to be proven that for .
Let us define functions a and b on as
According to the choice of , we obtain
According to the proof in [17], under the conditions (42)–(45), the sequence converges to for . □
Corollary 1
If , we have the nonlinear least squares problem with zero residual. Then, the constants and , and estimate (22) takes the form
This inequality can be written as
Then, we can write an equation for determining the convergence order as follows:
Therefore, the positive root, of the latter equation is the order of convergence of method (2).
In case in (1), we obtain the following consequences.
Corollary 2
Indeed, if , then , and estimate (22) takes the form
which indicates the quadratic convergence rate of method (2).
Remark 2.
If and , our results specialize to the corresponding ones in [4]. Otherwise, they constitute an improvement as already noted in Remark 1. As an example, let denote the functions and parameters where are replaced by , respectively. Then, we have in view of (15)–(17) that
and
Hence, we have
the new error bounds (22) being tighter than the corresponding (6) in [4], and the rest of the advantages (already mentioned in Remark 1) holding true.
Next, we study the convergence of method (2) if are constants, as a consequence of Theorem 1.
Corollary 3.
Let be continuous on an open convex subset , F be a continuously differentiable, and G be a continuous function on D. Suppose that problem (1) has a solution , and the inverse operation
exists, such that .
Suppose that the Fréchet derivative satisfies the classic Lipschitz conditions
and the function G has a first order divided difference that satisfies
where .
Furthermore,
and where
Then, for each , the iterative sequence , generated by (2) is well defined, remains in Ω, and converges to , such that the following error estimate holds for each :
where
The proof of Corollary 3 is analogous to the proof of Theorem 1.
3. Numerical Examples
In this section, we give examples to show the applicability of method (2) and to confirm Remark 2. We use the norm for
Example 1.
Let function be defined by
where . The solution of this problem and .
Let us give the number of iterations needed to obtain an approximate solution of this problem. We test method (2) for the different initial points , where , and use the stopping criterion . The additional point . The numerical results are shown in Table 1.
Table 1.
Results for Example 1, .
In Table 2, we give values of , and the norm of residual at each iteration.
Table 2.
Iterative sequence, norm of growth, and residual for Example 1, , .
Example 2.
Let function be defined by [5]:
where are two parameters. Here and . Thus, if , then we have a problem with zero residual.
Let us consider Example 2 and show that and the new error estimates (64) are tighter than the corresponding ones in [4]. We consider the case of the classical Lipschitz conditions (Corollary 3). Error estimates from [4] are as follows:
where
They can be obtained from (64) by replacing in , , , , by , respectively. Similarly,
Let us choose . Thus, we have , , , , , , . Radii are written in Table 3.
Table 3.
Radii of convergence domains.
Table 4 and Table 5 report the left and right side of error estimates (64) and (73). We obtained these results for and starting approximations , . We see that the new error bounds (64) are tighter than the corresponding (73) from [4].
Table 4.
Results for , .
Table 5.
Results for , .
4. Conclusions
We developed an improved local convergence analysis of the Gauss–Newton–Secant method for solving nonlinear least squares problems with nondifferentiable operator. We use a center and restricted radius Lipschitz conditions to study the method. As a consequence, we obtain a larger radius of convergence and tighter error estimates under the same computational effort as in earlier papers. This idea can be used to extend the usage of other methods with inverses, such as Newton-type, Secant-type, single-step, or multi-step, to mention a few. This should be our future work. Finally, it is worth mentioning that except for the methods used in this paper, some of the most representative computational intelligence algorithms can be used to solve the problems, such as monarch butterfly optimization (MBO) [18], the earthworm optimization algorithm (EWA) [19], elephant herding optimization (EHO) [20], the moth search (MS) algorithm [21], the slime mould algorithm (SMA), and Harris hawks optimization (HHO) [22].
Author Contributions
Editing, I.K.A.; Conceptualization S.S.; Investigation I.K.A., S.S., R.I., H.Y. and M.I.A. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Li, C.; Zhang, W.; Jin, X. Convergence and uniqueness properties of Gauss-Newton’s method. Comput. Math. Appl. 2004, 47, 1057–1067. [Google Scholar] [CrossRef]
- Argyros, I.K.; Ren, H. A derivative free iterative method for solving least squares problems. Numer. Algorithms 2011, 58, 555–571. [Google Scholar]
- Shakhno, S.M.; Gnatyshyn, O.P. On an iterative algorithm of order 1.839... for solving the nonlinear least squares problems. Appl. Math. Comput. 2005, 161, 253–264. [Google Scholar] [CrossRef]
- Shakhno, S.M.; Iakymchuk, R.P.; Yarmola, H.P. An iterative method for solving nonlinear least squares problems with nondifferentiable operator. Mat. Stud. 2017, 48, 97–107. [Google Scholar] [CrossRef][Green Version]
- Shakhno, S.M.; Iakymchuk, R.P.; Yarmola, H.P. Convergence analysis of a two-step method for the nonlinear least squares problem with decomposition of operator. J. Numer. Appl. Math. 2018, 128, 82–95. [Google Scholar]
- Shakhno, S.; Shunkin, Y. One combined method for solving nonlinear least squares problems. Visnyk Lviv Univ. Ser. Appl. Math. Comp. Sci. 2017, 25, 38–48. (In Ukrainian) [Google Scholar]
- Ulm, S. On generalized divided differences. Izv. ESSR Ser. Phys. Math. 1967, 16, 13–26. (In Russian) [Google Scholar]
- Cătinaş, E. On some iterative methods for solving nonlinear equations. Rev. Anal. Numér. Théor. Approx. 1994, 23, 47–53. [Google Scholar]
- Shakhno, S.M.; Mel’nyk, I.V.; Yarmola, H.P. Convergence analysis of combined method for solving nonlinear equations. J. Math. Sci. 2016, 212, 16–26. [Google Scholar] [CrossRef]
- Shakhno, S.M. Convergence of combined Newton-Secant method and uniqueness of the solution of nonlinear equations. Sci. J. Tntu 2013, 1, 243–252. (In Ukrainian) [Google Scholar]
- Zabrejko, P.P.; Nguen, D.F. The majorant method in the theory of Newton-Kantorovich approximations and the Pták error estimates. Numer. Funct. Anal. Optim. 1987, 9, 671–686. [Google Scholar] [CrossRef]
- Wang, X.; Li, C. Convergence of Newton’s method and uniqueness of the solution of equations in Banach space II. Acta Math. Sin. 2003, 19, 405–412. [Google Scholar] [CrossRef]
- Wang, X. Convergence of Newton’s method and uniqueness of the solution of equations in Banach space. IMA J. Numer. Anal. 2000, 20, 123–134. [Google Scholar] [CrossRef]
- Argyros, I.K.; Hilout, S. On an improved convergence analysis of Newton’s method. Appl. Math. Comput. 2013, 225, 372–386. [Google Scholar] [CrossRef]
- Argyros, I.K.; Magreñán, A.A. Iterative Methods and Their Dynamics with Applications: A Contemporary Study; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
- Dennis, J.E.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
- Ren, H.; Argyros, I.K. Local convergence of a secant type method for solving least squares problems. Appl. Math. Comput. 2010, 217, 3816–3824. [Google Scholar] [CrossRef]
- Wang, G.G.; Deb, S.; Cui, Z. Monarch butterfly optimization. Neural Comput. Appl. 2019, 31, 1995–2014. [Google Scholar] [CrossRef]
- Wang, G.G.; Deb, S.; Dos, L.; Coelho, L.D.S. Earthworm optimization algorithm: A bio-inspired metaheuristic algorithm for global optimization problems. Int. J. Bio-Inspired Comput. 2018, 12, 1–22. [Google Scholar] [CrossRef]
- Wang, G.G.; Deb, S.; Coelho, L.D.S. Elephant Herding Optimization. In Proceedings of the 3rd International Symposium on Computational and Business Intelligence (ISCBI 2015), Bali, Indonesia, 7–9 December 2015; pp. 1–5. [Google Scholar]
- Mirjalili, S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowl.-Based Syst. 2015, 89, 228–249. [Google Scholar] [CrossRef]
- Zhao, J.; Gao, Z.-M. The hybridized Harris hawk optimization and slime mould algorithm. J. Phys. Conf. Ser. 2020, 1682, 012029. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).