Improved Convergence Analysis of Gauss-Newton-Secant Method for Solving Nonlinear Least Squares Problems

Argyros, Ioannis; Shakhno, Stepan; Shunkin, Yurii

doi:10.3390/math7010099

Open AccessArticle

Improved Convergence Analysis of Gauss-Newton-Secant Method for Solving Nonlinear Least Squares Problems

by

Ioannis Argyros

¹,

Stepan Shakhno

^2,*

and

Yurii Shunkin

²

¹

Department of Mathematical Sciences, Cameron University, Lawton, OK 73505, USA

²

Department of Theory of Optimal Processes, Ivan Franko National University of Lviv, 79000 Lviv, Ukraine

^*

Author to whom correspondence should be addressed.

Mathematics 2019, 7(1), 99; https://doi.org/10.3390/math7010099

Submission received: 20 October 2018 / Revised: 12 January 2019 / Accepted: 15 January 2019 / Published: 18 January 2019

(This article belongs to the Special Issue Computational Methods in Analysis and Applications)

Download Versions Notes

Abstract

:

We study an iterative differential-difference method for solving nonlinear least squares problems, which uses, instead of the Jacobian, the sum of derivative of differentiable parts of operator and divided difference of nondifferentiable parts. Moreover, we introduce a method that uses the derivative of differentiable parts instead of the Jacobian. Results that establish the conditions of convergence, radius and the convergence order of the proposed methods in earlier work are presented. The numerical examples illustrate the theoretical results.

Keywords:

nonlinear least squares problem; differential-difference method; divided differences; order of convergence; residual

MSC:

65F20; 65G99; 65H10; 49M15

1. Introduction

Nonlinear least squares problems often arise while solving overdetermined systems of nonlinear equations, parameter estimation of physical processes by measurement results, constructing nonlinear regression models for solving engineering problems, etc.

The nonlinear least squares problem has the form

min_{x \in R^{p}} \frac{1}{2} F {(x)}^{T} F (x),

(1)

where the residual function

F : R^{p} \to R^{m}

(

m \geq p

) is nonlinear in x; F is a continuously differentiable function. Effective methods for solving nonlinear least squares problems is the Gauss-Newton method [1,2,3]

x_{n + 1} = x_{n} - {[F^{'} {(x_{n})}^{T} F^{'} (x_{n})]}^{- 1} F^{'} {(x_{n})}^{T} F (x_{n}), n = 0, 1 \dots

(2)

However, in practice, there are often problems with the calculation of derivatives. Hence, one can use the iterative-difference methods. These methods do not require calculation of derivatives. Moreover, they do not perform worse than Gauss-Newton method in terms of the convergence rate and the number of iterations. In some cases, nonlinear functions consist of differentiable and nondifferentiable parts. However, it is possible to use iterative-difference methods [4,5,6,7]

x_{n + 1} = x_{n} - {(A_{n}^{T} A_{n})}^{- 1} A_{n}^{T} F (x_{n}), n = 0, 1, \dots,

(3)

where

A_{n} = F (x_{n}, x_{n - 1}),

A_{n} = F (2 x_{n} - x_{n - 1}, x_{n - 1}),

or

A_{n} = F (x_{n}, x_{n - 1}) + F (x_{n - 2}, x_{n}) - F (x_{n - 2}, x_{n - 1}) .

It is desirable to build iterative methods that take into account properties of the problem. In particular, we can use only derivative of differentiable part of operator instead of full Jacobian, which in fact, does not exist. The methods obtained using this approach converge slowly. More efficient methods use sum of the derivatives of the differentiable part and divided difference of the nondifferentiable part of the operator instead of the Jacobian. Such an approach shows great results in the case of solving nonlinear equations.

In this work we study a combined method for solving nonlinear least squares problem, based on the Gauss-Newton, secant methods. We also use a method, requiring only derivative from the differentiable part of operator. We prove the local convergence and show efficiency on test cases when comparing with secant type methods [5,6]. The convergence region of iterative methods is small in general. This fact limits the number of initial approximations. It is therefore important to extend this region without requiring additional hypotheses. The new approach [8] leads to larger convergence radius than before [9]. We achieve this goal by locating an at least as small region as before containing the iterates. Then, the new Lipschitz constants are at least as tight as the old Lipschitz constants. Moreover, using more precise estimates on the distances involved, under weaker hypotheses, and under the same computational cost, we provide an analysis of the Gauss-Newton-Secant method with the following advantages over the corresponding results in [9]: larger convergence region; finer error estimates on the distances involved, and an at least as precise information on the location of the solution.

The rest of the paper is given as follows. Section 2 contains the statement of the problem, in Section 3 and Section 4, we present the local convergence analysis of the first and second method, respectively. In Section 5, we provide the numerical examples. The article ends with some conclusions.

2. Description of the Problem

Consider the nonlinear least squares problem

min_{x \in R^{p}} \frac{1}{2} {(F (x) + G (x))}^{T} (F (x) + G (x)),

(4)

where residual function

F + G : R^{p} \to R^{m}

(

m \geq p

) is nonlinear in x; F is continuously differentiable function; G is continuous function, differentiability of which, in general, is not required.

We propose a modification of the Gauss-Newton method to find a solution of problem (4):

\begin{matrix} x_{n + 1} = x_{n} - {(A_{n}^{T} A_{n})}^{- 1} A_{n}^{T} (F (x_{n}) + G (x_{n})), \\ A_{n} = F^{'} (x_{n}) + G (x_{n}, x_{n - 1}), n = 0, 1, \dots \end{matrix}

(5)

Here,

F^{'} (x_{n})

is Fréchet derivative by

F (x)

;

G (x_{n}, x_{n - 1})

is a divided difference of order one for function

G (x)

[10], where vectors

x_{n}

,

x_{n - 1}

and

x_{0}

,

x_{- 1}

are given initial approximations, satisfying

G (x, y) (x - y) = G (x) - G (y)

for

x \neq y

and

G (x, x) = G^{'} (x)

, if G is differentiable. Setting

A_{n} = F^{'} (x_{n})

, from method (5) we get Gauss-Newton type iterative method for solving problem (4)

x_{n + 1} = x_{n} - {(F^{'} {(x_{n})}^{T} F^{'} (x_{n}))}^{- 1} F^{'} {(x_{n})}^{T} (F (x_{n}) + G (x_{n})), n = 0, 1, \dots

(6)

In case of

m = p

, problem (4) turns into a system of nonlinear equations

F (x) + G (x) = 0 .

(7)

Then, it is well known ([3], p. 267) that techniques for minimizing problem (4) are techniques for finding a solution

x^{*}

of Equation (7). In this case (5) transforms into the Newton-Secant combined method [11,12]

x_{n + 1} = x_{n} - {(F^{'} (x_{n}) + G (x_{n}, x_{n - 1}))}^{- 1} (F (x_{n}) + G (x_{n})), n = 0, 1, \dots,

(8)

and method (6) into Newton’s-type method for solving nonlinear Equation (7) [13]

x_{n + 1} = x_{n} - {(F^{'} (x_{n}))}^{- 1} (F (x_{n}) + G (x_{n})), n = 0, 1, \dots

We assume from now on that function G is differentiable at

x = x^{*}

.

3. Local Convergence Analysis (5)

Sufficient conditions and the convergence order of the iterative process (5) are presented. However first, we need some crucial definitions. They are needed to provide a clear relationship between the Lipschitz constants appearing in the local convergence analysis and the relationship between them.

Definition 1.

The Fréchet derivative

F^{'}

satisfies the center-Lipschitz condition on D, if there exists

L_{0} > 0

such that for each

x \in D

| | F^{'} (x) - F^{'} (x^{*}) | | \leq L_{0} | | x - x^{*} | | .

(9)

Definition 2.

The divided difference

G (x, y)

satisfies the center-Lipschitz condition

D \times D

, if there exists

M_{0} > 0

such that for each

x, y \in D

| | G (x, y) - G (x^{*}, x^{*}) | | \leq M_{0} (| | x - x^{*} | | + | | y - x^{*} | |) .

(10)

Let

B > 0

and

α > 0

. Define function

φ : [0, + \infty) \to [0, + \infty)

by

φ (r) = B [2 α + (L_{0} + 2 M_{0}) r] (L_{0} + 2 M_{0}) r .

(11)

Let

U (x^{*}, r_{*}) = {x : | | x - x^{*} | | \leq r_{*}}, r_{*} > 0

. Suppose that equation

φ (r) = 1

has at least one positive solution. Denote by γ the smallest such solution. Define

D_{0} = D \cap U (x^{*}, γ) .

(12)

Definition 3.

The Fréchet derivative

F^{'}

satisfies the restricted Lipschitz condition on

D_{0}

, if there exists

L > 0

such that for each

x, y \in D_{0}

| | F^{'} (x) - F^{'} (y) | | \leq L | | x - y | | .

(13)

Definition 4.

The first order divided difference

G (x, y)

satisfies the restricted Lipschitz condition on

D_{0} \times D_{0}

, if there exists

M > 0

such that for each

x, y, u \in D_{0}

| | G (x, y) - G (u, x^{*}) | | \leq M (| | x - u | | + | | y - x^{*} | |) .

(14)

Next, we also state the definitions given in [9], so we can compare them to preceding ones.

Definition 5.

The Fréchet derivative

F^{'}

satisfies the Lipschitz condition on D, if there exists

L_{1} > 0

such that for each

x, y \in D

| | F^{'} (x) - F^{'} (y) | | \leq L_{1} | | x - y | | .

(15)

Definition 6.

The first order divided difference

G (x, y)

satisfies the Lipschitz condition on

D \times D

, if there exists

M_{1} > 0

such that for each

x, y, u, v \in D

| | G (x, y) - G (u, v) | | \leq M_{1} (| | x - u | | + | | y - v | |) .

(16)

Remark 1.

It follows from the preceding definitions that

L = L (L_{0}, M_{0})

,

M = M (L_{0}, M_{0})

L_{0} \leq L_{1},

(17)

L \leq L_{1},

(18)

M_{0} \leq M_{1},

(19)

and

M \leq M_{1},

(20)

since

D_{0} \subseteq D

. If any of (17)–(20) are strict inequalities, then the following advantages are obtained over the work in [9] using

L_{1}

and

M_{1}

instead of the new constants:

(a_{1})

At least as large convergence domain leading to at least as many initial choices.

(a_{2})

At least as tight upper bounds on the distances

| | x_{n} - x^{*} | |

, so at most as many iterations are needed to obtain a desired error tolerance.

It is always true that

D_{0}

is at least as small and included in D by (12). Here lies the new idea and the reason for the advantages. Notice that these advantages are obtained under the same computational cost, as in [9], since the new constants

L_{0}, M_{0}, L

and M are special cases of constants

L_{1}

and

M_{1}

. This technique of using the center Lipschitz condition in combination with the restricted convergence region has been used on Newton’s, Secant and Newton-like methods [14] and can be used on other methods in order to extend their applicability.

The Euclidean norm, and the corresponding matrix norm are used in this study which has the advantage

∥ A^{T} ∥ = ∥ A ∥

.

The proof of the next result follows the corresponding one in [9] but there are crucial differences where we use

(L_{0}, L)

instead of

L_{1}

and

M_{0}, M

instead of

M_{1}

.

Theorem 1.

Let

F + G : R^{p} \to R^{m}

be continuous on set

D \subseteq R^{p}

, F be continuously differentiable in this set, and

G (\cdot, \cdot) : D \times D \to L (R^{p}, R^{m})

be a divided difference of order one. Suppose, the problem (4) has a solution

x^{*}

on set D, and the inverse operator

{(A_{*}^{T} A_{*})}^{- 1} = {[{(F^{'} (x^{*}) + G (x^{*}, x^{*}))}^{T} (F^{'} (x^{*}) + G (x^{*}, x^{*}))]}^{- 1}

exists,

| | {(A_{*}^{T} A_{*})}^{- 1} | | \leq B

, (9), (10), (13), (14) hold, and γ defined in (11) exists. Moreover,

| | F (x^{*}) + G (x^{*}) | | \leq η, | | F^{'} (x^{*}) + G (x^{*}, x^{*}) | | \leq α;

(21)

B (L_{0} + 2 M_{0}) η < 1,

(22)

and

U (x^{*}, r_{*}) \subseteq D

, where

r_{*}

is the unique positive zero of function q, defined by

\begin{matrix} q (r) = B [(α + (L_{0} + 2 M_{0}) r) (L + 2 M) r / 2 + (L_{0} + 2 M_{0}) η] \\ + B [2 α + (L_{0} + 2 M_{0}) r] (L_{0} + 2 M_{0}) r - 1 . \end{matrix}

(23)

Then, for

x_{0}, x_{- 1} \in U (x^{*}, r_{*})

method (5) is well defined and generates the sequence

{x_{n}}, n = 0, 1, \dots,

which belongs to set

U (x^{*}, r_{*})

, and converges to the solution

x^{*}

. Moreover, the following error bounds hold

\begin{matrix} | | x_{n + 1} - x^{*} | | \leq C_{1} | | x_{n - 1} - x^{*} | | + C_{2} | | x_{n} - x^{*} | | + C_{3} | | x_{n - 1} - x^{*} | | | | x_{n} - x^{*} | | \\ + C_{4} | | x_{n - 1} - x^{*} {| |}^{2} | | x_{n} - x^{*} | | + C_{5} | | x_{n} - x^{*} {| |}^{2} \\ + C_{6} | | x_{n - 1} - x^{*} | | | | x_{n} - x^{*} {| |}^{2} + C_{7} | | x_{n} - x^{*} {| |}^{3}, \end{matrix}

(24)

where

\begin{matrix} g (r) = B {[1 - B (2 α + (L_{0} + 2 M_{0}) r) (L_{0} + 2 M_{0}) r]}^{- 1}; \\ C_{1} = g (r_{*}) M_{0} η; C_{2} = g (r_{*}) (L_{0} + M_{0}) η; C_{3} = g (r_{*}) α M; C_{4} = g (r_{*}) M_{0} M; \\ C_{5} = g (r_{*}) α \frac{L}{2}; C_{6} = g (r_{*}) (L_{0} M + M_{0} M + \frac{M_{0} L}{2}); C_{7} = g (r_{*}) \frac{L}{2} (L_{0} + M_{0}) . \end{matrix}

(25)

Proof.

According to the intermediate value theorem on

[0, r]

for sufficiently large r and in view of (22) function q has at least one positive zero. Denote by

r_{*}

the least such positive zero. Moreover, we have

q^{'} (r) \geq 0

for

r \geq 0 .

Indeed, this zero is unique on

[0, r]

.

We shall show estimate (24) by first showing that sequence

x_{n}

is well defined.

Let

A_{n} = F^{'} (x_{n}) + G (x_{n}, x_{n - 1})

, and set

n = 0

. We need to show that linear operator

A_{0}

is invertible. By assuming,

x_{0}, x_{- 1} \in U (x^{*}, r_{*})

, we obtain the following estimation:

\begin{matrix} | | I - {(A_{*}^{T} A_{*}^{})}^{- 1} A_{0}^{T} A_{0} | | = | | {(A_{*}^{T} A_{*}^{})}^{- 1} (A_{*}^{T} A_{*}^{} - A_{0}^{T} A_{0}) | | \\ = | | {(A_{*}^{T} A_{*}^{})}^{- 1} (A_{*}^{T} (A_{*}^{} - A_{0}) + (A_{*}^{T} - A_{0}^{T}) (A_{0} - A_{*}^{}) + (A_{*}^{T} - A_{0}^{T}) A_{*}) | | \\ \leq | | {(A_{*}^{T} A_{*}^{})}^{- 1} | | (| | A_{*}^{T} | | | | A_{*}^{} - A_{0} | | + | | A_{*}^{T} - A_{0}^{T} | | | | A_{0} - A_{*}^{} | | + | | A_{*}^{T} - A_{0}^{T} | | | | A_{*}^{} | |) \\ \leq B (α | | A_{*}^{} - A_{0} | | + | | A_{*}^{T} - A_{0}^{T} | | | | A_{0} - A_{*}^{} | | + α | | A_{*}^{T} - A_{0}^{T} | |) . \end{matrix}

(26)

By (9) and (10), we have in turn the estimate

\begin{matrix} | | A_{0} - A_{*} | | = | | (F^{'} (x_{0}) + G (x_{0}, x_{- 1})) - (F^{'} (x^{*}) + G (x^{*}, x^{*})) | | \\ = | | F^{'} (x_{0}) - F^{'} (x^{*}) + G (x_{0}, x_{- 1}) - G (x^{*}, x^{*}) | | \\ \leq | | F^{'} (x_{0}) - F^{'} (x^{*}) | | + | | G (x_{0}, x_{- 1}) - G (x^{*}, x^{*}) | | \\ \leq L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |) . \end{matrix}

(27)

Then from inequality (26), definition of

r_{*}

(23), we get

\begin{matrix} | | I - {(A_{*}^{T} A_{*}^{})}^{- 1} A_{0}^{T} A_{0} | | \leq B [2 α + L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |)] \\ \times [L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |)] \\ \leq B [2 α + (L_{0} + 2 M_{0}) r_{*}] (L_{0} + 2 M_{0}) r_{*} = φ (r_{*}) < 1 . \end{matrix}

(28)

By the Banach Lemma on invertible operators [3], and (28)

A_{0}^{T} A_{0}^{- 1}

is invertible. Then from (26), (27) and (28), we get in turn that

\begin{matrix} | | {(A_{0}^{T} A_{0})}^{- 1} | | \leq g_{0} = B {1 - B [2 α + (L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |)] \\ \times (L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} {| |)}}^{- 1} \\ \leq g (r_{*}) = B {1 - B [2 α + (L_{0} + 2 M_{0}) r_{*}] (L_{0} + 2 M_{0}) r_{*}}^{- 1} . \end{matrix}

Hence, iterate

x_{1}

is well defined by method (5) for

n = 0

. Next, we will show that

x_{1} \in U (x^{*}, r_{*})

. First of all, we get the estimation

\begin{matrix} | | x_{1} - x^{*} | | = | | x_{0} - x^{*} - {(A_{0}^{T} A_{0})}^{- 1} (A_{0}^{T} (F (x_{0}) + G (x_{0})) - A_{*}^{T} (F (x^{*}) + G (x^{*})) | | \\ \leq | | - {(A_{0}^{T} A_{0})}^{- 1} | | | | [- A_{0}^{T} (A_{0} - \int_{0}^{1} F^{'} (x^{*} + t (x_{0} - x^{*})) d t \\ - G (x_{0}, x^{*})) (x_{0} - x^{*}) + (A_{0}^{T} - A_{*}^{T}) (F (x^{*}) + G (x^{*}))] | | . \end{matrix}

Moreover, using (9), (10), (13), (14) and (21), we obtain in turn

\begin{matrix} | | A_{0} - \int_{0}^{1} F^{'} (x^{*} + t (x_{0} - x^{*})) d t - G (x_{0}, x^{*}) | | \\ = | | F^{'} (x_{0}) - \int_{0}^{1} F^{'} (x^{*} + t (x_{0} - x^{*})) d t + G (x_{0}, x_{- 1}) - G (x_{0}, x^{*}) | | \\ = | | \int_{0}^{1} (F^{'} (x_{0}) - F^{'} (x^{*} + t (x_{0} - x^{*})) d t + G (x_{0}, x_{- 1}) - G (x_{0}, x^{*}) | | \\ \leq \frac{1}{2} L | | x_{0} - x^{*} | | + M | | x_{- 1} - x^{*} | | = \frac{1}{2} (L | | x_{0} - x^{*} | | + 2 M | | x_{- 1} - x^{*} | |), \\ | | A_{0} | | \leq | | A_{*} | | + | | A_{0} - A_{*} | | \leq α + L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |) . \end{matrix}

Then, by method (5) for

n = 0

and the preceding estimate, we have in turn that

\begin{matrix} | | x_{1} - x^{*} | | \leq B {(α + L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |)) \\ \times \frac{1}{2} (L ∥x_{0} - x^{*}∥ + 2 M ∥x_{- 1} - x^{*}∥) | | x_{0} - x^{*} | | \\ + η (L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |))} \\ / {1 - B [2 α + L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |)] \\ \times (L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |))} \\ \leq g_{0} {(α + L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |)) \\ \times \frac{1}{2} (L ∥x_{0} - x^{*}∥ + 2 M ∥x_{- 1} - x^{*}∥) | | x_{0} - x^{*} | | \\ + η (L_{0} | | x_{0} - x^{*} | | + M_{0} (| | x_{0} - x^{*} | | + | | x_{- 1} - x^{*} | |))} \\ < g (r_{*}) [(α + (L_{0} + 2 M_{0}) r_{*}) (L + 2 M) r_{*} / 2 + (L_{0} + 2 M_{0}) η] r_{*} \\ = p (r_{*}) r_{*} = r_{*}, \end{matrix}

where

p (r) = g (r) [(α + (L_{0} + 2 M_{0}) r) (L + 2 M) r / 2 + (L_{0} + 2 M_{0}) η]

. That is

x_{1} \in U (x^{*}, r_{*})

and estimate (24) holds for

n = 0

.

Suppose that

x_{n} \in U (x^{*}, r_{*})

for

n = 0, 1, \dots, k

and estimate (24) holds for

n = 0, 1, \dots, k - 1

, where

k \geq

1 is integer. We shall show that

x_{n + 1} \in U

and estimate (24) holds for

n = k

.

As in the derivation of (28), using (9), (21) and the definition of function

φ

, we get in turn that

\begin{matrix} | | I - {(A_{*}^{T} A_{*}^{T})}^{- 1} A_{^{k}}^{T} A_{k} | | = | | {(A_{*}^{T} A_{*}^{})}^{- 1} (A_{*}^{T} A_{*}^{} - A_{^{k}}^{T} A_{k}) | | \\ = | | {(A_{*}^{T} A_{*}^{})}^{- 1} (A_{*}^{T} (A_{*}^{} - A_{k}) + (A_{*}^{T} - A_{^{k}}^{T}) (A_{k} - A_{*}^{}) + (A_{*}^{T} - A_{^{k}}^{T}) A_{*}) | | \\ \leq | | {(A_{*}^{T} A_{*}^{})}^{- 1} | | (| | A_{*}^{T} | | | | A_{*}^{} - A_{k} | | + | | A_{*}^{T} - A_{^{k}}^{T} | | | | A_{k} - A_{*}^{} | | + | | A_{*}^{T} - A_{^{k}}^{T} | | | | A_{*}^{} | |) \\ \leq B (α | | A_{*}^{} - A_{k} | | + | | A_{*}^{T} - A_{^{k}}^{T} | | | | A_{k} - A_{*}^{} | | + α | | A_{*}^{T} - A_{^{k}}^{T} | |) \\ \leq B [2 α + L_{0} | | x_{k} - x^{*} | | + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} | |)] \\ \times [L_{0} | | x_{k} - x^{*} | | + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} | |)] \\ \leq B [2 α + (L_{0} + 2 M_{0}) r_{*}] (L_{0} + 2 M_{0}) r_{*} < 1 . \end{matrix}

Hence,

{(A_{^{k}}^{T} A_{k})}^{- 1}

exists and

\begin{matrix} | | {(A_{k + 1}^{T} A_{k + 1})}^{- 1} | | \leq g_{k} = B {1 - B [2 α + (L_{0} | | x_{k} - x^{*} | | \\ + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} | |))] \\ \times (L_{0} | | x_{k} - x^{*} | | + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} {| |))}}^{- 1} \leq g (r_{*}) . \end{matrix}

Therefore, iteration

x_{k + 1}

is well defined, and the following estimate holds

\begin{matrix} | | x_{k + 1} - x^{*} | | = | | x_{k} - x^{*} - {(A_{k}^{T} A_{k})}^{- 1} (A_{k}^{T} (F (x_{k}) + G (x_{k})) - A_{*}^{T} (F (x^{*}) + G (x^{*})) | | \\ \leq | | - {(A_{k}^{T} A_{k})}^{- 1} | | | | ([- A_{k}^{T} (A_{k} - \int_{0}^{1} F^{'} (x^{*} + t (x_{k} - x^{*})) d t \\ - G (x_{k}, x^{*})) (x_{k} - x^{*}) + (A_{k}^{T} - A_{*}^{T}) (F (x^{*}) + G (x^{*}))] | | \\ \leq g_{k} {(α + L_{0} | | x_{k} - x^{*} | | + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} | |)) \\ \times \frac{1}{2} (L ∥x_{k} - x^{*}∥ + 2 M ∥x_{k - 1} - x^{*}∥) | | x_{k} - x^{*} | | \\ + η (L_{0} | | x_{k} - x^{*} | | + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} | |))} \\ \leq g (r_{*}) {(α + L_{0} | | x_{k} - x^{*} | | + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} | |)) \\ \times \frac{1}{2} (L ∥x_{k} - x^{*}∥ + 2 M ∥x_{k - 1} - x^{*}∥) | | x_{k} - x^{*} | | \\ + η (L_{0} | | x_{k} - x^{*} | | + M_{0} (| | x_{k} - x^{*} | | + | | x_{k - 1} - x^{*} | |))} \\ < p (r_{*}) r_{*} = r_{*} . \end{matrix}

That proves

x_{k + 1} \in U (x^{*}, r_{*})

and estimate (24) for

n = k .

Thus, method (5) is well defined,

x_{n} \in U (x^{*}, r_{*})

for all

n \geq 0

and estimate (24) holds for all

n \geq 0

. It remains to prove that

x_{n} \to x^{*}

for

n \to \infty

.

Define a and b on

[0, r_{*}]

by

a (r) = g (r) ((L_{0} + M_{0}) η + α L r / 2 + L (L_{0} + M_{0}) r^{2} / 2)

(29)

and

b (r) = g (r) (M_{0} η + α M r + (2 M_{0} M + L_{0} M + \frac{M_{0} L}{2}) r^{2}) .

(30)

According to

r_{*}

, we get

a (r_{*}) \geq 0, b (r_{*}) \geq 0, a (r_{*}) + b (r_{*}) = 1 .

(31)

Using estimate (24), the definitions of constants

C_{i}, i = 1, 2, \dots, 7

, and functions a and b, for

n \geq 0

we get following

\begin{matrix} | | x_{n + 1} - x^{*} | | \leq C_{1} | | x_{n - 1} - x^{*} | | + C_{2} | | x_{n} - x^{*} | | + C_{3} r_{*} | | x_{n - 1} - x^{*} | | \\ + C_{4} r_{*}^{2} | | x_{n - 1} - x^{*} | | + C_{5} r_{*} | | x_{n} - x^{*} | | + C_{6} r_{*}^{2} | | x_{n - 1} - x^{*} | | + C_{7} r_{*}^{2} | | x_{n} - x^{*} | | \\ = a (r_{*}) | | x_{n} - x^{*} | | + b (r_{*}) | | x_{n - 1} - x^{*} | | . \end{matrix}

(32)

As it was shown in [1], under conditions (29)–(32) sequence {

x_{n}

} converges to

x^{*}

, as

n \to \infty

. □

Corollary 1.

In case of

η = 0

, we have a nonlinear least squares problem with zero residual. Then,

C_{1} = 0

and

C_{2} = 0

, and estimate (24) reduces to

| | x_{n + 1} - x^{*} | | \leq (C_{3} + C_{4} r_{*}) | | x_{n - 1} - x^{*} | | | | x_{n} - x^{*} | | + (C_{5} + C_{6} r_{*} + C_{7} r_{*}) | | x_{n} - x^{*} {| |}^{2} .

That is method (5) converges with order

\frac{1 + \sqrt{5}}{2}

.

Let

G (x) \equiv 0

in (4), corresponding to the residual functions being differentiable. Then, from Theorem 1, we obtain the following corollary.

Corollary 2.

If

G (x) \equiv 0

, then in the conditions of theorem, we set,

M = 0

,

C_{3} = 0

,

C_{4} = 0

, and estimate (24) reduces to:

| | x_{n + 1} - x^{*} | | \leq (C_{5} + C_{6} r_{*} + C_{7} r_{*}) | | x_{n} - x^{*} {| |}^{2} .

Hence method (5) has a convergence order two.

Remark 2.

If

L_{0} = L = L_{1}

and

M_{0} = M = M_{1}

our results specialize to the corresponding ones in [9]. Otherwise, they constitute an improvement as already noted in the Remark 1. As an example let the

a_{1}, q_{1}, C_{1}^{1}, C_{2}^{1}, C_{3}^{1}, C_{4}^{1}, r_{*}^{1}

denote the functions and parameter where

L_{0}, L, M_{0}, M

are replaced by

L_{1}, L_{1}, M_{1}, M_{1}

respectively. Then we have in view of (17)–(20) that

q (r) \leq q_{1} (r),

g (r) \leq g_{1} (r),

C_{1} \leq C_{1}^{1},

C_{2} \leq C_{2}^{1},

C_{3} \leq C_{3}^{1},

C_{4} \leq C_{4}^{1},

so

r_{*}^{1} \leq r_{*},

B (L_{1} + 2 M_{1}) η < 1 \Rightarrow B (L_{0} + 2 M_{0}) η < 1 .

Consequently, the new sufficient convergence criteria are weaker than the ones in [9], unless, if

L_{0} = L_{1}

and

M_{0} = M_{1}

. And moreover, the new error bounds are tighter than the corresponding ones in [9] and the rest of the advantages already mentioned in Remark 1 hold true.

The results can be improved even further, if (10) and (14) are replaced by

| | G (x, y) - G (x^{*}, x^{*}) | | \leq K_{0} | | x - x^{*} | | + \bar{K_{0}} | | y - x^{*} | |,

(33)

and

| | G (x, y) - G (u, x^{*}) | | \leq N_{0} | | x - u | | + \bar{N_{0}} | | y - x^{*} | |,

(34)

respectively, since

K_{0} \leq M_{0}

,

\bar{K_{0}} \leq M_{0}

,

N_{0} \leq M

and

\bar{N_{0}} \leq M

. We leave the details to the motivated reader.

4. Local Convergence Analysis (6)

Sufficient conditions and the rate of local converges of method (6) are defined in the following theorem.

Theorem 2.

Let

F + G : R^{p} \to R^{m}

be continuous on set

D \subseteq R^{p}

, F be continuously differentiable in this set, and G be a function on D. Suppose, the problem (4) has a solution

x^{*}

on set D,

F (x^{*}) + G (x^{*}) = 0

and the inverse operator

{(A_{*}^{T} A_{*})}^{- 1} = {[F^{'} {(x^{*})}^{T} F^{'} (x^{*})]}^{- 1}

exists and

| | {(A_{*}^{T} A_{*})}^{- 1} | | \leq B

. Fréchet derivative

F^{'}

and function G satisfy Lipschitz conditions on set

D_{0}

∥F^{'} (x) - F^{'} (x^{*})∥ \leq L_{0} ∥x - x^{*}∥,

(35)

∥F^{'} (x) - F^{'} (y)∥ \leq L ∥x - y∥,

(36)

∥G (x) - G (x^{*})∥ \leq M_{0} ∥x - x^{*}∥ .

(37)

Moreover,

∥F^{'} (x^{*})∥ \leq α;

(38)

B M_{0} α < 1

(39)

and

U (x^{*}, r_{*}) \subseteq D,

where

r_{*}

is unique positive zero of function q, defined by

q (r) = B [(α + L_{0} r) (L r + 2 M_{0}) / 2] + B [2 α + L_{0} r] L_{0} r - 1 .

(40)

Then, for

x_{0} \in U (x^{*}, r_{*})

method (6) is well defined and generates the sequence

{x_{n}}, n = 0, 1, \dots

which belongs to set

U (x^{*}, r_{*})

, and converges to the solution

x^{*}

. Moreover, the following error bounds hold

| | x_{n + 1} - x^{*} | | \leq C_{1} | | x_{n} - x^{*} | | + C_{2} | | x_{n} - x^{*} {| |}^{2} + C_{3} | | x_{n} - x^{*} {| |}^{3},

(41)

where

\begin{matrix} g (r) = B {[1 - B (2 α + L_{0} r) L_{0} r]}^{- 1}; \\ C_{1} = g (r_{*}) M_{0} α; C_{2} = g (r_{*}) (L_{0} M_{0} + \frac{α L}{2}); C_{3} = g (r_{*}) \frac{L_{0} L}{2} . \end{matrix}

(42)

Proof.

According to intermediate value theorem on

[0, r]

for sufficiently large r and in view of (39) function q has a least positive zero, denoted by

r_{*}

, and

q^{'} (r) \geq 0

for

r \geq 0 .

Indeed, this zero is unique on

[0, r]

. The proof analogous to the one given in Theorem 1.

Let

A_{n} = F^{'} (x_{n})

, and set

n = 0

. By assuming

x_{0}, x_{- 1} \in U (x^{*}, r_{*})

. By analogy to (26) in Theorem 1, we get

| | I - {(A_{*}^{T} A_{*}^{})}^{- 1} A_{0}^{T} A_{0} | | \leq B (α | | A_{*}^{} - A_{0} | | + | | A_{*}^{T} - A_{0}^{T} | | | | A_{0} - A_{*}^{} | | + α | | A_{*}^{T} - A_{0}^{T} | |) .

(43)

Taking into account, that

| | A_{0} - A_{*} | | = | | F^{'} (x_{0}) - F^{'} (x^{*}) | | | \leq L_{0} | | x_{0} - x^{*} | |,

(44)

from inequality (43), definition of

r_{*}

given in (40), we get

\begin{matrix} | | I - {(A_{*}^{T} A_{*}^{})}^{- 1} A_{0}^{T} A_{0} | | \leq B [2 α + L_{0} | | x_{0} - x^{*} | |] L_{0} | | x_{0} - x^{*} | | \\ \leq B [2 α + L_{0} r_{*}] L_{0} r_{*} = φ (r_{*}) < 1 . \end{matrix}

(45)

From the Banach Lemma on invertible operators [3], and (45)

A_{0}^{T} A_{0}

is invertible. Then, from (43)–(45), we get

\begin{matrix} | | {(A_{0}^{T} A_{0})}^{- 1} | | \leq g_{0} = B {1 - B [2 α + L_{0} | | x_{0} - x^{*} | |] (L_{0} | | x_{0} - x^{*} {| |}}^{- 1} \\ \leq g (r_{*}) = B {1 - B [2 α + L_{0} r_{*}] L_{0} r_{*}}^{- 1} . \end{matrix}

Hence, iteration

x_{1}

is well defined.

Next, we will show that

x_{1} \in U (x^{*}, r_{*})

. We have the estimate

\begin{matrix} | | x_{1} - x^{*} | | = | | x_{0} - x^{*} - {(A_{0}^{T} A_{0})}^{- 1} (A_{0}^{T} (F (x_{0}) + G (x_{0})) - A_{*}^{T} (F (x^{*}) + G (x^{*})) | | \\ \leq | | - {(A_{0}^{T} A_{0})}^{- 1} | | | | [- A_{0}^{T} (A_{0} - \int_{0}^{1} F^{'} (x^{*} + t (x_{0} - x^{*})) d t (x_{0} - x^{*}) \\ - (G (x_{0}) - G (x^{*}))) + (A_{0}^{T} - A_{*}^{T}) (F (x^{*}) + G (x^{*}))] | | . \end{matrix}

In view of the estimates

\begin{matrix} | | (A_{0} - \int_{0}^{1} F^{'} (x^{*} + t (x_{0} - x^{*})) d t) (x_{0} - x^{*}) - (G (x_{0}) - G (x^{*})) | | \\ = | | \int_{0}^{1} (F^{'} (x_{0}) - F^{'} (x^{*} + t (x_{0} - x^{*})) d t (x_{0} - x^{*}) - (G (x_{0}) - G (x^{*})) | | \\ \leq (\frac{1}{2} L | | x_{0} - x^{*} | | + M_{0}) | | x_{0} - x^{*} | | = \frac{1}{2} (L | | x_{0} - x^{*} | | + 2 M_{0}) | | x_{0} - x^{*} | |, \\ | | A_{0} | | \leq | | A_{*} | | + | | A_{0} - A_{*} | | \leq α + L_{0} | | x_{0} - x^{*} | |, \end{matrix}

we obtain in turn that

\begin{matrix} | | x_{1} - x^{*} | | \leq B {(α + L_{0} | | x_{0} - x^{*} | |) \frac{1}{2} (L ∥x_{0} - x^{*}∥ + 2 M_{0}) | | x_{0} - x^{*} | |} \\ / {1 - B [2 α + L_{0} | | x_{0} - x^{*} | |] L_{0} | | x_{0} - x^{*} | |} \\ \leq g_{0} {(α + L_{0} | | x_{0} - x^{*} | |) \frac{1}{2} (L ∥x_{0} - x^{*}∥ + 2 M_{0}) | | x_{0} - x^{*} | |} \\ \leq g (r_{*}) {(α + L_{0} | | x_{0} - x^{*} | |) \frac{1}{2} (L ∥x_{0} - x^{*}∥ + 2 M_{0}) | | x_{0} - x^{*} | |} \\ < g (r_{*}) [(α + L_{0} r_{*}) (L r_{*} + 2 M_{0}) / 2] r_{*} = r_{*} . \end{matrix}

Hence,

x_{1} \in U (x^{*}, r_{*})

and inequality (41) holds for

n = 0

.

Suppose

x_{n} \in U (x^{*}, r_{*})

for

n = 0, 1, \dots, k

and estimate (41) holds for

n = 0, 1, \dots, k - 1

, where

k \geq 1

is integer. Next, we show that

x_{n + 1} \in U (x^{*}, r_{*})

and estimate (41) holds for

n = k

.

Then, we obtain

\begin{matrix} | | I - {(A_{*}^{T} A_{*}^{T})}^{- 1} A_{^{k}}^{T} A_{k} | | = | | {(A_{*}^{T} A_{*}^{})}^{- 1} (A_{*}^{T} A_{*}^{} - A_{^{k}}^{T} A_{k}) | | \\ = | | {(A_{*}^{T} A_{*}^{})}^{- 1} (A_{*}^{T} (A_{*}^{} - A_{k}) + (A_{*}^{T} - A_{^{k}}^{T}) (A_{k} - A_{*}^{}) + (A_{*}^{T} - A_{^{k}}^{T}) A_{*}) | | \\ \leq | | {(A_{*}^{T} A_{*}^{})}^{- 1} | | (| | A_{*}^{T} | | | | A_{*}^{} - A_{k} | | + | | A_{*}^{T} - A_{^{k}}^{T} | | | | A_{k} - A_{*}^{} | | + | | A_{*}^{T} - A_{^{k}}^{T} | | | | A_{*}^{} | |) \\ \leq B (α | | A_{*}^{} - A_{k} | | + | | A_{*}^{T} - A_{^{k}}^{T} | | | | A_{k} - A_{*}^{} | | + α | | A_{*}^{T} - A_{^{k}}^{T} | |) \\ \leq B [2 α + L_{0} | | x_{k} - x^{*} | |] L_{0} | | x_{k} - x^{*} | | \leq B [2 α + L_{0} r_{*}] L_{0} r_{*} < 1 . \end{matrix}

Hence,

{(A_{^{k}}^{T} A_{k})}^{- 1}

exists and

| | {(A_{k + 1}^{T} A_{k + 1})}^{- 1} | | \leq g_{k} = B {1 - B [2 α + L_{0} | | x_{k} - x^{*} | |] L_{0} | | x_{k} - x^{*} {| |}}^{- 1} \leq g (r_{*}) .

Therefore iteration

x_{k + 1}

is well defined, and we get in turn that

\begin{matrix} | | x_{k + 1} - x^{*} | | = | | x_{k} - x^{*} - {(A_{k}^{T} A_{k})}^{- 1} (A_{k}^{T} (F (x_{k}) + G (x_{k})) - A_{*}^{T} (F (x^{*}) + G (x^{*})) | | \\ \leq | | - {(A_{k}^{T} A_{k})}^{- 1} | | | | ([- A_{k}^{T} (A_{k} - \int_{0}^{1} F^{'} (x^{*} + t (x_{k} - x^{*})) d t \\ - G (x_{k}, x^{*})) (x_{k} - x^{*}) + (A_{k}^{T} - A_{*}^{T}) (F (x^{*}) + G (x^{*}))] | | \\ \leq g_{k} {(α + L_{0} | | x_{k} - x^{*} | |) \frac{1}{2} (L ∥x_{k} - x^{*}∥ + 2 M_{0}) | | x_{k} - x^{*} | |} \\ \leq g (r_{*}) {(α + L_{0} | | x_{k} - x^{*} | |) \frac{1}{2} (L ∥x_{k} - x^{*}∥ + 2 M_{0}) | | x_{k} - x^{*} | |} < r_{*} . \end{matrix}

That proves

x_{k} \in U (x^{*}, r_{*})

, and estimate (41) for

n = k

.

Thus, iterative process (6) is well defined,

x_{n} \in U (x^{*}, r_{*})

for all

n \geq 0

and estimate (41) holds for all

n \geq 0

.

Define function a on

[0, r_{*}]

a (r) = g (r) (M_{0} α + (α L / 2 + L_{0} M_{0}) r + L_{0} L r^{2} / 2) .

(46)

Using estimate (41), the definitions of constants

C_{i}, i = 1, 2, 3

and function a, for

n \geq 0

, we get the following

\begin{matrix} | | x_{n + 1} - x^{*} | | \leq C_{1} | | x_{n} - x^{*} | | + C_{2} r_{*} | | x_{n} - x^{*} | | + C_{3} r_{*}^{2} | | x_{n} - x^{*} | | \\ = a (r_{*}) | | x_{n} - x^{*} | | . \end{matrix}

(47)

For any

r_{*} > 0

and initial point

x_{0} \in U (x^{*}, r_{*})

,

r^{'}

exists and

0 < r^{'} < r_{*}

such that

x_{0} \in U (x^{*}, r^{'})

. Similarly to the proof that all iterates stay in

U (x^{*}, r_{*})

, we show that all iterates stay in

U (x^{*}, r^{'})

. So, estimation (47) holds, if

r_{*}

is replaced by

r^{'}

. In particular, from (47) for

n \geq 0

, we get

| | x_{n + 1} - x^{*} | | \leq a | | x_{n} - x^{*} | |,

where

a = a (r^{'})

. Obviously

a \geq 0

,

a < a (r_{*}) = 1

. Therefore, we obtain

| | x_{n + 1} - x^{*} | | \leq a | | x_{n} - x^{*} | | \leq \dots \leq a^{n + 1} | | x_{0} - x^{*} | | .

However,

a^{n + 1} \to 0

for

n \to \infty

. Hence, sequence {

x_{n}

} converges to

x^{*}

as

n \to \infty

, with a rate of geometric progression. □

The same type of improvements as in Theorem 1 are obtained for Theorem 2 (see Remark 2).

Remark 3.

As we can see from estimations (41) and (42), convergence of method (6) depends on α,

L_{0}

, L and M. For problems with weak nonlinearity (α,

L_{0}

, L and

M_{0}

– “small”) convergence rate of iterative process is linear. In case of strongly nonlinear problems (α,

L_{0}

, L and/or

M_{0}

– “large”) method (6) may not converge at all.

5. Numerical Experiments

Let us compare the convergence rate of combined method (5), Gauss-Newton type method (6) Secant-type method for solving nonlinear least squares problem [5,6] on some test cases with

\begin{matrix} x_{n + 1} = x_{n} - {(A_{n}^{T} A_{n})}^{- 1} A_{n}^{T} (F (x_{n}) + G (x_{n})), \\ A_{n} = F (x_{n}, x_{n - 1}) + G (x_{n}, x_{n - 1}), n = 0, 1, \dots \end{matrix}

(48)

Testing is carried out on nonlinear systems with a nondifferentiable operator with zero and non-zero residual. Classic Gauss-Newton and Newton methods can not be used for solving such a problem. Results are searched with an accuracy

ε = 10^{- 8}

. Calculations are performed until the following conditions are satisfied

| | x_{n + 1} - x_{n} | | \leq ε a n d | | A_{n}^{T} (F (x_{n}) + G (x_{n})) | | \leq ε,

in this case

f (x) = min_{x \in R^{n}} \frac{1}{2} {(F (x) + G (x))}^{T} (F (x) + G (x)) .

Example 1.

[11,12].

\{\begin{matrix} 3 x^{2} y + y^{2} - 1 + | x - 1 | = 0, \\ x^{4} + x y^{3} - 1 + | y | = 0, \end{matrix}

(x^{*}, y^{*}) \approx (0.89465537, 0.32782652), f (x^{*}) = 0 .

Example 2.

n = 2, m = 3;

\{\begin{matrix} 3 x^{2} y + y^{2} - 1 + | x - 1 | = 0, \\ x^{4} + x y^{3} - 1 + | y | = 0, \\ | x^{2} - y | = 0, \end{matrix}

(x^{*}, y^{*}) \approx (0.74862800, 0.43039151), f (x^{*}) \approx 4.0469349 \cdot 10^{- 2} .

Remark 4.

The results of the numerical experiments are shown in the Table 1. In particular, we compare studied methods with respect to the number of iterations needed to find the solution with given accuracy. In Example 1, all methods converge to one solution. In Example 2 Gauss-Newton type method (6) converges to point

(x^{*}, y^{*}) \approx (0.89465537, 0.32782652)

with residual

f (x^{*}) \approx 1.11666739 \cdot 10^{- 1}

, with the same number of iterations. Such iterations are marked with * symbol in the table. Other methods find the point

(x^{*}, y^{*}) \approx (0.74862800, 0.43039151)

with smaller residual

f (x^{*}) \approx 4.0469349 \cdot 10^{- 2}

. Additional initial approximation

(x_{- 1}, y_{- 1})

is chosen as:

(x_{- 1}, y_{- 1}) = (x_{0} - 10^{- 4}, y_{0} - 10^{- 4}) .

6. Conclusions

Based on the theoretical studies, the numerical experiments, and the comparison of obtained results, we can argue that the combined differential-difference method (5) converges faster than Gauss-Newton type method (6) and Secant type method (48). Moreover, the method has high convergence order

(1 + \sqrt{5}) / 2

in case of zero residual and does not require calculation of derivatives of the nondifferentiable part of operator. Therefore, the proposed method (5) solves the problem efficiently and fast.

Author Contributions

All authors contributed equally and significantly to the writing of this article. All authors read and approved the final manuscript.

Funding

This research received no external funding.

Acknowledgments

The authors would like to express their sincere gratitude to the referees for their valuable comments which have significantly improved the presentation of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Argyros, I.K. Convergence and Applications of Newton-Type Iterations; Springer: New York, NY, USA, 2008; 506p. [Google Scholar]
Dennis, J.E.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
Ortega, J.M.; Rheinboldt, W.C. Iterative Solution of Nonlinear Equations in Several Variables; Academic Press: New York, NY, USA, 1970. [Google Scholar]
Argyros, I.K.; Ren, H. A derivative free iterative method method for solving least squares problems. Numer. Algorithms 2011, 58, 555–571. [Google Scholar]
Ren, H.; Argyros, I.K. Local convergence of a secant type method for solving least squares problems. Appl. Math. Comput. 2010, 217, 3816–3824. [Google Scholar] [CrossRef]
Shakhno, S.M.; Gnatyshyn, O.P. On an iterative algorithm of order 1.839... for solving the nonlinear least squares problems. Appl. Math. Comput. 2005, 161, 253–264. [Google Scholar] [CrossRef]
Shakhno, S.M.; Gnatyshyn, O.P. Iterative-difference methods for solving nonlinear least-squares problem. In Progress in Industrial Mathematics at ECMI 98; Vieweg + Teubner Verlag: Stuttgart, Germany, 1999; pp. 287–294. [Google Scholar]
Argyros, I.K.; Hilout, S. On an improved convergence analysis of Newton’s method. Appl. Math. Comput. 2013, 225, 372–386. [Google Scholar] [CrossRef]
Shakhno, S.M.; Shunkin, Y.V. One combined method for solving nonlinear least squares problems. Visnyk Lviv Univ. Ser. Appl. Math. Inform. 2017, 25, 38–48. (In Ukrainian) [Google Scholar]
Ulm, S. On generalized divided differences. Proc. Acad. Sci. Estonian SSR. Phys. Mathe. 1967, 16, 13–26. (In Russian) [Google Scholar]
Cătinas, E. On some iterative methods for solving nonlinear equations. Revue d’Analyse Numérique et de Theorie de l’Approximation 1994, 23, 47–53. [Google Scholar]
Shakhno, S.M.; Mel’nyk, I.V.; Yarmola, H.P. Analysis of the Convergence of a Combined Method for the Solution of Nonlinear Equations. J. Math. Sci. 2014, 201, 32–43. [Google Scholar] [CrossRef]
Zabrejko, P.P.; Nguen, D.F. The majorant method in the theory of Newton-Kantorovich approximations and the Pták error estimates. Numer. Funct. Anal. Optim. 1987, 9, 671–686. [Google Scholar] [CrossRef]
Argyros, I.K.; Magreñán, Á.A. A Contemporary Study of Iterative Methods: Convergence, Dynamics and Applications; Academic Press: London, UK, 2018. [Google Scholar]

Table 1. Number of iteration made to solve test problem.

Example	$(x_{0}, y_{0})$	Gauss-Newton Type (6)	Secant Type (48)	Combined Method (5)
1	(1, 0)	19	7	7
	(3, 1)	22	11	10
	(0.5, 0.5)	21	18	10
2	(1, 0)	19*	22	12
	(3, 1)	22*	25	15
	(0.5, 0.5)	21*	19	13

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Argyros, I.; Shakhno, S.; Shunkin, Y. Improved Convergence Analysis of Gauss-Newton-Secant Method for Solving Nonlinear Least Squares Problems. Mathematics 2019, 7, 99. https://doi.org/10.3390/math7010099

AMA Style

Argyros I, Shakhno S, Shunkin Y. Improved Convergence Analysis of Gauss-Newton-Secant Method for Solving Nonlinear Least Squares Problems. Mathematics. 2019; 7(1):99. https://doi.org/10.3390/math7010099

Chicago/Turabian Style

Argyros, Ioannis, Stepan Shakhno, and Yurii Shunkin. 2019. "Improved Convergence Analysis of Gauss-Newton-Secant Method for Solving Nonlinear Least Squares Problems" Mathematics 7, no. 1: 99. https://doi.org/10.3390/math7010099

APA Style

Argyros, I., Shakhno, S., & Shunkin, Y. (2019). Improved Convergence Analysis of Gauss-Newton-Secant Method for Solving Nonlinear Least Squares Problems. Mathematics, 7(1), 99. https://doi.org/10.3390/math7010099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improved Convergence Analysis of Gauss-Newton-Secant Method for Solving Nonlinear Least Squares Problems

Abstract

1. Introduction

2. Description of the Problem

3. Local Convergence Analysis (5)

4. Local Convergence Analysis (6)

5. Numerical Experiments

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI