Next Article in Journal
A Novel Inverse Credibility Distribution Approach for the Membership Functions of LR Fuzzy Intervals: A Case Study on a Completion Time Analysis
Previous Article in Journal
Extended Ohtsuka–Vălean Sums
Previous Article in Special Issue
Characterizations of Well-Posedness for Generalized Hemivariational Inequalities Systems with Derived Inclusion Problems Systems in Banach Spaces
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Class of Sparse Direct Broyden Method for Solving Sparse Nonlinear Equations

1
School of Science, Xi’an Polytechnic University, Xi’an 710048, China
2
College of Science, Central South University of Forestry and Technology, Changsha 410004, China
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(8), 1552; https://doi.org/10.3390/sym14081552
Submission received: 11 June 2022 / Revised: 19 July 2022 / Accepted: 25 July 2022 / Published: 28 July 2022
(This article belongs to the Special Issue Symmetry in Nonlinear Functional Analysis and Optimization Theory II)

Abstract

:
In our paper, we present a sparse quasi-Newton method, called the sparse direct Broyden method, for solving sparse nonlinear equations. The method can be seen as a Broyden-like method and is a least change update satisfying the sparsity condition and direct tangent condition simultaneously. The local and q-superlinear convergence is presented based on the bounded deterioration property and Dennis–Moré condition. By adopting a nonmonotone line search, we establish the global and superlinear convergence. Moreover, the unit step length is essentially accepted. Numerical results demonstrate that the sparse direct Broyden method is effective and competitive for large-scale nonlinear equations.

1. Introduction

We consider the nonlinear equation
F ( x ) = 0 , x R n ,
where F : R n R n is a continuously differentiable mapping. We denote F ( x ) as the Jacobian matrix of F ( x ) at x and pay attention to the case F ( x ) having sparse or special structures. Specifically, one has
F ( x ) = ( F 1 ( x ) , F 2 ( x ) , , F n ( x ) ) T ,
and
F ( x ) = ( F 1 ( x ) , F 2 ( x ) , , F n ( x ) ) T .
Nonlinear equations arise from many scientific and engineering problems and have various applications in the fields such as physics, biology, and many other fields [1].
The linearization of nonlinear Equation (1) at an iterative point x k is
F ( x ) F ( x k ) + F ( x k ) ( x x k ) = 0 ;
when F ( x k ) is nonsingular, we obtain the Newton–Raphson method
x k + 1 = x k F ( x k ) 1 F ( x k ) .
Newton’s method is theoretically efficient because it is locally quadratically convergent when the Jacobian matrix is nonsigular and Lipschitz continuous at the solution of F ( x ) [2]. However, at each iteration, Newton’s method must compute the exact Jacobian matrix to keep the quadratic convergence rate. The idea of quasi-Newton methods is to approximate the Jacobian matrix F ( x k ) by a quasi-Newtonian matrix B k with an acceptable reduction of convergence rate. However, at each iteration, Newton’s method must compute the exact Jacobian matrix. To avoid computing the derivatives directly, quasi-Newton methods have been proposed, where F ( x k ) is approximated by a quasi-Newton matrix B k R n × n . Thus, quasi-Newton methods generate an iteration as follows:
x k + 1 = x k + α k d k ,
where the step length α k > 0 is determined by some line search strategies, and d k is the quasi-Newton direction obtained by solving the subproblem
F ( x k ) + B k d k = 0 .
Usually, as an approximation to the Jacobian matrix F ( x k ) , matrix B k usually satisfies the so-called quasi-Newton condition
B k + 1 s k = y k ,
where
s k = s k + 1 s k = α k d k , y k = F ( x k + 1 ) F ( x k ) .
The quasi-Newton matrix B k can be updated by kinds of quasi-Newton update formulae, such as Broyden’s method, Powell’s symmetric Broyden method, BFGS method, and DFP methods [3,4].
Quasi-Newton methods are popular among small and medium-scale problems, since they possess local and superlinear convergence without computing the Jacobian [5,6,7]. However, when the dimension of nonlinear equations is large, the matrix B k will be dense. Then, the computation and time complexity will be high. There are two considerations to motivate us to consider the sparse quasi-Newton methods for solving sparse nonlinear equations in this paper. One is the fact that there are lots of nonlinear equations with sparse or special Jacobian. Moreover, quasi-Newton methods for solving (1) have a good property that they can maintain the sparse structure of Jacobian matrices. Thus, in this paper, we are interested in constructing a sparse quasi-Newton method for solving sparse nonlinear equations, where the Jacobian matrix F ( x k ) has sparse or special structure. Earlier work on sparse quasi-Newton methods was carried out by Schubert [8] and Toint [9], where Schubert modified Broyden’s method by updating B k row by row so that the sparsity can be maintained and Toint studied sparse and symmetric quasi-Newton methods. There also have been many kinds of methods for solving large-scale nonlinear systems, such as limited-memory quasi-Newton methods [10,11], partitioned quasi-Newton methods [12,13,14], diagonal quasi-Newton method [15,16], and column updating method [17].
However, the global convergence of quasi-Newton methods for nonlinear equations is a relatively difficult topic, not to mention the dense case. This mainly results from the fact that the quasi-Newton direction may not be a descent direction of the merit function
θ ( x ) = 1 2 F ( x ) 2 .
Griewank [18] and Li and Fukushima [19] have proposed some line search techniques to establish the global convergence of the quasi-Newton method.
The purpose of our paper is to develop a sparse quasi-Newton method and study its local and global convergence. We consider Broyden’s method
B k + 1 = B k + ( y k B k s k ) s k T s k T s k .
If we replace y k with F ( x k + 1 ) s k , we can obtain the following update
B k + 1 = B k + ( F ( x k + 1 ) B k ) s k s k T s k T s k ,
which fulfills the direct tangent condition [20,21]
B k + 1 s k = F ( x k + 1 ) s k .
We call the corresponding method the direct Broyden method. Then, we will develop a sparse direct Broyden method, which enjoys the following nice properties: (a) the new sparse quasi-Newton method is a least change update satisfying the direct tangent condition; (b) the proposed method can preserve the sparsity property of the original Jacobian matrix F ( x ) exactly; and (c) the sparse direct Broyden method is globally and superlinearly convergent. Presented limited numerical results demonstrate that our algorithm has better performance than Schubert’s method and the direct Broyden method in iteration counts, function evaluation counts, and Broyden’s mean convergence rate.
The paper is organized as follows: in Section 2, we propose a sparse direct Broyden method and list its nice property. For the full step sparse direct Broyden method, local and superlinear convergence is also given. By adopting a nonmonotone line search, we prove the global and superlinear convergence of the method proposed in Section 2. Moreover, after finitely many iterations, the unit step length will always be accepted. In Section 4, we do some preliminary numerical experiments to test the efficiency of the proposed method. In the last section, we give the conclusion.

2. A New Sparse Quasi-Newton Update and Local Convergence

We pay attention to nonlinear Equation (1), whose Jacobian matrix is sparse or has a special structure. Firstly, we introduce some notations to describe the sparsity structure of the Jacobian as that in [22]. Define the sparsity features of the ith row of F ( x )
V i = { v R n : e j T v = 0   for   all   j   such   that   ( F ( x ) ) i j = e i T F ( x ) e j = 0   for   all   x R n } ,
where e j is the jth column of identity matrix. Then, we can obtain the set of matrices V that preserve the sparsity pattern of F ( x ) :
V = { A R n × n : A T e i V i , i = 1 , 2 , , n } .
Define a projection operator S i , i = 1 , 2 , , n , which maps R n onto V i :
S i ( s k ) j = ( s ( i ) k ) j = ( s k ) j , if v j 0 , 0 , if v j = 0 .
Similar to the derivation of Schubert’s method [8], we consider the sparse extension of direct Broyden update [2]
B k + 1 = B k + ( F ( x k + 1 ) B k ) s k s k T s k T s k ,
which fulfills the direct tangent condition
B k + 1 s k = F ( x k + 1 ) s k .
Then, we can obtain a compact representation of the new sparse quasi-Newton update as
B k + 1 = B k + i = 1 n ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k e i s ( i ) k T ,
where the pseudo-inverse of α R is defined by
α + = α 1 , if α 0 , 0 , if α = 0 .
The new sparse quasi-Newton method (3) updates the quasi-Newton matrix row by row to preserve the zero and nonzero structure of the Jacobian.
Then, we can obtain a quasi-Newton method as
x k + 1 = x k + α k d k ,
where d k can be obtained by solving the following subproblem
F ( x k ) + B k d k = 0 ,
and B k is updated by sparse direct Broyden update
B k + 1 = B k + i = 1 n ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k e i s ( i ) k T .
We call the corresponding method the sparse direct Broyden method. When α k 1 , we refer to it as a full step sparse direct Broyden method.
Lemma 1.
The B k + 1 defined by (3) is the unique solution to the following minimization problem:
min { B B k F : B V Q ( F ( x k + 1 ) , s k ) } ,
where Q ( F ( x k + 1 ) , s k ) = { B R n × n | B s k = F ( x k + 1 ) s k } .
Proof. 
Firstly, we will prove that B k + 1 V Q ( F ( x k + 1 ) , s k ) } . For i = 1 , 2 , , n , multiply both sides of (3) by e i T , to obtain
e i T B k + 1 = e i T B k + ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k s ( i ) k T .
Since B k T e i V i and s k V i , then we have B k + 1 T e i V i , which implies B k + 1 V .
If s ( i ) k 0 , one has
e i T B k + 1 s k = e i T B k s k + ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k s ( i ) k T s k .
According to the definition of the operator S i , we have
s ( i ) k T s ( i ) k = s ( i ) k T s k and ( s ( i ) k T s ( i ) k ) + = ( s ( i ) k T s ( i ) k ) 1 .
Then, (5) can be written as
e i T B k + 1 s k = e i T B k s k + e i T ( F ( x k + 1 ) B k ) s k = e i T F ( x k + 1 ) s k .
If s ( i ) k = 0 , we have
e i T F ( x k + 1 ) s k = e i T F ( x k + 1 ) s ( i ) k = 0 ,
thus e i T B k + 1 s k = e i T F ( x k + 1 ) s k , which implies B k + 1 s k = F ( x k + 1 ) s k . Therefore, B k + 1 Q ( F ( x k + 1 ) , s k ) } .
Then, we will prove the uniqueness. Suppose that B ¯ k + 1 Q ( F ( x k + 1 ) , s k ) } . Since B ¯ k + 1 s k = F ( x k + 1 ) s k and ( B ¯ k + 1 B k ) s k = ( B ¯ k + 1 B k ) s ( i ) k , one has
B k + 1 = B k + i = 1 n ( s ( i ) k T s ( i ) k ) + e i T ( B ¯ k + 1 B k ) s k e i s ( i ) k T .
Taking the Frobenius norm,
B k + 1 B k F = i = 1 n e i T ( B k + 1 B k ) 2 1 / 2 = i = 1 n ( s ( i ) k T s ( i ) k ) + e i T ( B ¯ k + 1 B k ) s k s ( i ) k T 2 1 / 2 = i = 1 n | ( s ( i ) k T s ( i ) k ) + e i T ( B ¯ k + 1 B k ) s ( i ) k | 2 · s ( i ) k 2 1 / 2 i = 1 , s ( i ) k 0 n e i T ( B ¯ k + 1 B k ) 2 1 / 2 i = 1 n e i T ( B ¯ k + 1 B k ) 2 1 / 2 = B ¯ k + 1 B k F ,
where the first inequality follows from the triangle inequality. Since the function f ( B ) = B B k F is strictly convex and the constraint condition (4) is convex, we can obtain the uniqueness.    □
To analyze the local convergence of the full step sparse direct Broyden method, first we show that the bounded deterioration property
B k + 1 F ( x * ) F ( 1 + α 1 σ k ) B k F ( x * ) F + α 2 γ k ,
is satisfied with some constants α 1 , α 2 0 , where γ k = max { x k x * 2 , x k + 1 x * 2 } 2 .
Lemma 2.
Suppose that F : R n R n is continuously differentiable in D 0 , which is an open and convex set. Let x * D 0 be a solution of (1) at which F ( x * ) is nonsingular. Suppose that there exists K = ( k 1 , k 2 , , k n ) R n with k i 0 , for i = 1 , 2 , , n , such that
e i T ( F ( x ) F ( y ) ) k i x y , x , y D 0 .
Then, one has the estimation
B k + 1 F ( x * ) F 2 B k F ( x * ) F 2 ( B k F ( x * ) ) s k 2 s k 2 + L 2 γ k ,
where L = K 2 .
Proof. 
For the case s k = 0 , then it is obvious that F ( x k ) = 0 and x k = x * . For the case s k 0 , subtracting F ( x * ) from both sides of the update formula
B k + 1 = B k + i = 1 n ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k e i s ( i ) k T ,
and multiplying by e i T , i = 1 , 2 , , n , one has
e i T ( B k + 1 F ( x * ) ) = e i T ( B k F ( x * ) ) + ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k s ( i ) k T = e i T ( B k F ( x * ) ) ( I ( s ( i ) k T s ( i ) k ) + s ( i ) k s ( i ) k T ) + ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) F ( x * ) ) s k s ( i ) k T .
Taking norms yields
e i T ( B k + 1 F ( x * ) ) F 2 = e i T ( B k F ( x * ) ) ( I ( s ( i ) k T s ( i ) k ) + s ( i ) k s ( i ) k T ) 2 + ( s ( i ) k T s ( i ) k ) + | e i T ( F ( x k + 1 ) F ( x * ) ) s k | 2 = e i T ( B k F ( x * ) ) 2 2 ( s ( i ) k T s ( i ) k ) + | e i T E k s ( i ) k | 2 + ( s ( i ) k T s ( i ) k ) + | e i T ( F ( x k + 1 ) F ( x * ) ) s k | 2 e i T E k 2 2 | e i T E k s k | 2 s k 2 + ( s ( i ) k T s ( i ) k ) + | e i T ( F ( x k + 1 ) F ( x * ) ) s k | 2 .
If s ( i ) k = 0 , then we have ( s ( i ) k T s ( i ) k ) + = 0 . It is obvious that
0 = ( s ( i ) k T s ( i ) k ) + | e i T ( F ( x k + 1 ) F ( x * ) ) s k | 2 k i 2 σ k .
If s ( i ) k 0 , it follows that
( s ( i ) k T s ( i ) k ) + | e i T ( F ( x k + 1 ) F ( x * ) ) s k | 2 = ( s ( i ) k T s ( i ) k ) + | e i T ( F ( x k + 1 ) F ( x * ) ) s ( i ) k | 2 e i T ( F ( x k + 1 ) F ( x * ) ) 2 k i 2 x k + 1 x * 2 k i 2 σ k 2 .
Thus, (7) reduces to
e i T ( B k + 1 F ( x * ) ) F 2 e i T ( B k F ( x * ) ) 2 | e i T ( B k F ( x * ) ) s k | 2 s k 2 + k i 2 γ k ,
Make a summation to obtain
( B k + 1 F ( x * ) ) F 2 ( B k F ( x * ) ) F 2 ( B k F ( x * ) ) s k 2 s k 2 + L 2 γ k .
   □
Based on the classical framework of Dennis and Moré, we give the following local convergence, which can be proved similar to the case of Broyden’s method [6,7].
Theorem 1.
Let the conditions in Lemma 2 hold. Then, there exist constants ϵ , δ > 0 such that, if x 0 x * 2 < ϵ and B 0 F ( x * ) F < δ , the sequence { x k } is well defined and converges to x * . Furthermore, the convergence rate is superlinear.
Proof. 
According to Lemma 2, one has
( B k + 1 F ( x * ) ) F ( B k F ( x * ) F + L γ k ,
which means that the estimation (6) is satisfied with α 1 = 0 and α 2 = L . Then, we obtain the local and linear convergence of { x k } .
Next, we will show the Dennis–Moré condition [7]
lim k ( B k F ( x * ) ) s k s k = 0
is satisfied. According to (8), one has
( B k + 1 F ( x * ) ) F ( B k F ( x * ) ) F 2 ( B k F ( x * ) ) s k 2 s k 2 1 / 2 + L γ k ;
then, the result can be proved similar to that in [7].    □

3. Algorithm and Global Convergence

In this section, by the use of LF condition [19], we propose a global sparse Broyden method, whose specific steps are listed in the following  Algorithm 1.
Algorithm 1 (Sparse direct Broyden Method for solving sparse nonlinear equations)
  • Step 0. Given constant σ 1 , σ 2 > 0 and ρ , r ( 0 , 1 ) . Given a positive sequence { η k } satisfying
    k = 0 η k η < .
    Given x 0 R n , stop tolerance ϵ > 0 , and a nonsingular matrix B 0 R n × n . Set k : = 0 .
  • Step 1. Stop if F ( x k ) ϵ .
  • Step 2. Solve the subproblem
    F ( x k ) + B k d k = 0
    to obtain the quasi-Newton direction d k .
  • Step 3. If 
    F ( x k + d k ) ρ F ( x k ) σ 1 d k 2 ,
    then let α k : = 1 and go to Step 5. Else, go to Step 4.
  • Step 4. Set α k = r i k , where i k is the smallest nonnegative integer i satisfying
    F ( x k + r i d k ) F ( x k ) σ 2 r i d k 2 + η k F ( x k ) ,
    where η k is defined as in (10).
  • Step 5. Set x k + 1 : = x k + α k d k .
  • Step 6. Update B k to obtain B k + 1 by sparse direct Broyden update
    B k + 1 = B k + i = 1 n ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k e i s ( i ) k T ,
    Set k : = k + 1 . Go to Step 1.
Remark 1.
It is noticed that the update formula (14) may be singular when B k is nonsingular. In this case, we use a similar technique in [22,23] and give the following discussion about the nonsingular sparse direct Broyden update.
Set H 0 = B k , and for i = 1 , 2 , , n , let
H i = H 0 + j = 1 i θ k j ( s ( j ) k T s ( j ) k ) + e j T ( F ( x k + 1 ) B k ) s k e j s ( j ) k T = H i 1 + θ k i ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k e i s ( i ) k T .
Since e i T H 0 = e i T H 1 = = e i T H i 1 , then
H i = H i 1 + θ k i ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) H i 1 ) s k e i s ( i ) k T .
For a scalar α ( 0 , 1 ) , θ k i can be chosen such that
d e t H i α n d e t H i 1 , θ k i 1 α n 1 + α n , 1 .
Thus, d e t B k + 1 α d e t B k and θ k i can be chosen so that
B k + 1   i s   n o n s i g u l a r , a n d θ k i 1 θ ^ < 1 .
Thus, we can define the sparse direct Broyden-like update formula as
B k + 1 = B k + i = 1 n θ k i ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k e i s ( i ) k T .
Remark 2.
It can be seen that the update formula (14) includes F ( x k + 1 ) , but it does not need to compute F ( x k + 1 ) in practice. Automatic differentiation is a chain-rule-based technique for evaluating the derivatives with respect to the input variables of functions defined by a high-level computer program. Automatic Differentiation has two basic modes of operations, the forward mode and the reverse mode. In the forward mode, the derivatives are propagated throughout the computation using the chain rule, while in the reverse mode the adjoint derivatives are propagated backwards. The forward mode and reverse mode of automatic differentiation provide the possibility to compute F ( x ) s and σ T F ( x ) exactly within machine accuracy for given vectors x , s and σ.
  To establish the global convergence, we need the following conditions.
Assumption 1.
(1)
F is continuously differentiable on Ω, which is a bounded level set defined by
Ω = { x R n F ( x ) e η F ( x 0 ) } .
(2)
F ( x ) is Lipschitz continuous on Ω with Lipschitz constant L > 0
F ( x ) F ( y ) L x y , x , y Ω .
(3)
F ( x ) is nonsingular for any x Ω .
   First, we give the following important lemmas.
Lemma 3.
The sequence { x k } generated by Algorithm 1 is contained in Ω. Moreover, it holds that
k = 0 s k 2 < ,
and the sequence { F ( x k ) } converges.
Proof. 
According to the line search (12) and (13), one has for any k
F ( x k + 1 ) ( 1 + η k ) F ( x k ) F ( x 0 ) Π j = 0 k ( 1 + η j ) F ( x 0 ) 1 k + 1 j = 0 k ( 1 + η j ) k + 1 = F ( x 0 ) 1 + 1 k + 1 j = 0 k η j k + 1 F ( x 0 ) 1 + η k + 1 k + 1 η η e η F ( x 0 ) .
Thus, { x k } is contained in level set Ω . Moreover, the sequence { F ( x k ) } is bounded.
On the basis of (12) and (13), we have for each k that
σ 0 s k 2 = σ 0 x k + 1 x k 2 F ( x k ) F ( x k + 1 ) + η k F k ,
where σ 0 = min { σ 1 , σ 2 } . We can obtain (15) by taking summation on both sides for k from 0 to .
Finally, since { F ( x k ) } satisfies
F ( x k + α k d k ) ( 1 + η k ) F ( x k ) ,
and { η k } satisfies
k = 0 η k η < ,
we then obtain the convergence of { F ( x k ) } . □
  Denote
δ k = ( F ( x k + 1 ) B k ) s k s k = F ( x k + 1 ) s k + F ( x k ) s k .
Lemma 4.
Suppose that the sequence { x k } is generated by Algorithm 1, and F ( x ) is Lipschitz continuous with a common Lipschitz constant L > 0 . If
k = 0 s k 2 < ,
then we have
lim t 1 t k = 0 t 1 δ k 2 = 0 .
In addition, there exists a subsequence of { δ k } tending to zero. If
k = 0 s k < ,
then we have
k = 0 δ k 2 < .
In addition, the whole sequence { δ k } converges to zero.
Proof. 
According to the update (14), we have
e i T B k + 1 = e i T B k + ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k s ( i ) k T .
Subtracting e i T F ( x k + 1 ) , we obtain
e i T ( B k + 1 F ( x k + 1 ) ) = e i T ( B k F ( x k + 1 ) ) + ( s ( i ) k T s ( i ) k ) + e i T ( F ( x k + 1 ) B k ) s k s ( i ) k T = e i T ( B k F ( x k + 1 ) ) I ( s ( i ) k T s ( i ) k ) + s k s ( i ) k T = e i T ( B k F ( x k + 1 ) ) I ( s ( i ) k T s ( i ) k ) + s ( i ) k s ( i ) k T .
Taking norms yields
e i T ( B k + 1 F ( x k + 1 ) ) 2 = e i T ( B k F ( x k + 1 ) ) ( I ( s ( i ) k T s ( i ) k ) + s ( i ) k s ( i ) k T ) 2 = e i T ( B k F ( x k + 1 ) ) 2 ( s ( i ) k T s ( i ) k ) + ( e i T ( B k F ( x k + 1 ) ) s k ) 2 e i T ( B k F ( x k + 1 ) ) 2 e i T ( B k F ( x k + 1 ) ) s k 2 s k 2 .
Since B k + 1 F ( x k + 1 ) F 2 = i = 1 n e i T ( B k + 1 F ( x k + 1 ) ) 2 , making summation from i = 1 to n yields
B k + 1 F ( x k + 1 ) F 2 i = 1 n e i T ( B k F ( x k + 1 ) ) 2 e i T ( B k F ( x k + 1 ) ) s k 2 s k 2 = ( B k F ( x k + 1 ) ) F 2 ( B k F ( x k + 1 ) ) s k 2 s k 2 = ( B k F ( x k + 1 ) ) F 2 δ k 2 .
Denote
D k = B k F ( x k ) and   E k = F ( x k + 1 ) F ( x k ) .
Then, one has that, for k 1 ,
D k F B k 1 F ( x k ) F D k 1 F + E k 1 F D 0 F + j = 0 k 1 E j F ,
and
δ k 2 B k F ( k + 1 ) F 2 B k + 1 F ( x k + 1 ) F 2 = D k E k F D k + 1 F 2 D k F 2 D k + 1 F 2 + E k F 2 + 2 E k F D k F D k F 2 D k + 1 F 2 + E k F 2 + 2 E k F · D 0 F + j = 0 k 1 E j F .
Making summation on both sides from k = 0 to t 1 , we have for 1 p < t
k = 1 t 1 δ k 2 D 0 F 2 + k = 0 t 1 E k F 2 + 2 k = 1 t 1 E k F D 0 F + j = 0 k 1 E j F D 0 F 2 + k = 0 t 1 E k F 2 + 2 D 0 F k = 1 t 1 E k F + 2 k = 1 t 1 E k F j = 0 k 1 E j F = D 0 F 2 + 2 D 0 F 2 k = 1 t 1 E k F + 2 k = 1 t 1 E k F 2 = D 0 F + k = 0 t 1 E k F 2 D 0 F + k = 0 p 1 E k F + k = p t 1 E k F 2 2 D 0 F + k = 0 p 1 E k F 2 + 2 k = p t 1 E k F 2 2 D 0 F + k = 0 p 1 E k F 2 + 2 ( t p ) k = p t 1 E k F 2 .
Dividing both sides by t and letting t , we have
lim t 1 t k = 1 t 1 δ k 2 2 lim t t p t k = p t 1 E k F 2 2 k = p E k F 2 .
If k = 0 s k 2 < , then the Lipschitz continuity of F ( x ) together with the last inequality implies
lim t 1 t k = 0 t 1 δ k 2 = 0 .
Then, there is a subsequence of { δ k } tending to zero. If k = 0 s k < , then (17) comes from (18). Moreover, the whole sequence { δ k } converges to zero. This completes the proof. □
Theorem 2.
Let the conditions in Assumption 1 hold. Then, the sequence { x k } generated by Algorithm 1 converges to the unique solution x * of (1).
Proof. 
We first verify
lim k inf F ( x k ) = 0 .
According to Lemma 3, the sequence { F ( x k ) } converges. Thus, we only need to prove that there is an accumulation point of { x k } , which is the unique solution of (1). If there are infinitely many α k , which is obtained by the line search condition (12), then
F ( x k + 1 ) ρ F ( x k )
holds for infinitely many k. This indicates lim inf k F ( x k ) = 0 .
There are only finite many α k , which is obtained by the line search condition (12). By (15) and Lemma 4, there is a subsequence { δ k } k K converging to zero. Since { x k } K is bounded, we may assume that { x k } K x * without loss of generality. Hence, { F ( x k + 1 ) } tends to F ( x * ) , and there exists a constant C 1 such that F ( x k + 1 ) C 1 for all sufficiently large k K . According to the subproblem (11) and the definition of δ k , one has
d k = F ( x k + 1 ) 1 ( ( F ( x k + 1 ) B k ) d k F ( x k ) ) F ( x k + 1 ) 1 ( F ( x k + 1 ) B k ) d k + F ( x k ) C 1 ( δ k d k + F ( x k ) ) ,
which indicates that there exists a constant M 1 such that
d k M 1 F ( x k )
holds for all sufficiently large k K . Thus, the subsequence { d k } K is bounded, and we can assume that { d k } K d * . Since ( F ( x k + 1 ) B k ) d k = δ k d k , then we have
B k d k F ( x * ) d * , k , k K .
Taking limit in the subproblem (11) as k , k K , one has
F ( x * ) d * + F ( x * ) = 0 .
Denote α * = lim sup k , k K α k . It is clear that α * 0 and α * d * = 0 . If α ¯ > 0 , then d * = 0 ; hence, it follows from (20) that F ( x * ) = 0 . If α * = 0 , or equivalently lim k α k = 0 . According to the line search rule, when k K is sufficiently large, α k < 1 and hence
F ( x k + ρ 1 α k d k ) F ( x k ) > σ 2 ρ 1 α k d k 2 .
Multiplying both sides of (21) by ( F ( x k + ρ 1 α k d k ) ) + F ( x k ) ) / ( ρ 1 α k ) and taking limit as k , k K , we obtain
F ( x * ) T F ( x * ) d * 0 .
Combined with (20), we have F ( x * ) = 0 . Then, we complete the proof. □
In what follows, we will show that, when k is sufficiently large, the α k 1 will be accepted.
Theorem 3.
Suppose Assumption 1 holds and { x k } is generated by Algorithm 1. Then, there exist a constant δ > 0 and an index k ¯ such that α k = 1 whenever δ k δ and k k ¯ . Furthermore, the inequality (12) holds for all k k ¯ satisfying δ k δ .
Proof. 
According to Theorem 2, { x k } converges to the solution x * of (1). Then, there exists a constant M 2 > 0 such that F ( x k + 1 ) 1 M 2 for all k sufficiently large. Moreover, it can be deduced similarly that there exists constant M 3 > 0 such that, when δ k δ and k is large enough,
d k M 3 F ( x k ) .
By the subproblem (11), one has
F ( x k + 1 ) ( x k + d k x * ) = F ( x k + 1 ) ( x k x * ) + ( F ( x k + 1 ) B k ) d k F ( x k ) = ( F ( x k + 1 ) F ( x * ) ) ( x k x * ) + ( F ( x k + 1 ) B k ) d k F ( x k ) + F ( x * ) + F ( x * ) ( x k x * ) .
This implies
x k + d k x * F ( x k + 1 ) 1 ( F ( x k + 1 ) F ( x * ) x k x * + ( F ( x k + 1 ) B k ) d k + F ( x k ) F ( x * ) F ( x * ) ( x k x * ) ) M 2 ( o ( x k x * ) + δ k d k ) M 2 ( o ( x k x * ) + δ k M 3 F ( x k ) F ( x * ) ) M 2 ( o ( x k x * ) + δ k M 3 M 4 x k x * ) ,
where M 4 an upper bound of F ( x ) on the level set Ω . Then, by the last inequality, we have
F ( x k + d k ) = F ( x k + d k ) F ( x * ) M 4 x k + d k x * M 2 M 4 ( o ( x k x * ) + δ k M 3 M 4 x k x * ) ,
On the other hand, by the nonsingularity of F ( x * ) and the convergence of { x k } , there is a constant m > 0 such that the inequality
F ( x k ) = F ( x k ) F ( x * ) m x k x *
holds for all k sufficiently large. Thus, we deduce from (22) and (23) that, when δ k δ ,
F ( x k + 1 ) ρ F ( x k ) + σ 1 d k 2 M 2 M 4 ( o ( x k x * ) + δ k M 3 M 4 x k x * ρ m x k x * + σ 1 M 3 2 F ( x k ) 2 ( M 2 M 3 M 4 2 δ k ρ m ) x k x * + o ( x k x * ) + σ 1 M 2 2 M 3 2 x k x * 2 ( ρ m M 2 M 3 M 4 2 δ k ) x k x * + o ( x k x * ) .
Let δ = min { δ , 1 2 ρ m ( M 2 M 3 M 4 2 ) 1 } ; then, we complete the proof. □
  The following theorem presents that Algorithm 1 is superlinearly convergent.
Theorem 4.
Let the Assumption 1 hold. Then, the sequence { x k } generated by Algorithm 1 converges to the unique solution x * of (1) superlinearly.
Proof. 
Let δ and k ¯ be as defined by Theorem 3. Then, according to Lemma 4, we have that
1 k j = 0 k 1 δ j 2 1 2 δ 2
holds for all k k ˜ , which implies that, in this case, there are at least k 2 many indices j k satisfying δ j δ . Let k = max { k ¯ , k ˜ } . Moreover, on the basis of Theorem 3, for any k 2 k , there are at least k 2 k many indices j k , which make α j = 1 and
F ( x j + 1 ) = F ( x j + d k ) ρ F ( x j ) .
Define J k = { j ( 24 ) holds } and | J k | as the number of the elements in J k . Then, | J k | k 2 k 1 . On the other side, for each j J k , we have
F ( x j + 1 ) ( 1 + η k ) F ( x j ) .
Multiplying inequalities (24) with j J k and (25) with j J k from j = k to k yields
F ( x k + 1 ) λ j k F ( x k ) [ Π j = k k ( 1 + η j ) ] F ( x k ) λ k 2 k 1 e η .
Thus, we obtain k = 0 F ( x k ) < . This together with (23) implies k = 0 x k x * < , and hence
k = 0 s k < .
Then, following from Lemma 4, one has
δ k = 0 .
Consequently, according to (23), the sequence { x k } converges to x * superlinearly. □

4. Numerical Experiments

In this section, we compare the SDBroyden method with Schubert’s method [8]. We also compare the SDBroyden method with a direct Broyden method and Newton’s method. All the methods are written in MATLAB R2018a and run in an iMac with 16G. The product F ( x ) s is computed by the automatic differentiation tool TOLMAB [24].
The testing problems are listed in Appendix A. The Jacobian matrices of the tested problems have different structures such as: diagonal (Problem 1, 2), tridiagonal (Problems 3, 4, 5, 6, 7, 8), block-diagonal (Problems 9, 10, 11), and special structure (Problem 12). The parameters in Algorithm 1 are specified as [19]
ϵ = 10 5 , ρ = 0.9 , σ 1 = σ 2 = 0.001 , β = 0.45 , η k = 1 ( k + 1 ) 2 .
For all the methods, we also stop the iteration if the number of iterations exceeds 200. We report the numerical performance of the above four methods in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7 and Figure 1 and Figure 2, where the meaning of each column is as follows:
Schubert:Schubert’s method;
SDBroyden:sparse direct Broyden method with LF condition;
Prothe number of the test problem;
Dim:the dimension of the problem;
Itethe total number of iterations;
Nfun:the total number of function evaluations;
R:Broyden’s mean convergence rate;
Time(s):CPU time in second;
Fail:the stopping criterion was not satisfied.
(1) In the first set of our numerical experiments, we test the performance of the SDBroyden method and Schubert’s method. When B 0 is chosen as unit matrix I, the results are listed in Table 1 and Table 2, respectively. For SDBroyden method and Schubert’s method, we compute the problems with dimensions ( n = 10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10,000, 20,000, 50,000), but we select a subset of the dimensions ( n = 10, 100, 1000, 2000, 10,000, 20,000, 50,000) to improve the readability of the corresponding tables. The two methods fail on two problems (3, 8). Considering the iteration counts, the SDBroyden method is more efficient than Schubert’s method on seven problems (1, 2, 4, 5, 10, 11, 12), equivalent to Schubert’s method on three problems (6, 7, 9). For the total number of function evaluations, the SDBroyden method has better performance on seven problems (1, 2, 4, 9, 10, 11, 12), while Schubert’s method needs less function evaluations on one problem (5), and both methods are equivalent on two problems (6, 7). As for the Broyden’s mean convergence rate, SDBroyden works well on seven problems (1, 2, 4, 6, 10, 11, 12), equal to Schubert’s method on three problems (5, 7, 9). It can be seen that the SDBroyden method outperforms Schubert’s method in iteration counts, function evaluation counts, and Broyden’s mean convergence rate.
When B 0 is chosen as the exact Jacobian matrix F ( x 0 ) , the results are given in Table 3 and Table 4, respectively. The two methods solve the 12 problems successfully. The SDBroyden method needs fewer iterations than Schubert’s method on seven problems (1, 2, 4, 5, 8, 10, 11), equal iterations with Schubert’s method on five problems (3, 6, 7, 9, 12). For the total number of function evaluations, the SDBroyden method is more efficient than Schubert’s method on six problems (1, 2, 4, 5, 8, 11) and equivalent to Schubert’s method on six problems (3, 6, 7, 9, 10, 12). As for the Broyden’s mean convergence rate, SDBroyden has better performance on nine problems (1, 2, 3, 4, 5, 8, 10, 11, 12) and equals Schubert’s method on two problems (7, 9). The two methods are competitive on one problem (6). It also can be seen that the SDBroyden method outperforms Schubert’s method in terms of number of iterations, number of function evaluations, and Broyden’s mean convergence rate. Meanwhile, the CPU time of SDBroyden method is mostly more than that of Schubert’s method.
Performance ration [25] is used to compare the numerical performance. For given solvers set S and problems set P, let t p , s be the number of iterations, the number of function evaluations or others, required to solve problem p by solver s. Then, define the performance ration as
r p , s = t p , s min { t p , q : q S } ,
whose distribution function is defined as
ρ s ( t ) = 1 N p size { p P : r p , s t } ,
where N p is the number of problems in the set P. Thus, ρ s : R [ 0 , 1 ] was the probability for solver s S that a performance ratio r p , s was within a factor t R of the best possible ratio. According to the definition of performance profiles, we can see that the top curve corresponds to the best solver.
In Figure 1, the performance of the two methods: the SDBroyden method and Schubert’s method, relative to the number of iterations, and the number of function evaluations are evaluated. Figure 1 indicates that SDBroyden has better performance than Schubert’s method on the number of iterations and number of function evaluations.
(2) In the second set of numerical experiments, we compare the SDBroyden method with the direct Broyden quasi-Newton method (DBQN). We give the results of the DBQN method with B 0 = I in Table 5. The DBQN method fails on four problems (3, 5, 8, 9). For the number of iterations and number of function evaluations, the SDBroyden method needs less iterations on five problems (2, 4, 6, 7, 11) and equals DBQN on three problems (1, 10, 12). For the Broyden’s mean convergence rate, the SDBroyden method performs better on five problems (2, 4, 6, 7, 11), equals DBQN on two problems (1, 10), and works badly on one problem (12).
The results of the DBQN method with B 0 = F ( x 0 ) are listed in Table 6. The DBQN method fails on one problem (5). For the number of iterations, SDBroyden is better than the DBQN method on seven problems (2, 4, 6, 8, 10, 11, 12), equivalent to the DBQN method on three problems (1, 3, 9). At the same time, DBQN performs well on one problem (7). For the number of function evaluations and Broyden’s mean convergence rate, SDBroyden is excellent on six problems (2, 4, 6, 8, 11, 12), while the DBQN method works well on one problem (10). The two methods coincide with each other on three problems (3, 9, 10).
In Figure 2, we also give the comparison of the SDBroyden method and DBQN method relative to the number of iterations and number of function evaluations. It can be seen that the top curve corresponds to the SDBroyden method. This means that the SDBroyden method has satisfactory performance in terms of number of iterations and number of function evaluations when compared with its dense version.
(3) In the third set of our numerical experiments, we compare the SDBroyden method with Newton’s method, where the results are listed in Table 7. Newton’s method fails on three problems (5, 8, 10). One can see that the SDBroyden method requires slightly more iterations than Newton’s method in most tests and has no significant advantages in the number of iterations, number of function evaluations, and Broyden’s mean convergence rate. However, the CPU time for Newton’s method is much higher than that of the SDBroyden method. Moreover, the CPU time of Newton’s method increases significantly faster than that of the quasi-Newton methods. Thus, the SDBroyden method can be applied to solve large-scale nonlinear equations.

5. Conclusions

We have developed a sparse direct Broyden quasi-Newton method for solving large-scale nonlinear equations, which is the sparse case of the direct Broyden method and is an extension of Broyden’s method. The method approximates the Jacobian matrix by least change updating and satisfies the sparsity condition and direct tangent condition simultaneously. We show that the method is locally and superlinearly convergent. Combined with a nonmonotone line search, we also establish the global and superlinear convergence. In particular, the unit step length is essentially accepted. Our numerical results show that the proposed method is effective and competitive for sparse nonlinear equations.

Author Contributions

Conceptualization, H.C.; methodology, H.C. and J.H.; software, H.C.; formal analysis, H.C.; writing—original draft preparation, H.C. and J.H.; writing—review and editing, H.C. and J.H. All authors have read and agreed to the published version of the manuscript.

Funding

The work is supported by the National Natural Science Foundation of China, Grant No. 11701577; the Natural Science Foundation of Hunan Province, China, Grant No. 2019JJ51002, 2020JJ5960; the Scientific Research Foundation of Hunan Provincial Education Department, China, Grant No. 18C0253; and the Natural Science Foundation of Shaanxi Province, China, Grant No. 2022JQ006.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The date used to support the research plan and all the computer codes used in this study are available from the corresponding author upon request.

Acknowledgments

The authors would like to thank the four anonymous referees for their careful reading of this paper and their comments to improve the quality of this paper. The authors also would like to thank the corresponding editor for providing insightful suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In this section, we list the test problems with initial guess x 0
F ( x ) = ( f 1 ( x ) , f 2 ( x ) , f n ( x ) ) T ,
where references [26,27,28,29,30,31,32,33,34,35] are cited in the Appendix A.
Problem 1. Logarithmic function [26]
F i ( x ) = ln ( x i + 1 ) x i n , i = 1 , 2 , , n . x 0 = ( 1 , 1 , , 1 ) T .
Problem 2. Strictly convex function [27]
F ( x ) is the gradient of h ( x ) = i = 1 n ( e x i x i ) .
F i ( x ) = e x i 1 , i = 1 , 2 , , n . x 0 = 1 n , 2 n , , 1 T .
Problem 3. Broyden Tridiagonal function [28]
F 1 ( x ) = ( 3 0.5 x 1 ) x 1 2 x 2 + 1 , F i ( x ) = ( 3 0.5 x i ) x i x i 1 2 x i + 1 + 1 , i = 2 , , n 1 , F n ( x ) = ( 3 0.5 x n ) x n x n 1 + 1 . x 0 = ( 3 , 3 , , 3 ) T .
Problem 4. Trigexp function [28]
F 1 ( x ) = 3 x 1 3 + 2 x 2 5 + sin ( x 1 x 2 ) sin ( x 1 + x 2 ) , F i ( x ) = x i 1 e ( x i 1 x i ) + x i ( 4 + 3 x i 2 ) + 2 x i + 1 + sin ( x i x i + 1 ) sin ( x i + x i + 1 ) 8 , i = 2 , , n 1 , F n ( x ) = x n 1 e ( x n 1 x n ) + 4 x n 3 . x 0 = ( 0 , 0 , , 0 ) T .
Problem 5. Tridiagonal system [29]
F 1 ( x ) = 4 ( x 1 x 2 2 ) , F i ( x ) = 8 x i ( x i 2 x i 1 ) 2 ( 1 x i ) + 4 ( x i x i + 1 2 ) , i = 2 , , n 1 F n ( x ) = 8 x n ( x n 2 x n 1 ) 2 ( 1 x n ) . x 0 = ( 12 , , 12 ) T .
Problem 6. Tridiagonal exponential problem [30]
F 1 ( x ) = x 1 exp ( cos ( h ( x 1 + x 2 ) ) ) , F i ( x ) = x i exp ( cos ( h ( x i 1 + x i + x i + 1 ) ) ) , i = 2 , , n 1 , F n ( x ) = x n exp ( cos ( h ( x n 1 + x n ) ) ) , h = 1 / ( n + 1 ) . x 0 = ( 1.5 , 1.5 , , 1.5 ) T .
Problem 7. Discrete boundary value problem [31]
F 1 ( x ) = 2 x 1 + 0.5 h 2 ( x 1 + h ) 3 x 2 , F i ( x ) = 2 x i + 0.5 h 2 ( x i + h i ) 3 x i 1 + x i + 1 , i = 2 , , n 1 , F n ( x ) = 2 x n + 0.5 h 2 ( x n + h n ) 3 x n 1 , h = 1 / ( n + 1 ) . x 0 = ( h ( h 1 ) , h ( 2 h 1 ) , , h ( n h 1 ) ) T .
Problem 8. Troesch problem [32]
F 1 ( x ) = 2 x 1 + ρ h 2 sinh ( ρ x 1 ) x 2 , F i ( x ) = 2 x i + ρ h 2 sinh ( ρ x i ) x i 1 x i + 1 , i = 2 , , n 1 , F n ( x ) = 2 x n + ρ h 2 sinh ( ρ x n ) x n 1 , ρ = 10 , h = 1 / ( n + 1 ) . x 0 = ( 0 , 0 , , 0 ) T .
Problem 9. Extended Rosenbrock function (n is even) [33]
F 2 i 1 ( x ) = 10 ( x 2 i x 2 i 1 2 ) , F 2 i ( x ) = 1 x 2 i 1 , i = 1 , 2 , , n / 2 . x 0 = ( 5 , 1 , , 5 , 1 ) T .
Problem 10. Problem 21 in [26] (n is multiple of 3)
F 3 i 2 ( x ) = x 3 i 2 x 3 i 1 x 3 i 2 1 , F 3 i 1 ( x ) = x 3 i 2 x 3 i 1 x 3 i x 3 i 2 2 + x 3 i 1 2 2 , F 3 i ( x ) = e x 3 i 2 e x 3 i 1 , i = 1 , 2 , , n / 3 . x 0 = ( 1 , 1 , , 1 ) T .
Problem 11. Tridimensional valley function (n is multiple of 3) [34]
F 3 i 2 ( x ) = ( c 2 x 3 i 2 3 + c 1 x 3 i 2 ) exp x 3 i 2 2 100 1 , F 3 i 1 ( x ) = 10 ( sin ( x 3 i 2 ) x 3 i 1 ) , F 3 i ( x ) = 10 ( cos ( x 3 i 2 ) x 3 i ) , i = 1 , 2 , , n / 3 , c 1 = 1.003344481605351 , c 2 = 3.344481605351171 × 10 3 . x 0 = ( 2 , 1 , 2 , , 2 , 1 , 2 ) T .
Problem 12. [35]
F 1 ( x ) = x 1 , F i ( x ) = cos ( x k 1 ) + x k 1 , i = 2 , , n . x 0 = ( 0.5 , 0.5 , , 0.5 ) T .

References

  1. Yuan, Y.X. Recent advances in numerical methods for nonlinear equations and nonlinear least squares. Numer. Algebra Control Optim. 2011, 1, 15–34. [Google Scholar] [CrossRef]
  2. Sun, W.; Yuan, Y.X. Optimization Theory and Methods: Nonlinear Programming; Springer Science & Business Media: New York, NY, USA, 2006. [Google Scholar]
  3. Wright, S.; Nocedal, J. Numerical Optimization; Springer Science: New York, NY, USA, 1999; Volume 35, p. 7. [Google Scholar]
  4. Fletcher, R. Practical Methods of Optimization; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
  5. Broyden, C.G. A class of methods for solving nonlinear simultaneous equations. Math. Comput. 1965, 19, 577–593. [Google Scholar] [CrossRef]
  6. Broyden, C.G.; Dennis, J.E., Jr.; Moré, J.J. On the local and superlinear convergence of quasi-Newton methods. IMA J. Appl. Math. 1973, 12, 223–245. [Google Scholar] [CrossRef]
  7. Dennis, J.E.; Moré, J.J. Quasi–Newton methods, motivation and theory. SIAM Rev. 1977, 19, 46–89. [Google Scholar] [CrossRef] [Green Version]
  8. Schubert, L.K. Modification of a quasi-Newton method for nonlinear equations with a sparse Jacobian. Math. Comput. 1970, 24, 27–30. [Google Scholar] [CrossRef]
  9. Toint, P.L. On sparse and symmetric matrix updating subject to a linear equation. Math. Comput. 1977, 31, 954–961. [Google Scholar] [CrossRef]
  10. Nocedal, J. Updating quasi-Newton matrices with limited storage. Math. Comput. 1980, 35, 773–782. [Google Scholar] [CrossRef]
  11. Van de Rotten, B.; Lunel, S.V. A limited memory Broyden method to solve high-dimensional systems of nonlinear equations. In Proceedings of the International Conference on Differential Equations, Hasselt, Belgium, 22–26 July 2003; pp. 196–201. [Google Scholar]
  12. Griewank, A.; Toint, P.L. Partitioned variable metric updates for large structured optimization problems. Numer. Math. 1982, 39, 119–137. [Google Scholar] [CrossRef]
  13. Griewank, A.; Toint, P.L. Local convergence analysis for partitioned quasi-Newton updates. Numer. Math. 1982, 39, 429–448. [Google Scholar]
  14. Cao, H.P.; Li, D.H. Partitioned quasi-Newton methods for sparse nonlinear equations. Comput. Optim. Appl. 2017, 66, 481–505. [Google Scholar] [CrossRef]
  15. Leong, W.J.; Hassan, M.A.; Yusuf, M.W. A matrix-free quasi-Newton method for solving large-scale nonlinear systems. Comput. Math. Appl. 2011, 62, 2354–2363. [Google Scholar] [CrossRef] [Green Version]
  16. Li, D.H.; Wang, X.; Huang, J. Diagonal BFGS updates and applications to the limited memory BFGS method. Comput. Optim. Appl. 2022, 81, 829–856. [Google Scholar] [CrossRef]
  17. Martínez, J.M. A quasi-Newton method with modification of one column per iteration. Computing 1984, 33, 353–362. [Google Scholar] [CrossRef]
  18. Griewank, A. The “global” convergence of Broyden-like methods with suitable line search. ANZIAM J. 1986, 28, 75–92. [Google Scholar] [CrossRef] [Green Version]
  19. Li, D.H.; Fukushima, M. A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Optim. Methods Softw. 2000, 13, 181–201. [Google Scholar] [CrossRef]
  20. Griewank, A.; Walther, A. On constrained optimization by adjoint based quasi-Newton methods. Optim. Methods Softw. 2002, 17, 869–889. [Google Scholar] [CrossRef]
  21. Schlenkrich, S.; Griewank, A.; Walther, A. On the local convergence of adjoint Broyden methods. Math. Program. 2010, 121, 221–247. [Google Scholar] [CrossRef]
  22. Marwil, E. Convergence results for Schubert’s method for solving sparse nonlinear equations. SIAM J. Numer. Anal. 1979, 16, 588–604. [Google Scholar] [CrossRef]
  23. Powell, M.J.D. A hybrid method for nonlinear equations. In Numerical Methods for Nonlinear Algebraic Equations; Gordon and Breach: London, UK, 1970. [Google Scholar]
  24. Forth, S.A.; Edvall, M.M. User Guide for MAD-MATLAB Automatic Differentiation Toolbox TOMLAB/MAD; Version 1.1 The Forward Mode; TOMLAB Optimisation Inc.: San Diego, CA, USA, 2007; p. 92101. [Google Scholar]
  25. Dolan, E.D.; Moré, J.J. Benchmarking optimization software with performance profiles. Math. Program. 2002, 91, 201–213. [Google Scholar] [CrossRef]
  26. La Cruz, W.; Martínez, J.; Raydan, M. Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Math. Comput. 2006, 75, 1429–1448. [Google Scholar] [CrossRef] [Green Version]
  27. Raydan, M. The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Numer. Anal. 1997, 7, 26–33. [Google Scholar] [CrossRef]
  28. Gomes-Ruggiero, M.A.; Martínez, J.M.; Moretti, A.C. Comparing algorithms for solving sparse nonlinear systems of equations. SIAM J. Sci. Comput. 1992, 13, 459–483. [Google Scholar] [CrossRef]
  29. Li, G. Successive column correction algorithms for solving sparse nonlinear systems of equations. Math. Program. 1989, 43, 187–207. [Google Scholar] [CrossRef] [Green Version]
  30. Bing, Y.; Lin, G. An efficient implementation of Merrill’s method for sparse or partially separable systems of nonlinear equations. SIAM J. Optim. 1991, 1, 206–221. [Google Scholar] [CrossRef]
  31. Moré, J.J.; Garbow, B.S.; Hillstrom, K.E. Testing unconstrained optimization software. ACM Trans. Math. Softw. (TOMS) 1981, 7, 17–41. [Google Scholar] [CrossRef]
  32. Roberts, S.M.; Shipman, J.S. On the closed form solution of Troesch’s problem. J. Comput. Phys. 1976, 21, 291–304. [Google Scholar] [CrossRef]
  33. Gasparo, M.G. A nonmonotone hybrid method for nonlinear systems. Optim. Methods Softw. 2000, 13, 79–94. [Google Scholar] [CrossRef]
  34. Friedlander, A.; Gomes-Ruggiero, M.A.; Kozakevich, D.N.; Mario Martínez, J.; Augusta Santos, S. Solving nonlinear systems of equations by means of quasi-neston methods with a nonmonotone stratgy. Optim. Methods Softw. 1997, 8, 25–51. [Google Scholar] [CrossRef]
  35. Luksan, L.; Matonoha, C.; Vlcek, J. Problems for Nonlinear Least Squares and Nonlinear Equations; Technical Report 1259; Institute of Computer Science, Academy of Sciences of the Czech Republic: Prague, Czech Republic, 2018. [Google Scholar]
Figure 1. Performance profiles for SDBroyden and Schubert: (a) results comparison on the number of iterations with B 0 = I ; (b) results comparison on the number of function evaluations with B 0 = I ; (c) results comparison on the number of iterations with B 0 = F ( x 0 ) ; (d) results comparison on the number of function evaluations with B 0 = F ( x 0 ) .
Figure 1. Performance profiles for SDBroyden and Schubert: (a) results comparison on the number of iterations with B 0 = I ; (b) results comparison on the number of function evaluations with B 0 = I ; (c) results comparison on the number of iterations with B 0 = F ( x 0 ) ; (d) results comparison on the number of function evaluations with B 0 = F ( x 0 ) .
Symmetry 14 01552 g001
Figure 2. Performance profiles of SDBroyden and DBQN (a) results comparison on the number of iterations with B 0 = I ; (b) results comparison on the number of function evaluations with B 0 = I ; (c) results comparison on the number of iterations with B 0 = F ( x 0 ) ; (d) results comparison on the number of function evaluations with B 0 = F ( x 0 ) .
Figure 2. Performance profiles of SDBroyden and DBQN (a) results comparison on the number of iterations with B 0 = I ; (b) results comparison on the number of function evaluations with B 0 = I ; (c) results comparison on the number of iterations with B 0 = F ( x 0 ) ; (d) results comparison on the number of function evaluations with B 0 = F ( x 0 ) .
Symmetry 14 01552 g002
Table 1. Results of Schubert’s method with B 0 = I .
Table 1. Results of Schubert’s method with B 0 = I .
Pro(Dim)101001000200010,00020,00050,000
(1)Ite6666666
(1)Nfun7777777
(1)R0.89151.10981.13261.13381.13491.13501.1351
(1)Time(s)0.06000.00000.02000.00000.04000.04000.0800
(2)Ite7777777
(2)Nfun8888888
(2)R1.04071.08311.09191.09241.09291.09291.0929
(2)Time(s)0.01000.00000.01000.00000.02000.04000.0900
(4)Ite12121213161414
(4)Nfun20202123262122
(4)R0.34290.34400.34960.34410.32150.41830.4178
(4)Time(s)0.01000.00000.00000.01000.13000.24000.4100
(5)Ite20172522221621
(5)Nfun63326473403648
(5)R0.10510.22880.11060.10520.19020.24610.1664
(5)Time(s)0.04000.00000.00000.00000.14000.16000.3400
(6)Ite4322221
(6)Nfun5433332
(6)R1.55532.56243.03883.43984.37134.76413.8427
(6)Time(s)0.06000.00000.01000.00000.02000.03000.0300
(7)Ite10866443
(7)Nfun111188665
(7)R0.49800.39350.52990.54890.53620.56130.5820
(7)Time(s)0.03000.00000.00000.00000.03000.07000.1100
(9)Ite4444444
(9)Nfun7777777
(9)RInfInfInfInfInfInfInf
(9)Time(s)0.04000.00000.00000.00000.51000.94001.5100
Dim121021002200110,00220,00150,001
(10)Ite4455555
(10)Nfun6677777
(10)R1.05631.05631.55861.55861.55861.55861.5586
(10)Time(s)0.01000.00000.11000.13000.26000.45000.8000
Dim121021002200110,00220,00150,001
(11)Ite6677777
(11)Nfun8899999
(11)R0.91750.91751.49721.49721.49721.49721.4972
(11)Time(s)0.03000.01000.30000.10000.52000.79001.7100
(12)Ite5555566
(12)Nfun6666677
(12)R1.13121.11561.11421.11421.11411.59181.5985
(12)Time(s)0.04000.00000.00000.00000.02000.05000.1200
Table 2. Results of the SDBroyden method with B 0 = I .
Table 2. Results of the SDBroyden method with B 0 = I .
Pro(Dim)101001000200010,00020,00050,000
(1)Ite5455555
(1)Nfun6566666
(1)R1.68651.21132.13902.14152.14352.14372.1439
(1)Time(s)0.07000.36001.47002.520013.260027.950088.0700
(2)Ite5555566
(2)Nfun6666677
(2)R1.10231.15871.17051.17121.17181.94491.9450
(2)Time(s)0.03000.22000.78001.74007.820020.250059.1100
(4)Ite12121212131313
(4)Nfun17181820191920
(4)R0.41080.42530.43350.40850.44430.47470.4364
(4)Time(s)0.28501.260011.280022.5000139.3500299.0400834.3150
(5)Ite161620181920
(5)Nfun29355748524856
(5)R0.21790.23960.15050.16740.17380.16640.1675
(5)Time(s)0.14000.51004.620017.560080.7900182.6200375.0000
(6)Ite3222221
(6)Nfun4333332
(6)R1.45662.48344.46855.05495.67785.14613.8427
(6)Time(s)0.05000.08000.62001.11006.580013.210020.2600
(7)Ite10866443
(7)Nfun111188665
(7)R0.49800.39350.52990.54890.53620.56130.5820
(7)Time(s)0.17000.38002.07004.100014.140029.390069.4400
(9)Ite4444444
(9)Nfun6666666
(9)RInfInfInfInfInfInfInf
(9)Time(s)0.05000.14500.56001.300013.670028.070084.4600
Dim121021002200110,00220,00150,001
(10)Ite3333444
(10)Nfun5555666
(10)R1.34201.34201.34201.34202.38522.38522.3852
(10)Time(s)0.06000.17000.90001.810010.800021.580064.8000
Dim121021002200110,00220,00150,001
(11)Ite5666666
(11)Nfun7888888
(11)R1.00131.82091.82091.82091.82091.82091.8209
(11)Time(s)0.09000.30002.15003.530018.850040.5600109.5900
(12)Ite4444444
(12)Nfun5555555
(12)R1.76791.75511.75401.75391.75381.75381.7538
(12)Time(s)0.05000.15000.67001.27006.450013.770043.1500
Table 3. Results of Schubert’s method with B 0 = F ( x 0 ) .
Table 3. Results of Schubert’s method with B 0 = F ( x 0 ) .
Pro(Dim)101001000200010,00020,00050,000
(1)Ite6666666
(1)Nfun8777888
(1)R0.86610.95230.96930.97030.90770.90770.9077
(1)Time(s)0.00000.00000.01000.00000.03000.04000.0900
(2)Ite6666666
(2)Nfun7777777
(2)R1.16901.19821.20241.20261.20281.20281.2028
(2)Time(s)0.05000.00000.00000.00000.04000.04000.0800
(3)Ite10111111111111
(3)Nfun11121212121212
(3)R0.62160.59880.63910.65150.68060.69310.7097
(3)Time(s)0.04000.00000.02000.00000.09000.14000.2000
(4)Ite13232329213027
(4)Nfun25444558448172
(4)R0.27050.17430.18200.14550.20540.12560.2367
(4)Time(s)0.05000.00000.01000.00000.06000.11000.1800
(5)Ite18242624232323
(5)Nfun26446854424242
(5)R0.25430.14250.11900.14200.18640.19000.1947
(5)Time(s)0.06000.00000.00000.00000.21000.24000.5100
(6)Ite5322222
(6)Nfun6433333
(6)R1.15621.85402.49632.84693.66204.01314.4773
(6)Time(s)0.04000.00000.01000.00000.01000.04000.0700
(7)Ite121274111
(7)Nfun1821116222
(7)R0.25850.21580.33820.64011.73021.88072.0797
(7)Time(s)0.01000.00000.01000.01000.01000.04000.0500
(8)Ite8887777
(8)Nfun1011109101019
(8)R0.58980.53170.57790.56370.57890.58810.6023
(8)Time(s)0.03000.00000.00000.00000.42001.35003.8000
(9)Ite3333333
(9)Nfun4444444
(9)R4.03274.03274.03274.03274.03274.03274.0327
(9)Time(s)0.00000.01000.00000.01000.04000.07000.2300
Dim121021002200110,00220,00150,001
(10)Ite9101010111111
(10)Nfun10111111121212
(10)R0.55200.58860.58860.58860.67010.67010.6701
(10)Time(s)0.02000.02000.28000.26000.74001.18002.4700
Dim121021002200110,00220,00150,001
(11)Ite5666666
(11)Nfun6777777
(11)R1.14161.76641.76641.76641.76641.76641.7664
(11)Time(s)0.03000.03000.20000.22000.43000.65001.3500
(12)Ite8888888
(12)Nfun99999910
(12)R0.59090.64490.70030.71700.75580.77250.7210
(12)Time(s)0.02000.00000.00000.01000.06000.11000.1800
Table 4. Results of the SDBroyden method with B 0 = F ( x 0 ) .
Table 4. Results of the SDBroyden method with B 0 = F ( x 0 ) .
Pro(Dim)101001000200010,00020,00050,000
(1)Ite4555555
(1)Nfun6666777
(1)R0.92111.69751.72871.73041.64311.64311.6431
(1)Time(s)0.03000.22001.66002.640014.210029.440088.3100
(2)Ite4445555
(2)Nfun5556666
(2)R1.25441.28721.29162.11832.11872.11872.1187
(2)Time(s)0.04000.13000.66001.69007.710016.110048.0500
(3)Ite11111111111111
(3)Nfun12121212121212
(3)R0.56090.56420.60450.61700.64600.65860.6751
(3)Time(s)0.19000.61005.520010.800055.6500112.3100307.9200
(4)Ite13171718202318
(4)Nfun24282835385956
(4)R0.28670.26880.27070.22180.62510.56200.6294
(4)Time(s)0.15000.89005.810012.490066.7200142.7600402.8600
(5)Ite23212220202020
(5)Nfun46403534343434
(5)R0.19960.16510.19570.22070.23090.23540.2412
(5)Time(s)0.35001.710013.820026.4100143.4000293.5900341.0000
(6)Ite4322222
(6)Nfun5433333
(6)R1.12042.07352.48952.84013.65504.00624.4704
(6)Time(s)0.10000.18001.11001.65008.710020.280053.0700
(7)Ite121274111
(7)Nfun1821116222
(7)R0.25850.21580.33820.64011.73021.88072.0797
(7)Time(s)0.21000.89003.95004.41005.970012.500032.9300
(8)Ite11766666
(8)Nfun14988999
(8)R0.42370.57670.63370.80680.77900.78110.7995
(8)Time(s)0.19000.41002.81006.150013.390024.060048.4500
(9)Ite3333333
(9)Nfun4444444
(9)RInfInfInfInfInfInfInf
(9)Time(s)0.03000.14000.54001.31005.740013.220037.1600
Dim121021002200110,00220,00150,001
(10)Ite8999999
(10)Nfun10111111111111
(10)R0.56160.60680.60680.60680.63090.63740.6438
(10)Time(s)0.13000.46003.55006.610039.420082.5100242.4400
Dim121021002200110,00220,00150,001
(11)Ite4555555
(11)Nfun5666666
(11)R1.37362.32042.32042.32042.32042.32042.3204
(11)Time(s)0.06000.23001.86002.950015.860033.690092.8300
(12)Ite8888887
(12)Nfun9999999
(12)R0.67040.72390.77930.79600.83480.85150.7721
(12)Time(s)0.11000.29002.16003.750019.370041.5900106.5800
Table 5. Results of the DBQN method with B 0 = I .
Table 5. Results of the DBQN method with B 0 = I .
Pro(Dim)1020501002005001000
(1)Ite5444455
(1)Nfun6555566
(1)R1.68651.09981.18281.21131.22582.13412.1390
(1)Time(s)0.29000.19000.15000.54000.56003.100019.3500
(2)Ite5555566
(2)Nfun6666677
(2)R0.99431.01071.02291.02741.02971.26241.2631
(2)Time(s)0.10000.13000.16000.21000.45003.170021.8400
(4)Ite24335359587395
(4)Nfun4268117118120187318
(4)R0.16300.09630.06060.05850.05870.03830.0233
(4)Time(s)1.08001.90006.360012.690024.120097.9500510.2200
(6)Ite6533222
(6)Nfun7644333
(6)R0.85161.14981.62302.09752.44363.03843.4893
(6)Time(s)0.20000.27000.15000.78000.42001.55008.1900
(7)Ite15232024141211
(7)Nfun22323027292220
(7)R0.22850.13560.14210.15100.14010.17410.1863
(7)Time(s)0.57000.69001.32003.05003.950010.970047.1200
Dim1221511022015011002
(10)Ite3333333
(10)Nfun5555555
(10)R1.34201.34201.34201.34201.34201.34201.3420
(10)Time(s)0.15000.18000.13000.26000.64002.000010.9100
Dim1221511022015011002
(11)Ite7777888
(11)Nfun13131313141414
(11)R0.55320.55320.55320.55320.65350.62240.5851
(11)Time(s)0.22000.34000.28000.57001.31004.870033.8700
(12)Ite4444444
(12)Nfun5555555
(12)R1.94151.83891.78601.76961.76171.75691.7554
(12)Time(s)0.14000.17000.15000.25000.50002.170013.6100
Table 6. Results of the DBQN method with B 0 = F ( x 0 ) .
Table 6. Results of the DBQN method with B 0 = F ( x 0 ) .
Pro(Dim)1020501002005001000
(1)Ite4555555
(1)Nfun6666666
(1)R0.92111.55301.66231.69751.71491.72521.7287
(1)Time(s)0.09000.22000.48000.51000.67003.160020.2400
(2)Ite11111212121313
(2)Nfun12121313131414
(2)R0.47260.46770.48190.48220.48270.48070.4808
(2)Time(s)0.20000.33000.32000.51001.13006.790047.0600
(3)Ite10111111111111
(3)Nfun11121212121212
(3)R0.56290.54960.55420.56410.57540.59120.6035
(3)Time(s)0.48000.69000.72001.18002.27009.220045.4800
(4)Ite16253229353433
(4)Nfun23405658648367
(4)R0.31640.16410.12410.12190.11420.08970.1128
(4)Time(s)0.78001.24004.02006.760015.700049.1800180.5200
(6)Ite5544333
(6)Nfun6655444
(6)R1.01961.17841.51021.92241.81932.25632.5914
(5)Time(s)0.10000.17000.32000.44000.62002.270012.2300
(7)Ite3333311
(7)Nfun4444422
(7)R1.56741.87402.27602.59872.92942.03202.2588
(7)Time(s)0.11000.10000.31000.39000.71000.86004.2900
(8)Ite14141823202324
(8)Nfun20233859447680
(8)R0.28270.24370.14540.09530.11990.07310.0670
(8)Time(s)0.63001.03001.61002.71005.050020.8500102.1800
Dim1221511022015011002
(9)Ite3333333
(9)Nfun4444444
(9)R3.81343.76053.73833.51843.43813.60763.5910
(9)Time(s)0.03000.06000.22000.22000.37001.43009.5300
Dim1221511022015011002
(10)Ite9999999
(10)Nfun10101010101010
(10)R0.74200.74200.74200.74200.74200.74200.7420
(10)Time(s)0.32000.25000.43001.40001.62006.050032.8000
Dim1221511022015011002
(11)Ite5555555
(11)Nfun6666666
(11)R1.34881.34881.34881.34881.34881.34881.3488
(11)Time(s)0.13000.13000.19000.42000.75003.070020.7800
(12)Ite11131515151515
(12)Nfun12141616161616
(12)R0.46070.39120.34750.35340.36100.37200.3808
(12)Time(s)0.15000.39000.56000.99001.93008.020055.5800
Table 7. Results of the Newton’s method.
Table 7. Results of the Newton’s method.
Pro(Dim)1020501002005001000
(1)Ite4555555
(1)Nfun6666666
(1)R0.92111.55301.66231.69751.71491.72521.7287
(1)Time(s)0.00000.00000.01000.07000.19001.910016.8700
(2)Ite4444444
(2)Nfun5555555
(2)R1.25441.27041.28261.28721.28961.29111.2916
(2)Time(s)0.01000.00000.01000.01000.15002.260013.3200
(3)Ite4444445
(3)Nfun5555556
(3)R1.28841.30611.33331.35421.37291.39132.2609
(3)Time(s)0.00000.00000.01000.01000.19002.130017.8300
(4)Ite16171818191920
(4)Nfun17181919202021
(4)R0.37210.36490.36250.36170.36200.36180.3623
(4)Time(s)0.02000.01000.01000.11000.85007.860067.7600
(6)Ite151286543
(6)Nfun161397654
(6)R0.36000.44880.67930.90781.15671.49611.7423
(6)Time(s)0.03000.00000.01000.06000.20001.49009.9400
(7)Ite2218128411
(7)Nfun2319139522
(7)R0.19190.23230.32170.45030.78252.03202.2588
(7)Time(s)0.03000.00000.02000.06000.12000.38003.3100
(9)Ite2222222
(9)Nfun3333333
(9)RInfInfInfInfInfInfInf
(9)Time(s)0.00000.00000.01000.01000.10000.70005.8800
Dim1221511022015011002
(11)Ite3333333
(11)Nfun4444444
(11)R2.01372.01372.01372.01372.01372.01372.0137
(11)Time(s)0.01000.00000.01000.02000.11001.060011.1200
(12)Ite4444444
(12)Nfun5555555
(12)R1.49721.47181.45811.45411.45231.45111.4508
(12)Time(s)0.01000.00000.01000.03000.16001.290012.0100
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cao, H.; Han, J. A Class of Sparse Direct Broyden Method for Solving Sparse Nonlinear Equations. Symmetry 2022, 14, 1552. https://doi.org/10.3390/sym14081552

AMA Style

Cao H, Han J. A Class of Sparse Direct Broyden Method for Solving Sparse Nonlinear Equations. Symmetry. 2022; 14(8):1552. https://doi.org/10.3390/sym14081552

Chicago/Turabian Style

Cao, Huiping, and Jing Han. 2022. "A Class of Sparse Direct Broyden Method for Solving Sparse Nonlinear Equations" Symmetry 14, no. 8: 1552. https://doi.org/10.3390/sym14081552

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop