On a Nonsmooth Gauss–Newton Algorithms for Solving Nonlinear Complementarity Problems

In this paper, we propose a new version of the generalized damped Gauss–Newton method for solving nonlinear complementarity problems based on the transformation to the nonsmooth equation, which is equivalent to some unconstrained optimization problem. The B-differential plays the role of the derivative. We present two types of algorithms (usual and inexact), which have superlinear and global convergence for semismooth cases. These results can be applied to efficiently find all solutions of the nonlinear complementarity problems under some mild assumptions. The results of the numerical tests are attached as a complement of the theoretical considerations.


Introduction
Let F : R n → R n and let F i , i = 1, ..., n denote the components of F. The nonlinear complementarity problem (NCP) is to find x ∈ R n such that x ≥ 0, F(x) ≥ 0 and x T F(x) = 0. (1) The ith component of a vector x is represented by x i . Solving (1) is equivalent to solving a nonlinear equation G(x) = 0, where the operator G : R n → R n is defined by ...
ϕ(x n , F n (x))    with some special function ϕ. Function ϕ may have one of the following forms: where θ : R → R is any strictly increasing function with θ(0) = 0, see [1]. The (NCP) problem is one of the fundamental problems of mathematical programming, operations research, economic equilibrium models, and in engineering sciences. A lot of interesting and important applications can be found in the papers of Harker and Pang [2] and Ferris and Pang [3]. We can find the most essential applications in: • engineering-optimal control problems, contact or structural mechanics problems, structural design problems, or traffic equilibrium problems, • equilibrium modeling-general equilibrium (in production or consumption), invariant capital stock, or game-theoretic models.
We borrow a technique used in solving some smooth problems. If g is a merit function of G, i.e., g(x) = 1 2 G(x) T G(x), then any stationary point of g(x) is a least-squares solution of the equation G(x) = 0. Then, algorithms for minimization are equivalent to algorithms for solving equations. The usual Gauss-Newton method (known also as the differential corrections method), presented by Ortega and Rheinboldt [4] in the smooth case, has the form Local convergence properties of the Gauss-Newton method was discussed by Chen and Li [5], but only for some smooth case. The Levenberg-Marquardt method is also considered, which is a modified Gauss-Newton method, in some papers, e.g., [6] or [7]. Moreover, some comparison of semismooth algorithms for solving (NCP) problems has been made in [8].
In practice, we may also consider the damped Gauss-Newton method with parameters ω k and λ k . Parameter ω k may be chosen to ensure suitable decrease of g. If λ k is positive for all k, then the inverse matrix in (3) always exists because G (x (k) ) T G(x (k) ) is a symmetric and positive semidefinite matrix. The method (3) has the important advantage: the search direction always exists, even if G (x) is singular. Naturally, in the case of nonsmooth equations, some additional assumptions are needed to allow the use of some line search strategies and to ensure the global convergence. Because, in some cases, a function G is nondifferentiable, so the equation G(x) = 0 will be nonsmooth, whereby the method (3) may be useless. Some version of the Gauss-Newton method for solving complementarity problems was also introduced by Xiu and Zhang [9] for generalized problems, but only for linear ones. Thus, for solving nonsmooth and nonlinear problems, we propose two new versions of a damped Gauss-Newton algorithm based on B-differential. The usual generalized method is a relevant extension of the work by Subramanian and Xiu [10] for a nonsmooth case. In turn, an inexact version is related to the traditional approach, which was widely studied, e.g., in [11]. In recent years, various versions of the Gauss-Newton method were discussed, although most frequently for solving nonlinear least-squares problems, e.g., in [12,13]. The paper is organized as follows: in the next section, we review some notions needed, such as B-differential, BD-regularity, semismoothness, etc. (Section 2.1). Next, we propose a new optimization problem-based methods for the NCP, transforming the NCP into an unconstrained minimization problem by employing a function ϕ 3 (Section 2.2). We state its global convergence and superlinear convergence rate under appropriate conditions. In Section 3, we present the results of numerical tests.

Preliminaries
If F is Lipschitz continuous, the Rademacher's theorem [14] implies that F is almost everywhere differentiable. Let the set of points, where F is differentiable, be denoted by D F . Then, the B-differential (the Bouligand differential) of F at x (introduced in [15]) is where F (x) denotes the usual Jacobian of F at x. The generalized Jacobian of F at x in the sense of Clarke [14] is We say that F is BD-regular at x, if F is locally Lipschitz at x and if all V ∈ ∂ B F(x) are nonsingular (regularity on account of B-differential). Qi proved (Lemma 2.6, [15]) that, if F is BD-regular at x, then a neighborhood N of x and a constant C > 0 exist such that, for any y ∈ N and V ∈ ∂ B F(y), V is nonsingular and Throughout this paper, · denotes the 2-norm. The notion of semismoothness was originally introduced for functionals by Mifflin [16]. The following definition is taken from Qi and Sun [17]. A function F is semismooth at a point x, if F is locally Lipschitzian at x and lim V∈∂F(x+th ),h →h,t↓0 Vh exists for any h ∈ R n . F is also said semismooth at x, if it is directionally differentiable at x and Scalar products and sums of semismooth functions are still semismooth functions. Piecewise smooth functions and maximum of a finite number of smooth functions are also semismooth. The semismoothness is the almost usually seen assumption on F in papers dealing with nonsmooth equations because it implies some important properties for convergence analysis of methods in nonsmooth optimization.
If for any V ∈ ∂F(x + h), as h → 0 where 0 < p ≤ 1, then we say F is p-order semismooth at x. Clearly, p-order semismoothness implies semismoothness. If p = 1, then the function F is called strongly semismooth. Piecewise C 2 functions are examples of strongly semismooth functions. Qi and Sun [17] remarked that, if F is semismooth at x, then, for any h → 0 and, if F is p-order semismooth at x, then for any h → 0

Remark 1.
Strong semismoothness of the appropriate function usually implies quadratic convergence of method instead of the superlinear one for semismooth function.
In turn, Pang and Qi [18] proved that semismoothness of F at x implies that sup V∈∂F(x+h) Moreover, if F is p-order semismooth at x, then sup V∈∂F(x+h)

The Algorithm and Its Convergence
Consider nonlinear equation G(x) = 0 defined by ϕ 3 . The equivalence of solving this equation and problem (NCP) is described by the following theorem: Theorem 1 (Mangasarian [1]). Let θ be any strictly increasing function from R into R, that is, For the convenience, denote for i = 1, 2, ..., n.
We assume that the function θ in Theorem 1 has the form Let G(x) be the associated function. We define function g in the following way: which allows for solving system G(x) = 0 based on solving the nonlinear least-square problem Let us note that x * solves G(x) = 0 if and only if it is a stationary point of g. Thus, from Theorem 1, x * solves (1).

Remark 2.
On the other hand, the first-order optimality conditions for problem (6) are equivalent to the nonlinear system where ∇g is the gradient of g, provided G is differentiable and G is the Jacobian matrix of G.
The continuous differentiability of the merit function g for some kind of nonsmooth functions was established by Ulbrich in the following lemma: Lemma 1 (Ullbrich, [19]). Assume that the function G : R n ⊃ D → R n is semismooth, or, stronger, p-order semismooth, 0 < p ≤ 1, then the merit function 1 2 G(x) 2 is continuously differentiable on D with gradient ∇g(x) = V T G(x), where V ∈ ∂G(x) is arbitrary.
. Suppose that ∇g(x) = 0. Then, given λ > 0, the direction d given by is an ascent direction for g. In particular, there is a positive w such that g(x − wd) < g(x).
Proof. There exist constants β ≥ 0 and γ > 0 such that x V x is symmetric and positive semidefinite. It follows that It follows that ∇g(x)d > 0 and that d is a ascent direction for g (Section 8.2.1 in [4]). Now, we present the generalized version of the damped Gauss-Newton method for solving the nonlinear complementarity problem.
Step 3: Find d (k) that is a solution of the linear system (A k + λ k I)d (k) = ∇g(x (k) ).

Remark 3. (i) In
Step 2, letting λ k = g(x (k) ) is one of the simplest strategy because then {λ k } converges to 0. (ii) The line search step (Step 4) in the algorithm follows the Armijo rule. Theorem 2. Let x (0) be a starting point and {x (k) } be a sequence generated by Algorithm 1. Assume that: Then, the generalized damped Gauss-Newton method described by Algorithm 1 is well defined and either {x (k) } terminates at a stationary point of g, or else every accumulation point of {x (k) }, if it exists, is a stationary point of g.

Proof.
The proof is almost the same as Theorem 2.1 in [10], providing appropriately modified assumptions.
For the nonsmooth case, the alternative condition may be considered instead of Lipschitz continuity of ∇g(x) (similar as in [10]). Thus, we have the following convergence theorem: Theorem 3. Let x (0) be a starting point and {x (k) } be a sequence generated by Algorithm 1. Assume that: (a) the level set L = x : g(x) ≤ g(x (0) ) is bounded; (b) G is semismooth on L.
Then, the generalized damped Gauss-Newton method described by Algorithm 1 is well defined and either {x (k) } terminates at a stationary point of g, or else every accumulation point of {x (k) }, if it exists, is a stationary point of g. Now, we take up the rate of convergence of the considered algorithm. The following theorem shows suitable conditions in various cases.

Theorem 4.
Suppose that x * is a solution of problem (1), G is semismooth, and G is BD-regular at x * . Then, there exists a neighborhood N * of x * such that, if x (0) ∈ N * and the sequence {x (k) } is generated by Algorithm 1, we have: (i) x (k) ∈ N * for all k and the sequence {x (k) } is linear convergent to x * ; (ii) if δ < 0.5, then the convergence is at least superlinear; (iii) If G is strongly semismooth, then the convergence is quadratic.
Proof. The proof of similar theorem given by Subramanian and Xiu [10] is based on three lemmas, which have the same assumptions as theorem. Now, we present these lemmas in versions adapted to our nonsmooth case.

Lemma 3. Assume that d x is a solution of the equation
for some matrix V x taken from ∂ B G(x). Then, there is a neighborhood D 1 of x * such that, for all x ∈ D 1 ,

Lemma 4.
There is a neighborhood D 2 of x * such that, for all Lemma 5. Suppose that the conditions of Lemma 1 hold. Then, there is a neighborhood D 3 of x * such that, for all x ∈ D 3 , The proofs of Lemmas 5 and 6 are almost the same as in [10]; however, in the proof of Lemma 4, we have to take into account the semismoothness and to use Lemma 1 to obtain the desired result.
At the same time, in a similar way, we may show a suitable rate of convergence. Now, we consider the inexact version of the considered method, which computes an approximate step, using the nonnegative sequence of forcing terms to control the level of accuracy.
For the above inexact version of the algorithm, we can state the analogous theorems which are equivalents of Theorems 2-4. Based on our previous results, the proof can be carried out almost in the same way as that of theorems for the 'exact' version of the method. However, the condition (7), implied by inexactness given in Step 3 of Algorithm 2, has to be considered. Thus, we omit both theorems as proofs here.
Step 3: Find d (k) that is a solution of the linear system Step 4: If then let and go to Step 1.
Step 5: Compute the smallest nonnegative integer m k such that and set and go to Step 1.

Numerical Results
In this section, we present results of our numerical experiments, obtained by coding both algorithms in Code:Blocks. We use double precision on an Intel Core i7 3.2 GHz running under the Windows Server 2016 operating system. We applied the generalized damped Gauss-Newton method to solve three nonlinear complementarity problems. In the following examples: N 1 and N 2 denote the number of performed iterations to satisfy the stopping criterion x (k+1) − x (k) < 10 −7 , using Algorithms 1 and 2, respectively. The forcing terms in Algorithm 2 were chosen as follows: Example 1 (from Kojima and Shindo [20]). Let the function F : R 4 → R 4 have the form Problem (NCP) with the above function F has two solutions: x * = (1, 0, 3, 0) T and x * * = ( √ 6/2, 0, 0, 0.5) T for which F(x * ) = (0, 31, 0, 4) T and F(x * * ) = 0, 2 + Thus, x * is a non-degenerate solution of (NCP) because Depending upon the starting point, we obtained the convergence iteration process to both solutions (see Table 1 or Figure 1).  Example 2. Let function F : R 2 → R 2 be defined as follows: Then, problem (NCP) has two solutions: Similar to Example 1, we obtained the convergence iteration process for both solutions, depending on the starting point (see Table 2 or Figure 2).  Because F is strictly monotonic, the proper problem (NCP) has exactly one solution.
Calculations have been made for various n with one starting point x (0) = (0, ..., 0) T . For all tests, we obtain the same number of iterations N 1 = 3 and N 2 = 4.

Conclusions
We have given the nonsmooth version of the damped generalized Gauss-Newton method presented by Subramanian and Xu [10]. The generalized Newton algorithms related to the Gauss-Newton method are well-known important tools for solving nonsmooth equations, which arise from various nonlinear problems such as nonlinear complementarity or variational inequality. These algorithms are especially useful when the problem has many variables. We have proved that the sequences generated by the methods are superlinearly convergent under mild assumptions. Clearly, the semismoothness and BD-regularity are sufficient to obtain only a superlinear convergence of methods, while strong semismoothness even gives quadratic convergence. However, if function G is not semismooth or BD-regular or the gradient of g is not Lipschitzian, the Gauss-Newton methods may be useless.
The performance of both methods was evaluated in terms of the number of iterations required. The analysis of the numerical results seems to indicate that the methods are usually reliable for solving semismooth problems. The results show that the inexact approach can produce a noticeable slowdown by the number of iterations (compare N 1 and N 2 in Figures 1 and 2). In turn, an important advantage is that the algorithms allow us to find various solutions to the problem (this can be observed in two examples: the first and second one). However, if there are many solutions of the problem, then the relationship between the starting point and the obtained solution may be unpredictable.
Clearly, traditional numerical algorithms aren't the only method for solving the nonlinear complementarity problems, regardless of the degree of nonsmoothness. Except for the methods presented in the paper and mentioned in the Introduction, some computational intelligence algorithms can be used to solve (NCP) problems, i.a., monarch butterfly optimization (see [22,23]), the earthworm optimization algorithm (see [24]), the elephant herding optimization (see [25,26]), or the moth search algorithm (see [27,28]). All of these approaches are bio-inspired metaheuristic algorithms.