An Inexact Optimal Hybrid Conjugate Gradient Method for Solving Symmetric Nonlinear Equations

: This article presents an inexact optimal hybrid conjugate gradient (CG) method for solving symmetric nonlinear systems. The method is a convex combination of the optimal Dai–Liao (DL) and the extended three-term Polak–Ribiére–Polyak (PRP) CG methods. However, two different formulas for selecting the convex parameter are derived by using the conjugacy condition and also by combining the proposed direction with the default Newton direction. The proposed method is again derivative-free, therefore the Jacobian information is not required throughout the iteration process. Furthermore, the global convergence of the proposed method is shown using some appropriate assumptions. Finally, the numerical performance of the method is demonstrated by solving some examples of symmetric nonlinear problems and comparing them with some existing symmetric nonlinear equations CG solvers.


Introduction
Consider the symmetric nonlinear system where F : R n → R n is a continuously differentiable mapping. The symmetry of the function F(x), means that the Jacobian of F(x) is symmetric. Such a class of problems could be defined from the gradient mapping of an unconstrained optimization problem, the Karush-Kuhn-Tucker (KKT) of equality constrained optimization problem, the discretized twopoint boundary value problem, the saddle point problem, the discretized elliptic boundary value problem, and so on, see in [1][2][3]. The Newton method and its modifications are among the common methods for solving (1) despite their drawbacks [4][5][6]. Among the main drawbacks is that if the Jacobian of F(x) is singular, the Newton method may not be useful for finding the solution of (1). Nonetheless, the solving of linear equation or the storage of Jacobian inverse at each iteration are also known to be among the drawbacks of both Newton and quasi-Newton methods.
The conjugate gradient method is known to be a remedy for all matrix requiring iterative methods for finding the solution of (1), see in [7]. The method performs the iteration where γ k is the step size to be determined via any suitable line search and d k is the CG direction defined by The term β k in (3) is a scalar known as the CG parameter. This scalar is what distinguished the various CG methods [8][9][10][11][12]. However, the CG parameter that was proposed by Dai-Liao (DL) is considered to be one of the most efficient CG parameters whenever the non-negative constant is appropriately selected [13]. The DL CG parameter is given by with t ≥ 0, s k−1 = x k − x k−1 , y k−1 = F k − F k−1 , and F k = F(x k ). Moreover, in an attempt to ensure the appropriate and optimal selection of t, Babaie-Kafaki and Ghanbari [14] proposed the following choices: and t 2 k = y T k−1 s k−1 s k−1 2 + y k−1 s k−1 .
Notwithstanding, the CG parameter proposed by Polak, Ribiére, and Polyak (PRP) is also numerically efficient and possess a restart feature that avoids jamming [8,9]. The PRP parameter is given by The following descent modification of the PRP parameter has been proposed by Babaie-Kafaki and Ghanbari [10] based on the Dai-Liao approach [13] as where ζ is a real constant. Based on comprehensive singular value analysis, the authors suggested to be the optimal choice of the real constant ζ, see in [15]. Moreover, for the symmetric nonlinear equations, Li and Wang [7] proposed an effective algorithm for solving symmetric nonlinear equations by combining the modified Fletcher-Reeves CG method [16] with an inexact gradient relation introduced in [1]. The reported numerical experiments illustrated that their algorithm is promising in handling large-scale symmetric nonlinear equations. This result in further studies on conjugate gradient methods for solving symmetric nonlinear equations were inspired. Zhou and Chen used the modified three-term CG method [17] with the approximate gradient and suggested HS-type CG method for solving symmetric nonlinear equations [18]. Thereafter, some efficient CG methods for unconstrained optimization problems were incorporated with the approximate gradient relation to solve large scale symmetric nonlinear equations. For example, the inexact PRP method [5], the norm descent CG method [19], the derived CG algorithm for symmetric nonlinear systems [20], and so on, see in [12,[21][22][23] for more details. In this article, motivated by the efficiencies of both the optimal DL CG method [14] and the optimal PRP CG method [15], we will propose an inexact optimal hybrid method for solving symmetric nonlinear equations. The proposed method is derivative-free and matrix-free, and therefore could sufficiently handle large-scale symmetric nonlinear systems efficiently. The remaining parts of this paper are as follows. The next section is the derivation and details of the proposed method, followed by a convergence analysis, numerical experiment, and conclusion.

A Class of Optimal Hybrid CG Method
This section describes the proposed optimal hybrid CG method for solving symmetric nonlinear equations with two different choices for the selection of the convex parameter at every iteration. Recall that Li and Fukushima [1] suggested the following approximation: for the gradient function ∇ f (x), where γ k is an arbitrary scalar and the function f (x) is specified by Now, we proposed the optimal hybrid CG parameter as where , and γ k to be determined such that with τ 1 , τ 2 ∈ (0, 1), ξ k is a non-negative sequence defined by Moreover, we defined our optimal hybrid direction as However, an optimal selection of ω k ∈ [0, 1] in (12) would yield an optimal β H k . Therefore, in order to have an optimal choice of ω k ∈ [0, 1] at every iteration, we proceed as follows.

The First Choice
The newton method is known to contain the full information of the Jacobian matrix; therefore, the first hybridization parameter will be derived by combining the proposed direction and the Newton direction. Recall that the default Newton direction is defined by where J −1 k is the Jacobian inverse. Using the approximate gradient (10), the Newton direction can be rewritten as Combining (19) with (17), we get Multiply (20) by −s T k−1 J k to obtain It can be observed that (21) has a Jacobian matrix; therefore, to eliminate the Jacobian matrix we employ the following secant equation: Now, using (22) in (21) together with the symmetry of the Jacobian matrix, we have this yield Now, solving for ω k , we have

The Second Choice
The second choice of the hybridization parameter is obtained by utilizing the conjugacy condition. Recall that the conjugacy condition is given by Now, using the direction (17) and Equation (26), we obtain that and using the definition of β H * k in (27), we get Therefore, solving for ω k in (28), we get the second formula for the computation of ω k at every iteration as follows: Furthermore, note that the proposed choices for ω k in (25) and (29) may be outside the interval [0, 1]. However, to maintain the convex combination in (12), we choose ω k to be 1 whenever ω k > 1, and ω k = 0 whenever ω k < 0. Some of the interesting features of the proposed method is that if ω k = 1, the proposed method reduces to PRP-type method [24] and for ω k = 0, we have an inexact Dai-Liao CG method for symmetric nonlinear equations which has not yet been presented in the literature. Below is the proposed Algorithm 1: Algorithm 1 Optimal hybrid CG method (OHCG). step 0: Select x 0 ∈ R n , and initialize the constants s ∈ (0, 1), , τ 1 , τ 2 ≥ 0. Set k = 0 and choose the positive sequence {ξ k }. step 1: Whenever F k ≤ , stop, if not go to Step 2. step 2: Calculate the search direction via (17) using any of the optimal choice (25) or (29). step 3: Determine γ k = max 1, s, s 2 · · · , satisfying step 4: Compute x k using (2). step 5: Set k = k + 1 and go to Step 1.

Global Convergence
This section will prove the global convergence of Algorithm 1 using the following assumptions: Start by defining the level set such that ξ fulfills the condition (16). and The Assumption 1 above implies that there exists R 1 , R 2 , q 1 > 0 such that and Proof. From the step length (30) and using the definition (11), it is not difficult to see that As the sequence {ξ k } is bounded, then we apply the result of the Lemma 3.3 in [25] and hence { F k } converges. Again, from the result Lemma 2. Suppose our assumption hold. Then, Proof. The proof is directly follow from (16) and (30).
CASE II: lim sup k→∞ γ k = 0. Already γ k ≥ 0, this shows that Let us consider the definition of h k in (10) and (34), then Now using (33), we have Furthermore, by the mean-value theorem, we have where we used Cauchy-Schwartz inequality in the first equality and inequalities (43), (44), and (45) in the last inequality. Now from (38) there existĉ ∈ (0, 1) such that s k−1 ≤ĉ, thus we have where This shows that our direction is bounded. As lim k→∞ γ k = 0, then γ k = γ k a does not satisfied (30), that is to say, which means that By mean value theorem, δ ∈ (0, 1) exists in such a way that As χ is bounded and x ∈ χ, then we assume that x k → x * and have the following result using (3) and (10): On the other hand, lim This yields a contradiction and thus the proof is complete.

Numerical Experiment
This section will provide the numerical experiments comparison between the proposed algorithm and some efficient CG solvers for solving symmetric nonlinear equations. However, for the numerical experiment of the proposed algorithm, we chose the optimal choice t 1 k defined by (5) for the optimal DL method and ζ * k defined by (9) for the optimal PRP methods. The algorithms were written in Matlab R2014a and ran on a 1.6 GHz CPU processor with 8GB RAM memory. To justify the numerical comparison, the parameters for the NDAS method [22] and the ICGM method [21] were set the same as in the reference papers. However, we set s = 0.3, τ 1 = τ 2 = 0.0001, γ k−1 = 0.01, and = 10 −4 for our proposed methods. In our experiment, we considered the following problems: Problem 1. [26] The precise description of the F(x) function is described as Problem 2. [27] The precise description of the F(x) function is described as The precise description of the F(x) function is described as The precise description of the F(x) function is described as [27] The precise description of the F(x) function is described as Problem 6. [29] The precise description of the F(x) function is described as Problem 7. [29] The precise description of the F(x) function is described as Problem 8. [29] The precise description of the F(x) function is described as Problem 9. [29] The precise description of the F(x) function is described as The numerical efficacy of our algorithm in terms of number of iterations and the CPU time in second is shown in Tables 1-5. The term "ITR" reflects the number of iterations, "TIME" for the CPU time, and "NORM" for the norm value of the function at the stopping point. Iteration ends whenever F k ≤ 10 −4 or the number of iterations reaches 1000. During the entire experiment, the following initial points were used, namely, x 1 = (0.1, 0.1, · · · , 0.1), x 2 = (0.2, 0.2, · · · , 0.2), x 3 = (0.3, 0.3, · · · , 0.3), x 4 = (1, 1 2 , 2 3 , · · · , n n ), x 5 = (1 − 1, 1 − 1 2 , 1 − 2 3 , · · · , 1 − n n ), x 6 = (−0.1, −0.1, · · · , −0.1), x 7 = 0.1(1, 1 2 , 2 3 , · · · , n n ), and x 8 = −(1 − 1, 1 − 1 2 , 1 − 2 3 , · · · , 1 − n n ).  ) and β H * k (ω 2 k ) versus NDAS method [22] and ICGM method [21]. ICGM   DIM  GUESS  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME ) and β H * k (ω 2 k ) versus NDAS method [22] and ICGM method [21]. ICGM   DIM  GUESS  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME ) and β H * k (ω 2 k ) versus NDAS method [22] and ICGM method [21]. ICGM   DIM  GUESS  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME Table 4. Numerical efficiency of Algorithm 1 with the choice of β H * k (ω 1 k ) and β H * k (ω 2 k ) versus NDAS method [22] and ICGM method [21]. ICGM   DIM  GUESS  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME ) and β H * k (ω 2 k ) versus NDAS method [22] and ICGM method [21]. ICGM   DIM  GUESS  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME  NORM  ITR  TIME  From Table 1, which includes Problems 1-2, it can be observed that our algorithm wins both the number of iterations and the CPU time, followed by the ICGM and NDAS algorithms. Nevertheless, note that the NDAS method failed to solve Problem 1 for almost six initial points, while the NDAS method has an advantage over the ICGM method for Problem 2 both for the number of iterations and computing time. Similarly, for all remaining problems, our algorithm has the less number of iterations and CPU time, except for Problems 5 and 8. For these two problems, the NDAS and ICGM methods have substantially less computing time and compete with the proposed algorithm for the number of iterations. We provide a picture of the overall performance for both the number of iterations and the CPU time using the Dolan and Moré [30] performance profile. Based on their profile, the best method is the method whose curve is at the top left corner. It could be observed that our algorithm reflected the top left curves in Figures 1 and 2 for the number of iterations and the time of the CPU, respectively. This clearly shows that our algorithm is the best in terms of the number of iteration and CPU time compared to the NDAS and ICGM algorithms.  ) and β H * k (ω 2 k ) versus NDAS method [22] and ICGM method [21] for number of iterations.  ) and β H * k (ω 2 k ) versus NDAS method [22] and ICGM method [21] for the CPU time in second.

Conclusions
This paper presented an inexact optimal hybrid CG algorithm for solving a system of symmetric nonlinear equations. The hybrid method presented here is the convex combination of the optimal DL method and the optimal three-term PRP CG method using the Li and Fukushima approximate gradient relation. The method is matrix-free and derivativefree, enabling it to handle large-scale system of symmetric nonlinear equations effectively.
Moreover, some mild assumptions are used to prove the global convergence of the method. Nevertheless, the numerical efficiency of the method was also demonstrated using some test problems by comparing the number of iterations and the CPU time compared to the NDAS method [22] and ICGM method [21]. Its overall success has shown that our proposed approach is a better alternative for solving symmetric nonlinear equations in terms of the number of iterations and the CPU time. Generally, the CG methods for the symmetric nonlinear systems can be improved by devising new efficient CG parameters or modifying the existing CG directions using the appropriate techniques. As a possibility for future research, time complexity analysis and comparison with evolutionary optimization algorithms for global optimization problems could be looked into.