1. Introduction
Conjugate Gradient (CG) algorithms are highly efficient for solving nonlinear problems, particularly those of large scale. Their efficiency stems from the fact that they avoid storing
matrices, instead relying on
column vectors. These algorithms are particularly effective for solving unconstrained minimization problems of the following form:
where
is continuously differentiable and bounded below. To solve this problem using the CG method, an initial point
is chosen, and the following iterative procedure is applied:
where the sequence
is expected to converge to the solution. The search direction
is determined by the following:
where
is the gradient of
f, and
is a CG parameter that varies across different CG algorithms. The step size
is obtained via a line search along the direction
. A first-order necessary condition for optimality states that if
is a minimizer of
f, then
Problems of this form arise in numerous high-impact applications, including signal recovery [
1], image restoration [
2], logistic regression [
3], power flow equations [
4], and variational inequalities [
5].
For a CG algorithm to be both convergent and efficient, the search direction
must satisfy certain criteria. These include the boundedness of the search direction and the sufficient descent property:
However, some CG algorithms may fail to achieve global convergence if they do not satisfy the sufficient descent condition.
A restart strategy is often used to ensure that the search direction
satisfies the sufficient descent condition at each iteration. Various restart algorithms have been proposed in the literature. For example, Powell [
6] introduced a restart technique to improve convergence rates and ensure the positive definiteness of the search direction. Other researchers, such as Bolan and Kowalik [
7], Shanno [
8], Knockaert and Zutter [
9], and Dai [
10], have also developed restart procedures to guarantee global convergence in their algorithms.
Despite these advancements, many restart strategies based on the Powell criterion can suffer from slow convergence because they discard information from previous search directions. To address this, Jiang et al. [
11] proposed efficient restart algorithms that take advantage of previous search directions, ensuring global convergence under Wolfe line search conditions. Their work demonstrated better performance over traditional restart methods.
In recent years, derivative-free methods have gained attention for solving problems where the gradient
is not available. These methods rely solely on function evaluations, making them suitable for problems where
f is not differentiable. Cheng [
12] proposed a PRP restart algorithm for solving such problems, demonstrating its efficiency over non-restart algorithms. However, the effectiveness of the restart strategy was not fully validated due to limited numerical experiments. In [
13], Li et al. proposed two classes of derivative-free methods with self-adaptive and restart features. The proposed methods, which extend MCG1 and MCG2 [
14], generate descent directions and achieve global convergence. Additionally, the linear convergence rate of these methods was established under the local error bound condition and their effectiveness in solving constrained nonlinear equations together with application was demonstrated. However, the methods in [
12,
13] require
F to be Monotone and Lipschitz continuous. For more on derivative-free methods, interested readers are referred to [
1,
2,
15,
16,
17,
18].
Pseudomonotonicity, a weaker condition than monotonicity, has also been explored in the context of CG algorithms. Recent work by Liu et al. [
19], Awwal et al. [
20], and Liu et al. [
21] has focused on solving problems involving pseudomonotone and continuous operators, though their algorithms do not incorporate restart strategies. More details about pseudomonotone operators can also be found in [
22].
In this work, we propose two efficient restart algorithms for solving problems involving pseudomonotone and continuous operators. The algorithm is designed to ensure global convergence while maintaining computational efficiency. The manuscript is organized as follows:
Section 2 introduces the proposed algorithm and its convergence properties,
Section 3 presents numerical experiments demonstrating the algorithm’s performance,
Section 4 applies the method to logistic regression models, and
Section 5 concludes the work.
2. Derivative-Free Projection Method with Restart Technique
In this section, we propose two classes of restart derivative-free algorithm to solve (
1) with
being
- 1.
pseudomonotone, i.e., for all
- 2.
and continuous.
To begin with, we recall the SPCG2+ method for solving the unconstrained optimization problem proposed by Liu et al. [
23]. The search direction is defined as follows:
where
is defined as
In order to obtain the global convergence of the algorithm, the search direction generated by the algorithm is required to satisfy the sufficient descent property. As such, we propose the following two classes of restart methods with the following search directions:
and
Note that the restart conditions in
and
are necesssry in order for the search direction to satisfy inequality (
2). This inequality is neccessary for the algorithm to converge. However, these restart conditions may not always hold. As such, we define
and
by
, which always satisfies (
2). In addition, it has been shown to be an efficient search direction when it comes to the practical aspect.
Next, we present two restart algorithms that generate approximate solutions of the pseudomonotone Equation (
1) with search direction
in (
5) and (
6) together with their convergence analysis.
Lemma 1. Let be nonzero for some , where are the first iterates obtained via the Algorithm 1. If is defined by (5) or (6) with , then | Algorithm 1 Restart Derivative-free Projection Method (RDFPM) |
- 1:
Initialization: Select an initial vector , , , . Set - 2:
Step 1. Evaluate . If then stop, else go to Step 2. - 3:
Step 2. Compute the search direction by ( 5) or ( 6). - 4:
Step 3. Set with and - 5:
Step 4. If , stop, else go to Step 5. - 6:
- 7:
Step 6. Let and repeat the process from Step 1.
|
Proof. Consider
defined by (
5), then
Case 1: When , .
Case 2: When
and
, we get
Case 3: If
and
, we get
Therefore, for all three cases above, we end up with (
7).
The proof for the case when
is defined by (
6) is similar because it is independent of the restart condition, so we omit it. □
Remark 1. As the Algorithm 1 generates iterates for with for , we can deduce from Lemma 1 that for impliesThis means that the Algorithm 1 runs as long as is not a solution, i.e., . Before stating the next Lemma, we would like to assume that the solution set to problem (
1) is nonempty.
Lemma 2. Let be a solution of problem (1) for a pseudomonotone and continuous operator F. If a sequence is generated by the Algorithm 1, thenPrecisely, exists. Proof. Since
is a solution of (
1) and
F is pseudomonotone, then
In addition, by the definition of
in the Algorithm 1, we have
which implies
Next, we estimate
using (
8), (
10), (
11) and
:
is a decreasing sequence from the last inequality, and, hence, the limit exists. □
Remark 2. It is interesting to note that Lemma 2 confirms that for any zero of F, is a decreasing sequence.
Lemma 3. Suppose is generated by the Algorithm 1 and F is continuous, then for all k, there exists , such thatIn other words, the sequence is bounded. Proof. As the sequence
converges by Lemma 2, then there exists
, such that for all
k,
By the triangle inequality, we end up with
Now, since
is bounded and
F is continuous,
is also bounded. □
Lemma 4. Let and be generated via the Algorithm 1. Then Proof. By Lemma 2, we have that
; we get
Hence,
□
Lemma 5. For all k, defined by Equation (5) or (6) is bounded. That is Proof. Let
be defined by (
5), then
On the other hand, if
be defined by (
6), then
That is, with
, we have
□
Proposition 1. If is generated by the Algorithm 1 with a pseudomonotone and continuous operator Φ, then Proof. Suppose by contradiction that
. This implies that there exists
with
Using (
15) together with (
9) and (
13), we have
So
obtained in Lemma 4 implies
In addition, as
and
are bounded, there exist convergent subsequences
and
, such that
Also, from (
7),
By the continuity of
F, (
15), and allowing
, we have
On the other hand, from Step 4 of the Algorithm 1, we have
Combining the above argument together with (
16), we have
which contradicts (
17). Hence, (
14) is true. □
The next Theorem confirms the convergence of a sequence generated by the Algorithm 1 to a zero of the pseudomonotone and continuous operator F.
Theorem 1. Suppose F is pseudomonotone and continuous; then, a sequence generated via the Algorithm 1 converges to a zero of F, that is,
Proof. From Proposition 1, one can choose a subsequence
such that
converges to 0. Because
is bounded, we may further extract a convergent subsequence from
, which we also denote by
for simplicity. Let
. Since
F is continuous, we have
Furthermore, since
exists by Lemma 2 and
, then the sequence
must converge to
. □
3. Numerical Experiments
In this section, we evaluate the performance of the proposed methods; that is, Algorithm 1 implemented with search direction (
5) and search direction (
6). We refer to these methods as
RDFPM I and
RDFPM II, respectively. The performance of these methods is compared against recently proposed methods in the literature; RDFIA [
24], a relaxed-inertial derivative-free algorithm for pseudomonotone systems, and HDFPM [
25], a memoryless BFGS hyperplane projection method for monotone equations. All experiments were executed in MATLAB R2025a Prerelease Update 2 (25.1.0.2833191) on an Apple Mac OS system with an M3 chip, 8GB using unified parameters for both proposed methods:
,
,
, and
.
Seven benchmark nonlinear equations from the literature were selected as test problems. These problems include Problem 1 (Problem 2 in [
26]), Problem 2 (Logarithmic function in [
27]), Problem 3 (Example 4.1 in [
28]), Problem 4 (Modified problem 10 in [
29]), Problem 5 (Strictly convex function II in [
27], Problem 6 (Nonsmooth function Problem 2 in [
30]), and Problem 7 (Problem 4.6 in [
31]). To assess the performance of each method, we evaluated four problem dimensions:
,
,
, and
. For each dimension, six initial points were tested: five deterministic vectors
with
, and one randomized initial point
. Each algorithm was implemented under identical experimental conditions (that is, all methods were tested on the same problem instances, dimensions, initial points and stopping criteria) to ensure fairness. The algorithms terminated when the residual satisfied
or
, or after a maximum of 2000 iterations.
Detailed results of the numerical experiments are presented in
Table A1,
Table A2,
Table A3,
Table A4,
Table A5,
Table A6 and
Table A7 in the following link:
https://github.com/ibrahimkarym/Two-classes-of-restart.git (accessed on 1 November 2025) or at the
Appendix A. To assess the performance of the methods, we used the Dolan and Moré [
32] performance profile, which allows us to plot the results. This profile provides an insightful comparison by summarizing the efficiency of each algorithm in terms of its ability to minimize iterations, function evaluations, and computation time. The best solver is the curve that tops the plot. Across all problems and initializations,
RDFPM I and
II consistently outperformed
RDFIA and
HDFPM. These results in
Figure 1,
Figure 2 and
Figure 3 confirm that
RDFPM I and
II are robust and efficient for solving nonlinear equations.
4. Application to Logistic Regression
In this section, we consider applying the two proposed methods (RDFPM I and RDFPM II) to solve the regularized centralized logistic regression problem. This problem is formulated as follows
where
is a regularization parameter that promotes stability and prevents overfitting, and the logistic loss function
models the classification error based on the dataset consisting of pairs
. Given the strong convexity of
f, the optimal solution
is uniquely characterized as the root of the corresponding system of nonlinear monotone equations [
33]:
To solve (
19), we employ our newly developed algorithm, leveraging its efficiency and robustness in handling large-scale nonlinear systems. We compare its performance with the HDFPM method [
25]. For fairness, the parameter settings for RDFPM I, RDFPM II, and HDFPM follow those used in previous experiments. For benchmarking, we utilized real-world datasets from the LIBSVM repository [
34], available at
https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ (accessed on 1 November 2025). Specifically, the benchmark instances
a1a–a9a and
colon-cancer were loaded using the
libsvmread function. No additional preprocessing was performed. There was no imputation, categorical encoding, or feature scaling—and the logistic objective was evaluated directly on the full dataset without a train/validation/test split to ensure a consistent optimization setting across all solvers. Details on the selected datasets are provided in
Table 1.
Experiments were conducted in MATLAB (details mentioned earlier). The global random number generator was fixed to mt19937ar with seed 97006855 using RandStream.setGlobalStream. To ensure deterministic behavior and comparable wall-clock timing, we restricted BLAS operations to a single thread by setting maxNumCompThreads(1) and the environment variables OMP_NUM_THREADS=1 and MKL_NUM_THREADS=1; no parallel pools were used.
The numerical experiments are conducted by initializing
x using the MATLAB script “
” and setting the regularization parameter to
. Each test instance is evaluated across five independent runs, and we report the average performance metrics. As demonstrated in
Table 1, our proposed algorithms achieves superior efficiency, outperforming HDFPM in terms of computational time, iteration count, and function evaluations. This highlights the effectiveness of our algorithm in addressing regularized centralized logistic regression problems.