Next Article in Journal
Document Image Shadow Removal Based on Illumination Correction Method
Previous Article in Journal
EvoDevo: Bioinspired Generative Design via Evolutionary Graph-Based Development
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Delayed Star Subgradient Methods for Constrained Nondifferentiable Quasi-Convex Optimization

Department of Mathematics, Faculty of Science, Khon Kaen University, Khon Kaen 40002, Thailand
*
Author to whom correspondence should be addressed.
Algorithms 2025, 18(8), 469; https://doi.org/10.3390/a18080469 (registering DOI)
Submission received: 1 July 2025 / Revised: 21 July 2025 / Accepted: 24 July 2025 / Published: 26 July 2025
(This article belongs to the Section Analysis of Algorithms and Complexity Theory)

Abstract

In this work, we consider the problem of minimizing a quasi-convex function over a nonempty closed convex constrained set. In order to approximate a solution of the considered problem, we propose delayed star subgradient methods. The main feature of the proposed methods is that it allows us to use the stale star subgradients when updating the next iteration rather than computing the new star subgradient in every iteration. We subsequently investigate the convergence results of sequences generated by the proposed methods. Finally, we present some numerical experiments on the Cobb–Douglas production efficiency problem to illustrate the effectiveness of the proposed method.

1. Introduction

In this work, we consider the quasi-convex optimization problem:
minimize f ( x ) subject to x X ,
where the objective function f : R n R is quasi-convex continuous (possibly) and nondifferentiable, and the constrained set X is nonempty closed and convex. Throughout this work, we will denote the set of all optimal solutions and the optimal value of the problem (1) as X * : = argmin x X f ( x ) and f * : = min x X f ( x ) , respectively. We also assume that X * .
The problem (1) not only forms a fundamental setting for optimization as every convex function is quasi-convex but also generalizes to several practical situations in many fields such as economics, engineering, and others. An important special setting is the problem of minimizing the ratios of two functions [1] (Lemma 4), [2] (Theorem 2.3.8), for example, optimizing the ratio of outputs and inputs for the highest-quality productivity assessments in economics, such as debt-to-equity ratio analysis in corporate financial planning, inventory-to-sales analysis in production planning, and nurse–patient ratio analysis in hospital planning. This kind of problem is known as fractional programming [3,4,5]. As mentioned above, various kinds of iterative methods have been developed to solve the quasi-convex optimization problem (1); see [1,6,7,8,9,10,11,12,13,14] and references therein.
If the function f is convex, it is well-known that the classical method to deal with the constrained convex optimization problem is the projected subgradient method: for an initial point x 0 X , compute
x k + 1 : = P X ( x k α k ˜ f ( x k ) ) , k N 0 ,
where ˜ f ( x k ) { g R n : g , y x f ( y ) f ( x ) for all y R n } is the Fenchel subgradient of f at x k . Because of the simplicity, this method has been continuously developed in the literature, see for instance [8,15]. However, for the quasi-convex case, it should be noted that the Fenchel subgradient may not exist, in general. In this situation, we need a more general subgradient concept, which is the so-called star subgradient proposed in [16,17]. By utilizing the notion of a star subgradient, Kiwiel [1] proposed the star subgradient method to solve the constrained quasi-convex optimization problem (1): for an initial point x 0 X ,
x k + 1 : = P X x k α k ˜ f ( x k ) ˜ f ( x k ) , k N 0 ,
where ˜ f ( x k ) f ( x k ) \ { 0 } is the nonzero star subgradient of f at x k . After that, Hu et al. [6] proposed an inexact version of the star-subgradient method by considering the approximated star subgradient with the presence of noises. By imposing the Hölder condition, they presented the convergence results in various aspects for both constant and diminishing step sizes. Subsequently, several methods have been investigated for solving more general problem settings, for instance, a conditional star subgradient method [8], a stochastic star subgradient method [7], and an incremental star subgradient method [9].
As we can notice, the above-mentioned methods require computing the star subgradient in every single iteration; however, there are some practical situations such as large-scale problems in which the exact star subgradients may be computationally expensive. This shortcoming can be overcome by using the idea of delayed subgradients in which we can retrieve stale subgradients instead of spending time computing a new subgradient in every iteration. This approach is not only useful for dealing with excessive computation but also for handling communication delays in networked systems. Note that various convex methods with delayed gradient or even subgradient updates are developed and studied, for instance, stochastic gradient descent [18,19], incremental type methods [20,21,22,23], distributed type methods [24,25,26], and references therein.
To the best of our knowledge, there is no report about the star subgradient method with delayed updates to solve the problem (1). To fill this gap, in this work, we propose two delayed star subgradient methods (in short, DSSMs) based on the ideas of the star subgradient method (2) and the delayed subgradient updates, called DSSM-I and DSSM-II, respectively. We investigate the convergence properties of the proposed methods in both objective values and iterations, including proving finite convergence.
This paper is organized as follows: Section 2 provides the essential notations, definitions, and facts utilized in this work. Section 3 contains the proposed methods, including their convergence results. This section is divided into two subsections. In Section 4, we then apply DSSM-I to the Cobb–Douglas production efficiency problem. Finally, we provide the summary in Section 5.

2. Preliminaries

This section contains important symbols, definitions, and tools utilized in this work. Throughout this paper, let R be the set of all real numbers, N 0 the set of all non-negative integers, and N the set of all positive integers. We denote with R n a Euclidean space with an inner product · , · and an induced norm · . We denote the unit sphere by the set S : = { z R n : z = 1 } and the open ball centered at the x R n with radius δ > 0 by the set B ( x ; δ ) : = { z R n : z x < δ } .
Let X R n be a nonempty set. The symbol cl ( X ) is the closure of X. The distance (function) between y R n and X is the function dist : R n R defined by
dist ( y , X ) : = inf x X y x .
The metric projection of y R n onto X is the point P X ( y ) X such that P X ( y ) y     x y for all x X . Note that the metric projection P X ( y ) exists and is unique for all y R n whenever X is nonempty, closed, and convex [27] (Theorem 1.2.3). The metric projection onto a nonempty closed convex set X has a nonexpansive property, i.e., P X ( x ) P X ( y )     x y for all x , y R n [27] (Theorem 2.2.21).
Let f : R n R be a function, and let α R . The strictly sublevel and sublevel set of f corresponding to α are defined by
S f , α < : = { x R n : f ( x ) < α } and S f , α : = { x R n : f ( x ) α } ,
respectively. The function f is called upper semicontinuous on R n if S f , α < is an open set for all α R . The function f is called lower semicontinuous on R n if S f , α is a closed set for all α R . The function f is continuous on R n if it is both upper semicontinuous and lower semicontinuous on R n . The function f is called quasi-convex on R n if S f , α is convex for all α R .
For any x R n , the star subdifferential [16,17] of f at x is the set
f ( x ) : = g R n : g , y x 0 for all y S f , f ( x ) < .
The element g f ( x ) is called the star subgradient of f at x , and it is denoted by ˜ f ( x ) . It is obvious that 0 f ( x ) for all x R n . The following fact provides the basic properties of the star subgradient, which can be found in [16] (Proposition 30) and [6] (Lemma 2.1).
Fact 1. 
Let f : R n R be a quasi-convex, and let x R n . Then the following statements hold:
(i)
f ( x ) \ { 0 } is a nonempty set;
(ii)
f ( x ) is a closed, convex cone.
Next, we will recall the definition and properties of the quasi-Fejer monotone sequence. Let X be a nonempty subset of R n . The sequence { x k } k = 0 R n is said to be quasi-Fejer monotone with respect to X if for any x X , there exists a sequence { ζ k } k = 0 ( 0 , ) such that
k = 0 ζ k < and x k + 1 x 2 x k x 2 + ζ k for all k N 0 .
Fact 2 
([28]). (Theorem 5.33) Let X R n be a nonempty set and { x k } k = 0 be a sequence in R n . If the sequence { x k } k = 0 is quasi-Fejer monotone with respect to X, then { x k } k = 0 is bounded and lim k x k x exists for all x X .
Finally, we provide the lemma, which is used in the next section.
Lemma 1 
([29]). (Lemma 2.1) Let { a k } k = 0 be a scalar sequence, and let { b k } k = 0 be a positive real number sequence such that lim k i = 0 k b i = . If lim k a k = 0 , then lim k i = 0 k a i b i i = 0 k b i = 0 .

3. Algorithms and Convergence Results

In this section, we introduce two delayed star subgradient methods (DSSM-I and DSSM-II) to solve Problem (1) and subsequently investigate the convergence properties of the proposed methods. To deal with delayed-type methods, we need the following assumption of the boundedness of the time-varying delays, as stated below.
Assumption 1. 
The sequence of time-varying delays { τ k } k = 0 N 0 is bounded, that is, there exists a non-negative integer τ such that 0 τ k τ for all k N 0 .

3.1. Delayed Star Subgradient Method I

Now, we are ready to propose the delayed star subgradient method of the first kind which is defined as the following Algorithm 1.
Algorithm 1 Delayed Star Subgradient Method I (DSSM-I)
Initialization: Given a stepsize { α k } k = 0 ( 0 , ) , the delays { τ k } k = 0 N 0 , and initial points x 0 , x 1 , x 2 , , x τ X .
Iterative Step: For a current point x k X , we compute
x k + 1 : = P X ( x k α k g k τ k ) ,
where g k τ k f ( x k τ k ) S is a unit star subgradient of f at x k τ k .
Update k : = k + 1 .
Throughout this work, to simplify the analysis, we denote x 0 = x 1 = x 2 = = x τ .
Remark 1. 
(i) 
Since the function f is quasi-convex, we note that DSSM-I is well defined. In fact, using Fact 1, we have for any k N 0 , f ( x k ) \ { 0 } . Hence, we are able to select any point ˜ f ( x k ) f ( x k ) \ { 0 } and subsequently set g k : = ˜ f ( x k ) ˜ f ( x k ) f ( x k ) S . Moreover, since { x k } k = 0 X , it is clear that f ( x k ) f * for all k N 0 .
(ii) 
If τ k = 0 for all k N 0 , DSSM-I coincides with the star subgradient method (SSM), which was proposed by Kiwiel [1].
Remark 2. 
Note that methods with time-varying delays allow us to use stale information, which is very helpful when the star subgradients are not easily computed. Assumption 1 is typically assumed for analyzing the convergence results of the delayed-type methods, see [18,19,20,24,25,26] and references therein. The presence of the delayed bound τ ensures that we need to update the unit star subgradient of f at x k τ k in at least every τ iteration. Some examples of the delayed sequences are as follows:
(i) 
Constant delay [18], that is, τ k : = τ for all k N 0 .
(ii) 
Cyclic delay [19,20,24,26] and references therein. The typical form of this type is τ k : = k mod ( τ + 1 ) for all k N 0 . In this case, the delays τ k ( k N 0 ) are chosen with deterministic order from the set { 0 , 1 , 2 , , τ } . This means that we will use stale information over a consistent period of length τ + 1 and update the new star subgradient at every τ + 1 iteration.
(iii) 
Random delay [25], that is, the delays τ k ( k N 0 ) are randomly chosen in the set { 0 , 1 , 2 , , τ } .
Assumption 2. 
The function f satisfies the Hölder condition with order p > 0 and modulus L > 0 on X, i.e., f ( x ) f * L dist ( x , X * ) p for all x X .
The Hölder condition assumption is typically assumed when investigating the convergence results of methods for solving quasi-convex optimization problems [6,7,9,10,11,12]. Note that, for the special case of p = 1 , the Hölder condition is nothing else than the Lipschitz condition.
The following lemma provides a useful property of the star subgradient and will play an important role in the convergence analysis. The formal proof is due to Konnov [10] (Proposition 2.1).
Lemma 2. 
Suppose that Assumption 2 holds. Let { x k } k = 0 be a sequence generated by DSSM-I. For any x * X * and k N 0 , if x k X * , then
f ( x k ) f * L g k , x k x * p for all g k f ( x k ) S .
Next, we will provide some lemma, which are important tools for proving the convergence results of DSSM-I. We first provide the basic inequality relating to the sufficient decreasing property of the generated sequence.
Lemma 3. 
Suppose that Assumption 2 holds. Let { x k } k = 0 be a sequence generated by DSSM-I. For any x * X * and k N 0 , if x k τ k X * , then
x k + 1 x * 2 x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k L 1 p f ( x k τ k ) f * 1 p .
Proof. 
Let x * X * and k N 0 be fixed. Suppose that x k τ k X * . We note from the definition x k in DSSM-I that
x k + 1 x * 2 = P X ( x k α k g k τ k ) P X ( x * ) 2 x k α k g k τ k x * 2 = x k x * 2 + α k 2 2 α k g k τ k , x k x * x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k g k τ k , x k τ k x * x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k L 1 p f ( x k τ k ) f * 1 p ,
where the first inequality follows from the nonexpansiveness of metric projection, and the second one follows from the Cauchy–Schwarz inequality and the fact that g k τ k = 1 for all k N 0 , and the last one follows from Lemma 2. This completes the proof. □
Lemma 4. 
Suppose that Assumption 1 holds. Let { x k } k = 0 be a sequence generated by DSSM-I, and let { α k } k = 0 ( 0 , ) be a nonincreasing sequence with lim k α k = 0 and k = 0 α k = . Then x k τ k x k ( τ + 1 ) α k τ for all k τ and lim k x k τ k x k = 0 .
Proof. 
We first note that for each k N 0
x k x k τ k = i = 1 τ k ( x k i + 1 x k i ) i = 1 τ k x k i + 1 x k i i = 0 τ x k i + 1 x k i
and
x k + 1 x k = P X ( x k α k g k τ k ) P X ( x k ) α k g k τ k = α k .
Let k N 0 be such that k τ . Combining (3) and (4) together with the nonincreasing of { α k } k = 0 yields
x k x k τ k i = 0 τ x k i + 1 x k i i = 0 τ α k i i = 0 τ α k τ = ( τ + 1 ) α k τ .
Furthermore, since lim k α k = 0 , we conclude that lim k x k x k τ k = 0 . The proof is complete. □
Now, we are ready to prove the first convergence result in the sense of the inferior limit of the function values of the generated sequence is equal to the optimal value f * .
Theorem 1. 
Suppose that Assumptions 1 and 2 hold. Let { x k } k = 0 be a sequence generated by DSSM-I and { α k } k = 0 be a nonincreasing sequence with lim k α k = 0 and k = 0 α k = . Then, we have lim inf k f ( x k ) = f * .
Proof. 
It is obvious that lim inf k f ( x k ) f * since f ( x k ) f * for all k N 0 , so that it is sufficient to prove that lim inf k f ( x k ) f * . Now, assume to the contrary that lim inf k f ( x k ) > f * . Then there exists ϵ > 0 and k 1 N 0 such that f ( x k ) > f * + ϵ for all k k 1 . This yields that
f ( x k τ k ) > f * + ϵ for all k k 1 + τ .
On the other hand, since lim k α k = 0 and lim k x k x k τ k = 0 (from Lemma 4), there exists k 2 N 0 such that
α k < 1 3 ϵ L 1 p and x k x k τ k < 1 3 ϵ L 1 p for all k k 2 .
Let x * X * and let k N 0 be such that k N 1 : = max { k 1 + τ , k 2 } . Then, we obtain from Lemma 3 together with the relations (6) and (7) that for any k N 1 ,
x k + 1 x * 2 < x k x * 2 + α k 1 3 ϵ L 1 p + 2 α k 1 3 ϵ L 1 p 2 α k ϵ L 1 p = x k x * 2 α k ϵ L 1 p ,
which yields
x k + 1 x * 2 < x N 1 x * 2 ϵ L 1 p i = N 1 k α i .
Since k = 0 α k = , we have ϵ L 1 p k = N 1 α k = , and it follows that there exists N 2 N 0 such that
ϵ L 1 p i = N 1 k α i > x N 1 x * 2 for all k N 2 .
Thus, for any k N : = max { N 1 , N 2 } , we obtain from (8) and (9) that
x k + 1 x * 2 < x N x * 2 ϵ L 1 p i = N k α i < 0 ,
which is a contradiction. Therefore, we can conclude that lim inf k f ( x k ) f * , and hence lim inf k f ( x k ) = f * , as desired. □
Corollary 1. 
Suppose that Assumptions 1 and 2 hold. Let { x k } k = 0 be a sequence generated by DSSM-I, and let { α k } k = 0 be a nonincreasing sequence with lim k α k = 0 and k = 0 α k = . If the sequence { x k } k = 0 is bounded, then there exists a subsequence of { x k } k = 0 that converges to an optimal solution in X * .
Proof. 
We first note from the fact that lim inf k f ( x k ) = f * in Theorem 1 together with the boundedness of the sequence { x k } k = 0 and the continuity of f that there exists a subsequence { x k i } i = 0 of { x k } k = 0 such that
lim i f ( x k i ) = lim inf k f ( x k ) = f * .
Since the sequence { x k i } i = 0 is also bounded, there exists a subsequence { x k i j } j = 0 of { x k i } i = 0 such that lim j x k i j = x ¯ R n . Again, the continuity of f and using (10) yield
f ( x ¯ ) = lim j f ( x k i j ) = lim i f ( x k i ) = f * .
Note that since { x k i j } j = 0 X , the closedness of X yields x ¯ X , which therefore implies that x ¯ is an element in X * . □
By imposing either the coercivity of the objective function f or the compactness of the constraint X, we can obtain an approximate convergence behavior and some nice convergence results in comparison to the one obtained in Theorem 1.
Theorem 2. 
Suppose that Assumptions 1 and 2 hold. Let { x k } k = 0 be a sequence generated by DSSM-I, and let { α k } k = 0 be a nonincreasing sequence with lim k α k = 0 and k = 0 α k = . If the function f is coercive or the constrained set X is compact, then the following statements hold:
(i)
For any σ > 0 , there exists N σ N 0 such that
dist ( x k , X * ) ρ ( σ ) + ( τ + 2 ) σ 1 p 2 τ + 3 for all k N σ ,
where ρ ( σ ) : = max dist ( x , X * ) : x X S f , f * + L σ ;
(ii)
lim k dist ( x k , X * ) = 0 , and lim k f ( x k ) = f * .
Proof. 
Let σ > 0 be arbitrary and denote the notation
ρ ( σ ) : = max dist ( x , X * ) : x X S f , f * + L σ .
Since the function f is coercive or the constrained set X is compact, we ensure that X * which together with X * X S f , f * + L σ yield that X S f , f * + L σ . Moreover, the continuity and quasi-convexity of f, respectively, imply that S f , f * + L σ is closed and convex, and so the set X S f , f * + L σ is also closed and convex. Again, with either the compactness of X or the coercivity of f, the intersection X S f , f * + L σ is bounded, and hence, X S f , f * + L σ is a compact set, which ensures that ρ ( σ ) < .
( i ) Let x * X * be given. Since lim k α k = 0 , we also have lim k α k τ = 0 . Consequently, there exists k σ N 0 such that k σ τ and
α k τ σ 1 p 2 τ + 3 for all k k σ .
This together with the relation that x k x k τ k ( τ + 1 ) α k τ which is obtained in Lemma 4 yield that
x k x k τ k ( τ + 1 ) σ 1 p 2 τ + 3 for all k k σ .
Now, we will show that
dist ( x k + 1 , X * ) max dist ( x k , X * ) , ρ ( σ ) + ( τ + 2 ) σ 1 p 2 τ + 3 for all k k σ .
Let k k σ be fixed. We divide the proof into two cases according to the behavior of f ( x k τ k ) .
Case 1: Suppose that f ( x k τ k ) > f * + L σ . Then we have x k τ k X * . Applying Lemma 3 and the nonincreasing property of the stepsize { α k } k = 0 , we get
x k + 1 x * 2 x k x * 2 + α k α k τ + 2 α k x k x k τ k 2 α k σ 1 p .
Substituting the inequalities (11) and (12) in (14) gives
x k + 1 x * 2 x k x * 2 + σ 1 p α k 2 τ + 3 + 2 ( τ + 1 ) σ 1 p α k 2 τ + 3 2 α k σ 1 p x k x * 2 .
By putting x * = P X * ( x k ) , we obtain dist ( x k + 1 , X * ) dist ( x k , X * ) .
Case 2: Suppose that f ( x k τ k ) f * + L σ . Then we have x k τ k X S f , f * + L σ and dist ( x k τ k , X * ) ρ ( σ ) . We note from the definition of x k + 1 = P X ( x k α k g k τ k ) the nonexpansiveness of P X and the nonincreasing property of { α k } k = 0 that
x k + 1 x * x k x * + α k x k x k τ k + x k τ k x * + α k τ ,
which together with the inequality (11) and (12) yield
x k + 1 x * x k τ k x * + ( τ + 2 ) σ 1 p 2 τ + 3 .
By putting x * : = P X * ( x k τ k ) and invoking dist ( x k τ k , X * ) ρ ( σ ) , we obtain
dist ( x k + 1 , X * ) ρ ( σ ) + ( τ + 2 ) σ 1 p 2 τ + 3 .
Thus, we proved the relation (13). By using the same analogy of Theorem 1, we obtain that lim inf k f ( x k τ k ) = f * . Consequently, there exists k σ such that k σ k σ and f ( x k σ τ k σ ) < f * + L σ 1 p . Therefore, we obtain from Case 2 that
dist ( x k σ + 1 , X * ) ρ ( σ ) + ( τ + 2 ) σ 1 p 2 τ + 3 .
Put N σ : = k σ + 1 . Suppose that dist ( x k , X * ) ρ ( σ ) + ( τ + 2 ) σ 1 p 2 τ + 3 for all k N σ . Invoking (13), we obtain dist ( x k + 1 , X * ) ρ ( σ ) + ( τ + 2 ) σ 1 p 2 τ + 3 . Hence, we conclude by the strong induction that
dist ( x k , X * ) ρ ( σ ) + ( τ + 2 ) σ 1 p 2 τ + 3 for all k N σ .
( i i ) For each r N , we denote X 1 r : = X S f , f * + L 1 r < . Observe that X 1 r r = 1 is nonincreasing and r N X 1 r = X S f , f * = X * . Subsequently, ρ 1 r r N is also nonincreasing and inf r N ρ 1 r = inf r N max x X 1 r dist ( x , X * ) = max x X * dist ( x , X * ) = 0 . Hence, we obtain that lim r ρ 1 r = 0 .
Now, let ε > 0 be arbitrary. Since lim r ρ 1 r + ( τ + 2 ) 1 r 1 p 2 τ + 3 = 0 , there exists M ε N such that
ρ 1 r + ( τ + 2 ) 1 r 1 p 2 τ + 3 < ε for all r M ε .
Since 1 M ε > 0 , there exists N N 0 such that
dist ( x k , X * ) ρ 1 M ε + ( τ + 2 ) 1 M ε 1 p 2 τ + 3 < ε for all k N .
Hence, we conclude that lim k dist ( x k , X * ) = 0 . Furthermore, by Assumption 2, we therefore obtain that lim k f ( x k ) = f * .  □
Remark 3. 
Under the setting of Theorem 2, we observe that if an optimal solution is unique, that is, X * = { x * } , we obtain that lim k x k x * = lim k dist ( x k , X * ) = 0 , which means that the generated sequence { x k } k = 0 converges to an unique solution x * .

3.2. Delayed Star Subgradient Method II

In order to obtain the convergence of the generated sequence, we need a modified version of DDSM-I which is defined as the following Algorithm 2.
Algorithm 2 Delayed Star Subgradient Method II (DSSM-II)
Initialization: Given a stepsize { α k } k = 0 ( 0 , ) , the delays { τ k } k = 0 N 0 , and initial points x 0 , x 1 , x 2 , , x τ X .
Iterative Step: For a current point x k X , if f ( x k ) = f * , we set
x k + 1 : = x k
and then stop. If f ( x k ) > f * , we compute
x k + 1 : = P X ( x k α k g k τ k ) ,
where g k τ k f ( x k τ k ) S is a unit star subgradient of f at x k τ k .
Update: k : = k + 1 .
Remark 4. 
(i)
If the delays τ k = 0 for all k N 0 , DSSM-II is relating to the method proposed by Hu et al. [9] (Algorithm 1) with the special setting of m = 1 .
(ii)
Note that DSSM-II involves evaluating the function value at the current iterate x k X and deciding to update the next iteration x k + 1 . If the function value f ( x k ) equals the optimal value f * , then no additional calculations are performed, and DSSM-II terminates so that an optimal solution is obtained at the point x k .
Next, we will derive some fundamental properties for the convergence of DSSM-II.
Lemma 5. 
Suppose that Assumption 2 holds. Let { x k } k = 0 be a sequence generated by DSSM-II. For any k N 0 and x * X * , if f ( x k ) > f * , then
x k + 1 x * 2 x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k L 1 p f ( x k τ k ) f * 1 p .
Proof. 
Let x * X * and k N 0 . Suppose that f ( x k ) > f * . We have from the definition of DSSM-II that x k + 1 = P X ( x k α k g k τ k ) . Moreover, since f ( x k ) > f * , we also have f ( x k τ k ) > f * . Thus, by following the proving lines of Lemma 3, we obtain
x k + 1 x * 2 x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k L 1 p f ( x k τ k ) f * 1 p ,
as desired. □
Theorem 3. 
Suppose that Assumptions 1 and 2 hold. Let { x k } k = 0 be a sequence generated by DSSM-II, and let { α k } k = 0 ( 0 , ) be a nonincreasing sequence with lim k α k = 0 and k = 0 α k = . Then, we have x k τ k x k ( τ + 1 ) α k τ for all k τ , lim k x k τ k x k = 0 , and lim inf k f ( x k ) = f * .
Proof. 
By invoking Lemma 5, we can prove the same analog of Lemma 4 and Theorem 1 to obtain lim inf k f ( x k ) = f * .  □
Theorem 4. 
Suppose that Assumptions 1 and 2 hold. Let { x k } k = 0 be a sequence generated by DSSM-II, and let { α k } k = 0 ( 0 , ) be a nonincreasing sequence with k = 0 α k = and k = 0 α k 2 < . Then the sequence { x k } k = 0 converges to an optimal solution in X * and, moreover, lim k f ( x k ) = f * .
Proof. 
If there exists K N 0 such that f ( x K ) = f * , we have that x K is an optimal solution and x k = x K for all k K . Consequently, we obtain lim k x k = x K X * and lim k f ( x k ) = f * as required.
On the other hand, we suppose that f ( x k ) > f * for all k N 0 . Let k N 0 and let x * X * . We note from Lemma 5 that
x k + 1 x * 2 x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k L 1 p f ( x k τ k ) f * 1 p x k x * 2 + α k 2 + 2 α k x k x k τ k .
Let us admit from the inequalities (3) and (4) that x k x k τ k i = 0 τ x k i + 1 x k i and x k + 1 x k α k for all k N 0 . Thus, for a fixed N N 0 , we have
k = 0 N α k x k x k τ k k = 0 N α k i = 0 τ x k i + 1 x k i = i = 0 τ k = i N i α k + i x k + 1 x k i = 0 τ k = 0 N α k + i x k + 1 x k i = 0 τ k = 0 N α k + i α k ,
which together with the nonincreasing property of { α k } k = 0 implies
k = 0 N α k x k x k τ k i = 0 τ k = 0 N α k 2 = ( τ + 1 ) k = 0 N α k 2 .
By approaching N in (15) and using the assumption that k = 0 α k 2 < , we obtain that k = 0 α k x k x k τ k < , and so
k = 0 α k 2 + 2 α k x k x k τ k < .
Therefore, the sequence { x k } k = 0 is quasi-Fejer monotone. By applying Fact 2, we obtain that the sequence { x k } k = 0 is bounded and lim k x k x * exists.
Since the sequence { x k } k = 0 is bounded and lim inf k f ( x k ) = f * (Theorem 3), there exists a subsequence { x k i } i = 0 of { x k } k = 0 such that
lim i f ( x k i ) = lim inf k f ( x k ) = f * .
Since { x k i } i = 0 is also bounded, there exists a subsequence { x k i j } j = 0 of { x k i } i = 0 such that lim j x k i j = x ¯ R n . Thus, the continuity of f yields
f ( x ¯ ) = lim j f ( x k i j ) = lim i f ( x k i ) = f * .
Moreover, the closeness of X implies x ¯ X , and hence we obtain that x ¯ X * . Since lim k x k x ¯ exists and lim j x k i j x ¯ = 0 , we conclude that lim k x k x ¯ = 0 . Moreover, invoking the continuity of f, we also have that lim k f ( x k ) = f * . The proof is complete. □
We end the convergence analysis of DSSM-II by investigating the finite convergence of the generated sequence as the following theorem.
Theorem 5. 
Suppose that Assumptions 1 and 2 hold and B ( x * ; δ ) X * for some δ > 0 . Let { x k } k = 0 be a sequence generated by DSSM-II. If one of the following statements holds:
(i) 
α k = α 0 , 2 δ 2 τ + 3 for all k N 0 ;
(ii) 
{ α k } k = 0 is nonincreasing with lim k α k = 0 and k = 0 α k = ,
then there exists K N 0 such that x K X * .
Proof. 
We suppose by contradiction that x k X * for all k N 0 . This implies x k τ k X * for all k N 0 . Note that x * + δ g k τ k x * δ , which implies that x * + δ g k τ k B ( x * ; δ ) X * . Since x k τ k X * , we have f ( x * + δ g k τ k ) < f ( x k τ k ) . Thus, by the definition of the star subgradient of f at x k τ k , we obtain that
g k τ k , x * + δ g k τ k x k τ k 0 ,
which yields
g k τ k , x k τ k x * δ .
On the other hand, since x k X * , we have from the definition of DSSM-II that x k + 1 = P X ( x k α k g k τ k ) , and the nonexpansive property of P X implies that
x k + 1 x * 2 x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k g k τ k , x k τ k x * ,
which combining (16), leads to
x k + 1 x * 2 x k x * 2 + α k 2 + 2 α k x k x k τ k 2 α k δ .
Let N be a fixed non-negative integer. By summing up (17) from k = 0 to k = N , we have
0 x 0 x * 2 + k = 0 N α k 2 + 2 k = 0 N α k x k x k τ k 2 δ k = 0 N α k ,
which implies
δ x 0 x * 2 2 k = 0 N α k + k = 0 N α k 2 2 k = 0 N α k + k = 0 N α k x k x k τ k k = 0 N α k .
Based on the conditions (i) and (ii), we will divide our consideration into two cases.
Case 1: Assume that α k = α ( 0 , 2 δ 2 τ + 3 ) for all k N 0 .
Let us note from (15) that
k = 0 N α k x k x k τ k i = 0 τ k = 0 N α k 2 = ( τ + 1 ) ( N + 1 ) α 2 .
Thus, we have
lim N x 0 x * 2 2 k = 0 N α k + k = 0 N α k 2 2 k = 0 N α k + k = 0 N α k x k x k τ k k = 0 N α k lim N x 0 x * 2 2 ( N + 1 ) α + α 2 + ( τ + 1 ) α = ( 2 τ + 3 ) α 2 < δ ,
which is a contradiction to the inequality (18).
Case 2: Assume that { α k } k = 0 is a nonincreasing sequence with lim k α k = 0 and k = 0 α k = . We obtain from Theorem 3 that lim k x k x k τ k = 0 . Thus, we finally obtain from the fact that lim k α k = 0 and k = 0 α k = and lim k x k x k τ k = 0 with Lemma 1 that
lim N x 0 x * 2 2 k = 0 N α k + k = 0 N α k 2 2 k = 0 N α k + k = 0 N α k x k x k τ k n = 0 N α k = 0 ,
which also contradicts to the inequality (18). □

4. Numerical Experiments

In this section, we present experimental results when we apply DSSM-I to solve the Cobb–Douglas production efficiency problem. We performed all experiments on MATLAB (R2023b) on a MacBook Air 13.3-inch with Apple M1 chip processor and 8GB memory.
The Cobb–Douglas production efficiency problem [3,6,12] aims to maximize the ratio of the total profit represented by the Cobb–Douglas product function and the total cost represented by the linear function on the product factor subject to the funding level limitations constraint. More precisely, let a 0 , c 0 > 0 and let a j , b i j , c j , p i 0 for all j = 1 , 2 , , n and i = 1 , 2 , , m , and the Cobb–Douglas production efficiency problem is defined by
maximize f ( x ) : = a 0 j = 1 n x j a j j = 1 n c j x j + c 0 subject to j = 1 n b i j x j p i , i = 1 , 2 , , m x j 0 , j = 1 , 2 , , n ,
where j = 1 n a j = 1 . Here, x : = ( x 1 , x 2 , , x n ) R n when x j is the j-th product factor ( j = 1 , 2 , , n ) , and for each i = 1 , 2 , , m , p i is the profit to be obtained from project i, and b i j is the support of product factor j to project i to achieve p i . According to the literature [6,12], we know that the Cobb–Douglas production efficiency problem (19) is a quasi-convex optimization problem. Note that the function f is continuous and quasi-concave [4] (Theorem 2.5.1), satisfying the Hölder condition on R n [12] (Appendix B), and the constraint set C : = i = 1 m { x R n : b i , x p i } [ 0 , ) n is the nonempty closed convex set, where b i : = ( b i 1 , , b i n ) T is a vector in R n . Therefore, we can apply DSSM-I to solve the problem (19). However, since the exact optimal solution to problem (19) is not known, we are unable to apply DSSM-II to solve it. To ensure the boundedness of the constrained set, we set the constrained set to be the set D : = { x R n : b i , x p i } [ 0.001 , 100 ] n which is a subset of C. Thus, the established convergence result from Theorem 2 guarantees that lim k f ( x k ) = f * . It is worth noting that the computational procedure of the metric projection onto the constrained set D is not an easy task since there is no explicit form of P D in general. To deal with the situation, we utilized the classical Halpern iteration by performing the following inner loop: for arbitrary initial point u l R n and a sequence { λ l } l = 0 ( 0 , 1 ) such that lim l λ l = 0 , l = 0 λ l = , and l = 1 | λ l + 1 λ l | < , we compute
u l + 1 : = λ l ( x k α k g k τ k ) + ( 1 λ l ) P D m + 1 P D m P D m 1 P D 1 u l for all l 1 ,
where D m + 1 = [ 0.001 , 100 ] n , D i : = { x R n : b i , x p i } for all i = 1 , 2 , , m . Then the sequence { u l } l = 0 converges to the unique point P D ( x k α k g k τ k ) , see [28] (Theorem 30.1) for further details. In our experiment, we put the inner initial point u 0 as the vector whose all coordinates are 1 and set λ l : = 1 l + 2 for all l N 0 . We use the stopping criterion u l + 1 u l u l + 1 10 6 for the inner loop.
To perform the numerical experiments, we randomly set the parameter a j , b i j ( 0 , 1 ) , a 0 , c 0 , c j ( 0 , 10 ) , and p i 0 , n 2 . We choose the stepsize and time-varying delays to be α k : = 1 k + 1 and τ k : = k mod ( τ + 1 ) for all k N 0 , respectively. We put the initial point x 0 as the vector whose all coordinates are 1. To explain the influence of the delay bound τ , we perform Algorithm DSSM-I for delay bounds τ = 0 , 1 , 3 , 5 , and 10 for various sizes of the problem n and m. Note that DSSM-I with τ = 0 is nothing else than the star subgradient method (SSM) proposed by Kiwiel [1]. Figure 1 illustrates the number of subgradient calculations performed by SSM and DSSM-I with the delay strategies τ k : = k mod ( τ + 1 ) for all k N 0 .
The average results of the 10 random data sets are presented in Figure 2 and Figure 3 below.
It can be observed from Figure 2 that, overall, all the results show a similar pattern as they approach the (approximated) optimal value as supported by Theorem 2. For each subfigure, we notice that the result of τ = 10 seems stable and decreases faster than other τ .
Next, we plot the relative errors of the objective functions values in the same setting as above in Figure 3.
It can be seen from Figure 3 that all results seem to decrease and stabilize. More precisely, the graph exhibits fluctuations that are alternating decreases and increases in small periods. These fluctuations gradually diminish and vanish to stabilize. Again, this behavior satisfies the convergence result of DSSM-I, and we notice that the results with τ = 10 seems to decrease rapidly to 0 compared to other values of τ .

5. Conclusions

In this work, we proposed the so-called projected star subgradient methods with delayed subgradient updates, namely DSSM-I and DSSM-II, for solving the constrained quasi-convex optimization problem. For DSSM-I, we proved that there exists a subsequence of the generated sequence that converges to an optimal solution provided that the boundedness of the generated sequence is imposed. Furthermore, under the compactness or the coercive assumption with the diminishing step size, we proved that the distance between the sequence and the optimal solution set X * converges to zero, resulting in the convergence of the objective function values to the optimal value. For the modified version as DSSM-II and without requiring the compactness or coercive assumptions, we proved that the generated sequence converges to an optimal solution. We also established the finite convergence provided that the interior of the optimal solution set X * is nonempty. Finally, we applied DSSM-I to solve the Cobb–Douglas production efficiency problem. It is worth noting that the proposed methods can reduce the computational costs of star subgradients by allowing the method to retrieve stale data instead. It is noted that since we could not indicate the precise upper bound on the number of times the star subgradient could be reused, the open problem with this topic still remains. Moreover, one can notice that we require the closed-form expression of the metric projection onto the constraint set X and overcome this obstacle by solving the sub-problem to approximate P X at each iteration. This also motivated us to consider a future direction that involves a method that does not require solving a sub-problem.

Author Contributions

Conceptualization, O.P. and N.N.; methodology, O.P. and N.N.; software, O.P.; validation, O.P. and N.N.; formal analysis, O.P. and N.N.; investigation, O.P. and N.N.; writing—original draft preparation, O.P.; writing—review and editing, O.P. and N.N.; visualization, O.P.; supervision, N.N.; project administration, N.N.; funding acquisition, N.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Fundamental Fund of Khon Kaen University. This research has received funding support from the National Science, Research, and Innovation Fund or NSRF. O. Pankoon was supported by the Development and Promotion of Science and Technology Talents Project (DPST).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Dataset available on request from the authors.

Acknowledgments

The authors are thankful to the editor and two anonymous referees for comments and remarks which improved the quality and presentation of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kiwiel, K.C. Convergence and efficiency of subgradient methods for quasiconvex minimization. Math. Program. 2001, 90, 1–25. [Google Scholar] [CrossRef]
  2. Cambini, A.; Martein, L. Generalized Convexity and Optimization; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  3. Bradley, S.P.; Frey, S.C. Fractional programming with homogeneous functions. Oper. Res. 1974, 22, 350–357. [Google Scholar] [CrossRef]
  4. Stancu-Minasian, I.M. Fractional Programming; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1997. [Google Scholar]
  5. Schaible, S.; Shi, J. Fractional programming: The sum-of-ratios case. Optim. Methods Softw. 2003, 18, 219–229. [Google Scholar] [CrossRef]
  6. Hu, Y.; Yang, X.; Sim, C.-K. Inexact subgradient methods for quasi-convex optimization problems. Eur. J. Oper. Res. 2015, 240, 315–327. [Google Scholar] [CrossRef]
  7. Hu, Y.; Yu, C.K.W.; Li, C. Stochastic subgradient method for quasi-convex optimization problems. J. Nonlinear Convex Anal. 2016, 17, 711–724. [Google Scholar]
  8. Hu, Y.; Yu, C.K.W.; Li, C.; Yang, X. Conditional subgradient methods for constrained quasi-convex optimization problems. J. Nonlinear Convex Anal. 2016, 17, 2143–2158. [Google Scholar]
  9. Hu, Y.; Yu, C.K.W.; Yang, X. Incremental quasi-subgradient methods for minimizing the sum of quasi-convex functions. J. Glob. Optim. 2019, 75, 1003–1028. [Google Scholar] [CrossRef]
  10. Konnov, I.V. On convergence properties of a subgradient method. Optim. Methods Softw. 2003, 18, 53–62. [Google Scholar] [CrossRef]
  11. Konnov, I.V. On properties of supporting and quasi-supporting vectors. J. Math. Sci. 1994, 71, 2760–2763. [Google Scholar] [CrossRef]
  12. Hishinuma, K.; Iiduka, H. Fixed point quasiconvex subgradient method. Eur. J. Oper. Res. 2020, 282, 428–437. [Google Scholar] [CrossRef]
  13. Choque, J.; Lara, F.; Marcavillaca, R.T. A subgradient projection method for quasiconvex minimization. Positivity 2024, 28, 64. [Google Scholar] [CrossRef]
  14. Zhao, X.; Köbis, M.A.; Yao, Y. A projected subgradient method for nondifferentiable quasiconvex multiobjective optimization problems. J. Optim. Theory Appl. 2021, 190, 82–107. [Google Scholar] [CrossRef]
  15. Ermol’ev, Y.M. Methods of solution of nonlinear extremal problems. Cybern 1966, 2, 1–14. [Google Scholar] [CrossRef]
  16. Penot, J.-P. Are generalized derivatives useful for generalized convex functions? In Generalized Convexity, Generalized Monotonicity: Recent Results; Crouzeix, J.-P., Martinez-Legaz, J.-E., Volle, M., Eds.; Springer: Boston, MA, USA, 1998; pp. 3–59. [Google Scholar]
  17. Penot, J.-P.; Zălinescu, C. Elements of quasiconvex subdifferential calculus. J. Convex Anal. 2000, 7, 243–269. [Google Scholar]
  18. Arjevani, Y.; Shamir, O.; Srebro, N. A tight convergence analysis for stochastic gradient descent with delayed updates. In Proceedings of the 31st International Conference on Algorithmic Learning Theory, San Diego, CA, USA, 8–11 February 2020; Kontorovich, A., Neu, G., Eds.; Proceedings of Machine Learning Research, PMLR: Cambridge, MA, USA, 2020; Volume 117, pp. 111–132. [Google Scholar]
  19. Stich, S.U.; Karimireddy, S.P. The error-feedback framework: Better rates for sgd with delayed gradients and compressed updates. J. Mach. Learn. Res. 2020, 21, 9613–9648. [Google Scholar]
  20. Gürbüzbalaban, M.; Ozdaglar, A.; Parrilo, P.A. On the convergence rate of incremental aggregated gradient algorithms. SIAM J. Optim. 2017, 27, 1035–1048. [Google Scholar] [CrossRef]
  21. Tseng, P.; Yun, S. Incrementally updated gradient methods for constrained and regularized optimization. J. Optim. Theory Appl. 2014, 160, 832–853. [Google Scholar] [CrossRef]
  22. Vanli, N.D.; Gürbüzbalaban, M.; Ozdaglar, A. Global convergence rate of proximal incremental aggregated gradient methods. SIAM J. Optim. 2008, 28, 1282–1300. [Google Scholar] [CrossRef]
  23. Butnariu, D.; Censor, Y.; Reich, S. Distributed asynchronous incremental subgradient methods. In Inherently Parallel Algorithms in Feasibility and Optimization and Their Applications; Brezinski, C., Wuytack, L., Reich, S., Eds.; Elsevier Science B.V.: Amsterdam, The Netherlands, 2001; Volume 8, p. 381. [Google Scholar]
  24. Namsak, S.; Petrot, N.; Nimana, N. A distributed proximal gradient method with time-varying delays for solving additive convex optimizations. Results Appl. Math. 2023, 18, 100370. [Google Scholar] [CrossRef]
  25. Deng, X.; Shen, L.; Li, S.; Sun, T.; Li, D.; Tao, D. Towards understanding the generalizability of delayed stochastic gradient descent. arXiv 2023, arXiv:2308.09430. [Google Scholar] [CrossRef]
  26. Arunrat, T.; Namsak, S.; Nimana, N. An asynchronous subgradient-proximal method for solving additive convex optimization problems. J. Appl. Math. Comput. 2023, 69, 3911–3936. [Google Scholar] [CrossRef]
  27. Cegielski, A. Iterative Methods for Fixed Point Problems in Hilbert Spaces; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  28. Bauschke, H.H.; Combettes, P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd ed.; Springer: Cham, Switzerland, 2017. [Google Scholar]
  29. Kiwiel, K.C. Convergence of Approximate and Incremental Subgradient Methods for Convex Optimization. SIAM J. Optim. 2004, 14, 807–840. [Google Scholar] [CrossRef]
Figure 1. Behaviors of a number of subgradint calculations for various delays bounds τ versus number of iterations.
Figure 1. Behaviors of a number of subgradint calculations for various delays bounds τ versus number of iterations.
Algorithms 18 00469 g001
Figure 2. Behaviors of objective function values for various delays bounds τ and problem sizes n and m.
Figure 2. Behaviors of objective function values for various delays bounds τ and problem sizes n and m.
Algorithms 18 00469 g002
Figure 3. Behavior of relative errors for various delay bounds τ and problem sizes n and m.
Figure 3. Behavior of relative errors for various delay bounds τ and problem sizes n and m.
Algorithms 18 00469 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pankoon, O.; Nimana, N. Delayed Star Subgradient Methods for Constrained Nondifferentiable Quasi-Convex Optimization. Algorithms 2025, 18, 469. https://doi.org/10.3390/a18080469

AMA Style

Pankoon O, Nimana N. Delayed Star Subgradient Methods for Constrained Nondifferentiable Quasi-Convex Optimization. Algorithms. 2025; 18(8):469. https://doi.org/10.3390/a18080469

Chicago/Turabian Style

Pankoon, Ontima, and Nimit Nimana. 2025. "Delayed Star Subgradient Methods for Constrained Nondifferentiable Quasi-Convex Optimization" Algorithms 18, no. 8: 469. https://doi.org/10.3390/a18080469

APA Style

Pankoon, O., & Nimana, N. (2025). Delayed Star Subgradient Methods for Constrained Nondifferentiable Quasi-Convex Optimization. Algorithms, 18(8), 469. https://doi.org/10.3390/a18080469

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop