Abstract
The generalized convex nearly isotonic regression problem addresses a least squares regression model that incorporates both sparsity and monotonicity constraints on the regression coefficients. In this paper, we introduce an efficient semismooth Newton-based augmented Lagrangian (Ssnal) algorithm to solve this problem. We demonstrate that, under reasonable assumptions, the Ssnal algorithm achieves global convergence and exhibits a linear convergence rate. Computationally, we derive the generalized Jacobian matrix associated with the proximal mapping of the generalized convex nearly isotonic regression regularizer and leverage the second-order sparsity when applying the semismooth Newton method to the subproblems in the Ssnal algorithm. Numerical experiments conducted on both synthetic and real datasets clearly demonstrate that our algorithm significantly outperforms first-order methods in terms of efficiency and robustness.
Keywords:
generalized convex nearly isotonic regression; augmented Lagrangian algorithm; semismooth Newton method MSC:
90C06; 90C25; 90C90
1. Introduction
Data generated in various fields often exhibit clear monotonicity, as seen in meteorological climate data [1,2], economic demand/supply curves [3], and biological growth curves [4]. Thus, this paper focuses on statistical models under order constraints. Specifically, suppose that we have m observations for , where is a vector with n features and is a response value. We concentrate on addressing the following optimization problem:
where is an data matrix, is the response vector, are given regularization parameters, and . In high-dimensional statistical regression, it is common for the number of features to exceed the number of samples. Therefore, in our paper, we assume that . The penalty term is composed of two components: the first enforces sparsity in the coefficient estimates by incorporating prior knowledge, and the second penalizes violations of monotonicity among adjacent pairs.
Problem (1) is a generalization of a wide range of ordered convex problems, including an isotonic regression model [5], nearly isotonic regression model [6], and ordered lasso problem [7]. The isotonic regression problem involves determining a vector that approximates a given vector while ensuring that z exhibits a non-decreasing (or non-increasing) sequence, i.e.,
where and . Since the restricted monotonicity constraint may lead to a model too rigid and make it difficult to adapt to complex data structures, Tibshirani et al. [6] relaxed this monotonicity constraint and considered the following nearly isotonic regression model:
where is a given parameter. It is evident that problem (1) is a generalization of problem (3), as it addresses general regression issues and incorporates sparsity constraints on the coefficients. We refer to problem (1) as the generalized convex nearly isotonic regression (GCNIR) problem.
In addition, problem (1) can also be regarded as a generalization of the following ordered lasso problem [7]:
Clearly, problem (4) extends the lasso problem by incorporating a monotonicity constraint on the absolute values of the coefficients. Like problem (2), this approach can lead to an overly rigid model (4). However, the GCNIR problem (1) mitigates the stringent monotonicity requirement of the ordered lasso, transforming it into a convex problem that is more flexible and tractable.
The GCNIR problem (1) can be reformulated into a convex quadratic programming (QP) problem by introducing new variables:
where
and denote the all-ones column vector and identity matrix, respectively. This implies that one can utilize the QP function “quadprog” provided by MATLAB or well-developed QP solvers, such as Gurobi and CPLEX [8], to compute reformulation (5) and thus solve problem (1). However, since has a size of , the computational cost of solving and storing becomes prohibitive, making it challenging to apply the aforementioned methods to large-scale problems.
Due to the challenges in solving the QP reformulation (5), it is logical to adapt the methods used for the previously discussed problems to address problem (1). The pool adjacent violators algorithm (PAVA) [9] is a cornerstone method for tackling shape-constrained statistical regression problems, as discussed in [10]. Initially developed for the isotonic regression model (2), PAVA has been extended to accommodate the nearly isotonic regression model (3), with adaptations such as the modified PAVA (MPAVA) [6] and the generalized PAVA (GPAVA) [2]. Despite its broad application, there is no theoretical guarantee that PAVA can be modified to tackle convex nonseparable minimization problems. Additionally, other approaches, such as the Generalized Proximal Gradient algorithm [7] and the alternating direction method of multipliers (ADMM) [11], have been proposed for solving the ordered lasso problem (4). To our knowledge, most current techniques for dealing with ordered models rely primarily on first-order information from the associated nonsmooth optimization framework. Consequently, we aim to develop a customized algorithm that utilizes second-order information to address the GCNIR problem more effectively.
This paper aims to develop a semismooth Newton-based augmented Lagrangian (Ssnal) algorithm to address the GCNIR problem (1) from a dual viewpoint. The Ssnal algorithm’s primary benefits include its superior convergence characteristics and reduced computational demands, which are achieved by exploiting second-order sparsity and employing efficient strategies within the semismooth Newton (Ssn) algorithm. Furthermore, the Ssnal algorithm has demonstrated its effectiveness in handling large-scale sparse convex models, as evidenced by its performance in applications such as Lasso [12], group Lasso [13], fused Lasso [14], clustered Lasso [15,16], multi-task Lasso [17], trend filtering [18], density matrix least squares problems [19], the Dantzig selector [20], and others [21,22,23,24]. Building on these successes, we propose to apply the Ssnal algorithm to solve problem (1).
The primary contributions of this paper are as follows. First, we calculate the proximal mapping related to the GCNIR regularizer and its generalized Jacobian. Second, we utilize the Ssnal algorithm to address the GCNIR problem from a dual perspective. Furthermore, by capitalizing on the low-rank properties and second-order sparsity inherent in the GCNIR problem, we significantly reduce the computational cost associated with the Ssn algorithm when solving the subproblems. Lastly, we perform a numerical analysis comparing our algorithm with first-order methods, including ADMM and the Accelerated Proximal Gradient (APG) method, demonstrating the efficiency and robustness of our approach.
The remaining sections of this paper are organized as follows. Section 2 delves into the analysis of the proximal mapping associated with the GCNIR regularizer and its generalized Jacobian. Section 3 outlines the framework of the Ssnal algorithm and discusses its convergence properties when applied to the dual formulation of the GCNIR problem (1). In Section 4, we evaluate the performance of the Ssnal algorithm through numerical experiments. Finally, we conclude the paper in Section 5.
Notation. For any , “” represents a diagonal matrix with in its i-th diagonal component. “” refers to an absolute vector, where each entry i is . “” indicates the sign vector, i.e., when , when , and when is equal to zero. Additionally, the notation “” refers to the support of the element z, specifically the collection of indices for which is not equal to zero. For any positive integer n, and are the unit column vectors. , while . denotes the Moore–Penrose pseudoinverse of the matrix . Typically, denotes the Fenchel conjugate of a given function h.
2. The Proximal Mapping of the GCNIR Regularizer and Its Generalized Jacobian
In this section, we shall present some results concerning the proximal mapping linked to the GCNIR regularizer along with its generalized Jacobian, which are necessary for later analysis.
Given any scalar , for any proper closed convex function , the proximal mapping and Moreau envelope [25] of p is defined by
The Moreau identity [26] holds, i.e.,
According to [27], is convex and continuously differentiable, and
Let be the GCNIR regularizer in (1), i.e.,
where .
Before diving into the proximal mapping associated with the GCNIR regularizer , we briefly introduce and relevant results, which are discussed in [14].
Define , where B is defined by
The proximal mapping with respect to is given as
Lemma 1.
(See [14], Lemma 1). For any given , we have that , where
On the basis of the above lemma, we can now explicitly calculate . For later convenience, we define .
Proposition 1.
For any given , it follows that for any ,
Proof.
According to the definition of the proximal mapping, it holds that for any ,
It follows from ([28], Corollary 4) that for any ,
This completes the proof. □
Next, we analyze the generalized Jacobian of , which is crucial for leveraging computational efficiency. We begin with presenting some findings concerning the generalized HS-Jacobian for , according to [14,29].
As noted in [14], the generalized HS-Jacobian for at is given by
where with
and where is an optimal Lagrangian multiplier for the constraint and
is an active index set.
The multifunction is given by
The subsequent proposition demonstrates that and may be regarded as the generalized HS-Jacobian for at , at , respectively.
Proposition 2.
For all , there exists a neighborhood of w such that for any ,
where .
Proof.
The results are derived from [14] (Proposition 2) and [29] (Lemma 2.1) with minor revisions. □
The multifunction is defined as
where
The mapping essentially acts as the generalized Jacobian for at w, which can be derived using the change-of-variables technique from previous work in [14] (Theorem 2).
Theorem 4.
Let λ and τ be non-negative real numbers, and let w be an element of . The set-valued function has the following properties: it is compact-valued, nonempty, and upper semicontinuous. For each , it can be concluded that the matrix V is symmetric and positive semidefinite. Furthermore, there exists a neighborhood of w that for any ,
3. A Semismooth Newton-Based Augmented Lagrangian Algorithm
This section introduces the semismooth Newton-based augmented Lagrangian (Ssnal) algorithm, an efficient approach for solving the GCNIR problem in the high-dimension low-sample setting, i.e., the case of . Directly applying the Ssnal algorithm to the primal problem would require solving a linear system of dimension, leading to significant computational costs, particularly for large-scale problems. To overcome this, we solve the GCNIR problem from the dual perspective.
We can reformulate the GCNIR problem (1) as
The dual problem of (P) takes the following minimization form:
The Lagrangian function of (D) is
Additionally, given , the augmented Lagrangian function of (D) takes the form:
3.1. The Framework of the Ssnal Algorithm
Below is the outline of the framework for an Ssnal algorithm designed to solve problem (D) (Algorithm 1).
| Algorithm 1 (Ssnal) A semismooth Newton-based augmented Lagrangian algorithm for (D) |
Initialization: , , , ,
|
The stopping criteria, which have been studied in [30,31] for approximately solving the solution to (9), can be stated as:
Next, we present the convergence results of the Ssnal algorithm, addressing both global and local convergence. Since problem (P) is feasible, ref. [30] (Theorem 4) establishes that satisfying the stopping criterion (C1) guarantees the global convergence of the Ssnal algorithm for problem (D).
Theorem 5.
Let the objective function of problem (P) be denoted by . It is clear that and are piecewise polyhedral multifunctions as described in [32]. This implies that both and satisfy error bound conditions at the original point with positive moduli and , respectively, as characterized in [33].
Following the analysis presented, we establish the local convergence of Algorithm 1, supported by the results in [12,30,31,34]. The proofs are analogous to those in [12] (Theorems 3.3). Therefore, we omit the detailed proofs here.
Theorem 3.
(Local convergence.) Given the conditions specified in (C1) and (C2), the sequence generated by the Algorithm 1 converges to , and for sufficiently large values of j, the following holds:
where and . Furthermore, if the condition (C3) is also applied, then for sufficiently large values of j,
where → when .
3.2. Ssn Algorithm for Subproblem (9)
We shall present a highly effective semismooth Newton (Ssn) algorithm designed to tackle problem (9) in this subsection.
For any given and a fixed , our goal is to solve the following problem:
where the objective function of problem (11) is given by
It is well known that is continuously differentiable with
Since is strongly convex, we can obtain the unique solution of problem (11) via solving the following nonsmooth equations:
For any , the multifunction is well defined and can be expressed as
where the operator is defined in (8).
Remark 1.
Based on Theorem 4, we can deduce that is not only nonempty but also has the properties of being compact-valued and upper semicontinuous. In addition, each component of the function is symmetric and positive definite.
Before delving into the Ssn algorithm, we introduce the following lemma, the proof of which is analogous to the one presented in [15] (Remark 2.12).
Lemma 2.
For any positive constant r, the proximal mapping is r-order semismooth at with respect to the multifunction . Similarly, the gradient is r-order semismooth at with respect to the multifunction .
Proof.
Given that and are piecewise affine functions as shown in [14], it follows that is Lipschitz continuous. Consequently, the proximal mapping is also a piecewise affine function that maintains Lipschitz continuity. As established in [35], is directionally differentiable at every point. Combining this with Theorem 4 and the definition of semismoothness [36,37,38,39], we can conclude that exhibits r-order semismoothness on . Similarly, it can be inferred that also demonstrates r-order semismoothness on . This completes the proof. □
Given that the gradient is nonsmooth, it is appropriate to employ the semismooth Newton (Ssn) algorithm instead of the standard Newton method to solve Equations (12). Building on the analysis provided, we are now ready to proceed with the development of an Ssn algorithm to solve (12) (Algorithm 2).
| Algorithm 2 (Ssn) A semismooth Newton algorithm for (12) |
Initialization: , , . Set .
|
For problem (13), we can employ the conjugate gradient algorithm to obtain such that
where , . Regarding the convergence of Algorithm 2, it is confirmed in the work of Li [14] (Theorem 3), and we now present their result directly as follows.
Theorem 4.
When applying Algorithm 2 to solve problem (12), the most computationally intensive step involves determining the search direction, which is derived from solving the linear system (13), i.e.,
Therefore, we aim to investigate the second-order information of the matrix V to reduce computation time. Firstly, let with
and denote with
where is given as in (7). It is derived from [14] (Propositions 2 and 3) that , i.e.,
where . Additionally, we can restructure H as a block diagonal matrix and exploit the generalized Jacobian’s sparse low-rank structure to substantially reduce computation time. Moreover, we have several options for solving the linear system (15), such as the Cholesky factorization, Sherman–Morrison–Woodbury (SMW) formula, or preconditioned conjugate gradient (PCG) method. These techniques further contribute to improving computational efficiency. Specific details can be found in [14,40], which we do not repeat here.
4. Numerical Experiments
In this section, we evaluate the efficiency of the Ssnal algorithm for solving the GCNIR problem by comparing it with the ADMM and APG algorithms using both synthetic and real datasets. The computational results were achieved by using MATLAB R2022b on a Dell desktop equipped with an Intel(R) Core(TM) i7-11700 CPU running at 2.50 GHz, along with 8 GB of RAM.
4.1. Some First-Order Methods for the GCNIR Problem
We provide a brief overview of the framework of ADMM and APG.
ADMM is recognized as a representative algorithm for addressing convex optimization problems [41], including the problem presented in (D). Here is a summary of the framework for the ADMM algorithm (see Algorithm 3):
| Algorithm 3 ADMM for the dual problem (D) |
Initialization: Set , , , , and initialize .
|
Let , representing the Lipschitz constant for function in the primal problem (P). Here is a summary of the framework for the APG algorithm (see Algorithm 4):
| Algorithm 4 APG for the primal problem (P) |
Initialization: Choose , set , , and initialize
|
4.2. Stopping Criteria
Utilizing the KKT conditions of problem (P) and (D), we can obtain the following relative KKT residual:
In addition, let and represent the objective values of (P) and (D), respectively, i.e.,
Then, the relative dual gap is defined by
In later experiments, we start the Ssnal algorithm with the parameters and terminate it when with a given error tolerance “to”. To enhance convergence speed, it is essential to dynamically modify the penalty parameter in the Ssnal algorithm. Specifically, we initially set and tune every three steps, i.e.,
where , , and represent the values of , , and during the j-th iteration, respectively.
In the case of the ADMM algorithm, we set the initial point and terminate the algorithm when . For the APG algorithm, we start the initial point and terminate the algorithm when .
Additionally, we set a tolerance level of . The tested algorithms will terminate under two conditions: either upon reaching their maximum number of iterations (100 for Ssnal; 30,000 for ADMM and APG) or if their running time exceeds 3 h.
In the following tables, “nnz” and “mon” represent the counts of non-zero elements in z and , as derived from Ssnal through these estimations:
where represents the k-th largest component in , ordered as . Time is shown in seconds.
4.3. Results on Synthetic Data
In this subsection, we evaluate the performance of three algorithms: Ssnal, ADMM, and APG on synthetic data.
In later experiments, we test five instances: . To create synthetic data, we employ the model
where is drawn from a Gaussian distribution and is generated by a distributed random number. Following the way provided in [15], we set to be for each case. The regularization parameters for the GCNIR problem (1) are specified as follows:
where . In total, we test 20 instances.
Table 1 presents a comparative analysis of the performance of three algorithms on synthetic datasets ranging from small to large scales. It is evident that the Ssnal algorithm exhibits both efficiency and robustness across a variety of parameter settings. Although all algorithms are capable of solving the problem with the required level of accuracy, Ssnal consistently outperforms the other two methods in terms of computational time. For instance, in the case of instance 5 with , Ssnal completes the task in a mere 3 s, whereas ADMM and APG take over 60 and 80 s, respectively.
Table 1.
The performances of Ssnal, ADMM, and APG when applied to synthetic data.
4.4. Results on Real Datasets
We evaluate the performances of three methods on three datasets: 10-K Corpus, StatLib, and UCI, which have been sourced from the LIBSVM datasets [42].
In our experiments with real datasets, we follow the methodology in [12] and utilize polynomial basis functions [43] to expand the original features across twelve datasets. For example, the number “10” in “mgscale10” indicates that we use a tenth-order polynomial to generate the basis functions. Furthermore, Table 2 provides statistical details for the twelve datasets under consideration, where “m” refers to the sample size and “n” denotes the number of features. For the GCNIR problem, the regularization parameters are set as the following three different strategies [13,14]: (P1) given in the previous section and
In our experiments, we use two values of for all datasets.
Table 2.
Summary of tested data sets.
Table 3, Table 4 and Table 5 display the comparative results of the three algorithms across parameter sets (P1), (P2), and (P3), respectively. An analysis of these tables reveals that the Ssnal algorithm was able to solve all 72 test cases within a 70 s timeframe, with the majority of these cases being resolved in under 20 s. In comparison, ADMM encountered failures in 23 cases, while APG failed in 50 instances. These results underscore the superior performance of Ssnal, which not only consistently outperformed ADMM and APG in terms of speed but also demonstrated a higher success rate.
Table 3.
The performances of Ssnal, ADMM, and APG when applied to real datasets with the regularization parameters defined as in (P1).
Table 4.
The performances of Ssnal, ADMM, and APG when applied to real datasets with the regularization parameters defined as in (P2).
Table 5.
The performances of Ssnal, ADMM, and APG when applied to real datasets with the regularization parameters defined as in (P3).
From Table 3, Ssnal not only succeeds in achieving the required accuracy but also takes less time than ADMM and APG. For instance, in the case of N7 with , the Ssnal algorithm takes s to reach the high accuracy of , while the ADMM algorithm needs 4666 s to achieve the lower accuracy of , and the APG algorithm even requires 4022 s to achieve a relatively large error of . Therefore, the Ssnal algorithm outperforms the other two algorithms in addressing the GCNIR problem.
Table 4 shows that the number of reversed order coefficients in z is almost the fewest among all tables. This is because the regularization parameter , which enforces monotonicity, is larger than the regularization parameter , which enforces sparsity in (P2). From Table 4, it is also evident that Ssnal demands considerably less time than the other two methods on twelve cases. Moreover, for more challenging tests, such as N3 with , only Ssnal successfully solved this problem, while the other two algorithms did not meet the accuracy requirements. The results strongly indicate that our Ssnal algorithm can efficiently and reliably solve the GCNIR problem.
Table 5 further illustrates that Ssnal continues to outperform ADMM and APG by a significant margin. This advantage is particularly pronounced for large-scale problems. In particular, for the case N4 with , Ssnal solves it to the desired accuracy within 53 s, while ADMM and APG fail to solve it within 3 h.
Consequently, we can confidently state that our Ssnal algorithm can efficiently and robustly solve the GCNIR problem (1) on real datasets with high accuracy.
5. Conclusions
In this paper, we proposed a highly efficient semismooth Newton-based augmented Lagrangian method for solving the GCNIR problem from the dual perspective. The proximal mapping associated with the GCNIR regularizer and its generalized Jacobian have been derived, and we have utilized the second-order sparsity structure to achieve superior performance in solving the subproblem of the Ssnal algorithm. Numerical results have demonstrated the efficiency and robustness of our proposed algorithm compared to the widely used ADMM and APG methods on both the synthetic and real datasets. Looking ahead, we anticipate our algorithm to play a significant role in solving convex problems with the GCNIR regularizer, thereby facilitating data analysis in statistical learning.
Author Contributions
Conceptualization, Y.-J.L.; methodology, Y.X.; software, Y.X.; validation, L.L. and Y.-J.L.; formal analysis, Y.X., L.L. and Y.-J.L.; investigation, Y.X.; resources, Y.-J.L.; data curation, Y.X.; writing—original draft preparation, Y.X.; writing—review and editing, L.L. and Y.-J.L.; visualization, Y.X.; supervision, L.L. and Y.-J.L.; project administration, L.L. and Y.-J.L.; funding acquisition, L.L. and Y.-J.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China (Grant No. 12271097), the Key Program of National Science Foundation of Fujian Province of China (Grant No. 2023J02007), the Central Guidance on Local Science and Technology Development Fund of Fujian Province (Grant No. 2023L3003), and the Fujian Alliance of Mathematics (Grant No. 2023SXLMMS01, 2025SXLMQN01).
Data Availability Statement
All data generated or analyzed during this study are included in this article.
Acknowledgments
The authors would very much like to thank the reviewers for their helpful suggestions.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Matyasovszky, I. Estimating red noise spectra of climatological time series. Időjárás Q. J. Hung. Meteorol. Serv. 2013, 117, 187–200. [Google Scholar]
- Yu, Y.L.; Xing, E. Exact algorithms for isotonic regression and related. J. Phys. 2016, 699, 012016. [Google Scholar] [CrossRef]
- Matsuda, T.; Miyatake, Y. Generalized nearly isotonic regression. arXiv 2021, arXiv:2108.13010. [Google Scholar]
- Obozinski, G.; Lanckriet, G.; Grant, C.; Jordan, M.I.; Noble, W.S. Consistent probabilistic outputs for protein function prediction. Genome Biol. 2008, 9, 247–254. [Google Scholar] [CrossRef] [PubMed]
- Barlow, R.E.; Brunk, H.D. The isotonic regression problem and its dual. J. Am. Stat. Assoc. 1972, 67, 140–147. [Google Scholar] [CrossRef]
- Tibshirani, R.J.; Hoefling, H.; Tibshirani, R. Nearly-isotonic regression. Technometrics 2011, 53, 54–61. [Google Scholar] [CrossRef]
- Tibshirani, R.; Suo, X. An Ordered Lasso and Sparse Time-Lagged Regression. Technometrics 2016, 58, 415–423. [Google Scholar] [CrossRef] [PubMed]
- Lin, L.; Liu, Y.J. An Efficient Hessian Based Algorithm for Singly Linearly and Box Constrained Least Squares Regression. J. Sci. Comput. 2021, 88, 26. [Google Scholar] [CrossRef]
- Ayer, M.; Brunk, H.D.; Ewing, G.M.; Reid, W.T.; Silverman, E. An empirical distribution function for sampling with incomplete information. Ann. Math. Statist. 1955, 26, 641–647. [Google Scholar] [CrossRef]
- Yu, Z.S.; Chen, X.Y.; Li, X.D. A dynamic programming approach for generalized nearly isotonic optimization. Math. Prog. Comp. 2023, 15, 195–225. [Google Scholar] [CrossRef]
- Brian, R.; Gaines, J.K.; Zhou, H. Algorithms for fitting the constrained Lasso. J. Comput. Graph. Stat. 2018, 27, 861–871. [Google Scholar]
- Li, X.D.; Sun, D.F.; Toh, K.C. A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 2018, 28, 433–458. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, N.; Sun, D.; Toh, K.C. An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems. Math. Program. 2018, 179, 223–263. [Google Scholar] [CrossRef]
- Li, X.; Sun, D.; Toh, K.C. On efficiently solving the subproblems of a level-set method for fused Lasso problems. SIAM J. Optim. 2018, 28, 1842–1866. [Google Scholar] [CrossRef]
- Lin, M.X.; Liu, Y.J.; Sun, D.F.; Toh, K.C. Efficient sparse semismooth Newton methods for the clustered Lasso problem. SIAM J. Optim. 2019, 29, 2026–2052. [Google Scholar] [CrossRef]
- Sun, D.; Toh, K.C.; Yuan, Y. Convex clustering: Model, theoretical guarantee and efficient algorithm. J. Mach. Learn. Res. 2021, 22, 1–32. [Google Scholar]
- Lin, L.; Liu, Y.J. An inexact semismooth Newton-based augmented Lagrangian algorithm for multi-task Lasso problems. Asia Pac. J. Oper. Res. 2024, 41, 2350027. [Google Scholar] [CrossRef]
- Liu, Y.J.; Zhang, T. Sparse Hessian based semismooth Newton augmented Lagrangian algorithm for general ℓ1 trend filtering. Pac. J. Optim. 2023, 19, 187–204. [Google Scholar]
- Liu, Y.J.; Yu, J. A semismooth Newton-based augmented Lagrangian algorithm for density matrix least squares problems. J. Optim. Theory Appl. 2022, 195, 749–779. [Google Scholar] [CrossRef]
- Fang, S.; Liu, Y.J.; Xiong, X. Efficient Sparse Hessian-Based Semismooth Newton Algorithms for Dantzig Selector. SIAM J. Sci. Comput. 2021, 43, 4147–4171. [Google Scholar] [CrossRef]
- Liu, Y.J.; Yu, J. A semismooth Newton based dual proximal point algorithm for maximum eigenvalue problem. Comput. Optim. Appl. 2023, 85, 547–582. [Google Scholar] [CrossRef]
- Liu, Y.J.; Zhu, Q. A semismooth Newton based augmented Lagrangian algorithm for Weber problem. Pac. J. Optim. 2022, 18, 299–315. [Google Scholar]
- Liu, Y.J.; Xu, J.J.; Lin, L.Y. An easily implementable algorithm for efficient projection onto the ordered weighted ℓ1 norm ball. J. Oper. Res. Soc. China 2023, 11, 925–940. [Google Scholar] [CrossRef]
- Liu, Y.J.; Wan, Y.; Lin, L. An efficient algorithm for Fantope-constrained sparse principal subspace estimation problem. Appl. Math. Comput. 2024, 475, 128708. [Google Scholar] [CrossRef]
- Moreau, J. Proximité et dualité dans un espace hilbertien. Bull. Société Mathématique Fr. 1965, 93, 273–299. [Google Scholar] [CrossRef]
- Rockafellar, R. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970; pp. 338–339. [Google Scholar]
- Lemaréchal, C.; Sagastizábal, C. Practical aspects of the Moreau–Yosida regularization: Theoretical preliminaries. SIAM J. Optim. 1997, 7, 367–385. [Google Scholar] [CrossRef]
- Yu, Y. On decomposing the proximal map. In Proceedings of the 27th International Conference on Neural Information Processing Systems, New York, NY, USA, 5–10 December 2013; pp. 91–99. [Google Scholar]
- Han, J.; Sun, D. Newton and quasi-Newton methods for normal maps with polyhedral sets. J. Optim. Theory Appl. 1997, 94, 659–676. [Google Scholar] [CrossRef]
- Rockafellar, R.T. Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1976, 1, 97–116. [Google Scholar] [CrossRef]
- Rockafellar, R.T. Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 1976, 14, 877–898. [Google Scholar] [CrossRef]
- Rockafellar, R.T.; Wets, R.J.B. Variational Analysis; Springer: Berlin, Germany, 1998; p. 550. [Google Scholar]
- Robinson, S.M. Some continuity properties of polyhedral multifunctions. In Mathematical Programming at Oberwolfach; König, H., Korte, B., Ritter, K., Eds.; Springer: Berlin, Germany, 1981; pp. 206–214. [Google Scholar]
- Luque, F.J. Asymptotic convergence analysis of the proximal point algorithm. SIAM J. Control Optim. 1984, 22, 277–293. [Google Scholar] [CrossRef]
- Facchinei, F.; Pang, J.S. Finite-Dimensional Variational Inequalities and Complementarity Problems; Springer: New York, NY, USA, 2003; p. 345. [Google Scholar]
- Mifflin, R. Semismooth and semiconvex functions in constrained optimization. SIAM J. Control Optim. 1977, 15, 959–972. [Google Scholar] [CrossRef]
- Kummer, B. Newton’s method for non-differentiable functions. In Advances in Mathematical Optimization; Guddat, J., Ed.; De Gruyter: Berlin, Germany, 1988; pp. 114–125. [Google Scholar]
- Qi, L.; Sun, J. A nonsmooth version of Newton’s method. Math. Program. 1993, 58, 353–367. [Google Scholar] [CrossRef]
- Sun, D.; Sun, J. Semismooth matrix-valued functions. Math. Oper. Res. 2002, 27, 150–169. [Google Scholar] [CrossRef]
- Luo, Z.; Sun, D.; Toh, K.C.; Xiu, N. Solving the OSCAR and SLOPE models using a semismooth Newton-based augmented Lagrangian method. J. Mach. Learn. Res. 2019, 20, 1–25. [Google Scholar]
- Gabay, D.; Mercier, B. A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Appl. Math. Comput. 1976, 2, 17–40. [Google Scholar] [CrossRef]
- LIBSVM—A Library for Support Vector Machines. Available online: https://www.csie.ntu.edu.tw/~cjlin/libsvm/ (accessed on 15 January 2024).
- Huang, L.; Jia, J.; Yu, B.; Chun, B.G.; Maniatis, P.; Naik, M. Predicting execution time of computer programs using sparse polynomial regression. In Proceedings of the 24th International Conference on Neural Information Processing Systems, New York, NY, USA, 6–9 December 2010; pp. 883–891. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).