A Chebyshev–Halley Method with Gradient Regularization and an Improved Convergence Rate
Abstract
:1. Introduction
2. Gradient-Regularized Chebyshev–Halley Method
Algorithm 1 Chebyshev–Halley method with gradient regularization (CHGR). |
Require: Initial point , , and .
|
3. Convergence Analysis
- (i).
- , which leads to .
- (ii).
- , which leads to .
- (iii).
- , which leads to .
- (iv).
- , which leads to .
- (v).
- , which leads to .
4. Numerical Experiments
- Efficient Convergence
- –
- In most cases, CHGR performs best with , which is consistent with theoretical results. In terms of both iteration count and computational efficiency, CHGR significantly outperforms second-order methods such as AICN, RN, RegN, and CNM. This is due to its two-step iterative framework, which accelerates convergence by incorporating third-order derivative information while maintaining computational complexity comparable to that of second-order methods.
- –
- For , AICN has a slightly less or comparable computational time to CHGR, due to fewer matrix and vector multiplication operations. However, for , AICN fails after only two steps of a normal iteration due to its high sensitivity to parameters. A similar issue also occurs in RN.
- –
- For , CHGR outperforms SUN in terms of the computational time on a9a () but performs slightly worse on a4a (). SUN is a single-step Newton iteration with an adaptive update strategy for , and it can accelerate convergence in some cases. However, the need to verify the descent condition for the adaptive strategy introduces additional computational costs.
- Numerical Stability
- –
- CHGR maintains stable convergence across different regularization strengths, while CNM and RCH fail to achieve satisfactory solutions within 100 steps for . This may be due to the more flexible regularization strategy of CHGR, which dynamically adjusts the regularization parameter based on the gradient norm, thereby improving convergence.
- –
- For , AICN and CHGR exhibit comparable performance. However, for , the weaker regularization causes the problem to become ill-conditioned, leading to the failure of AICN and RN to converge. Additionally, these two methods are more sensitive to the initial point, often failing due to numerical instability when initialized with or a random initialization.
5. Results
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Anandkumar, A.; Ge, R. Efficient approaches for escaping higher order saddle points in non-convex optimization. In Proceedings of the 29th Annual Conference on Learning Theory, New York, NY, USA, 23–26 June 2016; pp. 81–102. [Google Scholar]
- Chen, X.; Toint, P.L. High-order evaluation complexity for convexly-constrained optimization with non-Lipschitzian group sparsity terms. Math. Program. 2021, 187, 47–78. [Google Scholar] [CrossRef]
- Nesterov, Y.; Polyak, B.T. Cubic regularization of Newton method and its global performance. Math. Program. 2006, 108, 177–205. [Google Scholar] [CrossRef]
- Baes, M. Estimate Sequence Methods: Extensions and Approximations; Institute for Operations Research, ETH: Zürich, Switzerland, 2009; Volume 2. [Google Scholar]
- Gould, N.I.M.; Rees, T.; Scott, J.A. A Higher Order Method for Solving Nonlinear Least-Squares Problems; RAL Preprint RAL-P-2017–010; STFC Rutherford Appleton Laboratory: Oxfordshire, UK, 2017. [Google Scholar]
- Calandra, H.; Gratton, S.; Riccietti, E.; Vasseur, X. On high-order multilevel optimization strategies. SIAM J. Optimiz. 2021, 31, 307–330. [Google Scholar] [CrossRef]
- Jiang, B.; Wang, H.; Zhang, S. An optimal high-order tensor method for convex optimization. Math. Oper. Res. 2021, 46, 1390–1412. [Google Scholar] [CrossRef]
- Ahookhosh, M.; Nesterov, Y. High-order methods beyond the classical complexity bounds: Inexact high-order proximal-point methods. Math. Program. 2024, 208, 365–407. [Google Scholar] [CrossRef] [PubMed]
- Cartis, C.; Gould, N.I.M.; Toint, P.L. Adaptive cubic regularisation methods for unconstrained optimization. Part I: Motivation, convergence and numerical results. Math. Program. 2011, 127, 245–295. [Google Scholar] [CrossRef]
- Cartis, C.; Gould, N.I.M.; Toint, P.L. Adaptive cubic regularisation methods for unconstrained optimization. Part II: Worst-case function-and derivative-evaluation complexity. Math. Program. 2011, 130, 295–319. [Google Scholar] [CrossRef]
- Jiang, B.; Lin, T.; Zhang, S. A unified adaptive tensor approximation scheme to accelerate composite convex optimization. SIAM J. Optimiz. 2020, 30, 2897–2926. [Google Scholar] [CrossRef]
- Grapiglia, G.N.; Nesterov, Y. Adaptive third-order methods for composite convex optimization. SIAM J. Optimiz. 2023, 33, 1855–1883. [Google Scholar] [CrossRef]
- Chen, X.; Jiang, B.; Lin, T.; Zhang, S. Accelerating adaptive cubic regularization of Newton’s method via random sampling. J. Mach. Learn. Res. 2022, 23, 1–38. [Google Scholar]
- Lucchi, A.; Kohler, J. A sub-sampled tensor method for nonconvex optimization. IMA J. Numer. Anal. 2023, 43, 2856–2891. [Google Scholar] [CrossRef]
- Agafonov, A.; Kamzolov, D.; Dvurechensky, P.; Gasnikov, A.; Takáč, M. Inexact tensor methods and their application to stochastic convex optimization. Optim. Methods Softw. 2024, 39, 42–83. [Google Scholar] [CrossRef]
- Zhao, J.; Doikov, N.; Lucchi, A. Cubic regularized subspace Newton for non-convex optimization. In Proceedings of the 28th International Conference on Artificial Intelligence and Statistics, Phuket, Thailand, 3–5 May 2025. [Google Scholar]
- Nesterov, Y. Implementable tensor methods in unconstrained convex optimization. Math. Program. 2021, 186, 157–183. [Google Scholar] [CrossRef]
- Griewank, A. A mathematical view of automatic differentiation. Acta Numer. 2003, 12, 321–398. [Google Scholar] [CrossRef]
- Zhang, H.; Deng, N. An improved inexact Newton method. J. Glob. Optim. 2007, 39, 221–234. [Google Scholar] [CrossRef]
- Doikov, N.; Nesterov, Y.E. Inexact Tensor Methods with Dynamic Accuracies. In Proceedings of the 37th International Conference on Machine Learning, Virtual Event, 13–18 July 2020; pp. 2577–2586. [Google Scholar]
- Cartis, C.; Zhu, W. Cubic-quartic regularization models for solving polynomial subproblems in third-order tensor methods. Math. Program. 2025, 1–53. [Google Scholar] [CrossRef]
- Ahmadi, A.A.; Chaudhry, A.; Zhang, J. Higher-order newton methods with polynomial work per iteration. Adv. Math. 2024, 452, 109808. [Google Scholar] [CrossRef]
- Cartis, C.; Zhu, W. Global Convergence of High-Order Regularization Methods with Sums-of-Squares Taylor Models. arXiv 2024, arXiv:2404.03035. [Google Scholar]
- Xiao, J.; Zhang, H.; Gao, H. An Accelerated Regularized Chebyshev–Halley Method for Unconstrained Optimization. Asia Pac. J. Oper. Res. 2023, 40, 2340008. [Google Scholar] [CrossRef]
- Nesterov, Y. Accelerating the cubic regularization of Newton’s method on convex problems. Math. Program. 2008, 112, 159–181. [Google Scholar] [CrossRef]
- Nesterov, Y. Inexact accelerated high-order proximal-point methods. Math. Program. 2023, 197, 1–26. [Google Scholar] [CrossRef]
- Kamzolov, D.; Pasechnyuk, D.; Agafonov, A.; Gasnikov, A.; Takáč, M. OPTAMI: Global Superlinear Convergence of High-order Methods. arXiv 2024, arXiv:2410.04083. [Google Scholar]
- Antonakopoulos, K.; Kavis, A.; Cevher, V. Extra-newton: A first approach to noise-adaptive accelerated second-order methods. NeurIPS 2022, 35, 29859–29872. [Google Scholar]
- Monteiro, R.D.C.; Svaiter, B.F. An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM J. Optimiz. 2013, 23, 1092–1125. [Google Scholar] [CrossRef]
- Birgin, E.G.; Gardenghi, J.L.; Martínez, J.M.; Santos, S.A.; Toint, P.L. Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models. Math. Program. 2017, 163, 359–368. [Google Scholar] [CrossRef]
- Gould, N.I.M.; Rees, T.; Scott, J.A. Convergence and evaluation-complexity analysis of a regularized tensor-Newton method for solving nonlinear least-squares problems. Comput. Optim. Appl. 2019, 73, 1–35. [Google Scholar] [CrossRef]
- Huang, Z.; Jiang, B.; Jiang, Y. Inexact and Implementable Accelerated Newton Proximal Extragradient Method for Convex Optimization. arXiv 2024, arXiv:2402.11951. [Google Scholar]
- Polyak, R.A. Regularized Newton method for unconstrained convex optimization. Math. Program. 2009, 120, 125–145. [Google Scholar] [CrossRef]
- Ueda, K.; Yamashita, N. A regularized Newton method without line search for unconstrained optimization. Comput. Optim. Appl. 2014, 59, 321–351. [Google Scholar] [CrossRef]
- Doikov, N.; Nesterov, Y. Gradient regularization of Newton method with Bregman distances. Math. Program. 2024, 204, 1–25. [Google Scholar] [CrossRef]
- Mishchenko, K. Regularized Newton method with global O(1/k2) convergence. SIAM J. Optimiz. 2023, 33, 1440–1462. [Google Scholar] [CrossRef]
- Yamakawa, Y.; Yamashita, N. Convergence analysis of a regularized Newton method with generalized regularization terms for unconstrained convex optimization problems. Appl. Math. Comput. 2025, 491, 129219. [Google Scholar] [CrossRef]
- Jiang, Y.; He, C.; Zhang, C.; Ge, D.; Jiang, B.; Ye, Y. Beyond Nonconvexity: A Universal Trust-Region Method with New Analyses. arXiv 2023, arXiv:2311.11489. [Google Scholar]
- Doikov, N.; Mishchenko, K.; Nesterov, Y. Super-universal regularized newton method. SIAM J. Optimiz. 2024, 34, 27–56. [Google Scholar] [CrossRef]
- Gratton, S.; Jerad, S.; Toint, P.L. Yet another fast variant of Newton’s method for nonconvex optimization. IMA J. Numer. Anal. 2025, 45, 971–1008. [Google Scholar] [CrossRef]
- Zhou, Y.; Xu, J.; Bao, C.; Ding, C.; Zhu, J. A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees. arXiv 2025, arXiv:2502.04799. [Google Scholar]
- Hanzely, S.; Kamzolov, D.; Pasechnyuk, D.; Gasnikov, A.; Richtárik, P.; Takác, M. A Damped Newton Method Achieves Global and Local Quadratic Convergence Rate. NeurIPS 2022, 35, 25320–25334. [Google Scholar]
- Hanzely, S.; Abdukhakimov, F.; Takáč, M. Damped Newton Method with Near-Optimal Global O(k−3) Convergence Rate. arXiv 2024, arXiv:2405.18926. [Google Scholar]
- Jackson, R.H.F.; McCormick, G.P. The polyadic structure of factorable function tensors with applications to high-order minimization techniques. J. Optimiz. Theory Appl. 1986, 51, 63–94. [Google Scholar] [CrossRef]
- Zhang, H. On the Halley class of methods for unconstrained optimization problems. Optim. Methods Softw. 2010, 25, 753–762. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiao, J.; Zhang, H.; Gao, H. A Chebyshev–Halley Method with Gradient Regularization and an Improved Convergence Rate. Mathematics 2025, 13, 1319. https://doi.org/10.3390/math13081319
Xiao J, Zhang H, Gao H. A Chebyshev–Halley Method with Gradient Regularization and an Improved Convergence Rate. Mathematics. 2025; 13(8):1319. https://doi.org/10.3390/math13081319
Chicago/Turabian StyleXiao, Jianyu, Haibin Zhang, and Huan Gao. 2025. "A Chebyshev–Halley Method with Gradient Regularization and an Improved Convergence Rate" Mathematics 13, no. 8: 1319. https://doi.org/10.3390/math13081319
APA StyleXiao, J., Zhang, H., & Gao, H. (2025). A Chebyshev–Halley Method with Gradient Regularization and an Improved Convergence Rate. Mathematics, 13(8), 1319. https://doi.org/10.3390/math13081319