Submit to Algorithms Review for Algorithms Propose a Special Issue

Journal Menu

Journal Browser

Gradient Methods for Optimization

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Combinatorial Optimization, Graph, and Network Algorithms".

Deadline for manuscript submissions: closed (28 February 2023) | Viewed by 20765

Share This Special Issue

Special Issue Editor

Dr. Zebang Shen

E-Mail Website
Guest Editor

Institute for Machine Learning, ETH Zurich, 8092 Zürich, Switzerland
Interests: optimal transport; optimization; machine learning

Special Issue Information

Dear Colleagues,

With machine learning tasks often cast as high-dimensional minimization problems that need to be solved accurately and efficiently, optimization has been a cornerstone in modern AI research.

Gradient-based algorithms, among numerous competitors, are the most successful, both theoretically and empirically, due to their low per-iteration cost and fast convergence rate, notable examples including the gradient descent and its momentum accelerated variants for convex optimization, the Frank–Wolfe algorithms, the Alternating Direction Method of Multipliers (ADMM) for constrained optimization, and the stochastic gradient descent for non-convex optimization, just to name a few.

Despite huge success in previous studies, there are still many fundamental open problems, such as the algorithmic bias of stochastic gradient methods requiring further detailed investigating.
Moreover, emerging AI tasks have opened up new fields of optimization research, such as federated learning imposing novel privacy constraints regarding the optimization procedure, the training of the generative adversarial network requiring lifting the optimization domain to the more abstract probability manifold, and significantly benefiting from a better understanding of the more intricate min–max optimization.

We invite you to submit high-quality papers to the Special Issue on “Gradient Methods for Optimization", with subjects covering the whole range from theory to algorithms. The following is a (non-exhaustive) list of topics of interest:

Optimization methods and theories for convex, submodular, and non-convex problems.
Optimization in more abstract domains such as the probability manifold.
Optimization for min–max problems.
Optimization methods for federated learning.

Dr. Zebang Shen
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

12 pages, 747 KiB

Open AccessArticle

Extrinsic Bayesian Optimization on Manifolds

by Yihao Fang, Mu Niu, Pokman Cheung and Lizhen Lin

Algorithms 2023, 16(2), 117; https://doi.org/10.3390/a16020117 - 15 Feb 2023

Cited by 1 | Viewed by 2573

Abstract

We propose an extrinsic Bayesian optimization (eBO) framework for general optimization problems on manifolds. Bayesian optimization algorithms build a surrogate of the objective function by employing Gaussian processes and utilizing the uncertainty in that surrogate by deriving an acquisition function. This acquisition function represents the probability of improvement based on the kernel of the Gaussian process, which guides the search in the optimization process. The critical challenge for designing Bayesian optimization algorithms on manifolds lies in the difficulty of constructing valid covariance kernels for Gaussian processes on general manifolds. Our approach is to employ extrinsic Gaussian processes by first embedding the manifold onto some higher dimensional Euclidean space via equivariant embeddings and then constructing a valid covariance kernel on the image manifold after the embedding. This leads to efficient and scalable algorithms for optimization over complex manifolds. Simulation study and real data analyses are carried out to demonstrate the utilities of our eBO framework by applying the eBO to various optimization problems over manifolds such as the sphere, the Grassmannian, and the manifold of positive definite matrices. Full article

(This article belongs to the Special Issue Gradient Methods for Optimization)

► Show Figures

Figure 1

25 pages, 620 KiB

Open AccessArticle

Personalized Federated Multi-Task Learning over Wireless Fading Channels

by Matin Mortaheb, Cemil Vahapoglu and Sennur Ulukus

Algorithms 2022, 15(11), 421; https://doi.org/10.3390/a15110421 - 9 Nov 2022

Cited by 10 | Viewed by 3796

Abstract

Multi-task learning (MTL) is a paradigm to learn multiple tasks simultaneously by utilizing a shared network, in which a distinct header network is further tailored for fine-tuning for each distinct task. Personalized federated learning (PFL) can be achieved through MTL in the context of federated learning (FL) where tasks are distributed across clients, referred to as personalized federated MTL (PF-MTL). Statistical heterogeneity caused by differences in the task complexities across clients and the non-identically independently distributed (non-i.i.d.) characteristics of local datasets degrades the system performance. To overcome this degradation, we propose FedGradNorm, a distributed dynamic weighting algorithm that balances learning speeds across tasks by normalizing the corresponding gradient norms in PF-MTL. We prove an exponential convergence rate for FedGradNorm. Further, we propose HOTA-FedGradNorm by utilizing over-the-air aggregation (OTA) with FedGradNorm in a hierarchical FL (HFL) setting. HOTA-FedGradNorm is designed to have efficient communication between the parameter server (PS) and clients in the power- and bandwidth-limited regime. We conduct experiments with both FedGradNorm and HOTA-FedGradNorm using MT facial landmark (MTFL) and wireless communication system (RadComDynamic) datasets. The results indicate that both frameworks are capable of achieving a faster training performance compared to equal-weighting strategies. In addition, FedGradNorm and HOTA-FedGradNorm compensate for imbalanced datasets across clients and adverse channel effects. Full article

(This article belongs to the Special Issue Gradient Methods for Optimization)

► Show Figures

Figure 1

15 pages, 600 KiB

Open AccessArticle

Fed-DeepONet: Stochastic Gradient-Based Federated Training of Deep Operator Networks

by Christian Moya and Guang Lin

Algorithms 2022, 15(9), 325; https://doi.org/10.3390/a15090325 - 12 Sep 2022

Cited by 10 | Viewed by 4066

Abstract

The Deep Operator Network (DeepONet) framework is a different class of neural network architecture that one trains to learn nonlinear operators, i.e., mappings between infinite-dimensional spaces. Traditionally, DeepONets are trained using a centralized strategy that requires transferring the training data to a centralized location. Such a strategy, however, limits our ability to secure data privacy or use high-performance distributed/parallel computing platforms. To alleviate such limitations, in this paper, we study the federated training of DeepONets for the first time. That is, we develop a framework, which we refer to as Fed-DeepONet, that allows multiple clients to train DeepONets collaboratively under the coordination of a centralized server. To achieve Fed-DeepONets, we propose an efficient stochastic gradient-based algorithm that enables the distributed optimization of the DeepONet parameters by averaging first-order estimates of the DeepONet loss gradient. Then, to accelerate the training convergence of Fed-DeepONets, we propose a moment-enhanced (i.e., adaptive) stochastic gradient-based strategy. Finally, we verify the performance of Fed-DeepONet by learning, for different configurations of the number of clients and fractions of available clients, (i) the solution operator of a gravity pendulum and (ii) the dynamic response of a parametric library of pendulums. Full article

(This article belongs to the Special Issue Gradient Methods for Optimization)

► Show Figures

Figure 1

11 pages, 650 KiB

Open AccessArticle

Accounting for Round-Off Errors When Using Gradient Minimization Methods

by Dmitry Lukyanenko, Valentin Shinkarev and Anatoly Yagola

Algorithms 2022, 15(9), 324; https://doi.org/10.3390/a15090324 - 9 Sep 2022

Cited by 4 | Viewed by 2620

Abstract

This paper discusses a method for taking into account rounding errors when constructing a stopping criterion for the iterative process in gradient minimization methods. The main aim of this work was to develop methods for improving the quality of the solutions for real applied minimization problems, which require significant amounts of calculations and, as a result, can be sensitive to the accumulation of rounding errors. However, this paper demonstrates that the developed approach can also be useful in solving computationally small problems. The main ideas of this work are demonstrated using one of the possible implementations of the conjugate gradient method for solving an overdetermined system of linear algebraic equations with a dense matrix. Full article

(This article belongs to the Special Issue Gradient Methods for Optimization)

► Show Figures

Figure 1

22 pages, 939 KiB

Open AccessArticle

Federated Optimization of ℓ₀-norm Regularized Sparse Learning

by Qianqian Tong, Guannan Liang, Jiahao Ding, Tan Zhu, Miao Pan and Jinbo Bi

Algorithms 2022, 15(9), 319; https://doi.org/10.3390/a15090319 - 6 Sep 2022

Viewed by 3145

Abstract

Regularized sparse learning with the

ℓ_{0}

Regularized sparse learning with the

ℓ_{0}

-norm is important in many areas, including statistical learning and signal processing. Iterative hard thresholding (IHT) methods are the state-of-the-art for nonconvex-constrained sparse learning due to their capability of recovering true support and scalability with large datasets. The current theoretical analysis of IHT assumes the use of centralized IID data. In realistic large-scale scenarios, however, data are distributed, seldom IID, and private to edge computing devices at the local level. Consequently, it is required to study the property of IHT in a federated environment, where local devices update the sparse model individually and communicate with a central server for aggregation infrequently without sharing local data. In this paper, we propose the first group of federated IHT methods: Federated Hard Thresholding (Fed-HT) and Federated Iterative Hard Thresholding (FedIter-HT) with theoretical guarantees. We prove that both algorithms have a linear convergence rate and guarantee for recovering the optimal sparse estimator, which is comparable to classic IHT methods, but with decentralized, non-IID, and unbalanced data. Empirical results demonstrate that the Fed-HT and FedIter-HT outperform their competitor—a distributed IHT, in terms of reducing objective values with fewer communication rounds and bandwidth requirements. Full article

(This article belongs to the Special Issue Gradient Methods for Optimization)

► Show Figures

Figure 1

31 pages, 793 KiB

Open AccessArticle

ZenoPS: A Distributed Learning System Integrating Communication Efficiency and Security

by Cong Xie, Oluwasanmi Koyejo and Indranil Gupta

Algorithms 2022, 15(7), 233; https://doi.org/10.3390/a15070233 - 1 Jul 2022

Cited by 3 | Viewed by 2980

Abstract

Distributed machine learning is primarily motivated by the promise of increased computation power for accelerating training and mitigating privacy concerns. Unlike machine learning on a single device, distributed machine learning requires collaboration and communication among the devices. This creates several new challenges: (1) the heavy communication overhead can be a bottleneck that slows down the training, and (2) the unreliable communication and weaker control over the remote entities make the distributed system vulnerable to systematic failures and malicious attacks. This paper presents a variant of stochastic gradient descent (SGD) with improved communication efficiency and security in distributed environments. Our contributions include (1) a new technique called error reset to adapt both infrequent synchronization and message compression for communication reduction in both synchronous and asynchronous training, (2) new score-based approaches for validating the updates, and (3) integration with both error reset and score-based validation. The proposed system provides communication reduction, both synchronous and asynchronous training, Byzantine tolerance, and local privacy preservation. We evaluate our techniques both theoretically and empirically. Full article

(This article belongs to the Special Issue Gradient Methods for Optimization)

► Show Figures

Journal Menu

Journal Browser

Gradient Methods for Optimization

Share This Special Issue

Special Issue Editor

Special Issue Information

Benefits of Publishing in a Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI