Next Article in Journal
Magnetic Otto Engine for an Electron in a Quantum Dot: Classical and Quantum Approach
Previous Article in Journal
The Exponentiated Lindley Geometric Distribution with Applications
Open AccessArticle

Variational Characterizations of Local Entropy and Heat Regularization in Deep Learning

1
Department of Statistics, University of Wisconsin Madison, Madison, WI 53706, USA
2
Division of Applied Mathematics, Brown University, Providence, RI 02906, USA
3
Department of Statistics, University of Chicago, Chicago, IL 60637, USA
*
Author to whom correspondence should be addressed.
Entropy 2019, 21(5), 511; https://doi.org/10.3390/e21050511
Received: 7 March 2019 / Revised: 6 May 2019 / Accepted: 11 May 2019 / Published: 20 May 2019
(This article belongs to the Special Issue Information-Theoretic Approaches in Deep Learning)
The aim of this paper is to provide new theoretical and computational understanding on two loss regularizations employed in deep learning, known as local entropy and heat regularization. For both regularized losses, we introduce variational characterizations that naturally suggest a two-step scheme for their optimization, based on the iterative shift of a probability density and the calculation of a best Gaussian approximation in Kullback–Leibler divergence. Disregarding approximation error in these two steps, the variational characterizations allow us to show a simple monotonicity result for training error along optimization iterates. The two-step optimization schemes for local entropy and heat regularized loss differ only over which argument of the Kullback–Leibler divergence is used to find the best Gaussian approximation. Local entropy corresponds to minimizing over the second argument, and the solution is given by moment matching. This allows replacing traditional backpropagation calculation of gradients by sampling algorithms, opening an avenue for gradient-free, parallelizable training of neural networks. However, our presentation also acknowledges the potential increase in computational cost of naive optimization of regularized costs, thus giving a less optimistic view than existing works of the gains facilitated by loss regularization. View Full-Text
Keywords: deep learning; local entropy; heat regularization; variational characterizations; Kullback–Leibler approximations; monotonic training deep learning; local entropy; heat regularization; variational characterizations; Kullback–Leibler approximations; monotonic training
Show Figures

Figure 1

MDPI and ACS Style

García Trillos, N.; Kaplan, Z.; Sanz-Alonso, D. Variational Characterizations of Local Entropy and Heat Regularization in Deep Learning. Entropy 2019, 21, 511.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop