Regularization of neural networks can alleviate overfitting in the training phase. Current regularization methods, such as Dropout and DropConnect, randomly drop neural nodes or connections based on a uniform prior. Such a data-independent strategy does not take into consideration of the quality of individual unit or connection. In this paper, we aim to develop a data-dependent approach to regularizing neural network in the framework of Information Geometry. A measurement for the quality of connections is proposed, namely confidence
. Specifically, the confidence of a connection is derived from its contribution to the Fisher information distance. The network is adjusted by retaining the confident connections and discarding the less confident ones. The adjusted network, named as ConfNet, would carry the majority of variations in the sample data. The relationships among confidence estimation, Maximum Likelihood Estimation and classical model selection criteria (like Akaike information criterion) is investigated and discussed theoretically. Furthermore, a Stochastic ConfNet is designed by adding a self-adaptive probabilistic sampling strategy. The proposed data-dependent regularization methods achieve promising experimental results on three data collections including MNIST, CIFAR-10 and CIFAR-100.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited