Next Article in Journal
Prospects of Observing Ionic Coulomb Blockade in Artificial Ion Confinements
Next Article in Special Issue
Deep Ensemble of Weighted Viterbi Decoders for Tail-Biting Convolutional Codes
Previous Article in Journal
The Meta-Dynamic Nature of Consciousness
Open AccessArticle

Examining the Causal Structures of Deep Neural Networks Using Information Theory

by 1,†, 2,† and 1,*
1
Allen Discovery Center, Tufts University, Medford, MA 02155, USA
2
Department of Mathematics, University of California Berkeley, Berkeley, CA 94720, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2020, 22(12), 1429; https://doi.org/10.3390/e22121429
Received: 12 November 2020 / Revised: 8 December 2020 / Accepted: 12 December 2020 / Published: 18 December 2020
Deep Neural Networks (DNNs) are often examined at the level of their response to input, such as analyzing the mutual information between nodes and data sets. Yet DNNs can also be examined at the level of causation, exploring “what does what” within the layers of the network itself. Historically, analyzing the causal structure of DNNs has received less attention than understanding their responses to input. Yet definitionally, generalizability must be a function of a DNN’s causal structure as it reflects how the DNN responds to unseen or even not-yet-defined future inputs. Here, we introduce a suite of metrics based on information theory to quantify and track changes in the causal structure of DNNs during training. Specifically, we introduce the effective information (EI) of a feedforward DNN, which is the mutual information between layer input and output following a maximum-entropy perturbation. The EI can be used to assess the degree of causal influence nodes and edges have over their downstream targets in each layer. We show that the EI can be further decomposed in order to examine the sensitivity of a layer (measured by how well edges transmit perturbations) and the degeneracy of a layer (measured by how edge overlap interferes with transmission), along with estimates of the amount of integrated information of a layer. Together, these properties define where each layer lies in the “causal plane”, which can be used to visualize how layer connectivity becomes more sensitive or degenerate over time, and how integration changes during training, revealing how the layer-by-layer causal structure differentiates. These results may help in understanding the generalization capabilities of DNNs and provide foundational tools for making DNNs both more generalizable and more explainable. View Full-Text
Keywords: artificial neural networks; causation; information theory artificial neural networks; causation; information theory
Show Figures

Figure 1

MDPI and ACS Style

Marrow, S.; Michaud, E.J.; Hoel, E. Examining the Causal Structures of Deep Neural Networks Using Information Theory. Entropy 2020, 22, 1429. https://doi.org/10.3390/e22121429

AMA Style

Marrow S, Michaud EJ, Hoel E. Examining the Causal Structures of Deep Neural Networks Using Information Theory. Entropy. 2020; 22(12):1429. https://doi.org/10.3390/e22121429

Chicago/Turabian Style

Marrow, Scythia; Michaud, Eric J.; Hoel, Erik. 2020. "Examining the Causal Structures of Deep Neural Networks Using Information Theory" Entropy 22, no. 12: 1429. https://doi.org/10.3390/e22121429

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop