1. Introduction
In recent years, deep neural networks (DNNs) have achieved remarkable success in various domains such as image recognition [
1], recommendation systems [
2], and natural language processing [
3]. However, their performance heavily relies on the distribution of training data [
4], making them vulnerable to distributional shifts between training and test domains [
5]. This issue is particularly critical in autonomous driving, where complex and dynamic environments introduce natural corruptions such as illumination changes, adverse weather, and blur or defocus of images, all of which can severely impair visual perception models [
6]. Such degradations not only reduce model accuracy but may also lead to serious safety failures, making corruption robustness a key prerequisite for deploying DNNs in autonomous driving systems. Empirical evidence shows that mainstream DNNs often experience substantial accuracy degradation when exposed to such corruptions [
7]. Although augmenting training data with corrupted samples may improve robustness, it often leads to overfitting to seen corruption types, limiting generalization to unseen degradation patterns [
8]. These challenges underscore the need for reliable corruption robustness evaluation and a deeper understanding of its structural foundations. Mainstream corruption robustness evaluation methods typically assess model performance degradation under naturally corrupted inputs using benchmark datasets [
9], or quantify sensitivity through output fluctuations and confidence shifts [
10]. However, these approaches often require large volumes of corrupted data and are computationally expensive. More importantly, being based on posterior observations, they offer limited insight into the intrinsic relationship between a model’s internal information structure and its robustness.
Information theoretic approaches have increasingly informed the theoretical understanding of DNNs [
11]. Researchers have sought to characterize how information is transmitted and transformed across neural layers using concepts such as mutual information and entropy, aiming to uncover their intrinsic links to generalization, representational efficiency, and robustness. The Information Bottleneck (IB) theory [
12] provided an early perspective by framing deep learning as a trade-off between input compression and task-relevant retention, with subsequent studies linking compression dynamics to generalization [
13,
14].
Building on this, partial information decomposition [
15] (PID) has emerged as a refined multivariate framework that decomposes mutual information into redundant, synergistic, and unique components, enabling deeper insights into how neurons share, integrate, and specialize information. PID has been applied to both biological and artificial neural systems. In neuroscience, Luppi et al. [
16] used PID to map the brain’s informational architecture, while Varley et al. [
17] employed partial entropy decomposition to uncover higher-order structures in human activity. In artificial systems, Wollstadt et al. [
18] defined redundancy and relevancy for feature selection, Wibral et al. [
19] proposed PID as a unified framework for neural goal functions, and Dewan et al. [
20] applied Diffusion PID to interpret generative models. Together, these studies highlight PID’s role in elucidating information dynamics, refining feature selection, and advancing model interpretability. Recent studies have begun to investigate the functional roles of the components defined by PID. Proca et al. [
21] applied PID to simple neural networks across supervised and reinforcement learning tasks, finding that synergistic information supports multimodal integration and multitask learning, while redundancy correlates with robustness, highlighting the role of internal information dynamics in general learning ability. Moreover, Tax et al. [
22] used PID to analyze hidden neuron representations in Boltzmann machines, revealing a staged learning pattern from redundancy to uniqueness. While these findings provide empirical support for the functional significance of PID components, they are mostly derived from small-scale networks and have not yet been extended to more complex architectures or realistic tasks.
Inspired by the aforementioned studies, this work investigates how the structure of neuronal information interactions relates to model performance, with particular emphasis on robustness under natural corruptions. We introduce a robustness assessment approach based on PID measures, aiming to estimate a model’s stability across corruption scenarios solely from the information structure revealed by its neural activations on clean images. Just as network depth is widely seen as a structural indicator of model expressiveness [
23], we explore whether information structure can likewise serve as a prior for robustness, enabling efficient model assessment and comparison. Our main contributions are: (1) we reveal how redundant, synergistic, and unique information components differentially account for performance variation under corruption; (2) through empirical analysis on two benchmark corruption datasets and six mainstream architectures, we validate the potential of inferring robustness using PID metrics computed from clean samples, providing theoretical insights into model assessment and design.
2. Partial Information Decomposition
In information theory, Mutual Information (MI) [
24] is a fundamental measure used to quantify the dependence between two random variables. For two discrete random variables
X and
Y, mutual information is defined as follows:
where
denotes the joint probability distribution of
X and
Y and
,
are the marginal distributions. MI can also be equivalently expressed in terms of entropy:
here,
represents the Shannon entropy of
Y, quantifying the information content of the variable:
The term
, known as conditional entropy, measures the remaining uncertainty in
Y given knowledge of
X:
MI thus measures the reduction in uncertainty about
Y given knowledge of
X, i.e., how much information
X provides about
Y. However, in multivariate settings, MI cannot distinguish between different types of contribution from multiple sources, for example, whether they offer redundant, unique, or synergistic information about the target variable. To address this limitation, PID provides a framework for decomposing mutual information into interpretable atomic components.
PID, introduced by Williams and Beer [
15], extends Shannon’s framework to analyze how multiple source variables jointly contribute to a target variable. Given a set of sources
and a target
Y, PID decomposes the total mutual information
into the following fundamental components:
Redundant Information: Information that is shared by multiple source variables—i.e., the same information about Y is provided by more than one source.
Unique Information: Information that is exclusively provided by a single source variable and not available from any other sources.
Synergistic Information: Information that can only be obtained through the joint consideration of multiple source variables, which cannot be accessed from any individual source alone.
To illustrate the decomposition more intuitively, consider a simplified case with two source variables
and
, and a single target variable
Y. The mutual information between the joint sources and the target can be decomposed into a sum of partial information atoms as shown in Equation (5),
denotes the mutual information between a single source
and the target
Y, and
denotes the mutual information between the joint sources
and the target
Y.
represents the redundant information about
Y shared by
and
,
represents the information uniquely provided by
, and
represents the synergistic information about
Y that is only provided jointly by
and
. The relationships among these components can be intuitively illustrated using a Venn diagram, as depicted in
Figure 1.
This results in an underdetermined system of three equations with four unknowns. Although PID provides a conceptual framework to differentiate redundant, unique, and synergistic information, it does not prescribe a specific method for computing these quantities. Currently, there is no universally accepted redundancy function, and alternative formulations characterize distinct facets of multivariate information. Two commonly used redundancy functions are
, which was originally proposed by [
15], and
[
25]. As demonstrated in the study by [
21], these two measures exhibit consistent behavior in various experimental settings. Therefore, for computational tractability, we adopt the
measure to compute the redundancy function, that is,
4. Results
4.1. Preliminary Observations on the Connection Between Neuronal PID Structure and Model Robustness
Each model is evaluated on the clean test set and eight corrupted test sets, with both prediction outputs and neuronal activations recorded. Partial information decomposition is subsequently applied to neuron pairs using clean image samples, in order to investigate potential relationships between internal information processing mechanisms and model behavior under corruption.
Figure 2a presents the classification accuracy of each model on the ImageNet-C and CIFAR10-C datasets across eight corruption types and five severity levels (
), together with the corresponding accuracy standard deviation
, where a smaller
indicates less variation in accuracy across corruption types and severity levels.
Figure 2b illustrates the PID results on clean images, including RR, SR, and UR. A comparison of the two subfigures indicates that models with smaller
tend to exhibit higher RR and lower SR values, whereas UR does not show a consistent trend. These observations provide preliminary evidence that the internal information structure of neurons may be related to model robustness under corruption.
4.2. Rank Correlation Analysis Between PID Measures and Robustness
To further assess the statistical validity of the trends observed in the previous section, a non-parametric rank correlation analysis is conducted to quantify the relationship between PID measures and model robustness. Unlike correlation metrics that rely on linear assumptions, Spearman’s and Kendall’s are better suited for for capturing monotonic relationships, particularly when dealing with nonlinearity and ordinal consistency. Accordingly, this study applies both and to systematically assess the rank correlation between three PID measures (RR, SR, UR) and the three robustness measures (, , ). This analysis aims to explore the potential of these metrics as predictors for model robustness.
The rank correlation results are presented in
Figure 3. They reveal a consistent trend between the information decomposition metrics measured on images that
and the robustness of models under corrupted conditions. Notably, RR exhibits a strong negative correlation with
, with both Spearman’s
and Kendall’s
reaching −1 and
p-values below 0.01 across both datasets. This result indicates that models with higher redundancy tend to show smaller performance fluctuations across various corruption types and severities. This phenomenon can be attributed to the fault-tolerant properties of redundancy in neural networks. In scenarios where multiple neurons transmit overlapping information, the network remains stable even if part of the neurons is disrupted by noise or distortion, as the unaffected neurons compensate for the loss, thereby maintaining the overall stability of the system [
34]. Similar conclusions were drawn by Barret et al. [
25], who demonstrated that redundancy enhances a model’s resistance to external perturbations and ensures that critical information is robustly transmitted under varying input conditions.
SR exhibits a negative rank correlation with model robustness across both datasets. Specifically, models with higher SR tend to show greater fluctuations in accuracy under different types and severities of corruption, while the strength and statistical significance of this correlation vary between datasets, suggesting a less consistent association compared to RR. This observation can be explained by findings from [
21], which suggest that neurons relying heavily on synergistic representations are more sensitive to input perturbations. Because synergistic information emerges through the combined activity of multiple sources, disruption to any one of these sources can compromise the integrity of the shared information. Consequently, the failure of such neurons to maintain stable information integration undermines the decision-making performance of model.
The differences in RR and synergy rate SR across models can be traced to their architectural design, reflecting how structural mechanisms shape internal information dynamics. ResNet50 and DenseNet121 consistently exhibit higher RR and lower SR, which can be attributed to their skip connections and dense feature reuse. These design choices promote overlapping information pathways, thereby enhancing redundancy and reducing reliance on fragile synergistic interactions. In contrast, AlexNet and MobileNetV2 show lower RR and higher SR, indicating that their relatively shallow or lightweight convolutional structures depend more on joint feature integration. Such reliance on synergy makes them more sensitive to corruption-induced perturbations, as disruption of any source neuron can compromise the shared representation. InceptionV3 demonstrates moderate RR but the lowest SR, suggesting that its multi-branch architecture fosters representational diversity while minimizing dependence on synergistic encoding. Finally, VGG16 lies in the middle range, with balanced RR and SR values, consistent with its deep yet sequential convolutional design that neither strongly emphasizes redundancy nor synergy. These observations highlight that redundancy and synergy rates reflect architectural design choices. Models with mechanisms that encourage overlapping information transmission tend to achieve greater robustness, whereas architectures that rely heavily on synergistic integration are more vulnerable to corruption.
UR shows a strong positive correlation with , although the statistical significance is weaker in the CIFAR10-C dataset. Its rank correlation with is statistically significant on ImageNet-C but not on CIFAR10-C, and its correlations with are not statistically significant on both datasets; therefore, the association between UR and robustness indicators is not consistent across datasets. This suggests that UR may be more involved in task-specific feature representation while playing a relatively limited role in general robustness mechanism.
In summary, these findings demonstrate that the PID measures of neurons, particularly redundancy and synergy, capture structural differences in how models respond to corrupted inputs. These results provide both theoretical support and empirical evidence for developing model evaluation approaches grounded in internal information structure.
4.3. Connection Between Unique Information and Classification Performance
Building upon the established association between neuronal information structure and model robustness, this section investigates how information interaction patterns relate to classification performance under varying corruption conditions. To this end, we evaluate model performance and compute PID measures on clean images (
) and mildly corrupted images (
) across eight corruption types in both ImageNet-C and CIFAR10-C. The results are illustrated in
Figure 4a,b, where the x-axis represents corruption types and different markers denote distinct models. The results reveal that models achieving higher accuracy typically exhibit higher UR values, suggesting that unique information may reflect a model’s ability to extract discriminative features.
However, this trend weakens as the severity of corruption increases.
Figure 4c,d presents the results for
and
on ImageNet-C, where the classification performance of all models declines significantly, and the differences in UR become increasingly disordered. At these higher corruption levels, accuracy approaches random behavior, as severe distortions overwhelm the useful signal, which obscures the correlation with UR and makes the trend less clear. This observation implies that UR may serve as a meaningful structural indicator of model capacity under high-quality input conditions, whereas under severe corruption, models may rely more on redundant mechanisms or other robustness strategies to maintain performance.
To further examine the discriminative capacity of unique information in classification tasks, UR were compared between correctly and incorrectly classified samples. As depicted in
Figure 5, for corruption levels ranging from
to
, correctly classified inputs consistently exhibit higher UR values than misclassified ones. This effect is particularly pronounced under mild corruption conditions, supporting the inference that increased unique information is linked to improved discriminative capability.
Theoretical support for this observation can be found in prior studies on the dynamics of information learning. Tax et al. [
22] performed an analysis of neural information dynamics using the PID framework during the training of Boltzmann machines and observed that neurons first acquire redundant information before gradually specializing to encode unique information about the target variable. This transition reflects an increasing ability to capture discriminative features, thereby enhancing classification performance. From the perspective of statistical decision theory, Venkatesh et al. [
35] further demonstrated that the amount of unique information provides an upper bound on the minimum risk achievable by a given information source in a decision task. In other words, neurons with higher UR contribute to lower uncertainty in classification decisions. Therefore, the positive relationship between UR and classification performance is supported by both empirical evidence and theoretical foundations.
The UR also reflects architectural design. InceptionV3, ResNet50, and DenseNet121 consistently show higher UR, indicating that multi-branch, residual, and dense connectivity promote feature specialization and discriminative capacity. MobileNetV2 exhibits intermediate UR, balancing efficiency with moderate uniqueness. In contrast, VGG16 and AlexNet display low UR, consistent with their sequential or shallow structures that rely more on shared representations. Overall, architectures that encourage diverse and specialized feature encoding achieve higher UR, while simpler sequential designs yield lower UR and weaker discriminative ability.
5. Conclusions
Within the framework of PID, the structure of neuronal information interactions is examined as a potential indicator of model robustness. PID measures derived from clean-image neuron activations are analyzed to explore the relationship between internal information interaction mechanisms and model robustness under corruption. This approach facilitates a novel robustness evaluation paradigm that does not rely on extensive corrupted test samples, thereby contributing both theoretical insights and practical value to model assessment.
Experiments conducted on the ImageNet-C and CIFAR10-C datasets reveal consistent trends in the connection between information decomposition and robustness. Models with a higher redundancy rate tend to achieve more stable performance across diverse corruption types and severity levels, whereas higher synergy rate is generally associated with increased performance variability. These results suggest that redundancy contributes critically to robustness by supporting tolerance to input degradation, while a greater dependence on synergistic information may heighten sensitivity to noise and perturbations. In addition, the unique information rate correlates with classification performance on high-quality inputs, indicating that unique information reflects the degree of specialization in encoding discriminative features. However, as corruption severity increases, the influence of unique information diminishes, and the network increasingly relies on redundancy to preserve performance.
Overall, our findings highlight the distinct functional roles of the components in PID. Redundancy is essential for robustness, synergy facilitates complex feature integration but introduces sensitivity, and uniqueness contributes to accuracy under high-quality input conditions. Future research should further investigate strategies for modulating the information structure of neural models to achieve a better balance between robustness and accuracy. For instance, specific architectural designs or training procedures could be developed to enhance redundancy and thereby improve stability under challenging conditions. Additionally, identifying methods to promote the learning of unique information without compromising robustness could be a promising direction toward building more interpretable neural networks.