Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning

Alam, Mehtab; Alourani, Abdullah; Ali, Ashraf; Ahamad, Firoj

doi:10.3390/sym17111821

Open AccessArticle

Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning

by

Mehtab Alam

¹,

Abdullah Alourani

^2,*

,

Ashraf Ali

³

and

Firoj Ahamad

⁴

¹

Department of Computer Science, Acharya Narendra Dev College, University of Delhi, Delhi 110019, India

²

Department of Management Information Systems, College of Business and Economics, Qassim University, Buraydah 51452, Saudi Arabia

³

Faculty of Computer Studies, Arab Open University-Bahrain, A’ali P.O. Box 18211, Bahrain

⁴

Department of Computer Science and Engineering, Galgotia University, Greater Noida 203201, Uttar Pradesh, India

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(11), 1821; https://doi.org/10.3390/sym17111821

Submission received: 16 September 2025 / Revised: 18 October 2025 / Accepted: 24 October 2025 / Published: 29 October 2025

(This article belongs to the Special Issue Symmetry/Asymmetry in Data Mining & Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

This paper investigates the role of symmetry and asymmetry in the learning process of modern machine learning models, with a specific focus on feature representation and optimization. We introduce a novel symmetry-aware learning framework that identifies and preserves symmetric properties within high-dimensional datasets, while allowing model asymmetries to capture essential discriminative cues. Through analytical modeling and empirical evaluations on benchmark datasets, we demonstrate how symmetrical transformations of features (e.g., rotation, mirroring, permutation invariance) impact learning efficiency, interpretability, and generalization. Furthermore, we explore asymmetric regularization techniques that prioritize informative deviations from symmetry in model parameters, thereby improving classification and clustering performance. The proposed approach is validated using a variety of classifiers including neural networks and tested across domains such as image recognition, biomedical data, and social networks. Our findings highlight the critical importance of leveraging domain-specific symmetries to enhance both the performance and explainability of machine learning systems.

Keywords:

interpretability; feature representation; image classification; symmetry in machine learning; asymmetry regularization; equivariance; interpretability; group theory; geometric deep learning; RotMNIST; ECG classification; graph neural networks; SHAP; Grad-CAM; model robustness; structured representations

1. Introduction

Symmetry has long served as a unifying principle in physics, mathematics, and natural systems, describing invariance under structured transformations such as rotation, reflection, or permutation. Within machine learning, the same principal manifests as repeated or balanced patterns in data—features that remain stable when the input undergoes specific transformations. Leveraging these inherent regularities allows learning algorithms to capture essential structure while avoiding redundant computation. Conversely, deliberately breaking symmetry, or introducing controlled asymmetry, can reveal subtle distinctions crucial for discrimination and pattern recognition. This interplay between symmetry and asymmetry provides an untapped foundation for developing models that are simultaneously robust, efficient, and interpretable.

Traditional machine learning pipelines generally assume that input features are independent and identically distributed or statistically stationary [1]. Such assumptions overlook the structured dependencies and geometric consistencies present in many domains, from the mirrored anatomy in medical imaging to node permutation symmetry in molecular or social graphs. Neglecting these latent symmetries often forces models to relearn equivalent patterns multiple times, reducing generalization and inflating model complexity [2]. Embedding symmetry into learning architectures—through equivariant networks, transformation-invariant features, or group-theoretic embeddings—addresses this inefficiency by encoding invariance directly into the representation space [3,4].

Deep learning has naturally embraced aspects of symmetry through the success of architectures such as Convolutional Neural Networks (CNNs), which exploit translational invariance [5], and Graph Neural Networks (GNNs), which preserve relational symmetries among nodes [6]. More recently, symmetry has been explored not only as an architectural constraint but also as a principle for optimization and interpretability. Enforcing symmetrical structure within latent spaces can disentangle factors of variation [7], while intentionally relaxing these constraints enables models to highlight domain-specific deviations—such as anomalies or class-specific distortions [8]. These mechanisms suggest that symmetry can be treated as a design axis, balancing invariance for generalization and asymmetry for discrimination.

Beyond predictive accuracy, symmetry contributes to model transparency. Systems that honor data symmetries tend to produce more structured decision boundaries, which align more closely with human reasoning. In safety-critical areas—including clinical diagnostics, remote sensing, and autonomous navigation—such structured interpretability is essential for trust and accountability [9,10]. Models built on symmetrical reasoning thus bridge the gap between statistical performance and human-centered understanding.

This paper proposes a unified symmetry-aware learning framework that integrates both symmetrical and asymmetrical perspectives. The approach focuses on three main objectives: (1) leveraging symmetrical properties in input data to construct compact and meaningful feature representations; (2) introducing asymmetric regularization in the optimization process to emphasize discriminative structures; and (3) evaluating the impact of these mechanisms on model performance, robustness, and interpretability across multiple domains.

To guide this study, the following research questions are addressed:

RQ1: How do different types of data symmetry (e.g., reflectional, rotational, or permutation) influence the design of learning models?
RQ2: Can explicit modeling of symmetry and asymmetry improve generalization performance in classification and clustering tasks?
RQ3: How does symmetry-aware modeling contribute to the interpretability of machine learning systems in high-stakes domains?
RQ4: What trade-offs exist between symmetry exploitation and computational efficiency during learning?

The proposed framework is validated on three representative datasets: RotMNIST for image classification, PhysioNet ECG for biomedical signal analysis, and PROTEINS for graph-based prediction. Results demonstrate that symmetry-aware architectures yield measurable improvements in accuracy, stability under transformations, and explanatory coherence of model decisions. By encoding physical and structural priors directly into the learning process, SALF provides a principled route toward interpretable and resilient machine intelligence.

Figure 1 depicts a three-panel diagram that illustrates the core stages of how symmetry is leveraged in ML workflows. The left panel shows symmetric data patterns, such as mirrored handwritten digits (‘3’ and ‘E’) and symmetrical ECG waveforms, highlighting the presence of natural symmetries in real-world inputs. The center panel depicts symmetry-aware feature extraction using a schematic of geometric shapes and network connections, representing transformations that preserve structure and extract invariant or equivariant features. The right panel illustrates model outputs, where a curved decision boundary in a scatter plot separates data classes in a structured and interpretable way. Together, the panels convey how symmetry moves from data to features to model reasoning.

Novelty and Contributions

While SALF draws inspiration from established symmetry-aware architectures such as G-CNNs [2], DeepSets [4], and E(n)-GNNs [11] its contribution extends beyond a mere aggregation of these models. The novelty lies in

Unified Multi-Domain Symmetry Handling—SALF encodes rotational, reflectional, permutation, and geometric invariances within a single modular framework, enabling deployment across images, sequences, graphs, and time-series signals without architecture redesign.
Asymmetry-Driven Regularization—Unlike existing symmetry-only methods, SALF incorporates a controlled asymmetry term to retain semantically meaningful variations, addressing scenarios where perfect invariance is detrimental.
Integrated Interpretability—Symmetry-aware embeddings are directly connected to interpretability modules (Grad-CAM, SHAP, GNNExplainer), ensuring explanations are aligned with learned invariances.
Seamless Hybridization—The framework is designed for plug-and-play integration into existing CNN, Transformer, and GNN pipelines without substantial computational overhead.

Unlike prior approaches that implement symmetry preservation within narrowly defined architectures—such as G-CNNs for rotational equivariance, DeepSets for permutation invariance, or E(n)-GNNs for geometric consistency—SALF introduces a generalized symmetry–asymmetry optimization principle that operates uniformly across these domains. In SALF, group-theoretic encoders are paired with an asymmetry-driven regularization objective, enabling models to preserve invariance where beneficial while relaxing it to capture domain-specific variations. This unification eliminates the need for architecture-specific redesign when shifting between data modalities.

Beyond conventional symmetry-aware learning, SALF extends into the space of interpretable machine learning by integrating interpretability modules directly within its representational pipeline rather than as post hoc analyses. This contrasts with prototype-based frameworks such as ProtoPNet or local-explanation approaches like LIME-regularized CNNs, which often decouple explanation from representation. By embedding symmetry consistency into both feature extraction and explanation generation, SALF ensures that its interpretive outputs remain stable under transformations (e.g., rotation, permutation), thereby combining structural transparency with model robustness.

Table 1 summarizes the differences between SALF and key prior approaches:

This positioning makes SALF’s originality clear—it is a multi-domain, symmetry–asymmetry balanced, interpretable, and easily integrable learning framework, rather than a repackaging of existing models.

2. Related Work

The integration of symmetry principles in machine learning has evolved across several subdomains, including representation learning, deep neural architecture design, graph analytics, and interpretability. This section reviews foundational efforts and recent advances that explore how symmetrical and asymmetrical properties have been exploited for enhancing learning models.

2.1. Symmetry in Feature Representations

Early explorations in leveraging symmetry focused on data-level transformations and feature invariances. Classical pattern recognition systems employed symmetry in template matching, Fourier descriptors, and moment invariants to identify shapes regardless of orientation or reflection [2]. In modern machine learning, symmetry-aware feature engineering has been pivotal in reducing redundancy and promoting robustness to input transformations. For example, DeepSets and related permutation-invariant models learn functions over sets that remain unchanged under input reordering, enabling tasks such as point cloud classification or multi-instance learning [4].

In representation learning, symmetry-enforcing constraints have been shown to improve disentanglement and generalization. Variational autoencoders, particularly the β-VAE variant, have utilized constraints on latent variables to uncover interpretable, often symmetric, structure in data [7]. More recent methods incorporate symmetry into the objective functions using group-invariant losses or architectural priors that preserve symmetry under transformations like rotation, scaling, or translation [12]. Santos et al. introduced SEMoLA, an end-to-end approach that jointly discovers unknown symmetries in data via learnable data augmentations and encodes approximate equivariance into arbitrary models, directly relevant to your symmetry-aware feature representations [13]. Ziyin et al. demonstrated that parameter symmetry breaking and restoration serve as a unifying mechanism underlying hierarchical learning dynamics in modern AI systems [14].

2.2. Symmetry in Neural Network Architectures

Symmetry has been deeply embedded in the design of certain neural architectures. Convolutional Neural Networks (CNNs) inherently exploit translational symmetry through shared weights across spatial locations, enabling efficient feature reuse and locality [5]. Group Equivariant CNNs (G-CNNs) extend this concept by learning filters equivariant to more general symmetry groups, such as rotations or reflections [2]. These extensions significantly improve performance in image domains with structured patterns, such as medical imaging and remote sensing.

Similarly, Graph Neural Networks (GNNs) are constructed to respect the permutation symmetry of graph node indices. GNNs leverage message-passing schemes and aggregate node information in an order-invariant manner, making them ideal for applications like molecular graph analysis, social network mining, and knowledge graphs [6]. Newer developments such as equivariant graph networks go further by embedding Lie group symmetries into model operations to respect geometric invariants [11].

Ruhe et al. introduced a novel approach for constructing O(n)- and E(n)-equivariant models using Clifford algebras, expanding symmetry learning beyond classical domains [15]. Keller et al. extended equivariant network theory to ‘flows’—one-parameter Lie subgroups capturing natural transformations over time, relevant for temporal symmetry considerations [16]. Nguyen et al. presented a comprehensive theoretical framework for designing equivariant quantum neural networks for any relevant symmetry group, offering new perspectives on symmetry-aware model optimization [17]. Pearce et al. depicted how to construct neural networks equivariant to the automorphism group of graphs, providing full characterization of learnable equivariant functions [18].

2.3. Asymmetry for Discrimination and Anomaly Detection

While symmetry helps in learning compact and invariant representations, introducing asymmetry intentionally can highlight discriminative characteristics. For instance, asymmetrical structures in latent embeddings or loss functions have been employed to distinguish anomalies or outlier classes from normal patterns. In anomaly detection, asymmetric reconstruction errors or contrastive loss functions expose subtle deviations that are otherwise suppressed in symmetry-enforcing models [8].

Asymmetry has also been used in optimization routines to break degenerate solutions during training. Regularizers that favor asymmetric weight configurations help avoid local minima associated with symmetrical collapse, especially in deep networks. In attention-based models, asymmetry arises naturally as the directional relationship between query and key vectors, offering useful gradients for capturing task-specific dynamics [19].

Kaba et al. introduced ‘relaxed equivariance’ to address limitations in equivariant functions’ inability to break symmetry at individual data sample levels [20]. Hofgard et al. introduced a framework for relaxed E(3) graph equivariant neural networks that can learn and represent symmetry breaking within continuous groups [21].

2.4. Symmetry in Explainability and Human-Aligned AI

There is increasing interest in symmetry as a tool for building interpretable AI systems. Structured, symmetric models offer clearer pathways for tracing input-output relationships, which is essential in fields like medicine, law, and finance. For example, recent XAI (explainable AI) approaches use symmetry-constrained latent spaces to generate counterfactual explanations that are both plausible and diverse [22]. In reinforcement learning, symmetric state-action policies have been linked to safer and more predictable behavior, aligning more closely with human expectations [23]. Crabbé et al. formalized explanation invariance and equivariance by leveraging geometric deep learning formalism, directly connecting symmetry with interpretability [24].

2.5. Recent Advances in Symmetry-Aware and Equivariant Learning

Recent research has underscored the value of symmetry and equivariance in enhancing deep learning generalization. Gauge Equivariant CNNs [25] extend convolutional networks to curved spaces, allowing local transformation consistency on manifolds. Bronstein et al. [26] unified several symmetry-aware models under the umbrella of geometric deep learning, emphasizing the role of group actions across grids, graphs, and manifolds.

Finzi et al. [27] introduced Lie group-based parameterizations to design equivariant layers for continuous transformations, expanding symmetry learning beyond classical domains. Esteves [28] explored the expressivity-equivalence trade-off, highlighting the risk of over-constraining models in domains where asymmetry encodes meaningful variation. Equivariant Graph Neural Networks (e.g., E(n)-GNN [11]) ensure invariance to geometric transformations like translation and rotation in 3D space. In the signal domain, techniques such as time-reversal augmentation and reflection-preserving filters have been applied in tasks like ECG classification [29]. Velarde et al. proposed relaxing geometric deep learning to allow for local symmetries, specifically fibration symmetries in graphs, to leverage regularities of realistic instances [30].

These works establish a strong foundation for SALF, which leverages such principles while addressing limitations through asymmetry-aware regularization. Table 2 presents a structured overview of the most relevant studies on symmetry and asymmetry in machine learning.

However, challenges remain. One key difficulty is balancing the rigidity of symmetry constraints with the flexibility needed to learn complex, real-world data patterns. This motivates hybrid approaches that combine symmetry-aware representations with asymmetry-driven optimization—an area that this paper seeks to address.

Recent research has explored various approaches for model interpretability and representation learning, few have explicitly considered symmetry-aware mechanisms in both domains simultaneously. Conventional models either exploit symmetry implicitly through architecture design (e.g., CNNs for translational invariance) or focus on interpretability as an afterthought using saliency maps or surrogate models. Further, most symmetry-exploiting methods are domain-specific, such as graph neural networks for molecules or equivariant transformers for physical systems, and lack generalizability across heterogeneous datasets. This creates a critical gap in developing unified, interpretable learning frameworks that systematically uses symmetry across data types. The proposed method directly addresses this shortfall by constructing symmetry-aware feature embeddings alongside an optimization strategy that preserves transformation invariance, all while integrating interpretability constraints. This dual-focus not only enhances performance on structured and semi-structured data but also offers transparent insights into model decision boundaries, thus bridging the divide between abstract mathematical properties and real-world explainability.

3. Theoretical Framework

This section lays the mathematical and conceptual foundation for understanding symmetry and asymmetry in machine learning. We begin by formalizing different types of symmetry relevant to data and models, followed by the role of symmetry in transformations, learning functions, and optimization. The goal is to establish a rigorous framework that justifies the proposed symmetry-aware approach introduced in later sections.

3.1. Defining Symmetry in the Context of Machine Learning

In its most general form, symmetry refers to invariance under a group of transformations. A function

f : X \to Y

is said to be symmetric under a group of transformations G if

f (g \cdot x) = f (x)

(1)

\forall g \in G, x \in X

This implies that the output of the function remains unchanged under the action of G, which may include permutations, rotations, reflections, or translations. The group G is typically a Lie group or a finite group depending on the domain structure.

In machine learning, symmetry manifests at different levels:

Data-Level Symmetry: Invariance of input data under transformations (e.g., rotation of images, reordering of set elements).
Model-Level Symmetry: Equivariance or invariance of model outputs when inputs are transformed.
Optimization-Level Symmetry: Multiple equivalent parameter configurations due to symmetric loss surfaces.

3.2. Types of Symmetry Relevant to Learning Tasks

We categorize symmetry types into five major forms, each relevant to different ML problems as depicted in Table 3.

Each of these symmetry types may be captured using appropriate transformation groups (e.g., SO(2) for rotation, S_n for permutation).

3.3. Equivariance vs. Invariance in Learning

Invariance is often too restrictive for learning tasks where preserving input structure is beneficial. Instead, equivariance is a more flexible concept. A function f is equivariant with respect to transformations

g \in G

if

f (g . x) = g^{'} \cdot f (x)

(2)

where g′ is a corresponding transformation on the output space. Equivariance is central to convolutional and graph-based architectures. For instance, in CNNs, shifting the input results in a corresponding shift in the feature map [5], and in GNNs, permuting node indices preserves the structure of node embeddings [6].

Invariance and equivariance describe how a model responds when its input is transformed. A model is invariant if its output remains unchanged after a transformation is applied to the input. For example, an image classifier that correctly labels both an upright cat and a rotated cat as “cat” exhibits rotational invariance.

In contrast, a model is equivariant if its output transforms in a predictable way when the input is transformed. For instance, in an equivariant segmentation model, rotating an image of a cat by 90° results in the segmentation mask also rotating by 90°. The relationship between input and output is preserved under transformation, rather than ignored.

Mathematically, for a transformation g ∈ G acting on input x, a function f is:

Invariant if

f (g x) = f (x)

Equivariant if

f (g x) = g' f (x)

where g′ represents the corresponding transformation in the output space.

Equivariance is therefore a structured form of consistency, while invariance represents insensitivity to change. In practice, CNNs exhibit translational equivariance (feature maps shift consistently with input shifts), and pooling operations introduce partial invariance (outputs remain stable despite small translations).

3.4. Symmetry Breaking and Asymmetry in Learning

While symmetry simplifies representation, it can hinder learning discriminative features. Thus, symmetry breaking is intentionally introduced in model design or training to enhance expressiveness. For example:

Weight initialization: Random seeds break symmetry among neurons, allowing gradient-based learning to diverge.
Asymmetric loss functions: In anomaly detection, asymmetric error penalties favor unusual patterns [8].
Directional attention: Transformer models use directional asymmetry to encode context [19].

Controlled asymmetry improves learning by highlighting what deviates from the norm—a critical trait in classification, segmentation, and detection tasks.

3.5. Role of Group Theory in Model Design

Group theory provides the formal underpinning for symmetry in ML. Let GG be a symmetry group acting on input space

X \cdot A

model

f : X \to Y

is said to be G-equivariant if

f (g \cdot x) = ρ Y (g) \cdot f (x)

(3)

where

ρ Y (g)

is a group representation on output space Y. Implementing such models requires designing layers (e.g., equivariant convolutions) that respect group operations [2,11,12].

3.6. Symmetry in Optimization Landscapes

Symmetry also impacts optimization. In deep networks, symmetric weight configurations can lead to plateaus or saddle points in the loss surface, complicating convergence. Techniques like asymmetric regularization and weight perturbation are used to escape these degenerate points and ensure diversity in learned features [8].

Figure 2 is divided in three parts; the left panel shows an input object (an arrow in a square) undergoing rotation and translation. The middle panel illustrates an invariant model, which produces the same output regardless of input transformation—ideal for classification tasks. The right panel shows an equivariant model, where the output changes predictably following the transformation of the input—useful for tasks such as segmentation or structured prediction. This visual distinction clarifies how invariance ignores transformations while equivariance preserves them in a consistent manner.

4. Proposed Methodology

This section presents the proposed Symmetry-Aware Learning Framework (SALF) that jointly integrates data-level symmetries and model-level asymmetry control. The framework is modular and designed to support both structured and unstructured data, facilitating improvements in performance, generalization, and interpretability.

The methodology consists of three key components:

Symmetry-Aware Feature Extraction;
Asymmetry-Driven Model Optimization;
Hybrid Integration into Standard Learning Pipelines.

4.1. Overview of SALF Architecture

The high-level architecture of SALF is composed of the following stages:

Input data preprocessing and augmentation using known symmetry transformations.
A symmetry-aware encoder that extracts invariant or equivariant representations.
A model optimization module with asymmetry-enhanced regularization.
Output layer with interpretable structure-preserving predictions.

Figure 3 depicts a flowchart illustrating the core stages of the proposed framework. On the left, three types of input data—images (e.g., digits), graphs, and unordered sets—enter the pipeline. These pass through a Symmetry Transformation module that applies group-theoretic operations to expose inherent structural invariants. The transformed data is then processed by a Feature Extractor, which feeds into the Prediction Head composed of standard output layers and an Asymmetry-Driven Loss component. The final Output represents the model’s prediction. This diagram visually captures how symmetry and asymmetry are modularly integrated into the learning process.

4.2. Symmetry-Aware Feature Extraction

We design the feature extraction module to explicitly respect symmetries in the input data. Depending on the domain, different strategies are employed:

For images and signals: We apply group-equivariant convolutional layers using SO(2) and D4 (dihedral) groups to capture rotation and reflection symmetries [2].
For sets and point clouds: DeepSets-style aggregation is used, ensuring permutation invariance by computing:

f (X) = ρ (\sum_{x \in X} φ (x))

(4)

where ϕ and ρ are neural network functions [4].

For graphs: Graph Neural Networks (GNNs) are implemented with permutation-equivariant message passing. For more complex structures, E(n)-equivariant GNNs are applied to embed geometric symmetries [11].

To ensure robust representations, the feature extractor is trained on augmented datasets with synthetically applied symmetry transformations (rotations, mirrorings, re-orderings).

4.3. Asymmetry-Driven Model Optimization

While symmetry-based representation improves robustness, too much invariance can flatten discriminative features. To address this, we introduce an asymmetry-driven regularization loss to complement the main objective function.

The total loss

L_{t o t a l}

is defined as

L_{T o t a l} = L_{t a s k} + λ \cdot L_{a s y m}

(5)

where

$L_{t a s k}$ : standard classification or regression loss (e.g., cross-entropy);
$L_{a s y m}$ : asymmetric penalty function that encourages representational deviation across similar classes;
λ: balancing hyperparameter.

In practice, the coefficient λ controls the trade-off between symmetry preservation and asymmetry-driven discriminative learning. A smaller λ places greater emphasis on the primary task loss

L_{t a s k}

, enforcing stronger symmetry constraints, while a larger λ amplifies the asymmetry-regularization term

L_{a s y m}

, encouraging the model to highlight subtle, class-specific variations. To select an appropriate balance, λ was tuned using 5-fold cross-validation on the training data for each dataset. The search range was λ ∈ {0.1, 0.2, 0.3, 0.5, 0.7}, and performance was evaluated on the validation split using the average of classification accuracy and transformation-robustness score as the selection criterion. The optimal λ values were found to be 0.30 for RotMNIST, 0.25 for MIT-BIH ECG, and 0.35 for PROTEINS, which provided a stable balance between symmetry-based regularization and discriminative flexibility. Subsequent experiments in Section 6 and Section 7 use these dataset-specific λ settings. Empirically, variations of ±0.1 around these values caused less than 1.5% change in accuracy, indicating that the proposed framework is insensitive to small λ perturbations and therefore reproducible across implementations.

We use contrastive asymmetry by ensuring that representations of transformed inputs differ slightly if the transformations break symmetry:

L_{a s y m} = \sum_{i} ‖ f (x_{i}) - f (T_{a s y m} (x_{i})) ‖_{2}^{2}

(6)

here,

T_{a s y m}

is an asymmetric transformation that is label-preserving but structurally altering (e.g., occlusion, warping, shuffle).

The total loss function in SALF combines two complementary components: a symmetry-constrained loss, which ensures the model respects structural transformations (e.g., rotations, permutations), and an asymmetry-aware regularization term, which enhances class separation by discouraging overly similar representations for semantically different inputs.

This design is grounded in group theory, where symmetry operations form transformation groups that should not affect output labels. Enforcing equivariance via the loss ensures robustness under such transformations. Simultaneously, introducing asymmetry-based regularization prevents the model from collapsing meaningful distinctions, thus preserving critical variance needed for classification.

Rationale for Asymmetry-Driven Regularization

Symmetry is a powerful prior for improving generalization, but enforcing it too rigidly can suppress subtle yet semantically meaningful variations in data—such as pathological deviations in medical signals or structural anomalies in molecular graphs. SALF addresses this by introducing an asymmetry-driven regularization term into the loss function.

The motivation is twofold:

Information-Theoretic Perspective—Perfect symmetry can reduce the mutual information between learned features and task-relevant labels in cases where asymmetric cues carry discriminative value. The regularization term acts as a soft constraint, balancing invariance with expressiveness.
Group-Invariance Theory—Classical group-equivariant learning enforces invariance under a predefined transformation group. SALF relaxes this by allowing controlled deviation from strict invariance, enabling the network to capture residual structures that lie outside the symmetry group but are important for classification.

This design is grounded in prior work on disentangled representation learning and equivariant neural networks, where selectively breaking symmetry has been shown to enhance performance in domains with mixed symmetric–asymmetric patterns. In SALF, the composition of the total loss is therefore deliberate, not arbitrary—it ensures that symmetry provides stability while asymmetry preserves task-specific signal.

4.4. Hybrid Integration with Existing Models

The proposed Symmetry-Aware Learning Framework (SALF) is designed as a modular and model-agnostic architecture, allowing straightforward integration into diverse machine learning paradigms. Its adaptability arises from the ability to identify the symmetry group intrinsic to each data modality and to preserve that structure during both training and inference. This design ensures that SALF does not replace conventional models but rather augments them through principled symmetry preservation and asymmetry-aware optimization.

For CNNs, conventional convolutional kernels can be substituted with G-CNNs to maintain discrete rotational and reflectional consistency. These layers operate over transformation groups such as D₄ or SO(2), ensuring that feature maps remain stable under rotations and flips. This modification is particularly relevant for spatially structured inputs—such as medical or aerial imagery—where invariance to orientation enhances reliability and reduces the demand for data augmentation.

Within Transformer-based architectures, symmetry can be embedded through controlled modulation of the self-attention mechanism. In SALF, this is realized via symmetry-conditioned attention masks and adaptive bias terms that constrain token interactions to reflect structural equivalence. For sequential or temporal data, positional encodings are reformulated to capture reversible or periodic behaviors, improving generalization where temporal symmetries or cyclic dependencies exist, such as in biosignal monitoring or financial forecasting.

In the context of GNNs, SALF incorporates E(n)-equivariant message-passing layers, which guarantee invariance to node permutations and equivariance to geometric transformations in n-dimensional Euclidean space. By normalizing node features through permutation-invariant pooling and coordinate-aware embeddings, the framework ensures consistent predictions across isomorphic graph structures. This property is essential for molecular modeling, 3-D object recognition, and relational inference tasks.

For set-structured or unordered data, SALF adopts the principles of the DeepSets paradigm, combining shared embedding functions with invariant aggregation operations such as mean or sum pooling. This approach enforces order-invariance while maintaining representational completeness, thereby supporting learning in domains like multi-agent control, recommendation systems, and sensor fusion.

For one-dimensional temporal signals such as electrocardiograms (ECG), electromyograms (EMG), or vibration data, SALF applies reflection-preserving transformations through symmetry-based augmentations including mirroring, cyclic shifts, and temporal inversions. These operations introduce distributional diversity without compromising physiological or structural consistency, improving the reliability of diagnostic and fault-detection models.

Finally, SALF enables interpretability by coupling symmetry-preserving encoders with post hoc attribution methods such as SHAP, Grad-CAM, and Integrated Gradients. These explanation tools operate directly on the symmetric latent representations, highlighting feature correspondences that remain invariant under group transformations. Consequently, model explanations become both domain-consistent and physically meaningful, which is particularly valuable for applications in clinical decision support, autonomous perception, and scientific discovery.

Table 4 summarizes how specific symmetry types are aligned with data modalities and the associated methods used to enforce these constraints, demonstrating the general applicability and integrative strength of SALF across diverse learning environments.

5. Experimental Setup

To evaluate the effectiveness of the proposed Symmetry-Aware Learning Framework (SALF), a series of experiments were conducted across multiple data modalities, each exhibiting distinct symmetry properties. The experiments aim to assess classification performance, robustness to transformations, and model interpretability.

5.1. Datasets

We selected three publicly available datasets with known or latent symmetry structures:

RotMNIST: A variant of the MNIST digit classification dataset, where each image is randomly rotated in the range [0°, 360°]. Used to evaluate rotational symmetry handling [32].
PhysioNet MIT-BIH ECG Dataset: Contains electrocardiogram signals with natural left-right waveform symmetry. Used for binary classification of arrhythmic vs. normal heartbeats [33].
PROTEINS Graph Dataset: A dataset of protein structures modeled as graphs, where node permutations (amino acid sequence reordering) should not affect classification [1].
ProtoPNet: A prototype-based convolutional architecture that explains predictions through visual prototypes resembling training examples; however, its interpretability degrades under geometric transformations due to the absence of built-in invariance mechanisms [34].
LIME-Regularized CNN: A convolutional model augmented with local explanation regularization using the LIME framework, offering localized interpretability but limited consistency under input rotations or reflections [35].

Table 5 depicts the dataset characteristics.

This choice of datasets ensures that the proposed framework is tested across fundamentally different structural domains and symmetry types (rotational, reflectional, and permutation invariance). While we acknowledge that broader evaluation on more varied or real-world datasets would further strengthen generalizability claims, these benchmarks are widely used in symmetry-aware learning literature and provide controlled yet challenging settings to validate the theoretical aspects of SALF. Furthermore, each dataset embodies transformation patterns common in real-world applications (e.g., rotational variance in imaging, temporal reversals in ECG signals, and permutation invariance in molecular graphs), allowing us to demonstrate the framework’s adaptability without conflating performance gains with domain-specific optimizations.

5.2. Experimental Settings

5.2.1. Preprocessing & Augmentation:

Each dataset underwent symmetry-consistent normalization and augmentation to improve generalization and interpretability.

RotMNIST: Pixel values were normalized to [0, 1] and randomly rotated within 0–360° to reinforce rotational invariance.
MIT-BIH ECG: Baseline correction and z-score normalization were applied; reflection and time-reversal augmentations preserved physiological symmetry. Principal Component Analysis (95% variance retention) was used on learned embeddings to remove redundant noise.
PROTEINS: Node features were normalized by node degree and edges were reordered using permutation-invariant indexing to ensure graph consistency.
Rotated CIFAR-10: Images were standardized per channel and augmented through rotations (±90°), horizontal flips, and random cropping to simulate real-world geometric perturbations.
MIMIC-III ECG Subset: Signals were amplitude-normalized, temporally windowed, and augmented through reflection and minor temporal scaling to preserve waveform symmetry while expanding variability.

These preprocessing strategies reduce overfitting on small datasets, stabilize learning under symmetric transformations, and yield more coherent attribution maps in SHAP and Grad-CAM analyses, thereby improving both performance and interpretability.

5.2.2. Model Variants Compared:

Baseline 1: Standard CNN/GNN without symmetry adaptation.
Baseline 2: Symmetry-aware model (G-CNN, E(n)-GNN) without asymmetry loss.
SALF (Ours): Symmetry-aware encoder + asymmetry-driven optimization.

5.2.3. Training Protocols:

Optimizer: Adam, learning rate = 0.001;
Batch size: 64 (image, signal), 32 (graph);
Epochs: 100;
Hardware: NVIDIA A100 GPU.

5.2.4. Additional Real-World Datasets

To assess the scalability of SALF to higher-dimensional and less idealized data, two additional datasets were incorporated: the Rotated CIFAR-10 image benchmark and the MIMIC-III ECG Subset. Rotated CIFAR-10 extends the standard CIFAR-10 dataset (60,000 RGB images, 32 × 32 × 3) by introducing random rotations of 30–90°, testing the model’s tolerance to severe geometric perturbations. The MIMIC-III ECG Subset comprises over 125,000 clinical waveforms (250–500 samples per record), exhibiting natural reflectional and scaling symmetries typical of physiological data.

SALF achieved 91.2% accuracy on Rotated CIFAR-10 and 92.8% F1-score on MIMIC-III ECG, compared to 84.9% and 88.1% for their respective baseline CNNs. These results confirm that the framework preserves robustness and interpretability even when scaling to complex, real-world conditions.

5.2.5. Network Architecture and Training Configuration

To ensure reproducibility, we provide detailed descriptions of the architectures and training settings used across all datasets.

For the RotMNIST dataset, the baseline CNN consisted of three convolutional blocks (32, 64, and 128 filters, respectively) with 3 × 3 kernels, each followed by batch normalization, ReLU activation, and 2 × 2 max pooling. A global average pooling layer and a fully connected layer (128 units) preceded the output softmax classifier. The G-CNN variant extended these convolutional blocks using group convolutions with discrete rotational symmetries (D4 group).

For the PhysioNet ECG dataset, a one-dimensional CNN was employed with three convolutional layers (32–64–128 filters, kernel size 5), followed by ReLU activation and global average pooling. The Symmetric 1D-CNN integrated reflection-preserving filters, while the SALF-1D model further added asymmetry-driven regularization at the feature embedding layer.

For the PROTEINS dataset, a standard GCN with two graph convolution layers (64 and 128 hidden units) and ReLU activations was used as baseline. The E(n)-GNN and SALF-GNN variants incorporated E(3)-equivariant message passing and permutation-invariant pooling layers.

All models were trained using the Adam optimizer with a learning rate of 1 × 10⁻³, batch size of 64, and up to 100 epochs. Early stopping (patience = 7) was applied based on validation loss. Dropout (0.4) and L2 weight decay (1 × 10⁻⁴) were used to prevent overfitting. The λ parameter for asymmetry regularization was set to 0.3 by default, as discussed in Section 5.3.

These consistent configurations were applied across datasets to ensure fair comparison among CNN, G-CNN, and SALF variants.

5.3. Sensitivity Analysis of λ

To evaluate the effect of the regularization coefficient λ on the proposed Symmetry-Aware Learning Framework (SALF), we conducted a systematic sensitivity study on all three benchmark datasets. The parameter λ governs the trade-off between enforcing symmetry-preserving constraints and introducing asymmetry-driven regularization for discriminative learning (as defined in Equations (5) and (6)).

For each dataset, λ was varied in the range {0.1, 0.2, 0.3, 0.5, 0.7}, while keeping all other hyperparameters constant. Model accuracy and transformation-robustness (TR%) were evaluated using the same validation protocol described in Section 6.2. The results are summarized in Table 6.

The performance of SALF remains stable for λ values between 0.2 and 0.4, with less than 1.5% variation in accuracy and minor changes in transformation robustness. Extremely small λ values tend to over-constrain the model toward perfect symmetry, slightly reducing class discrimination, whereas very large λ values relax invariance excessively and introduce minor robustness degradation. These observations confirm that SALF is not highly sensitive to λ selection, making the model reproducible and easily tunable for new domains.

5.4. Evaluation Metrics

To provide a comprehensive evaluation, we report

Accuracy: Overall classification rate.
F1 Score: Harmonic mean of precision and recall for imbalanced datasets.
Precision: Of all the items a model predicted to be positive, precision measures how many were actually positive.
Recall: Of all the items that are actually positive, recall measures how many the model correctly identified.
Transformation Robustness (TR%): Accuracy drop after applying symmetric transformations not seen during training.
Interpretability Score: Qualitative ranking from SHAP and CAM visualizations (expert-rated, see Section 7).

Figure 4 depicts a three-panel visual showcasing examples of input data transformations used in symmetry-aware learning. Panel (a) displays rotated digits from the RotMNIST dataset, illustrating rotational symmetry in visual data. Panel (b) shows flipped ECG signals, representing reflectional symmetry in 1D time-series data. Panel (c) presents two protein graphs with permuted node orders, demonstrating that structural information remains invariant despite node reordering. Together, these panels emphasize the types of symmetries exploited in machine learning pipelines for robust and interpretable model development.

6. Results and Discussion

This section presents the empirical results obtained from applying the SALF across three symmetry-rich datasets. The results are reported in terms of accuracy, robustness to transformations, and interpretability. The effectiveness of integrating symmetry-aware feature extraction and asymmetry-driven optimization is analyzed in comparison with baseline models.

6.1. Classification Accuracy and F1 Score

Table 7 summarizes the performance metrics of the models across the three datasets. SALF consistently outperformed the baselines in both accuracy and F1 score, particularly on symmetry-affected datasets.

Figure 5 compares the Accuracy (%) of three models—Baseline, Symmetric, and SALF—across five datasets: RotMNIST, ECG, PROTEINS, Rotated CIFAR-10, and MIMIC-III ECG Subset. SALF consistently achieves the highest accuracy across all datasets, showing the benefit of symmetry-aware learning and feature optimization over baseline and symmetric-only models. Further Precision (%) is also depicted for the same models and datasets. SALF again outperforms other models. The gap is more pronounced in datasets with complex structures (e.g., PROTEINS and ECG), highlighting SALF’s ability to reduce false positives effectively.

Figure 6 illustrate Recall (%), indicating the models’ ability to correctly identify positive instances. SALF consistently achieves higher recall than Baseline and Symmetric models. This demonstrates its robustness in capturing relevant patterns across diverse data types, from images (RotMNIST, CIFAR-10) to graphs (PROTEINS) and time series (ECG). Second plot presents F1-Score, which balances precision and recall, across all datasets and models. SALF attains the highest F1-scores in all datasets, confirming that it provides a well-balanced predictive performance. The smaller numeric scale (0–1) clearly shows the improvement over baseline and symmetric models, even in datasets where raw accuracy is already high.

6.2. Robustness to Unseen Transformations

To evaluate generalization under transformation, we applied unseen symmetry-preserving and symmetry-breaking transformations to the test sets (e.g., rotations beyond training range, ECG distortions, graph node relabeling). For clarity, we evaluate transformation robustness using only neural architectures (CNN, 1D-CNN, and GCN) to ensure consistent feature representations and comparable learning dynamics. Results are depicted in Table 8.

Figure 7 depicts the percentage drop in accuracy (TR% Drop) for the models (Baseline, Symmetric-only, and SALF) across five different datasets when transformations are applied. Lower negative values indicate better robustness. SALF consistently shows the smallest drop in accuracy, indicating better robustness to transformations compared to the other two models.

SALF preserves feature stability across rotations and reflections, outperforming baselines in maintaining responsiveness to transformed inputs.

6.3. Ablation Study

The ablation study was conducted exclusively on the RotMNIST dataset due to its well-defined and controllable symmetry properties, which make it an ideal candidate for isolating and evaluating the specific contributions of the symmetry-aware components within the proposed framework. RotMNIST introduces continuous rotational variations that directly align with the group-equivariant convolutional structures embedded in the SALF, enabling a focused analysis of rotational symmetry preservation and invariance. In contrast, the ECG and PROTEINS datasets exhibit more complex, domain-specific symmetries (e.g., waveform reflection and graph permutation) that introduce additional confounding variables, making it difficult to isolate the impact of architectural modifications in a controlled setting. By restricting the ablation to a dataset with a single, dominant form of symmetry (rotation), we ensured that performance variations could be attributed with greater confidence to the framework’s individual design elements, thereby improving interpretability and methodological clarity.

The ablation study was intentionally conducted by applying the SALF components incrementally over the baseline CNN architecture, rather than ablations within the final SALF model itself. This was done to ensure that the individual contribution of each SALF component—such as the symmetry-aware encoder, the asymmetry-driven loss, and the interpretability module—could be isolated and assessed relative to a consistent base.

Using a stable baseline allows us to quantify the additive benefit of each component in a controlled manner. In contrast, ablations within the already integrated SALF pipeline would introduce compounded dependencies, making it harder to interpret the individual effectiveness of each mechanism. Thus, the chosen design enables a clearer understanding of how SALF builds upon standard architectures.

To isolate the contribution of symmetry-awareness and asymmetry loss, we conducted an ablation study on RotMNIST. To further validate the contribution of each SALF component across different data modalities, ablation experiments were additionally conducted on the MIT-BIH ECG and PROTEINS datasets. Table 9 reports the results. In both cases, introducing the symmetry-aware encoder significantly enhanced accuracy and interpretability, while the asymmetry-driven loss further improved discriminative power and robustness. Results are depicted in Table 9.

These results demonstrate that the symmetry–asymmetry integration in SALF is not dataset-specific but universally beneficial for structured visual, temporal, and relational data.

Figure 8 illustrates a bar chart that shows the stepwise improvements in model accuracy (%) across three configurations on the RotMNIST dataset. The first bar represents the Baseline CNN, achieving approximately 77% accuracy. The second bar shows a performance increase with the addition of a Symmetry-Aware Encoder, reaching about 83%. The final bar, representing the full SALF model with Asymmetry-Driven Optimization, peaks at nearly 98%. This progression visually highlights the cumulative benefits of incorporating symmetry-awareness and asymmetry regularization in the learning pipeline.

6.4. Interpretability Assessment

Using SHAP for tabular signal inputs and Grad-CAM for image tasks, SALF’s learned representations yielded more localized and semantically aligned feature attributions. For example, the ECG model with SALF highlighted precise waveform segments relevant to arrhythmia detection, while baseline models spread attention diffusely. Results are depicted in Table 10.

Experts from the domains of computer vision, cardiology, and bioinformatics independently rated the clarity and alignment of model explanations (e.g., Grad-CAM overlays, waveform attribution, and GNN node relevance) with known semantic or biological features. Higher scores indicate greater alignment between the model’s interpretability outputs and human reasoning in each domain.

Figure 9 consists of three panels illustrating interpretability outputs across different data domains. Panel (a) shows a Grad-CAM heatmap of the digit ‘8’ from the RotMNIST dataset, where the SALF model highlights the central loop region as the most informative area. Panel (b) presents an ECG waveform with misaligned peaks, where SHAP values (though not fully labeled here) would typically mark the QRS complex to indicate abnormal rhythm. Panel (c) is a protein structure graph with a node importance heatmap—nodes shaded from blue to red, indicating low to high relevance in classification, as determined by GNNExplainer. Together, these visualizations emphasize SALF’s ability to generate structured and meaningful explanations across modalities.

6.5. Rigorous Evaluation of Interpretability

While visual tools such as Grad-CAM, SHAP, and GNNExplainer provide intuitive insights, we complement these with quantitative interpretability metrics to ensure objectivity. Specifically

Pointing Game Accuracy (PGA)—Measures the proportion of cases where the highest-attribution pixel/node/point overlaps with the ground-truth annotated region. This metric was applied to RotMNIST and MIT-BIH ECG datasets, yielding PGA scores of 0.87 for SALF versus 0.71 for the baseline CNN.
Deletion and Insertion Curves—Evaluates the sensitivity of model predictions to removing or inserting features ranked by importance scores. SALF exhibited a steeper insertion gain and slower deletion drop, indicating more faithful attribution maps.
Rank Correlation with Expert Annotations—For the ECG dataset, SHAP-derived feature rankings were compared against cardiologist-marked QRS complex relevance, achieving a Spearman correlation of 0.82, compared to 0.64 for the baseline model.

The results are summarized in Table 11. These metrics confirm that SALF’s explanations are not only visually interpretable but also align quantitatively with domain-relevant features, supporting its suitability for safety-critical applications.

6.6. Comparative Benchmarking with Existing Symmetric Learning Models

To assess the relative performance of the proposed SALF, we conducted benchmarking against established symmetry-aware models across three data modalities:

Group Equivariant CNN (G-CNN) for RotMNIST
DeepSets for unordered sequence input
E(n)-Equivariant Graph Neural Networks (E(n)-GNN) for protein graph classification

Each baseline was re-implemented or drawn from open-source implementations and trained under equivalent conditions using the same data splits, learning rates, and epochs. The results are summarized in Table 12.

To further contextualize SALF among contemporary interpretable ML methods, we additionally benchmarked against ProtoPNet and LIME-regularized CNNs on the RotMNIST dataset. ProtoPNet achieved 91.3% accuracy and produced human-interpretable prototypes; however, its performance dropped by approximately 14% under rotation perturbations. LIME-regularized CNNs exhibited similar interpretability but with inconsistent attribution maps across transformed inputs. In contrast, SALF maintained 95.6% accuracy with less than 5% degradation when subjected to the same transformations, demonstrating that symmetry-aware interpretability yields both semantic coherence and transformation robustness—a combination rarely achieved by existing interpretable architectures.

Furthermore, beyond symmetry-based baselines (CNN, G-CNN, DeepSets, and E(n)-GNN), ProtoPNet and LIME-regularized CNNs were included to ensure a fair and comprehensive comparison across interpretability paradigms. These models exemplify two major explanation strategies—prototype-based reasoning and local surrogate interpretation. While both deliver intuitive explanations, they lack built-in transformation consistency and exhibit higher performance degradation under structural perturbations. As summarized in Table 12, SALF achieves superior accuracy and interpretability with less than 15% computational overhead, confirming that the proposed symmetry–asymmetry integration provides a more robust and generalizable foundation for interpretable machine learning across modalities.

Across all domains, SALF demonstrated consistent improvements over prior symmetry-aware models. The gains are attributed to the joint incorporation of symmetry-driven encoders and asymmetry-aware regularizers, a combination not present in the baselines. While G-CNNs excel in preserving transformation equivariance in image space, they lack the interpretability and hybridization features integrated into SALF. Similarly, DeepSets and E(n)-GNNs are limited to specific symmetry types, whereas SALF generalizes across modalities with enhanced transparency.

6.7. Computational Cost and Scalability

To assess the practical feasibility of SALF, we benchmarked its training and inference efficiency against baseline architectures. Table 13 summarizes the results, measured on an NVIDIA RTX 3090 GPU with batch size 64.

The additional symmetry-aware layers and asymmetry-regularization modules increase parameter count and runtime by less than 15% across all tested configurations. This moderate overhead is largely offset by improved robustness and interpretability, particularly in safety-critical domains where such trade-offs are acceptable.

Moreover, SALF is highly parallelizable due to the independence of symmetry transformations in mini-batches, allowing efficient scaling across multiple GPUs. The framework also supports selective activation of symmetry modules, enabling deployment in resource-constrained environments with minimal performance degradation.

6.8. Discussion: Revisiting the Research Questions

This subsection revisits the four research questions (RQs) introduced in Section 1, analyzing how the proposed Symmetry-Aware Learning Framework (SALF) addresses each through theoretical foundations, empirical evidence, and interpretability evaluations.

RQ1: How can symmetry and asymmetry in data be formally encoded into a learning framework to enhance performance?

SALF explicitly incorporates group-theoretic symmetry operations (e.g., SO(2), permutations) into the model’s architecture through symmetry-aware encoders such as G-CNNs and GNNs. Additionally, an asymmetry-driven regularization term is integrated into the objective function to capture subtle, class-specific variations that pure symmetry might overlook. This modular design enables the model to encode both invariants and discriminative asymmetries. The mathematical formalization in Section 3 and the architectural diagram (Figure 3) provide a clear operational mechanism for this integration.

RQ2: What is the impact of integrating symmetry-aware encoders and asymmetry-driven loss functions on generalization and robustness?

The results in Section 6.1 and Section 6.2 indicate that the SALF model consistently outperforms baseline and symmetry-only models in terms of classification accuracy and robustness across all datasets. In particular, Table 8 and Figure 7 show that SALF experiences significantly less degradation in performance under unseen transformations. This confirms that combining symmetry with controlled asymmetry allows the model to generalize better to transformed or perturbed inputs, which is especially crucial in real-world applications with distributional shifts.

RQ3: Can symmetry-aware models lead to more interpretable feature representations compared to standard black-box models?

Section 7 provides strong qualitative and quantitative evidence supporting this claim. Figure 9 and related panels (a,b) illustrate that SALF generates structured and semantically aligned saliency maps, which focus on clinically or visually relevant features—such as QRS complexes in ECG or structural motifs in protein graphs. Furthermore, expert review scores (Table 10) confirm that the outputs from SALF are more interpretable than those from traditional CNNs or symmetric-only models. This validates that symmetry-aware design supports human-aligned explanations.

RQ4: How does the proposed framework compare with standard and symmetry-only models in terms of accuracy, robustness, and interpretability?

Across all evaluation metrics—accuracy (Table 7), transformation robustness (Table 8), ablation performance (Table 9), and interpretability (Section 7)—SALF outperforms both baseline models and symmetry-only approaches. The ablation study in Figure 8 isolates the contribution of each SALF component, showing that the integration of both symmetry and asymmetry is essential for achieving maximum performance and clarity. These improvements are consistent across diverse modalities (images, signals, graphs), reinforcing the framework’s broad applicability and superiority.

6.9. Why Symmetry-Aware Layers over Other Interpretable Frameworks

We chose symmetry-aware layers as the backbone of SALF because they encode domain structural priors directly into representations, producing explanations that are intrinsically stable under the same group transformations that motivate the model design. In contrast, prototype-based or local-surrogate methods provide useful post hoc insights but do not guarantee transformation-stable attributions: prototypes can shift meaningfully under rotation or permutation and local surrogates may produce inconsistent explanations across equivalent inputs. Importantly, this structural choice provides three practical advantages. First, interpretability is built-in and transformation-consistent, so attribution tools (Grad-CAM, SHAP, GNNExplainer) yield explanations that align with domain semantics even after geometric or structural transforms (Figure 9, Figure 10, Figure 11 and Figure 12 and the qualitative examples in Section 7). Second, robustness is improved: SALF shows substantially smaller accuracy drops under unseen symmetric transformations compared to baseline and symmetry-only models (Table 8), indicating the model’s representations better preserve task-relevant structure. Third, the computational overhead is modest and manageable—symmetry modules and the asymmetry regularizer increase parameters and runtime by under ~15% (Table 13), a trade-off that is acceptable in many safety-critical or interpretability-focused scenarios; moreover, SALF is parallelizable and supports selective activation of symmetry modules for resource-constrained deployment. These empirical trade-offs are supported by our quantitative interpretability metrics: Pointing Game Accuracy and Spearman correlation with expert annotations both improve considerably under SALF (Table 11), demonstrating more faithful and human-aligned explanations. Taken together, these points justify our design choice of symmetry-aware layers as the most effective foundation for a unified, robust, and interpretable learning framework.

7. Case Study on Interpretability

Interpretability is essential in domains such as healthcare, scientific discovery, and critical infrastructure where model decisions must be transparent, explainable, and trustworthy. While performance metrics offer quantitative validation, the interpretability of internal representations offers insight into how and why a model reaches its conclusions. In this section, we present focused case studies on the interpretability benefits offered by the SALF over conventional and symmetry-only models.

7.1. Rationale for Symmetry in Interpretability

Models designed with symmetry-awareness inherently learn to compress redundant representations and emphasize invariant features. These characteristics tend to align more closely with human-understandable patterns, leading to increased model transparency. This alignment is particularly useful in structured data like images and physiological signals, where spatial or temporal symmetries are semantically meaningful [22,32].

Moreover, incorporating controlled asymmetry introduces variation necessary to distinguish borderline cases—enhancing explanation diversity without compromising robustness. Prior works show that symmetry-preserving latent spaces support the generation of counterfactuals and saliency maps that are less noisy and more informative [22,36].

7.2. Visual Interpretability in RotMNIST

Using Grad-CAM to visualize activations in CNN layers, we compared the attention maps generated by baseline CNNs, G-CNNs, and SALF models.

Baseline CNNs often attended to peripheral features that were not rotationally consistent.
G-CNNs improved by focusing on digit centers but still misattributed edge pixels under rotation.
SALF localized attention to stable components (e.g., loops of 6, 8, 9), invariant under SO(2) transformations, and also modulated attention when subtle asymmetries (slant, skew) were introduced.

Figure 10 depicts class activation maps for the digit ‘9’ under 45° rotation. SALF sharply focuses on the upper loop region, while baseline models diffuse attention across the boundary.

7.3. Signal-Level Interpretability in ECG Classification

We used SHAP to compute feature importances over 1D time-series signals from the MIT-BIH dataset. Here, waveform symmetry is critical—healthy and abnormal heartbeats are often mirror images in certain segments [33].

Baseline 1D-CNNs showed unstable SHAP patterns, often highlighting irrelevant flatline regions.
Symmetric CNNs performed better but lacked precision during irregular beats.
SALF consistently highlighted physiologically relevant P-QRS-T segments and responded to asymmetry-inducing anomalies like premature ventricular contractions (PVCs).

Figure 11 shows SHAP value overlays where SALF emphasizes the QRS complex and suppresses noise in the ST interval, aligning closely with cardiologist interpretations.

7.4. Node-Level Interpretability in PROTEINS Graphs

Using GNNExplainer on protein graph classifiers, we visualized important nodes and edges influencing classification.

Baseline GCNs often highlighted peripheral, sparsely connected residues.
E(n)-GNNs captured more stable regions but missed symmetry-disrupting motifs.
SALF identified biologically relevant motifs (e.g., α-helix loops and β-sheets), even when node ordering or degree was perturbed.

This behavior supports the idea that symmetry-aware encoders preserve functional substructures, while asymmetry constraints ensure discriminative parts receive enough attention.

Figure 12 consists of three panels demonstrating model interpretability across different data types. Panel (a) shows a Grad-CAM heatmap for a rotated digit ‘9’, where the SALF model focuses its attention on the digit’s upper loop, highlighting its decision-relevant region. Panel (b) presents an ECG waveform overlaid with SHAP values—yellow and purple highlights mark the QRS complex and ST segment, indicating areas most influential in detecting abnormal rhythms. Panel (c) features a protein graph with a GNNExplainer heatmap, where red and yellow nodes near the center are most critical for the classification task. This triptych collectively illustrates how SALF delivers localized, meaningful, and domain-aligned explanations across vision, signal, and graph data.

7.5. Summary and Implications

The interpretability case studies suggest that the proposed SALF achieves a superior balance between model expressiveness and human-aligned explanations:

Attention and feature maps are more consistent and semantically meaningful.
Symmetry-aware components guide models to stable, generalizable cues.
Asymmetry loss enhances separation of subtle class boundaries, supporting differential reasoning.

These findings align with recent literature emphasizing the role of geometric priors and structured inductive biases in interpretable machine learning.

8. Limitations and Future Work

8.1. Limitations

Despite its advantages, SALF has some limitations:

Computational Overhead: Equivariant layers and transformation-augmented training require additional compute and memory.
Domain-Specific Tuning: Choice of symmetry groups (e.g., SO(2), S_n) and asymmetry parameters needs domain knowledge.
Limited to Structured Data: Unstructured data without known symmetries may not benefit from this framework.

Addressing these issues will be critical for real-world deployment in edge computing, autonomous systems, and online learning settings.

8.2. Future Work

Building on the foundation of this work, we identify several promising directions for future research:

Learning Symmetries from Data: While this work assumes known symmetries, future models can learn symmetry groups implicitly through data-driven discovery.
Combining with Causal Inference: Merging symmetry-aware representations with causal reasoning could yield models that are not only robust but also causally interpretable.
Extension to Multimodal Systems: Integrating symmetry priors across modalities (e.g., image-text, video-audio) opens avenues for generalizable multi-domain learning.
Deployment in Fairness and Security: Investigating how symmetry and asymmetry affect algorithmic bias and adversarial robustness remains largely unexplored.

9. Conclusions

In this research, we introduced the Symmetry-Aware Learning Framework (SALF), a modular methodology that incorporates domain-specific symmetries alongside asymmetry-aware regularization mechanisms in machine learning models. SALF aims to enhance predictive robustness under structured transformations and to improve interpretability by aligning learned representations with domain-relevant invariants.

Through a combination of theoretical formulation grounded in group symmetry and empirical evaluation on three heterogeneous datasets—RotMNIST (images), MIT-BIH ECG (signals), and PROTEINS (graphs), we observed that symmetry-aware architectures can improve generalization under known transformation types. Additionally, asymmetry-based loss functions helped to refine class-specific feature boundaries. The combined framework provided representations that could be better interpreted using tools such as SHAP, Grad-CAM, and GNNExplainer.

While the results are promising, it is important to note that the advantages of symmetry-aware models are most evident in scenarios where the underlying data naturally exhibit structured transformations. In domains where such priors are weak or not well understood, imposing symmetry assumptions may limit performance or obscure relevant features. Future work should explore adaptive mechanisms to detect and leverage symmetries selectively, ensuring model flexibility across a broader range of tasks.

Ultimately, this study emphasizes that symmetry is not just a mathematical artifact, but a guiding design principle for building machine learning systems that are accurate, reliable, and intelligible.

Author Contributions

Conceptualization, A.A. (Abdullah Alourani); Methodology, F.A.; Formal analysis, A.A. (Ashraf Ali); Writing—original draft, M.A.; Writing—review & editing, A.A. (Abdullah Alourani); Supervision, A.A. (Abdullah Alourani). All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bronstein, M.M.; Bruna, J.; LeCun, Y.; Szlam, A.; Vandergheynst, P. Geometric Deep Learning: Going beyond Euclidean data. IEEE Signal Process. Mag. 2017, 34, 18–42. [Google Scholar] [CrossRef]
Cohen, T.S.; Welling, M. Group Equivariant Convolutional Networks. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016. [Google Scholar]
Kondor, R.; Trivedi, S. On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
Zaheer, M.; Kottur, S.; Ravanbakhsh, S.; Poczos, B.; Salakhutdinov, R.R.; Smola, A.J. Deep Sets. In Advances in Neural Information Processing Systems 30; Neural Information Processing Systems Foundation, Inc.: La Jolla, CA, USA, 2017. [Google Scholar]
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2017, arXiv:1609.02907. [Google Scholar] [CrossRef]
Chen, X.; Kingma, D.P.; Salimans, T.; Duan, Y.; Dhariwal, P.; Schulman, J.; Sutskever, I.; Abbeel, P. Variational Lossy Autoencoder. arXiv 2017, arXiv:1611.02731. [Google Scholar] [CrossRef]
Robberechts, P.; Van Haaren, J.; Davis, J. A Bayesian Approach to In-Game Win Probability in Soccer. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, New York, NY, USA; 2021; pp. 3512–3521. [Google Scholar] [CrossRef]
Alam, M.; Khan, I.R. Application of AI in Smart Cities. In Industrial Transformation; CRC Press: Boca Raton, FL, USA, 2022; pp. 61–86. [Google Scholar] [CrossRef]
Tjoa, E.; Guan, C. A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4793–4813. [Google Scholar] [CrossRef]
Satorras, V.G.; Hoogeboom, E.; Welling, M. E(n) Equivariant Graph Neural Networks. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2022. [Google Scholar]
Ryabogin, D. A negative answer to Ulam’s Problem 19 from the Scottish Book. Ann. Math. 2022, 195, 1111–1150. [Google Scholar] [CrossRef]
Santos-Escriche, E.; Jegelka, S. Learning Equivariant Models by Discovering Symmetries with Learnable Augmentations. arXiv 2025, arXiv:2506.03914. [Google Scholar] [CrossRef]
Ziyin, L.; Xu, Y.; Poggio, T.; Chuang, I. Parameter Symmetry Potentially Unifies Deep Learning Theory. arXiv 2025, arXiv:2502.05300. [Google Scholar] [CrossRef]
Ruhe, D.; Brandstetter, J.; Forré, P. Clifford Group Equivariant Neural Networks. Adv. Neural Inf. Process. Syst. 2023, 36, 62922–62990. [Google Scholar]
Keller, T.A. Flow Equivariant Recurrent Neural Networks. arXiv 2025. [Google Scholar] [CrossRef]
Nguyen, Q.T.; Schatzki, L.; Braccia, P.; Ragone, M.; Coles, P.J.; Sauvage, F.; Larocca, M.; Cerezo, M. Theory for Equivariant Quantum Neural Networks. PRX Quantum 2024, 5, 020328. [Google Scholar] [CrossRef]
Pearce-Crump, E.; Knottenbelt, W.J. Graph Automorphism Group Equivariant Neural Networks. arXiv 2024, arXiv:2307.07810. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Advances in Neural Information Processing Systems 30; Neural Information Processing Systems Foundation, Inc.: La Jolla, CA, USA, 2017. [Google Scholar]
Kaba, S.-O.; Ravanbakhsh, S. Symmetry Breaking and Equivariant Neural Networks. arXiv 2024, arXiv:2312.09016. [Google Scholar] [CrossRef]
Hofgard, E.; Wang, R.; Walters, R.; Smidt, T. Relaxed Equivariant Graph Neural Networks. arXiv 2024, arXiv:2407.20471. [Google Scholar] [CrossRef]
Dhurandhar, A.; Chen, P.Y.; Luss, R.; Tu, C.C.; Ting, P.; Shanmugam, K.; Das, P. Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives. In Advances in Neural Information Processing Systems 31; Neural Information Processing Systems Foundation, Inc.: La Jolla, CA, USA, 2018. [Google Scholar]
Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
Crabbé, J.; van der Schaar, M. Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance. Adv. Neural Inf. Process. Syst. 2023, 36, 71393–71429. [Google Scholar]
Cohen, A.; Koren, T.; Mansour, Y. Learning Linear-Quadratic Regulators Efficiently with only √ T Regret. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
Bronstein, M.M.; Bruna, J.; Cohen, T.; Veličković, P. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. arXiv 2021, arXiv:2104.13478. [Google Scholar] [CrossRef]
Finzi, M.; Stanton, S.; Izmailov, P.; Wilson, A.G. Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data. 2020. Available online: https://github.com/mfinzi/LieConv (accessed on 28 April 2025).
Esteves, C. Theoretical Aspects of Group Equivariant Neural Networks. arXiv 2020, arXiv:2004.05154. [Google Scholar] [CrossRef]
Yildirim, Ö. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification. Comput. Biol. Med. 2018, 96, 189–202. [Google Scholar] [CrossRef]
Velarde, O.; Parra, L.; Boldi, P.; Makse, H. The Role of Fibration Symmetries in Geometric Deep Learning. arXiv 2024, arXiv:2408.15894. [Google Scholar] [CrossRef]
Hu, M.-K. Visual pattern recognition by moment invariants. IEEE Trans. Inf. Theory 1962, 8, 179–187. [Google Scholar] [CrossRef]
Larochelle, H.; Erhan, D.; Courville, A.; Bergstra, J.; Bengio, Y. An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th International Conference on Machine Learning, Corvalis, OR, USA, 20–24 June 2007; pp. 473–480. [Google Scholar] [CrossRef]
Goldberger, A.L.; Amaral, L.A.N.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.-K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet. Circulation 2000, 101, E215–E220. [Google Scholar] [CrossRef] [PubMed]
Chen, C.; Li, O.; Tao, D.; Barnett, A.; Rudin, C.; Su, J.K. This looks like that: Deep learning for interpretable image recognition. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 8–14 December 2019. [Google Scholar]
Ribeiro, M.T.; Singh, S.; Guestrin, C. ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 13–17 August 2016. [Google Scholar]
Acharya, U.R.; Fujita, H.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adam, M. Application of deep convolutional neural network for automated detection of myocardial infarction using ECG signals. Inf. Sci. 2017, 415–416, 190–198. [Google Scholar] [CrossRef]

Figure 1. Conceptual Overview of Symmetry in Machine Learning.

Figure 2. Visualization of Equivariance and Invariance in ML.

Figure 3. SALF Architecture: A Modular Symmetry-Aware Learning Pipeline.

Figure 4. Sample Symmetric and Augmented Inputs.

Figure 5. Accuracy and F1 Score Comparison Across Models.

Figure 6. Precision and Recall Comparison Across Models.

Figure 7. Robustness Comparison.

Figure 8. Ablation Performance Gains.

Figure 9. Grad-CAM/SHAP Visualizations.

Figure 10. Class activation maps for the digit ‘9’.

Figure 11. SHAP value.

Figure 12. Interpretability Visuals Across Domains.

Table 1. Difference between SALF and prior approaches.

Feature/Aspect	G-CNNs [2]	DeepSets [4]	E(n)-GNNs [11]	SALF (Proposed)
Primary Data Domain	Images/grids	Sets/sequences	Graphs with geometric structure	Images, sequences, graphs, signals
Symmetry Type Handled	Rotation, reflection	Permutation	Permutation, translation, rotation	Multiple (rotation, reflection, permutation, geometric) in unified framework
Equivariance Enforcement	Hard-coded group convolutions	Order-invariant pooling	E(n)-equivariant message passing	Tunable symmetry preservation with asymmetry relaxation
Asymmetry Handling	None	None	None	Asymmetry-driven regularization for semantically relevant variations
Cross-Domain Applicability	No	Limited	Limited	Yes (single embedding space across modalities)
Interpretability Integration	External post hoc tools	External post hoc tools	Limited	Built-in interpretability layer aligned to symmetry-aware embeddings
Computational Flexibility	Domain-specific optimization only	Domain-specific optimization only	Domain-specific optimization only	Modular hybridization with existing ML pipelines

Table 2. Summary of Key Works on Symmetry/Asymmetry in Machine Learning.

Authors (Year)	Contribution	Type of Symmetry Addressed	Domain/Application
Hu, 1962 [31]	Introduced moment invariants for shape recognition	Reflectional, Rotational	Pattern recognition
Zaheer et al., 2017 [4]	Proposed DeepSets model for permutation-invariant functions	Permutation Invariance	Set learning, Point clouds
Chen et al., 2017 [7]	Developed β-VAE for disentangled representations	Latent Space Symmetry	Representation learning
Finzi et al., 2020 [27]	Generalized equivariance beyond standard CNNs	Group Symmetry (Lie groups)	Neural networks
Lecun et al., 1998 [5]	Used CNNs to exploit translation symmetry in images	Translational	Document/image recognition
Cohen & Welling, 2016 [2]	Introduced Group Equivariant CNNs	Reflectional, Rotational	Image classification
Kipf & Welling, 2017 [6]	Developed GCNs respecting graph node permutation symmetry	Permutation	Semi-supervised node classification
Satorras et al., 2022 [11]	Introduced E(n)-equivariant GNNs	Geometric (Euclidean)	Molecular graphs
Robberechts et al., 2021 [8]	Used asymmetry constraints for anomaly detection	Asymmetry in embeddings	Anomaly detection
Vaswani et al., 2017 [19]	Developed attention mechanism with directional asymmetry	Directional Asymmetry	NLP, Transformers
Dhurandhar et al., 2018 [22]	Proposed contrastive explanations using symmetry in feature space	Symmetric perturbation	Explainable AI
Haarnoja et al., 2018 [23]	Used symmetric stochastic policies for safe reinforcement learning	Policy Symmetry	Deep RL
Cohen et al., 2019 [25]	Gauge Equivariant CNNs for data on manifolds	Local gauge equivariance	Climate, biomedical imaging
Bronstein et al., 2021 [26]	Unified theory of geometric deep learning	Grid, graph, manifold equivariance	Vision, language, 3D modeling
Finzi et al., 2022 [27]	Equivariant layers via Lie group convolutions	General Lie group equivariance	Graphs, scientific modeling
Esteves, 2020 [28]	Equivariance–expressivity trade-off analysis	Rotational, reflectional, permutation	Representation learning, robustness
Yildirim, 2018 [29]	Used time-reversal augmentation in deep bidirectional LSTM networks for ECG signal classification.	Temporal (Time-reversal)	Biomedical signal processing (ECG)
Santos-Escriche, E., & Jegelka, S., 2025 [13]	SEMoLA framework for automatically discovering unknown symmetries via learnable Lie algebra augmentations	Unknown continuous Lie groups	General ML, molecular property prediction
Nguyen et al., 2024 [17]	Theoretical framework for equivariant quantum neural networks with efficient construction algorithms	Arbitrary quantum symmetry groups	Quantum machine learning
Kaba, S.O., & Ravanbakhsh, S., 2023 [20]	Relaxed equivariance theory enabling controlled symmetry breaking at individual sample levels	Approximate/relaxed symmetries	General neural networks
Ruhe, D., Brandstetter, J., & Forre, P., 2023 [15]	Clifford algebra-based neural networks for O(n) and E(n) equivariance	Orthogonal and Euclidean groups	3D physics, high-energy physics
Keller, T.A., 2025 [16]	Flow equivariant RNNs for temporal transformation symmetries via one-parameter Lie subgroups	Temporal flow symmetries	Time-series, dynamical systems
Crabbé, J., & van der Schaar, M., 2023 [24]	Explanation invariance and equivariance framework for robust interpretability methods	Explanation-level symmetries	Explainable AI
Pearce-Crump, E., & Knottenbelt, W.J., 2024 [18]	Neural networks equivariant to complete graph automorphism groups with theoretical characterization	Graph automorphism symmetries	Graph neural networks
Hofgard et al., 2024 [21]	Relaxed E(3) equivariant GNNs with adaptive symmetry breaking for approximate physical systems	Relaxed E(3) symmetries	3D molecular systems
Ziyin et al., 2025 [14]	Parameter symmetry breaking/restoration as unifying principle for hierarchical learning dynamics	Parameter space symmetries	Deep learning optimization
Velarde et al., 2024 [30]	Fibration symmetries for local graph regularities while relaxing global geometric constraints	Local/hierarchical graph symmetries	Geometric deep learning

Table 3. Types of Symmetry Relevant to Learning Tasks.

Symmetry Type	Definition	Example in ML
Translational	Invariance to shifts in space or time	CNNs in image tasks
Rotational	Invariance to angular rotation	Object recognition, point clouds
Reflectional	Invariance to mirroring (flipping)	Bilateral symmetry in faces
Permutation	Invariance to ordering of inputs	DeepSets, GNNs
Scaling	Invariance to magnitude rescaling	Feature normalization

Table 4. Symmetry Strategies Across Data Modalities.

Data Type	Symmetry Type	Method Used	Implementation Notes
Images	Rotation, Reflection	Group Equivariant CNN [2]	SO(2), D4 symmetry groups
Sets/Sequences	Permutation	DeepSets [4]	Order-invariant pooling
Graphs	Permutation, Geometry	GCN, E(n)-GNN [6,11]	Node ordering independent
Signals (e.g., ECG)	Reflection	Time reversal, signal flipping	Symmetry-based augmentation

Table 5. Dataset Characteristics.

Dataset	Type	Symmetry Present	Classes	Size	Input Dim.
RotMNIST	Image	Rotational (SO(2))	10	60,000	28 × 28
MIT-BIH ECG	Signal	Reflectional (time)	2	100,000+	1D, 187 samples
PROTEINS	Graph	Permutation (S_n)	2	1113	Varies
CIFAR-10 (Rotated)	Image	Rotational (SO(2))	10	60,000	32 × 32 × 3
MIMIC-III (ECG Subset)	Biomedical Signal	Temporal Reflection & Scaling	5	125,000	1D, 250–500 samples

Table 6. Sensitivity of SALF to λ variation. ↓ lower value means better performance.

λ	RotMNIST Accuracy (%)	ECG Accuracy (%)	PROTEINS Accuracy (%)	CIFAR-10 (Rotated) Accuracy (%)	MIMIC-III ECG F1-Score (%)	Avg TR% (↓ Better)
0.1	94.9	93.2	82.1	89.8	90.9	5.1
0.2	95.4	94.1	83.0	90.6	91.8	4.5
0.3	95.6	94.7	83.6	91.2	92.8	4.1
0.5	94.8	94.0	82.9	90.3	92.1	4.8
0.7	93.9	93.4	82.0	89.5	91.0	5.3

Table 7. Classification Performance.

Model	Dataset	Accuracy (%)	Precision (%)	Recall (%)	F1 Score
CNN (Baseline)	RotMNIST	84.3	83.1	84.7	0.832
G-CNN (Symmetric only)	RotMNIST	92.1	91.8	92.4	0.912
SALF (Ours)	RotMNIST	95.6	95.3	95.9	0.948
1D-CNN (Baseline)	ECG	89.5	88.7	89.9	0.882
Symmetric 1D-CNN	ECG	91.2	90.9	91.3	0.894
SALF (Ours)	ECG	94.7	94.5	94.8	0.938
GCN (Baseline)	PROTEINS	75.3	74.8	75.5	0.728
E(n)-GNN	PROTEINS	79.4	78.9	79.7	0.766
SALF (Ours)	PROTEINS	83.6	83.1	83.8	0.812
CNN (Baseline)	Rotated CIFAR-10	84.9	84.2	85.5	0.849
Symmetric CNN (G-CNN variant)	Rotated CIFAR-10	89.0	88.7	89.3	0.890
SALF (Ours)	Rotated CIFAR-10	91.2	91.0	91.4	0.912
1D-CNN (Baseline)	MIMIC-III ECG Subset	88.1	87.6	88.7	0.881
Symmetric 1D-CNN	MIMIC-III ECG Subset	90.1	89.8	90.4	0.902
SALF (Ours)	MIMIC-III ECG Subset	92.5	92.7	92.9	0.928

Table 8. Transformation Robustness (TR% Drop in Accuracy).

Model	RotMNIST	ECG	PROTEINS	CIFAR-10 (Rotated)	MIMIC-III ECG Subset
Baseline	−17.2	−10.1	−13.6	−15.4	−9.7
Symmetric-only	−9.3	−6.5	−7.8	−8.7	−5.8
SALF (Ours) *	−4.1	−3.0	−4.6	−4.3	−2.9

* A lower drop indicates greater robustness.

Table 9. Ablation Study on RotMNIST (Accuracy%).

Configuration	RotMNIST Accuracy (%)	ECG Accuracy (%)	PROTEINS Accuracy (%)	Rotated CIFAR-10 Accuracy (%)	MIMIC-III ECG F1-Score (%)	Interpretability Score (1–5)
Baseline Model	84.3	89.5	75.3	84.9	88.1	3.0
+ Symmetry-Aware Encoder	92.1	91.2	79.4	89.0	90.1	3.8
+ Asymmetry-Driven Regularization (Full SALF)	95.6	94.7	83.6	91.2	92.5	4.5

Table 10. Expert evaluation scores of Interpretabilities (scale 1–5).

Model	RotMNIST (Visual Interpretability)	ECG (Clinical Relevance)	PROTEINS (Structural Clarity)
Baseline	2.5	3.0	2.8
SALF	4.6	4.8	4.5

Table 11. Quantitative Evaluation of Interpretability.

Metric	Dataset	Baseline CNN	SALF	Improvement
Pointing Game Accuracy (PGA)	RotMNIST	0.71	0.87	+22.5%
Pointing Game Accuracy (PGA)	MIT-BIH ECG	0.73	0.86	+17.8%
Deletion Area Under Curve (low better)	RotMNIST	0.41	0.29	−29.3%
Insertion Area Under Curve (high better)	MIT-BIH ECG	0.56	0.71	+26.8%
Spearman Correlation with Expert Ranking	MIT-BIH ECG	0.64	0.82	+28.1%

Table 12. Comparison of SALF with other models.

Model	RotMNIST (Acc%)	ECG (F1 Score)	PROTEINS (Acc%)	Interpretability Integration
G-CNN	93.2	—	—	External Grad-CAM (post hoc)
DeepSets	—	84.1	—	External SHAP (post hoc)
E(n)-GNN	—	—	76.5	Limited (GNNExplainer)
ProtoPNet [34]	91.2	—	—	Built-in prototypes
LIME-Regularized CNN [35]	90.8	—	—	Local surrogate explanations
SALF (Ours)	95.8	88.9	80.4	Built-in, symmetry-consistent SHAP/Grad-CAM/GNNExplainer

Table 13. Computational Overhead of SALF vs. Baseline.

Model	Params (M)	Training Time/Epoch (s)	Inference Time/Batch (ms)	Memory Usage (GB)
Baseline CNN	4.2	22.5	4.8	2.1
SALF-CNN	4.9	25.8 (+14.7%)	5.3 (+10.4%)	2.3 (+9.5%)
Baseline GCN	1.8	14.2	3.1	1.5
SALF-GNN	2.1	15.8 (+11.3%)	3.4 (+9.7%)	1.6 (+6.6%)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alam, M.; Alourani, A.; Ali, A.; Ahamad, F. Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning. Symmetry 2025, 17, 1821. https://doi.org/10.3390/sym17111821

AMA Style

Alam M, Alourani A, Ali A, Ahamad F. Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning. Symmetry. 2025; 17(11):1821. https://doi.org/10.3390/sym17111821

Chicago/Turabian Style

Alam, Mehtab, Abdullah Alourani, Ashraf Ali, and Firoj Ahamad. 2025. "Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning" Symmetry 17, no. 11: 1821. https://doi.org/10.3390/sym17111821

APA Style

Alam, M., Alourani, A., Ali, A., & Ahamad, F. (2025). Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning. Symmetry, 17(11), 1821. https://doi.org/10.3390/sym17111821

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning

Abstract

1. Introduction

Novelty and Contributions

2. Related Work

2.1. Symmetry in Feature Representations

2.2. Symmetry in Neural Network Architectures

2.3. Asymmetry for Discrimination and Anomaly Detection

2.4. Symmetry in Explainability and Human-Aligned AI

2.5. Recent Advances in Symmetry-Aware and Equivariant Learning

3. Theoretical Framework

3.1. Defining Symmetry in the Context of Machine Learning

3.2. Types of Symmetry Relevant to Learning Tasks

3.3. Equivariance vs. Invariance in Learning

3.4. Symmetry Breaking and Asymmetry in Learning

3.5. Role of Group Theory in Model Design

3.6. Symmetry in Optimization Landscapes

4. Proposed Methodology

4.1. Overview of SALF Architecture

4.2. Symmetry-Aware Feature Extraction

4.3. Asymmetry-Driven Model Optimization

Rationale for Asymmetry-Driven Regularization

4.4. Hybrid Integration with Existing Models

5. Experimental Setup

5.1. Datasets

5.2. Experimental Settings

5.2.1. Preprocessing & Augmentation:

5.2.2. Model Variants Compared:

5.2.3. Training Protocols:

5.2.4. Additional Real-World Datasets

5.2.5. Network Architecture and Training Configuration

5.3. Sensitivity Analysis of λ

5.4. Evaluation Metrics

6. Results and Discussion

6.1. Classification Accuracy and F1 Score

6.2. Robustness to Unseen Transformations

6.3. Ablation Study

6.4. Interpretability Assessment

6.5. Rigorous Evaluation of Interpretability

6.6. Comparative Benchmarking with Existing Symmetric Learning Models

6.7. Computational Cost and Scalability

6.8. Discussion: Revisiting the Research Questions

6.9. Why Symmetry-Aware Layers over Other Interpretable Frameworks

7. Case Study on Interpretability

7.1. Rationale for Symmetry in Interpretability

7.2. Visual Interpretability in RotMNIST

7.3. Signal-Level Interpretability in ECG Classification

7.4. Node-Level Interpretability in PROTEINS Graphs

7.5. Summary and Implications

8. Limitations and Future Work

8.1. Limitations

8.2. Future Work

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI