SymmetryLens: Unsupervised Symmetry Learning via Locality and Density Preservation
Abstract
:1. Introduction
- The formulation of a new approach to symmetry learning via maps that respect both symmetry and (a proposed notion of) locality, coupled in a natural way. The outputs of the method are both minimal generators of the symmetry and a symmetry-based data representation that, in a sense, makes the hidden symmetry manifest. This representation can be used as input for a regular CNN, allowing the model to work as an adapter between raw data and the whole machinery of CNNs.
- The formulation of an information-theoretic loss function encapsulating the formulated symmetry and locality properties.
- The development of optimization techniques (“time-dependent rank”, see Section 4.3.1) that result in highly robust and reproducible results.
- The demonstration of the symmetry recovery and symmetry-based representation capabilities on quite different sorts of examples, including 1D pixel translation symmetries, a shuffled version of pixel translations, and frequency shifts using dataset dimensionalities as high as 33 (i.e., the relevant symmetry generators with the shape ).
2. Related Work
3. Theoretical Setting and Data Model
3.1. The Continuous Setting
3.1.1. Introduction
3.1.2. Processes with Symmetry and Locality Under 1-Dimensional Translations
3.1.3. Processes with Symmetry and Locality Under a General 1-Dimensional Group Action
3.2. The Discrete Setting: Synthetic Data Generation
3.2.1. Discrete Translation Symmetry
3.2.2. General Representations: Behavior Under Orthogonal/Unitary Transformations
4. Materials and Methods
4.1. The Setup: Objects to Learn
4.1.1. Symmetry Generator
4.1.2. The Resolving Filter
4.1.3. The Group Convolution Matrix
4.2. The Loss Function
4.2.1. Stationarity/Uniformity
4.2.2. Locality
4.2.3. Information Preservation
4.2.4. Total Loss
4.2.5. On the Coupled Effect of the Alignment and Resolution Terms
4.3. Training the Model
4.3.1. Training the Group Convolution Layer
4.3.2. Training Probability Density Estimators
5. Results
5.1. Results on Synthetic and Real Data
- The group convolution operator L formed by combining the learned symmetry generator with the learned resolving filter as in (11) is approximately equal to the underlying transformation used in generating the dataset (see Figure 7b–Figure 10b). As a result, the matrix L is highly successful in reconstructing the underlying hidden local signals, resulting in a symmetry-based representation (see Figure 11).
5.2. Comparison with Other Unsupervised Symmetry Learning Approaches
6. Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Estimation Procedures
Appendix A.1. Probability Estimation
Appendix A.1.1. Marginal Probability Estimation
Appendix A.1.2. Conditional Probability Estimation
Estimated Quantity | Output Activation Function |
---|---|
Kernel weights | Softmax over the kernel axis |
Mean value of each kernel | No (linear) activation |
Variances of each kernel | Scaled tangent hyperbolic function followed by exponentiation |
Appendix A.2. Multidimensional Entropy Estimation
Appendix A.3. KL Divergence Estimation
Appendix A.3.1. KL Divergence of Marginal Probabilities
Appendix A.3.2. KL Divergence of Conditional Probabilities
Appendix B. Equivariance of the Group Convolution Layer
Appendix C. Datasets
ID | Signal Type | Invariance | Dimensionality | Parameter Ranges | Size (Samples) |
---|---|---|---|---|---|
1 | Gaussian | Circulant translation | 7 | Scale: Amplitude: | 63 K |
2 | Gaussian | Translation | 7 | Scale: Amplitude: | 252 K |
3 | Mnist slices | Translation | 27 | ——- | 1.08 M |
4 | Gaussian | Translation | 33 | Scale: Amplitude: | 1.65 M |
5 | Legendre (, ) | Translation | 33 | Scale: Amplitude: | 1.65 M |
6 | Gaussian | Permuted translation | 33 | Scale: Amplitude: | 1.65 M |
7 | Legendre (, ) | Permuted translation | 33 | Scale: Amplitude: | 1.65 M |
8 | Gaussian | Frequency shift | 33 | Scale: Amplitude: | 1.65 M |
9 | Legendre (, ) | Frequency shift | 33 | Scale: Amplitude: | 1.65 M |
Appendix C.1. Details of Synthetic Data
- We select a parametrized family of local basis signals, such as the family of Gaussians parametrized by amplitude, center, and width.
- Use a binomial distribution (probability ; trials) to determine the number of local signals to include in each sample.
- Sample the parameters of the basis signal (e.g., center, width, and amplitude) from uniform distributions over finite ranges to obtain each local signal.
- Add up the local pieces to end up with a single sample that has information locality under component translations (we call this the raw sample).
- We apply an appropriate unitary transformation to the raw sample to obtain locality/symmetry under the desired group action.
- Finally, we add Gaussian noise to the sample (with ).
Basis Signal Types
Appendix C.2. Details of Real Data
Appendix D. Details of the Comparison with GAN-Based Methods
Appendix E. Hyperparameters and Loss Terms
Appendix E.1. Initial Choice of Hyperparameters
Estimator Learning Rate | Model Learning Rate | Total Learning Rate Decay | Alignment Coefficient | Uniformity Coefficient | Resolution Coefficient | Information Preservation Coefficient |
---|---|---|---|---|---|---|
2.5 | 0.1 | 0.1 | 1.0 | 2.0 | 1.0 | 2.0 |
Appendix E.2. Elementary Effect Sensitivity Analysis
- : Mean effect ;
- : Mean absolute effect ;
- : Standard deviation ;
Parameter | [95% CI] | ||
---|---|---|---|
Estimator learning rate | 0.08981 | 0.09123 [0.08902] | 0.10771 |
Model learning rate | −0.03309 | 0.04060 [0.04458] | 0.06588 |
Total learning rate decay | −0.03529 | 0.03682 [0.05176] | 0.06890 |
Alignment coefficient | 0.02523 | 0.05384 [0.05455] | 0.08922 |
Uniformity coefficient | 0.05345 | 0.05619 [0.09583] | 0.10371 |
Resolution coefficient | −0.04455 | 0.04938 [0.06613] | 0.08628 |
Information preservation coefficient | 0.07904 | 0.07909 [0.09634] | 0.11925 |
Noise | −0.00298 | 0.01124 [0.00578] | 0.01424 |
Estimator Learning Rate | Model Learning Rate | Total Learning Rate Decay | Alignment Coefficient | Uniformity Coefficient | Resolution Coefficient | Information Preservation Coefficient | Noise | Cosine Similarity | |
---|---|---|---|---|---|---|---|---|---|
1 | 2.292 | 0.075 | 0.167 | 1.083 | 2.167 | 1.250 | 0.917 | 0.133 | 0.816 |
2 | 3.125 | 0.075 | 0.167 | 1.083 | 2.167 | 1.250 | 0.917 | 0.133 | 0.955 |
3 | 3.125 | 0.075 | 0.167 | 1.083 | 1.500 | 1.250 | 0.917 | 0.133 | 0.815 |
4 | 3.125 | 0.075 | 0.167 | 1.083 | 1.500 | 1.250 | 0.917 | 0.000 | 0.827 |
5 | 3.125 | 0.075 | 0.167 | 1.083 | 1.500 | 0.917 | 0.917 | 0.000 | 0.943 |
6 | 3.125 | 0.108 | 0.167 | 1.083 | 1.500 | 0.917 | 0.917 | 0.000 | 0.856 |
7 | 3.125 | 0.108 | 0.100 | 1.083 | 1.500 | 0.917 | 0.917 | 0.000 | 0.948 |
8 | 3.125 | 0.108 | 0.100 | 0.750 | 1.500 | 0.917 | 0.917 | 0.000 | 0.843 |
9 | 3.125 | 0.108 | 0.100 | 0.750 | 1.500 | 0.917 | 1.250 | 0.000 | 0.999 |
10 | 1.875 | 0.092 | 0.167 | 1.083 | 2.500 | 1.083 | 1.083 | 0.067 | 0.885 |
11 | 2.708 | 0.092 | 0.167 | 1.083 | 2.500 | 1.083 | 1.083 | 0.067 | 0.988 |
12 | 2.708 | 0.092 | 0.167 | 1.083 | 2.500 | 0.750 | 1.083 | 0.067 | 0.982 |
13 | 2.708 | 0.092 | 0.167 | 1.083 | 1.833 | 0.750 | 1.083 | 0.067 | 0.976 |
14 | 2.708 | 0.092 | 0.167 | 0.750 | 1.833 | 0.750 | 1.083 | 0.067 | 0.997 |
15 | 2.708 | 0.092 | 0.167 | 0.750 | 1.833 | 0.750 | 0.750 | 0.067 | 0.997 |
16 | 2.708 | 0.125 | 0.167 | 0.750 | 1.833 | 0.750 | 0.750 | 0.067 | 0.986 |
17 | 2.708 | 0.125 | 0.167 | 0.750 | 1.833 | 0.750 | 0.750 | 0.200 | 0.997 |
18 | 2.708 | 0.125 | 0.100 | 0.750 | 1.833 | 0.750 | 0.750 | 0.200 | 0.997 |
19 | 3.125 | 0.125 | 0.100 | 0.750 | 2.500 | 1.083 | 1.083 | 0.000 | 0.990 |
20 | 3.125 | 0.125 | 0.100 | 0.750 | 2.500 | 0.750 | 1.083 | 0.000 | 0.999 |
21 | 3.125 | 0.092 | 0.100 | 0.750 | 2.500 | 0.750 | 1.083 | 0.000 | 0.992 |
22 | 3.125 | 0.092 | 0.100 | 1.083 | 2.500 | 0.750 | 1.083 | 0.000 | 0.988 |
23 | 3.125 | 0.092 | 0.100 | 1.083 | 1.833 | 0.750 | 1.083 | 0.000 | 0.987 |
24 | 2.292 | 0.092 | 0.100 | 1.083 | 1.833 | 0.750 | 1.083 | 0.000 | 0.987 |
25 | 2.292 | 0.092 | 0.167 | 1.083 | 1.833 | 0.750 | 1.083 | 0.000 | 0.997 |
26 | 2.292 | 0.092 | 0.167 | 1.083 | 1.833 | 0.750 | 1.083 | 0.133 | 0.994 |
27 | 2.292 | 0.092 | 0.167 | 1.083 | 1.833 | 0.750 | 0.750 | 0.133 | 0.993 |
28 | 3.125 | 0.108 | 0.133 | 0.917 | 2.167 | 1.083 | 1.083 | 0.133 | 0.997 |
29 | 3.125 | 0.108 | 0.200 | 0.917 | 2.167 | 1.083 | 1.083 | 0.133 | 0.994 |
30 | 3.125 | 0.108 | 0.200 | 0.917 | 2.167 | 0.750 | 1.083 | 0.133 | 0.993 |
31 | 3.125 | 0.075 | 0.200 | 0.917 | 2.167 | 0.750 | 1.083 | 0.133 | 0.991 |
32 | 3.125 | 0.075 | 0.200 | 0.917 | 1.500 | 0.750 | 1.083 | 0.133 | 0.994 |
33 | 2.292 | 0.075 | 0.200 | 0.917 | 1.500 | 0.750 | 1.083 | 0.133 | 0.995 |
34 | 2.292 | 0.075 | 0.200 | 1.250 | 1.500 | 0.750 | 1.083 | 0.133 | 0.982 |
35 | 2.292 | 0.075 | 0.200 | 1.250 | 1.500 | 0.750 | 1.083 | 0.000 | 0.987 |
36 | 2.292 | 0.075 | 0.200 | 1.250 | 1.500 | 0.750 | 0.750 | 0.000 | 0.818 |
Appendix E.3. Ablation Study
Appendix F. Complexity Analysis
Appendix F.1. Time Complexity
Operation | Description | Time Complexity |
---|---|---|
Probability estimation | Estimation of per-component probabilities via Gaussian kernels. | O() |
Conditional probability estimation | Estimation of conditional probabilities via Gaussian kernels parametrized with three neural networks. | O() |
Forming the group convolution matrix | Formed by applying the generator to the resolving filter. We use an eigendecomposition- based algorithm for efficiency. | O() |
Joint entropy computation (covariance step) | Computing the covariance matrix. | O() |
Joint entropy computation (eigendecomposition step) | Eigendecomposition of the covariance matrix. | O() |
Appendix F.2. Space Complexity
Object | Description | Space Complexity |
---|---|---|
Probability estimator | Estimation of probabilities via Gaussian kernels. | O() |
Conditional probability estimator | Estimation of conditional probabilities via Gaussian kernels parametrized by three neural networks. | O() |
Generator | Formed by applying the symmetry
generator to the resolving filter. We use an eigendecomposition- based algorithm for efficiency. | O() |
Resolving filter | Computing the covariance matrix. | O(d) |
References
- Cohen, T.; Welling, M. Group equivariant convolutional networks. In Proceedings of the International Conference on Machine Learning. PMLR, New York, NY, USA, 19–24 June 2016; pp. 2990–2999. [Google Scholar]
- Babelon, O.; Bernard, D.; Talon, M. Introduction to Classical Integrable Systems; Cambridge Monographs on Mathematical Physics, Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Hairer, E.; Lubich, C.; Wanner, G. Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations, 2nd ed.; Springer Series in Computational Mathematics; Springer: Berlin/Heidelberg, Germany, 2010; Volume 31. [Google Scholar]
- Higgins, I.; Racanière, S.; Rezende, D. Symmetry-based representations for artificial and biological general intelligence. Front. Comput. Neurosci. 2022, 16, 836498. [Google Scholar] [CrossRef] [PubMed]
- Anselmi, F.; Patel, A.B. Symmetry as a guiding principle in artificial and brain neural networks. Front. Comput. Neurosci. 2022, 16, 1039572. [Google Scholar] [CrossRef] [PubMed]
- Bronstein, M.M.; Bruna, J.; Cohen, T.; Veličković, P. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. arXiv 2021, arXiv:2104.13478. [Google Scholar]
- Anselmi, F.; Evangelopoulos, G.; Rosasco, L.; Poggio, T. Symmetry-adapted representation learning. Pattern Recognit. 2019, 86, 201–208. [Google Scholar] [CrossRef]
- Helgason, S. Differential Geometry, Lie Groups, and Symmetric Spaces; Academic Press: Cambridge, MA, USA, 1979. [Google Scholar]
- Desai, K.; Nachman, B.; Thaler, J. Symmetry discovery with deep learning. Phys. Rev. D 2022, 105, 096031. [Google Scholar] [CrossRef]
- Ozakin, A.; Vasiloglou, N.; Gray, A.G. Density-Preserving Maps. In Manifold Learning: Theory and Applications; Ma, Y., Fu, Y., Eds.; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
- Weinberg, S. What is Quantum Field Theory, and What Did We Think It Is? arXiv 1997, arXiv:hep-th/9702027. [Google Scholar]
- Benton, G.; Finzi, M.; Izmailov, P.; Wilson, A.G. Learning invariances in neural networks from training data. Adv. Neural Inf. Process. Syst. 2020, 33, 17605–17616. [Google Scholar]
- Romero, D.W.; Lohit, S. Learning partial equivariances from data. arXiv 2021, arXiv:2110.10211. [Google Scholar]
- Zhou, A.; Knowles, T.; Finn, C. Meta-learning symmetries by reparameterization. arXiv 2020, arXiv:2007.02933. [Google Scholar]
- Craven, S.; Croon, D.; Cutting, D.; Houtz, R. Machine learning a manifold. Phys. Rev. D 2022, 105, 096030. [Google Scholar] [CrossRef]
- Forestano, R.T.; Matchev, K.T.; Matcheva, K.; Roman, A.; Unlu, E.B.; Verner, S. Accelerated discovery of machine-learned symmetries: Deriving the exceptional Lie groups G2, F4 and E6. Phys. Lett. B 2023, 847, 138266. [Google Scholar] [CrossRef]
- Forestano, R.T.; Matchev, K.T.; Matcheva, K.; Roman, A.; Unlu, E.B.; Verner, S. Deep learning symmetries and their Lie groups, algebras, and subalgebras from first principles. Mach. Learn. Sci. Technol. 2023, 4, 025027. [Google Scholar] [CrossRef]
- Krippendorf, S.; Syvaeri, M. Detecting symmetries with neural networks. Mach. Learn. Sci. Technol. 2020, 2, 015010. [Google Scholar] [CrossRef]
- Sohl-Dickstein, J.; Wang, C.M.; Olshausen, B.A. An unsupervised algorithm for learning lie group transformations. arXiv 2010, arXiv:1001.1027. [Google Scholar]
- Dehmamy, N.; Walters, R.; Liu, Y.; Wang, D.; Yu, R. Automatic symmetry discovery with lie algebra convolutional network. Adv. Neural Inf. Process. Syst. 2021, 34, 2503–2515. [Google Scholar]
- Greydanus, S.; Dzamba, M.; Yosinski, J. Hamiltonian neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 15353. [Google Scholar]
- Alet, F.; Doblar, D.; Zhou, A.; Tenenbaum, J.; Kawaguchi, K.; Finn, C. Noether networks: Meta-learning useful conserved quantities. Adv. Neural Inf. Process. Syst. 2021, 34, 16384–16397. [Google Scholar]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Yang, J.; Walters, R.; Dehmamy, N.; Yu, R. Generative adversarial symmetry discovery. In Proceedings of the International Conference on Machine Learning. PMLR, Honolulu, HI, USA, 23–29 July 2023; pp. 39488–39508. [Google Scholar]
- Yang, J.; Dehmamy, N.; Walters, R.; Yu, R. Latent Space Symmetry Discovery. arXiv 2023, arXiv:2310.00105. [Google Scholar]
- Tombs, R.; Lester, C.G. A method to challenge symmetries in data with self-supervised learning. J. Instrum. 2022, 17, P08024. [Google Scholar] [CrossRef]
- Doob, J. Stochastic Processes; Wiley: Hoboken, NJ, USA, 1991. [Google Scholar]
- Folland, G.B. A Course in Abstract Harmonic Analysis; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
- Bell, A.J.; Sejnowski, T.J. An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 1995, 7, 1129–1159. [Google Scholar] [CrossRef] [PubMed]
- Cover, T. Elements of Information Theory; Wiley series in telecommunications and signal processing; Wiley-India: New Delhi, India, 1999. [Google Scholar]
- Pichler, G.; Colombo, P.J.A.; Boudiaf, M.; Koliander, G.; Piantanida, P. A differential entropy estimator for training neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 17691–17715. [Google Scholar]
- Herman, J.; Usher, W. SALib: An open-source Python library for Sensitivity Analysis. J. Open Source Softw. 2017, 2. [Google Scholar] [CrossRef]
- Iwanaga, T.; Usher, W.; Herman, J. Toward SALib 2.0: Advancing the accessibility and interpretability of global sensitivity analyses. Socio-Environ. Syst. Model. 2022, 4, 18155. [Google Scholar] [CrossRef]
- Morris, M.D. Factorial sampling plans for preliminary computational experiments. Technometrics 1991, 33, 161–174. [Google Scholar] [CrossRef]
Symmetry | Gaussian Dataset | Legendre Dataset | MNIST Slices |
---|---|---|---|
Translation | |||
Permuted translation | - | ||
Frequency space translation | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Efe, O.; Ozakin, A. SymmetryLens: Unsupervised Symmetry Learning via Locality and Density Preservation. Symmetry 2025, 17, 425. https://doi.org/10.3390/sym17030425
Efe O, Ozakin A. SymmetryLens: Unsupervised Symmetry Learning via Locality and Density Preservation. Symmetry. 2025; 17(3):425. https://doi.org/10.3390/sym17030425
Chicago/Turabian StyleEfe, Onur, and Arkadas Ozakin. 2025. "SymmetryLens: Unsupervised Symmetry Learning via Locality and Density Preservation" Symmetry 17, no. 3: 425. https://doi.org/10.3390/sym17030425
APA StyleEfe, O., & Ozakin, A. (2025). SymmetryLens: Unsupervised Symmetry Learning via Locality and Density Preservation. Symmetry, 17(3), 425. https://doi.org/10.3390/sym17030425