Abstract
This paper discusses the use of Equivariant Neural Networks (ENN) for solving Partial Differential Equations by exploiting their underlying symmetry groups. We first show that Group-Convolutionnal Neural Networks can be used to generalize Physics-Informed Neural Networks and then consider the use of ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods without the need to formally derivate corresponding numerical invariantizations. The benefit of our approach is illustrated on the 2D heat equation through the instantiation of an SE(2) symmetry-preserving discretization.
1. Introduction
Numerically solving Partial Differential Equations (PDEs) is of paramount importance for a wide range of applications such as physics, crowd theory, epidemiology and quantitative finance. Conventional methods such as Finite Element or Finite Difference methods have the main advantage of being easy to implement but are highly time consuming. With the rise of Deep Learning in the past decade, new approximate methods based on Physics-Informed Neural Networks (PINN) have been developed [,,] and allow to significantly improve simulation capacities [,].
Nonetheless, usual PDEs typically exhibit symmetries [,], and it is therefore natural to expect numerical solving schemes to comply with those. For Hamiltonian systems, symplectic integrators [,,,] have been introduced and have also recently been combined with machine learning techniques for the sake of efficiency []. For more general PDEs, symmetry-preserving Finite Difference schemes have been proposed [,], with the underlying theory being consolidated in []. Practical applications showing improvements with respect to the conventional approach have been presented in [,]. However, the formal derivation of the required numerical invariantization of the differential operators becomes more and more challenging as the number of variables increases, hence limiting the applicability of these methods and motivating the need for alternative approaches.
There are mainly two ways to imprint Deep Learning algorithms with symmetries. The first one, recently explored in [] for PDEs solving, generalizes the data augmentation techniques widely used for image processing tasks and aims at learning symmetries directly from the data. The second one aims at directly encoding the symmetries within the learning algorithms by leveraging the emerging field of Geometric Deep Learning [,]. In this context, Equivariant Neural Networks (ENN), initially introduced in [], have been shown very efficient, leveraging generalized convolution operators such as Steerable Convolution or convolution [,,,], and therefore providing equivariance to a wide range of symmetry groups. These equivariance mechanisms are very appealing, as proving theoretical guarantees with respect to the algorithms response to inputs variations and have been shown more efficient than data augmentation techniques from both theoretical [] and empirical standpoints [] in several contexts. Yet, these architectures cannot be applied directly to PDEs solving as one would for a conventional PINN and, at the time of writing, only [] proposes to use steerable convolution to solve PDEs, but it is limited to special cases of symmetries.
Contributions
In this paper, we present two innovative ways of using ENN to solve PDEs while exploiting the associated symmetries. By anchoring in [], we first show that Group-Convolutionnal Neural Networks can be used to generalize the PINN architecture to encode generic symmetries. By leveraging differential invariant theory [], we then propose using ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods without the need to formally derivate corresponding numerical invariantizations. A key advantage of this approach is that it allows solving any other PDE with the same symmetry group without any retraining. Finally, we illustrate the interest of our approach on the 2D Heat Equation and show in particular that a set of fundamental differential invariants of the roto-translation group SE(2) can be efficiently approximated by ENN for arbitrary functions by training on simple bivariate polynomials evaluations, allowing to easily build SE(2) symmetry-preserving discretization schemes.
2. PDEs and Symmetries
2.1. Systems of PDEs
We are interested in the following in solving systems of PDEs involving one time variable t, p independent space variables and q dependent variables , for which a solution is of the form , with for in terms of components. In the following, we denote by , with coordinates the space of the independent variables, and by , with coordinates u, that of the dependent variable.
We call order jet space the Cartesian product between the space of the independent variables and enough copies of the space of the dependent variables to include coordinates for each partial derivative of order less or equal than n
In the above definition, the binomial coefficient corresponds to the number of partial derivatives (assumed to be smooth enough) with order less than or equal to n. A function represented as can naturally be prolonged to a function from to by evaluating f and the corresponding partial derivatives, so that with is the spatial cross-derivative corresponding to the multi-index . According to this formalism, a PDEs system can be then written as
where is an operator from to .
2.2. Symmetry Group and Differential Invariants
We consider a Lie group G of dimension m acting as on a sub-manifold , with its Lie algebra generated by the vector fields . We can define the transform of a function under the action of G by identifying f with its graph and by defining , where the function is the function associated with the transformed graph .
A symmetry group G of a PDEs system is a group G such that if f is a solution, then its transform by the group action is also a solution. We then denote by the prolongation of the group action of G to for which a prolonged transform , for , sends the graph onto and by the corresponding prolonged vector fields. The algebraic invariants of the prolonged group action are called the differential invariants of order n of the group G and can be obtained by leveraging the infinitesimal invariance criteria . A complete set of independent differential invariants of order n in the sense of Theorem 2.17 of [] is generically denoted by in the sequel and is related to the symmetry group of PDEs systems as illustrated in Section 4.2.
3. Equivariant Neural Networks
To incorporate the symmetry information of PDEs into the neural network solver, it is of capital importance to introduce equivariance into neural networks. Multiple approaches have been studied these past years. Those approaches can be separated into two categories: G-CNN and Steerable CNN. The first method is the one we chose to work with and to generalize. The second one can be explored in [,,,,].
3.1. G-CNN
The idea behind a G-CNN is to perform the convolution over the group G one wants equivariance with. These kind of convolution layers were first introduced by Cohen and Welling in [] for discrete groups. There has been some important work on generalizing this approach to other groups [,,,]. Let us first start with some reminders about the group-based convolution operator and its properties.
Definition 1
(Group Convolution). Let G be a compact Group and two vector spaces. Let be a kernel, be a feature function and μ the Haar measure on G. We define the group convolution for any by
Proposition 1.
If the actions of G on and has regular representations, then the group convolution defined in Figure 1 is G-equivariant.
Figure 1.
Action of G with various representations. (a) Regular representation on scalars. (b) Regular representation on vectors. (c) Non-regular representation on vectors.
As illustrated in Figure 1b, regular representations only allow to describe limited group actions. Indeed, this group convolution does not revoke the constraint on the kernel (see []) if one wants equivariance to all kinds of actions and not only ones with regular representations.
3.2. A New Convolution
Definition 2
(Representative Group Convolution). Let G be a compact Group and two vector spaces. Let be a kernel, be a feature function and μ the Haar measure on G. If and are the linear representations of the action of G on and , respectively, we define the representative group convolution for any by
Remark 1.
In what follows, we keep the same definitions for G, K, V, ρ, μ and f.
Theorem 1.
With the same hypothesis stated in Definition 2, let V denote either or and ρ either or . If G acts on by
then the representative group convolution is G-equivariant.
This new convolution layer is thus very powerful to create an estimator to an equivariant function, because it is itself equivariant by construction. However, we cannot compose it with other non-equivariant operations without breaking the equivariance for the whole network. Therefore, alone, a convolution layer is not that powerful, but a chain of multiple convolution layers is much more interesting.
Lemma 1.
Any composition of G-equivariant functions is still G-equivariant.
Multiple representative-group convolution layers can be composed to obtain a G-equivariant network. Note that the action of G on the output of the ith layer must match the action of G on the input of the th layer.
In Table 1, there is a chain of representations , but what really matters is only the first one () and the last one (). Indeed, by Lemma 1, the whole network is equivariant to G’s action with on the input and on the output. Thus, we have full choice over the other representations for .
Table 1.
Example of an Equivariant Neural Network.
Remark 2.
One can still use non-equivariant functions between two hidden layers of the network, as long as these functions are point-wise and that the representations chosen for theses hidden layers are regulars. This covers the main usual architectures for convolutional neural networks.
Lifting the Coordinate Space
The Representative Group Convolution cannot be used right away since the convolution is constructed to be performed on G and not on the input data space (denoted in the sequel). The problem is obviated by lifting the coordinates from to G. One can find more details of this method in [].
Definition 3
(Lifting). Let be the set of orbits of G. If u is a mapping from to V, we set as the lifted version of u, defined by:
An element is then lifted to a tuple .
Definition 4
(Lifted Action). If G acts on , then it has an extended action on the lifted space . If is a lifted element, and is acting by:
4. Solving of PDEs with ENN
We discuss in this section two ways of using ENN for solving PDEs, starting with the use of G-CNN to generalize the PINN concept and then by building symmetry-preserving Finite Difference schemes by using the ENN as differential invariant approximators.
4.1. Equivariant PINN
The idea behind PINNs is somewhat straightforward. Let us consider the PDE defined in Section 2.1 but with added boundary conditions on a set B, which gives:
Now, we directly estimate a solution u of at with an Equivariant Neural Network (ENN) parameterized by , taking as input the initial profile of the solution, being u at time . This ENN is equivariant to the symmetry group of .
In order to train the ENN to approximate the solution of the PDE, we introduce the following optimization problem :
Additionally, the following loss function :
with
with boundary conditions on B. Additionally, and are two training sets of randomly distributed points, with and .
Remark 3.
A similar approach has been used by Wang et al. in [] but with steerable neural networks. They used a U-Net and a ResNet architecture, common for these types of tasks [], which they made equivariant in order to predict the PDEs’ results. To tackle the constraints on the kernels, they manually design some transformations to make their neural networks equivariant to 4 different actions. Our result is meant to be more general than this case-by-case design. We bypass the kernel’s constraints and have a fully equivariant network to any given group.
4.2. Symmetry-Preserving Finite Difference
We restrict ourselves in the following to PDEs systems of order n with a linear dependency with respect to the time differentials. According to the introduced formalism, we only consider systems of the form
where and is an operator from to . The above form covers most of the PDEs encountered in physics, ranging from the heat and wave equations to the Navier Stokes, Shrödinger and Maxwell equations. Assuming that the above PDEs system is regular enough, it admits G as symmetry if and only if the operator can be expressed as a function of a complete set of differential invariants, i.e., if and only if (2) can can be re-written as
with .
We propose in the following to approximate the differential invariants by neural networks equivariant to the corresponding group action. More precisely, let us consider a discretisation (resp. of the input space (resp. time interval and denote for any . For a given differential invariant , we are then interested in approximating the vector by the output of an equivariant neural network , taking as input the tensor , so that we have for the jth component
In the following, we propose to train on multivariate polynomial functions in the space variables x of degree d, but other choices could be envisioned depending on the considered problem.
The use of equivariant neural networks is motivated here by the fact that the operator that we are approximating is equivariant. Indeed, for a function , is a function from to , and we can therefore consider its transform according to the action of G by considering its transformed graph, as defined in Section 2.2. As the differential invariant is an algebraic invariant of the prolonged group action , it is possible to write
meaning that differential invariant operators are equivariant with respect to the associated group action, as illustrated on Figure 2 for the case of .
Figure 2.
From left to right and top to bottom: the initial function u, its rotated version , the rotated version of the differential invariant (see Section 4.3) and the differential invariant of the rotated function. As expected, computing the differential invariant from u and applying the rotation (bottom left) gives the same results as computing the differential invariant from the rotated function (bottom right).
We now come back to the PDE systems (2) and detail the numerical scheme that we propose here for its integration. The idea is to first train an ENN for approximating each of the k differential invariants which are involved and then to integrate by using an explicit scheme in which the differential invariants are replaced by their approximation with the ENN, leading to
4.3. Numerical Experiments
4.3.1. Approximating Differential Invariants
Here, we considered the case of for which a generating set of second-order differential invariants is given by
We trained in the following 2 Neural Networks, namely one conventional Convolutional Neural Network with equivariant layers and one ENN , both of them built to have roughly the same number of parameters (≈). The training set consists in evaluations of 2D-polynomials in with degrees up to 10 and generated from random coefficients drawn uniformly in [−1, 1]. Polynomials evaluations were performed on the discrete grid . An example of prediction with the trained , together with the corresponding theoretical value, is given in Figure 3.
Figure 3.
The differential invariant computed for the function u depicted in Figure 2 with an SE(2)-CNN (left) and its theoretical value (right).
4.3.2. Solving the 2D Heat Equation
Here, we consider the 2D heat equation defined on a square domain [−a,a]×[−b,b], with the boundary condition and and the initial condition , for an arbitrary function. Below, we give the results that we obtained by using an FD scheme relying on the approximation of the 2D-Laplacian, as described in Section 4.2, i.e., by computing the solution according to the following update rule:
where . We ran steps of the simulation for using the two trained architectures considered in Section 4.3.1, namely and , and compared the obtained heat profiles with that of the ground truth. The boundary condition was taken into account by overriding the predicted outputs by the conventional second-order derivative approximation for the corner cases. The obtained results are depicted in Figure 4, where we can, in particular, observe the high benefit of preserving the SE(2) symmetry during the numerical integration.
Figure 4.
Comparison of the theoretical heat profile of the 2D heat equation with a top boundary condition (see Section 4.3.2) with those obtained through simulation with two symmetry-preserving FD schemes (see Section 4.2) by leveraging (middle) and SE(2) (right) equivariant neural networks.
5. Conclusions and Further Work
We presented two innovative ways of using ENN to solve PDEs while exploiting the associated symmetries. We first showed that G-CNN can be used to generalize the PINN architecture to encode generic symmetries and then proposed using ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods. Our approach is illustrated on the 2D Heat Equation for which we, in particular, showed that a set fundamental differential invariants of SE(2) can be efficiently approximated by ENN for arbitrary functions by training on simple bivariate polynomials evaluations, allowing to easily build SE(2) symmetry-preserving discretization schemes.
Additional work will include performing proper benchmarking between the two approaches and more conventional numerical schemes for PDE integration. More complex PDEs with richer symmetry groups such as Maxwell equations could be considered in this context.
Author Contributions
Both authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.
Funding
Eliot Tron contributed to this work during an internship at Thales Research and Technology in 2021.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
- Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. arXiv 2020, arXiv:cs.LG/1907.04502. [Google Scholar] [CrossRef]
- Sirignano, J.; Spiliopoulos, K. DGM: A Deep Learning Algorithm for Solving Partial Differential Equations. J. Comput. Phys. 2018, 2018 375, 1339–1364. [Google Scholar] [CrossRef]
- Raissi, M.; Yazdani, A.; Karniadakis, G.E. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science 2020, 367, 1026–1030. [Google Scholar] [CrossRef] [PubMed]
- Wang, R.; Kashinath, K.; Mustafa, M.; Albert, A.; Yu, R. Towards Physics-Informed Deep Learning for Turbulent Flow Prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’20, Virtual Event, 6–10 July 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1457–1466. [Google Scholar] [CrossRef]
- Olver, P. Applications of Lie Groups to Differential Equations. In The Handbook of Brain Theory and Neural Networks; Springer: New York, NY, USA, 1993. [Google Scholar]
- Fushchich, W.; Nikitin, A. Symmetries of Maxwell’s Equations; Mathematics and Its Applications; Springer: Dordrecht, The Netherlands, 2013. [Google Scholar]
- Morrison, P. Structure and structure-preserving algorithms for plasma physics. Phys. Plasmas 2016, 24, 055502. [Google Scholar] [CrossRef]
- Kraus, M. Metriplectic Integrators for Dissipative Fluids. In Geometric Science of Information; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 292–301. [Google Scholar]
- Coquinot, B.; Morrison, P.J. A general metriplectic framework with application to dissipative extended magnetohydrodynamics. J. Plasma Phys. 2020, 86, 835860302. [Google Scholar] [CrossRef]
- Luesink, E.; Ephrati, S.; Cifani, P.; Geurts, B. Casimir preserving stochastic Lie-Poisson integrators. arXiv 2021, arXiv:2111.13143. [Google Scholar]
- Zhu, A.; Jin, P.; Tang, Y. Deep Hamiltonian networks based on symplectic integrators. arXiv 2020, arXiv:2004.13830. [Google Scholar]
- Dorodnitsyn, V. Finite Difference Models Entirely Inheriting Symmetry of Original Differential Equations. Int. J. Mod. Phys. C 1994, 5, 723–734. [Google Scholar] [CrossRef]
- Shokin, I.; Shokin, J.; Shokin, Y.; Šokin, Û.; Roesner, K. The Method of Differential Approximation; Computational Physics Series; Springer: Berlin/Heidelberg, Germany, 1983. [Google Scholar]
- Olver, P.J. Geometric Foundations of Numerical Algorithms and Symmetry. Appl. Algebra Eng. Commun. Comput. 2001, 11, 417–436. [Google Scholar] [CrossRef]
- Marx, C.; Aziz, H. Lie Symmetry Preservation by Finite Difference Schemes for the Burgers Equation. Symmetry 2010, 2, 868. [Google Scholar] [CrossRef]
- Razafindralandy, D.; Hamdouni, A. Subgrid models preserving the symmetry group of the Navier–Stokes equations. C. R. Méc. 2005, 333, 481–486. [Google Scholar] [CrossRef]
- Brandstetter, J.; Welling, M.; Worrall, D.E. Lie Point Symmetry Data Augmentation for Neural PDE Solvers. arXiv 2022, arXiv:2202.07643. [Google Scholar]
- Bronstein, M.M.; Bruna, J.; Cohen, T.; Veličković, P. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. arXiv 2021, arXiv:2104.13478. [Google Scholar]
- Gerken, J.E.; Aronsson, J.; Carlsson, O.; Linander, H.; Ohlsson, F.; Petersson, C.; Persson, D. Geometric Deep Learning and Equivariant Neural Networks. arXiv 2021, arXiv:2105.13926. [Google Scholar]
- Cohen, T.; Welling, M. Group Equivariant Convolutional Networks. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; Balcan, M.F., Weinberger, K.Q., Eds.; PMLR: New York, NY, USA, 2016; Volume 48, pp. 2990–2999. [Google Scholar]
- Cohen, T.S.; Geiger, M.; Weiler, M. A General Theory of Equivariant CNNs on Homogeneous Spaces. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H., Larochelle, H., Beygelzimer, A., Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32, pp. 9145–9156. [Google Scholar]
- Weiler, M.; Cesa, G. General E(2)-Equivariant Steerable CNNs. arXiv 2019, arXiv:1911.08251. [Google Scholar]
- Worrall, D.; Welling, M. Deep Scale-spaces: Equivariance Over Scale. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
- Kondor, R.; Trivedi, S. On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Dy, J., Krause, A., Eds.; PMLR: Stockholm, Sweden, 2018; Volume 80, pp. 2747–2755. [Google Scholar]
- Elesedy, B.; Zaidi, S. Provably Strict Generalisation Benefit for Equivariant Models. arXiv 2021, arXiv:2102.10333. [Google Scholar]
- Gerken, J.E.; Carlsson, O.; Linander, H.; Ohlsson, F.; Petersson, C.; Persson, D. Equivariance versus Augmentation for Spherical Images. arXiv 2022, arXiv:2202.03990. [Google Scholar]
- Wang, R.; Walters, R.; Yu, R. Incorporating Symmetry into Deep Dynamics Models for Improved Generalization. arXiv 2020, arXiv:2002.03061. [Google Scholar]
- Finzi, M.; Stanton, S.; Izmailov, P.; Wilson, A.G. Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data. arXiv 2020, arXiv:2002.12880. [Google Scholar]
- Cohen, T.S.; Welling, M. Steerable CNNs. arXiv 2016, arXiv:1612.08498. [Google Scholar]
- Lang, L.; Weiler, M. A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels. arXiv 2020, arXiv:2010.10952. [Google Scholar]
- Cohen, T.S.; Weiler, M.; Kicanaoglu, B.; Welling, M. Gauge Equivariant Convolutional Networks and the Icosahedral CNN. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 10–15 June 2019. [Google Scholar]
- Cohen, T.S.; Welling, M. Group Equivariant Convolutional Networks. arXiv 2016, arXiv:1602.07576. [Google Scholar]
- Cohen, T.S.; Geiger, M.; Weiler, M. Intertwiners between Induced Representations (with Applications to the Theory of Equivariant Neural Networks). arXiv 2018, arXiv:1803.10743. [Google Scholar]
- Kondor, R.; Trivedi, S. On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups. arXiv 2018, arXiv:1802.03690. [Google Scholar]
- Cohen, T.; Geiger, M.; Weiler, M. A General Theory of Equivariant CNNs on Homogeneous Spaces. arXiv 2018, arXiv:1811.02017. [Google Scholar]
- Wang, R.; Kashinath, K.; Mustafa, M.; Albert, A.; Yu, R. Towards Physics-informed Deep Learning for Turbulent Flow Prediction. arXiv 2019, arXiv:1911.08655. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).