Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations †

: This paper discusses the use of Equivariant Neural Networks (ENN) for solving Partial Differential Equations by exploiting their underlying symmetry groups. We ﬁrst show that Group-Convolutionnal Neural Networks can be used to generalize Physics-Informed Neural Networks and then consider the use of ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods without the need to formally derivate corresponding numerical invariantizations. The beneﬁt of our approach is illustrated on the 2D heat equation through the instantiation of an SE(2) symmetry-preserving discretization.


Introduction
Numerically solving Partial Differential Equations (PDEs) is of paramount importance for a wide range of applications such as physics, crowd theory, epidemiology and quantitative finance. Conventional methods such as Finite Element or Finite Difference methods have the main advantage of being easy to implement but are highly time consuming. With the rise of Deep Learning in the past decade, new approximate methods based on Physics-Informed Neural Networks (PINN) have been developed [1][2][3] and allow to significantly improve simulation capacities [4,5].
Nonetheless, usual PDEs typically exhibit symmetries [6,7], and it is therefore natural to expect numerical solving schemes to comply with those. For Hamiltonian systems, symplectic integrators [8][9][10][11] have been introduced and have also recently been combined with machine learning techniques for the sake of efficiency [12]. For more general PDEs, symmetry-preserving Finite Difference schemes have been proposed [13,14], with the underlying theory being consolidated in [15]. Practical applications showing improvements with respect to the conventional approach have been presented in [16,17]. However, the formal derivation of the required numerical invariantization of the differential operators becomes more and more challenging as the number of variables increases, hence limiting the applicability of these methods and motivating the need for alternative approaches.
There are mainly two ways to imprint Deep Learning algorithms with symmetries. The first one, recently explored in [18] for PDEs solving, generalizes the data augmentation techniques widely used for image processing tasks and aims at learning symmetries directly from the data. The second one aims at directly encoding the symmetries within the learning algorithms by leveraging the emerging field of Geometric Deep Learning [19,20]. In this context, Equivariant Neural Networks (ENN), initially introduced in [21], have been shown very efficient, leveraging generalized convolution operators such as Steerable Convolution or G−convolution [22][23][24][25], and therefore providing equivariance to a wide range of symmetry groups. These equivariance mechanisms are very appealing, as proving theoretical guarantees with respect to the algorithms response to inputs variations and have been shown more efficient than data augmentation techniques from both theoretical [26] and empirical standpoints [27] in several contexts. Yet, these architectures cannot be applied directly to PDEs solving as one would for a conventional PINN and, at the time of writing, only [28] proposes to use steerable convolution to solve PDEs, but it is limited to special cases of symmetries.

Contributions
In this paper, we present two innovative ways of using ENN to solve PDEs while exploiting the associated symmetries. By anchoring in [29], we first show that Group-Convolutionnal Neural Networks can be used to generalize the PINN architecture to encode generic symmetries. By leveraging differential invariant theory [6], we then propose using ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods without the need to formally derivate corresponding numerical invariantizations. A key advantage of this approach is that it allows solving any other PDE with the same symmetry group without any retraining. Finally, we illustrate the interest of our approach on the 2D Heat Equation and show in particular that a set of fundamental differential invariants of the roto-translation group SE(2) can be efficiently approximated by ENN for arbitrary functions by training on simple bivariate polynomials evaluations, allowing to easily build SE(2) symmetry-preserving discretization schemes.

Systems of PDEs
We are interested in the following in solving systems of PDEs involving one time variable t, p independent space variables x 1 , . . . , x p = x ∈ X and q dependent variables u 1 , . . . , u q = u ∈ U , for which a solution is of the form u = f (t, x), with u j = f j (t, x) for j = 1, . . . , q in terms of components. In the following, we denote by X = R p , with coordinates x 1 , . . . , x p , the space of the independent variables, and by U = R q , with coordinates u, that of the dependent variable.
We call n−order jet space J (n) the Cartesian product between the space of the independent variables X and enough copies of the space of the dependent variables U to include coordinates for each partial derivative of order less or equal than n In the above definition, the binomial coefficient ( p+n n ) corresponds to the number of partial derivatives (assumed to be smooth enough) with order less than or equal to n. A function f : X → U represented as u = f (x) can naturally be prolonged to a function u (n) = f (n) (x) from X to J (n) by evaluating f and the corresponding partial derivatives, so that u (n) = {∂ α x u, |α| ≤ n} with ∂ α x u is the spatial cross-derivative corresponding to the multi-index α = i 1 , . . . , i p ∈ N p . According to this formalism, a PDEs system can be then written as ∆ t, x, u (n) = 0 where ∆ is an operator from R + × J (n) to R q .

Symmetry Group and Differential Invariants
We consider a Lie group G of dimension m acting as g.(x, u) on a sub-manifold M ⊆X × U , with its Lie algebra g generated by the vector fields ζ 1 , . . . , ζ m . We can define the transform of a function u = f (t, x) under the action of G by identifying f with its graph Γ f and by defining g. f = f g , where the function f g is the function associated with the transformed graph g.Γ f = Γ f g .
A symmetry group G of a PDEs system is a group G such that if f is a solution, then its transform f g by the group action is also a solution. We then denote by pr (n) G the prolongation of the group action of G to J (n) for which a prolonged transform g (n) , for g ∈ G, sends the graph Γ f (n) onto Γ (g. f ) (n) and by pr (n) ζ 1 , . . . , pr (n) ζ m the corresponding prolonged vector fields. The algebraic invariants I G of the prolonged group action pr (n) G are called the differential invariants of order n of the group G and can be obtained by leveraging the infinitesimal invariance criteria pr (n) ζ i I G = 0. A complete set of independent differential invariants of order n in the sense of Theorem 2.17 of [6] is generically denoted by ∂φ G u,n = ∂φ G,1 u,n , . . . , ∂φ G,k u,n in the sequel and is related to the symmetry group of PDEs systems as illustrated in Section 4.2.

Equivariant Neural Networks
To incorporate the symmetry information of PDEs into the neural network solver, it is of capital importance to introduce equivariance into neural networks. Multiple approaches have been studied these past years. Those approaches can be separated into two categories: G-CNN and Steerable CNN. The first method is the one we chose to work with and to generalize. The second one can be explored in [23,28,[30][31][32].

G-CNN
The idea behind a G-CNN is to perform the convolution over the group G one wants equivariance with. These kind of convolution layers were first introduced by Cohen and Welling in [33] for discrete groups. There has been some important work on generalizing this approach to other groups [29,[34][35][36]. Let us first start with some reminders about the group-based convolution operator and its properties.
Definition 1 (Group Convolution). Let G be a compact Group and V 1 , V 2 two vector spaces. Let K : G → Ł(V 1 , V 2 ) be a kernel, f : G → V 1 be a feature function and µ the Haar measure on G. We define the group convolution for any s ∈ G by

Proposition 1.
If the actions of G on V G 1 and V G 2 has regular representations, then the group convolution defined in Figure 1 is G-equivariant.
As illustrated in Figure 1b, regular representations only allow to describe limited group actions. Indeed, this group convolution does not revoke the constraint on the kernel (see [36]) if one wants equivariance to all kinds of actions and not only ones with regular representations.

A New Convolution
Definition 2 (Representative Group Convolution). Let G be a compact Group and V 1 , V 2 two vector spaces. Let K : G → Ł(V 1 , V 2 ) be a kernel, f : G → V 1 be a feature function and µ the Haar measure on G. If ρ 1 : G → Ł(V 1 ) and ρ 2 : G → Ł(V 2 ) are the linear representations of the action of G on V 1 and V 2 , respectively, we define the representative group convolution for any s ∈ G by

Remark 1.
In what follows, we keep the same definitions for G, K, V, ρ, µ and f .

Theorem 1.
With the same hypothesis stated in Definition 2, let V denote either V 1 or V 2 and ρ either ρ 1 or ρ 2 . If G acts on V G by ρ(g) f g −1 r ∀g, r ∈ G and f : G → V, then the representative group convolution is G-equivariant.
This new convolution layer is thus very powerful to create an estimator to an equivariant function, because it is itself equivariant by construction. However, we cannot compose it with other non-equivariant operations without breaking the equivariance for the whole network. Therefore, alone, a convolution layer is not that powerful, but a chain of multiple convolution layers is much more interesting.

Lemma 1. Any composition of G-equivariant functions is still G-equivariant.
Multiple representative-group convolution layers can be composed to obtain a Gequivariant network. Note that the action of G on the output of the ith layer must match the action of G on the input of the (i + 1)th layer.
In Table 1, there is a chain of representations ρ 0 → . . . → ρ L , but what really matters is only the first one (ρ 0 ) and the last one (ρ L ). Indeed, by Lemma 1, the whole network is equivariant to G's action with ρ 0 on the input and ρ L on the output. Thus, we have full choice over the other representations ρ for 1 ≤ ≤ L − 1. N 0 = f ∈ V G 0 , convolution layers:

Remark 2.
One can still use non-equivariant functions between two hidden layers of the network, as long as these functions are point-wise and that the representations chosen for theses hidden layers are regulars. This covers the main usual architectures for convolutional neural networks.

Lifting the Coordinate Space
The Representative Group Convolution cannot be used right away since the convolution is constructed to be performed on G and not on the input data space (denoted X in the sequel). The problem is obviated by lifting the coordinates from X to G. One can find more details of this method in [29].
Definition 3 (Lifting). Let Q = X /G be the set of orbits of G. If u is a mapping from X to V, we set u ↑ : G × Q → V as the lifted version of u, defined by: An element x ∈ X is then lifted to a tuple r x , o q .

Definition 4 (Lifted Action
). If G acts on X × V, then it has an extended action on the lifted space (G × X /G) × V. If (r, q), u ↑ (r, q) is a lifted element, and g ∈ G is acting by:

Solving of PDEs with ENN
We discuss in this section two ways of using ENN for solving PDEs, starting with the use of G-CNN to generalize the PINN concept and then by building symmetry-preserving Finite Difference schemes by using the ENN as differential invariant approximators.

Equivariant PINN
The idea behind PINNs is somewhat straightforward. Let us consider the PDE defined in Section 2.1 but with added boundary conditions on a set B, which gives: Now, we directly estimate a solution u of (E) at t 0 + dt with N θ an Equivariant Neural Network (ENN) parameterized by θ, taking as input the initial profile of the solution, being u at time t 0 . This ENN is equivariant to the symmetry group of (E).
In order to train the ENN N θ to approximate the solution of the PDE, we introduce the following optimization problem (P): Additionally, the following loss function L: Additionally, T f and T b are two training sets of randomly distributed points, with T b ⊂ B and T f ⊂ X .

Remark 3.
A similar approach has been used by Wang et al. in [28] but with steerable neural networks. They used a U-Net and a ResNet architecture, common for these types of tasks [37], which they made equivariant in order to predict the PDEs' results. To tackle the constraints on the kernels, they manually design some transformations to make their neural networks equivariant to 4 different actions. Our result is meant to be more general than this case-by-case design. We bypass the kernel's constraints and have a fully equivariant network to any given group.

Symmetry-Preserving Finite Difference
We restrict ourselves in the following to PDEs systems of order n with a linear dependency with respect to the time differentials. According to the introduced formalism, we only consider systems of the form where k t ∈ N and ∆ is an operator from R + × J (n) to R q . The above form covers most of the PDEs encountered in physics, ranging from the heat and wave equations to the Navier Stokes, Shrödinger and Maxwell equations. Assuming that the above PDEs system is regular enough, it admits G as symmetry if and only if the operator ∆ can be expressed as a function of a complete set of differential invariants, i.e., if and only if (2) can can be re-written as with F : R qk → R q . We propose in the following to approximate the differential invariants by neural networks equivariant to the corresponding group action. More precisely, let us consider a discretisation x (1) , . . . , x (n x ) (resp. t (1) , . . . , t (n t ) ) of the input space X (resp. time interval R + ) and denote f (i,j) = f (t (i) , x (j) ) for any f : R + × R p → R q . For a given differential invariant ∂φ G f , we are then interested in approximating the vector ∂φ , so that we have for the jth component In the following, we propose to train N G on multivariate polynomial functions in the space variables x of degree d, but other choices could be envisioned depending on the considered problem.
The use of equivariant neural networks is motivated here by the fact that the operator f → ∂φ G f that we are approximating is equivariant. Indeed, for a function f : R + × R p → R q , ∂φ G f is a function from R + × R p to R q , and we can therefore consider its transform g.∂φ G f according to the action of G by considering its transformed graph, as defined in Section 2.2. As the differential invariant is an algebraic invariant of the prolonged group action pr (n) G, it is possible to write meaning that differential invariant operators are equivariant with respect to the associated group action, as illustrated on Figure 2 for the case of SE (2). We now come back to the PDE systems (2) and detail the numerical scheme that we propose here for its integration. The idea is to first train an ENN N i G for approximating each of the k differential invariants ∂φ G,j .,n which are involved and then to integrate by using an explicit scheme in which the differential invariants are replaced by their approximation with the ENN, leading to Figure 2. From left to right and top to bottom: the initial function u, its rotated versionũ, the rotated version of the SE(2) differential invariant u 2 x + u 2 y (see Section 4.3) and the differential invariant of the rotated function. As expected, computing the differential invariant from u and applying the rotation (bottom left) gives the same results as computing the differential invariant from the rotated function (bottom right).

Approximating SE(2) Differential Invariants
Here, we considered the case of SE(2) for which a generating set of second-order differential invariants is given by ∂φ SE(2) u,2 = u, u 2 x + u 2 y , u xx + u yy , u 2 x u xx + 2u x u y u xy + u 2 y u yy , u 2 xx + 2u 2 xy + u 2 yy (12) We trained in the following 2 Neural Networks, namely one conventional Convolutional Neural Network N R 2 with R 2 −equivariant layers and one SE(2)−ENN N SE(2) , both of them built to have roughly the same number of parameters (≈2.2 × 10 6 ). The training set consists in 29 × 29 evaluations of 2D-polynomials in R[X, Y] with degrees up to 10 and generated from random coefficients drawn uniformly in [−1, 1]. Polynomials evaluations were performed on the discrete grid (i/29, j/29) i,j=−14,..., 14 . An example of prediction with the trained N SE(2) , together with the corresponding theoretical value, is given in Figure 3.

Solving the 2D Heat Equation
Here, we consider the 2D heat equation u t = u xx + u yy defined on a square domain [−a,a]×[−b,b], with the boundary condition u(t, ±a, b) = 0 and u(t, ±a, −b) = 100 and the initial condition u t=0 = f , for f : R 2 → R an arbitrary function. Below, we give the results that we obtained by using an FD scheme relying on the approximation of the 2D-Laplacian, as described in Section 4.2 , i.e., by computing the solution according to the following update rule: where u n = u(n × δ t , .). We ran 10 5 steps of the simulation for δ t ≈ 10 −7 using the two trained architectures considered in Section 4.3.1, namely N R 2 and N SE (2) , and compared the obtained heat profiles with that of the ground truth. The boundary condition was taken into account by overriding the predicted outputs by the conventional second-order derivative approximation for the corner cases. The obtained results are depicted in Figure 4, where we can, in particular, observe the high benefit of preserving the SE(2) symmetry during the numerical integration.

Conclusions and Further Work
We presented two innovative ways of using ENN to solve PDEs while exploiting the associated symmetries. We first showed that G-CNN can be used to generalize the PINN architecture to encode generic symmetries and then proposed using ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetrypreserving Finite Difference methods. Our approach is illustrated on the 2D Heat Equation for which we, in particular, showed that a set fundamental differential invariants of SE(2) can be efficiently approximated by ENN for arbitrary functions by training on simple bivariate polynomials evaluations, allowing to easily build SE(2) symmetry-preserving discretization schemes.
Additional work will include performing proper benchmarking between the two approaches and more conventional numerical schemes for PDE integration. More complex PDEs with richer symmetry groups such as Maxwell equations could be considered in this context.