Neural-Network-Assisted Finite Difference Discretization for Numerical Solution of Partial Differential Equations

: A neural-network-assisted numerical method is proposed for the solution of Laplace and Poisson problems. Finite differences are applied to approximate the spatial Laplacian operator on nonuniform grids. For this, a neural network is trained to compute the corresponding coefﬁcients for general quadrilateral meshes. Depending on the position of a given grid point x 0 and its neighbors, we face with a nonlinear optimization problem to obtain the ﬁnite difference coefﬁcients in x 0 . This computing step is executed with an artiﬁcial neural network. In this way, for any geometric setup of the neighboring grid points, we immediately obtain the corresponding coefﬁcients. The construction of an appropriate training data set is also discussed, which is based on the solution of overdetermined linear systems. The method was experimentally validated on a number of numerical tests. As expected, it delivers a fast and reliable algorithm for solving Poisson problems


Introduction
In the last decade, the widespread use of artificial neural networks (ANNs) led to a significant step forward in computer science and computational mathematics. A wide range of practical problems, including segmentation and classification tasks, could be efficiently solved using this tool. Here, the related mathematical models fall into the class of discrete problems. At the same time, we have to keep in mind that the engine of the ANNbased algorithms is a gradient-based continuous optimization algorithm enriched with stochastic elements. This gives the first motivation to also apply ANN-based techniques for the numerical solution of problems arising from continuous mathematical models. The numerical solution of partial differential equations (PDEs) constitutes an important class of these, having a central role in the natural sciences and engineering.
Accordingly, a family of ANN-based methodologies was developed for solving partial differential equations (PDEs) numerically. The most common approach is the class of the so-called physics-informed neural networks (PINNs). In this framework, the solution candidates are optimized based on the most straightforward error terms. To compute these, one has to calculate consistency error in the governing equations and boundary conditions. In concrete terms, the approximations are substituted into the physical laws constituting the mathematical problem. For a general description of this method, we refer to [1,2]. A number of specific forms of this method were developed and applied for a scale of classical PDE problems arising in real-life applications; see, e.g., [3]. Beyond these, the above framework can be extended to the numerical solution of stochastic [4] and inverse problems [5,6] as well.
A possible shortcut of this procedure is to seek approximations only from a class of functions that satisfy the governing equation exactly. In this case, we only have to care about parameters (unknowns) associated with the boundary, which can largely decrease the dimension of the corresponding optimization problems. Such an algorithm is introduced in [7] and applied to the numerical solution of Dirichlet-to-Neumann problems.
To summarize these developments and further approaches in detail, a number of review articles have been published in the past years. Among them, we propose [8,9], where a really wide range of further specific references can be found. All of these approaches offer a complete procedure for the numerical solution of PDEs.
At the same time, in the last 75 years, a wide class of conventional numerical methods were developed for solving PDEs. A natural idea is to combine these with the efficient computational tools offered by the ANNs. This is also the aim of the present study. In other words, we propose here an ANN-assisted numerical solution of PDEs. In this framework, instead of the entire approximation procedure, only an important and technical step of a conventional numerical method is executed using an ANN.
In the course of the numerical solution of a PDE, we usually first have to discretize the underlying problem including the governing equation and the boundary data. Whenever we can choose purely function-space-based discretizations, it is mostly connected to a physical discretization of the computational domain, where the equation is posed. Generating a suitable mesh for this purpose is still a challenging problem. At this step, one can also make use of ANN-based procedures; see, e.g., [10][11][12].
Having this mesh at hand, choosing a specific finite element or finite difference discretization method and, finally, incorporating the given boundary data, we obtain an algebraic (or linear algebraic) system of equations. The construction of such finitedifference-based approximations will be the focus of our study.
In case of uniform rectangular meshes, we can easily derive finite difference approximations of various differential operators and convert the original problem quickly into a system of equations. Otherwise, which is the case in almost all real-life engineering problems, we need to deal with simplicial (or even hybrid) meshes and use an appropriate Galerkin-type discretization. Generating this mesh with the corresponding data structure and collecting (assembling) the system of equations may be rather time-consuming. For the details of this procedure, we refer to [13], Section 8. In the case of linear problems like the Laplacian equation, this can take even longer compared to the solution of the system of equations. Even for nonlinear problems, including iterative solvers, the assembling may take a significant portion of the computational time [14].
Here, we consider the case of the Laplacian operator. Accordingly, our aim in the present work is to develop an ANN that generates the finite difference discretization matrix of the Laplacian and meets the following requirements: (i) It also works on nonuniform and nonrectangular grids; (ii) It is very quick and uses only neighboring relations between the grid points; (iii) It delivers an accurate discretization of the Laplacian.
Note that the numerical approximation of the Laplacian operator has a long history and a number of efficient approximations were elaborated. It has a central role not only in classical diffusion problems but also in several computationally intensive mathematical models, such as equations for wave propagation, Navier-Stokes equations or free-surface wave equations [15].
Summarized, the main motivation of the present work is to utilize the tool of the ANNs for the accurate discretization of the Laplacian operator. We perform this in the framework of a finite difference approximation. Our contribution here is to provide the coefficients for nonregular meshes. Instead of a pointwise optimization procedure, these coefficients will be computed by an ANN, which we train in advance.
The setup of our contribution is the following. After a short review, we discuss the principle of our algorithm. Then, the two main components, the procedure for generating learning data and the construction of the neural network, will be explained. Finally, in a series of numerical experiments, an experimental analysis will be carried out to test the efficiency of our algorithm.
Note in advance that the methodology we develop here can be easily extended for a number of differential operators, where a computationally efficient finite difference approximation is needed on a nonuniform spatial mesh.

Materials and Methods
We first state the mathematical problems to solve. To ease the presentation, we perform the study in the two-dimensional case, but the principle is still valid in three space dimensions.

Problem Statement and Mathematical Background
We investigate the numerical approximation of the Laplacian operator at a given point x 0 . Here, {x j } J j=1 denotes the adjacent grid points to x 0 with given function values {u(x j )} J j=1 and u(x 0 ), respectively. In practice, the coefficients {a j } J j=0 are to be optimized, which depend on the position of the grid points. Indeed, we apply this approximation for a spatial discretization of a domain Ω, which can be regarded as a grid. The aim of such an approximation is the numerical solution of some PDE. To demonstrate the principles, we consider the simple Poisson's problem where Ω ⊂ R 2 denotes the computational domain, u : Ω → R is the unknown function and g : ∂Ω → R is given. The well-posedness of (2) is ensured in H 1 (Ω) provided that Ω is a bounded Lipschitz domain, f ∈ H −1 (Ω) and g ∈ H 1 2 (∂Ω) (see [16]). Concerning the numerical solution of (2), in the most easy case, using a uniform rectangular grid, we have locally the geometric setup shown on the left in Figure 1. Here, for the approximation of ∆u(x 0 ), we use the function values at x 1 , x 2 , x 3 and x 4 , which are considered the neighbors of x 0 .
Then, in a given grid point x 0 , we have the classical five-point approximation so that in this case, in (1), we will have This has the approximation order 2, i.e., for the difference of the two sides in (3), the choice in (4) delivers the approximation order 2 with respect to both variables: (5) provided that the function u is four-times continuously differentiable. Whenever it seems to be a strict assumption, for a number of PDEs (such as diffusion problems), including the Laplacian operator, we have smooth analytic solutions. In practice, the pointwise approximations in (3) are assembled into a matrix D h,FD . Incorporating also the boundary condition and using the vector f h for the pointwise values of f in the grid points, we finally have a linear algebraic problem to solve: The solution u h,FD is then called a finite difference (FD) approximation of u in (2). In another well-known case, one can discretize any domain into (possibly curved) triangles and, based on this tessellation, a first-order finite element (FE) approximation can be applied. This situation in a special geometric case is presented on the right in Figure 1.
Here, the neighbors of x 0 are the points x 1 , x 2 , . . . , x 6 . In this case, instead of the direct form of the Laplacian operator, its weak form is discretized. At the same time, in the course of the numerical solution of (2), this approach also leads to a linear system to solve: Here, the unknown vector u h,FE ∈ R N consists of coefficients of a finite element basis, while f h,FE represents the scalar product of f with the basis elements. Usually, a lifting also has to be applied to incorporate the boundary conditions. In any case, again, the matrix D h ∈ R N×N can be regarded as an approximation of ∆ on the entire domain Ω. Accordingly, in a specific case, shown on the right of Figure 1, we can determine the coefficients as in (4) to obtain In the course of computing these coefficients, we have used the ratio h s = 5 3 h x . Note that this discretization leads to a second-order convergence with respect to the L 2 (Ω)-norm.
To compare the above cases, we also give the position of the nonzero elements and the total number of the nonzeros in the matrices D h,FD and D h,FE corresponding to the discretizations in (4) and (8), respectively. To visualize the nonzero entries of these matrices, we have depicted them in Figure 2. Whenever the finite element discretization is flexible regarding the geometry of the domain, compared to the simple finite difference approximation given by D h,FD , it has some drawbacks as well: • We need more nonzero entries in the matrices D h,FE , as one can observe in Figure 2. It enhances both the assembling time of the matrices and the solution of the corresponding linear systems. • We need to generate and store the entire structure of the mesh, including a list of triangles with their nodes and faces. • In a nonuniform grid, we need to transform the gradient of basis functions on each of the triangles according to the transformation of the reference triangles computing Jacobian matrices and their inverses (see [13]).
To avoid all of these, our objective is to develop a finite difference approximation, which has the beneficial properties in (i)-(iii).
A direct way for constructing such a second-order approximation is, however, impossible even in the one-dimensional case using two neighboring grid points. This is stated, e.g., in an early contribution [17]. One should also note that, assuming a quasi-uniform one-dimensional mesh, the second-order accuracy can be preserved. Here, the difference in the neighboring grid distances should be one magnitude less compared to the grid size.
In a general case, a possible idea to obtain a second-order (or even higher-order) finite difference scheme without using a wide stencil (data from more neighboring grid points) is offered by the so-called method of compact difference schemes introduced in [18]. In this framework, the coefficients are not given explicitly; they have to be computed using a linear system. In the corresponding equalities, on the left hand side the linear combination of function values are given as in (1) or (3). At the same time, on the right hand side, again, a linear combination of the second-order derivatives arises. This idea was further developed; see, e.g., [19].
Another approach is to introduce further appropriate grid points, where the original simple finite differences deliver a higher-order approximation. Such a construction is analyzed in detail in [20], which also discusses the stability issues for the corresponding time-dependent problems.
We choose here an alternative way. Without introducing new variables or solving linear systems for optimal finite difference coefficients, an ANN will be utilized for constructing an optimal finite difference approximation.

Principles of the Present Algorithm
A natural easy choice could be to generalize (3) by preserving the second-order accuracy. As we have mentioned, we cannot just provide a simple formula for this. Instead, we could perform an optimization procedure to determine the coefficients, which approximate the Laplacian operator as accurately as possible.
Before digging into the details, we immediately reduce the setup of this problem: • One can obviously assume the translation-invariant property of the approximation. Accordingly, we may assume that x 0 = (0, 0). • Also, any physically meaningful approximation should be scale-invariant. In concrete terms, if we have the optimal coefficients {a j } 4 j=0 for some configuration {x j } 4 j=1 , then we should have { 1 r 2 a j } 4 j=0 for the configuration {r · x j } 4 j=1 . However, a separate optimization procedure in all of the grid points would be extremely time-consuming. This is the point where artificial neural networks come into the picture and the following main principle is established.
An ANN should be trained to learn this optimization in the sense that it should give the coefficients {a j } 4 j=0 in (1) using the positions {x j } 4 j=1 as an input. Formally, we associate the mapping N to the ANN, such that First, we reduce the problem again by requiring that the Laplacian of the constant function is computed in (0, 0) exactly. This means that 4 ∑ j=0 a j = 0, so that we have With this simplification, we have to optimize only the set of parameters {a 1 , a 2 , a 3 , a 4 }. Note that a second-order approximation can reconstruct the Laplacian for all at most second-order polynomials.

Construction of Learning Data
A main cornerstone for the practical construction of a corresponding ANN is to create a suitable learning data set. To obtain this, we compute the optimal coefficients {a j } 4 j=1 for a number of possible positions {x j } 4 j=1 . In concrete terms, we start from the standard geometry x 1 = (−1, 0), x 2 = (0, −1), x 3 = (1, 0), x 4 = (0, 1). By means of the scaleinvariant property, this will be sufficient for all configurations. Adding a relatively small random number to all coordinates, we perturb them to havẽ as shown in Figure 3. This can be performed simultaneously to obtain a nonuniform tessellation of a computational domain Ω (see Figure 4).  For this given data, we approximate the optimal coefficients {a j } 4 j=1 as follows. In the perturbed grid points {x j } 4 j=1 , we "solve" the system of equations for the unknown coefficients (a 1 , a 2 , a 3 , a 4 ) in the least-square sense. In rough terms, we are looking for the coefficients (a 1 , a 2 , a 3 , a 4 ), which can reconstruct the Laplacian for the polynomials p 1 , p 2 , p 3 , p 4 and p 5 in (9) as accurately as possible.
Here, we have applied a weight of 100 for the first-order polynomials. This is a hyperparameter: the above value could deliver a good performance regarding the final approximation properties. The motivation for applying this is to ensure the first-order accuracy.
In the implementation, we use the efficient approximative solver numpy.linalg.lstsq in Python for the problem in (10). This was performed for a number of 1600 random perturbations. Note that both of the random perturbations and the least-square solution can be implemented in a vectorized form, enhancing the efficiency of the whole procedure.

Setup of the Artificial Neural Network
Having an appropriate set of learning data at hand, the final practical issue is to design a feasible neural network for computing the above coefficients {a j } 4 j=1 . In any case, to keep the computational costs at a low level, this should contain relatively few parameters. Also, since the mapping N is nonlinear, we need to incorporate nonzero activation functions. In concrete terms, we have applied the so-called SeLu activation function, which is given with Within the Keras package of Python, we have used the standard parameters α 0 ≈ 1.0507 and α 1 ≈ 1.6733, respectively.
After a long series of experiments, the following structure is proposed: • The input layer is of size 8, obtaining the 8 coordinates of (x 1 , The first hidden layer, a dense one, is of size 4, using the SeLu activation function and a bias term. • The second hidden layer is again a dense one of size 12 using the SeLu activation function and a bias term. • Finally, the output layer is also dense and of size 4, and it is equipped with a linear activation function without a bias term.
The above structure is displayed in Figure 5.  Altogether, this setup contains 144 parameters, which have to be optimized during the learning procedure. Note that in the majority of the applications of ANNs a much larger number of parameters is used. At the same time, for the efficiency of the algorithm, we need a really simple setup, which is comparable to the complexity of a single matrixvector product. Also, if we obtain an acceptable accuracy with 144 parameters, a more sophisticated system could suffer from overfitting and would require an extra regularization procedure.
Altogether, the algorithm to build a matrix D h,FD for the approximation of the Laplacian consists of the following steps: Generate randomly perturbed grid points and training data using the least-square approximation for (10).
Construct the ANN in Figure 5 and train it using the above data.
Construct a quadrilateral spatial grid on the computational domain Ω. (iv) For each grid point x 0 and its neighbors x 1 , x 2 , x 3 and x 4 , we apply an affine linear mapping to transform the midpoint to the origin and scale one distance to be one.
We apply the ANN to the above setup and scale back the distances to obtain the finite difference coefficients (a 1 , a 2 , a 3 , a 4 ) for the neighboring points.
We compute a 0 = −a 1 − a 2 − a 3 − a 4 and have all the coefficients at hand. (vii) For an arbitrary function u, we can approximate its Laplacian in all interior points by assembling the matrix D h,NN . Remarks:

1.
Note that in Python the above operation of the ANN can also be vectorized, leading to an efficient implementation.

2.
We can incorporate any inhomogeneous Dirichlet boundary data, since these values are simply given on the boundary grid points.

3.
We could make this procedure completely mesh-free using only grid points. At the same time, using a quadrilateral mesh, the neighbors of a certain grid point can be specified automatically. Also, these kind of meshes are practically useful (see, e.g., [21]) and these can be obtained using the frequently used mesh generators.

4.
Here, D h,NN has the same structure as D h,FD (see the left of Figure 2), but the nonzero entries were computed using our ANN.
In practical cases, for the numerical solution of (2), we not only have to approximate the Laplacian but we also have to solve a problem given in (6).

A Summary: The ANN-Assisted Numerical Method
By incorporating the ANN-based approximation into the conventional FD numerical method, we summarized the proposed PDE solver for the Laplacian problem in Figure 6.

Results: An Experimental Analysis
We first test the performance of the ANN, then the approximation property of the matrix D h,FD and, finally, the computational error regarding u h,FD .

Performance of the ANN
In all experiments, we use the same ANN. The corresponding hyperparameters in the training procedure are chosen as follows: • The performance of the learning algorithm is shown in Figure 7. According to this, the ANN could learn the the coefficients of the approximation. Also, in Figure 8, one can observe a good agreement of the testing and the training loss. Note that further increasing the size of layers does not result in a further decrease in the testing loss, while we need more epochs to arrive at the same level of accuracy.

Performance of the ANN-Based Approximation of the Laplacian
In the first series of experiments, we investigate the accuracy of the numerical approximation of the Laplacian. In concrete terms, we compare the standard five-point approximation in (3) with the approximation provided by the trained ANN.
The computational domain is Ω = (0, 1) × (0, 1), and we apply a nonuniform grid, as shown in Figure 4. Here, we have used the number of 21 × 21 grid points.
The approximation was tested in the case of the following model problem: which has the analytic solution u(x, y) = x 2 · cos xy. Our aim here is to compare the pointwise approximation given by the ANN-based approximation with the ∆ operator. To quantify this, we compute the quantity which is the 2 -norm of the error of the approximation D h,NN ≈ ∆ on the grid. Here, x j in the summation applies to all of the grid points.
We compare this quantity with the 2 -norm given by the standard five-point approximation. Here, according to the geometric setup in Figure 1, the indices j + 1, j + 2, j + 3 and j + 4 denote the indices of the neighboring grid points of x j . While the first one is approximately 0.5, the second one is ≈6. This was the average of a number of experiments. Here, the random positions in Figure 4 can effect the above error terms.
This shows a real advance of our approach versus the conventional approximation. At the same time, we need to solve the problem in (11) numerically. (2) In this series of experiments, we investigate the model problem

Performance of the ANN-Based Numerical Solution of
where the computational domain is given by Note that (14) has the analytic solution u(x, y) = y + x · cos 4y. The domain Ω was discretized using a nonuniform grid. The motivation of this arises from the study of moving boundary problems for water waves [15], where the velocity potential exhibits more variation nearer to the free surface. In this way, for an equally accurate discretization, we need here more grid points compared to the bottom region. Also, the number of the grid points is frequently identical on each vertical segment. Using this convention, we can easily record the neighboring relations. A corresponding grid is shown in Figure 9. In concrete terms, we have 21 grid points along all vertical segments. Also, we have 21 segments in the discretization. Performing again a comparison between the ANN-based and conventional approximation of the Laplacian, the errors given by (12) and (13), we obtain 0.33 and 6.5, respectively. Here, we have a significant advance of our approach. Also, we have compared the final computational errors in the discrete L 2 -norm to obtain Here, u h,FD denotes the numerical solution of (14) using the conventional finite difference method given with the (6) matrix based on the discretization in (13). Likewise, u h,NN denotes the solution of the equation D h,NN u h,NN = f h , where D h,NN is the discretization matrix obtained by our ANN. The comparison of the errors above also shows the advance of our approach.
Another way of obtaining an accurate discretization matrix is to perform the optimization procedure discussed in Section 2.3 pointwise in each x 0 . This, however, significantly increases the computational time. Regarding this, a comparison is shown in Table 1. The proposed ANN-based algorithm becomes faster only for a large number of grid points. This is typically the case in real-life situations.

Discussion and Conclusions
We have presented an ANN-based finite difference approximation for Poisson problems on nonuniform grids. The ANN given in Section 2.4 could deliver an appropriate matrix D h,FD to approximate the Laplacian. In the framework of an FD approximation (see Figure 6), this could also be used to enhance the accuracy of the numerical solution of (2), as pointed out in Section 4.3. Also, our algorithms offer a fast computation of the FD coefficients (see Table 1).
The main benefit of the present approach is the application of this procedure for moving boundary problems. In the course of the numerical solution of these problems, we have to generate a new mesh in each time step, which requires a highly efficient procedure.
The construction of such a complex algorithm will be the continuation of this research.