Physics-Informed Neural Networks for Solving Coupled Stokes–Darcy Equation

In this paper, a grid-free deep learning method based on a physics-informed neural network is proposed for solving coupled Stokes–Darcy equations with Bever–Joseph–Saffman interface conditions. This method has the advantage of avoiding grid generation and can greatly reduce the amount of computation when solving complex problems. Although original physical neural network algorithms have been used to solve many differential equations, we find that the direct use of physical neural networks to solve coupled Stokes–Darcy equations does not provide accurate solutions in some cases, such as rigid terms due to small parameters and interface discontinuity problems. In order to improve the approximation ability of a physics-informed neural network, we propose a loss-function-weighted function strategy, a parallel network structure strategy, and a local adaptive activation function strategy. In addition, the physical information neural network with an added strategy provides inspiration for solving other more complicated problems of multi-physical field coupling. Finally, the effectiveness of the proposed strategy is verified by numerical experiments.


Introduction
The coupled Stokes-Darcy equation studied in this paper has many applications in the physical and engineering sciences. For example, in reservoir modeling, to model heterogeneous porous media, the permeability field is often assumed to be a multiscale function with high contrast and discontinuous features. There are also model studies of the evolution of fibroblast shape and position under stress [1]. The model is based on the idea of continuum mechanics to describe the stress-induced phase transition, the cell body is modeled as a linear elastic matrix, and the cell body surface evolves according to a specific dynamic relationship. In this model, the stress tensor has discontinuities at the cell surface due to changes in the strain tensor due to cell contraction. The stiffness term in the Stokes-Darcy system due to small parameters and the discontinuity in the normal velocity due to the imbalance of the normal stress at the interface make our problem difficult to solve. Moreover, for complex area problems, curved interface problems, and high-dimensional problems, it will be difficult to mesh generation. Therefore, knowing how to design an accurate, efficient, and stable meshless numerical approximation algorithm has become the focus of literature research. Studies [2][3][4][5][6][7] are relevant here, and interested readers can read the research.
In recent years, deep learning methods have achieved unprecedented success in various application fields such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, and bioinformatics. In some cases, they are better than human experts [8,9]. Driven by these exciting developments, people began to make an in-depth study on how to use deep neural networks to solve partial differential equations [10][11][12][13]. In particular, Raissi et al. proposed physics-informed neural networks(PINNs) to help solve partial differential equations and data-driven discovery [14]. The results show that given certain initial conditions and boundary conditions, PINNs can solve some partial differential equations very well. Since then, the door to solve partial differential equations using deep neural networks has been opened, and some works [15][16][17][18][19][20][21][22][23][24] based on PINNs have been published one after another. For example, Even Lu Lu et al. expounded the difference between the traditional finite element method and the deep neural network in solving partial differential equations from the selection of basis functions, the solution process, the error source, and the error order in [25]. Even the famous Mishra [26] et al. and Jagtap [27,28]  This paper mainly wants to study a new method for the numerical approximation of a meshless deep neural network [29] to solve the problems we care about. Our main goal is to study strategies to improve the ability of deep neural network to solve the Stokes-Darcy model, and to improve the approximation ability of PINNs to solve the Stokes-Darcy model. We propose strategies to improve the accuracy and efficiency of deep neural networks in this paper and provide several numerical examples to demonstrate our approach. It should be noted that due to the randomness of the initial parameters when training the network, our numerical results will fluctuate within a certain range.
The structure of this paper is as follows: an introduction to the Stokes-Darcy fluid coupling problem and its mathematical model is given in Section 2. In Section 3, the related knowledge of the neural network and strategies to improve the approximation properties of PINNs are introduced. The accuracy and reliability of the proposed strategy are verified by numerical examples in Section 4. Concluding remarks and outlook are given in Section 5.

Problem Setup
In this part, we specifically introduce the mathematical model of the problem and the corresponding interface conditions. The Stokes-Darcy fluid coupling system is discussed in a given region Ω; Γ divides region Ω into Ω s and Ω d , representing the region of Stokes flow and Darcy flow, respectively. For simplicity, Γ s and Γ d represent the boundaries of Ω s and Ω d , except the interface. n and τ are used to represent the external normal vector and tangent vector respectively. On interface Γ, n s is used to represent the normal vector from region Stokes to region Darcy, n d is used to represent the normal vector from region Darcy to region Stokes, and τ is the tangent vector.
In order to better describe the Stokes-Darcy fluid-coupling system mathematically, first of all, the motion of fluids in Ω s and Ω d is described by the Stokes equation and Darcy law, respectively. We often need to distinguish between physical quantities in Ω s and Ω d , especially when they are at the interface Γ. Therefore, the relevant physical quantities in the Ω s region are represented by the symbols with the subscript s, and the relevant physical quantities in the Ω d region are represented by the symbols with the subscript d, as follows: So we can get the governing system [7,30,31] as follows: Fluid region (Stokes equations) where u s is the fluid velocity, p s is the kinematic pressure, f s is the external force, ν > 0 is the kinematic viscosity of the fluid, T(u s , p s ) = 2νD(u s ) − p s I is the stress tensor, and D(u s ) = 1 2 (∇u s + ∇u T s ) is the deformation tensor, and I is the unit vector.

Rock matrix (Darcy equations)
The above equation is the Darcy equation describing fluid flow in the porous media region, see [2,[30][31][32], u d is the fluid velocity in the porous medium, p d is the dynamic pressure, and f d is the external force source term. The permeability K is a positive definite symmetric tensor allowed to vary in space.

Outer boundary
Here, for simplicity, we consider the Dirichlet boundary conditions on the Stokes side and the Neumann boundary conditions on the Darcy side.
Obviously, the pressure is unique under an additional constant, so we can assume that Ω p dxdy = 0.

Interface conditions
Here, the (4a) is a continuity condition of normal velocity at an interface obtained by conservation of mass, (4b) is a continuity condition of normal stress of fluid at an interface obtained through normal force balance, and (4c) is a famous Beaver-Joseph-Saffman (BJS) interface condition [30,[33][34][35], where parameter α is a constant associated with friction.

Numerical Method
In this part, we will adopt the fully connected deep neural network (DNN) as our basic network to solve the problem. At the same time, the PINNs algorithm framework and some extensions of the algorithm are introduced.

Network Formation
DNN is a widely parallel connected network composed of multiple simple units. Its organizational structure can simulate the interactive response of a biological nervous system to real-world objects. From the perspective of computer science, the neural network can be regarded as a mathematical model with multiple parameters. This is the result of nested functions, such as y i = f act (∑ i W i x i + b i ). We connect each neuron of each layer together. Taking the neural network of the L-layer as an example, the output of the neural network is as follows: where W i is the weighting coefficient matrix and b i is the bias vector. All the undetermined (5), and Θ is the parameter space. The (5) can also be written as Here, N 1 = W 1 z + b 1 , z is the input variable of neural network, σ stands the activation function, and D represents the number of layers of the neural network.

Physics-Informed Neural Networks
In [14], the authors propose to use deep neural networks to approximate the solution of partial differential equations, which can be called u-networks, and then use automatic differential techniques to obtain the differential operators of the equation. They then obtain the f-network satisfying the physical information of the equation. Then, the boundary function and internal loss function are established by using the principle of least squares. The working process of the PINNs is better explained below by taking the model we are asking for as an example.
When solving the Stokes-Darcy equation, we use the random Latin hypercube random point method to extract the data points and divide the data points into five parts according to the problem. After the input of the neural network is determined, we need to use the given boundary conditions and equation information to establish the loss function. Generally, the least square method is used, and the automatic differentiation technology [36] is also used in this process. Here, we divide the loss function into five parts: L(x f s , θ) represents the internal loss of the Stokes region; L(x f d , θ) represents the internal loss of the Darcy region; L(x uΓ , θ) represents the loss on the interface; and L(x us , θ) and L(x ud , θ) represent the loss on the boundary of the Stokes region and the Darcy region,respectively. Additionally, the specific expressions are as follows: where, Here represents the configuration points inside the Stokes region; represents the training data on the interface; and represent the training data on the Ω s and Ω d boundaries, respectively. N f s , N f d , N u Γ , N us , and N ud represent the number of points in the Stokes region, the number of points in the Darcy region, the number of points in the interface, the number of points in the border of the Stokes region, and the number of points in the Darcy region. After establishing the loss function, we need to select the appropriate optimization algorithm to train the loss function and update the parameters in the neural network through training and back propagation. This process is repeated until the number of training sessions we set is reached or the loss function values converge. Then, we find an approximate solution of the partial differential equation. Common optimization algorithms include the stochastic gradient descent method, Newton method, and quasi-Newton method. This paper adopts the gradient based Adam algorithm [37], which has the advantages of adaptive learning rate and batch computing. In some calculation examples, the Adam algorithm is combined with the L-BFGS algorithm [38]. The working process of PINNs is given by Figure 1. In Figure 1, x and y represent the input of the neural network; f act in (5) and σ in the figure both represent the activation function in the neural network; and u, v, and p represent the output of the neural network.

Improving Strategy of Physical-Informed Neural Network
The PINNs has a strong approximation ability, can solve many physical problems, and can describe many physical phenomena, but it has certain limitations in solving small parameter problems, such as the fluid viscosity coefficient and permeability in the Stokes-Darcy system. If the viscosity coefficient of the problem to be solved is very small, it will increase the difficulty of solving. At the same time, there are some limitations in solving the interface discontinuity problem. In the Stokes-Darcy equation, if the analytical solution is discontinuous on the interface, the general PINNs cannot be well solved. Therefore, in order to solve the above limitations, we propose the following strategies.

Add a Weight Function to the Loss Function
One way to improve the accuracy of PINNs is to add a weight function before various losses of the loss function. For small-parameter problems, the weight function can be increased to balance all kinds of losses, so that the network will not focus on training one item and ignore other items. For the problem we are trying to solve, according to the specificity of our loss function, we only add the weight function to L(x f s , θ) and L(x f d , θ), that is, we replace (7) with the following (8) where ϕ(ν) and ψ(ν, K) take the reciprocal of the corresponding parameters in the equation, that is, ϕ(ν) = 1 ν , ψ(ν, K) = K ν . Here, we do not use some adaptive weighting strategies [39,40] because the purpose of adaptive weighting strategies is to accelerate the convergence of the loss function. By adjusting the weights of various losses in the loss function, the value of the loss function decreases rapidly, but this has little effect on the small parameter problem we want to study.

Parallel Network Architecture
Another way to improve the approximation ability of PINNs is to decompose the solution region, that is, to divide the solution region into several sub-regions and use independent networks within each sub-region, which is the parallel network structure strategy. Common parallel network algorithms are conservative PINNs (cPINNs) [41] and XPINNs [16], both of which have been given parallel implementations in [42]. cPINNs are required to satisfy nonlinear conservation laws, and the interface condition part of the loss function is relatively complex, but XPINNs are suitable for solving all differential equations, and the interface condition part is also relatively simple. In the model to be solved, we use the idea of XPINNs to divide the solution region into two regions and train the neural network in the two sub-regions respectively. The specific training process is shown in Figure 2. The parallel network architecture has a very good effect on the solution of discontinuous problems on the interface, which will be shown in the numerical examples that follow.

Local Adaptive Activation Function Strategy
The selection of activation function is very important for the training of neural networks. The use of a single activation function can no longer meet the needs of solving complex problems. Therefore, Jagtap et al. proposed Rowdy activation function [43] with good properties for solving partial differential equations with high-frequency composite components and proposed adaptive activation function. In [44], an additional scalable parameter na is introduced to the activation function, where n ≥ 1 is a predefined scaling factor and parameter a ∈ R is the slope of the activation function. Since parameter a is defined for the whole network, we call this the global adaptive activation function (GAAF). The neural network expression of GAAF is shown by The optimization of these parameters will dynamically change the value of the loss function so as to accelerate the convergence of the loss function. But GAAF may fail on some complex issues. Therefore, a layer-wise locally defined activation function is proposed to extend this strategy, that is, add different slope a to the activation function of each hidden layer of the neural network. The neural network expression of the hierarchical local adaptive activation function is shown as follows: This provides additional D−1 parameters and optimizes the weight and bias, i.e., Here, unlike the global adaptive activation function, each hidden layer has its own activation function slope.

Numerical Experiments
This section introduces several numerical experiments to solve the two-dimensional coupled Stokes-Darcy equation. Firstly, the accuracy and validity of the numerical method are verified by constructing numerical examples with analytical solutions, and the influence of weighted loss function on solving small parameter physical problems is demonstrated. Then, the analytical solution of interface discontinuity is constructed, and the numerical results of two different network structures are compared. Then, the more complicated interface curve problem is solved. Finally, a numerical example without analytical solution is designed to simulate the fluid movement under different permeabilities and viscosities, and the velocity flow diagram, in accordance with the physical law, is obtained.
In the following examples, we use the relative L 2 norm to estimate our error by Here, N represents the number of all points in the neural network training process, U pred represents the predicted value at the corresponding coordinate point, and U exact represents the analytical value at the corresponding coordinate point.

Interface Continuous Solution Problem
It is difficult to find a solution that meets the interface conditions (4b) and (4c). In this case, we simply extend the interface conditions to include an inhomogeneous term based on benchmark problem in [30,31]. In other words, we replace (4b) and (4c) with (13) Figure 3 shows the comparison between the analytical solution and the neural network prediction solution. Through (11), the relative L 2 error of each physical quantity can be calculated as E(u s ) = 4.04 × 10 −4 , E(u d ) = 1.46 × 10 −3 , E(p s ) = 3.48 × 10 −3 and E(p d ) = 4.22 × 10 −5 , respectively. In the Table 1, we give the hyper-parameters in the neural network training process, where N us and N ud represent the number of points taken on the border of the Stokes region and Darcy region; N f s = N f d and N f s = N f d represent the number of internal points; N uΓ represents the number of interface points; L N and N N represent the number of neural network layers and the number of neurons in each hidden layer; and N P represents the number of parameters of the neural network. The optimization algorithm, learning rate η, and activation function will continue to be used in the following examples unless otherwise specified. Next, in Tables 2 and 3, when we set the permeability as K = 10 −2 I and 10 −4 I, respectively, and the fluid viscosity as ν = 10 −1 , 10 −2 , 10 −3 and 10 −4 respectively, calculating the relative error of each physical quantity. The results show that the weighting of loss function is more helpful to calculate small parameter problems. At the same time, when permeability K = 10 −2 I and fluid viscosity ν = 10 −3 , the change of loss value and the change of relative L 2 error of each physical quantity in the training process are shown in Figure 4. In the figure, each physical quantity with W in front represents the weighted result of the loss function, and the one without W in front represents the unweighted result of the loss function, which further shows that our measures are effective.   Table 2. When K = 10 −2 I, the relative L 2 error of velocity and pressure under different fluid viscosities.  Table 3. When K = 10 −4 I, the relative L 2 error of velocity and pressure under different fluid viscosities.  Next, we study the influence of the depth and width of the neural network on the prediction accuracy. In this study, we control other hyperparametric variables to remain unchanged. For different network depths and widths, the training times of Adam and L-BFGS algorithms are 10,000. As shown in Tables 4 and 5, we observe that the prediction accuracy of the model will increase with the increase of the width and depth of the neural network.

Interface Discontinuity Solution Problem
In this example, we solve the Stokes-Darcy equation for discontinuous interfaces. Since the fluid is not continuous when passing through the interface, the solutions of the two regions will be very different, and it will be difficult to optimize, so it is difficult to simulate the fluid in the entire region with only one network. Therefore, we propose the parallel network architecture; one network is in the Stokes region, and the other network is in the Darcy region, and both networks play a role in the simulation at the interface. The solution region and parameter α are the same as in the previous example. The terms on the right-hand side of the equation and the inhomogeneous terms in the interface conditions are given by u s =[− sin(πx) 2 sin(πy) cos(πy), sin(πx) cos(πx) sin(πy) 2 ], p s = 1 2 sin(πx) cos(πy), sin(πx) cos(πy).
(14) Table 6 shows the comparison of the CPU-time used by the three algorithms to solve the model under the control of relevant variables, as well as the comparison of the relative L 2 error of each physical variable. It can be observed that the parallel network architecture takes less time and has less error. Figure 5 shows the prediction results of velocity variables in the model by three algorithms, i.e., the single network structure, parallel network structure, and parallel network structure, with a local adaptive activation function strategy, as well as the absolute error comparison of the three algorithms. It can be observed from the absolute error diagram that the single network structure has a significant impact on the vicinity of the interface when solving the model. The simulation is not very good, but the parallel network architecture can be well simulated at the interface. It should be noted that the comparisons in Figure 5 and Table 6 are based on the same premise of controlling other training parameters.  When the solution at the interface is discontinuous, the absolute errors of single network structure, parallel network structure, and local adaptive activation function parallel network structure are compared.
(15) Table 7 shows the hyper-parameters in the neural network training process. Figure 6 shows the comparison between the simulated fluid velocity and the fluid velocity given by the analytical solution and shows our calculation effect through the absolute error diagram. It can be seen that the error is relatively large only near the curve interface, and the simulation in other places is very good. Figure 7 represents the distribution of data points in each region used in the training process and the change of relative error of each physical quantity. Through (11), the relative L 2 error of each physical quantity can be calculated as E(u s )=3.81 × 10 −3 and E(u d )=3.74 × 10 −3 , respectively.

No-True Solution Problem
In this example, the physical phenomenon described by the Stokes-Darcy system is examined. Let f s = 0 and f d = 0 in (1) and (2) Table 8 shows the hyper-parameters we used during training. We simulated different physical phenomena exhibited when a fixed permeability changes the viscosity of the fluid, as shown in Figure 9. From the simulation results, it can be seen that as the viscosity of the fluid decreases, the motion of the fluid becomes more intense, and the amount of fluid flowing through the interface and the speed will increase. At the same time, the physical phenomenon exhibited when the viscosity of the fixed fluid changes the permeability is simulated, as shown in Figure 10. From the simulation results, it can be observed that as the permeability decreases, the amount of fluid passing through the interface will decrease, and the velocity of the fluid that has passed through the interface will also decrease. The simulation effects shown in Figures 9 and 10 conform to certain physical laws.

Conclusions
In this paper, based on the PINN algorithm, we propose several strategies to improve the accuracy to solve the more complex Stokes-Darcy model, and the effectiveness of our proposed strategy is well verified in the example of small parameters and discontinuous interface. These strategies are not only applicable to the Stokes-Darcy system but also have a certain reference for the solution of other more complex multiphysics coupled models. However, in the process of solving the training network, we did not obtain the convergence speed of the algorithm, and the research on the minimum and saddle point problems in the optimization problem is also very meaningful.