A Second-Order Well-Balanced Finite Volume Scheme for the Multilayer Shallow Water Model with Variable Density

: In this work, we consider a multilayer shallow water model with variable density. It consists of a system of hyperbolic equations with non-conservative products that takes into account the pressure variations due to density ﬂuctuations in a stratiﬁed ﬂuid. A second-order ﬁnite volume method that combines a hydrostatic reconstruction technique with a MUSCL second order reconstruction operator is developed. The scheme is well-balanced for the lake-at-rest steady state solutions. Additionally, hints on how to preserve a general class of stationary solutions corresponding to a stratiﬁed density proﬁle are also provided. Some numerical results are presented, including validation with laboratory data that show the efﬁciency and accuracy of the approach introduced here. Finally, a comparison between two different parallelization strategies on GPU is presented.

One possible approach to introduce multilayer shallow water models is the one described by Audusse et al. in [19]. There, the domain is discretized in the vertical direction with an a priori arbitrary number of layers. In each layer, similar assumptions as in the classical shallow water framework are made: the velocity is averaged in the vertical direction, vertical effects are neglected, and the pressure is assumed hydrostatic. In order to allow mass and momentum exchange between layers, an approximation of the vertical flux in the layer is incorporated into the model through non-conservative terms (see [20,21] or [4]). In this way, we are able to obtain a detailed vertical profile of the velocity. Recent developments in multilayer shallow water models concern the local change of the number of layers according to the presence of relevant vertical effects (see [22]).
Another approach to capture complex vertical profiles, but avoiding fully 3D models while calculating the free surface directly, are the so-called σ-coordinate models ( [23]). In those models, as in cost for different parallelization strategies is addressed and in Appendix B, a thorough description of the model derivation is provided.

Governing Equations and Model
The free surface Navier-Stokes equation in a d-dimensional space (d = 2, 3) are considered. We denote by v = (u, w) the velocity function with u ∈ R d−1 the horizontal components of the velocity and w the vertical one. We will consider two continuity equations: one corresponding to an incompressible flow with constant density ρ 0 and the other corresponding to the advected flow with variable density, ρ, where k = (0, 1) T , g is the constant gravity function, ρ is the total density of the fluid, and the total stress tensor is given by Σ = −pI + T, with I the identity tensor, p the pressure term, and T the tensor corresponding to the viscous terms, We shall decompose the total density ρ as the sum of a reference density, denoted by ρ 0 , and the corresponding variation from that reference density, denoted by ρ 1 , If we now divide (1) by ρ 0 , we obtain the following system: We shall now consider a multilayer shallow water approach to system (4). In the multilayer approach, the fluid domain Ω F (t) is divided along the vertical dimension into a set of M ∈ N * layers of thickness h α (t, x) with M + 1 smooth interfaces Γ α+ 1 2 (t), which shall be described as (see Figure 1): The layers will be then described as Ω α (t) = (x, z) ∈ Ω F (t), such that z  Following the usual approach in multilayer shallow water models, we consider the horizontal velocity and relative density fluctuations to be independent of the vertical variable z: where u α is the horizontal velocity and w α is the vertical one. We also assume that both w α and p α are linear in z inside each layer. Moreover, we suppose that pressure is also continuous at the interfaces and hydrostatic pressure is assumed, so that p α (t, x, z) = p with Here, the component p α+ 1 2 is the hydrostatic pressure at Γ α+ 1 2 (t) and p S denotes the pressure at the free surface. Therefore, the unknowns of the system are the depths of the layers, the density fluctuation, and the horizontal velocities in each layer. Remark that ∂ z w α = −∇ · u α and therefore it is not considered as an unknown. As it is usual in multilayer shallow water type systems, we shall consider l α ∈ (0, 1), α = 1, . . . , M such that M ∑ α=1 l α = 1. Then, the layer thickness is assumed to be proportional to the total depth, so that h α = l α h.
There is no hope for such a particular set v α , θ α , p α to be a solution of the complete equations in the layer Ω α (t). Instead, we shall consider a reduced weak formulation with particular test functions. A detailed derivation of the vertical integration procedure can be found in [20,30], or [4]. A thorough derivation of this specific model is given in Appendix B. For the sake of simplicity, from now on, we will suppose d = 2 and merely present the final version of the model, where the horizontal viscosity terms have been neglected where η = h + b is the free surface and G α± 1 2 α = 1, . . . , M − 1, are the mass exchange terms between layers, defined by We will also consider no mass exchange at the bottom and the free surface, that is, G 1 2 = G M+ 1 2 = 0. In addition, θ α+ 1 2 and u α+ 1 2 are some approximation of u and θ at the interfaces Γ (8) may be rewritten in the following form: Note that pressure terms have been rewritten and reduce to One can easily check that system (10) has non-trivial stationary solutions. Here, we are interested in the particular stationary solutions that satisfy u α = 0, α = 1, . . . , M. For this particular case, one can distinguish two families of stationary solutions: the one corresponding to the choice of a constant relative density function θ α = K, α = 1, . . . , M, and the more general case where θ α is not constant.
In the first case, the stationary solution reduces to the well-known lake-at-rest stationary solutions, characterized in this case by where K 1 ≥ 1 and K 2 are two given constants. For the second more general case, we shall consider smooth stationary solutions. Then, given two known smooth functions b(x) and h(x) ≥ 0, the stationary solutions are characterized as the regular solutions of the following ODE system: Note that the ODE (12) can be solved by an iterative procedure: starting from α = M, where the equation only depends on θ M , and then continue to solve downwards on α so that once that θ M , θ M−1 , . . . , θ α+1 have been calculated, (12) corresponds to a scalar ODE for θ α that may be solved. Note that there exists a trivial class of stationary solutions for the original system (4), which corresponds to a stratified fluid with constant free surface, that is, Unfortunately, these types of solutions are not an exact solution of (12). Therefore, they are not stationary solutions for the multilayer system (10), except for the case of constant bottom b(x) = cst. This is due to the particular choice of the vertical discretization used in the multilayer approach. The layers are proportional to the total thickness and the variables have been averaged inside the layer. As a result, averaging of stratified solutions does not result in a stratified solution (space independent of θ) for the multilayer system, unless the bottom and free surface are constant.
Nevertheless, stationary stratified solutions (13) may be approximated in the multilayer framework by the following set of stationary solutions: with Here,θ α are given constants that should be properly chosen to determine a physical stable solution. The stationary solutions (14) have been obtained from (12) by the iterative procedure described before. Figures 2 and 3 show a particular realization of these types of stationary solutions. In particular, we see the vertical distribution of θ α (x), α = 1, . . . , M, along a given channel for the values M = 5, M = 10, M = 50, and M = 100. We see from the pictures that (14) converges to (13) as M → +∞.   The study of the hyperbolicity of the proposed model (10) is not an easy task. The model is hyperbolic for the case M = 1 but remains an open question in general. Nevertheless, numerical simulations show that the eigenvalues of the system are real. It is possible to give an upper and lower bound for the largest and smallest wave speed, λ max , λ min , respectively. Following [31], the minimum and maximum of the wave speeds are bounded by: where

Numerical Scheme
The full system (10) can be written in the form of an hyperbolic system with conservative fluxes and non-conservative products as follows: where w is the vector of the conserved variables, Remark that the Einstein notation is used to represent the block structures in the previous vector.
We denote by F C (w) the advected or transport flux, given by Here, P(w, η, ∂ x w, ∂ x η) corresponds to the pressure term, which depend on the relative density and the free surface, and is defined by where Finally, the terms T(w, ∂ x w) correspond to the mass, density, and momentum exchange between layers: We recall that G α+ 1 2 is described by (9). We will consider a well-balanced second-order finite volume scheme where pressure terms are discretized using the path-conservative framework introduced by Parés [41]. First, we describe the first order solver and then present its extension to second order, which is done by means of a reconstruction operator that preserves the well-balanced properties of the first order scheme.
Let us suppose, for simplicity, a uniform discretization of the domain in computational cells . We denote by w n i the cell average approximation of the solution at time t n = n∆t given by the numerical scheme: Let us also denote by z B,i the approximation of cell average of the bottom bathymetry at cell I i , that is, In what follows, for any variable f , we shall define its average at the intercell i + 1 2 as: In particular, for a variable f α defined within the layer α, we shall write The difference at the intercell i + 1 2 will be written as We shall denote the average of a variable f α at the layer interface α + 1 2 by In addition, at the bottom or free surface interfaces, we shall assume

First Order HLL-Type Scheme
Following [42], we define a HLL-type scheme for (18) by where Remark that in order to make the notation less cumbersome, we have neglected the subindex i + 1 2 , so that here h stands for h i+ 1 2 , ∆h stands for (∆h) i+ 1 2 and so on. The term T i+ 1 2 corresponds to the approximation of the exchange between layers at the layer interfaces on the intercell i + 1 2 , , where again to simplify notation, we denote and analogously for u The coefficients α 0,i+ 1 2 and α 1,1+/2 are related with the numerical viscosity of the scheme and are defined in terms of the upper and lower bounds of the maximum and minimum of the waves speeds (see [42] for more details):

Hydrostatic Reconstruction
The numerical scheme defined previously is not well-balanced in the sense that it cannot preserve lake-at-rest steady states as the numerical viscosity term α 0,i+1/2 (w i+1 − w i ) does not vanish. In order to obtain a well-balanced scheme, we will combine it with a modified hydrostatic reconstruction technique [43]. Given the states w n i and w n i+1 , and z B,i and z B,i+1 , we shall define the reconstructed states at the interfaces w HR,± i+ 1 2 and z B,1+/2 . To do so, we consider and h HR,− where ( f ) + is the positive part of f . Using (28), we define Note that in (28), the reconstructed heights have been defined as h HR = (·) + , that is, using the positive part of the values therein. In [43], the authors show that the hydrostatic reconstruction technique is positive provided that the underlying conservative scheme used is positive. To prove this fact, the authors prove that it is essential to preserve the positivity of the reconstructed water heights, which justifies the use of the positive part. Remark that in general, for smooth topographies, one would have that the jumps of the bottom at cell interfaces are small and therefore the positive part is not needed when h is positive. Nevertheless, in the presence of big gradients of the bottom, this is needed in order to avoid non-physical negative water thickness.
In order to maintain an easier notation, we denote by the cell averages of the vectors θ α and θ α u α on the cell I i .
Remark that the averages and differences (23)- (25) are now defined in terms of the reconstructed variables, that is, and also and Now, we can redefine the numerical scheme (26) in the following way: where D ± i+ 1 2 are defined in Section 3.1. We remark that the terms ghθ α ∆η in P i+1/2 , reduce now to ghθ α ∆h. Finally, the term S ± i+ 1 2 corresponds to the corrections due to the use of the hydrostatic reconstruction and guarantee the consistency of the scheme as well as the well-balanced property: comes from the evaluation of the integral of between the center of the cell and the intercell along the path that defines the reconstruction (see [40,44]). Thanks to the definitions (29), we have that u α , θ α , and η remain constant, which justifies the given definition for P ± α,i+ 1 2 .
The term T ± i+ 1 2 is defined by We recall that f for any variable f . In addition, finally, G ± The resulting numerical scheme (33) is then first order accurate in space and time, well-balanced for the stationary solution corresponding to water at rest and constant density and positive preserving for the water high with the standard CFL − 1/2 restriction.
In fact, it is trivial to check that the scheme is well-balanced for the stationary solution corresponding to water at rest: indeed, D ± and u α,i = 0, for all i and 1 ≤ α ≤ M. The discretization of the pressure terms, P i+ 1 2 and P ± i/2 , is also zero if θ α,i are a constant value and ∆h = h HR,+ in a water at rest solution with constant relative density. Finally, the positivity of the numerical scheme is a consequence of the fact that the scheme for the equation of the total thickness corresponds to a HLL scheme, therefore positivity is granted under a suitable CFL condition.

Upwind Approximation of the Exchange Terms between Layers
The numerical scheme defined previously has a major drawback: we cannot ensure that θ α ≥ 1 for 1 ≤ α ≤ M, which could result in non-physical solutions. In fact, if we consider the following initial condition that corresponds to a dam break in relative density with M = 4 layers: the previous numerical scheme produces a numerical solution where θ 2 is clearly below the unity, which conflicts with (5). This is shown in Figure 4. According to [32], this could be solved if the exchange terms between layers T i+ 1 2 and T ± i+ 1 2 are properly discretized. In what follows, we describe an upwind discretization of those terms. The exchange terms between layers in (10) could be seen as the vertical flux between layers. However, by writing this flux as non-conservative products, we lose information about the direction of these vertical fluxes, leading to non-physical solutions. In order to incorporate this directional information, we perform an upwind approximation of the exchange terms between layers as proposed in [32].
In particular, we propose to discretize them as follows: Note that this numerical treatment is equivalent to adding the following vertical diffusion terms to the system: Now, if we reproduce the previous numerical example, we obtain that θ α ≥ 1, as it can be seen in Figure 5.

Remark 2.
Summing up all the equations for l α hθ α , the terms in T ± i+ 1 2 cancel out. This means that, in fact, the scheme reduces to a classical HLL scheme for the variable M ∑ α=1 l α hθ α . Therefore, by following the usual arguments used for HLL, one could prove that M ∑ α=1 l α θ α ≥ 1. We could not formally prove that θ α ≥ 1.
Nevertheless, the numerical experiments carried out verified that these properties are fulfilled.

Second Order Approximation
In order to achieve second order in space, we combine the previous first order numerical scheme with a second order reconstruction operator at each cell, that is, at each time step t n and at each cell is called the stencil of the reconstruction operator. We also use the standard notation: Following [39], the natural extension of the first order numerical scheme (33) may be written as follows: ).
In what follows, for the sake of simplicity, we shall drop the dependence on time.
Remark 3. The evaluation of the numerical scheme (33) on the reconstructed states w ± means that one has to perform the hydrostatic reconstruction procedure (27)-(29) from the reconstructed states. More explicitly, the hydrostatic reconstructed states become and h HR,− In the previous expression, the notation h HR = (·) + denotes the positive part while z ± i (x) = z B (x) + O(∆x 2 ), x ∈ I i and denote by, We shall denote the components of R i = (R h |R hθ α |R hθu α ). According to [39], in order to preserve the well-balanced properties of the first order numerical scheme, the reconstruction operators should be also well-balanced, that is, the reconstruction operators should preserve the stationary solutions corresponding to water at rest with constant density. More explicitly, for a lake-at-rest steady state solution, one should have Therefore, the reconstruction operator R z B i cannot be arbitrarily chosen. In practice, we consider a reconstruction operator for the free surface R η i , defined from the cell averages of the surface {η j , j ∈ S i }. Then, define the bottom reconstruction operator by This grants that the reconstruction operator satisfies and the well-balanced property is achieved provided that R η i reduces to a constant whenever the states η j in the stencil are constant.
In this work, we use the MUSCL reconstruction operator (see [45]). In each cell I i , we define a piece-wise linear operator of the form where σ i is the vector which provides the slope of the reconstruction for each variable and x i is the center of cell I i . As usual, it is also necessary to provide a slope limiter to avoid spurious oscillations at jump discontinuities, while keeping the second order accuracy when the solution is smooth. In this work, we use an average limiter (avg), defined by where the subindex k refers the k-th component of the vector and the avg operator is defined as Let us detail the reconstruction procedure for the conserved variables and the bathymetry: • First, we consider the reconstruction of the water depth h and free surface η that is R h i (x) and R η i (x) and the reconstruction of the bathymetry is recovered by setting . In order to guarantee the positivity of the water depth during the reconstruction, we use the technique introduced in [46]. • Next, we consider the reconstruction of the relative density θ α at each cell, R θ α i (x). Let us denote by σ θ α i the slope of the reconstruction of θ α and σ h i the corresponding slope for the water depth.
Again, we follow [46] to guarantee that R θ α i (x) ≥ 1, x ∈ I i . • Finally, we consider the reconstruction of the velocity u α at each cell. Let us denote by σ u α i the slope of the reconstruction of u α at the cell I i , then we define R hθ α u α The integral term in (35) is approximated by the middle point quadrature formula that results in: 1 ∆x We shall consider θG 1 2 = uθG 1 2 = 0 and θG M+ 1 2 = uθG M+ 1 2 = 0. Finally, the second order in time is achieved via a total variation diminish (TVD) Runge-Kutta method (see [47]).
The second order numerical scheme is thus well-balanced for the stationary solution corresponding to water at rest and constant density and positive preserving for the water depth.

Numerical Tests
In this section, we present several numerical simulations with the main objective to show the capabilities of the model and of the numerical scheme previously presented. Unless stated otherwise, the second order scheme is used. The first two numerical subsections will show the accuracy and well-balanced properties of the scheme. Then, an academic test case with a smooth distribution of relative density is shown in Section 4.3. The numerical strategy will be validated in Section 4.4, which corresponds to a lock-exchange in density that generates a gravity current, where laboratory data are available. In particular, the influence of the number of layers on the accuracy of the results will be discussed. The numerical test presented in Section 4.5 also corresponds to a dam break problem in density but with a non-constant bathymetry function, where we will show that the model is able to reproduce the general behaviour of this type of problem. Finally, a 2D problem will be considered in Section 4.6.

Order of Accuracy Test
The objective of this numerical test is to check numerically the order of accuracy of the numerical scheme. We consider a 5-layer simulation with an initial condition given by: and the bathymetry is given by The CFL parameter is set to 0.5 and the final time is t = 0.5 s. Periodic boundary conditions are set.
The errors and order of accuracy are shown in Tables 1 and 2 for h, hθ 1 , and hθ 1 u 1 for a horizontal discretization of 25, 50, 100, 200, and 400 volume cells for the first and second order schemes, respectively. The numerical solutions are compared to a reference solution, which has been computed with the same numerical scheme on a fine mesh of 3200 volume cells. Similar results are obtained for hθ α and hθ α u α , for α > 1. Here, hθ 1 has been chosen in particular due to the fact that the pressure term for α = 1 is the most sophisticated one, as it depends on hθ α , 1 ≤ α ≤ 5. Figure 6 shows the free surface (left) and the velocities u α , 1 ≤ α ≤ 5 at the final time t = 0.5 s for the second order scheme using the 200 cell mesh.

Well-Balanced Test
In order to show the well-balance properties of the numerical scheme, we perform a numerical simulation of a lake-at-rest steady test. We set the following bathymetry function, and a constant free surface η = 2. In this problem, the relative density is set to 1, and the initial velocity is zero. Figure 7 (left) depicts the free surface and bathymetry at t = 150 s, unchanged with respect to the initial condition. Figure 7 (right) shows a zoom on the surface and we can see that the model is exact up to machine precision, even for long time simulations.  A more interesting situation corresponds to the stratified solution described by (14). Let us consider here a particular solution of this family with three layers (1 ≤ α ≤ 3): The constant values K i are chosen in order to have a stable stratified profile in the vertical direction, that is, θ 3 (x) ≤ θ 2 (x) ≤ θ 1 (x). Here, we consider K 1 = 1.01, K 2 = 0.02 and K 3 = 0. Note that, if a stratification expression like (13) is used, then the coefficients K i tends to zero with the increasing of the number of layers.
The numerical scheme described in Section 3 cannot preserve the steady state (42). Figure 8 shows the free surface and velocities at t = 10 s when (42)   Nevertheless, it is possible to modify scheme (35) or (33) in order to improve their well-balanced properties and, in particular, to preserve this stationary solution. Following [48], it is possible to modify the reconstruction procedure in such a way that these particular solutions are preserved. Remark that here we are focusing on preserving just this particular solution. The procedure is briefly described in what follows. Future and ongoing works will aim at generalizing the technique so that the numerical scheme is able to preserve a larger family of steady states.
Let us modify (35) in order to improve its well-balanced property and, in particular, to preserve the stationary solution (42). We denote by w e (x) this particular stationary solution. According to [48], the first stage is to define a reconstruction operator that preserves w e (x). The idea is quite simple: first define the fluctuation with respect to w e (x) that is at each cell and at each time step we compute the quantities, where x i is the center of the cell I i . Note that w e (x i ) is the evaluation of the stationary solution at the center of the cell that is a second order approximation of its cell average on the cell I i . Next, we apply the standard second order reconstruction previously described to the fluctuations w n f ,i . Let us denote these reconstructions by R f i (x). Finally, the reconstruction for w is defined as follows: Special care should be taken for the quadrature formula applied for the integral appearing in (35). Again, we follow [48] in order to propose a quadrature formula that preserves the well-balanced character of the reconstruction and, therefore, of the numerical scheme. Now, if we apply (35) with the previous reconstruction procedure, the numerical scheme also preserves the stationary solution (42). Figure 9 shows the errors for the relative density and velocities at time t = 150 s when compared to the exact steady state and those are of the order of the machine precision. as the initial free surface. The relative densities θ α and velocities u α remain the same. The initial free surface and relative densities are shown in Figure 10. We consider the same domain as previously, discretized with the same number of cells. The relative densities and the free surface are fixed as the right boundary condition, while open boundary conditions are considered on the left. Figures 11 and 12 show the difference of the relative density with respect to the one given by the steady state. As it can be seen, the relative density slowly converges to the stationary solution up to machine precision (see Figure 12).   Figure 11. Difference on the relative density between the perturbed and unperturbed steady state.

Simulation for a Smooth Distribution of Relative Density
We perform now a simulation with a smooth profile for the relative density θ. We fix the domain [−4, 4], discretized with 800 volume cells and we choose M = 10 layers. The relative density θ is initially equal in all the layers and given by while the bathymetry and the free surface are constant and equal to 0.5 and 1.5, respectively. We consider open boundary conditions. The initial condition is shown in Figure 13. Figures 14-17 show the evolution of the relative density for different times. We see that the relative density tends to go downwards and towards the sides, but, at different velocities, due to the difference of pressure between layers, forming a shock (see Figure 16). Eventually, for large times, the solution becomes stratified in layers (see Figure 17). A dual representation has been chosen for the relative density. On the right-hand side of the figure, we show the graph of the relative density for a choice of some layers, so that they are representative of the evolution of the problem. On the left-hand side, we show the relative density within the layers through a heat map.  Figure 13. Relative density initial condition.  Figure 14. Evolution of the relative density at t = 2.5 s.    Figure 17. Evolution of the relative density at t = 50 s.

Simulation of a Lock-Exchange in a Flat Channel
We propose now to perform the test corresponding to the experimental data presented in [35]. There, the authors propose a laboratory experiment in a 3 m long channel, where a gatebox of length 0.1 m is placed within the channel containing a fluid with density ρ 1 . The rest of the channel contains a fluid of density ρ 0 , with ρ 1 > ρ 0 The gate box is opened and its fluid is released into the flume. The density ρ 0 is 1000 kg/m 3 while ρ 1 is 1034 kg/m 3 . Thus, we consider The initial height of the water is constant and equal to 0.3 m while the bathymetry is the constant function b = 0.5. This initial condition is shown in Figure 18. We consider the domain in [0, 3] discretized with 800 volume cells. We impose reflecting no-slip boundary conditions.
For the particular case of M = 40 layers, we show in Figures 19-25 the evolution of the density distribution at different times. We observe how a shock is formed once the fluid in the gate box is released. The propagation of this shock corresponds to the front position described earlier. The relative density profile at the left boundary of the channel is not an effect of the numerical treatment of the reflecting no-slip boundary conditions. A similar effect can be also observed in the laboratory experiment performed in [35].
In [35], the authors obtain the position of the head of the gravity current with respect to time. In Figures 26 and 27, we can see the comparison for the front position between the experimental data and the results obtained with the numerical approach presented here. The numerical simulations have been obtained using M = 15, 20, 30, and 40 layers. We see that the larger the number of layers, the better agreement we have between experimental and numerical data. Even for M = 20, the results are quite good, despite the fact that non-hydrostatic effects, usually present in such situations (see [49]), are not taken into account.

Simulation of a Dam Break Problem with a Non Constant Bathymetry Function
We consider the domain [−5, 5] and a bathymetry function given by and a constant free surface η = 2 m. A dam break problem in the relative density is considered as an initial condition by setting We fix M = 30 as the number of layers and discretize the domain with 1000 volume cells. Free-flow open boundary conditions are considered. The initial condition is shown in Figure 28 while the evolution of the fluid is depicted in Figures 29-33 for different times. In this test, we also compare the first and second order schemes. In particular, we observe a sharp transition of density over the obstacle, especially in the second order case. This kind of profile is common in gravity currents over obstacles (see [50]).        Figure 33. Distribution of the relative density θ. The first row corresponds to the first order scheme and the second row to the second order scheme.

Simulation of a Dam Break Problem in Two Dimensions
As we have said, although the models and numerical scheme are introduced for the 1D case for the sake of simplicity, everything can be easily extended to the 2D case. To illustrate this fact, we show here a 2D simulation. We fix the domain (x, y) ∈ [−5, 5] × [−1, 1] where non-constant bottom is considered described by A constant free surface equal to 2 meters is considered initially. The relative density distribution is set as The domain discretized with 36,000 uniform cells with ∆x = 1/60 and ∆y = 1/30. We impose reflecting no-slip boundary condition at the horizontal walls and free-flow boundary conditions at the vertical ones. We use M = 15 layers. The initial condition can be seen in Figure 34               The numerical simulations performed in this section show that the numerical Section 3 is both robust and accurate, and that the model Section 2 is useful to better understand flows with density fluctuations.

Conclusions
We have presented a multilayer shallow water model with variable density. We propose first and second order numerical schemes that preserve the stationary solutions corresponding to water at rest with constant density. Some insight has been briefly given on how to modify the scheme in order to preserve some stratified stationary solutions. The numerical scheme is also positive preserving for the water height. According to the numerical experiments, it seems that it also satisfies the maximum principle for θ α . The numerical tests performed are promising, showing robustness and great agreement with laboratory experimental data. The model used and the numerical strategy introduced here will certainly help with a better understanding of density driven currents in geophysical flows.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Parallelization on GPU
In this appendix, we will discuss how the numerical scheme discussed in Section 3 is implemented on GPUs. Parallelizations on GPU are useful when dealing with large computational domains, as they are able to provide a very significant speed-up with respect to a sequential version of the code. There are different ways to develop GPU parallel codes. One method consists of developing the entire code on CUDA (see [7] and the references therein). There are numerous examples in the literature where different numerical schemes and models have been written directly in CUDA, obtaining extremely good results from the computational efficiency point of view. Another way to develop GPU parallel codes is to use OpenACC directives (see [51]). While CUDA is a well-known approach to parallelize finite volume numerical schemes and has successfully been applied to solve large problems (see for example [6]), OpenACC is a newer paradigm, offering a friendlier approach to GPU-based parallelization.
Several authors have studied the results of parallelizing a code under CUDA and OpenACC (see for example [52]). In general, the speed-ups obtained are favorable to CUDA, which has a greater control over data management and load balancing. Nevertheless, OpenACC also presents some advantages over CUDA. The main one is that the work needed on adapting a sequential code to OpenACC is much less demanding than the one required for CUDA. When developing a CUDA code, the construction of the accelerator kernel, the part of the code that will be executed by the GPU, requires that the particular structure of the problem and the target accelerator hardware have to be taken into account. This means that the performance of the CUDA code highly depends on its specific adaptation to the architecture being used. It is not an easy task to develop a good performing CUDA code and the acceleration gained depends mostly on the capabilities of the developer in adapting the particular algorithm used. Contrary to CUDA, OpenACC merely needs to add some instructions in the form of pragmas to the code. In this way, most of the burden of the parallelization of the code falls into the compiler, which is the one who has to translate the information contained in the pragmas to build the accelerator kernel, and not the developer. Therefore, it is not necessary to recode the whole program, but rather instruct your compiler with enough and suitable hints in the code to generate a parallel version. While it is true that some authors have achieved similar performance with OpenACC and CUDA for some reference problems (see for instance [53]), this requires a heavy personalization of the OpenACC version, potentially losing its accessibility, and getting closer to a CUDA implementation.
Note that the usual work required to parallelize a sequential code in CUDA is also present in OpenACC. However, the tools provided by OpenACC are friendlier than the ones provided by CUDA. Specifically, OpenACC allows easy data transference to the accelerator hardware and an easy parallelization of those loops susceptible to be parallelizable. In this sense, it is somehow similar to OpenMP, the programming interface for parallelization on CPUs.
Comparisons have been made between OpenACC and CUDA version of the code for the one layer and one-dimensional case. For the CUDA code, we follow the procedure described in [54]. The results are shown in Figure A1 (left). As we can see, CUDA is two times faster than OpenACC. However, OpenACC speed-ups are also very good and the effort required to obtain an OpenACC version from the sequential code is much less demanding than the one required to obtain the CUDA version. Figure A1 (right) shows a comparison of the elapsed real time for two different versions of the code: a multi-CPU version and a GPU one (run on two different GPU architectures). We see that the GPU implementation is better by far than the CPU one and offers great scalability. Nevertheless, one should take into account that the elapsed real time is very sensitive to the GPU used.
To approximate Q, a solution of (A15), we approximate v by u that is a linear interpolation in z, such that u |z= 1 2 (z α− 1 2 +z α+ 1 2 ) = u α . Finally, we can solve the system defined by (A12) and the equation resulting from multiplying scalarly (A13) by η α+ 1 2 . In this way, we can obtain the expressions of T ± α+ 1 2 that satisfy the jump condition and the consistency condition on the interface. We can easily solve it and we obtain ], we set w +

Appendix B.3. A Particular Weak Solution with Hydrostatic Pressure
In this section, we complete the derivation of the model under the assumption of hydrostatic pressure. This means that p α (t, x, z) = p Here, the component p (t) and p S denotes the pressure at the free surface. Then, the unknowns of the system are the depths of the layers, the density fluctuation, and the horizontal velocities in each layer.