Numerical Benchmark Studies on Flow and Solute Transport in Geological Reservoirs

: Predicting and characterising groundwater ﬂow and solute transport in engineering and hydrogeological applications, such as dimensioning tracer experiments, rely primarily on numerical modelling techniques. During software selection for numerical modelling, the accuracy of the results, ﬁnancial costs of the simulation software, and computational resources should be considered. This study evaluates numerical modelling approaches and outlines the advantages and disadvantages of several simulators in terms of predictability, temporal control, and computational efﬁciency conducted in a single user and single computational resource set-up. A set of well-established ﬂow and transport modelling simulators, such as MODFLOW/MT3DMS, FEFLOW, COMSOL Multiphysics, and DuMu X were tested and compared. These numerical simulators are based on three numerical discretisation schemes, i.e., ﬁnite difference (FD), ﬁnite element (FE), and ﬁnite volume (FV). The inﬂuence of dispersivity, potentially an artefact of numerical modelling (numerical dispersion), was investigated in parametric studies, and results are compared with analytical solutions. At the same time, relative errors were assessed for a complex ﬁeld scale example. This comparative study reveals that the FE-based simulators COMSOL and FEFLOW show higher accuracy for a speciﬁc range of dispersivities under forced gradient conditions than DuMu X and MODFLOW/MT3DMS. FEFLOW performs better than COMSOL in regard to computational time both in single-core and multi-core computing. Overall computational time is lowest for the FD-based simulator MODFLOW/MT3DMS while the number of mesh elements is low (here < 12,800 elements). However, for ﬁner discretisation, FE software FEFLOW performs faster.


Introduction
Geoengineering and hydrogeological applications for groundwater flow and solute transport use tracer experiments. These are often planned and dimensioned using numerical simulators. Mathematical model prediction efficiency highly depends on the quality of the numerical simulator itself, which is evaluated in code inter-comparison studies, or so-called benchmarking experimentation. For decades, these testing experiments for the numerical software used to simulate flow and solute transport in georeservoirs have been state-of-the-art. Usually, the developers of software or analytical approximations verify and optimise their codes by conducting forward simulations for a given set of benchmark problems (e.g., [1][2][3][4][5][6][7]). They illustrate the accuracy of the numerical simulators for single-phase flow in porous or fractured porous media by comparing established analytical

Mathematical Model
The analytical solution and the numerical method solved the flow equation as follows: Solute transport in the porous medium is computed by the advection-dispersion equation of divergent form solved numerically for an incompressible fluid as in Equation (3) [16,17]: For radial symmetry (Problem 2, 2D), the transport equation can be written as [18] ∂C ∂t Here, ρ b [ML −3 ]-density of solid in porous media, n [-]-porosity, S [-]-storage coefficient, r-distance as radial coordinate from the injection well. Hence, only one velocity exists in a representative elementary volume (REV) in time and space; therefore, velocity fluctuations within the REV are neglected. They are considered in the transport equation as mechanical dispersion. The numerical simulator results are compared with the analytical solution for the 1D and 2D problems described below.
The analytical solution for the 1D problem used here are derived from [1], The adapted and modified 2D problem solution from [18] after Gelhar and Collins [19] are used for the concentration estimation: where r inj [L]-radial length the solute advances during injection, R d [-]-retardation. MODFLOW is a well-tested, FD method-based numerical simulator employed for decades for fluid flow and solute transport [15], available as open-source code from the U.S. Geological Survey (USGS). A number of modified commercial generic versions are also available. MODFLOW 6 has superseded the MODFLOW-2005 of 2018 by integrating unstructured gridding techniques for groundwater flow and transport [20,21]. The USGS also provides a pre-and post-processing graphical user interface (GUI) ModelMuse, which is also freely available. This study uses ModelMuse for model building and simulation result visualisation. The FD-based space and time discretisation method is a well-known technique with relatively low memory demands and low simulation time costs. The LKMT3 packages in the MT3DMS transport model connect the flow model interface to read model geometries and hydraulic characteristics, such as saturated thickness, fluxes across the cell, positions, and flux of the various sources and sinks through an unformatted flow transport link file saved in MODFLOW. For the flow simulation study, the preconditioned conjugated gradient solver (PCG) in MODFLOW with layer-centred grid, and for solute transport, generalised conjugate gradient solver (GCG) in MT3DMS is used. The advective flow is solved in third-order total-variation-diminishing (TVD) Runge-Kutta scheme.
2.2.2. FEFLOW 6.0 FEFLOW (FE subsurface FLOW and transport system) is an interactive groundwater modelling software for 2D and 3D fully coupled or decoupled, thermo-hydro-chemical (THC) processes in saturated or variably saturated systems [3]. The reactive multi-species transport can be modelled in subsurface water environments with or without one or multiple free surfaces (e.g., [22]). The programming interface (interface manager, IFM) allows the use and develop user-specific plug-ins to FEFLOW. It has a user-friendly model builder interface. Besides the parallelised (OpenMP) computational core, it has powerful pre-and post-processing capabilities, including 2D and 3D GIS data. FEFLOW is available for Windows systems as well as for different Linux distributions. The FEFLOW 7.0 launched with a different GUI compared to FEFLOW 6.0 version with a few new features, such as multi-layer wells and parameter visualisation. However, FEFLOW 6.0 is used for this study, which supports both old and new GUIs.

COMSOL Multiphysics
COMSOL Multiphysics (formerly FEMLAB) is an FE-based numerical simulator for various physical applications. It is mainly employed for technical applications and received increasing attention for environmental applications during recent years (e.g., [23][24][25][26][27][28]). In addition to several available interfaces for standard applications, e.g., flow in porous media, COMSOL offers the possibility to implement individual PDEs without requiring access to the source code. This makes COMSOL highly flexible. The software focuses on the interconnection and interaction between different physical processes allowing for multiphysics and multidimensional couplings. COMSOL is available for Windows, Mac, Linux, or Unix systems. It offers several direct and iterative solvers for various applications. The interfaces to external software allow easy transfer of model results and geometries, e.g., MathWorks MATLAB (https://www.comsol.com/livelink-for-matlab (accessed on 10 February 2022)). Additionally, COMSOL allows pre-and post-processing within the same interface. The "flow and solute transport" module of COMSOL Multiphysics v.4.4, used in this study, solves the flow and transport equations for saturated and unsaturated porous media. The flow and transport equations are managed in separate interfaces. COMSOL allows the equations to be solved simultaneously or step-wise, reducing simulation time. The PCG solver is used for our study since this is also the standard solver in FEFLOW.

DuMu X
DuMu X [29] is a multi-scale, multiphysics toolbox based on the Distributed and Unified Numerics Environment (DUNE) [30] for simulation flow and transport processes in porous media. DuMu X comes as an extension to DUNE, inheriting functionality from all available DUNE modules. It provides a framework for implementing porous media flow problems, including problem formulation, spatial and temporal discretisation selection, and coupling in non-linear solvers to general concepts modelling. The free and opensource academic software is available to download at www.dumux.org (accessed on 1 February 2022) from a series of ready-to-use models. DuMu X builds and runs on Linux, Unix, and Mac operating systems. Installation in Windows is possible using a virtual machine using Linux or employing Windows Subsystem for Linux. The one-phase flow two-component transport numerical model, 1p2c, implemented in DuMu X is used in this study. A BOX scheme [31] is used for the space discretisation and a standard implicit Euler scheme for time discretisation. The non-linear equations are linearised and solved using Newton-Raphson method, allowing adaptive time-stepping [31].

Set Up of the Numerical Model
We installed the four studied numerical simulators and their required packages in a Linux OS on a 4-core 2.32 GHz CPU with an 8 GB RAM computer. The parallel computing efficiency for flow and solute transport is only tested for COMSOL and FEFLOW numerical simulators. Therefore, computational performance is studied for both, single-core and multi-core computing in these simulators. Though parallel computing is available in DuMu X , it requires installing several software packages (such as UG-Grid, ALU Grid, and parallel open MPI), which were not activated for our study.
Each simulator has its requirements and technicalities, e.g., defining each model's boundary and initial conditions, which can be partly overcome when one modeller implements the same problem in all four simulators. All numerical simulators, including all necessary software packages, were installed, and all problem models were set up and run by the first author on the same machine. This is done to enhance understanding and interpretation of boundary conditions and reduce errors while transitioning from the conceptual model (on the paper) into the computer program.

Problem Definition
For selecting the benchmark problems, we considered problems where analytical solutions are readily available. We explicitly defined and described the problem including flow and transport boundary conditions in a clear way that could be implemented in all tested simulators. Moreover, we chose benchmark problems that could be implemented in all simulators, in order to assess their relative performance (e.g., the reservoirs have no fractures, and have layered structure and not are dome-shaped). Therefore, three benchmark problems with gradually increasing geometrical complexity (i.e., 1D in Problem 1, 2D in Problem 2, and 3D in Problem 3) are defined for this study.

Problem 1D-Solute Tracer Transport for Steady-State Flow in a Homogenous Aquifer Forced Head Gradient
The analytical solution for solute transport in any homogeneous aquifer with a head gradient condition was formulated in [1]. Figure 1 depicts benchmark problem 1, which is aimed at testing the quasi-one-dimensional flow and solute transport. The solute is injected at the left-hand boundary with a pressure head of 12 m. The advective-dispersive transport is monitored at the observation point. The 100 m long flow domain is assumed as a homogenous and isotropic porous medium. Hydraulic conductivity is 1 × 10 −4 m/s, and porosity 0.25. The constant head difference between the two sides is 2 m. The simulation period is 200 days. The time step is set at one day (maximum time step size), and the domain is discretised using a mesh size of 1 m. The observation point is located at a 50 m distance from the inlet (i.e., middle of the domain). At the left-hand side, the boundary is defined as fixed concentration C in = 1 mg/L (i.e., Dirichlet), and free outflow at the right-hand side. Different dispersivity values (0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, and 10 m) were simulated to conduct the comparison and sensitivity of each simulator to dispersion. distance from the inlet (i.e., middle of the domain). At the left-hand side, the boundary is defined as fixed concentration Cin = 1 mg/L (i.e., Dirichlet), and free outflow at the righthand side. Different dispersivity values (0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, and 10 m) were simulated to conduct the comparison and sensitivity of each simulator to dispersion. In this problem, the concentration change is simulated at an observation well after tracer injection at a constant rate at a fully penetrating injection well. The aquifer is confined. A constant solute concentration of 1 mg/L in the water injected by the well at a constant rate is assumed. Moreover, three sides of the domain are assigned as constant head boundaries, assuming a 'free outflow' of fluid and solute during the simulation period of 200 days. The maximum time step size is set as one day. Storage is set to zero so that steady-state flow conditions are achieved instantaneously after injection. Since symmetric radial flow and transport behaviour is expected for the stated model set-up, the numerical models developed only half of the domain, assuming symmetry at the middle of the domain at the injection point.
A numerical model with a spatial discretisation of 40 elements on the x-direction 20 elements on the y-direction is implemented in all used simulator platforms to simulate the concentration breakthrough at the observation point at a 25 m distance from the injection well. Simulation results are compared with the analytical solution given by Gelhar and Collins [19], adapted and modified by Schroth et al. [18]. The initial and boundary conditions and model domain flow and transport properties are shown in Figure 2.  In this problem, the concentration change is simulated at an observation well after tracer injection at a constant rate at a fully penetrating injection well. The aquifer is confined. A constant solute concentration of 1 mg/L in the water injected by the well at a constant rate is assumed. Moreover, three sides of the domain are assigned as constant head boundaries, assuming a 'free outflow' of fluid and solute during the simulation period of 200 days. The maximum time step size is set as one day. Storage is set to zero so that steady-state flow conditions are achieved instantaneously after injection. Since symmetric radial flow and transport behaviour is expected for the stated model set-up, the numerical models developed only half of the domain, assuming symmetry at the middle of the domain at the injection point.
A numerical model with a spatial discretisation of 40 elements on the x-direction 20 elements on the y-direction is implemented in all used simulator platforms to simulate the concentration breakthrough at the observation point at a 25 m distance from the injection well. Simulation results are compared with the analytical solution given by Gelhar and Collins [19], adapted and modified by Schroth et al. [18]. The initial and boundary conditions and model domain flow and transport properties are shown in Figure 2. distance from the inlet (i.e., middle of the domain). At the left-hand side, the boundary is defined as fixed concentration Cin = 1 mg/L (i.e., Dirichlet), and free outflow at the righthand side. Different dispersivity values (0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, and 10 m) were simulated to conduct the comparison and sensitivity of each simulator to dispersion.

Problem 2D-Solute Tracer Transport in a Forced Gradient Confined Homogenous Aquifer Point Source
In this problem, the concentration change is simulated at an observation well after tracer injection at a constant rate at a fully penetrating injection well. The aquifer is confined. A constant solute concentration of 1 mg/L in the water injected by the well at a constant rate is assumed. Moreover, three sides of the domain are assigned as constant head boundaries, assuming a 'free outflow' of fluid and solute during the simulation period of 200 days. The maximum time step size is set as one day. Storage is set to zero so that steady-state flow conditions are achieved instantaneously after injection. Since symmetric radial flow and transport behaviour is expected for the stated model set-up, the numerical models developed only half of the domain, assuming symmetry at the middle of the domain at the injection point.
A numerical model with a spatial discretisation of 40 elements on the x-direction 20 elements on the y-direction is implemented in all used simulator platforms to simulate the concentration breakthrough at the observation point at a 25 m distance from the injection well. Simulation results are compared with the analytical solution given by Gelhar and Collins [19], adapted and modified by Schroth et al. [18]. The initial and boundary conditions and model domain flow and transport properties are shown in Figure 2.

Problem 3D-Solute Transport for Confined Homogeneous Multi-Layered Forced Gradient Conditions
This section describes the field-scale application of a tracer experiment. This involves the evaluation of a tracer experiment in a dipole-forced gradient setting with multi-layered injection and pumping wells. The problem is intended to illustrate the performance of the different simulators for complex geometries and flow conditions commonly occurring in reservoir engineering applications, i.e., a layered aquifer with alternating layers of high and low permeability and porosity. The geological cross-section of the model domain represents five layers of different thicknesses [32,33]. The lateral extent of the domain is 100 × 100 m. Three layers represent sandstone aquifers with hydraulic conductivity values of 8.03 × 10 −8 , 1.97 × 10 −7 , and 4.36 × 10 −8 m/s from the top to the bottom layer, respectively. Less permeable layers or aquitards (9.69 × 10 −12 m/s) of 1 m thick silty clay lenses separate them ( Figure 3). The modelled formation is assumed to be confined due to thick clay lenses at the top and at the bottom. Hence, those layers are assumed to be impermeable and hydraulically not connected. The porosity values are 14.5% (top), 16.3% (middle), and 13.3% (bottom) for the permeable sandstone and 3.9% for the aquitards specified ( Figure 3). Injection and pumping wells are placed in the three conductive layers, and each layer is discretised with a uniform thickness of 0.5 m. For the numerical simulation, vertical symmetry is assumed; hence, only half of the domain is modelled. The model is discretised into a rectangular mesh of 2.5 m. The injection and pumping rates are 8.64 m 3 /day. The simulation period is 200 days with a maximum time step of 0.5 days used for time discretisation. However, time-stepping varied in different simulators; hence, the individual model simulation results are compared using the 'cubic interpolation' (a third-degree polynomial) method in MATLAB by interpolating the data and RMSE value between the curves calculated. Again, the concentration variation with time at the pumping well is compared against MODFLOW/MT3DMS data since no analytical solution is available for such a 3D problem. MODFLOW/MT3DMS has been chosen as a reference since it is widely used and verified for saturated groundwater flow and solute transport for many test sites worldwide.

Results
This section provides a description of the results of the numerical experiments, their interpretations, and conclusions drawn. Thus, results of the three benchmark problems are described based on geometric complexity (1D, 2D, and 3D), spatial discretisation, and computation efficiency of different software.

Problem 1D-Solute Transport in a Homogeneous Aquifer
The temporal variation of concentration of the analytical solution [1] and the numerical simulation of the breakthrough curves (BTCs) at the observation point, 50 m from the point source, is presented in Figure 4, for two different dispersivity values, i.e., 0.7 and 5 m. The dispersivity value of 5 m is used as a standard scenario since it represents a median value of the studied dispersivity range 0.01 to 10 m, and because 5 m is a common field dispersivity measured [34,35]. The relative error is estimated by comparing numerically simulated BTCs with that obtained by the analytical solution using the standard Root mean squared error method (RMSE). The two distinct sets of curves in Figure 4a reveal that all four numerical and the analytical solution are sensitive with respect to changes in dispersivity, i.e., 0.7 m and 5 m. The numerically simulated BTCs reveal a good match with the analytical solution since the RMSE value is less than 0.1 mg/l (i.e., <10% of C 0 ). The RMSE of the BTCs for different dispersivity values is shown in Figure 4b for the participating numerical simulators in problem 1D.  The simulation with a dispersivity of 0.7 m shows larger variations between the simulated breakthrough curves. The numerical dispersion is especially pronounced for Du-Mu X . It is important to note that the RMSE significantly decreases for higher dispersivity values (Figure 4b). Though DuMu X shows high RMSE values < 3 m dispersivity, the errors are comparable to the other simulators as well as for higher dispersivity values. However, the errors (maximum) value ranged between 2% and 8% of C0, being well below the threshold of 10% of peak concentration, indicating reasonable accuracy.

Problem 2D-Solute Transport in a Homogeneous Aquifer in Forced Gradient
Divergent radial flow and mass transport from an 'injection well' in a homogeneous and confined aquifer was studied. Tracer concentrations are monitored at a 25 m distance from the injection well. The change in concentration with time for the four different numerical tools revealed significant differences for the 200 days of simulation (results presented for the time slice of 120 days in Figure 5). The deviation between the numerically simulated BTCs and the analytical solution is low. For DuMu X , this is particularly true. For low dispersivity values, the best convergence is achieved at the cost of significant numerical dispersion. Since truncation error comprised major part of numerical dispersion The simulation with a dispersivity of 0.7 m shows larger variations between the simulated breakthrough curves. The numerical dispersion is especially pronounced for DuMu X . It is important to note that the RMSE significantly decreases for higher dispersivity values (Figure 4b). Though DuMu X shows high RMSE values < 3 m dispersivity, the errors are comparable to the other simulators as well as for higher dispersivity values. However, the errors (maximum) value ranged between 2% and 8% of C 0 , being well below the threshold of 10% of peak concentration, indicating reasonable accuracy.

Problem 2D-Solute Transport in a Homogeneous Aquifer in Forced Gradient
Divergent radial flow and mass transport from an 'injection well' in a homogeneous and confined aquifer was studied. Tracer concentrations are monitored at a 25 m distance from the injection well. The change in concentration with time for the four different numerical tools revealed significant differences for the 200 days of simulation (results presented for the time slice of 120 days in Figure 5). The deviation between the numerically simulated BTCs and the analytical solution is low. For DuMu X , this is particularly true. For low dispersivity values, the best convergence is achieved at the cost of significant numerical dispersion. Since truncation error comprised major part of numerical dispersion in FD schemes while mesh size is relatively small [36]. The numerical errors increases again for high dispersivity values >5 m. The lowest RMSE for COMSOL is observed for low dispersivity values (<1 m), and for FEFLOW lowest RMSE value is achieved at low to average values of dispersity (Figure 5b). The MODFLOW/MT3DMS results also reveal a similar trend with the dispersivity values, while the lowest RMSE is found for dispersivity value 1-3 m. The numerical dispersion increases for high dispersivity > 5 m. For the fixed grid size and variable velocity field, the numerical dispersion would increase after a mesh threshold as it cannot satisfy the Courant criterion and mesh Peclet number throughout the grid size range.

Problem 3: 3D Flow and Solute Transport Simulation in a Layered Georeservoir
Figure 6a-c shows the simulated breakthrough curves in the reservoir's top, middle, and lower aquifer, respectively. Since no analytical solution is available for this 3D case, the breakthrough curves were only compared. The tracer breakthrough in the different layers varies significantly in peak concentration and arrival time (Figure 6a-c). Peak concentrations are maximum in all layers simulated in COMSOL. This is especially evident in the top aquifer (Figure 6a). DuMu X indicates a significantly stronger "smoothing", i.e., higher concentrations during the rising limb and the tail and lower peak concentration, than the other simulators. This effect is already revealed in the 2D and 1D simulations for low dispersivities. FEFLOW and MODFLOW/MT3DMS range in the sharp peak between COMSOL and DuMu X . While MODFLOW/MT3DMS shows significantly lesser peak concentrations than COMSOL for all layers, FEFLOW has comparable concentrations to MODFLOW in the top layer (Figure 6a), concentrations to COMSOL in the middle layer (Figure 6b), and in the bottom layer, the concentration peak lies in between those two (Figure 6c). The peak arrival (BTC) in MODFLOW/MT3DMS is the slowest, while DuMu X is the fastest for all three layers.

Problem 3: 3D Flow and Solute Transport Simulation in a Layered Georeservoir
Figure 6a-c shows the simulated breakthrough curves in the reservoir's top, middle, and lower aquifer, respectively. Since no analytical solution is available for this 3D case, the breakthrough curves were only compared. The tracer breakthrough in the different layers varies significantly in peak concentration and arrival time (Figure 6a-c). Peak concentrations are maximum in all layers simulated in COMSOL. This is especially evident in the top aquifer (Figure 6a). DuMu X indicates a significantly stronger "smoothing", i.e., higher concentrations during the rising limb and the tail and lower peak concentration, than the other simulators. This effect is already revealed in the 2D and 1D simulations for low dispersivities. FEFLOW and MODFLOW/MT3DMS range in the sharp peak between COMSOL and DuMu X . While MODFLOW/MT3DMS shows significantly lesser peak concentrations than COMSOL for all layers, FEFLOW has comparable concentrations to MODFLOW in the top layer (Figure 6a), concentrations to COMSOL in the middle layer (Figure 6b), and in the bottom layer, the concentration peak lies in between those two (Figure 6c). The peak arrival (BTC) in MODFLOW/MT3DMS is the slowest, while DuMu X is the fastest for all three layers. COMSOL and DuMu X . While MODFLOW/MT3DMS shows significantly lesser peak concentrations than COMSOL for all layers, FEFLOW has comparable concentrations to MODFLOW in the top layer (Figure 6a), concentrations to COMSOL in the middle layer (Figure 6b), and in the bottom layer, the concentration peak lies in between those two (Figure 6c). The peak arrival (BTC) in MODFLOW/MT3DMS is the slowest, while DuMu X is the fastest for all three layers.  The difference in the tracer BTCs is studied using MODFLOW/MT3DMS as a reference since MODFLOW/MT3DMS is widely used and eventually verified for groundwater flow for the most occasion than other software. The root mean squared differences for various dispersivity values (Figure 7a-c) show significantly higher variations comparing 1D and 2D cases. While the lower dimensional (1D and 2D) problem errors range between 0.1 and 8%, the difference range between approximately 0.1 and 0.7 mg/L or 4% to 25% in 3D cases compared to the MODFLOW/MT3DMS concentration. As for the other simulations, dispersivity around 3 m achieves the "best" fit for COMSOL and FEFLOW. For DuMu X , the fit can be improved further if slightly higher dispersivity around 7 m is chosen. The difference in the tracer BTCs is studied using MODFLOW/MT3DMS as a reference since MODFLOW/MT3DMS is widely used and eventually verified for groundwater flow for the most occasion than other software. The root mean squared differences for various dispersivity values (Figure 7a-c) show significantly higher variations comparing 1D and 2D cases. While the lower dimensional (1D and 2D) problem errors range between 0.1 and 8%, the difference range between approximately 0.1 and 0.7 mg/L or 4% to 25% in 3D cases compared to the MODFLOW/MT3DMS concentration. As for the other simulations, dispersivity around 3 m achieves the "best" fit for COMSOL and FEFLOW. For Du-Mu X , the fit can be improved further if slightly higher dispersivity around 7 m is chosen.

Spatial Discretisation Effects on Computational Efficiency
The accuracy of the advective dispersive equation (Equation (3)) is estimated through BTCs by solving the transport equation only using different grid sizes and assuming the tracer is conservative. Figure 8 shows the influence of spatial discretisation on the accuracy of the solution for benchmark Problem 2. Here, grid sensitivity is conducted for a dispersivity value of 5 m. All simulators show a lower RMSE value for a mesh, 80 × 40 elements, than the 40 × 20 element mesh. For finer discretisation, the decrease of the RMSE is low compared to the difference between the 40 × 20 element mesh and the 80 × 40 element mesh. Spatial discretisation is usually a 'trade-off' to the accuracy of the results and 'resource cost'. Here, the resource cost is estimated from the computer's virtual memory requirements to generate the grid and the computational time. For COMSOL, DuMu X , and FEFLOW, the RMSE value increases again for the finer mesh. FEFLOW has revealed higher accuracy compared with other software.

Spatial Discretisation Effects on Computational Efficiency
The accuracy of the advective dispersive equation (Equation (3)) is estimated through BTCs by solving the transport equation only using different grid sizes and assuming the tracer is conservative. Figure 8 shows the influence of spatial discretisation on the accuracy of the solution for benchmark Problem 2. Here, grid sensitivity is conducted for a dispersivity value of 5 m. All simulators show a lower RMSE value for a mesh, 80 × 40 elements, than the 40 × 20 element mesh. For finer discretisation, the decrease of the RMSE is low compared to the difference between the 40 × 20 element mesh and the 80 × 40 element mesh. Spatial discretisation is usually a 'trade-off' to the accuracy of the results and 'resource cost'. Here, the resource cost is estimated from the computer's virtual memory requirements to generate the grid and the computational time. For COMSOL, DuMu X , and FEFLOW, the RMSE value increases again for the finer mesh. FEFLOW has revealed higher accuracy compared with other software.

Computational Time (CPU Time) of Single Processor and Parallelisation
The computational time of the four numerical simulators is estimated for the 2D and 3D benchmarking problems using an automatic time step refinement approach rather than enforcing an artificial time-stepping. The simulation time is 200 days with a maximum time step size of 0.5 days. The time steps used in 3D simulation by COMSOL, FEFLOW, and MT3DMS are 503, 402, and 409, respectively. Computational time is correlated to the degrees of freedom and local criteria, such as the advective part Courant number. The same grid is used for all four numerical simulators (in the 2D problem), but simulation time varies since each simulator uses a different discretisation method. The degrees of freedom signify the element numbers used in MODFLOW/MT3DMS, the number of nodes for FEFLOW and COMSOL, the number of cells for DuMu X . Generally, there are two ways to significantly shorten the computing time of a numerical simulator to solve a specific problem. The first way is to adjust the solver settings and, secondly, by parallel computing the numerical software in multiprocessor (multi-core) systems.
Recent parallel computer architectures increase computational performance and offer higher memory storage that significantly exceeds traditional single CPU computers. Among the participating simulators, parallelisation was implemented in MODFLOW by some independent developers [37,38] and compared the simulation efficiency with the established software. In addition, multithread computing has been available in FEFLOW since 2008 and COMSOL from the beginning. DuMu X also supports parallel simulations using a distributed memory model based on MPI. Flow computation time in MODFLOW is relatively short compared with transport computation time in MT3DMS for the 2D and 3D problems. Yet, parallel computing time was not estimated for MODFLOW/MT3DMS as in MT3DMS transport simulation; parallelisation was not supported.

Computational Time (CPU Time) of Single Processor and Parallelisation
The computational time of the four numerical simulators is estimated for the 2D and 3D benchmarking problems using an automatic time step refinement approach rather than enforcing an artificial time-stepping. The simulation time is 200 days with a maximum time step size of 0.5 days. The time steps used in 3D simulation by COMSOL, FEFLOW, and MT3DMS are 503, 402, and 409, respectively. Computational time is correlated to the degrees of freedom and local criteria, such as the advective part Courant number. The same grid is used for all four numerical simulators (in the 2D problem), but simulation time varies since each simulator uses a different discretisation method. The degrees of freedom signify the element numbers used in MODFLOW/MT3DMS, the number of nodes for FEFLOW and COMSOL, the number of cells for DuMu X . Generally, there are two ways to significantly shorten the computing time of a numerical simulator to solve a specific problem. The first way is to adjust the solver settings and, secondly, by parallel computing the numerical software in multiprocessor (multi-core) systems.
Recent parallel computer architectures increase computational performance and offer higher memory storage that significantly exceeds traditional single CPU computers. Among the participating simulators, parallelisation was implemented in MODFLOW by some independent developers [37,38] and compared the simulation efficiency with the established software. In addition, multithread computing has been available in FEFLOW since 2008 and COMSOL from the beginning. DuMu X also supports parallel simulations using a distributed memory model based on MPI. Flow computation time in MODFLOW is relatively short compared with transport computation time in MT3DMS for the 2D and 3D problems. Yet, parallel computing time was not estimated for MODFLOW/MT3DMS as in MT3DMS transport simulation; parallelisation was not supported. Tables 1 and 2 reveal that computational time is significantly reduced with parallel computing implementation in the simulators FEFLOW and COMSOL. MODFLOW/MT3DMS performs the best, having the fastest computation time for the coarse mesh. However, for a mesh with a higher number of elements, the transport simulation software MT3DMS computational time increases significantly due to the poor convergence, which reduces the time step size and increases the total number of time steps to complete the simulation. Overall, when running the simulations on a single processor, the computational efficiency is higher for MODFLOW/MT3DMS, while in parallel (multi-core) computing, the FEFLOW has the higher efficiency. We noted that COMSOL requires more random-access memory (RAM > 8 GB) than available on the machine to run the simulation. One problem experienced with DuMu X , was to inability to create finer grids (for a mesh with more than 22,000 elements). Therefore, the computation of the 3D problem was not successful in our computer used for running the benchmarks. Contrary to our expectation that the FE method-based simulators require more memory and have a slower computational speed than the FV method-based ones [39], we observe higher computational efficiency.

Resource Use Efficiency and Discretisation
The advantage of FE-based software (i.e., FEFLOW and COMSOL) is not explored explicitly through FE techniques that can create meshes for complex geometrical domains to avoid differences in degrees of freedom solved by different software packages. Although, recently, the unstructured mesh has been introduced to overcome the limitation of discretisation using control volume FD software MODFLOW 6 [20,40]). All simulators, except FEFLOW, show stable pressure conditions at the multi-layered injection and pumping well throughout the simulation period. Pressure values at the injection and pumping well in FEFLOW show oscillations during the early simulation period (simulation time 0-5 days) and achieve a steady condition before the tracer reaches the observation point. The fact to be noted is that in the well-boundary or highly variable boundary condition cases, FEFLOW requires a higher mesh density to simulate a stable flow condition. Numerical stability and convergence in the codes FEFLOW, COMSOL, and DuMu X respond more sensitively with respect to spatial discretisation, while this is not observed in MOD-FLOW/MT3DMS simulations. Maina et al. [9] reported as well that both FEFLOW and MODFLOW/MT3DMS suffer from numerical dispersion/diffusion. Though by using finer discretisations, the truncation and rounding errors of the numerical scheme generally decrease [41], i.e., reducing the numerical dispersion, we found a slight improvement in FD for the 2D problem ( Figure 8). Similar to [9], we found that FE-based software FEFLOW and COMSOL and FD-based MODFLOW/MT3DMS have to be applied under restricted conditions to limit problems with numerical stability. Furthermore, MODFLOW/MT3DMS has a slightly higher accuracy for a coarse discretised model domain (40 × 20) than the other three simulators (Figure 8), but by increasing the mesh discretisation, the accuracy of the other simulators becomes better.
In the 3D problem, the total number of mesh elements in FEFLOW (19,392 elements) and DuMu X (22,400 elements) was higher than those used in MODFLOW and COMSOL (19,200 elements). The vertex-centred FV-based DuMu X simulator requires the construction of the FE mesh and the assignment of flow and transport properties (i.e., porosity and permeability) in the FE nodes. The FE nodes are then the centres of the control volumes in the secondary FV mesh, constructed by uniting the barycentres of the FEs and the midpoints of the FE edges (i.e., BOX method, [31]). Therefore, imposing the FE nodes at the interface separating two layers cannot represent the hydraulic properties precisely. This problem can be solved by placing two FE nodes at equal distances from the actual interface between the layers, both assigned with the individual property of the respective layer.

Transport Simulation Efficiency
The 1D problem addresses tracer transport from point source contamination for homogenous, isotropic conditions with a constant head gradient. The FE-based FEFLOW and COMSOL time-concentration curves are the closest to the analytical solution revealed by a low RMSE (Figure 4) for this problem. For the 1D case, the differences between these two simulators are significantly lower than 5% of C 0 except for the low dispersivity values (0.1-1 m). In the 2D case, COMSOL and FEFLOW show the difference is lower than 2% of the initial concentration, C 0 . Exceptionally low dispersivity (0.1-0.7 m) cases show higher relative errors than the analytical solution that might be associated with larger mesh size than dispersivity values, i.e., higher mesh Peclet number-related numerical dispersion. Significant higher differences in concentration values were shown for lower dispersivity values (0.1-1 m) in the 2D problem. However, the difference is insignificant for higher dispersivity values (3-10 m). In that instance, it is worth mentioning that the numerical dispersion handling from different software packages varies with software and methods. Moreover, the numerical error that is estimated in relatively simple 1D cases reveals a plausible numerical solution can only be achieved for dispersion values ranging between 0.7 and 7 m or highly discretised (space and time), simple flow condition, close to a linear problem, which is sufficiently limit expectation of a numerical solute transport modelling output. We found that except for MODFLOW/MT3DMS, all software showed higher sensitivity to spatial discretisation. Hence, grid size significantly impacts solution accuracy in simulations employing FE and FV software codes, while a finer discretisation does not affect FD accuracy. However, Maina et al. [9] reported that both MODFLOW/MT3DMS and FEFLOW are sensitive to spatial discretisation. They also reported that FE simulators computed an early tracer breakthrough in homogenous uniform flow simulation. We also found similar behaviour for FV-based DuMu X and FE-based FEFLOW and COMSOL simulators in both 1D and 2D problems (Figures 4a and 5a). Moreover, for a lower dispersivity, numerical dispersion is higher with DuMu X (Figure 5b). Therefore, a difference in tracer breakthrough from different software for the different aquifer layers in the 3D problem is well expected. If they are only compared for commonly perceived dispersion cases (5 and 7 m), the observed concentration variations among the software are highly significant (e.g., Figure 7). The simulated peak arrival time variation should only be associated with a change in permeability and porosity. The 3D case demonstrates that the results vary depending on how boundary conditions are implemented in the different codes and how hydraulic parameters vary across the layers. Therefore, we observed a varied peak arrival time, peak concentration, and BTCs for all four simulators ( Figure 6). Yet, it can be observed that all software codes properly simulate all relevant processes. The efficiency of a numerical method varies depending on the PDEs (e.g., parabolic, hyperbolic) to be solved. We selected four different software codes to solve the transport equation by a hyperbolic PDE. The mathematical properties of the solute transport equation vary depending upon the terms in the equation dominating in a particular situation. Hence, numerical methods would be deficient in solving groundwater transport problems of varied flow and transport situations encountered in the field [11]. In our 3D reservoir problem, the velocity may be close to zero in low permeability layers, and the transport processes are dominated by dispersion, i.e., non-Fickian transport [19] compared to advection-dominated transport processes in high permeability zones or at the pumping wells of three high permeability layers in 3D problem. Thus, for the 3D simulation, the governing equation may show more hyperbolic character in one area, such as near the well (or at one time), and more parabolic in another area, such as at a low permeability zone (or at another time).

Computation Time and Memory Use Efficiency
The computational time of the participating software (Tables 1 and 2) revealed a significant difference in the numerical performance considering the degrees of freedom that were solved by each simulator. The same rectangular grid is used for all four simulators. However, numerous choices are available even with the same numerical simulator, such as adding restrictions on time-stepping or adding regularisations (such as a constraint controlling the pressure). These may improve the accuracy at the cost of longer runtimes. Typical for FE software, COMSOL requires a larger memory (RAM) to run the simulation [39]. Memory reliance by DuMu X mainly arises when the grid is created (UG-Grid module was used in the 3D problem). Furthermore, COMSOL requires a larger memory to store the solution for each time step during the simulation. In contrast to Huebner et al. [39], the FE simulation by COMSOL and FEFLOW is faster for 2D problems than the FV-based software DuMu X .

Implementation of BC in Software
In transport simulations, mal-defined boundary conditions are common sources of errors. When a Dirichlet BC (constant concentration) is selected in a transport model, for example, in our 1D and 3D problems, a solute flux will be forced into or out of that cell to maintain concentration in that particular cell, and the flux, which can occur by both advection and dispersion processes (e.g., [11]). Moreover, in real hydrogeological or geochemical applications, a constant concentration over time is unlikely regardless of the accompanying flow field changes or local concentration gradients. Thus, it is inappropriate to apply a constant concentration boundary condition to a field problem to represent concentration in open water bodies bounding an aquifer with a head or open flow boundary, or for a boundary far from an area containing a solute plume of interest [13]. For flux boundary conditions, such as implementing multi-layer injection well, we find that assigning a well boundary without a wellbore storage condition works better. However, with a wellbore storage condition, a higher mass transfer is estimated in MODFLOW/MT3DMS and FEFLOW (30% of the total fluid mass in the 3D model), whereas COMSOL shows a minor influence from the open flow boundary. Again, hydraulic conductivity can vary significantly over short distances, and heterogeneity can exhibit significant spatial correlations, persistence, and connectedness. Hence, simulation of the "true" velocity distribution in space and time directly correlated to the accuracy and precision of actual distribution of K in the model domain. Notably, the 3D problem revealed that, once heterogeneities and anisotropy are introduced, model predictions differ. This is probably a consequence of the differences in implementing the hydrogeological parameters (e.g., permeability, porosity) in the simulators due to different spatial discretisation methods [42]. For instance, permeability contrast layers are introduced and additionally discretised in the vertex-centred DuMu X model, while a cell-centred parameter distribution is used in MODFLOW/MT3DMS. Hence, the differences in implementation between the models are: − Implementation of the model grid; − Implementation of flow and transport boundary conditions.

User-Friendliness
The first author implemented the three problems (1D, 2D, and 3D) in each of the four simulators. Before conducting this study, the first author familiarised himself with the use and implementation of flow and solute transport in FEFLOW. The implementation of the other three simulators was achieved by discussing with the other authors, familiar/experts in other participating numerical simulators. This way, the first author has experienced and acquainted himself with all software used within the frame of this work and, hence, used that software to set up the models, run simulations, and extract the data for further analysis. Therefore, the first author was in a position to judge the user-friendliness for new users for the three simulators, COMSOL, MODFLOW/MT3DMS, and DuMu X . Moreover, several master's students working with the authors for their project familiarised themselves with the software and exchanged their experiences. Eventually, during the follow-up discussion, sources of errors and the difference in simulation results were identified, and different implementation approaches such as refined grid, time-steps, boundary condition implementation were considered. COMSOL and FEFLOW model builder UI (user interface) are beneficial and easy to grasp by a new user to implement a problem in the software package. The COMSOL program window is well organised and particularly intuitive. The model setup is tailored by defining a series of PDEs to describe the simulated physical phenomena. All the components of the constructed model can be accessed and edited in a panel (Model Tree) program window in COMSOL. Hence, the COMSOL simulation environment facilitates all steps of the modelling process: defining geometry, specifying physics, meshing, solving, and post-processing. The same applies to FEFLOW as well. Commercial user interfaces (e.g., Visual MODFLOW) or freeware, e.g., ModelMuse, supports pre-and post-processing in MODFLOW/MT3DMS. Its feature-based boundary and naming special functions (e.g., evaporation, recharge, river boundary conditions) are easy for a new modeller to understand. On the other hand, DuMu X is not supported by any pre-processing GUI, which may be regarded as a disadvantage. Though it offers higher control over the simulation process and parameter estimation through building the problem script, modifying and updating it for the desired process or simulation in any C++ editor, the result can be visualised and post-processed with specialised software, such as Paraview. Additionally, the COMSOL and FEFLOW GUIs allow for easy modification of parameters. However, the user cannot edit or view the source code, which reduces the control and overall insight into the simulation process.

Conclusions
For very low dispersivity values, such as 0.5 m or lower, numerical simulation results show significant oscillations or are not converging in FE software packages FEFLOW and COMSOL, but FD software MODFLOW/MT3DMS simulations were found to generate stable results, which also applies to the FV software code DuMu X . However, relative errors are significantly higher for low dispersivity cases than for the analytical solution. This error is especially prominent for DuMu X , for which the excellent convergence comes at the cost of more significant numerical dispersion. For 1D and 2D cases, all numerical simulations show good agreement with the analytical results. Moreover, increasing spatial discretisation (grid refinement) improves accuracy for all four software packages. COMSOL Multiphysics needs a finer mesh to produce the same level of accuracy as FEFLOW and DuMu X simulations for the 2D cases. For the choice of the appropriate simulation software, the specific demands of the problem statement need to be considered for transport simulations. For a forced gradient set-up, where a commonly higher dispersion value is expected, the FE software FEFLOW is the best choice. Due to the high requirements for mesh refinement assembling the model in virtual memory, COMSOL Multiphysics has the highest demand for computer resources. However, the more significant solution time of COMSOL is compensated by its intuitive 'user interface', implementing different problems relatively fast as it does not need any changes in the source code. On the other hand, DuMu X is an academic, open-source code that is freely distributed, which is an advantage compared to purchasing commercial software. Furthermore, with the program code being generally available, modifications can easily be made, and the source code can be adapted to specific non-standard problems. For single-phase transport problems, COMSOL represents a good choice for a simulator; however, for more complex physics (e.g., multi-phase flow [43,44], hydromechanical coupling [6,45], etc.), further benchmarking studies need to be conducted to test the efficiency of these simulators.