Hybrid neural network reduced order modelling for turbulent flows with geometric parameters

Geometrically parametrized Partial Differential Equations are nowadays widely used in many different fields as, for example, shape optimization processes or patient specific surgery studies. The focus of this work is on some advances for this topic, capable of increasing the accuracy with respect to previous approaches while relying on a high cost-benefit ratio performance. The main scope of this paper is the introduction of a new technique mixing up a classical Galerkin-projection approach together with a data-driven method to obtain a versatile and accurate algorithm for the resolution of geometrically parametrized incompressible turbulent Navier-Stokes problems. The effectiveness of this procedure is demonstrated on two different test cases: a classical academic back step problem and a shape deformation Ahmed body application. The results show into details the properties of the architecture we developed while exposing possible future perspectives for this work.


Introduction
Shape optimization in the context of turbulent flow problems is a particularly challenging task. The difficulty is linked with both the high-dimensionality of the problems that need to be solved and the number of configurations to test, the first one due to the physics, the second one due to the scope of the research. These two features make usually the problem intractable with standard numerical methods (e.g., finite element, finite volume, finite difference methods). Reduced order models [2,3] (ROMs) are a possible tool that can be used in such a setting to make the problem solvable. There exist a variety of reduced order modeling techniques but the overall principle of all of them is to unveil a low dimensional behavior of a high dimensional system to allow faster computation.
ROMs can be classified depending on the technique used to approximate the solution manifold and the method used to evolve the latent dynamics. The most used techniques to evaluate the solution manifold are based on linear approximation methods such as the reduced basis with a greedy approach ( [25,34]), the proper orthogonal decomposition ( [37]) or non-intrusive methods as exposed in [39] but more recently also nonlinear methods have been proposed ( [30,28]). For what concerns the evolution of the latent space dynamics arguably the most common approach is based on (Petrov-) Galerkin projection of the original system onto the reduced subspace/manifold [7]. Data driven techniques [10], which are solely based on the reconstruction of the mapping between input and output quantities are also a possible approach. Recently, the latter techniques received particular attention also due to the latest discoveries in machine learning. Data-driven methods are usually easier to implement and permit to obtain efficient ROMs also in the case of nonlinear/nonaffine problems and in the case of commercial codes with no access to the discretized full order system. On the other hand, they usually do not exploit information concerning the underlying physical principles and they might require a large number of training data to produce accurate results. Projection based techniques, thanks to the projection stage, incorporate in a natural way the physical knowledge but are particularly challenging to be implemented in the case of nonlinear and non-affine problems.
In this work we propose a hybrid approach where the underlying partial differential equations are partially treated using a standard POD-Galerkin approach and partially by neural networks data-driven approaches. This choice is dictated by both practical and theoretical considerations. The practical one concerns the idea of generating an approach that could be applied to any turbulence model without the need to modify the reduced order model. In incompressible turbulent flows there exist a large number of turbulence models, used to outflank the difficulty in solving the dissipative scales, and, using a projection-based technique, would require to create a new reduced order model for each of them. Secondly, despite the large amount of theoretical work behind turbulence models, there are still a number of empirical coefficients and this makes the overall formulation less rigorous in terms of physical principles. These considerations have been used to propose a reduced order model that could be applied to any eddy viscosity turbulence model and that exploit a projection based technique for mass and momentum conservation and a data driven approach for the reconstruction of the eddy viscosity field. The model is constructed extending the work done in [23,18] to geometrically parametrized problems [38] with a modification of the approach to reconstruct the eddy viscosity mapping.
In the first part of this work we present all the technicalities related to the implementation of the previously described hybrid method: subsection 2.1 contains the Finite Volume discretization of the incompressible Navier-Stokes equation employed for this work, subsection 2.2 explains the method we selected for the motion of the mesh due to geometrical parametrization, subsection 2.3 introduces the reduced order model while subsection 2.4 gives an overview on the actual algorithm used for the resolution, subsection 2.5 treats the eddy viscosity evaluation. The second part of the paper is devoted to the presentation of the results related to two different test cases: a classical academic back step with variable slope of the step into subsection 3.1 and a second, more applied, one, shown into subsection 3.2, where the flow around an Ahmed body with variable slope of the rear part is resolved, both revealing good behaviours and promising results. In the end, few considerations and possible future developments for this work are present into section 4.

The full order problem
In this work we are interested on Reynolds Averaged Navier Stokes (RANS) problems in a geometrically parametrized setting. This section is devoted to the explanation of the full order discretization employed to obtain a high fidelity solution.
The problem we want to deal with is modeled by the following equations: where u = u(t, x, µ) stands for the time averaged velocity field, p = p(t, x, µ) stands for the mean pressure field, ν is the kinematic viscosity, ν t is the eddy viscosity, g D is the boundary value to be assigned on Dirichlet boundaries while g N is the boundary value to be assigned on the Neumann boundaries. The vector µ ∈ P ⊂ R p is representing the vector of dimension p containing the parameters of the problem that, at this stage, can be both physical or geometrical without any necessity of specification. From now on we will consider just steady state problems. For this reason the time derivative into the momentum equation will be neglected. Moreover we get u(t, x, µ) = u(x, µ), p(t, x, µ) = p(x, µ) and we will refer to them as just u and p for sake of simplicity.
For these kind of applications, the use of Finite Volume techniques is common and reliable, even though Finite Element methods are widespread used (see [15]) and mixed techniques are available Figure 1: Scheme of the relation between two neighbor cells of the tessellation T . too (see [11]). To approximate the problem by the use of the Finite Volume technique, the domain Ω(µ) has to be divided into a tessellation T (µ) = {Ω i (µ)} N h 1 so that every cell Ω i is a non-convex polyhedron and . For sake of brevity, from now on, we will refer to Ω i (µ) as Ω i .
The steady-state momentum equation written in its integral form for every cell of the tessellation T , reads as follows: Let us analyze this last equation, term by term. The convective term can be treated by the use of the Gauss' theorem: where S i is the total surface related to the cell i, S ij is the oriented surface dividing the two neighbor cells i and j, u ij is the velocity evaluated at the center of the face S ij and F ij is the flux of the velocity through the face S ij (see Figure 1). Two considerations have to be underlined for this procedure. The first one is that u ij is not straight available in the sense that all the variables of the problem are evaluated at the center of the cells while here an evaluation for the velocity is required at the center of the face. Many different techniques are available to obtain it but the basic idea behind them all is that the face value is obtained by interpolating the values at the center of the cells. The second clarification is about fluxes: during an iterative process for the resolution of the equations, they are calculated by the use of the velocity obtained at previous step so that the non-linearity is easily resolved.
We now deal with the pressure term exploiting the gradient theorem: where p ij is the pressure evaluated at the center of the face S ij . The last term to be taken into consideration is the diffusive one: where (ν + ν t ) i is the viscosity for the i-th cell, (ν + ν t ) ij is the viscosity evaluated at the center of the face S ij and (∇u) ij refers to the gradient of the velocity evaluated at the center of the face S ij . Notice that the gradient of the velocity is not known at the face of the cell. If the mesh is orthogonal, the approximation of its flux is straightforward: where d is the vector connecting the centers of cells i and j. If the mesh is not orthogonal (see Figure 1), a correction has to be added: where S ij has been decomposed into a component parallel to d, namely π ij , and another one orthogonal to d, namely ω ij . The term (∇u) ij is finally evaluated by interpolation starting from the values (∇u) i and (∇u) j at the centers of the neighbor cells.
Now the complete discrete momentum equation can be written: After having applied the necessary interpolation for face centers quantities evaluation, the whole system can be rewritten into its matrix form as follow: where A u is the matrix containing all the terms related to velocity into the discretized momentum equation, B p is the matrix containing the terms related to pressure into the same equation, ∇(·) is the matrix representing the incompressibility constraint, u h is the vector where all the u i variables are collected and the same applies for p h with respect to p i having u h ∈ U h ⊂ R d N h and p h ∈ Q h ⊂ R N h with d spacial dimension of the problem. The interested reader can find deeper explanations on the Finite Volume discretization technique in [27,24,31].
In this work, for what concerns the offline phase, a segregated pressure-based approach has been selected. In particular, the Semi-Implicit Method for Pressure-Linked Equations (SIMPLE) algorithm has been employed. This choice is due to the difficulties given by velocity-pressure linked problems (see e.g. [12]).
To better understand the procedure, let us report here the crucial points about this algorithm, they will be very useful later during the description of the ROM technique in this paper.
First of all we can divide the operator related to velocity into a diagonal and an extra-diagonal parts so that After that, recalling Equation 2, we can reshape the momentum equation as follows: In an iterative algorithm, we can express both velocity and pressure as their value at previous iteration plus a correction term: where * terms are the old ones while are the corrections terms. With some approximations for the mixed terms, the following relation holds: Into the SIMPLE algorithm a big assumption is taken since the extra-diagonal term H(u ) is discarded and put to zero. Of course this makes the whole procedure no more consistent but on the counterpart it makes the resolution of the so-called pressure correction step much easier. We then get: If we now apply the divergence operator to both sides of Equation 2.1, we end up with a Poisson equation for pressure by exploiting the incompressibility constraint:

Mesh motion
When working in a finite volume environment, the geometrical parametrization matter is complex to be approached and treated. Some points have to be considered before starting: • as shown in subsection 2.1, also element-wisely, all the equation are written in their physical domain; • a finite volume mesh does not have a standard cell shape, resulting on an almost randomshaped polyhedra collection; • mapping the equations to a reference domain may require the use of a non-linear map but this choice wold lead to a change in the nature of the equations of the problem (see [16]).
For all the reasons above, it may not be a good idea to rewrite the problem into a reference geometry to map it back to the real domain at the end of the resolution. On the contrary in this work we decided to operate always on the real domains, moving the real mesh both during the offline and online phases. In fact, since no mapping is used, also at the online level everything is calculated in the real domain that has to be modeled according with the online parameter. This is the reason why we need a very efficient strategy for the mesh motion: in case it takes too much effort to be carried out, it compromises all the benefit coming from the reduction.
To move the mesh we use a Radial Basis Function (RBF) interpolation strategy [14]. The general formula for the evaluation of the displacements of the grid reads: where δ(x) is the displacement of the grid node positioned in x, N b is the number of selected control points on the moving boundary, ω i are some calculated weights, ϕ is a fixed function whose support is a round area of predetermined radius r, x b i are the coordinates of the control points and q(x) is a polynomial.
The procedure can be summarized in the following steps: 1. select the control points into the boundaries to be moved and shift their position obeying the fixed motion rule selected for the geometry modification, accordingly with the parameter dependent displace law: they can be either all the points into the boundary or just a fraction of their total amount if the dimension of the mesh is big enough (see Figure 2), since the higher is the number of control points, the bigger (and then expensive) is the resulting RBF linear problem to be solved; 2. calculate all the parameters for the RBF to ensure the interpolation capability of the scheme: resulting on the solution of the following linear problem: i for each row, α contains the coefficients for the polynomial q(x) and δ b are the displacements for the control points, known a priori (see [9]); 3. evaluate all the remaining points of the grid by applying Equation 3.
Few aspects have to be underlined about the procedure above: • Equation 3 is used not just to move the internal points of the grid but also the points located on the moving boundaries that are not selected as control points: even if their displacement could be calculated exactly, changing their position by rigid translation while all the points of the internal mesh are shifted by the use of the RBF may lead to a corrupted grid; • Equation 4 requires the resolution of a dense linear problem whose dimension is equal to Thus, the number of control points have to be carefully selected. Fortunately the resolution of Equation 4 has to be carried out just once, storing all the necessary parameters to be used in the following mesh motions; • by the use of this mesh motion strategy, one ends up with meshes having all the same topology which is an important feature when different geometries have to be compared.

The reduced order problem
The resolution of Equation 1 for many different values of the parameter may become unaffordable. For this reason, the scope of this work, is to find an efficient way to get an accurate solution at a lower computational cost, namely a Reduced Order Model (ROM). To pursue this goal, we relay on a POD-Galerkin technique. It consists on computing a certain number of full order solutions s i = s(µ i ), where µ i ∈ T for i = 1, ..., N t , being T the training collection of a certain number N t of parameter values, to obtain the maximum amount of information from this costly stage to be employed later on for a cheaper resolution of the problem. Those snapshots can be resumed at the end of the resolution all together into a matrix S ∈ R N h ×Nt so that: The idea is to perform the ROM resolution that is able to minimize the error E ROM between the obtained realization of the problem and its high fidelity counterpart. In the POD-Galerkin scheme, the reduced order solution can be exploited as follow: where N r ≤ N t is a predefined number, namely the dimension of the reduced order solution manifold, β j (µ) are some coefficients depending only on the parameter while ξ j (x) are some precalculated orthonormal functions depending only on the position.
The best performing functions ξ j are, in our case, the ones minimizing the L 2 -norm error E ROM between all the reduced order solutions s ROM i , i = 1, ..., N t and their high fidelity counterparts: Using a Proper Orthogonal Decomposition (POD) strategy, the required basis functions are obtained through the resolution of the following eigenproblem, obtained with the method of snapshots: where C ∈ R Nt×Nt is the correlation matrix between all the different training solutions, V ∈ R Nt×Nt is the matrix containing the eigenvectors and λ ∈ R Nt×Nt is the matrix where eigenvalues are located on the diagonal. All the elements of C are composed by the L 2 inner products of all the possible couples of truth solutions s i and s j . Of course the choice of a POD procedure for the creation of the modal basis functions is not the only possible one, see e.g. [17], [13] and [21].
What may result confusing about this last computation is the fact that the L 2 norm is not well defined since all the realisations are obtained for different parameter values and, thus, for different domains. In this work we overtake this problem by exploiting the fact that all the meshes have the same topology. It is then possible to define a mid-configuration by the mesh motion obtained through a specific parameter µ mid resulting from: In our case we use equispaced offline parameters to compose T leading to just µ mid = The correlation matrix can then be easily assembled as: Finally the POD basis functions are obtained as a linear combination of the training solutions as follows: All the basis functions can be collected into a single matrix: It is used to project the original problem onto the reduced subspace so that the final system dimension is just N r . Supposing N r N h , this procedure leads to a problem requiring a computational cost that is much lower with respect to the high fidelity one (see Figure 3). Many different ways can be chosen to solve the reduced problem. For example the whole system in Equation 1 can be assembled and projected in a monolitic approach or the equations can be treated one at a time in an iterative procedure. As we will see in subsection 2.4, in this work we decided to deal with a segregated approach. This means that the momentum predictor and pressure correction steps are iterated until convergence is reached. Since the solution fields during these iterations vary a lot, from the first attempt for the variables to last resolution, the information contained into the converged snapshots is not sufficient to ensure the correct reduced reconstruction of the path to the global minimum for Equation 1.
To overtake this issue, the idea proposed here is to enrich the set of snapshots for the matrix into Equation 5 by the use of some intermediate snapshots that are stored during the iterations of the full order problem, as shown in Figure 4. The matrix we obtain is: This procedure is of course somehow polluting the physical content of the resulting POD basis functions, since the intermediate steps solutions physical meaning is almost negligible, but the real gain of this procedure is to ensure a better convergence for the ROM algorithm.

The Reduced Order SIMPLE algorithm
We present here a new strategy for the resolution of the reduced problem: since for the full order solutions we rely on a segregated pressure based SIMPLE algorithm, the application of a monolithic approach for what concerns the online phase would lead to an inconsistency. In fact, the decoupling of the equations into the system reported in Equation 1, requires a slight modification of their form. For this reason we developed a Reduced Order SIMPLE algorithm, based on the full order one, that simulates the high fidelity behaviour for what concerns the convergence to the final solution, utilizing projection-based techniques. In the following Algorithm 1 we present the main steps for the implementation of this algorithm. For the interested reader, its laminar counterpart can be analyzed in more detail in [38]. Turbulence in this algorithm is treated, as it can be done for the whole SIMPLE family of algorithms, by the addition of an extra turbulent viscosity ν t (see [41]). Let us introduce here the snapshots matrices containing the full order solutions of Equation 1: where d is the space dimension of the problem and N s is the number of realizations equal to the number of provided training parameter values.
For the application of a projection-based reduction procedure of Equation 1, two different sets of basis functions have to be provided, for pressure and velocity respectively. This means that the procedure we exposed in subsection 2.3 has to be carried out for both S p and S u . Reduced pressure p r and reduced velocity u r can then be written as: where N p ≤ N s and N u ≤ N s are the selected number of modal basis functions chosen to reconstruct pressure and velocity manifolds V p and V u respectively, so that p r ∈ V p = span{θ 1 , . . . θ Np } and u r ∈ V u = span{ψ 1 , . . . ψ Nu }, being θ i the POD basis for pressure and ψ i the POD basis for velocity. Matrices Θ and Ψ contain the modal basis functions for pressure and velocity.

Algorithm 1 The Reduced Order SIMPLE algorithm
Input: first attempt reduced pressure and velocity coefficients b and a ; modal basis functions matrices for pressure and velocity Θ and Ψ Output: reduced pressure and velocity fields p r and u r 1: From b and a , reconstruct reduced fields p and u : Then correct the velocity explicitly after having reconstructed the new pressure p ; 6: Relax the pressure field and the velocity equation with the prescribed under-relaxation factors α p and α u , respectively. The under-relaxed fields are called p ur and u ur ; 7: if convergence then 8: u r = u ur and p = p ur ; 9: else 10: Assemble the conservative face fluxes F ij : 11: set u = u ur and p = p ur ; 12: iterate from step 1.

13: end if
Fluid flows projection based ROMs usually require to be stabilized in some way (see e.g. [8,26,5]). For Navier-Stokes problems, in particular, the use of stable snapshots does not guarantee the Ladyzhenskaya-Brezzi-Babushka condition fulfillment for the saddle-point problem (see [6]). The accuracy in the pressure field is of high relevance for many different configurations (see [35]). In this case, the application of a segregated approach, also at the reduced level, leads to the complete unnecessity of extra stabilization.
Into step number 2 of Algorithm 1 no explanation is provided on how to evaluate the eddy viscosity ν t . This is a crucial point of the whole procedure and requires a deeper analysis that we provide to the reader in subsection 2.5.

Neural Network eddy viscosity evaluation
Different possibilities are available for the closure of turbulent problems (see [40]); to make the ROM independent from the chosen turbulence model in the FOM, different approaches are eligible (see, e.g., [22,19]). In this case a data-driven approach is employed for the eddy viscosity ν t . Analogously as for velocity and pressure, first, the reduced eddy viscosity ν tr is computed via POD on the snapshot matrix S νt ∈ R N h ×Ns : where ζ i and c i are the POD modes and coefficients for eddy viscosity, respectively, and N νt ≤ N s denotes the selected number of modes to reconstruct the eddy viscosity.
In contrast to the POD coefficients of velocity and pressure, which are obtained by projecting the full order problem onto the respective POD modes and subsequently solving the reduced order problem, the POD coefficients for the eddy viscosity are modeled via a multilayer feedforward neural network. This neural network takes as the input the POD coefficients for velocity a and the corresponding geometrical parameters values µ and maps them to the POD coefficients of the turbulent viscosityc (Tilde denotes a prediction from the neural network) as the output (Figure 5).  Figure 5: Illustration of a neural network that maps the POD coefficients for velocity a ∈ R Nu and the parameter values µ ∈ R p as inputs to the the POD coefficients c ∈ R Nν t of the eddy viscosity ν t via N l fully connected layers.
Subsequently, the basics of multilayer feedforward neural networks and their training process are briefly reviewed; for a comprehensive description, we refer to Goodfellow et al. [20]. The input to the neural network is commonly denoted as x and for our application reads: The choice on what to use for the input is supported by the fact that the dependency of the eddy viscosity field on the velocity field is well known because of the way the RANS equations are constructed while the dependency on the geometric parameters help in the accuracy of the network. The mapping from this input vector to the coefficients for the eddy viscosityc is learned by the multilayer neural network via N l fully connected layers: where layer i (i = 1, . . . , N l ) performs an affine transformation of its input (specified by the trainable weight matrix W i and bias b i ) that is subsequently passed through the (linear or nonlinear) element-wise activation function f i . To train the weights θ in supervised learning, the empirical risk over the training data J is minimized: wherep data and n train denote the empirical distribution of the training data and the number of training samples, respectively; L(c, c) is a per-sample loss metric that describes the discrepancy between target output c (given by training data) and predicted outputc (by neural network).
As loss function, we use the squared L 2 -loss function (also known as mean squared error), the most common choice for the loss function in regression problems: Employing this loss function, the objective function J is minimized using the Adam [29] optimizer with minibatching, and the required gradients of the parameters with respect to the loss function are calculated via backpropagation [33]. The hyperparameters of the neural network, which are the parameters that are not subject to the optimization during training, were tuned for each test case separately by minimizing the loss on a designated validation data set (while the accuracy evaluation of the neural network was finally performed on a third set, referred to as test set). The hyperparameters subject to tuning were: the height and width of the neural network (i.e. the number of hidden layers and units per hidden layer, cf. Figure 5), the activation functions for each layer, and the learning rate as well as the batch size of the Adam optimizer. For the creation and training of the neural networks, we employed the Python library PyTorch [32].

Academic test case
The first test case we propose to check the effectiveness of the procedure previously described is a classical 2D back step problem where the slope of the step is parametrized and can be varied (see Figure 6).
All the results provided in this paper are obtained by the use of an in-house open source library ITHACA-FV (In real Time Highly Advanced Computational Applications for Finite Volumes) [36], developed in a Finite Volume environment based on the solver OpenFOAM [1]. Figure 6: Geometry of the domain The set of equations we want to consider are the ones reported in Equation 1 where g D = [1, 0, 0] T , g N = 0, the eddy viscosity ν t is obtained by the resolution of a k − turbulence model and ν = 1 × 10 −3 .
With reference to Figure 6, the height of the duct at the inlet, namely h 1 , is equal to one while it is equal to 1.7 in the middle of the channel, namely h 2 . The domain is divided into 14 × 10 3 hexahedral cells mesh. The mesh motion is carried out by the use of a Radial Basis Function approach, as explained in subsection 2.2.
The Reynolds number characterizing the dynamics of the problem can be evaluated taking into account both the fluid properties together with geometrical aspects as: Since the range for the Reynolds number we are working at is on the border line between laminar and turbulent flows, we are forced to consider a turbulence closure model.
For the offline phase we selected 50 equispaced values of the parameter µ ∈ [0, 75]. Those values of the angle of the step were used to solve 50 different full order problems in order to construct the snapshots matrix.
By applying a POD procedure, we can obtain the modal basis functions we need to project the equations.  By analyzing Figure 7 we can notice that at least 25 modes have to be selected for ν t in order to catch the main part of the information contained into the offline snapshots. For what regards pressure and velocity manifolds, they are here projected and then reconstructed using 35 basis functions.
Thus, a neural network has been constructed for the eddy viscosity approximation at every reduced SIMPLE algorithm step as explained in subsection 2.4.
The neural network employed here is composed by: • an input layer, whose dimension is equal to the dimension of the reduced velocity, i.e. 35, plus one for the parameter; • two hidden layers of dimension 256 and 64 respectively; • an output layer of dimension 25 for the reduced eddy viscosity coefficients.
The net is a fully connected one. Moreover the neurons of the hidden layers are characterized by the employment of ReLU activation functions. For the training procedure, the Adam optimizer has been selected and 10 4 epochs have been fixed.
The training set is composed by both the intermediate and final solutions obtained during the offline phase, randomly selected. To control the training procedure, a test set has been selected too: 10 totally random new parameter values have been chosen and their related full solutions have been calculated, saving both final and intermediate steps, coherently with the offline snapshots used for training. Looking at Figure 8, it can be noticed that there is a nice agreement between train and test loss functions. This is a good indicator for the extrapolation capability of the net.  In Figure 9, Figure 10 and Figure 11, we show the comparisons between full order model (FOM) and ROM solutions for velocity, pressure and eddy viscosity. Two random angles have been selected to show the behaviour of the model for both a very low parameter value and for a very high one. As it may be noticed, the reconstruction of the reduced order model is very accurate and the errors are pretty low. The main differences between the high fidelity and the reduced solutions are present for high values of the parameter. This is to be addressed to the fact that the mesh is really distorted for those cases and the good orthogonality properties of the original mesh are lost.
In any case the model is able to tackle the full order solution and can predict in a consistent way the correct solution.   As proof of what it has just been said, we show on Figure 12 the trend of the L 2 norm relative errors while varying the dimension of the reduced manifolds for velocity and pressure at the same time. The values presented in this plot are the mean relative errors between 10 random chosen parameters for the online phase.

Ahmed body
As the second test case, we chose an automotive external aerodynamic one: the Ahmed body [4]. The Ahmed body is a generic vehicle: the flow around the back of this bluff body contains the main flow structures that are encountered also for real-life vehicles. We defined one geometrical parameter -the slant angle -using RBF mesh morphing (see Subsection 2.2). Figure 13 shows the Ahmed body and illustrates the covered design space by the slant angle parameter. Depending on the slant angle, different flow regimes are encountered (cf. Figure 15): (1) below approximately 12°, the flow remains attached over the slant; (2) between 12°and 30°, forming c-pillar vortices as well as recirculation regions at the top and base increase drag; (3) at approximately 30°, the flow fully separates off the slant, thus leading to a sudden drag decrease. At this stage, the study is restricted to the initial part of a single flow regime ranging from 15°to 23°, which already constitutes a demanding task.
We sampled the parameter range (15°to 23°) uniformly with 20 RANS simulations using OpenFOAM ® with the Spalart-Allmaras turbulence model; these 20 simulations were decomposed into 10 for training (offline phase) the ROM and 10 to assess its accuracy (online phase). The inlet velocity for the simulations was set to 40 m s −1 , thus resulting in a Reynolds number of ≈ 2.8×10 6 based on the model length. Each mesh was created with SnappyHexMesh ® and contained  Figure 15). While from a CFD perspective the meshes are very coarse, they constitute a challenge for the ROM and are considerably larger compared with those of the academic test case (35 × 10 4 vs. 14 × 10 3 ). We saved every 20th of the total 2000 iterations as snapshots (velocity, pressure, and eddy viscosity fields), resulting in 100 snapshots per simulated slant angle. Each simulation took about 3 minutes on 16 CPU-cores.
After assembling the snapshot matrices with the intermediate as well as the converged iteration of the FOM simulations, we decomposed those matrices into modes and coefficients via POD. Figure 14 shows the corresponding cumulated eigenvalues for velocity, pressure and eddy viscosity. For the upcoming investigations, we chose to keep 30 POD modes for all three fields.  Figure 14: Cumulated eigenvalues of the POD for velocity, pressure, and eddy viscosity.
As described in subsection 2.5, the POD coefficients of the eddy viscosity are modeled via a neural network. For the present test case, the input of this neural network -for each of the 1000 training samples (10 angle values times 100 saved iterations per angle) -is given by the 30 POD coefficients of velocity and, additionally, the slant angle. The optimized neural network architecture consists of two hidden layers with 128 units each, Tanh activation functions, as well as a learning rate of 0.001 for the Adam optimizer, thereby using a batch size of 128; the training was terminated after 10 000 epochs. Analogously as for the academic test case, we assessed the model accuracy on the test data set (the 1000 samples corresponding to the 10 test geometries) and found that the model generalizes well to unseen data.
With the trained neural network for the eddy viscosity, we are enabled to solve the reduced order problem for test geometries, i.e. slant angle configurations not present in the training data. Subsequently, we evaluate the ROM accuracy quantitatively and qualitatively by comparing ROM and FOM results for the 10 test geometries. For the quantitative analysis, we (1) compare the drag coefficients and (2) compare the relative L 2 -errors between the velocity and pressure fields from ROM and FOM. For the qualitative comparison, we compare the velocity and pressure fields on two slices through the computational domain for two chosen test geometries. We start the accuracy assessment with the drag coefficient, the major quantity of interest in the development of vehicle aerodynamics. As the drag coefficient of the ROM is obtained by integrating the pressure and wall shear stress over the vehicle surface, this investigation also allows to implicitly assess the accuracy of surface field predictions for those fields. Figure 15 shows the drag coefficient c d over the slant angle for the conducted 20 FOM simulations and indicates the even distribution in the parameter space of the geometries used for training and testing. 14 Figure 15: Drag coefficient c d over slant angle for the 20 full-order simulations: the even distribution of geometries into train and test sets is illustrated. For the test geometries, additionally, the ROM prediction is shown. In black, albeit not used in the present study, the development of the drag coefficients for higher slant angles is shown.
The minimum and maximum absolute errors of the ROM are 1.5 (test sample at slant angle 22.8°) and 3.0 (15.4°) drag counts, respectively, while the mean error over all 10 test samples amounts to 2.4 drag counts. The drag coefficient in automotive vehicle aerodynamics is dominated by the pressure contribution (approximately 85 % pressure and 15 % viscous contribution for the present test case); accordingly, we found that the error in surface pressure between ROM and FOM accounts for the majority of the total error in the drag coefficient prediction. Therefore, the visible systematic offset between ROM and FOM for the drag coefficient can probably be reduced by improving the pressure field prediction, which is investigated next. Figure 16 shows the relative L 2 -errors between ROM prediction and FOM (solid lines) for velocity and pressure. As for the drag coefficient, the highest errors for both fields are encountered for the test sample with 15.4°slant angle. The errors for pressure are one magnitude higher compared with those for velocity. Additionally, the projection errors -the lower bounds for the ROM errors -are shown (dashed lines). While for the velocity a ROM prediction error close to the projection error is achieved, there is still room for improvement in the case of pressure (vertical distance between blue solid and dashed lines).
Finally, Figure 17 and Figure 18 compare the FOM and ROM fields qualitatively for velocity and pressure, respectively. We chose the test samples with the lowest and highest slant angle for this visual comparison.
For velocity and pressure, ROM and FOM results are in good agreement on both presented slices. In accordance with the quantitative results, for both fields, the errors for slant angle 15.4°a re higher compared with those at 22.8°. As the parametrization alters the vehicle geometry exclusively at the rear end, the main flow field variations are expected to occur in the wake area of the vehicle; accordingly, for velocity, the highest ROM errors are visible in this region. Additionally, smaller regions at the top of the front end exhibit higher errors for both test samples.
For the pressure, the regions of highest errors are scattered around the vehicle surface. Besides the wake region, in particular below the vehicle underbody high errors occur. The deficiencies of  Figure 15). The ROM errors (solid lines) lines are compared with those from the projection of the FOM solution into the POD subspace (dashed lines).    the pressure prediction of the ROM near the surface likely result in relatively high errors for the drag coefficients and is a topic of improvement for future work.

Discussion
In this paper we presented a new approach based on a technique that mixes up a classical projectionbased method for what concerns both the momentum equation and the incompressibility constraint with a data-driven procedure for what regards the eddy viscosity closure. This choice revealed a wide applicability and flexibility since the turbulence model selected for the offline phase does not affect in any way the computations during the online phase. Moreover the reconstruction of the eddy viscosity field is very accurate as showed in subsection 3.1.
The reduced SIMPLE algorithm we presented here in subsection 2.4, taking advantage of the coupling between the accuracy of projection-based methods and the versatility of neural networks, showed to guarantee good approximations in widely different fluid dynamics test cases. Moreover the idea of collecting converged fields together with middle iterations solutions ensures good convergence properties without showing relevant errors due to the physical information pollution of the modal basis functions, as explained in subsection 2.3.
Finally the choice of relying on an RBF approach for the mesh motion demonstrated to be effective while preserving a good shape of the modified mesh.
For what concerns the efficiency of the online phase of the problem, still some improvements are required and a natural forward step for this kind of applications would be the development of hyper reduction techniques for the reduced operators. This task could be also entrusted to neural networks approaches, trying to approximate the reduced operators by the evaluation, e.g., of an autoencoder. In any case the scope of this article was not focused on highly efficient hyper reduction techniques. Thus, even if in this procedure we are still relying on reconstructed fulldimension reduced order fields to assemble the equations, the results are in any case appreciable also in terms of time consuming.