Enhanced Drag Force Estimation in Automotive Design: A Surrogate Model Leveraging Limited Full-Order Model Drag Data and Comprehensive Physical Field Integration

Naffer-Chevassier, Kalinja; De Vuyst, Florian; Goardou, Yohann

doi:10.3390/computation12100207

Open AccessArticle

Enhanced Drag Force Estimation in Automotive Design: A Surrogate Model Leveraging Limited Full-Order Model Drag Data and Comprehensive Physical Field Integration

by

Kalinja Naffer-Chevassier

^1,2,†,

Florian De Vuyst

^3,*,†

and

Yohann Goardou

¹

Renault Group, Technocentre Renault, 78280 Guyancourt, France

²

LMAC Lab, Université de Technologie de Compiègne, CS 60319, 60203 Compiègne, France

³

BMBI Lab, Université de Technologie de Compiègne, CNRS, CS 60319, 60203 Compiègne, France

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Computation 2024, 12(10), 207; https://doi.org/10.3390/computation12100207

Submission received: 16 August 2024 / Revised: 21 September 2024 / Accepted: 25 September 2024 / Published: 16 October 2024

(This article belongs to the Special Issue Synergy between Multiphysics/Multiscale Modeling and Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

In this paper, a novel surrogate model for shape-parametrized vehicle drag force prediction is proposed. It is assumed that only a limited dataset of high-fidelity CFD results is available, typically less than ten high-fidelity CFD solutions for different shape samples. The idea is to take advantage not only of the drag coefficients but also physical fields such as velocity, pressure, and kinetic energy evaluated on a cutting plane in the wake of the vehicle and perpendicular to the road. This additional “augmented” information provides a more accurate and robust prediction of the drag force compared to a standard surface response methodology. As a first step, an original reparametrization of the shape based on combination coefficients of shape principal components is proposed, leading to a low-dimensional representation of the shape space. The second step consists in determining principal components of the x-direction momentum flux through a cutting plane behind the car. The final step is to find the mapping between the reduced shape description and the momentum flux formula to achieve an accurate drag estimation. The resulting surrogate model is a space-parameter separated representation with shape principal component coefficients and spatial modes dedicated to drag-force evaluation. The algorithm can deal with shapes of variable mesh by using an optimal transport procedure that interpolates the fields on a shared reference mesh. The Machine Learning algorithm is challenged on a car concept with a three-dimensional shape design space. With only two well-chosen samples, the numerical algorithm is able to return a drag surrogate model with reasonable uniform error over the validation dataset. An incremental learning approach involving additional high-fidelity computations is also proposed. The leading algorithm is shown to improve the model accuracy. The study also shows the sensitivity of the results with respect to the initial experimental design. As feedback, we discuss and suggest what appear to be the correct choices of experimental designs for the best results.

Keywords:

automotive engineering; drag force; surrogate model; reduced-order model; machine learning; data-driven; limited data; shape parameters; far field; optimal transport

1. Introduction

In automotive engineering, drag reduction is a crucial step aimed at improving aerodynamics to increase vehicle range. As it has a major impact on the architecture and design of the product, aerodynamics is among the first performance to be studied and optimized. The design convergence is split between physical and numerical testing. While physical testing allows for faster iteration and more precise results, it is limited by the possibilities offered in a wind tunnel facility. It is often used in the final stages of development, with fine-tuning and approval in mind. On the contrary, numerical simulation is used very early in a project to initiate the design convergence, by eliminating non-compliant design and proposing solutions to improve the aerodynamic performance. However, each simulation is time-consuming and costly. A typical automotive simulation involves turbulence models and unsteady computation, leading to restitution delay in hours. Shape optimization is often used in this context to propose small alterations at various locations to improve the global performance. The design variables are mostly defined by the three-dimensional CAD shape parametrization. For standard vehicles, the admissible design domain can be rather big with numerous parameters. Among the reference computational approaches for shape optimization, one can distinguish two main families of methods: gradient-based methods and surrogate modeling techniques.

1.1. Gradient-Based Approaches

The first family considers gradient-based or descent-based optimization algorithms. These iterative methods require at least the evaluation of the gradient of the cost function (the drag coefficient in the context of this paper) with respect to the shape variables. For the evaluation of the gradient, adjoint-based methods have marked significant progress in optimization techniques. The calculation of the gradient using the adjoint method was introduced by Lions [1], then further developed, e.g., by Pironneau [2] and Jameson [3,4]. The key feature of the adjoint method is that the numerical complexity to evaluate the gradient does not depend on the number of variables, enabling its application for high-dimensional parametric problems. Although extensively utilized in the aeronautical field, this method is still relatively underused in automotive aerodynamics. The adjoint method requires the development of an additional code, the so-called adjoint code. The case of adjoint methods for unsteady computations is a harder task since the primal time-dependent solution has to be stored. In this context, we mention the work by Cheylan et al. [5], where an adjoint solver was derived from a Lattice Boltzmann Method code for the Navier–Stokes equations, including a large eddy simulation (LES) turbulence model. However, adjoint-based shape optimization is limited in practical applications due to the iterative and computationally expensive nature of CFD simulations. This makes it difficult to meet the demands of fast, interactive design optimization, where rapid feedback is essential. Moreover, in the automotive industry, design is often driven by aesthetic constraints, which are difficult to quantify mathematically. This makes it challenging to integrate them directly into an optimization process.

1.2. Surrogate Modeling

The second family of methods groups together all the surrogate modeling techniques. A surrogate model is based on approximation methods such as regression to build an easy-to-compute approximate cost function. For that, a design of computer experiment (DoCE) is used. The first step is to choose a sampling strategy of the admissible design space. Then, high-fidelity (HF) computations are run to evaluate the cost function at these sample points. The so-called Response Surface Methodology (RSM) is widely used in the Computer-Assisted Engineering community because of the simplicity in its design and setup. RSM can be built with various approximation methods such as Radial Basis Function [6,7], Gaussian Process [8], or AI tools such as regression-based Artificial Neural Networks [9].

However, for high-dimensional design problems, a significant number of HF computations is needed. The Latin Hypercube Sampling (LHS) strategy allows to sample a hypercube domain with a number of samples equal to the dimension d. However, it is known to be generally insufficient for obtaining an accurate response surface. An acknowledged empirical law is to consider a number of samples equal to d times a constant (between 3 and 5 for aerodynamic studies). In the automotive industry, this leads to an unrealistic budget of computations. This is why innovative methods that require fewer input samples, such as the approach proposed in this paper, are being explored. By approximating the results of expensive and time-consuming CFD simulations, surrogate models enable the creation of rapid, interactive design tools that significantly speed up the evaluation and optimization process.

1.3. Shape Parametrization

Another important topic is the way to parametrize the car shapes. The performance of the optimization process can strongly depend on the shape parametrization. Some CAD parameter variations may have a very weak influence on the drag value. Of course, the “true” manifold of parameters of interest is not known a priori. Moreover, for technical and practical reasons, engineers may only consider and manipulate meshed geometries as “objects-of-study”. For different CAD samples, the mesh software may produce meshes with different mesh connectivity. Sometimes, the CAD parameters change during the study. If the surrogate model is built with those initial parameters, any changes would necessitate retraining the model. In that case, one has to imagine a reparametrization of the vehicle shapes from the available meshes. This will be discussed in this paper.

1.4. Adding Information from Available Volume Fields

As mentioned above, surface response methodologies can build regression functions of drag coefficients from various shape parameter sample HF computations. By solely retrieving aerodynamic coefficients from high-fidelity calculations, a significant part of the information contained in the results is lost. It appears advantageous to leverage ancillary information such as velocity fields, pressure distributions, or kinetic energy. As drag coefficients are correlated with these fields, one can expect more accurate drag force predictions providing correlated information. Volume fields defined on meshes are highly dimensional by nature. In order to obtain a low-dimensional representation of them, dimensionality reduction techniques such as principal component analysis (PCA [10]), proper orthogonal decomposition (POD [11]), locally linear embedding (LLE [12]), and multidimensional scaling or neural-based encoders [9] can be used. In the case of parametric problems, one can combine both dimensionality reduction with response surface methods to predict the reduced variables to obtain a parametric reduced-order model of the field.

As presented later on in Section 2 (methodology), the whole drag force is the integral of the normal stresses exerted onto the vehicle skin. Thus, the drag force is a function of the pressure field and the velocity gradient. A way to derive a more physical drag force is to achieve a low-order surrogate model of the skin normal stresses. Another way to evaluate the drag force is to adopt the so-called far-field approach, where the drag force can be well-approximated by the integral of a difference of momentum fluxes over an arbitrary cutting plane, preferably located in the wake of the vehicle (this will be detailed and explained in Section 2). In this case, the drag force depends on the pressure distribution, velocity, and kinetic energy fields. A key advantage of far-field formulas is that the cutting plane can be the same for all car shapes of interest, allowing for easier principal component analysis and dimensionality reduction.

1.5. Scope, Objectives, and Structure of the Paper

Classical RSM for drag prediction often require extensive input data, with hundreds to thousands of evaluations (simulations) needed to achieve acceptable accuracy. This evaluation-intensive requirement is not compatible with industrial processes where development times are short. Generating large datasets is not only time-consuming but also financially expensive. In the literature, there is a lack of surrogate models trained with a limited amount of input data, ideally of the order of the number of shape parameters. A big shortcoming of RSM is that it usually only makes use of quantities of interest (here, drag forces) that are often scalar, with all the physical fields being lost. Moreover, many current surrogate models are designed for predicting the drag of cars with significant geometric modifications or even different vehicle models. These models are useful in the early stages of development to assess the bulk geometry. However, during the fine styling phase where modifications are minimal (often just a few centimeters), these models become less effective. There is a clear gap in the literature regarding models that can accurately predict drag for such small geometric changes and with only limited input data. To overcome this scientific challenge, we have to parameterize the car shape in such a way that drag force responses are strongly correlated to the shape parameters. There is also the constraint of low-dimensional representation since the dataset is made of few evaluations and one can only consider few degrees of freedom for the identification of the operator that maps shape into drag force.

These constraints lead us to consider an enhanced surrogate model with the following features:

Low-dimensional reparametrization of the vehicle geometry.
Incorporation of physical fields to enrich the data and raise the information content without additional CFD computations; this represents the biggest difference from traditional methods.
Reliance on physical formulas for calculating drag forces.
Ability to compute sensitivities, i.e., accuracy for small geometry variations.

Let us also emphasize that the “limited evaluations” constraint sets the work apart from methods relying on extensive datasets (like deep neural networks and related data-intensive ML approaches). In addition, the use of artificial data generative AI systems cannot meet the requirement of accuracy for slight geometric changes (final fine styling phase).

The paper is organized as follows. Section 2 is dedicated to the methodology of the method. It includes integral formulas of drag force (Section 2.1), the shape encoding and reparametrization (Section 2.2), the low-order representation of the momentum formula in the wake cutting plane (Section 2.3), the definition of the drag surrogate model (Section 2.4), and its use for a new “query” shape (Section 2.6). Section 3 gathers all the numerical experiments and validations conducted. The paper ends with concluding remarks and perspectives in Section 4.

1.6. Related Works

This section is intended to provide an overview of the recent literature and related works in the context of surrogate modeling and machine learning methods for automotive drag force prediction. In aerodynamic studies, the performance of a design is often evaluated based on the drag coefficient. Some models only predict this scalar coefficient, while others predict the entire flow field, from which the drag is computed. Both approaches require extensive input data to achieve high prediction accuracy. Two main categories of models emerge in the literature: those used in early-phase project development with a broad range of shape variations and those designed for more specific geometry variations. The ShapeNet dataset https://shapenet.org/ (accessed on 1 June 2023) is commonly employed for training models that need to handle diverse car shapes. For instance, ref. [13] proposed a method using the Gaussian Process with optimized weights to predict the drag coefficient along with pressure and velocity field surrounding simplified 3D cars (without spoilers, tires, and side mirrors) of this dataset. Car shapes are parameterized using a Polycube map, which allows for a fixed-length representation of all shapes. The positions of PolyCube mesh surface points are used for the GP model. Trained with 889 shapes, the model’s prediction accuracy improves as the training set size increases, with the error decreasing from 5.4% for 100 samples to 3.4% for 889 samples. Additionally, the model achieves an average Mean Squared Error of 0.48 on velocity and 8.1 on pressure when trained with the full dataset. A manifold-based approach was presented in [14], where authors use Locally Linear Embedding to predict the drag of 3D car shapes. The method assumes that the drag coefficient of a car can be obtained through the linear combination of five neighboring cars in the manifold of car shapes. To build the manifold, the signed distance function of 60 car shapes from the ShapeNet dataset was used. An overall average relative percent error of 11.5% was achieved with this approach. In [15], the authors applied a similar method to estimate the airflow around the vehicle using a reduced input dataset. The model was constructed using the signed distance function of 70 car shapes, including various types like sedans and sports cars. The model achieved an

L^{2}

error on the velocity field ranging from 0.56% to 1%, depending on the car type. Neural Networks (NN), especially Convolutional Neural Networks (CNN), have been widely adopted due to their efficacy in extracting spatial features. For instance, ref. [16] proposed an improved ResNeXt model [17] that uses 2D depth and normal renderings to represent 3D car geometries. By training on 9070 shapes from ShapeNet, their model achieved an overall average drag prediction error of 2.4%. The Shapenet dataset was also used by [18] to train a coordinate-based multilayer perceptron to predict the flow field around a 2D car profile. Trained on 1812 profiles, derived from the central vertical cross-section of 3D shapes of the ShapeNet dataset, the model achieved an average MSE of 7.15% for drag prediction. Similarly, [19] used Convolutional Neural Networks on arbitrary 2D primitives to predict airflow around 2D car silhouettes. This model returned a prediction error of 15.34%. In [20], a U-Net [21] architecture was employed to predict the drag coefficients and flow field of car shapes from the geometry [22]. This model uses a signed distance function for 3D geometry representation and is trained on up to 10,080 samples, achieving a maximum absolute error of approximately 2.5% in drag prediction and 2.8% on the

L^{2}

error of field prediction. Alternatively, a Geodesic Convolutional Neural Network [23] was used in [24] to predict pressure and velocity distribution on the surface of a 3D car. As in [13], the car geometries were remeshed using a Polycube map for consistent representation. The model trained on approximately 2000 randomly generated car shapes reported an accuracy of 51% for new prediction. The accuracy of the model can reach up to 70% when the model training dataset is augmented with 54 real car shapes. This Geodesic Convolutional Neural Network was used in [25] for drag optimization. The GCNN-based approach outperforms traditional methods like Kriging in creating a surface response for optimal drag coefficient prediction. However, no details on the error or accuracy of the surface response were provided by authors. An optimization study was also presented in [26], where the authors introduced MeshSDF, a model designed for the generation and optimization of 3D geometries. The model learns from point cloud representations of the 3D geometry and uses a CNN to predict the pressure field over the shape. Trained on 1400 car shapes from the ShapeNet dataset, the model was employed to generate optimized shapes that minimize pressure drag from an initial configuration. However, the authors provided no details on the accuracy of their pressure field predictions or the resulting drag estimations.

Another category of models focuses on studying specific geometric variations for fine styling adjustments. These models aim to predict drag or flow fields when a defined car geometry undergoes specific modifications. This is the focus of our paper, as such models are important for optimizing aerodynamic performance. Despite their importance, there is limited research on applying these types of models within the automotive industry. In [27], a mathematical model was built to generate new silhouettes of a sedan parameterized with 21 design parameters and then estimate their drag coefficient. The mathematical model was constructed using a two-step procedure to predict the drag coefficient from the coordinates of 52 control points. First, Principal Component Analysis (PCA) was applied to reduce the number of variables used for the geometry representation. These principal components were then used in a linear regression model to predict the drag coefficient. The performance of the regression model for drag prediction was evaluated for different training sample sizes ranging from 100 to 1000. As the input sample size increased, the mean absolute percentage error decreased from 13.86% to 9.49%. For the largest sample size, 70% of the predictions had an error below 10%. However, the error can still reach up to 65% for certain silhouette types. Scaling up to 3D geometries, a Radial Basis Function was used in [28] to map design shape parameters to the drag coefficient of a 3D car with 12 shape parameters. Initially, with 164 high-fidelity evaluations from Latin hypercube sampling, the model achieved a 3.7% prediction error for the optimal shape. By using an Adaptive Multi-Scale RSM methodology, which refines the response surface iteratively, the error was reduced to less than 1% with increased sample sizes (280 to 480). This approach also makes the model less sensitive to the number and distribution of input samples compared to traditional RSM methodologies. Closest to our work, ref. [29] proposes a model that combines Proper Orthogonal Decomposition (POD) reduction with kriging interpolation to evaluate pressure and velocity fields on the surface of a car. In their approach, POD was applied to the snapshot matrix of results from a Detached Eddy Simulation (DES) for a car shape with three parameters. A kriging model was then used to interpolate the POD basis coefficients based on shape modifications. The model was trained using multiple datasets with varying distributions and input sample sizes ranging from 8 to 38. As the number of input samples increased, the relative error decreased from 2.3% to 1.1% for pressure prediction and from 2.9% to 1.7% for viscosity prediction. With a Latin hypercube dataset of 20 samples, the mean error in drag prediction using the estimated fields is 0.3%. Additionally, a variable fidelity approach was explored to enhance the high-fidelity DES snapshots with lower fidelity Reynolds-Averaged Navier–Stokes (RANS) snapshots. However, this approach did not yield better prediction results for the test case considered.

The reviewed approaches highlight the use of extensive input data for developing effective models in automotive drag force prediction. They also reveal a notable gap in research concerning fine-tuning models for specific geometric variations.

2. Methodology

2.1. Drag Force Evaluation Methods

Aerodynamic simulations of automotive vehicles are conducted at very high Reynolds numbers, typically of the order

10^{7}

, involving thin boundary layers and developed turbulence. To address these high-Reynolds-number flows, the solver relies on the Large Eddy Simulation (LES) method. LES is a numerical modeling approach aimed at directly capturing the large unsteady turbulent structures. Furthermore, it is combined with a Smagorinsky subgrid scale (SGS) model as the closure model. This combination enables the representation of turbulent features at a smaller scale than the primary grid resolution, thereby enhancing the precision of the models. These two concurrent approaches empower the solver to faithfully represent turbulent flows across various scales.

The evaluation of the drag force

f_{x}

can then be carried out through the following equations [30,31]:

f = {\int \int}_{S_{v}} ((p - p_{0}) n - τ \cdot t) d σ, f_{x} = f \cdot \hat{x}

(1)

where p represents the static pressure,

p_{0}

denotes the infinite unperturbed pressure,

\hat{x}

is the unit vector in the x direction,

τ

represents the viscous stress tensor, n stands for the unitary normal vector on the vehicle’s surface

S_{v}

, and

t

is the tangential unit vector. The viscous stress tensor is calculated as

τ = ν (\nabla u + \nabla u^{T})

, with

ν

representing the kinematic viscosity and

\nabla u

denoting the velocity gradient tensor.

Rather than determining aerodynamic forces through the integration of stresses on the vehicle surface, the drag force can also be evaluated on another “virtual” surface into the fluid domain. Assuming a bounding domain that is far enough from the vehicle, the drag force can be estimated on a fixed rear cutting plane

S_{c p}

, which is orthogonal to

\hat{x}

, located in the wake of the vehicle according to the so-called far-field formula, also known as the Onorato formula [32]:

f_{x} = {\int \int}_{S_{c p}} (p_{0} - p + ρ [u_{0} u_{x} - {(u_{x})}^{2}]) d S .

(2)

Here,

ρ

represents the fluid density,

u_{0}

is the freestream velocity, and

u_{x}

is the velocity component in the direction of interest x. This will allow us to obtain snapshot fields that are defined on the grid

S_{c p}

that do not depend on the vehicle shapes. The surface drag coefficient is computed from the aerodynamic force

f_{x}

using formula [31]:

C_{d} A = f_{x} / (\frac{1}{2} ρ_{0} u_{0}^{2})

.

Because of the use of an LES turbulence model (see [33]), the Navier–Stokes solver returns unsteady solutions. However, when evaluating vehicle performance, metrics such as drag coefficient

C_{d} A

are usually searched as constant values. The LES-based Navier–Stokes solutions are expected to reach a stationary flow after a transient phase, in the sense of a stationary stochastic process (stationary flow being averaged over a suitable time window). Thus, the drag coefficient

C_{d} A

is typically computed from time-averaged quantities to return an expectation value. When computing mean far-field drag, careful attention must be paid to evaluate mean values of nonlinear terms like

{(u_{x})}^{2}

.

The shape-parametrized reduced-order surrogate model proposed in this paper is composed of three stages: (i) shape encoding; (ii) extraction of proper orthogonal modes on the cutting plane; (iii) surrogate model of force density and drag force. In what follows, we present details of the three stages.

2.2. Shape Encoding

In the context of shape design optimization in automotive engineering, the assessment of performance often requires the evaluation of numerous geometrical variations of the vehicle through the numerical design of the experiment. To generate deformed geometries from a reference one, a direct morphing approach can be used. Local regions of reference geometry are transformed by defining a control area, where specific displacements are imposed; a free area, which is free to move; and a fixed area, which remains stationary during the morphing process (Figure 1). Assume that the shape of the vehicle depends on p CAD parameters

{(μ_{k})}_{k = 1, \dots, p}

that are gathered into a vector

μ

. The domain of admissible shape parameters

μ

is denoted by

M \subset R^{p}

. In the sequel, the car shape for parameter

μ

will be denoted by

\underset{̲}{S} (., μ)

. It will be assumed that each shape

\underset{̲}{S} (., μ)

is a slight variation of a reference shape denoted by

{\underset{̲}{S}}^{r e f} (.) = \underset{̲}{S} (., μ^{r e f})

.

To capture as much information as possible from the entire 3D shape and to be able to discriminate different admissible shapes, it is important to have a more informative and discriminating descriptor, able to characterize the principal components of geometrical variation. For efficiency purposes, such a shape encoding also has to be set up in a rather low-dimensional space.

The procedure used in the present work is explained in the sequel. It is assumed that the morphed geometry

\underset{̲}{S} (., μ)

is a slight variation from the reference surface

{\underset{̲}{S}}^{r e f} (.)

. It can be characterized by a shape displacement vector field

\underset{̲}{δ} (\cdot, μ) \in {[L^{2} ({\underset{̲}{S}}^{r e f})]}^{3}

evaluated on the reference shape:

\underset{̲}{S} (\cdot, μ) = {\underset{̲}{S}}^{r e f} (\cdot) + \underset{̲}{δ} (\cdot, μ) .

(3)

The idea is to approximate all the admissible displacements in a rather low-dimensional vector space defined by a suitable orthogonal reduced basis. This can be performed, for example, from a method of snapshots of shapes.

Let

{\underset{̲}{S} (., μ^{k})}_{k = 1, \dots, N_{s}}

be a set of

N_{s}

precomputed deformed shapes from a suitable CAD parameter sampling

{μ^{1}, \dots, μ^{N_{s}}}

. From these shape snapshots, it is easy to find an orthogonal basis

{\underset{̲}{q}}_{ℓ} (.) \in {[L^{2} ({\underset{̲}{S}}^{r e f})]}^{3}

,

ℓ = 1, \dots, K

,

K \leq N_{s}

, either by means of a principal component analysis (PCA) or more simply by means of a Gram–Schmidt (GS) orthogonalization procedure. If

N_{s}

is big enough, it is preferable to apply a PCA to find the K principal components, expecting that

K < N_{s}

. In the case of limited data with

N_{s}

being small, one can consider

K = N_{s}

and simply use the GS algorithm (i.e., QR factorization in the discretized context). Thus, for a query parameter vector

μ \in M

,

\underset{̲}{S} (\cdot, μ) \approx {\underset{̲}{S}}^{r e f} (.) + \sum_{ℓ = 1}^{K} ν_{ℓ} (μ) {\underset{̲}{q}}_{ℓ} (\cdot) .

(4)

The linear combination coefficients

{(ν_{ℓ} (μ))}_{ℓ = 1, \dots, K}

provide a set of descriptors of the shape

\underset{̲}{S} (., μ)

. They are computed from the orthogonal projection of

\underset{̲}{δ} (., μ)

onto the vector space spanned by the basis functions

q_{ℓ} (\cdot)

:

ν_{ℓ} (μ) = {(\underset{̲}{δ} (\cdot, μ), {\underset{̲}{q}}_{ℓ})}_{{[L^{2} (S^{r e f})]}^{3}} .

(5)

The feature vector

ν (μ)

then serves as the new global shape descriptor to encode the geometry

\underset{̲}{S} (., μ)

. This approach enables the creation of new shape descriptors that offer more descriptive information than the CAD parameters. By leveraging the detailed information captured at the local level from local features, this method provides a comprehensive representation of the overall shape, resulting in a more informative and nuanced global shape descriptor set.

Discretized Formalism

For computational purposes, any car shape is discretized thanks to a Finite Element (FE) triangular mesh as triangular elements are better suited for meshing complex geometries, such as detailed 3D car models. Let us assume a shape triangulation made of

N_{v}

vertices,

N_{v} > > 1

. The FE displacement fields

\underset{̲}{δ} (., μ)

as well as the orthogonal fields

{\underset{̲}{q}}_{ℓ} (.)

are stored as large vectors

δ_{μ}

and

q_{ℓ}

, respectively. The shape encoding algorithm is the following:

Offline stage: assume that a snapshot database of shape displacements

$D = [δ_{μ^{1}}, δ_{μ^{2}}, \dots, δ_{μ^{N_{s}}}] \in R^{3 N_{v} \times N_{s}}$

is available. From an SVD analysis or a QR factorization of D, compute an orthogonal reduced basis $q_{1}, \dots, q_{K} \in R^{3 N_{v}}$ . Define the matrix,

$Q = [q_{1}, \dots, q_{K}] \in R^{3 N_{v} \times K};$
Online stage: for a query CAD parameter $μ$ , compute a mesh of the shape $\underset{̲}{S} (;, μ)$ . Then, compute the discrete displacement field $δ_{μ}$ and the POD coefficients vector,

$\begin{matrix} ν (μ) = Q^{T} δ_{μ} \in R^{K} . \end{matrix}$

(6)

Remark that if

N_{s}

is small and the QR factorization

D = Q R

is used (in this case

K = N_{s}

), then

R = Q^{T} D \in R^{K \times K}

is the matrix of the POD coefficients vector of each displacement vector

δ_{μ^{1}}, δ_{μ^{2}}, \dots, δ_{μ^{N_{s}}}

of D.

2.3. Knowledge Extraction and Reduced-Order Representation in the Cutting Plane

As already seen in the previous section, for a deformed shape

\underset{̲}{S} (., μ)

, the aerodynamic drag can be estimated by integrating the physical scalar field

κ (., μ) \overset{def}{=} p_{0} - p (., μ) + ρ [u_{0} u_{x} (., μ) - u_{x}^{2} (., μ)]

over a cutting plane

S_{c p}

using the far-field formula:

C_{d} A (μ) = \frac{1}{\frac{1}{2} ρ_{0} u_{0}^{2}} {\int \int}_{S_{c p}} κ (., μ) d S_{y} .

(7)

The quantity

κ (., μ) \in L^{2} (S_{c p})

can be seen as the drag force density per unit surface over the cutting plane

S_{c p}

. As each new shape is a slight deformation of the reference shape, the resulting flow field is also a variation of that of the reference configuration. Then, the field force density

κ (\cdot, μ)

can be computed as the reference one plus a deviation

Δ κ (\cdot, μ)

:

κ (., μ) = κ^{r e f} (.) + Δ κ (., μ) .

(8)

Once again, from the different snapshot solutions

{κ (., μ^{k})}_{k = 1, \dots, N_{s}}

previously computed with a high-fidelity CFD solver, one can extract an orthogonal reduced basis

ψ_{ℓ} (.) \in L^{2} (S_{c p})

,

ℓ = 1, \dots, K

. This allows us to approximate the variation

Δ κ (., μ)

in a low-dimensional vector space spanned by the orthogonal modes

ψ_{ℓ}

, i.e.,

Δ κ (., μ) \approx \sum_{ℓ = 1}^{K} b_{ℓ} (μ) ψ_{ℓ} (.) .

(9)

For the numerical implementation, the cutting plane is discretized using a Cartesian mesh or a Finite Element mesh composed of

N_{c p}

nodes. The FE discrete fields

κ (., μ)

as well as the orthogonal basis

{(ψ_{ℓ} (.))}_{ℓ}

are stored as large vectors

κ_{μ}

and

ψ_{ℓ}

, respectively. The variation fields

Δ κ (., μ)

are also stored as vectors

Δ κ_{μ} = κ_{μ} - κ^{r e f}

. Then, the algorithm is as follows:

Compute the snapshot matrix $U \in R^{N_{c p} \times N_{s}}$ of field forces by collecting results of the high-fidelity CFD solver for the $N_{s}$ training shapes:

$U = [\begin{matrix} Δ κ_{μ^{1}}, & \dots, & Δ κ_{μ^{N_{s}}} \end{matrix}] .$

(10)
Extract the modes $ψ_{ℓ} \in R^{N_{c p}}$ , $ℓ = 1, \dots, K$ by performing either principal component analysis or QR factorization of matrix U, depending on the number of snapshots. Then, define the matrix,

$P = [ψ_{1}, . . . ψ_{K}] \in R^{N_{c p} \times K} .$

For very limited snapshot data, it is preferable to use a QR factorization. In this case, we have $K = N_{s}$ and $U = P T$ with $T \in R^{K \times K}$ , an upper triangular matrix. Since the matrix P is semi-orthogonal, we have $T = P^{T} U$ .

2.4. Parametric Surrogate Model

Equations (8) and (9) invite us to search

κ (., μ)

in the form

κ (., μ) = κ^{r e f} (.) + \sum_{ℓ = 1}^{K} b_{ℓ} (μ) ψ_{ℓ} (.)

for a new query vector

μ

, with corresponding shape

S (., μ)

. One can then use Formula (7) to compute

C_{d} A (μ)

. But, of course, the coefficients

b_{ℓ} (μ)

are not known and a “closure” is needed.

In the previous Section 2.2, we considered a shape encoding defined by the coefficient vector

ν = ν (μ)

. The partial information we have on the shape

S (., μ)

is described by

ν (μ)

. So, it is better to consider the combination coefficients

b_{ℓ}

as functions of

ν (μ)

:

κ (., μ) = κ^{r e f} (.) + \sum_{ℓ = 1}^{K} b_{ℓ} (ν (μ)) ψ_{ℓ} (.) .

(11)

The coefficients

b_{ℓ} (ν (μ))

can be seen as nonlinear features of

ν (μ)

. However, actually, these functions are unknown. In the context of limited data, a large model with multiple degrees of freedom such as a DNN is irrelevant because of the lack of data. The simplest reasonable “closure” one can consider in this context is a linear dependency of each

b_{ℓ} (ν (μ))

with

ν (μ)

, meaning

b_{ℓ} (ν (μ))

chosen in the form

b_{ℓ} (ν (μ)) = \sum_{m = 1}^{K} a_{ℓ m} ν_{m} (μ) .

(12)

The

Δ κ

model proposed in this paper is

Δ κ (., μ) \approx \sum_{ℓ = 1}^{K} \sum_{m = 1}^{K} a_{ℓ m} ν_{m} (μ) ψ_{ℓ} (.)

(13)

with a constant coefficient matrix

A = {(a_{ℓ m})}_{ℓ, m} \in R^{K \times K}

. The matrix has to be identified to fit the available data. This can be performed, for example, from the available sampled data

{μ^{1}, \dots, μ^{N_{s}}}

by solving the least square problem:

min_{A \in R^{K \times K}} \frac{1}{2} \sum_{k = 1}^{N_{s}} {∥Δ κ (., μ^{k}) - \sum_{ℓ = 1}^{K} \sum_{m = 1}^{K} a_{ℓ m} ν_{m} (μ^{k}) ψ_{ℓ} (.)∥}_{L^{2} (S_{c p})}^{2} .

(14)

From the FE discretized point of view, the minimization problem (14) reads

min_{A \in R^{K \times K}} \frac{1}{2} {∥U - P A R∥}_{F}^{2}

with

R = [ν (μ^{1}), \dots, ν (μ^{N_{s}})] \in R^{K \times N_{s}}

. The first-order optimality conditions give the solution

A = \underset{K \times N_{c p}}{\underset{︸}{P^{T}}} \underset{N_{c p} \times N_{s}}{\underset{︸}{U}} \underset{N_{s} \times K}{\underset{︸}{R^{T}}} \underset{K \times K}{\underset{︸}{{(R R^{T})}^{- 1}}}

assuming that R has a maximal rank so that

R R^{T}

is invertible. In the case where

K = N_{s}

and under the maximal rank assumption, using the QR factorization

U = P T

, the solution simply writes

A = P^{T} U R^{- 1} = T R^{- 1} .

The discretized version of (11), (12) is

κ (μ) = κ^{r e f} + P A ν (μ),

(15)

which is equivalent to

κ (μ) = κ^{r e f} + \sum_{ℓ = 1}^{K} \sum_{m = 1}^{K} a_{ℓ m} ν_{m} (μ) ψ_{ℓ} .

(16)

Finally, the surface drag is computed from Formula (7):

C_{d} A (μ) = \frac{1}{\frac{1}{2} ρ_{0} u_{0}^{2}} {\int \int}_{S_{c p}} κ (., μ) d S_{y} .

(17)

In practice, the integral is approximated using an FE quadrature on each FE element over the cutting plane. We have

C_{d} A (μ) = 〈 w, κ (μ) 〉

for some vector

w

.

2.5. Online Stage: Drag Force Evaluation

For a new parameter vector

μ

, the deformed surface is created and is discretized by a triangular mesh. If the mesh does not share the same connectivity as the reference shape mesh, a mesh matching method such as transport optimal (Appendix A) has to be used. The set of shape feature

ν (μ)

is computed by Formula (6). Then, the force density

κ (μ)

is computed from Equation (16). Finally, the drag

C_{d} A (μ)

is assessed by means of Formula (17).

2.6. Summary

The offline and online stages of the proposed surrogate model are summarized in Algorithms 1 and 2 below.

Algorithm 1 ROM far-field Offline Phase—Learning phase for a set of

N_{s}

training shapes

{S (., μ^{k})}_{k = 1, \dots, N_{s}}

with FOM results

Require: Reference geometry ${\underset{̲}{S}}^{r e f}$ and deformed shapes ${\underset{̲}{S} (., μ^{k})}_{k = 1, \dots, N_{s}}$
Ensure: Q, $ψ$ , A
1: Discretize shapes by a mesh of $N_{v}$ vertices.
2: Compute snapshot matrix of displacements $D = [δ_{μ^{1}}, δ_{μ^{2}}, \dots, δ_{μ^{N_{s}}}]$ .
3: Compute orthogonal reduced basis $Q = [q_{1}, \dots, q_{K}]$ by SVD analysis or QR factorization of D.
4: Compute new global shape descriptors $ν (μ^{k}) = Q^{T} δ_{μ}^{k}$ .
5: Compute snapshot matrix of field variations $U = [Δ κ_{μ^{1}}, Δ κ_{μ^{2}}, \dots, Δ κ_{μ^{N_{s}}}]$ .
6: Extract orthogonal modes $ψ$ by SVD analysis or QR factorization of U.
7: Compute coefficient matrix A.

Algorithm 2 ROM far-field Online Phase—Prediction of a new “query” shape

\underset{̲}{S} (., μ)

,

μ \in M

Require: Reference geometry ${\underset{̲}{S}}^{r e f}$ , deformed geometry $\underset{̲}{S} (., μ)$ , far-field modes $ψ$ , coefficients matrix A, shape basis matrix Q, reference force density $κ^{r e f}$
Ensure: Aerodynamic drag coefficient $C_{d} A (μ)$
1: Discretize deformed shape by a mesh of $N_{v}$ vertices.
2: Compute discrete displacement field $δ_{μ}$ .
3: Compute shape descriptor $ν (μ) = Q^{T} δ_{μ}$ .
4: Compute variation field $κ (μ) = κ^{r e f} + \sum_{ℓ = 1}^{K} \sum_{m = 1}^{K} a_{ℓ m} ν_{m} (μ) ψ_{ℓ}$ .
5: Compute drag coefficient $C_{d} A$ .

3. Numerical Experiments, Results, and Discussion

3.1. High-Fidelity Simulation

A CFD numerical simulation is carried out using the commercial solver ProLB, based on the Lattice Boltzmann Method (LBM). The LBM is a numerical approach to model macroscopic fluid dynamics by simulating the microscopic particle dynamics and their collisions on a discretized lattice grid. This method handles complex geometries well. To deal with turbulent flows near the vehicle, an LES model along with an LBM-based SGS (subgrid Smagorinski) model is used. The computational domain and boundary conditions are set to accurately represent a realistic wind tunnel setup (Figure 2). The domain is set to 48.96 m long, 29.4 m high, and 46.4 m wide. At the inlet of the flow domain, a realistic upstream velocity

u_{0} = 45.88

m.

s^{- 1}

is used, corresponding to a Reynolds number of order

10^{7}

. The atmospheric pressure

p_{0} = 101, 325

Pa is imposed at the outlet. On all other bounding surfaces of the domain, a frictionless boundary condition is employed. Finally, on vehicle surfaces, a no-slip boundary condition is imposed. The computational domain is composed of eight successive Adaptive Mesh Refinement (AMR) resolution domains, corresponding to refinements of the LBM mesh. The finest resolution is of the order of 2 mm near the vehicle to ensure accurate flow results. The Boltzmann equation is solved over the LBM mesh for a total of 520,000 iterations, which corresponds to 2.5 s of air flow simulation. This high number of iterations is set to ensure that the flow reaches a steady state. Pressure and velocity fields are captured on the lattice grid for the steady-state iterations. The simulation takes approximately 7 h on a cluster of 900 CPUs to complete.

3.2. Simplified Geometry “S2A”

The surrogate model has been tested on a simplified industrial car design, which is used in the Renault wind tunnel facility for aerodynamic calibration. This design has the essential structures of a real vehicle while having enough simplicity to reduce computational complexity. It is made of multiple detailed structures: a superstructure, an underbody, front and rear axles, tires, and rims. The vehicle geometry is simplified with no external appendices (such as external mirrors) or engine compartments. Parametrization of the car shape involves three independent parameters, i.e.,

μ = (μ_{1}, μ_{2}, μ_{3})

, affecting only the vehicle body superstructure. The first parameter

μ_{1}

corresponds to a translation along the

\hat{y}

axis, simulating a narrowing of the rear bumper and constrained within the range of

[- 20 mm, 20 mm]

. The second parameter

μ_{2}

represents a translation along the

\hat{x}

axis to model an extension of the vehicle’s roof, with a range of

[- 50 mm, 50 mm]

. Lastly, the third parameter

μ_{3}

denotes a rotation around the

\hat{y}

axis of the rear underbody, constrained within

[- 5^{\circ}, 5^{\circ}]

. The specific morphing area impacted by these parameters is illustrated in Figure 3. The initial geometry is represented by an FE mesh composed of

N_{v}

= 556,354 nodes.

3.3. Data Generation and Preprocessing

A total of 30 deformed shapes are generated from the initial geometry, employing a maximin space-filling strategy on the CAD parameter

μ

(Figure 4). CFD simulations are performed for each shape, and the high-fidelity results provide detailed information on the flow field around the vehicles and the corresponding drag coefficients (Table 1). The mean dimensionless surface drag of this dataset is

0.9952

with a standard deviation of

0.0127

. The low standard deviation, compared to the mean, suggests that the drag coefficients are relatively consistent across the dataset since shape variations are within a maximum range of 50 mm. However, in a design optimization study, even a small reduction in drag of

0.02

m^{2}

is important as it could lead to a reduction of 1 g of

{CO}_{2}

consumption when proceeding to the Worldwide Harmonized Light Vehicles Test Procedure (WLTP) cycle.

In this test case, each deformed geometry is independently meshed; therefore, the meshes do not share the same mesh topology as the reference shape. As explained in Appendix A, an optimal transport strategy is used to make the mapping and mesh correspondence. The optimal transport problem was solved by using the reference open-source Python library Geomloss [34].

3.4. Model Performance

The performance of the model with respect to the number of input data points and their distribution is evaluated. To achieve this, cross-validation is performed using different numbers of folds. The number of folds, denoted by

N_{f}

, determines how the data is split for training and validation. The dataset is divided into

N_{f}

subsets of approximately equal size. Usually in cross-validation, for each round, one subset is used for validation while the remaining

N_{f} - 1

subsets are used for training. However, in this study, as we aim to keep a low number of training data, the typical approach is reversed: one fold is used for training and the remaining folds are used for validation. This allows us to assess how well the model performs when it is trained on a small dataset. To ensure robustness of cross-validation results, multiple cross-validation runs are performed for each choice of fold number

N_{f}

. In each run, a different random seed is used to shuffle the data, resulting in different splits. This is performed to minimize the impact of any particular split that may be suboptimal for the model. After performing these cross-validation runs, the results are averaged to provide an estimate of the model’s performance. This approach is especially important when the number of folds is small, as the randomness in splitting the data can sometimes lead to unfavorable training datasets, potentially leading to misleading performance evaluations. By repeating the process with different splits, the model’s evaluation becomes more accurate and less dependent on any single data split.

Cross-validation is performed with the number of folds

N_{f}

varying between 2 and 8. This results in training subsets containing between 15 samples (for

N_{f} = 15

) and 3 or 4 samples (for

N_{f} = 8

). For each number of folds, a total of 10 cross-validation runs are executed.

First, the performance of the low-dimensional reparametrization of the vehicle geometry is evaluated for each cross-validation fold. To do so, the Mean Squared Error (MSE), interpreted here as the

L^{2}

projection error, is computed for each shape in the validation datasets. This error measures how well the geometry of the vehicle is reconstructed from the reduced parameter space. The MSE is then averaged across all validation shapes for each fold. As shown in Figure 5, the average MSE tends to increase as the number of folds increases as there are less data for training. However, the MSE values remain relatively low (less than

0.5

mm), even with a high number of folds. This demonstrates the effectiveness of the reparametrization in capturing the geometric variations in the vehicle shapes, even with limited training data.

Then, to evaluate the performance of the model in predicting the drag coefficient for new shapes, several metrics are computed during cross-validation. First, the maximal and mean absolute error on

C_{d} A

prediction are assessed. Figure 6a shows the results for the different folds. Although the error tends to increase slightly with the number of folds, the overall error remains well within an acceptable threshold of 3%, indicating the robust performance of the model with limited input data. The Pearson coefficient is computed to quantify the linear correlation between the predicted and high-fidelity drag values (Figure 6b). The model consistently achieved high Pearson correlation coefficients across all folds. Since the model is intended to serve as a decision support tool during the design optimization process, it is important that the ranking of shapes regarding drag is preserved. The model’s ability to maintain the correct ranking is then evaluated using the Kendall rank correlation coefficient [35]. The Kendall coefficient is defined by the following equation [36]:

τ = \frac{n_{c} - n_{d}}{n},

(18)

with

n_{c}

being the number of concordant pairs,

n_{d}

the number of discordant pairs, and n the total number of pairs. A concordant pair is a pair of observations wherein the predicted and high-fidelity values have the same relative ordering. So, if one shape has a higher predicted drag value than another, and the actual drag values follow the same order, the pair is considered concordant. On the other hand, a discordant pair is when the relative ordering is not preserved by the prediction. However, this definition of discordant pairs can be too strict for aerodynamic studies. In some cases, pairs are considered discordant when the predicted and actual rankings are inverted, even though the difference in drag values between the configurations is very small. In such cases, the small deltas may not significantly affect the overall ranking for design studies. Therefore, it would be more appropriate to classify these pairs as concordant. A relaxed methodology is introduced to take this threshold into account. It introduces a threshold to account for minor prediction errors, allowing pairs with a small simulated delta to remain concordant if the predicted delta is similarly close. For larger differences in drag, a more relaxed threshold is used. In the following analysis, two variants of the relaxed Kendall coefficient are used to assess the model ranking performance with different tolerance levels. The first tolerance level allows a higher margin of error in the drag prediction, representing the maximum value for the prediction to be considered acceptable. The second tolerance level is stricter, requiring a smaller margin of error to classify the prediction as accurate. These two variants will be referred to as Relaxed Kendall Acceptable and Relaxed Kendall Accurate. The tolerance for the Relaxed Kendall Acceptable coefficient is defined as follows:

t h r e s h o l d_{a c c e p t a b l e} = \{\begin{matrix} | Δ C_{d} A | + \frac{| Δ C_{d} A |}{5} + 0.009 & if | Δ C_{d} A | > 0.005 \\ | Δ C_{d} A | + 0.01 & otherwise \end{matrix}

(19)

The tolerance for the Relaxed Kendall Accurate coefficient is defined as follows:

t h r e s h o l d_{a c c u r a t e} = \{\begin{matrix} | Δ C_{d} A | + \frac{| Δ C_{d} A |}{5} + 0.003 & if Δ | C_{d} A | > 0.01 \\ | Δ C_{d} A | + 0.005 & otherwise \end{matrix}

(20)

where

| Δ C_{d} A |

is the absolute drag difference between two configurations. A predicted drag difference

\tilde{Δ C_{d} A}

is classified as concordant if it is the same sign as

Δ C d_{A}

or if one of the following apply:

$| \tilde{Δ C_{d} A} | \leq t h r e s h o l d_{a c c u r a t e}$ for the Relaxed Kendall Accurate coefficient;
$| \tilde{Δ C_{d} A} | \leq t h r e s h o l d_{a c c e p t a b l e}$ for the Relaxed Kendall Acceptable coefficient.

The values of the Kendall coefficients depending on the number of folds are presented in Figure 6b. The Kendall rank coefficient computed across all folds is in the range of 0.575 to 0.59. This low value is due to many pairs being classified as discordant with the strict initial definition of concordance and discordance. However, when the Relaxed Kendall is applied, using the modified definitions that account for minor prediction errors, a significant improvement in the score is observed. Indeed, across all folds, the Relaxed Kendall Acceptable and Relaxed Kendall Accurate coefficients are in a respective range of 0.93 to 0.94 and 0.86 to 0.89.

The total time of the modeling process is measured, including the three main phases: data preprocessing, model training, and inference. The preprocessing of the data consists in mesh correspondence by using optimal transport. During this step, the computational time mainly depends on the number of triangular elements representing the shape. This step can be skipped if the mesh is morphed simultaneously with the geometry. When using the Geomloss library, optimal transport for a shape with

N_{v}

= 556,354 elements takes approximately 15 min on a system with 40 CPUs. The training time depends on the number of input samples, which is linked to the number of folds in cross-validation. Figure 7 illustrates how the training time varies with different numbers of folds. As the number of folds increases, the training time decreases because the model becomes easier to solve with fewer data points in each training subset. The training of the model with 15 shapes (two folds) takes approximately 10 s, while the training time drops to under 5 s when only 3–4 shapes (eight folds) are used in the training set. For the inference of a new shape, once the new geometry is generated using a morphing tool, no additional preprocessing via optimal transport is required if the mesh shares the same topology as the reference shape. In this case, the inference time is extremely short, taking less than 1 s to obtain the results on the cutting plane and the drag force.

The cross-validation experiment demonstrated that, with a small number of folds, resulting in a large amount of input data, the performance of the model increases. However, even with limited training data, the performance of the model is relatively high. Therefore, it is advantageous to work with the smallest number of input data points, as this reduces the number of CFD simulations required.

3.5. Surrogate Model Construction

When working with a small input dataset, the choice of training input data is very important to achieve good accuracy. Therefore, in the following analysis, four deformed shapes are chosen wisely (ID numbers 17, 19, 20, and 21 in Table 1) to train the model. The

N_{s} = 4

input samples are selected to ensure minimal comprehensive coverage of the exploration space of global parameters

μ

. Additionally, the initial reference shape (shape ID 1) is included for computation of the shape variations. All the other shapes in the dataset serve as the validation set. The distribution of the selected shapes can be seen in Figure 8.

3.5.1. Shape Encoding

Once the mesh mapping is complete, shape variation fields

δ_{μ} \in R^{3 N_{v}}

of the training geometries are used as the shape variation basis. The norm of the deformation field is shown in Figure 9 and Figure 10, for two shapes of the training set. The training shapes cover both positive and negative deformations for parameters

μ_{1}

,

μ_{2}

, and

μ_{3}

, ensuring a faithful representation of the exploration space.

The field of

δ_{μ}

is used to compute the new shape basis

q_{ℓ}

and resulting shape coefficients

ν (μ)

. The relative

L^{2}

error projection on the given basis is plotted in Figure 11 for the validation shapes. The overall error is quite small, with a maximum error of

0.002

m for all shapes in the validation database.

This new basis enables good reconstruction of the shape, indicating that it effectively captures the essential features and variations present in the original data.

3.5.2. Computation of the Flow Field Modes on a Wake Cutting Plane

The snapshot matrix U of

Δ κ

fields is built from the FOM results. These results are obtained on a cutting plane located at a distance of

0.2 m

behind the car (Figure 12a), discretized with

N_{c p} = 6930

nodes. It is worth noting that the flow on a wake plane close to the vehicle is turbulent with large unsteady eddies and lacks the stabilization observed in planes situated farther away. Therefore, the influence of shape variations becomes more pronounced and discriminating on proximity planes, providing valuable insights on the aerodynamic behavior of the vehicle. As shown in Figure 12b–d, the flow fields for the training shapes have different turbulent patterns. This difference will allow the model to learn a broader range of physical behavior to enhance its generalization ability for better predictive performance.

Far-field modes, denoted as

ψ_{1}

,

ψ_{2}

,

ψ_{3}

, and

ψ_{4}

, are computed and depicted in Figure 13. Unlike Principal Component Analysis (PCA), the first mode resulting from the QR decomposition of U is not a linear combination of the columns of matrix U. This initial mode is the normalized first column of matrix U, as illustrated in Figure 13a. Then, each additional mode is orthogonal to the previous ones, so it captures distinct and independent flow field dynamics. With this approach, each mode provides new insights on the pattern of flow behavior.

3.6. Surrogate Model Evaluation

Once the reparametrization has been completed and the physical modes are computed, the constant coefficient matrix A is determined, which concludes the model training on the

N_{s} = 4

shapes. The model is then applied to predict the density force

κ (μ)

on the cutting plane for the deformed shapes in the validation dataset. Figure 14 compares the flow field from the high-fidelity simulation with the predicted one from the surrogate model for one arbitrary shape from the validation set (Shape ID 7). The model successfully captures the overall flow patterns, confirming that the surrogate model accurately represents the fluid behavior. However, as shown in Figure 14c, errors are observed in some turbulent regions, where the model struggles to capture finer details.

The MSE of the flow field predictions is given in Figure 15 for the different shapes of the validation dataset. The MSE value is relatively low for all configurations, with an average MSE of 0.38, indicating the strong overall predictive performance of the model.

The predicted flow field density

κ (μ)

is integrated over the cutting plane by the mean of Equation (17) to obtain the drag coefficient of each deformed shape. The accuracy of the surrogate model is assessed by comparing the predicted drag coefficient against those obtained from FOM results. As shown by the correlation plot in Figure 16a, there is a strong positive correlation between predicted and real drag coefficients, with a Pearson coefficient of 0.86. All predicted coefficients lie in the admissible threshold error range of 0–3% (green dotted line), with a maximal error of 1.59%. However, it is important to note that the order is not perfectly preserved for some geometries, indicating that the model captures the overall trend but may struggle with specific variations. To further examine these nuanced behaviors at particular points, the correlation plot of delta drag between each configuration is presented in Figure 16b. This plot reveals a positive correlation between predicted and real deltas. Most points are located within the threshold for acceptable (red dotted line) and accurate (green dotted line) prediction. More than 91% of deltas between configurations are predicted accurately, with only 2% of predictions falling outside the acceptable range. Moreover, the associated Kendall coefficients for this model are as follows: 0.67 for the standard Kendall coefficient, 0.95 for the Relaxed Kendall Accurate, and 0.99 for the Relaxed Kendall Acceptable. These different results demonstrate the model’s excellent effectiveness in predicting drag coefficient variations.

4. Concluding Remarks and Perspectives

In this paper, we presented a novel ML-based surrogate model for predicting drag forces on shape-parametrized cars, addressing the constraints of limited high-fidelity CFD simulations. The model features low-dimensional reparametrization of vehicle geometry, integration of physical fields to enhance data information without extra CFD computations, reliance on physical drag formulas, and accuracy for small geometry variations.

The surrogate model showed robust predictive capabilities across various 3D deformed geometries, enabling rapid assessment of minor shape variations during optimization. The model achieves typical mean square errors of order 2% for drag prediction with only few evaluations. While occasional challenges in preserving variation order were noted, the consistently low maximal error underscores its reliability for decision support and optimization studies.

Future work aims to enhance the model by incorporating more physical field modes and exploring different deformation fields for reparametrization. Additionally, refining the Design of Experiment selection could further optimize the methodology.

Author Contributions

Conceptualization, K.N.-C., F.D.V. and Y.G.; methodology, K.N.-C., F.D.V. and Y.G.; software, K.N.-C. and Y.G.; validation, K.N.-C. and Y.G.—original draft preparation, K.N.-C.; writing—review and editing, K.N.-C., F.D.V. and Y.G.; visualization, K.N.-C.; supervision, F.D.V. and Y.G.; project administration, Y.G.; funding acquisition, Y.G. All authors have read and agreed to the published version of the manuscript.

Funding

This material is based upon work supported by the French ANRT (Association nationale de la recherche et de la technologie) with a CIFRE fellowship (2020/1218) granted to Renault Group.

Data Availability Statement

Data are unavailable due to privacy restrictions.

Acknowledgments

The authors would like to thank Yves Tourbier for fruitful discussions toward this topic. This work was performed using HPC resources from GENCI-IDRIS (Grant 2023-AD011012779R1).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CFD	Computational Fluid Dynamics
LBM	Lattice Boltzmann Method
POD	Principal Orthogonal Decomposition
RSM	Response Surface Model
HF	High-fidelity (computation)
AI	Artificial Intelligence
NN	Neural Network
LLE	Locally Linear Embedding
CNN	Convolutional Neural Network
SUV	Sport Utility Vehicle
$C_{d} A$	Drag coefficient times frontal area
LES	Large Eddy Simulation
SGS	Subgrid Scale
$R e$	Reynolds number
$f_{x}$	Drag force component in the x-direction
p	Static pressure
$p_{0}$	Infinite unperturbed pressure
$\hat{x}$	Unit vector in the x-direction
$τ$	Viscous stress tensor
$ν$	Kinematic viscosity
$\nabla u$	Velocity gradient tensor
$S_{v}$	Vehicle’s surface
$t$	Tangential unit vector
$S_{c p}$	Cutting plane in the wake of the vehicle
$ρ$	Fluid density
$u_{0}$	Freestream velocity
$u_{x}$	Velocity component in the x-direction
CAD	Computer-Aided Design
$μ$	Shape parameter vector
$M$	Domain of admissible shape parameters

Appendix A. Mesh Matching

Appendix A.1

If the initial geometry

\underset{̲}{S} (., μ^{r e f})

and the deformed geometry

\underset{̲}{S} (., μ)

do not share the same mesh topology, as in the case of remeshing, establishing a correspondence between the two meshes is necessary. One approach to achieve this is to employ optimal transport between the nodes or elements of the two meshes. In this context, the initial geometry

\underset{̲}{S} (., μ^{r e f})

and the deformed one

\underset{̲}{S} (., μ)

are treated as point clouds of element centroids, each consisting of

N_{v}

and

N_{v}^{μ}

points, respectively. The centroids are noted as

{(c_{j})}_{j = 1 \dots M}

for

\underset{̲}{S} (., μ^{r e f})

and

{(c_{i}^{μ})}_{i = 1 \dots M^{μ}}

for

\underset{̲}{S} (., μ)

, with each point assigned a mass

m_{j}

and

m_{i}^{μ}

, respectively. By default, point clouds are conceptualized as distributions of uniform mass, where

m_{j} = \frac{1}{M}

and

m_{i}^{μ} = \frac{1}{M^{μ}}

.

The transport plan

π^{μ}

between

\underset{̲}{S} (., μ)

and

\underset{̲}{S} (., μ^{r e f})

is a matrix of dimensions

M^{μ} \times M

, where

m_{i j}

represents the amount of mass transported from the initial point

c_{i}^{μ} \in \underset{̲}{S} (., μ)

to

c_{j} \in \underset{̲}{S} (., μ^{r e f})

(see Figure A1). The definition of the transport plan is formulated to minimize a transport cost, which is given by

cost (π^{μ}) = \sum_{i j} m_{i j} | | c_{j} - c_{i}^{μ} {| |}^{2}

(A1)

Therefore, each point

c_{j} \in \underset{̲}{S} (., μ^{r e f})

is connected to one or multiple points

c_{i}^{μ} \in \underset{̲}{S} (., μ^{r e f})

with a specified weight

m_{i j}

if

m_{i j} > 0

.

Figure A1. Transport plan towards a reference point

c_{j}

of the reference cloud.

m_{i j}

represents the mass transport from point

c_{i}^{μ}

to point

c_{j}

.

Figure A1. Transport plan towards a reference point

c_{j}

of the reference cloud.

m_{i j}

represents the mass transport from point

c_{i}^{μ}

to point

c_{j}

.

References

Lions, J.L.; Magenes, E. Non-Homogeneous Boundary Value Problems and Applications; Springer: Berlin/Heidelberg, Germany, 1972. [Google Scholar]
Pironneau, O. On optimum design in fluid mechanics. J. Fluid Mech. 1974, 64, 97–110. [Google Scholar] [CrossRef]
Jameson, A. Aerodynamic design via control theory. J. Sci. Comput. 1988, 3, 233–260. [Google Scholar] [CrossRef]
Jameson, A. Optimum aerodynamic design using CFD and control theory, AIAA paper 95-1729. In Proceedings of the AIAA 12th Computational Fluid Dynamics Conference, San Diego, CA, USA, 19–22 June 1995. [Google Scholar]
Cheylan, I.; Fritz, G.; Ricot, D.; Sagaut, P. Shape Optimization Using the Adjoint Lattice Boltzmann Method for Aerodynamic Applications. AIAA J. 2019, 57, 1–16. [Google Scholar] [CrossRef]
Hardy, R.L. Multiquadric equations of topography and other irregular surfaces. J. Geophys. Res. 1971, 76, 1905–1915. [Google Scholar] [CrossRef]
Buhmann, M.D. Radial Basis Functions: Theory and Implementation; Cambridge University Press: New York, NY, USA, 2003. [Google Scholar]
MacKay, D.J.C. Introduction to Gaussian processes. NATO ASI Ser. F Comput. Syst. Sci. 1998, 168, 133–166. [Google Scholar]
Brunton, S.L.; Bernd, R.N.; Koumoutsakos, P. Machine learning for fluid mechanics. Annu. Rev. Fluid Mech. 2020, 52, 477–508. [Google Scholar] [CrossRef]
Pearson, K. On Lines and Planes of Closest Fit to Systems of Points in Space. Philos. Mag. 1901, 2, 559–572. [Google Scholar] [CrossRef]
Lumley, J. Stochastic Tools in Turbulence; Academic Press: New York, NY, USA, 1970. [Google Scholar]
Roweis, S.T.; Saul, L.K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 2000, 290, 2323–2326. [Google Scholar] [CrossRef]
Umetani, N.; Bickel, B. Learning three-dimensional flow for interactive aerodynamic design. ACM Trans. Graph. 2018, 37, 1–10. [Google Scholar] [CrossRef]
Li, X.; Xie, C.; Sha, Z. Part-Aware Product Design Agent Using Deep Generative Network and Local Linear Embedding. In Proceedings of the Hawaii International Conference on System Sciences, Kauai, HI, USA, 5–8 January 2021. [Google Scholar]
Badías, A.; Curtit, S.; González, D.; Alfaro, I.; Chinesta, F.; Cueto, E. An augmented reality platform for interactive aerodynamic design and analysis. Int. J. Numer. Methods Eng. 2019, 120, 125–138. [Google Scholar] [CrossRef]
Song, B.; Yuan, C.; Permenter, F.; Arechiga, N.; Ahmed, F. Surrogate Modeling of Car Drag Coefficient with Depth and Normal Renderings. In Proceedings of the ASME 2023 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Volume 3A: 49th Design Automation Conference (DAC), Boston, MA, USA, 20–23 August 2023. [Google Scholar]
Xie, S.; Girshick, R.; Dollar, P.; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5987–5995. [Google Scholar]
Rosset, N.; Cordonnier, G.; Duvigneau, R.; Bousseau, A. Interactive design of 2D car profiles with aerodynamic feedback. Comput. Graph. Forum 2023, 42, 427–437. [Google Scholar] [CrossRef]
Guo, X.; Li, W.; Iorio, F. Convolutional Neural Networks for Steady Flow Approximation. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Association for Computing Machinery, New York, NY, USA, 13–17 August 2016; pp. 481–490. [Google Scholar]
Jacob, S.; Mrosek, M.; Othmer, C.; Köstler, H. Deep Learning for Real-Time Aerodynamic Evaluations of Arbitrary Vehicle Shapes. SAE Int. J. Passeng. Veh. Syst. 2022, 15, 77–90. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Med Image Comput.-Comput.-Assist. Interv. 2015, 9351, 234–241. [Google Scholar]
Heft, A.; Indinger, T.; Adams, N. Introduction of a New Realistic Generic Car Model for Aerodynamic Investigation. SAE Tech. Pape 2012-01-0168 2012. [Google Scholar]
Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M.M. Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5115–5124. [Google Scholar]
Baqué, P.; Remelli, E.; Fleuret, F.; Fua, P. Geodesic Convolutional Shape Optimization. In Proceedings of the 35th ICML, Stockholm, Sweden, 10–15 July 2018; pp. 481–490. [Google Scholar]
Durasov, N.; Lukoyanov, A.; Donier, J.; Fua, P. DEBOSH: Deep Bayesian Shape Optimization. arXiv 2021, arXiv:2109.13337. [Google Scholar]
Remelli, E.; Lukoianov, A.; Richter, S.R.; Guillard, B.; Bagautdinov, T.; Baque, P.; Fua, P. MeshSDF: Differentiable Iso-Surface Extraction. Adv. Neural Inf. Process. Syst. 2020, 33, 22468–22478. [Google Scholar]
Gunpinar, E.; Coskun, U.C.; Ozsipahi, M.; Gunpinar, S. A Generative Design and Drag Coefficient Prediction System for Sedan Car Side Silhouettes based on Computational Fluid Dynamics. Comput.-Aided Des. 2019, 111, 1–10. [Google Scholar] [CrossRef]
Ando, K.; Takamura, A.; Saito, I. Automotive aerodynamic design exploration employing new optimization methodology based on CFD. SAE Int. J. Passeng. Cars-Meek Syst. 2010, 3, 398–406. [Google Scholar] [CrossRef]
Bertram, A.; Othmer, C.; Zimmermann, R. Towards Real-time Vehicle Aerodynamic Design via Multi-fidelity Data-driven Reduced Order Modeling. In Proceedings of the 2018 AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, Kissimmee, FL, USA, 8–12 January 2018. [Google Scholar]
Clancy, L.J. Aerodynamics; Wiley: Hoboken, NJ, USA, 1975. [Google Scholar]
Anderson, J. Fundamentals of Aerodynamics, 6th ed.; McGraw-Hill: New York, NY, USA, 2016. [Google Scholar]
Onorato, M.; Costelli, A.; Garrone, A. Drag Measurement Through Wake Analysis. SAE Tech. Pape. 840302 1984, 85–93. [Google Scholar]
Sagaut, P. Large Eddy Simulation for Incompressible Flows; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Feydy, J.; Séjourné, T.; Vialard, F.-X.; Amari, S.-I.; Trouve, A.; Peyré, G. Interpolating between Optimal Transport and MMD using Sinkhorn Divergences. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, Naha, Japan, 16–18 April 2019; pp. 2681–2690. [Google Scholar]
Kendall, M.G. A new measure of rank correlation. Biometrika 1938, 30, 81–93. [Google Scholar] [CrossRef]
Nelsen, R.B. Kendall tau metric. Encycl. Math. 2001, 3, 226–227. [Google Scholar]

Figure 1. Morphing regions on the well-known Ahmed body with one CAD parameter acting on the global height. Control areas are colored in green, free area in purple, and the boundary areas in blue.

Figure 2. Boundary conditions CFD simulation 3D vehicle. (a) Boundary conditions on inlet and outlet surfaces of wind tunnel. (b) Boundary conditions on wind tunnel ceiling, floor, lateral walls, and vehicle.

Figure 3. Presentation of the S2A geometry involving three independent parameters. The variable surface part of the geometry is drawn in magenta.

Figure 4. Maximin space-filling on CAD parameter

μ \in R^{3}

.

Figure 4. Maximin space-filling on CAD parameter

μ \in R^{3}

.

Figure 5. Averaged Mean Squared Error of shape reconstruction depending on the number of folds in cross-validation.

Figure 6. (a) Mean and max absolute relative error on

C_{d} A

depending on the number of folds for cross-validation. (b) Correlation scores depending on the number of folds for cross-validation.

Figure 6. (a) Mean and max absolute relative error on

C_{d} A

depending on the number of folds for cross-validation. (b) Correlation scores depending on the number of folds for cross-validation.

Figure 7. Training and inference time depending on the number of folds for cross-validation.

Figure 8. Maximin space-filling on CAD parameter

μ \in R^{3}

. Orange dots are the

N_{s} = 4

training shapes, the green dot is the reference geometry

μ = 0

, and the remaining blue dots are used for the validation set.

Figure 8. Maximin space-filling on CAD parameter

μ \in R^{3}

. Orange dots are the

N_{s} = 4

training shapes, the green dot is the reference geometry

μ = 0

, and the remaining blue dots are used for the validation set.

Figure 9. Shape deformation field norm for training shape ID number 19.

Figure 10. Shape deformation field norm for training shape ID number 21.

Figure 11. Relative

L^{2}

error projection on the

q_{ℓ}

POD basis for each shape of the database.

Figure 11. Relative

L^{2}

error projection on the

q_{ℓ}

POD basis for each shape of the database.

Figure 12. Flow field on cutting plane located

0.2

m from the vehicle. (a) Cutting planes of study in the wake at respective locations

0.2

m and

3.2

m from the vehicle. (b) Delta fields

Δ κ (μ)

for shape ID 17 on the wake plane located

0.2

m from the vehicle. (c) Delta fields

Δ κ (μ)

for shape ID 19 on the wake plane located

0.2

m from the vehicle. (d) Delta fields

Δ κ (μ)

for shape ID 20 on the wake plane located

0.2

m from the vehicle.

Figure 12. Flow field on cutting plane located

0.2

m from the vehicle. (a) Cutting planes of study in the wake at respective locations

0.2

m and

3.2

m from the vehicle. (b) Delta fields

Δ κ (μ)

for shape ID 17 on the wake plane located

0.2

m from the vehicle. (c) Delta fields

Δ κ (μ)

for shape ID 19 on the wake plane located

0.2

m from the vehicle. (d) Delta fields

Δ κ (μ)

for shape ID 20 on the wake plane located

0.2

m from the vehicle.

Figure 13. Modes on cutting plane

S_{c p}

. (a) Contour plot of far-field mode

ψ_{1}

. (b) Contour plot of far-field mode

ψ_{2}

. (c) Contour plot of far-field mode

ψ_{3}

.

Figure 13. Modes on cutting plane

S_{c p}

. (a) Contour plot of far-field mode

ψ_{1}

. (b) Contour plot of far-field mode

ψ_{2}

. (c) Contour plot of far-field mode

ψ_{3}

.

Figure 14. Prediction results of the flow field. (a) Simulated flow field. (b) Predicted flow field. (c) Absolute relative error.

Figure 15. Mean Squared Error of the flow field predictions for various shapes in the validation dataset.

Figure 16. Prediction results. (a) Correlation plot of

C_{d} A

. (b) Correlation plot of delta of

Δ C_{d} A

and its distribution.

Figure 16. Prediction results. (a) Correlation plot of

C_{d} A

. (b) Correlation plot of delta of

Δ C_{d} A

and its distribution.

Table 1. Table of dimensionless drag results (the surface drag of the reference shape is 1). The range is

[0.9632, 1.0168]

.

Table 1. Table of dimensionless drag results (the surface drag of the reference shape is 1). The range is

[0.9632, 1.0168]

.

Shape ID	$\frac{C_{d} A}{C_{d} A^{ref}}$	Set	Shape ID	$\frac{C_{d} A}{C_{d} A^{ref}}$	Set
1	1.0000	Reference	17	1.0075	Training
2	0.9820	Validation	18	0.9974	Validation
3	0.9939	Validation	19	1.0168	Training
4	0.9836	Validation	20	0.9632	Training
5	1.0011	Validation	21	0.9616	Training
6	1.0046	Validation	22	0.9987	Validation
7	0.9994	Validation	23	0.9761	Validation
8	0.9831	Validation	24	0.9790	Validation
9	1.0088	Validation	25	0.9952	Validation
10	0.9921	Validation	26	0.9858	Validation
11	1.0054	Validation	27	0.9712	Validation
12	0.9844	Validation	28	0.9943	Validation
13	0.9785	Validation	29	1.0101	Validation
14	0.9955	Validation	30	1.0043	Validation
15	0.9969	Validation	31	0.9775	Validation
16	1.0100	Validation

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Naffer-Chevassier, K.; De Vuyst, F.; Goardou, Y. Enhanced Drag Force Estimation in Automotive Design: A Surrogate Model Leveraging Limited Full-Order Model Drag Data and Comprehensive Physical Field Integration. Computation 2024, 12, 207. https://doi.org/10.3390/computation12100207

AMA Style

Naffer-Chevassier K, De Vuyst F, Goardou Y. Enhanced Drag Force Estimation in Automotive Design: A Surrogate Model Leveraging Limited Full-Order Model Drag Data and Comprehensive Physical Field Integration. Computation. 2024; 12(10):207. https://doi.org/10.3390/computation12100207

Chicago/Turabian Style

Naffer-Chevassier, Kalinja, Florian De Vuyst, and Yohann Goardou. 2024. "Enhanced Drag Force Estimation in Automotive Design: A Surrogate Model Leveraging Limited Full-Order Model Drag Data and Comprehensive Physical Field Integration" Computation 12, no. 10: 207. https://doi.org/10.3390/computation12100207

APA Style

Naffer-Chevassier, K., De Vuyst, F., & Goardou, Y. (2024). Enhanced Drag Force Estimation in Automotive Design: A Surrogate Model Leveraging Limited Full-Order Model Drag Data and Comprehensive Physical Field Integration. Computation, 12(10), 207. https://doi.org/10.3390/computation12100207

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhanced Drag Force Estimation in Automotive Design: A Surrogate Model Leveraging Limited Full-Order Model Drag Data and Comprehensive Physical Field Integration

Abstract

1. Introduction

1.1. Gradient-Based Approaches

1.2. Surrogate Modeling

1.3. Shape Parametrization

1.4. Adding Information from Available Volume Fields

1.5. Scope, Objectives, and Structure of the Paper

1.6. Related Works

2. Methodology

2.1. Drag Force Evaluation Methods

2.2. Shape Encoding

Discretized Formalism

2.3. Knowledge Extraction and Reduced-Order Representation in the Cutting Plane

2.4. Parametric Surrogate Model

2.5. Online Stage: Drag Force Evaluation

2.6. Summary

3. Numerical Experiments, Results, and Discussion

3.1. High-Fidelity Simulation

3.2. Simplified Geometry “S2A”

3.3. Data Generation and Preprocessing

3.4. Model Performance

3.5. Surrogate Model Construction

3.5.1. Shape Encoding

3.5.2. Computation of the Flow Field Modes on a Wake Cutting Plane

3.6. Surrogate Model Evaluation

4. Concluding Remarks and Perspectives

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Mesh Matching

Appendix A.1

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI