1. Introduction
Deep neural networks have emerged as a fast and reliable alternative for solving many scientific problems [
1,
2]. Recent advancements in hardware and software have made it possible to train extensive neural networks, e.g., physics-informed neural networks [
2,
3,
4], deep operator networks [
5,
6,
7,
8,
9], and neural operators [
10,
11,
12,
13], that accurately estimate and predict complex engineering and physical systems. However, constructing these models requires a large number of data samples for training. Due to the costs associated with experiments and computations, most studies only include a limited number of high-fidelity data samples to describe a physical problem [
14,
15]. As a result, training neural networks with only high-fidelity data becomes prohibitively expensive and may lead to inaccurate results.
One possible solution for reducing the high cost of producing high-fidelity data is to incorporate data from models or experiments with varying accuracy into the neural network training procedure using multi-fidelity approaches. Instead of immediately resorting to expensive experimental setups or high-dimensional, high-resolution meshes to obtain accurate high-fidelity data, less accurate datasets can be acquired using prediction methods or coarse meshes to construct an initial low-fidelity surrogate. As expected, the accuracy of these initial results is impractical for subsequent steps and needs to be improved. To mitigate this issue and ensure accuracy, multi-fidelity techniques can be employed to bridge the gap between low-fidelity and high-fidelity spaces. This approach can offer cost-effective and precise results for various problems, such as parametric optimization, which relies on accurate data to provide optimal solutions.
In the context of multi-fidelity model optimization, various methods exist for obtaining a low-fidelity model, including simplified physics models, coarser meshes in high-fidelity models, or relaxed solver convergence criteria. However, it is crucial for surrogate models to represent the high-fidelity model accurately. To achieve this, corrections are applied to the low-fidelity models. For instance, techniques like space mapping, shape preserving, and adaptive response correction are utilized in aerodynamic models. Compared to function approximation surrogates, the main advantage of multi-fidelity model optimization is that it requires fewer high-fidelity data points to construct a physics-based model with the desired accuracy level. This reduction in data dependency improves the efficiency of the optimization algorithm, making the process faster and more cost-effective while maintaining acceptable accuracy levels [
16]. For example, in [
16], the low-fidelity surrogate, constructed from the datasets obtained from the transonic small-disturbance equation, was corrected using the shape-preserving response correction methodology. They were able to reduce computational and optimization costs by reducing the needed high-fidelity dataset by about 90%.
Multi-fidelity surrogates have shown promising results for several challenging problems and test conditions. For example, in [
17], drag and lift coefficients were optimized for an airfoil under transonic conditions using a multi-fidelity model. The surrogate was constructed using coarse and fine meshes to collect low- and high-fidelity datasets. The authors suggested a multi-level method as a computationally effective solution by performing mapping algorithms such as multi-level optimization, space mapping, and shape-preserving response prediction techniques. Similarly, in [
18], a multi-fidelity method was considered as an alternative cheap closure model for evaluating drag in shock–particle interactions, rather than using expensive fine-resolution computational solutions. The authors applied space mapping, radial basis functions, and modified Bayesian kriging as correcting techniques on low-fidelity surrogates to determine the best and most inexpensive closure model.
In [
19], an accelerated optimization process with the multi-fidelity method was outlined for aerodynamic inverse design by comparing the manifold mapping-based design, which needs less than 20 high-fidelity and 1000 to 2000 low-fidelity evaluations, rather than direct aerodynamic inverse design subject to pattern search with 700 to 1200 high-fidelity model evaluations. Furthermore, in [
20], by comparing single- and multi-fidelity methods to decrease the drag coefficient of a missile, a significant reduction in optimization time was observed. The authors used a faster semi-empirical missile tool to obtain low-fidelity data.
Recognizing the potential of multi-fidelity surrogates, efforts have been made to reduce the need for high-fidelity data and achieve cost savings. One approach is to enhance mapping methods. For example, in [
21], various mapping approaches were explored to determine surrogate-based optimization schemes that can effectively handle constraints in variable-complexity design problems. They observed a 53% reduction in calling high-fidelity functions in the case of the wing design problem, resulting in reduced computational costs. In [
22], the co-kriging method was used to create a genetic-based surrogate model for a similar purpose. The genetic-based surrogate model was found to have better accuracy than the low-fidelity model-based surrogate models. Moreover, using the genetic-based model reduced computational costs due to fewer high-fidelity data requirements. To generalize the model, [
23] focused on the residual while constructing the surrogate model for robust optimization. They created a surrogate model to eliminate errors due to the choice of turbulence model on the accuracy and cost of a numerical solution and applied the model to optimize diffuser geometry. In [
24], the accuracy of the available multi-fidelity model was improved by combining gradient-enhanced kriging and a generalized hybrid bridge function while constructing the multi-fidelity model, and promising results of the robustness were reported.
To improve the accuracy of multi-fidelity surrogates and reduce the cost of high-fidelity data requirements, deep learning methods have been applied to create low-fidelity surrogates and map low- to high-fidelity spaces. The concept of an intelligent teacher–student interaction in machine learning, where student learning is accelerated with privileged information and corrected by transferring knowledge from teacher to student, is described in [
25]. In [
26], fidelity-weighted learning was introduced as a new student–teacher structure for deep neural networks. This approach was evaluated in natural language and information retrieval processing and outlined fast and reliable mapping for weakly and strongly labeled data. Inductive transfer and bi-fidelity-weighted learning methods were utilized in [
27] for uncertainty propagation by constructing neural network surrogates from low- and high-fidelity datasets. Two approaches, partial adaption and shallow network, were used to map between low- and high-fidelity data. The bi-fidelity-weighted method showed promising performance on validation errors for three multidisciplinary examples.
Many works [
28,
29,
30,
31,
32,
33,
34] have proposed scientific machine-learning-based solutions for the multi-fidelity problem. For instance, in [
28], a novel approach called the deep operator network (DeepONet) integrated a multi-fidelity neural network model to reduce the desired high-fidelity data and attained an error one order of magnitude smaller. This approach was implemented to compute the Boltzmann transport equation (BTE) and proposed a fast solver for the inverse design of BTE problems. To reduce computational costs for complex physical problems involving parametric uncertainty and partial unknowns, a bi-fidelity modeling approach utilizing a deep operator network was introduced in [
30]. This approach was applied to three problems: a nonlinear oscillator, heat transfer, and wind farm system. The evaluation demonstrates that the proposed method significantly improves the validation error by increasing its efficiency. In [
31], the authors improved their proposed non-autonomous DeepONet-based framework [
35] by incorporating a residual learning approach. This approach merges information from pre-existing mathematical models, enabling precise and efficient predictions in challenging power engineering problems. Finally, in [
32], the authors proposed a deep operator learning framework for computing fine-scale solutions of multiscale partial differential equations (PDEs). Specifically, they trained multi-fidelity homogenization maps using mathematically motivated neural operators.
The objective of this paper is to significantly lower the computational expenses involved in simulating the time-dependent evolution of lift and drag coefficients without compromising on result resolution. To achieve this, we design a novel physics-guided, bi-fidelity Fourier-featured deep operator learning-based framework, which is constructed using coarse and fine datasets obtained through numerical simulation. To attain the aforementioned objective, we make the following contributions.
We develop a physics-guided, bi-fidelity Fourier-featured deep operator learning-based framework (see
Section 2.2). This framework takes an arbitrary undisturbed free-stream velocity as input and produces continuous, oscillatory time trajectories of the lift and drag coefficients for a cylinder within a channel. Our approach begins with the design and training of a physics-guided low-fidelity deep operator network using an extensive dataset that captures the foundational solution patterns. Subsequently, the low-fidelity deep operator network’s predictions are enhanced through a physics-guided residual deep operator network. This elevation process transitions the low-fidelity solution to a high-fidelity solution utilizing a small high-fidelity dataset. The use of the developed framework enables a comprehensive analysis of the target solution’s time evolution for any given undisturbed free-stream velocity, eliminating the need for time-consuming numerical computations and simulations.
The aforementioned deep operator learning-based framework is constructed using a novel physics-guided approach. This approach utilizes the oscillatory nature of the time trajectory of drag and lift coefficients to transform the problem into a functional inverse problem, thereby reducing the solution space for training the deep operator networks. In this paper, we compare the physics-guided approach with the traditional data-driven approach.
Within this framework, we incorporate the Fourier-featured network as the trunk network of the DeepONets, thereby leading to the development of the Fourier-featured DeepONet. This incorporation harnesses the intrinsic capabilities of the Fourier-featured network, particularly its proficiency for capturing the fluctuations in lift and drag time trajectories. Consequently, the Fourier-featured DeepONet demonstrates superior performance compared to the vanilla DeepONet, which often struggles with understanding and precisely modeling these oscillatory patterns.
The rest of this paper is organized as follows.
Section 2 outlines the numerical method used to generate data on the lift and drag coefficient trajectories, including the specifics of the simulation setup and parameters. Subsequently, we describe the physics-guided, bi-fidelity Fourier-featured deep operator learning methodology for developing a framework to learn target solutions.
Section 3 presents numerical experiments that evaluate both the low-fidelity and the proposed bi-fidelity deep operator learning framework using their respective datasets.
Section 4 discusses the results and their implications in detail. Finally, in the concluding
Section 5, we encapsulate our key findings and conclusions.
2. Methodology and Description of the Tools
This paper proposes a novel approach called physics-guided bi-fidelity Fourier-featured deep operator learning to address the challenge of minimizing the cost associated with acquiring high-fidelity data. The approach is applied specifically to accurately estimate the fluctuations in the drag and lift coefficients over time for a cylinder in a channel operating under low-Reynolds-number conditions.
In order to obtain the necessary data, both low-fidelity and high-fidelity data are collected by simulating the flow around the cylinder using a commercial solver, Ansys Fluent. The data collection process involves varying the Reynolds number within an interval centered around 100. Coarse (low-fidelity) and fine (high-fidelity) meshes are used to capture the flow characteristics. Low-fidelity data are employed to establish an initial deep operator learning network that accepts velocity and time as inputs. To refine the predictions of the low-fidelity deep operator network, a secondary deep operator learning network utilizing high-fidelity data has been integrated. This integration leads to a more precise prediction of time trajectories for drag and lift coefficients. Please note that this study employs velocity as an input parameter. Recognizing the impact of inlet velocity on the pressure field, our future research will leverage detailed pressure data surrounding the cylinder as an input parameter to enhance further understanding.
In the following sections of this paper, we provide detailed descriptions of the numerical method used for data collection. This includes the simulation setup and parameters. Additionally, we delve into the bi-fidelity Fourier-featured deep operator learning framework, explaining its architectural components and training process. These advancements together contribute to a more efficient and cost-effective approach for estimating fluctuations in the drag and lift coefficients over time of the cylinder in the channel under low-Reynolds-number conditions.
2.1. Numerical Approach
We consider a cylinder with a diameter of
m placed inside a rectangular channel of length
m and height
m, as illustrated in
Figure 1 (replicated from [
36]). This benchmark simulation investigates two- and three-dimensional laminar flows around a cylinder and has been studied by several research groups using various numerical approaches. The simulation is detailed in terms of drag, lift, and Strouhal number. The
x-axis (streamwise direction) and
y-axis are aligned along the length and height of the channel, respectively. The center of the cylinder is positioned at
, slightly off-center along the
y-axis from the center line of the channel to induce vortex shedding. The left and right boundaries of the channel are taken as the inlet and outlet, respectively. The involved quantities are normalized with the cylinder’s diameter,
D. The flow geometry is illustrated in
Figure 1. The top and bottom walls of the channel and cylinder solid surface follow the no-slip boundary condition, whereas the outlet boundary condition is set for the exit. A parabolic velocity condition, defined in Equation (
1), is applied to the inlet boundary.
Here,
represents the velocity vector, and
denotes the maximum velocity at the centerline of the channel. The mean velocity of the flow can be calculated using Equation (
2).
When
is set to
m/s, the mean velocity around the cylinder is
m/s. This results in a Reynolds number of 100, based on Equation (
3), where
m
/s is the kinematic viscosity of the fluid.
To create the low- and high-fidelity datasets, the Reynolds number was adjusted to values between 90 and 110 by varying the maximum velocity. For each inlet condition, the lift
and drag
were instantaneously calculated on the cylinder surface using Equation (
4). Here,
S denotes the cylinder surface, and
and
represent the
x and
y components of the normal vector on
S, respectively. The tangential velocity on the cylinder surface,
, is obtained using a tangent vector of
.
The lift
and drag
coefficients can be estimated using the corresponding forces on the cylinder surface, as shown in Equation (
5).
The Fluent laminar solver is used for numerical simulations with transient settings. Unstructured grids are used for the simulations, with refinement near the cylinder wall. A coarse mesh containing around 3000 cells is used to obtain low-fidelity data, while high-fidelity data are achieved with a refined mesh of around 87,700 cells. The Courant (CFL) number is kept below 0.9 for each mesh and inlet velocity condition. Additionally, the convergence criteria for the inner iterations are set to .
Figure 2 compares the lift coefficient (
) and drag coefficient (
) of the reference [
36] with the fine and coarse mesh results. It can be observed that, by using the fine mesh results, the maximum values of
and
overlap with the reference data, thus validating the results.
2.2. Bi-Fidelity Fourier-Featured Deep Operator Learning
This section describes the proposed bi-fidelity Fourier-featured deep operator learning framework for approximating the operators that predict the drag and lift coefficients over time.
2.2.1. Deep Operator Learning
This paper proposes a bi-fidelity Fourier-featured deep operator learning framework to approximate the operator mapping
between the undisturbed free-stream velocity in the channel, denoted as
, and the time-dependent trajectories of the drag and lift coefficients of the cylinder, denoted as
and
, respectively, for
, i.e.,
where
. In the above, the input domain for the undisturbed free-stream velocity
is given by
m/s, and the time domain for the coefficients is
s.
To approximate the operator
, we will design a bi-fidelity Fourier-featured deep operator learning framework, denoted as
, where
is the vector of trainable parameters. This framework consists of two deep operator networks, which we will introduce in the next sections: a low-fidelity deep operator network, denoted as
, and a residual deep operator network, denoted as
. The proposed framework then satisfies
2.2.2. Low-Fidelity Deep Operator Network
Using a coarse mesh in simulations of complex systems is an effective approach to reduce the computational demands of the simulation process. This methodology simplifies the system’s complexity while generating abundant data that will significantly enhance the training process for deep operator networks. However, the adoption of a coarse mesh also introduces its own set of challenges. Notably, it omits fine-grained details in the system’s dynamics, leading to approximation errors. These errors may initially be minor but can compound over time, potentially resulting in significant discrepancies and impacting the accuracy and reliability of the deep operator network’s predictions.
The proposed bi-fidelity framework capitalizes on the maximum advantages of the low-fidelity deep operator network while deploying a residual deep operator network to bridge the gap between the coarse and fine operators. This strategy enables the framework to benefit from the computational cost-effectiveness of the low-fidelity deep operator network without compromising the high-accuracy characteristic of the fine operator.
The proposed bi-fidelity learning approach first designs a low-fidelity deep operator network with trainable parameters
, denoted as
. This deep operator network maps the undisturbed free-stream velocity
onto the low-fidelity solution
obtained using coarse mesh simulation. The target solution for the low-fidelity deep operator network,
, corresponds to temporal trajectories of either the drag or lift coefficients within the specified time domain
. We will design this low-fidelity deep operator network to be either a physics-guided deep operator network (see
Section 2.2.5) or a data-driven deep operator network with Fourier-featured layers (see
Section 2.2.4).
Training the low-fidelity deep operator network. To train the coarse deep operator network, we minimize the following loss function:
using the dataset of
N triplets:
, where
for a given undisturbed free-stream velocity
. In practice, for each velocity
, we can obtain
q samples of the target solution by evaluating it at
q different times
.
Since generating the low-fidelity dataset is computationally less expensive, we assume that this dataset can be large enough. This large dataset is particularly advantageous for training our low-fidelity deep operator network as it provides a diverse range of examples to learn from, thereby enhancing the efficacy and generalization ability of the model.
Furthermore, we note that, in our framework, the trainable low-fidelity deep operator network could alternatively be replaced with either an efficient simulator, a physics-informed neural network (PINN) [
2,
4], or traditional machine learning approaches like spectral proper orthogonal decomposition (POD) [
37]. An efficient simulator could enhance the computational process by providing faster approximations. In contrast, a PINN offers the potential to increase modeling efficiency by incorporating the simulator’s equations directly into the loss function. Additionally, employing POD might provide valuable insights due to its ability to decompose complex datasets into a set of orthogonal modes, thereby simplifying the analysis and interpretation of fluid dynamics phenomena and potentially enhancing the model’s predictive accuracy and interpretability.
2.2.3. Residual Deep Operator Network
To approximate the fine or high-fidelity operator, we propose a residual deep operator learning strategy that uses the output from the trained low-fidelity deep operator network
.
Figure 3 provides a visual representation of the proposed framework. The framework involves defining the residual operator:
which estimates the difference between the true fine operator and the output of the trained low-fidelity deep operator network.
We approximate the residual operator mentioned above using a second deep operator network. Specifically, the proposed residual deep operator network, denoted by , maps the undisturbed free-stream velocity to the residual solution , where and are the fine solution and the predicted coarse solution, respectively. Note that this second network fine-tunes the predictions generated by the low-fidelity deep operator network, facilitating precise adjustments to the high-fidelity solution.
Training the residual deep operator network. To train the residual deep operator network, we minimize the following loss function:
using the dataset of
N triplets
, where
denotes the residual error, i.e., the difference between the high-fidelity solution, obtained via fine-mesh simulation, and the corresponding prediction generated by the trained low-fidelity deep operator network.
Obtaining high-fidelity solutions for the dataset can be expensive, resulting in a smaller dataset size compared to the coarse dataset . This expense is due to the substantial computational resources required for high-resolution CFD simulations. While we used a uniform random sampling method to collect data, it may not be ideal for costly high-fidelity data. Adopting a more strategic sampling approach, aimed at specific areas of complexity or interest, could potentially enhance the proposed framework’s performance. We will explore this strategic sampling approach in future work.
In constructing the residual deep operator neural network, especially for achieving a balance between efficiency and accuracy through high-fidelity evaluations, determining the number of high-fidelity training points is vital but lacks a universal guideline due to its dependence on specific problem details and data nature. This paper addresses this by considering factors like problem complexity, low-fidelity data quality, and desired error thresholds, in conjunction with the neural network’s size and structure. Through extensive hyper-parameter optimization, we have identified an optimal size for the high-fidelity dataset. This process is inherently iterative, necessitating adjustments based on the model’s validation stage performance to finely tune the balance between computational efficiency and accuracy.
Inference. After training the low-fidelity and residual deep operator networks, we predict the time evolution of lift and drag coefficients for an arbitrary undisturbed free-stream velocity in a channel using a two-step process. First, we obtain a low-fidelity approximation of the solution using the trained low-fidelity deep operator network. This serves as our initial prediction/guess. Then, we use the residual deep operator network to predict the error between the true solution over time and our initial guess. Finally, we compute the approximation of the true solution by adding the outputs from both the trained low-fidelity and residual deep operator networks. Algorithm 1 summarizes the details of this two-step process for predicting the lift and drag coefficients for a given time partition
}.
Algorithm 1: A Bi-Fidelity Fourier-Featured Deep Operator Framework for Predicting the Lift and Drag Coefficients. |
- 1
Require: The trained low-fidelity deep operator network , the trained residual deep operator network , the undisturbed free-stream velocity of the fluid in the channel , a given time partition . - 2
Step 1: Use the trained low-fidelity deep operator network to predict the low-fidelity solution on for the given velocity , i.e., , where . - 3
Step 2. Use the trained residual deep operator network to predict the errors between the high-fidelity solution and the predicted low-fidelity solution on for the given velocity . - 4
Return. The predicted high-fidelity solution on , i.e., , where and .
|
2.2.4. Deep Operator Learning
In this paper, we adopt the deep operator network (DeepONet) proposed in [
5] as the foundational model for constructing our novel, physics-guided, bi-fidelity Fourier-featured deep operator learning-based framework. DeepONet is based on the universal approximation theorem of nonlinear operators [
38]. It can approximate any nonlinear continuous operator, which are mappings between infinite dimensional spaces.
Figure 4B illustrates the DeepONet architecture, which is designed to approximate the target operator using a trainable linear representation. To enable this representation, it is crucial to construct and train two interconnected but distinct sub-neural networks: the branch network and the trunk network. The branch network is primarily responsible for handling the input, denoted as
, and produces a vector of trainable basis coefficients, denoted as
. The trunk network, on the other hand, decodes the output by processing the output location
. Its output is another vector of trainable basis functions, denoted as
. The desired linear representation is achieved by taking the dot product of the outputs from the branch and trunk networks. This effectively combines these outputs in a meaningful way to approximate the target operator as follows:
2.2.5. Physics-Guided Deep Operator Learning
The oscillatory behavior of the time trajectory of lift and drag coefficients suggests that a physics-guided approach could effectively decipher these patterns. Physics-guided neural networks have a unique advantage in incorporating physical intuition into deep learning, resulting in accurate and robust predictions. This advantage is especially valuable when dealing with complex phenomena, such as predicting drag and lift coefficients across a wide range of input velocities. By incorporating physical intuition into deep learning, the precision, reliability, and generalizability of the predictions can be improved compared to fully data-driven models. Although data-driven models are powerful, they may fail to capture complex dependencies if they are not adequately represented in the dataset. Physics-guided models address this potential limitation by utilizing known relationships and behaviors.
In this paper, we adopt a physics-guided approach along with deep operator neural networks to transform the problem into an operator/functional inverse problem as illustrated in
Figure 4C. In particular, the oscillatory nature of the lift and drag coefficients over time, as shown in
Figure 2, suggests the suitability of sinusoidal family functions for modeling this behavior. Based on this observation, we propose integrating this physics-guided approach into our deep operator learning approach. More specifically, our proposed physics-guided deep operator network is defined as follows:
Here, represents either the lift or drag coefficients, while , , and are the outputs of three distinct DeepONets. These networks take the undisturbed free-stream velocity and time as inputs. In this configuration, learns the amplitude variations over time of the target solution, understands the frequency of the solution’s trajectory, and captures the phase angle. This division of tasks allows our model to capture both the scale and periodicity of the system dynamics. By embedding the physics of the problem into our model, we anticipate enhancing the model’s robustness and reliability.
Training the Physics-Guided Deep Operator Network. To train the low-fidelity physics-guided deep operator network, we minimize the subsequent loss function:
utilizing a dataset comprising
N triplets, denoted as
, where
corresponds to a specific channel undisturbed free-stream velocity
from the set
. Furthermore, to train the residual physics-guided deep operator network, we optimize the following loss function:
using the dataset of
N triplets
, where the term
represents the residual error.
Inference. For a comprehensive understanding, Algorithm 2 provides a thorough depiction of the inference phase for the bi-fidelity framework, built upon the physics-guided deep operator components.
Algorithm 2: A Physics-Guided Bi-Fidelity Fourier-Featured Deep Operator Framework for Predicting the Lift and Drag Coefficients. |
- 1
Require: The trained physics-guided low-fidelity deep operator network , the trained physics-guided residual deep operator network , the undisturbed free-stream velocity of the fluid in the channel , a given time partition . - 2
Step 1: Use the trained physics-guided low-fidelity deep operator network to predict the low-fidelity solution on for the given velocity , i.e., , where . - 3
Step 2. Use the trained physics-guided residual deep operator network to predict the errors between the high-fidelity solution and the predicted low-fidelity solution on for the given velocity . - 4
Return. The predicted high-fidelity solution on , i.e., , where and .
|
2.2.6. Fourier Features Network
To design each DeepONet, we employ a feed-forward neural network for the branch network. However, traditional feed-forward networks have limitations in capturing high-frequency oscillatory patterns, making it difficult to encode the target output solutions, specifically the oscillatory time trajectory of the drag and lift coefficients. To overcome this limitation, we develop the Fourier-featured DeepONet by adopting the Fourier features network as the trunk network within the DeepONets’ architecture. This network is responsible for encoding the output and can effectively encapsulate the target solution. It is selected for its exceptional ability to capture periodic patterns and oscillatory behaviors, transcending the capabilities of conventional feed-forward layers.
The Fourier features network, as shown in
Figure 4A, relies on a random Fourier mapping expressed as
[
39]. The matrix
B in
contains values sampled from a Gaussian distribution
. By integrating a random Fourier mapping with a traditional neural network, the Fourier features network effectively boosts learning capabilities by simplifying high-dimensional data. This method is distinctive in its ability to mitigate spectral bias, a common issue that hinders the learning of high-frequency data components. As a result, it significantly enhances performance across various tasks, making it an ideal choice for the trunk network in the DeepONet model used for predicting oscillatory time trajectories of lift and drag coefficients.
3. Numerical Model Results
This section presents the training and testing procedures used in the proposed physics-guided bi-fidelity Fourier-featured deep operator learning framework to predict the drag and lift coefficients of a cylinder.
3.1. Training and Testing Datasets
To train and evaluate our physics-guided bi-fidelity Fourier-featured deep operator learning framework, two separate datasets are needed: one for low-fidelity data and another for high-fidelity data. The low-fidelity dataset is obtained by simulating with a coarse mesh, while the high-fidelity dataset is generated using a fine mesh. The fine mesh of the high-fidelity dataset produces more accurate results but comes with a significantly higher computational cost. Within each dataset, two-dimensional numerical simulations were conducted for each , where , specifically for the undisturbed free-stream velocity within the channel. To obtain the training targets, we monitored and recorded the lift and drag coefficients of the cylinder over time. The values of these coefficients were collected within the specified time domain. Another key difference between these datasets, apart from the mesh resolution, is their size. The low-fidelity dataset is much larger than its high-fidelity counterpart, primarily due to cost-effectiveness and affordability.
In this paper, we utilized 150 sets of simulation data to train the low-fidelity DeepONet model and 50 sets to train the residual DeepONet. For testing purposes, we allocated 10% of the data, while the remaining 90% were used for training. To ensure independence, we employed a simple random selection method to transform the input functions into separate samples. This approach fosters the development of a more generalized model, enhancing its applicability under diverse operating conditions.
3.2. Neural Networks and Training Protocols
To ensure the model’s effectiveness and establish an efficient training process, we conducted routine hyper-parameter optimization. This optimization aimed to identify the optimal architecture and suitable settings for the training process. The results of this optimization are presented in
Table 1. For the training of all DeepONets discussed in this paper, physics-guided and data-driven, we employed the settings detailed along with the Adam optimizer, coupled with a “reduce-on-plateau” learning rate scheduler.
3.3. Low-Fidelity Deep Operator Learning
Results
For low-fidelity learning, we designed both physics-guided and data-driven low-fidelity deep operator networks and assessed their performance. After training each operator on the low-fidelity training dataset, we evaluated their effectiveness using a low-fidelity test dataset. This test dataset constitutes 10 percent of the total low-fidelity data, which were excluded from the training phase. It includes trajectories spanning time steps within the output domain .
To visualize the performance of the proposed low-fidelity models, we randomly selected an undisturbed free-stream velocity
from the low-fidelity test dataset and predicted the drag and lift coefficients using both physics-guided and data-driven deep operator networks.
Figure 5 and
Figure 6 depict these coefficients for the chosen velocity. The displayed results underscore the exceptional predictive prowess of the low-fidelity DeepONet models; their predictions align closely with the true values derived from CFD simulations using a coarse mesh.
3.4. Bi-Fidelity Fourier-Featured Deep Operator Learning Results
In our proposed bi-fidelity learning framework, both the low-fidelity and residual models can be either a physics-guided or data-driven Fourier-featured DeepONet. To rigorously assess the effectiveness of this bi-fidelity structure and the performance of the introduced physics-guided Fourier-featured DeepONet, we constructed three distinct bi-fidelity configurations:
First Configuration: Both the low-fidelity and the residual models are physics-guided Fourier-featured DeepONets.
Second Configuration: The low-fidelity model is a data-driven Fourier-featured DeepONet, while the residual model is a physics-guided Fourier-featured DeepONet.
Third Configuration: Both the low-fidelity and residual models are data-driven Fourier-featured DeepONets.
Through these diverse configurations, we aim to provide a comprehensive analysis of the data-driven and physics-guided approaches within our bi-fidelity learning framework.
This section focuses on the assessment of these frameworks through their predictions on the high-fidelity test dataset. For this purpose, after training both the low-fidelity and residual DeepONets, we employ the methodology outlined in Algorithms 1 and 2 to approximate the high-fidelity solution for each framework. These solutions correspond with the respective bi-fidelity frameworks’ predictions on the high-fidelity test dataset.
To demonstrate the performance of our proposed bi-fidelity learning framework, we randomly selected an undisturbed free-stream velocity
from the high-fidelity test dataset. Utilizing this velocity, we predicted the time trajectory of both drag and lift coefficients by employing each of the three previously outlined bi-fidelity learning framework configurations.
Figure 7,
Figure 8 and
Figure 9 present the drag and lift coefficients’ time trajectory for the chosen velocity. Correspondingly, these figures represent predictions obtained from the first, second, and third configurations, respectively. As observed in these figures, the fine solution for both drag and lift coefficients aligns closely with that of the bi-fidelity Fourier-featured deep operator learning frameworks’ predictions, indicative of a strong match. Such observations imply not only the superior capability of the residual deep operator network in discerning the residuals between coarse predictions and fine solutions but also underscore the outstanding performance of our proposed bi-fidelity learning framework.
3.4.1. Comparative Performance Analysis of Bi-Fidelity Learning Frameworks
To comprehensively evaluate the generalization performance of the proposed bi-fidelity learning frameworks,
Table 4 and
Table 5 show the
- and
-relative errors corresponding to each framework’s predictions of the fine solution. These statistics underscore that all three bi-fidelity configurations achieve errors below
when predicting the fine solution for the target coefficients. Interestingly, the results indicate a superior predictive capability from the physics-guided approach, attributable to its leveraging of the system’s physical equations. Among the configurations, the first, which employs a physics-guided Fourier-featured DeepONet for both low-fidelity and residual models, yields the lowest error. Furthermore, the second configuration, utilizing the physics-guided Fourier-featured DeepONet for the residual model and data-driven DeepONet for the low-fidelity model, exhibits lower errors than the third configuration that relies on the data-driven Fourier-featured DeepONet for both models. This clearly showcases the efficacy of the proposed physics-guided deep operator for both low-fidelity and residual models in the bi-fidelity framework.
3.4.2. Comparison of Physics-Guided Bi-Fidelity Deep Operator Learning Framework: With vs. without Fourier Feature Trunk Network
As detailed in
Section 2.2.6, we employed Fourier feature networks as the trunk net for both the data-driven and physics-guided DeepONets in our low-fidelity and residual models. Referring to
Figure 10, it becomes evident that the bi-fidelity deep operator learning framework constructed using the vanilla DeepONet struggles to capture the oscillatory characteristics of the lift and drag time trajectories. In contrast, the proposed bi-fidelity Fourier-featured deep operator learning framework adeptly captures the fluctuations in these trajectories. This underlines the effectiveness and appropriateness of integrating the Fourier-featured trunk net with the DeepONet, especially for lift and drag time evolution prediction.
4. Discussion
The developed physics-guided bi-fidelity Fourier-featured deep operator learning framework has demonstrated outstanding performance, achieving error rates below for both coarse and fine operator approximation. This indicates a highly accurate prediction of time trajectories for the drag and lift coefficients. This achievement is particularly noteworthy considering that only 45 high-fidelity input trajectories () were used in the training process. It highlights the high efficiency of the proposed framework, which effectively utilizes available information, such as inferred physics, especially in scenarios where obtaining high-fidelity data is expensive. These results clearly emphasize the potential of our framework in tackling complex computational challenges while optimizing resource utilization.
Based on our comprehensive comparative analysis, it is evident that the proposed physics-guided Fourier-featured DeepONet outperforms its data-driven counterpart, primarily due to its enhanced predictive power and efficiency in narrowing the solution space. This makes it an optimal tool within the bi-fidelity learning framework, particularly for the low-fidelity and residual models when predicting lift and drag coefficients. However, it is crucial to acknowledge several key distinctions between these two methodologies.
As is observed in
Table 4 and
Table 5, incorporating the physics-guided DeepONet in the residual model has a more noticeable impact on enhancing the bi-fidelity learning framework’s performance compared to its use as a low-fidelity model. The primary reason for this is the differing complexities in the relationship between the input and the solution in low-fidelity and residual modeling. In the context of low-fidelity modeling, the relationship is relatively straightforward, with changes primarily in amplitude and without significant alterations in phase shift and frequency. Conversely, the residual model, which is tasked with predicting the residual solution, presents a more complex scenario. This model exhibits behavior that varies across all parameters: amplitude, frequency, and phase shift. Thus, employing a physics-guided DeepONet, which concentrates on narrowing the solution space in these dimensions, enhances the framework’s ability to accurately map the input to the output.
Physics-guided models are notable for their ability to incorporate underlying physics-based insights into their framework. This incorporation has the potential to generate more robust and accurate predictions. This feature is particularly advantageous when dealing with complex phenomena, such as predicting drag and lift coefficients across a wide range of input values. By integrating physics-based insights into the model, we can potentially improve the precision and reliability of predictions, even when dealing with a diverse and complex parameter space.
On the other hand, while data-driven models are undoubtedly powerful, they may overlook intricate dependencies if these subtleties are not adequately captured within the dataset. Physics-guided models address this potential limitation by leveraging known relationships and behaviors, thereby reducing the risk of overlooking such details.
Future Work
While recognizing the unique strengths and limitations of both data-driven and physics-guided approaches, our research establishes a foundation for future advancements in the bi-fidelity framework of lift and drag coefficients. This paves the way for exploring other bi-fidelity learning methods, such as input augmentation [
28]. This approach has the potential to improve the accuracy of predicting fine solutions, offering a promising avenue for more precise and reliable predictions in bi-fidelity learning.
Looking toward future research, a promising direction is to explore the integration of physics-informed neural networks (PINNs) and traditional low-order methods like proper orthogonal decomposition (POD) within our bi-fidelity learning framework. Applying these methods for both low-fidelity and residual learning, we plan to conduct a comparative analysis against our DeepONet-based bi-fidelity framework. This comparison is expected to yield a deeper understanding of the strengths and limitations inherent in both DeepONet-based configurations and traditional model-based approaches.
Additionally, an exciting future direction involves the implementation of the recently introduced POD-DeepONet model that employs the POD modes of the training data as the trunk net [
40]. Employing this model in both the low-fidelity and residual aspects of our framework could enhance the predictive accuracy and efficiency, leading to a more robust and versatile framework for accurately predicting complex fluid dynamics phenomena.
DeepONets are proficient in mapping inputs to outputs, yet they encounter limitations due to their fixed output domain. This restriction affects their performance when dealing with data outside their trained output domain range. In response to this challenge, recent advancements have proposed the use of a non-autonomous DeepONet, designed to operate beyond this fixed domain and improve prediction capabilities [
35]. Accordingly, another significant direction for future research involves replacing the standard DeepONet with the non-autonomous variant in both the low-fidelity and residual models of our framework. Implementing this change is expected to enhance the framework’s capacity to navigate various output domains, potentially increasing the predictive accuracy for lift and drag coefficients. The adoption of a non-autonomous DeepONet could lead to the development of a more adaptable and robust bi-fidelity learning framework.
While neural networks, such as DeepONets, are adept at interpolation tasks, they encounter difficulties in extrapolation scenarios, as highlighted in [
41]. Our current study does not explore situations where inputs extend beyond the established range of the training set. Nevertheless, [
41] proposes several methods to address these extrapolation challenges. One approach is extrapolation via fine-tuning with sparse new observations, which involves fine-tuning the network using sparsely acquired new data. Another technique is extrapolation via multi-fidelity learning with sparse new observations, incorporating multi-fidelity learning strategies to integrate data of varying accuracy and detail. These methods hold the potential to significantly improve the performance of DeepONets in extrapolation tasks. Integrating them into our current research would substantially change and expand our paper’s scope. Therefore, we plan to apply these techniques in future work, building upon the foundational research that we have conducted thus far.
Furthermore, we aim to expand our research from bi-fidelity to multi-fidelity and Bayesian operator learning [
28,
42,
43,
44]. This expansion aims to optimize the use of data while mitigating the computational costs associated with predicting computationally expensive data. The ultimate goal is to enhance the model’s prediction by maximizing generalization and accuracy. As a result, our research not only contributes to the current understanding but also lays the foundation for future innovations in predicting lift and drag coefficients’ time trajectories.
Finally, we plan to use novel Bayesian multi-fidelity operator learning frameworks to optimize complex dynamical systems [
45] or predict network dynamical systems [
46] in a distributed and federated [
47] manner.