Next Article in Journal
Robust Load Frequency Control in Cyber-Vulnerable Smart Grids with Renewable Integration
Previous Article in Journal
Heat Transfer Enhancement in Heat Exchangers by Longitudinal Vortex Generators: A Review of Numerical and Experimental Approaches
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Adaptive Multitask Neural Network for High-Fidelity Wake Flow Modeling of Wind Farms

by
Dichang Zhang
1,
Christian Santoni
2,
Zexia Zhang
2,
Dimitris Samaras
1 and
Ali Khosronejad
2,*
1
Department of Computer Science, Stony Brook University, Stony Brook, NY 11794, USA
2
Department of Civil Engineering, Stony Brook University, Stony Brook, NY 11794, USA
*
Author to whom correspondence should be addressed.
Energies 2025, 18(11), 2897; https://doi.org/10.3390/en18112897
Submission received: 29 April 2025 / Revised: 28 May 2025 / Accepted: 29 May 2025 / Published: 31 May 2025

Abstract

:
Wind turbine wake modeling is critical for the design and optimization of wind farms. Traditional methods often struggle with the trade-off between accuracy and computational cost. Recently, data-driven neural networks have emerged as a promising solution, offering both high fidelity and fast inference speeds. To advance this field, a novel machine learning model has been developed to predict wind farm mean flow fields through an adaptive multi-fidelity framework. This model extends transfer-learning-based high-dimensional multi-fidelity modeling to scenarios where varying fidelity levels correspond to distinct physical models, rather than merely differing grid resolutions. Built upon a U-Net architecture and incorporating a wind farm parameter encoder, our framework integrates high-fidelity large-eddy simulation (LES) data with a low-fidelity engineering wake model. By directly predicting time-averaged velocity fields from wind farm parameters, our approach eliminates the need for computationally expensive simulations during inference, achieving real-time performance ( 1.32 × 10 5 GPU hours per instance with negligible CPU workload). Comparisons against field-measured data demonstrate that the model accurately approximates high-fidelity LES predictions, even when trained with limited high-fidelity data. Furthermore, its end-to-end extensible design allows full differentiability and seamless integration of multiple fidelity levels, providing a versatile and scalable solution for various downstream tasks, including wind farm control co-design.

1. Introduction

Wind energy has become a crucial component of the global transition to renewable energy, driven by the demand for sustainable, low-carbon power generation [1]. However, wind farm performance is significantly affected by wake effects, where downstream turbines experience reduced wind speeds and increased turbulence due to the influence of upstream turbines [2]. To mitigate these effects, researchers have proposed deliberately adjusting the yaw of the turbine to redirect the wakes away from the downstream turbines [3,4,5,6,7,8,9]. Accurate prediction of turbine wakes is essential for minimizing power losses and maximizing wind farm efficiency.
High-fidelity wind turbine models typically solve the momentum equations with turbine loads represented as body forces [10,11,12,13,14,15,16,17,18,19,20]. One of the earliest and most widely used methods is the actuator disk model, which treats the rotor as a porous disk [10,11]. This approach was extended by modeling individual blades as lines [12], and later refined into the actuator surface model, which projects blade geometry onto a 2D surface for improved force distribution [18]. Further enhancements account for tower and nacelle effects using either the immersed boundary or actuator surface methods [18,21,22].
Although high-fidelity models accurately capture flow in wind turbines and farms, their computational cost often limits their use in layout optimization and control. To address this, simplified analytical wake models have been developed. Early efforts to model wakes for turbine placement optimization were carried out by Jensen [23]. This model assumes a uniform velocity profile behind the rotor that results in a top hat shape and a linearly expanding wake to account for energy entrainment. However, experimental measurements have shown that the top hat velocity profile is not realistic, and a Gaussian distribution provides a more accurate representation [24]. To overcome this limitation, Bastankhah and Porté-Agel [25] introduced an analytical model assuming a Gaussian velocity deficit in the wake. This model showed reasonable agreement with the experimental measurements of Chamorro and Porté-Agel [26] and high-fidelity simulations of Wu and Porté-Agel [16,27].
To capture the effects of yaw misalignment and wake steering, several extensions of the Gaussian wake model have been proposed. Bastankhah and Porté-Agel [28] modified the Gaussian wake model to account for the wake displacement due to yaw misalignment, which was shown to predict the wake redirection with fair accuracy for a single turbine. However, high-fidelity simulations have shown that yaw misalignment generates a counter-rotating vortex, leading to asymmetric wake deflection and secondary steering of downstream turbine wakes [6]. To account for the secondary steering, the curled wake model of Martínez-Tossas et al. [29,30] was introduced into the Gaussian wake model to compute the cross-wind velocity and obtain an effective yaw angle for the computation of the wake displacement [31].
While analytical wake models provide valuable insights, they often struggle to capture the complex flow dynamics of utility-scale wind farms. To address the limitations of conventional wake modeling, data-driven approaches can leverage high-fidelity numerical simulations and LiDAR measurements to improve wake characterization and enhance predictive accuracy. Renganathan et al. [32] developed a machine learning (ML) model trained on LiDAR measurements collected from a wind farm in Texas. Their framework employed an autoencoder convolutional neural network (ACNN) to compress high-dimensional wake flow fields into a compact latent space. The encoder was subsequently replaced by a multi-layer perceptron and a Gaussian process regressor to map input parameters directly to the latent representation and reconstruct the original measurements. This approach, referred to as the neural compression paradigm, requires abundant high-fidelity supervision—5000 labeled instances were used in their experiments.
To reduce the need for extensive high-fidelity data, Zhang et al. [33] and Santoni et al. [34] proposed methods that exploit multi-fidelity training and leverage the super-resolution capabilities of ACNNs. These models are trained to reconstruct time-averaged high-fidelity velocity fields from their low-fidelity counterparts. Specifically, Zhang et al. used a limited number of instantaneous high-fidelity snapshots as degraded input data, while Santoni et al. employed velocity fields generated by the Gauss–curl hybrid (GCH) model developed by King et al. [31] as low-fidelity inputs. Compared to the flow-to-latent mapping used in neural compression, the flow-to-flow mapping of the super-resolution paradigm is simpler and requires fewer high-fidelity examples to train.
However, super-resolution methods face two major limitations: (i) inference relies on the availability of low-fidelity simulations, which creates a bottleneck for real-time deployment; and (ii) training requires paired datasets with one-to-one correspondence between fidelity levels, which restricts data flexibility and limits applicability.
Recent works [35,36,37,38,39] have explored applying transfer learning to high-dimensional multi-fidelity modeling in other scientific domains with promising results. These approaches first train a neural network surrogate model to map input parameters to low-fidelity simulations using abundant training data, and then fine-tune it to map input parameters to high-fidelity simulations with scarce high-fidelity supervision. This study seeks to develop a novel framework for wind farm surrogate modeling based on transfer learning, aiming to replace the super-resolution paradigm and overcome the aforementioned limitations. To facilitate parameter-to-wind-flow prediction, an efficient encoding scheme is proposed to enable ACNN processing of physical parameters. While this encoding enables parameter-to-flow predictions, empirical studies (Section 3.4) show that vanilla transfer learning performs poorly in the context of wind farm mean flow modeling. It is hypothesized that this suboptimal performance arises from the heterogeneous spatial similarities between high- and low-fidelity simulations. As a solution, a more adaptive transfer learning approach is proposed—one that gradually transfers learned representations by initially focusing on regions with higher similarity and progressively adapting to regions with lower similarity. Empirical results confirm that the proposed method outperforms vanilla transfer learning and achieves an accuracy level sufficient to support downstream wind farm design and control tasks.
The developed ML model directly predicts the high-fidelity time-averaged velocity field from wind farm information, eliminating the need for simulations during inference and achieving real-time prediction. Furthermore, the model can be trained on any combination of data and can be extended to more than two levels of fidelity, without requiring correspondence between high-fidelity data and their degraded versions. This flexibility expands the range of potential training datasets.
The end-to-end network is fully differentiable, allowing for seamless integration into gradient-based control or optimization algorithms. This approach not only overcomes the computational bottlenecks of previous methods but also provides a scalable and versatile solution for time-averaged velocity prediction in complex scenarios. A comparison of training data requirements and inference characteristics between the proposed method and previous methods are provided in Table 1 and Table 2.
The main contribution of this paper is summarized as follows:
  • An adaptive transfer learning framework is proposed for high-dimensional multi-fidelity modeling, effectively integrating data from varying fidelity levels corresponding to distinct physical models.
  • A GPU-efficient scheme is designed to encode wind farm physical parameters into representations compatible with the ACNN, facilitating rapid data processing and real-time inference, thereby significantly reducing computational overhead.
  • By combining the adaptive transfer learning framework with the efficient encoding scheme, a surrogate model is developed for wind farm mean flow prediction. This model seamlessly integrates high-fidelity large eddy simulation (LES) data with low-fidelity engineering wake models, demonstrating effectiveness, generalizability, and extensibility.
The paper is organized as follows. Section 2 describes the proposed methodology, including efficient parameter encoding, the neural network architecture, and the adaptive transfer learning framework. Section 3 elaborates on the experimental setup, including data simulations, dataset creation, training specifics, and the obtained results. Finally, Section 4 summarizes the findings and discusses potential future research directions inspired by this work.

2. Methodology

The proposed framework incorporates a novel neural network architecture and an improved training procedure. The network architecture is based on a U-Net backbone [40], augmented with an efficient mechanism for encoding wind farm information at the input stage and two linear layers at the output for multi-task prediction. The training procedure enhances the standard transfer learning paradigm [41] and is specifically tailored to address the challenges of high-dimensional transfer scenarios.

2.1. Efficient Encoding of Wind Farm

Previous studies have demonstrated the effectiveness of ACNNs for processing fluid velocity fields. However, these architectures are primarily suited for high-dimensional mappings, such as flow-field-to-flow-field transformations, and cannot directly process physical wind farm parameters, which are represented as a series of data in the form of { ( X i , Y i , Z i , γ i ) } i = 1 N , where ( X i , Y i , Z i , γ i ) denote the rotor center location and yaw angle of the i-th turbine in a wind farm with a total of N turbines. To address this limitation, a GPU-efficient encoding scheme that maps wind farm parameters to a high-dimensional space is proposed, enabling ACNN processing. This approach separates inputs into two groups—one representing wind conditions and the other describing the wind farm configuration, which are processed independently.
To approximate the free-stream flow field, the wind condition parameters are used in conjunction with the law of the wall. Specifically, the freestream velocity at height y is U y = U hub ( y y hub ) α , where y h u b , U h u b , and α are hub height, hub-height velocity, and wind shear. Although this provides only a coarse approximation of wind farm flow without turbines, it is computationally efficient on GPUs and is a robust representation for ACNN-based learning.
The wind turbine placement is encoded by representing each turbine as a 3D Gaussian-shaped distribution in the wind farm. Each Gaussian has its mean at the turbine center and variance determined by the yaw angle.
The encoded value at position ( x , y , z ) of the entire simulation domain for a set of N turbines located at { ( X i , Y i , Z i ) } i = 1 N with yaw angles { γ i } i = 1 N is given by:
v encoded ( x , y , z ) = i = 1 N exp ( 1 2 [ ( x X i ) cos γ i + ( z Z i ) sin γ i 2 A 2   + ( y Y i ) 2 B 2   + ( z Z i ) cos γ i ( x X i ) sin γ i 2 C 2 ] )
where x, y, and z are the positions along the free-stream flow direction, height, and spanwise direction, respectively. A, B, and C are selected based on characteristic scale for each dimension, and γ i is the rotor yaw angle of turbine i. As the mean represents the center of the turbine, the covariance of the encoding effectively rotates the Gaussian according to the yaw direction of the turbine, allowing the representation to be both direction-aware and spatially continuous. It is worth noting that this representation is not a probability density function, as the total sum equals the number of turbines and thus depends on the configuration, rather than being normalized over space. Figure 1c shows the encoded representation of a three-turbine system at hub height, and the zoom-in view highlights a turbine yaw of γ = 20 . Working with object locations and poses as Gaussian distributions is a common technique in computer vision and facilitates neural network learning. Additionally, this representation aligns with the velocity deficit observed in wind turbine wakes, as demonstrated by experimental and field measurements [24,26,42]. This alignment helps simplify the learning process.
Both encoding processes are computationally efficient and parallelizable on GPUs. They involve only basic arithmetic operations, which are highly optimized in modern deep learning frameworks like PyTorch (v2.0.1) [43]. Specifically, they are implemented using standard PyTorch (v2.0.1) tensor operations—including element-wise addition, subtraction, exponentiation, and power transformation—which are GPU accelerated and fully integrated into the computational graph. The only data transfer between CPU and GPU consists of copying the physical input parameters, which is minimal in size. Once these parameters are on the GPU, all subsequent encoding computations and neural network operations are executed entirely within the GPU memory without requiring additional CPU–GPU communication.
The computational graph in this context refers to the directed graph of tensor operations that define the complete data flow—from the raw input parameters through the proposed encoding mechanism to the neural network’s outputs and ultimately to the loss function. This graph is automatically constructed by PyTorch during the forward pass, with each tensor operation registered as a node. Because the entire process—including the encoding—is built from differentiable operations, the backward pass can seamlessly compute gradients across the whole graph using automatic differentiation. While automatic differentiation facilitates neural network training, it also provide direct gradient calculation with respect to wind farm layouts during inference and can be leveraged for gradient-based wind farm design and control.

2.2. Model Architecture

Figure 1 illustrates the general framework of the proposed machine learning model. After encoding the wind farm parameters into an ACNN-processable representation, they are concatenated and passed through a U-Net backbone [40]. Figure 1a shows the transformation of the wind farm representation within the U-Net architecture.
U-Net follows an encoder-decoder design with skip connections that directly transfer feature maps from corresponding layers of the encoder to the decoder. This structure helps preserve spatial information essential for capturing small-scale fluid features. The encoder extracts high-level features from input data through a series of convolutional layers, each followed by ReLU activations and max-pooling operations. These layers progressively reduce spatial dimensions while increasing feature depth, enabling the model to learn abstract representations of fluid flow patterns. The bottleneck connects the encoder and decoder, offering the deepest layer where the model consolidates global and local features. This stage captures complex interactions. The decoder gradually reconstructs the spatial resolution using upsampling layers. Skip connections from the encoder reintroduce high-resolution features, allowing the model to reconstruct detailed flow features that might otherwise be lost due to downsampling.
Finally, the U-Net output is processed by two linear prediction heads (equivalent to two 1 × 1 convolutional layers). The outputs from these two prediction heads are trained under the supervision of different fidelity datasets, as specified by the multitask loss in Equation (2).

2.3. Adaptive Transfer Learning for High-Dimensional Multi-Fidelity Modeling

In this section, a preliminary discussion on few-shots learning is given to contextualize the ML problem settings. The algorithm section provides a detailed description of the method.

2.3.1. Preliminary

Multi-fidelity surrogate modeling aims to approximate complex systems governed by partial differential equations by leveraging a combination of a few high-fidelity data points and abundant low-fidelity data points. Traditional approaches such as multi-fidelity Kriging (MFK) [44,45] integrate simulations of different fidelity levels using an auto-regressive Gaussian process framework. While MFK has been widely used for low-dimensional problems, it does not scale well for high-dimensional settings due to the curse of dimensionality.
Few-shot transfer learning aims to adapt a model pre-trained on large-scale datasets for a downstream task with a limited amount of data. Such adaptation is often realized by fine-tuning. Intuitively, the fine-tuned model will have better performance if the pre-trained features are more related to the downstream task. Zhou et al. [46] provide an upper bound on the test error of fine-tuning an empirical risk minimizer (ERM), which depends on the L 2 -distance between pre-trained model weights and fine-tuned model weights. Hu et al. [47,48] explicitly define a model-agnostic method to calculate “task distance” as a measurement for task similarity in classification problems. Zamir et al. [49] and later works [50,51,52] consider heterogeneous similarities in the pre-train task. They try to divide pre-training tasks into different subsets of tasks based on task similarity and choose the best subset of pre-training sub-tasks for different downstream tasks. These methods consider the heterogeneous similarity in pre-training tasks and focus on building a better pre-trained model by choosing more related pre-training sub-tasks and abandoning less related sub-tasks. They consider the downstream task as a whole and compare it to different subsets of pre-training tasks. The method proposed in this work considers the similarity of the physics underlying the high- and low-fidelity simulations at different regions of the simulation for the same task and proposes a better fine-tuning strategy.
Multi-fidelity surrogate modeling through transfer learning pre-trains a deep neural network on low-fidelity data and fine-tunes the network on high-fidelity data. ACNNs have shown great ability as data-driven surrogate models. The computational cost of collecting enough training data for neural-network training urges us to adapt these data-driven models to multi-fidelity surrogate modeling. Transfer learning is a natural approach and has been studied in different problems [35,36,37,38,39]. Unlike passive learning, multi-fidelity active learning [53,54,55] tries to balance between information gain and computational cost and actively decides the fidelity level of the next data to acquire. However, all of these works treat “multi-fidelity” as “multi-resolution” for experiments with high-dimensional outputs and collect data from the same algorithms with different resolutions of grids.
In the context of wind farm mean flow modeling, low-fidelity simulations are derived from simplified models compared to high-fidelity simulations, neglecting or approximating difficult-to-compute physics. As a result, they offer significant speed improvements beyond merely using coarser grids.
A key challenge arises because the wind farm flow field comprises distinct regions—such as free stream interacting with turbines, newly formed wakes, far wakes, and wakes interacting with downstream turbines—and each of these regions is affected differently by the physics neglected in low-fidelity models. While previous studies have successfully applied vanilla transfer learning in multi-resolution settings, empirical results (Section 3.4) confirm that standard transfer learning performs poorly under these conditions.
The hypothesis is that the relationship between the pre-trained features and the high-fidelity data varies across these different regions of the flow field. In regions where critical physical processes are missing or heavily approximated in the low-fidelity simulations, the pre-trained features are poorly aligned with the true high-fidelity behavior, and fine-tuning must effectively relearn the correct features. However, in regions where the preserved physics in low-fidelity simulations still closely match the high-fidelity data, the pre-trained features remain useful for accurate prediction. Unfortunately, indiscriminate fine-tuning across all regions risks corrupting these useful representations, ultimately degrading performance in areas where transfer learning should offer benefits.

2.3.2. Algorithm

To address the issues mentioned above, an adaptive transfer learning method based on a multi-task network is proposed. The key idea is to gradually adjust model flexibility by increasing the weight on high-fidelity outputs and reducing the weight on low-fidelity outputs during training. This progressive fine-tuning allows the model to first fit highly related regions before adapting to less related ones.
Additionally, patch-level pseudo-high-fidelity data [56] are selected to stabilize training. These patch-level pseudo-data are generated during the gradual relaxation of constraints, ensuring that when the model gains flexibility, the uncertainty in well-fitted regions does not increase. This method effectively mitigates the corruption of pre-trained features, leading to more accurate and robust predictions across diverse multi-fidelity scenarios.
The model constraint and pseudo-data selection are implemented as described in Algorithm 1. Let S L = ( x i , y i L ) be the low-fidelity dataset, S H = ( x j , y j H ) be the high-fidelity dataset, f θ be some neural network backbone, and M L and M H be two linear layers, while y ^ i L = M L f θ ( x i ) are the low-fidelity predictions and y ^ j H = M H f θ ( x j ) are the high-fidelity predictions. The network is trained to minimize a multi-task loss:
L α = α | S L | i | | y ^ i L y i L | | 2 + ( 1 α ) | S H | j | | y ^ j H y j H | | 2 .
where α is a constraint control parameter. When α = 1 , the model matches low-fidelity simulations, and when α = 0 , it matches high-fidelity ones. Using only α = 1 , 0 without pseudo-labeling corresponds to vanilla transfer learning. As α 1 , the model minimizes y ^ H y H 2 subject to y ^ H A L A H y L , where A H is the pseudo-inverse of A H , enforcing high-fidelity predictions as linear transformations of low-fidelity ones. For intermediate α , the multi-task loss constrains high-fidelity predictions roughly within a subspace spanned by a linear transformation of low-fidelity data. In the proposed method, α is gradually decreased from 1 to 0, training the model to convergence at each step.
To select patch-level pseudo-data, high-dimensional outputs are partitioned into patches. Let V be the output space, P ( V ) = { v k } be some partition over V, and n ( v k ) be the relative volume of v k with respect to V. For some patches v k and some dissimilarity measurement F , the patch-level dissimilarity is 1 n ( v k ) v k F ( y L , y H ) d v . When y for x is not available, estimated y ^ = A f θ ( x ) is used.
As flexibility gradually increases, pseudo-data need to be collected at the patches where high- and low-fidelity simulations are less relevant. Therefore, an adaptive threshold 1 ( δ + α ) η is used, where α is the constraint control parameter and δ and η are some pre-selected threshold parameters. If the estimated dissimilarity is smaller than this threshold at patch v of input x, ( x , M H f θ ( x ) , v ) is added to the high-fidelity training set.
Algorithm 1 Adaptive Transfer Learning
  1:
Input: Low-fidelity dataset S L = { ( x i , y i L ) } , high-fidelity dataset S H = { ( x j , y j H , V ) } , neural network backbone f θ , linear layers M L and M H , multi-task loss function L α , dissimilarity measurement F , output space partition P ( V ) , threshold parameters δ , η
  2:
   for  α = 1  to 0 do
  3:
      Minimize L α
  4:
      if  α 1  then
  5:
         for each  ( x , y L ) S L  do
  6:
            for each  v P ( V )  do
  7:
               if  1 n ( v ) v F ( M H f θ ( x ) , y L ) d v < 1 ( δ + α ) η  then
  8:
                  Add ( x , M H f θ ( x ) , v ) to S H
  9:
               end if
10:
          end for
11:
       end for
12:
    end if
13:
 end for
After this, patch-level pseudo-data are added to the training set. Not all data points in high-fidelity datasets have ground truth over the entire output space V. The new high-fidelity training set is represented as S H = { ( x j , y j H , v j ) } , where v j is the confident patches in V for input x. v j = V if the data point is from the original training set. The multi-task loss will be modified as
L α = α | S L | i | | y ^ i L y i L | | 2 + ( 1 α ) j n ( v j ) j | | ( y ^ j H y j H ) 1 v j | | 2 ,
where 1 ( · ) represents the indicator function of confident patches.

3. Experiments

To evaluate the effectiveness of the proposed model presented in the previous section, a series of experiments is conducted using LES and analytical models. The primary results highlight the model’s accuracy and generalization capabilities. In addition, an ablation study is performed to validate the effectiveness of the adaptive training procedure, while an extensibility study assesses the model’s ability to incorporate and leverage diverse types of training data.

3.1. Computational Details

3.1.1. Analytical Wake Model

The analytical wake model used for the training of the ML model is the Gauss–curl hybrid (GCH) model [31]. This model integrates the Gaussian wake model [25,28] with the curl model [29]. This allows for the computation of the downstream turbine effective yaw angle due to counter-rotating vortices induced by the turbine upstream. These spanwise velocity components from the curl model are incorporated to consider the added recovery due to yaw misalignment and secondary wake steering of a downstream turbine. A full description of the GCH model can be found in King et al. [31]. The velocity fields were computed using the GCH model as implemented in the Flow Redirection and Induction in Steady State package (FLORIS v3.4) and developed by the National Renewable Energy Laboratory (NREL) [57]. The input parameters were taken from the example input file provided by the FLORIS (v3.4) package. The turbulence intensity (TI) was set to 15%.

3.1.2. Large Eddy Simulations

The high-fidelity simulations of the wind farm were performed with the VFS-Wind LES solver. The governing equations of wind farm flow are given by the filtered Navier–Stokes equations. The sub-grid stresses are parameterized using the dynamic eddy viscosity model [58]. The central-difference scheme was used to discretize the momentum equations, and the Crank–Nicolson scheme to advance in time. The fractional step method was used to project the resulting non-solenoidal velocity field into a solenoidal space [59].
The actuator surface model (ASM), developed by Yang and Sotiropoulos [18], was used to parameterize the turbine rotor blades and nacelle. The ASM models the turbine by computing the lift and drag coefficient based on blade element theory [60] over a two-dimensional unstructured grid. The resulting body force is distributed over the flow grid using a smoothed four-point cosine function proposed by Yang et al. [61]. Moreover, the wind turbine angular velocity is obtained from the balance of the aerodynamic and generator torque [62]. A full description of the wind turbine control system can be found in Santoni et al. [63,64].
A series of large eddy simulations were conducted to generate high-fidelity data on the wake flow field at the SWiFT facility under different wind and yaw conditions. Located in Lubbock, Texas, the SWiFT facility features three Vestas V27 experimental turbines, each with a nameplate capacity of 225 kW at a rated wind speed of 14 m/s [65]. The turbines have a rotor diameter of D = 27 m and a hub height of 32.1 m. The computational domain consists of flat terrain with a dimension of 80 D × 24.9 D × 37 D along the streamwise (x), spanwise (z), and vertical (y) directions, respectively, and 451 × 143 × 281 computational grid points along the same directions. The resolution is Δ x / D = 0.16 and Δ z / D = 0.08 in the streamwise and spanwise directions, respectively. The grid has a uniform resolution of Δ y / D = 0.08 up to a height of y / D = 4.4 along the vertical direction. Above this height, the grid is stretched to the top of the domain.
A free-slip boundary condition is applied at the top boundary, while periodic boundary conditions are enforced in the spanwise direction. The bottom boundary follows the logarithmic law of the wall to model surface effects. An inflow boundary condition is prescribed in the streamwise direction. A precursor simulation with periodic boundary conditions was first conducted to generate a fully developed neutral atmospheric boundary layer.
Four wind directions were considered— 150 (northeast), 0 (south), 330 (southwest), and 274 (west)—each evaluated at five hub-height wind speeds: U h u b = 4.8 , 6.1 , 7.5 , 8.8 , and 10.2 m/s (see Figure 2). For wake steering cases, the computational domain was refined to 40.2 D × 12.4 D × 18.5 D in the streamwise, spanwise, and wall-normal directions, respectively. This refinement resulted in a resolution of Δ x / D = 0.08 , Δ z / D = 0.04 , and Δ y / D = 0.04 in the rotor region. Simulations for wake steering were conducted for the south wind direction at U h u b = 5.4 , 7.0 , 8.5 , 10.1 , and 11.6 m/s. For each wind speed, fixed yaw angles were imposed on the upstreammost turbine ( T 1 ) with γ T 1 = 20 , 15 , 10 , 0 , 10 , 15 , and 20 . A grid sensitivity analysis for the actuator surface model was performed by Santoni et al. [66].

3.2. Training

The neural network is trained using all GCH simulations along with a sparsely sampled subset of LES simulations. In each experiment, only three high-fidelity LES cases are included in the training set, while the remaining LES cases are used as test data to evaluate the model’s performance.
For the wind direction experiment, the training set contains the LES of the following cases:
  • U hub = 4.8 m/s and southwest wind direction;
  • U hub = 7.5 m/s and south wind direction;
  • U hub = 10.2 m/s and west wind direction.
For the wake steering experiment, the training set contains the LES of the following cases:
  • U hub = 5.4 m/s and yaw angle γ T 1 = 20 ;
  • U hub = 8.5 m/s and yaw angle γ T 1 = 0 ;
  • U hub = 11.6 m/s and yaw angle γ T 1 = 15 .
The proposed algorithm is configured with parameter values α = 1 , 0.5 , 0 , δ = 0 , and η = 50 . The values of δ and η were selected based on empirical observations of the error distribution between the high-fidelity and low-fidelity flow fields in the training set. The L 2 -norm is used as the dissimilarity measurement. For each value of α , the neural network is trained for 100,000 iterations using the Adam optimizer [67] with a learning rate of 10 4 . All training instances are utilized in each iteration.
Previous work on semi-supervised learning of high-dimensional outputs [68] uses patch-level pseudo-labels. The method follows this convention and uses patch-level pseudo-data as patch-level pseudo-data. Rather than evenly dividing the parameter space into patches, finer patches are defined around wind turbines. For each turbine, three patches are defined. All of them are 3 D wide in the z-axis and 1 D tall in the y-axis, where D is the turbine rotor diameter. In the wind flow direction, the x-axis, the patches are divided based on their distance to the turbine center. These patches include overhead patch ( 2 D to 0.5 D ahead), turbulence patch ( 0.5 D ahead to 1 D behind), and wake patch ( 1 D behind to 10 D behind). Furthermore, if a patch A of a turbine intersects with patch B of another turbine, the patch is divided into A / B , B / A , and A B . After turbine-related patches are created, the other background patches are evenly divided. Each of them is a cube with (length, width, height) = 1 5 (length, width, height) of the whole simulation box. For background patch A and turbine-related patch B, if A B , reassign A as A / B . This ensures no overlap patches.

3.3. Main Results

The model performance is evaluated by root mean squared error (RMSE) between model-predicted normalized velocities and high-fidelity normalized velocities, defined as
R M S E = 1 N n = 1 N ( U ¯ p r e d U ¯ l e s U h u b ) n 2 1 n ,
where N is the total number of grid points and 1 n is an indicator function to identify if the n-th grid point is in any zoom-in region of a turbine, as shown in Figure 2a. This prevents the metric results from being dominated by the free stream. Table 3 and Table 4 present the results for wake steering and wind direction cases, respectively. Despite being trained on sparsely sampled high-fidelity simulations, the ML model achieves an error of approximately 2 % for both wind direction and wake steering cases, significantly lower than the GCH model, which exhibits an average error of around 6.5 % . Furthermore, in none of the test cases does the NL model exceed 50 % of the error of the low-fidelity model, demonstrating a consistent improvement across the entire parameter space, even when high-fidelity training data are limited. Given that power output is proportional to the cube of the wind speed, the consistent improvement in velocity prediction accuracy around turbines directly translates to more reliable energy yield estimates in practical applications.
Detailed predictions and comparisons with high- and low-fidelity simulations for turbines under the southwest wind direction with U hub = 6.1 m / s , as well as for aligned wind turbines at U hub = 7.0 m / s with γ T 1 = 15 , are shown in Figure 3 and Figure 4. The rotor-averaged streamwise velocity is defined as:
U R A = 1 A A U ¯ d A ,
where A is the rotor-swept area.
Figure 3 illustrates a case where the wind speed is not included in any high-fidelity training cases, while Figure 4 presents a case where neither the wind speed nor the yaw angle appears in any high-fidelity training cases. In both cases, the GCH model overestimates the velocity deficit in the middle of the wakes for both upstream and downstream turbines, whereas the machine-learned model does not. This demonstrates that the machine learning model effectively learns the corrections between high- and low-fidelity simulations and generalizes them to unseen scenarios where high-fidelity examples are absent. This generalizability is key for real-world deployment, where one cannot retrain the model for every new operating condition. Figure 3d and Figure 4d show that the rotor-averaged velocities predicted by the neural network exhibit minimal error compared to high-fidelity simulations.
Another important observation is that the GCH model does not account for the presence of the nacelle. In contrast, the ML model not only learns to represent the nacelle but also captures how its influence varies with different yaw angles. As shown in Figure 4c,f, the machine-learned model accurately represents the pose of the yawed nacelle, even under a yaw angle not present in the high-fidelity training cases.
Figure 4e also shows that the GCH model first underestimates and then overestimates the velocity deficit along the spanwise direction in the far wake of the upstream turbines, indicating inaccuracies in wake deflection modeling. However, as shown in Figure 4f, this issue does not appear in the machine-learned model. To further demonstrate the ability to model wake deflection, Figure 5 presents the wake centerline trajectories for upstream turbines at U hub = 10.1 m / s with γ T 1 = 20 and 20 . These wake centerline trajectories are determined by performing a Gaussian fit to the velocity deficit along the spanwise direction at hub height for each grid point along the streamwise direction, recording the position of the maximum value at every point. Since γ T 1 = 20 is outside the range of any high-fidelity training cases, predicting wake deflection in this case is a challenging scenario to assess the model’s generalizability.
While the GCH model overestimates the wake deflection for both γ T 1 = 20 and 20 , the ML model closely follows the high-fidelity trajectories, accurately capturing both the deflection scale and the subtle “S” shape. Moreover, both LES and the ML model exhibit asymmetry in the wake centerlines for positive and negative yaw angles, consistent with the findings of Ciri et al. [69] and Fleming et al. [70], whereas the GCH model incorrectly predicts symmetric wake deflections for positive and negative yaw angles. Practically, this asymmetry is critical: failure to account for direction-specific deflection may result in mirrored wake steering strategies that perform poorly in one direction. The ML model’s ability to capture this enables more reliable planning for yaw-based wind farm control.

3.4. Ablation Study

To demonstrate the necessity of adaptive transfer learning, an additional model is trained using the vanilla transfer learning approach for both cases. The settings are identical to those described in Section 3.2, except that α takes only the values of 1 and 0, and no pseudo-data selection is performed.
Table 5 compares the performance of the two machine learning models and the GCH model for both cases, showing that the proposed method achieves the best results in all scenarios. For wake steering cases, the model trained using the vanilla transfer learning method performed even worse than the low-fidelity simulations. This suggests that the pre-trained features learned from low-fidelity simulations were corrupted during fine-tuning. This result further validates our hypothesis and is presented in Section 2.3.
Figure 6 provides detailed predictions and comparisons between high- and low-fidelity simulations and the two machine learning models under the northeast wind direction with U h u b = 8.8 m/s, where neither the wind direction nor wind speed is included in any high-fidelity training cases. The error between high- and low-fidelity predictions, as shown in Figure 6i–v, indicates that the similarity between them varies across different regions. Consequently, the vanilla model produces inaccurate predictions. While it performs slightly better than GCH, it consistently overestimates the wake velocity deficit and exaggerates the effect of the nacelle.
In contrast, the proposed model’s predictions closely resemble high-fidelity results in most regions. However, Figure 6iii presents the only velocity profiles where the proposed model’s prediction performs worse than the GCH or vanilla models. This discrepancy may arise from the selection of pseudo-data before they are properly fitted, as the dissimilarity measurement is an estimation. Nevertheless, overall, the proposed model still significantly outperforms both the GCH model and the vanilla model.

3.5. Extenability Study

To demonstrate the ability of the proposed model to deal with different types and numbers of low-fidelity simulations, two additional models are trained with extra sources of low-fidelity simulation for yaw cases. Following the settings of Zhang et al. [33], the average of five LES snapshots is used as another source of low-fidelity simulations. The first model is trained with the same setting described in Section 3.2, except that GCH is replaced with LES snapshot averages. The second model is trained with GCH and LES snapshot averages as low-fidelity simulations. Three modifications are performed to add an extra source of low-fidelity simulations. Let S g = ( x i , y i g ) be the GCH dataset, S s = ( x i , y i s ) be the snapshot average dataset, and S l = ( x j , y j l ) be the high-fidelity converged LES dataset, with f θ as the neural network backbone.
The first modification is adding an extra linear prediction head. Now, there are three linear prediction heads: M g , M s , and M l . The predictions are given by:
y ^ i g = M g f θ ( x i ) , y ^ i s = M s f θ ( x i ) , y ^ j l = M l f θ ( x j ) .
Here, y ^ i g represents the GCH predictions, y ^ i s represents the snapshot average predictions, and y ^ j l represents the high-fidelity converged LES predictions.
The second modification involves using the average of the MSE calculated from different sources of low-fidelity simulations as the low-fidelity part of the multitask loss function. The new loss function is written as:
L α = 1 2 α | S g | i y ^ i g y i g 2 + α | S s | i y ^ i s y i s 2 + ( 1 α ) | S l | j y ^ j l y j l 2 .
which is a modification of Equation (2).
The third modification introduces a new dissimilarity measurement by taking the minimum dissimilarity between low-fidelity and high-fidelity simulations:
F ( y g , y s , y l ) = min F ( y g , y l ) , F ( y s , y l )
where F is the original dissimilarity measurement mentioned in Section 2.3. More sources of low-fidelity simulations can be added similarly.
Table 6 compares the performance of machine learning models trained with different low-fidelity data against the GCH model and snapshot averages. MLs&g is trained with converged LES and both GCH and snapshot averages, ML g is trained with converged LES and GCH, and ML s is trained with converged LES and LES snapshot averages. Although GCH and snapshot averages exhibit similar RMSE values, the nature of their errors differs. The GCH model, based on simplified governing equations, produces highly biased predictions, whereas snapshot averages, derived from the instantaneous flow snapshots, result in highly variable predictions. All models exhibit minor errors, demonstrating the robustness of the proposed approach across different scenarios and types of low-fidelity data. Moreover, the model trained with both low-fidelity simulations outperforms those trained on a single source of low-fidelity data. This highlights the model’s ability to extract valuable information from multiple sources, thereby enhancing its predictive accuracy on high-fidelity simulations.

4. Conclusions

This study presents an adaptive multi-fidelity framework for high-dimensional surrogate modeling, extending traditional transfer learning to address complex scenarios where simulations of varying fidelities involve distinct physics. Applied to wind farm mean flow prediction, the framework integrates the U-Net architecture and a novel encoding scheme for wind farm physical parameters, utilizing sparse high-fidelity data alongside abundant low-fidelity data. By adaptively regulating the similarity between high- and low-fidelity predictions and enriching the training set with patch-level synthesized pseudo-high-fidelity data, the model achieves real-time high-fidelity inference with demonstrated generalizability and extensibility, surpassing traditional paradigms including neural compression and super resolution.
The performance of the model is evaluated under two different scenarios: one involving variations in wind direction and wind speed, and the other involving different yaw angles and wind speeds. For both scenarios, only three high-fidelity data points are used for training, with α = 1 , 0.5 , 0 , and the model is trained for 100,000 iterations using the Adam optimizer with a learning rate of 10 4 for each value of α . Once trained, the model requires no additional high- or low-fidelity simulations during inference, making the inference speed 1.32 × 10 5 GPU hour per instance on a single RTX A6000 GPU. The CPU workload is negligible, as it is solely used for transferring wind farm parameters to the GPU.
Results demonstrate that the proposed model closely resembles high-fidelity simulations, achieving an average error rate of approximately 2 % in close-rotor regions, with no single case exceeding a 3 % error. It also successfully captures features absent in low-fidelity simulations, such as nacelle effects and the asymmetry between positive and negative yaw angles. Additionally, it exhibits strong generalizability, accurately predicting cases with physical parameters not present in any high-fidelity training data. Beyond the main results, ablation and extensibility studies were conducted to highlight the method’s advantages over vanilla transfer learning and its ability to handle varying numbers and types of fidelity sources.
The framework’s real-time inference capability and full differentiability, enabled by neural network backpropagation, offer promising avenues for wind farm control co-design optimization, integrating real-time decision making with flow modeling.
Additionally, its ability to handle diverse fidelity sources opens the possibility of leveraging extensive data from various engineering models, simulations, and LiDAR measurements to develop a robust foundational wind farm model in the future. In the broader machine learning community, foundational models—large, general-purpose models pre-trained on diverse datasets and adaptable to a wide range of downstream tasks—are transforming domains such as natural language processing and computer vision. By enabling the integration of heterogeneous wind farm data across fidelities and physics, our framework lays the groundwork for such a foundational model in wind farm modeling, offering robustness, transferability, and scalability that go beyond existing multi-fidelity approaches.

Author Contributions

Conceptualization, D.Z., C.S., Z.Z., D.S. and A.K.; Methodology, D.Z., C.S., D.S. and A.K.; Software, D.Z., C.S., Z.Z. and A.K.; Validation, D.Z. and A.K.; Formal analysis, D.Z., C.S. and Z.Z.; Investigation, D.Z. and D.S.; Resources, A.K.; Writing—original draft, D.Z. and C.S.; Writing—review and editing, D.Z., C.S., D.S. and A.K.; Visualization, D.Z.; Supervision, D.S.; Project administration, D.S and A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by grants from the U.S. Department of Energy’s Office of Energy Efficiency and Renewable Energy (EERE) under the Water Power Technologies Office (WPTO) Award Numbers DE-EE0009450 and DE-EE00011379. Partial support was provided by NSF (grant number 2233986). The computational resources for the simulations of this study were partially provided by the Institute for Advanced Computational Science at Stony Brook University. The views expressed herein do not necessarily represent the views of the U.S. Department of Energy or the United States Government.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

ACNNAutoencoder convolutional neural network
ASMActuator surface model
FLORISFlow Redirection and Induction in Steady State modeling tool
GCHGauss–curl hybrid model
LESLarge eddy simulation
MLMachine learning
MFKMulti-fidelity Kriging
RMSERoot mean square error
F Dissimilarity measurement function
f θ Neural network backbone
γ T 1 Yaw angle of the upstream turbine T1
L Training loss/objective function
MLinear prediction head (e.g., M L , M H )
n ( v k ) Relative volume of patch v k
P ( V ) Partitioning of the output space V
U ¯ Streamwise velocity
U hub Hub-height freestream velocity
U R A Rotor-averaged streamwise velocity
y , y ^ Ground truth and predicted data

References

  1. Gielen, D.; Gorini, R.; Wagner, N.; Leme, R.; Gutierrez, L.; Prakash, G.; Asmelash, E.; Janeiro, L.; Gallina, G.; Vale, G.; et al. Global Energy Transformation: A Roadmap to 2050 (2019 Edition); International Renewable Energy Agency (IRENA): Masdar City, United Arab Emirates, 2019. [Google Scholar]
  2. El-Asha, S.; Zhan, L.; Iungo, G. Quantification of power losses due to wind turbine wake interactions through SCADA, meteorological and wind LiDAR data. Wind Energy 2017, 20, 1823–1839. [Google Scholar] [CrossRef]
  3. Medici, D.; Alfredsson, P. Measurements on a wind turbine wake: 3D effects and bluff body vortex shedding. Wind Energy 2006, 9, 219–236. [Google Scholar] [CrossRef]
  4. Jiménez, A.; Crespo, A.; Migoya, E. Application of a LES technique to characterize the wake deflection of a wind turbine in yaw. Wind Energy 2010, 13, 559–572. [Google Scholar] [CrossRef]
  5. Wagenaar, J.; Machielse, L.; Schepers, J. Controlling Wind in ECN’s Scaled Wind Farm. In Proceedings of the Europe’s Premier Wind Energy Event, Copenhagen, Denmark, 16–19 April 2012; Volume 1. [Google Scholar]
  6. Fleming, P.; Gebraad, P.; Lee, S.; van Wingerden, J.; Johnson, K.; Churchfield, M.; Michalakes, J.; Spalart, P.; Moriarty, P. Evaluating techniques for redirecting turbine wakes using SOWFA. Renew. Energy 2014, 70, 211–218. [Google Scholar] [CrossRef]
  7. Herbert-Acero, J.; Probst, O.; Réthoré, P.; Larsen, G.; Castillo-Villar, K. A Review of Methodological Approaches for the Design and Optimization of Wind Farms. Energies 2014, 7, 6930–7016. [Google Scholar] [CrossRef]
  8. Gebraad, P.; Teeuwisse, F.; van Wingerden, J.; Fleming, P.; Ruben, S.; Marden, J.; Pao, L. Wind plant power optimization through yaw control using a parametric model for wake effects-a CFD simulation study. Wind Energy 2016, 19, 95–114. [Google Scholar] [CrossRef]
  9. Boersma, S.; Doekemeijer, B.; Gebraad, P.; Fleming, P.; Annoni, J.; Scholbrock, A.; Frederik, J.; van Wingerden, J. A tutorial on control-oriented modeling and control of wind farms. In Proceedings of the 2017 American Control Conference (ACC), Seattle, WA, USA, 24–26 May 2017; pp. 1–18. [Google Scholar] [CrossRef]
  10. Sørensen, J.; Myken, A. Unsteady actuator disc model for horizontal axis wind turbines. J. Wind. Eng. Ind. Aerodyn. 1992, 39, 139–149. [Google Scholar] [CrossRef]
  11. Sørensen, J.; Kock, C. A model for unsteady rotor aerodynamics. J. Wind. Eng. Ind. Aerodyn. 1995, 58, 259–275. [Google Scholar] [CrossRef]
  12. Sørensen, J.; Shen, W.; Munduate, X. Analysis of wake states by a full-field actuator disc model. Wind Energy 1998, 1, 73–88. [Google Scholar] [CrossRef]
  13. Sørensen, N.; Michelsen, J. Aerodynamic predictions for the Unsteady Aerodynamics Experiment Phase-II rotor at the National Renewable Energy Laboratory. In Proceedings of the 2000 ASME Wind Energy Symposium, Reno, NV, USA, 10–13 January 2000. [Google Scholar] [CrossRef]
  14. Sørensen, J.N.; Shen, W.Z. Numerical Modeling of Wind Turbine Wakes. J. Fluids Eng. 2002, 124, 393–399. [Google Scholar] [CrossRef]
  15. Jimenez, A.; Crespo, A.; Migoya, E.; Garcia, J. Large-eddy simulation of spectral coherence in a wind turbine wake. Environ. Res. Lett. 2008, 3, 015004. [Google Scholar] [CrossRef]
  16. Wu, Y.T.; Porté-Agel, F. Large-Eddy Simulation of Wind-Turbine Wakes: Evaluation of Turbine Parametrisations. Bound.-Layer Meteorol. 2011, 138, 345–366. [Google Scholar] [CrossRef]
  17. Ciri, U.; Rotea, M.; Santoni, C.; Leonardi, S. Large-eddy simulations with extremum-seeking control for individual wind turbine power optimization. Wind Energy 2017, 20, 1617–1634. [Google Scholar] [CrossRef]
  18. Yang, X.; Sotiropoulos, F. A new class of actuator surface models for wind turbines. Wind Energy 2018, 21, 285–302. [Google Scholar] [CrossRef]
  19. Shapiro, C.; Gayme, D.; Meneveau, C. Filtered actuator disks: Theory and application to wind turbine models in large eddy simulation. Wind Energy 2019, 22, 1414–1420. [Google Scholar] [CrossRef]
  20. Bontempo, R.; Manna, M. A ring-vortex actuator disk method for wind turbines including hub effects. Energy Convers. Manag. 2019, 195, 672–681. [Google Scholar] [CrossRef]
  21. Kang, S.; Yang, X.; Sotiropoulos, F. On the onset of wake meandering for an axial flow turbine in a turbulent open channel flow. J. Fluid Mech. 2014, 744, 376–403. [Google Scholar] [CrossRef]
  22. Santoni, C.; Carrasquillo, K.; Arenas-Navarro, I.; Leonardi, S. Effect of tower and nacelle on the flow past a wind turbine. Wind Energy 2017, 20, 1927–1939. [Google Scholar] [CrossRef]
  23. Jensen, N. A Note on Wind Generator Interaction; Number 2411 in Risø-M, Risø National Laboratory: Roskilde, Denmark, 1983. [Google Scholar]
  24. Chamorro, L.; Porté-Agel, F. A Wind-Tunnel Investigation of Wind-Turbine Wakes: Boundary-Layer Turbulence Effects. Bound.-Layer Meteorol. 2009, 132, 129–149. [Google Scholar] [CrossRef]
  25. Bastankhah, M.; Porté-Agel, F. A new analytical model for wind-turbine wakes. Renew. Energy 2014, 70, 116–123. [Google Scholar] [CrossRef]
  26. Chamorro, L.; Porté-Agel, F. Effects of Thermal Stability and Incoming Boundary-Layer Flow Characteristics on Wind-Turbine Wakes: A Wind-Tunnel Study. Bound.-Layer Meteorol. 2010, 136, 515–533. [Google Scholar] [CrossRef]
  27. Wu, Y.; Porté-Agel, F. Atmospheric Turbulence Effects on Wind-Turbine Wakes: An LES Study. Energies 2012, 5, 5340–5362. [Google Scholar] [CrossRef]
  28. Bastankhah, M.; Porté-Agel, F. Experimental and theoretical study of wind turbine wakes in yawed conditions. J. Fluid Mech. 2016, 806, 506–541. [Google Scholar] [CrossRef]
  29. Martinez-Tossas, L.; Annoni, J.; Fleming, P.; Churchfield, M. The aerodynamics of the curled wake: A simplified model in view of flow control. Wind. Energy Sci. 2019, 4, 127–138. [Google Scholar] [CrossRef]
  30. Martinez-Tossas, L.; King, J.; Quon, E.; Bay, C.; Mudafort, R.; Hamilton, N.; Howland, M.; Fleming, P. The Curled Wake Model: A Three-Dimensional and Extremely Fast Steady-State Wake Solver for Wind Plant Flows. Wind Energy Sci. 2021, 6, 555–570. [Google Scholar] [CrossRef]
  31. King, J.; Fleming, P.; King, R.; Martínez-Tossas, L.; Bay, C.; Mudafort, R.; Simley, E. Control-oriented model for secondary effects of wake steering. Wind Energy Sci. 2021, 6, 701–714. [Google Scholar] [CrossRef]
  32. Ashwin Renganathaw, S.; Maulik, R.; Letizia, S.; Iungo, G. Data-Driven Wind Turbine Wake Modeling via Probabilistic Machine Learning. Neural Comput. Appl. 2022, 34, 6171–6186. [Google Scholar] [CrossRef]
  33. Zhang, Z.; Santoni, C.; Herges, T.; Sotiropoulos, F.; Khosronejad, A. Time-Averaged Wind Turbine Wake Flow Field Prediction Using Autoencoder Convolutional Neural Networks. Energies 2021, 15, 41. [Google Scholar] [CrossRef]
  34. Santoni, C.; Zhang, D.; Zhang, Z.; Samaras, D.; Sotiropoulos, F.; Khosronejad, A. Toward ultra-efficient high-fidelity predictions of wind turbine wakes: Augmenting the accuracy of engineering models with machine learning. Phys. Fluids 2024, 36, 065159. [Google Scholar] [CrossRef]
  35. Chen, W.; Stinis, P. Feature-adjacent multi-fidelity physics-informed machine learning for partial differential equations. J. Comput. Phys. 2024, 498, 112683. [Google Scholar] [CrossRef]
  36. De, S.; Britton, J.; Reynolds, M.; Skinner, R.; Jansen, K.; Doostan, A. On transfer learning of neural networks using bi-fidelity data for uncertainty propagation. Int. J. Uncertain. Quantif. 2020, 10, 6. [Google Scholar] [CrossRef]
  37. Zhang, Y.; Gong, Z.; Zhou, W.; Zhao, X.; Zheng, X.; Yao, W. Multi-fidelity surrogate modeling for temperature field prediction using deep convolution neural network. Eng. Appl. Artif. Intell. 2023, 123, 106354. [Google Scholar] [CrossRef]
  38. Lyu, Y.; Zhao, X.; Gong, Z.; Kang, X.; Yao, W. Multi-fidelity prediction of fluid flow and temperature field based on transfer learning using Fourier Neural Operator. arXiv 2023, arXiv:2304.06972. [Google Scholar]
  39. Liao, P.; Song, W.; Du, P.; Zhao, H. Multi-fidelity convolutional neural network surrogate model for aerodynamic optimization based on transfer learning. Phys. Fluids 2021, 33, 127121. [Google Scholar] [CrossRef]
  40. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
  41. Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 9. [Google Scholar] [CrossRef]
  42. Liu, X.; Li, L.; Shi, S.; Chen, X.; Wu, S.; Lao, W. Three-Dimensional LiDAR Wake Measurements in an Offshore Wind Farm and Comparison with Gaussian and AL Wake Models. Energies 2021, 14, 8313. [Google Scholar] [CrossRef]
  43. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems—NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019; pp. 8026–8037. [Google Scholar]
  44. Kennedy, M.; O’Hagan, A. Predicting the Output from a Complex Computer Code When Fast Approximations Are Available. Biometrika 1998, 87, 1–13. [Google Scholar] [CrossRef]
  45. Kennedy, M.C.; O’Hagan, A. Bayesian calibration of computer models. J. R. Stat. Soc. Ser. 2001, 63, 425–464. [Google Scholar] [CrossRef]
  46. Zhou, P.; Zou, Y.; Yuan, X.T.; Feng, J.; Xiong, C.; Hoi, S. Task similarity aware meta learning: Theory-inspired improvement on MAML. In Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, Online, 27–30 July 2021; Volume 161, pp. 23–33. [Google Scholar]
  47. Hu, M.; Chang, H.; Guo, Z.; Ma, B.; Shan, S.; Chen, X. Understanding Few-Shot Learning: Measuring Task Relatedness and Adaptation Difficulty via Attributes. In Proceedings of the 37th International Conference on Neural Information Processing Systems—NeurIPS 2023, New Orleans, LA, USA, 10–16 December 2023; pp. 19397–19409. [Google Scholar]
  48. Hu, M.; Chang, H.; Guo, Z.; Ma, B.; Shan, S.; Chen, X. Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications. arXiv 2024, arXiv:2403.03535. [Google Scholar]
  49. Zamir, A.R.; Sax, A.; Shen, W.B.; Guibas, L.; Malik, J.; Savarese, S. Taskonomy: Disentangling Task Transfer Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–22 June 2018; pp. 3712–3722. [Google Scholar]
  50. Dwivedi, K.; Roig, G. Representation similarity analysis for efficient task taxonomy & transfer learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, CA, USA, 15–20 June 2019; pp. 12387–12396. [Google Scholar]
  51. Sun, Q.; Liu, Y.; Chua, T.S.; Schiele, B. Meta-transfer learning for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), Seattle, WA, USA, 16–22 June 2024; pp. 403–412. [Google Scholar]
  52. Liu, C.; Wang, Z.; Sahoo, D.; Fang, Y.; Zhang, K.; Hoi, S.C. Adaptive task sampling for meta-learning. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 752–769. [Google Scholar]
  53. Li, S.; Phillips, J.M.; Yu, X.; Kirby, R.; Zhe, S. Batch Multi-Fidelity Active Learning with Budget Constraints. In Proceedings of the 36th International Conference on Neural Information Processing Systems—NeurIPS 2022, New Orleans, LA, USA, 28 November–9 December 2022; pp. 995–1007. [Google Scholar]
  54. Wu, D.; Niu, R.; Chinazzi, M.; Ma, Y.; Yu, R. Disentangled multi-fidelity deep bayesian active learning. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 37624–37634. [Google Scholar]
  55. Li, S.; Wang, Z.; Kirby, R.; Zhe, S. Deep Multi-Fidelity Active Learning of High-Dimensional Outputs. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 28–30 March 2022; Volume 151, pp. 1694–1711. [Google Scholar]
  56. Lee, D.H. Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks. In Proceedings of the 2013 ICML Workshop on Challenges in Representation Learning, Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
  57. Fleming, P.; King, J.; Bay, C.; Simley, E.; Mudafort, R.; Hamilton, N.; Farrell, A.; Martínez-Tossas, L. Overview of FLORIS updates. J. Phys. Conf. Ser. 2020, 1618, 022028. [Google Scholar] [CrossRef]
  58. Germano, M.; Piomelli, U.; Moin, P.; Cabot, W. A dynamic subgrid-scale eddy viscosity model. Phys. Fluids Fluid Dyn. 1991, 3, 1760–1765. [Google Scholar] [CrossRef]
  59. Kim, J.; Moin, P. Application of a fractional-step method to incompressible Navier-Stokes equations. J. Comput. Phys. 1985, 59, 308–323. [Google Scholar] [CrossRef]
  60. Froude, W. On the Elementary Relation Between Pitch, Slip and Propulsive Efficiency; Institution of Naval Architects: London, UK, 1878. [Google Scholar]
  61. Yang, X.; Zhang, X.; Li, Z.; He, G. A smoothing technique for discrete delta functions with application to immersed boundary method in moving boundary simulations. J. Comput. Phys. 2009, 228, 7821–7836. [Google Scholar] [CrossRef]
  62. Burton, T.; Sharpe, D.; Jenkins, N.; Bossanyi, E. Wind Energy Handbook; Wiley: Hoboken, NJ, USA, 2011. [Google Scholar]
  63. Santoni, C.; Khosronejad, A.; Yang, X.; Seiler, P.; Sotiropoulos, F. Coupling Turbulent Flow with Blade Aeroelastics and Control Modules in Large-Eddy Simulation of Utility-Scale Wind Turbines. Phys. Fluids 2023, 35, 015140. [Google Scholar] [CrossRef]
  64. Santoni, C.; Khosronejad, A.; Seiler, P.; Sotiropoulos, F. Toward control co-design of utility-scale wind turbines: Collective vs. individual blade pitch control. Energy Rep. 2023, 9, 793–806. [Google Scholar] [CrossRef]
  65. Berg, J.; Bryant, J.; LeBlanc, B.; Maniaci, D.; Naughton, B.; Paquette, J.; Resor, B.; White, J.; Kroeker, D. Scaled Wind Farm Technology Facility Overview. In Proceedings of the 32nd ASME Wind Energy Symposium, Reston, VA, USA, 13–17 January 2014; pp. 1–15. [Google Scholar] [CrossRef]
  66. Santoni, C.; Sotiropoulos, F.; Khosronejad, A. A Comparative analysis of actuator-based turbine structure parametrizations for high-fidelity modeling of utility-scale wind turbines under neutral atmospheric conditions. Energies 2024, 17, 753. [Google Scholar] [CrossRef]
  67. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  68. Li, C.; Hu, X.; Abousamra, S.; Chen, C. Calibrating uncertainty for semi-supervised crowd counting. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 16685–16695. [Google Scholar]
  69. Ciri, U.; Rotea, M.A.; Leonardi, S. Effect of the turbine scale on yaw control. Wind Energy 2018, 21, 1395–1405. [Google Scholar] [CrossRef]
  70. Fleming, P.; Annoni, J.; Churchfield, M.; Martinez-Tossas, L.; Gruchalla, K.; Lawson, M.; Moriarty, P. A Simulation Study Demonstrating the Importance of Large-Scale Trailing Vortices in Wake Steering. Wind Energy Sci. 2018, 3, 243–255. [Google Scholar] [CrossRef]
Figure 1. General framework for wind farm flow prediction. Input parameters, including wind and wind farm conditions, are processed through the law of the wall and Gaussian encoding. These encodings are concatenated and fed into a U-Net architecture for hierarchical feature extraction and decoding. The outputs are passed through different linear layers to generate predictions at multiple fidelity levels. (a) U-Net architecture transformation illustrating the hierarchical encoding and decoding of the wind farm neural representation across different layers. The numbers above each representation indicate the number of output channels (feature maps) produced by each layer. (b) Representation of the constructed free-stream velocity field, where velocities increase logarithmically along the y-axis (height). (c) Gaussian representation of a wind farm (bottom) and a zoomed-in view of a yawed turbine representation (top), both at the hub-height plane.
Figure 1. General framework for wind farm flow prediction. Input parameters, including wind and wind farm conditions, are processed through the law of the wall and Gaussian encoding. These encodings are concatenated and fed into a U-Net architecture for hierarchical feature extraction and decoding. The outputs are passed through different linear layers to generate predictions at multiple fidelity levels. (a) U-Net architecture transformation illustrating the hierarchical encoding and decoding of the wind farm neural representation across different layers. The numbers above each representation indicate the number of output channels (feature maps) produced by each layer. (b) Representation of the constructed free-stream velocity field, where velocities increase logarithmically along the y-axis (height). (c) Gaussian representation of a wind farm (bottom) and a zoomed-in view of a yawed turbine representation (top), both at the hub-height plane.
Energies 18 02897 g001
Figure 2. (a) Illustration of the south wind direction LES configuration and a zoomed-in region surrounding a turbine. This turbine is located downwind of another turbine, and therefore, it is affected by the upwind turbine wake. (b) Illustration of the simulation box under different wind directions.
Figure 2. (a) Illustration of the south wind direction LES configuration and a zoomed-in region surrounding a turbine. This turbine is located downwind of another turbine, and therefore, it is affected by the upwind turbine wake. (b) Illustration of the simulation box under different wind directions.
Energies 18 02897 g002
Figure 3. Time-averaged velocity contours U ¯ of (a) LES, (b) GCH, and (c) ML and errors δ of (e) LES-GCH and (f) LES-ML. Velocity contours are at a hub-height plane of the southwest wind direction with U h u b = 6.1  m/s. (d) Rotor-averaged streamwise velocity ( U R A ) of LES(Energies 18 02897 i001), GCH(Energies 18 02897 i002), and ML(Energies 18 02897 i003) corresponding to the velocity contours.
Figure 3. Time-averaged velocity contours U ¯ of (a) LES, (b) GCH, and (c) ML and errors δ of (e) LES-GCH and (f) LES-ML. Velocity contours are at a hub-height plane of the southwest wind direction with U h u b = 6.1  m/s. (d) Rotor-averaged streamwise velocity ( U R A ) of LES(Energies 18 02897 i001), GCH(Energies 18 02897 i002), and ML(Energies 18 02897 i003) corresponding to the velocity contours.
Energies 18 02897 g003
Figure 4. Time-averaged velocity contours U ¯ of (a) LES, (b) GCH, and (c) ML and errors δ of (e) LES-GCH and (f) LES-ML. Velocity contours are at a hub-height plane of the aligned wind turbines of the U h u b = 7.0  m/s with γ T 1 = 15 . (d) Rotor-averaged streamwise velocity ( U R A ) of LES(Energies 18 02897 i001), GCH(Energies 18 02897 i002), and ML(Energies 18 02897 i003) corresponding to the velocity contours.
Figure 4. Time-averaged velocity contours U ¯ of (a) LES, (b) GCH, and (c) ML and errors δ of (e) LES-GCH and (f) LES-ML. Velocity contours are at a hub-height plane of the aligned wind turbines of the U h u b = 7.0  m/s with γ T 1 = 15 . (d) Rotor-averaged streamwise velocity ( U R A ) of LES(Energies 18 02897 i001), GCH(Energies 18 02897 i002), and ML(Energies 18 02897 i003) corresponding to the velocity contours.
Energies 18 02897 g004
Figure 5. Wake centerline for the upstream turbine T 1 at a hub wind speed of U h u b = 10.1 m/s with γ T 1 = 20 and 20 . The trajectories are shown for LES(Energies 18 02897 i001), GCH(Energies 18 02897 i002), and ML(Energies 18 02897 i003).
Figure 5. Wake centerline for the upstream turbine T 1 at a hub wind speed of U h u b = 10.1 m/s with γ T 1 = 20 and 20 . The trajectories are shown for LES(Energies 18 02897 i001), GCH(Energies 18 02897 i002), and ML(Energies 18 02897 i003).
Energies 18 02897 g005
Figure 6. Time-averaged velocity contours ( U ¯ ) of (a) zoom-in of LES, (b) entire LES. Velocity contours are at the hub-height plane for the northeast wind direction with U h u b = 8.8 m/s. The position of the zoomed-in region is labeled with the dashed rectangle in (a). (iv) Velocity profiles along the spanwise directions correspond to the dashed lines in (a), with LES (Energies 18 02897 i001), GCH (Energies 18 02897 i004), proposed method (Energies 18 02897 i003), and vanilla transfer learning (Energies 18 02897 i005).
Figure 6. Time-averaged velocity contours ( U ¯ ) of (a) zoom-in of LES, (b) entire LES. Velocity contours are at the hub-height plane for the northeast wind direction with U h u b = 8.8 m/s. The position of the zoomed-in region is labeled with the dashed rectangle in (a). (iv) Velocity profiles along the spanwise directions correspond to the dashed lines in (a), with LES (Energies 18 02897 i001), GCH (Energies 18 02897 i004), proposed method (Energies 18 02897 i003), and vanilla transfer learning (Energies 18 02897 i005).
Energies 18 02897 g006
Table 1. Training data requirements of different ML paradigms for wind farm surrogate modeling. One-to-one indicates whether the high- and low-fidelity training data require one-to-one correspondence.
Table 1. Training data requirements of different ML paradigms for wind farm surrogate modeling. One-to-one indicates whether the high- and low-fidelity training data require one-to-one correspondence.
ML ParadigmNumber of FidelitiesAmount of High-Fidelity DataOne-to-One
Neural Compression [32]1AbundantNot required
Super Resolution [33,34]2ScarceRequired
Adaptive Transfer Learning2 or moreScarceNot required
Table 2. Inference characteristics of different ML paradigms for wind farm surrogate modeling.
Table 2. Inference characteristics of different ML paradigms for wind farm surrogate modeling.
ML ParadigmInputEnd-to-End Differentiable
Neural Compression [32]Physical parameters
Super Resolution [33,34]Low-fidelity simulations×
Adaptive Transfer LearningPhysical parameters
Table 3. Comparison of GCH and ML performance in terms of RMSE (Equation (4)) across yaw angles and hub wind speeds ( U hub ).
Table 3. Comparison of GCH and ML performance in terms of RMSE (Equation (4)) across yaw angles and hub wind speeds ( U hub ).
U hub (m/s) 20 15 10 0 10 15 20
GCH ML GCH ML GCH ML GCH ML GCH ML GCH ML GCH ML
5.46.7%1.3%6.6%1.5%6.7%1.5%6.6%1.6%6.6%1.8%6.7%2.1%
7.06.7%1.3%6.6%1.7%6.6%1.7%6.6%1.5%6.6%1.7%6.6%1.9%6.6%2.2%
8.56.8%1.5%7.1%2.9%7.1%3.0%6.8%1.5%6.8%1.7%6.9%2.1%
10.16.0%1.9%6.1%1.9%5.9%1.5%5.7%1.7%6.1%1.4%6.1%1.5%6.1%1.9%
11.65.1%2.0%5.1%1.9%5.0%1.8%5.0%1.9%5.0%1.3%5.2%1.5%
Table 4. Comparison of GCH and ML performance in terms of RMSE (Equation (4)) across wind directions and hub wind speeds ( U hub ).
Table 4. Comparison of GCH and ML performance in terms of RMSE (Equation (4)) across wind directions and hub wind speeds ( U hub ).
U hub (m/s)NortheastSouthSouthwestWest
GCH ML GCH ML GCH ML GCH ML
4.86.4%2.4%5.8%2.8%6.0%2.9%
6.16.9%2.5%6.8%2.2%7.3%2.2%7.5%2.7%
7.57.0%2.3%6.8%2.4%7.2%2.5%
8.87.1%2.2%6.7%2.3%7.0%2.4%7.0%2.8%
10.26.0%2.2%5.7%2.4%5.9%2.4%
Table 5. Performance comparison in terms of RMSE (Equation (4)) of GCH, ML1 (proposed method), and ML2 (vanilla transfer learning).
Table 5. Performance comparison in terms of RMSE (Equation (4)) of GCH, ML1 (proposed method), and ML2 (vanilla transfer learning).
CaseGCHML1ML2
Wind Direction6.7%2.3%2.6%
Yaw6.3%1.8%11.4%
Table 6. Performance comparison in terms of RMSE (Equation (4)) of GCH, snapshot averages, and machine learning models trained with different low-fidelity data.
Table 6. Performance comparison in terms of RMSE (Equation (4)) of GCH, snapshot averages, and machine learning models trained with different low-fidelity data.
GCHSnapshots ML s & g ML s ML g
6.3%6.3%1.1%1.5%1.8%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, D.; Santoni, C.; Zhang, Z.; Samaras, D.; Khosronejad, A. Adaptive Multitask Neural Network for High-Fidelity Wake Flow Modeling of Wind Farms. Energies 2025, 18, 2897. https://doi.org/10.3390/en18112897

AMA Style

Zhang D, Santoni C, Zhang Z, Samaras D, Khosronejad A. Adaptive Multitask Neural Network for High-Fidelity Wake Flow Modeling of Wind Farms. Energies. 2025; 18(11):2897. https://doi.org/10.3390/en18112897

Chicago/Turabian Style

Zhang, Dichang, Christian Santoni, Zexia Zhang, Dimitris Samaras, and Ali Khosronejad. 2025. "Adaptive Multitask Neural Network for High-Fidelity Wake Flow Modeling of Wind Farms" Energies 18, no. 11: 2897. https://doi.org/10.3390/en18112897

APA Style

Zhang, D., Santoni, C., Zhang, Z., Samaras, D., & Khosronejad, A. (2025). Adaptive Multitask Neural Network for High-Fidelity Wake Flow Modeling of Wind Farms. Energies, 18(11), 2897. https://doi.org/10.3390/en18112897

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop