1. Introduction
In recent years, the widespread use of UAVs and the development of general aviation have greatly advanced the progress of aerodynamic shape optimization (ASO), driving it towards higher levels of efficiency, intelligence, and multi-mission capability. For the ASO problem, the main challenges are the expensive computational cost during the optimization process and the number of design variables. Moreover, conventional fixed-wing designs face inherent limitations that hinder their ability to achieve optimal aerodynamic performance across various low-speed flight conditions. Therefore, the development of efficient optimization methods for ASO problems in aircraft design is of great interest and necessity.
A typical ASO process consists of four stages: shape parameterization, mesh creation or deformation, flow solution, and optimization algorithm [
1]. The dimensionality of design variables in the optimization process is directly linked to the number of parameterized variables. The shape parameterization stage plays a crucial role in determining optimization efficiency [
2,
3]. The Free-Form Deformation technique [
4] allows flexible manipulation of geometric surfaces by displacing control points to alter the surface mesh. However, it generally requires a large number of design variables and lacks clear geometric interpretability. The Class-Shape Transformation (CST) [
5] parameterization method offers strong shape control capabilities. It is characterized by low design dimensionality, high adaptability, and excellent geometric accuracy. Wang et al. [
6] introduced an enhanced CST method by incorporating B-spline basis functions to overcome limitations in local expressiveness and shape adjustment capability. However, this modification significantly increased the number of design variables.
An effective way of overcoming the shortcoming that too many design variables can greatly decrease the optimization efficiency is to reduce the number of design variables via dimension reduction techniques. This can be realized by using the PCA technique. Cinquegrana et al. [
7] applied PCA-based airfoil parameterization for aerodynamic layout optimization and introduced an adaptive optimization method for design variables. Oyama et al. [
8,
9] used PCA on the optimized airfoil dataset to reduce dimensionality and conducted reverse analysis to assess the influence of individual modes on aerodynamic characteristics. Yu et al. [
10] introduced a PCA-based airfoil parameterization method and examined how various factors impact the effectiveness of PCA modeling. Asouti et al. [
11] used the PCA technique to better guide the application of evolution operators and train metamodels in Metamodel-Assisted Evolutionary Algorithms.
In the flow solution stage, employing surrogate models [
12,
13] or neural networks [
14,
15,
16] as alternatives to computational fluid dynamics is regarded as an effective approach to enhance optimization efficiency. The revolutionary application of Artificial Intelligence (AI) in fields like natural language processing has demonstrated its substantial potential in transforming aircraft design. A key AI-based approach is the development of intelligent prediction models for aerodynamic parameters, which function as surrogate models to establish precise relationships between aerodynamic characteristics and aircraft shape parameters. These models allow for rapid aerodynamic evaluations, significantly speeding up and enhancing the efficiency of the design process [
17]. As advancements in design technologies progress rapidly and integrate more quickly, utilizing intelligent methodologies to streamline the design cycle has become a crucial trend in the development of morphing aircraft.
To achieve multi-mission capability, the concept of morphing wings has gained significant attention as a promising method. Morphing wings are capable of adjusting their shape in response to changing flight conditions, thereby enabling optimal aerodynamic performance across various flight missions. This paper primarily focuses on the optimization of airfoils with variable camber and thickness. As morphing mechanisms become increasingly complex, with higher deformation dimensionality and greater demands for high-fidelity modeling in morphing aircraft, it is essential to develop parameterization techniques with reduced dimensionality and high geometric accuracy.
In traditional passive sampling approaches, a batch of airfoils is typically generated in a single step using methods like Latin Hypercube Sampling (LHS) [
18] or random sampling. These samples are then evaluated through aerodynamic simulations, and a surrogate model is trained based on this fixed dataset. However, this approach has several drawbacks: the design space is often extensive, with regions corresponding to high performance being sparsely covered. As a result, the trained model may lack sufficient sample support in high-performance regions, leading to prediction errors. In contrast, purely grid-based or large-scale random sampling methods tend to generate numerous low-performance samples, leading to unnecessary simulation costs and computational inefficiencies.
To overcome the limitations outlined above, this study proposes an aerodynamic optimization framework that combines PCA with DL. An optimization-guided data augmentation strategy is introduced, where the initial surrogate model is used to guide the search for high-performance airfoil candidates within reduced iterations. These candidates are then evaluated through high-fidelity aerodynamic simulations, which provide accurate coefficients for retraining the predictive model, thereby improving its reliability in performance-critical regions. Finally, SHAP analysis is applied to quantitatively evaluate the contribution of each principal component to the predicted aerodynamic coefficients, thus enhancing the interpretability of optimization results.
2. Methodology
Figure 1 illustrates the proposed optimization framework for airfoil design, which integrates deep learning with an optimization-guided data augmentation loop. The process starts by constructing an initial airfoil database through a Design of Experiments (DoE) approach, where airfoil geometries are parameterized via PCA, and their aerodynamic coefficients are calculated using XFOIL. These samples are used to train an initial surrogate model, built on a deep neural network (DNN). Utilizing this model as a performance estimator, a MIGA is employed to identify candidate airfoils with potentially superior lift or lift-to-drag ratios. The selected candidates are then evaluated through high-fidelity XFOIL simulations, and the validated samples are added to the training dataset. The surrogate model is retrained using this augmented dataset, thereby improving its predictive accuracy, particularly in high-performance regions of the design space. Finally, the airfoil optimization process commences.
Step 1: the PCA method is employed to parameterize the input airfoil.
Step 2: import the airfoil parameters into the DL-based prediction model and obtain the aerodynamic coefficients.
Step 3: update the design variables using the optimization algorithm.
Step 4: use a short-term MIGA run to produce high-performance candidate airfoils, and then employ XFOIL to obtain accurate aerodynamic coefficients. Incorporate these new data points into the training set and retrain the DL model.
Step 5: determine whether the new airfoil satisfies the convergence condition. If so, output the final airfoil; if not, repeat steps 2 and 3.
The following subsections provide a detailed description of the surrogate model based on optimization-guided data augmentation loop for predicting aerodynamic coefficients of airfoils, as well as the optimization algorithm for airfoil design.
2.1. Parameterization Method
2.1.1. CST Parameterization Method
CST [
5] not only reduces the number of optimization parameters but also keeps the airfoil surface smooth. The method approximates the airfoil geometry using polynomials. The upper surface of an airfoil is described as
and the lower surface of an airfoil is described as
where
and
denote the dimensionless values of the x-axis and y-axis, respectively. The subscripts u and l represent the upper and lower surfaces of the airfoil, respectively.
and
are the thickness ratios of the upper and lower surfaces of the airfoil trailing edge. The class function C(
) is defined as
where the parameters
and
are related to the geometric shape of the airfoil. For a general class of geometric shapes,
= 0.5 and
= 1.0. The shape function equations are as follows:
where the shape functions for Bernstein polynomials of order n are defined as
in which
and
are the required parameters for optimization. The order is 12 in the present paper; so, there are 24 design parameters that control the airfoil shape.
2.1.2. PCA Dimensionality Reduction Method
PCA [
19,
20] is a multivariable analysis method, which can reduce the number of variables. PCA is implemented on the basis of a database, and its idea is to transform the original set of parameters into a new set with a lower dimension while preserving the intrinsic information of the original data.
Assume that the size of the database is n, and the dimension of the input data is m. The input data can be written in the form of a matrix, as follows:
PCA for the database is implemented as per the following procedures.
- (1)
Obtain a new matrix as follows:
It is obvious that the mean value of the elements in the vector (i = 1, 2, …, m) is zero.
- (2)
Calculate the covariance matrix of DA:
where
(i, j = 1, 2, …, m) is the covariance of
and
, and it is expressed as
- (3)
Calculate eigenvectors (i = 1, 2, …, m) and the corresponding eigenvalues (i = 1, 2, …, m) of matrix C through
Then, rank the eigenvalues from largest to smallest as , , …, , where the corresponding eigenvectors are , , …, .
- (4)
Calculate the contribution rate of each principal component:
Then, calculate the cumulative contribution rate of the principal components:
Select the largest p (p < m) principal components through
In this study, q is set to be 1 to transform the original input data into a lower dimension.
- (5)
Generate the new dataset using the largest p principal components:
where
,
, …,
are called the principal components of the original data. As can be seen, the dimension of the transformed dataset is p, which is lower than the dimension of the original dataset, namely, m. Thus, the design number is reduced from m to p. For a set of reduced design variables [
,
, …,
], it can be returned to the original design variables [
,
, …,
] through
The PCA airfoil parameterization modeling process studied in this paper is illustrated in
Figure 2.
2.2. Deep Neural Network
A DNN is a model composed of multiple layers of interconnected neurons, where each layer contains several neurons that learn to establish complex mappings between inputs and outputs.
Figure 3 presents a schematic diagram of a single neuron. The output of the neuron is defined as:
In the equation, represents the input to the neuron; denotes the weights corresponding to each input; b represents the bias; and σ is the activation function.
2.3. Optimization Method
2.3.1. Multi-Island Genetic Algorithm
The MIGA [
21] is used to optimize the airfoil. This algorithm has better global optimization ability and faster computational efficiency than the Genetic Algorithm (GA). The algorithm divides a large population into several subpopulations called islands. On each island, the traditional GA is applied for subpopulation evolution. The GA is inspired by the survival of the fittest during natural selection. First, the object is encoded for optimization in the solution domain. The algorithm then generates high-quality solutions through genetic operators such as selection, crossover, and mutation. A large population is used to search for the optimal solution [
22].
2.3.2. Optimization Objective and Constraint Conditions
The optimization procedure focuses on two key flight conditions: climb and low-speed cruise. During the climb phase, enhancing lift is necessary to support efficient upward motion. During the low-speed cruise phase, the goal is to maximize the lift-to-drag ratio. Achieving this improves aerodynamic performance, lowers fuel consumption, and increases the aircraft’s operational range. The optimization variables include PCA weights for both the upper and lower surfaces. These variables consist of 18 components in total. The first 9 components correspond to PCA weights for the upper surface, while the remaining 9 components represent PCA weights for the lower surface. The airfoil designed in this study is intended for a typical military UAV and features a chord length of 600 mm. The boundary conditions and optimization objectives corresponding to two flight states are summarized in
Table 1.
The optimization enforces two-level thickness constraints:
1. Local Constraints: At 11 chordwise positions χ∈{0.1, 0.2, 0.3, 0.4, 0.45, 0.5, 0.55, 0.6, 0.7, 0.8, 0.9}:
2. Global Constraint: maximum thickness:
2.4. Optimization-Guided Data Augmentation
To expand the high-quality training set for the surrogate model, the MIGA is executed 100 times, with each run initialized using a distinct pseudo-random seed k∈{1,…,100}. At the start of each restart, the default Pseudo-Random Number Generator engines in both the Python 3.9.13 random module and the NumPy library are reseeded using the same value k. This setup ensures that the processes of initialization, crossover, mutation, and reproduction follow a unique path. Each path corresponds to a specific realization of the Markov chain [
23].
Each restart generates ten loosely coupled islands, with each island consisting of 30 individuals. The evolutionary process is then carried out for exactly 10 generations. This island-based structure enables the algorithm to explore multiple regions of the search space in parallel while keeping the communication overhead between islands relatively low. Each individual in the population is represented by a stacked vector that encodes the design variables used in the optimization process.
Let denote the upper-surface and lower-surface PCA coefficients of the i-th individual, respectively. Each individual is thus represented by an 18-dimensional vector formed by stacking these two components. The optimization process is therefore conducted within this 18-dimensional PCA coefficient space.
All candidate airfoils were subsequently evaluated using XFOIL to obtain accurate aerodynamic coefficients. The corresponding surrogate model predictions were then replaced with these high-fidelity simulation results. This correction step compensates for the surrogate model’s limited accuracy in regions associated with high aerodynamic performance. The validated samples were added to the original training dataset and used to retrain the neural network. Using the retrained surrogate model, a final round of global optimization was conducted to identify the optimal airfoil geometry with enhanced reliability in aerodynamic predictions. By introducing targeted high-performance samples and refining the distribution of training data, the approach effectively mitigated the original model’s deficiencies in critical subspaces of the design space.
4. Discussion
(1) The airfoil parameterization approach based on PCA serves two main purposes. First, it reduces the dimensionality of the design variables. Second, it retains the essential geometric characteristics required for accurate airfoil representation. However, selecting an insufficient number of principal modes may limit the ability to capture complex geometric features. This reduction in expressiveness can negatively impact the model’s predictive accuracy. On the other hand, incorporating too many modes may introduce components dominated by noise. These components do not contribute meaningful information and may degrade prediction performance.
(2) The aerodynamic surrogate model for airfoils integrates deep learning with PCA. This combination enables efficient dimensionality reduction of the input geometry. At the same time, it preserves a high level of predictive accuracy in modeling aerodynamic performance.
(3) After the initial training phase of the machine learning model, a GA is applied to explore a wider range of airfoil candidates that are predicted to exhibit high aerodynamic performance. These predicted high-performance airfoils are subsequently evaluated using XFOIL to obtain more accurate aerodynamic coefficients. The refined samples are then used to enrich the dataset with targeted high-performance data. This approach reduces the inefficiency associated with purely random sampling and helps alleviate the bias introduced by insufficient data in regions of high aerodynamic quality.
(4) The interpretability workflow employs a combination of visualization tools to construct a clear and organized local explanation system. These tools include waterfall plots, stacked contribution charts, comparative bar plots, and SHAP correlation diagrams. Together, they provide detailed insight into the prediction behavior of the surrogate model. This structured framework is used to support the evaluation of the model’s reliability in specific prediction scenarios.