Skip to Content
AerospaceAerospace
  • Article
  • Open Access

11 March 2026

Binary Icing Shapes Prediction via Principal Component Analysis and Deep Learning Method

,
and
1
School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China
2
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
*
Author to whom correspondence should be addressed.

Abstract

Aircraft icing prediction is crucial for aerodynamic design and airworthiness assessment. Traditional physics-based models struggle with complex multi-physical processes, while existing AI methods (function-based characterization or direct image learning) face issues like multi-valued mapping, high data dependency, or lack of physical interpretability. This study proposes a deep learning framework based on point set displacement description, transforming the icing process into airfoil boundary point movements. PCA dimensionality reduction mitigates the curse of dimensionality while retaining physical meaning. A neural network is used to map environmental parameters to low-dimensional principal components. Comparative analysis shows the 64 × 64 network achieves optimal fitting; 2000 samples reproduce complex ice shapes, and 800 low samples characterize simple ones. Balancing efficiency, accuracy, and interpretability with reduced data dependency, this method provides a new approach for rapid engineering icing prediction.

1. Introduction

Ice prediction constitutes one of the most significant technical approaches for aircraft aerodynamic design and airworthiness assessment [1], which plays an indispensable role in evaluating the survivability of aircraft under severe weather conditions. However, predicting its icing characteristics involves multiple complex physical processes [2], including supercooled droplet movement, water impingement and collection, film flow, heat transfer, phase change, etc. Over the past few decades, the literature has carried out in-depth research on this issue and developed a series of physical models [3,4,5,6,7,8,9,10,11,12,13,14] to improve the prediction accuracy; yet, a unified framework has not been established for wide-ranging applications. From the perspective of development trends, it has become increasingly challenging to achieve rapid advancements in icing prediction capabilities by relying exclusively on physical models. Currently, the research hotspots have shifted toward data-integrated methodologies, which presents a significant challenge to the application of artificial intelligence (AI) in this field.
In early research, the AI-based ice prediction was usually characterized with functions. For instance, Ogretim et al. [14] represented experimental ice shapes in the form of normalized Fourier series, and trained predictive models via the General Regression Neural Network (GRNN) and back propagation (BP) neural network approaches to construct a mapping relationship between the geometric parameters of icing shapes and a five-dimensional set of environmental variables (including freestream velocity V, the temperature T, liquid water content (LWC), median volume diameter (MVD), and exposure time). This work preliminarily validated the effectiveness of the neural network (NN) for predicting binary icing shapes. In comparison with LEWICE, the NN approach exhibits superior computational efficiency and lower resource consumption, yet it generates inaccurate predictions under some complex conditions, such as glaze ice. On the basis of characterization using Fourier series, Yi et al. [14] designed a network structure based on deep fully connected layers (DNN) and a stacked auto-encoder (SAE). By analyzing the icing characteristic parameters such as horn peak thickness, horn angle, icing limit, leading edge minimum thickness, total ice area, the DNN-SAE results show better performance on characterizing nonlinear characteristics of icing shapes than the traditional BP approach. Qu et al. [15] attempted to extend this methodology to arbitrary initial airfoils. When establishing the airfoil icing prediction network, they simultaneously introduced the pressure coefficient of the initial airfoil surface for training, which can partially realize the characteristic transformation of icing shape prediction from a single airfoil to arbitrary airfoils. However, this method is only applicable to low-speed incompressible flows. Chang et al. [16] employs wave functions in multiple frequencies with Wavelet Packet Transform (WPT) to characterize two-dimensional icing shapes. The results indicated that WPT exhibits significant advantages over LEWICE and Fourier series in capturing ice horn morphologies and surface details. A notable limitation is that although the selection of fourth-order WPT balances prediction accuracy and computational efficiency, the truncation of high-frequency components may lead to the loss of details regarding the surface roughness of glaze ice.
An alternative research paradigm leans toward the direct learning of icing images. For instances, He [17] proposed an image-based prediction method for airfoil icing shapes via Transposed Convolutional Neural Networks (TCNN) to address the issue of multiple values of icing thickness in the normal direction at the same position on the airfoil surface for complex icing shapes. It takes icing conditions as inputs and 512 × 256-pixel grayscale images of icing shapes as outputs, enabling the rapid generation of airfoil icing images. He et al. [18] also constructed an end-to-end prediction model from images to lift and drag coefficients using Convolutional Neural Networks (CNNs). With grayscale images of iced airfoils as input and lift/drag coefficients as output, this model realizes the rapid prediction of aerodynamic characteristics directly from icing shape images. The relative error of lift/drag coefficient prediction is controlled within 8%. The results indicate that an increase in the number of convolutional layers can capture more high-frequency features, while the increment of convolution kernels can extract more icing shape features; nevertheless, an excessive number will introduce redundant features and reduce the generalization performance of the model. Yu et al. [19] introduced several new AI approaches in the application of icing image learning methods. By integrating the Feature Extraction Autoencoder (FEAE), Multi-Layer Perceptron (MLP) Network, Image Generation Neural Network (IGNN), and the Noise Reduction Autoencoder (NRAE), they established a Multi-Autoencoder Fusion Network (MAEFN) method based on a seven-dimensional icing condition vector. This method can achieve a learning accuracy of 98.85% in average pixel accuracy on the validation set, which is 25,000 times faster than the LEWICE algorithm. A notable limitation is that the neural network is an interpolation model, which highly relies on large-sample sparse matrices. When the icing conditions exceed the training conditions, the reliability will quickly decrease. Yu [20] further developed the deep learning method with the Fused Super-Resolution Network (FSRN) and Vision Transformer (ViT). FSRN decomposes the training process into two stages: major feature generation and minor detail generation. When the sample size is reduced to 1000, they can achieve higher prediction accuracy than the MLP architecture, implying potential for low-sample requirements. However, the limitations of this fully image-based learning method are also prominent: first, it has a strong dependence on data; second, it lacks physical interpretability.
As mentioned above, it can be concluded that the data dependence of AI-based ice prediction remains an urgent problem to be solved, regardless of characterizing ice shape with functions or directly learning ice images. When using functions to characterize complex iced airfoil morphologies, model failure may occur due to the multi-valued characteristics in the normal direction. In contrast, direct image learning is highly dependent on data, which leads to the loss of physical interpretability of icing. Innovatively, this study proposes a deep learning method based on point set displacement description, combining Principal Component Analysis (PCA) and Multi-Layer Perceptron (MLP). Section 2 elaborates on the design concept and evaluation methodology of the method, Section 3 conducts a comparative analysis, and Section 4 presents the conclusions and future work.

2. Numerical Methodology

2.1. Generalized Difference-in-Differences

Unlike traditional methods—either functional characterization or direct image learning—this methodology innovatively extracts icing features using several boundary point sets; therefore, the icing process is described as the movement of these point sets from a clean airfoil to an iced airfoil, as shown in Figure 1. By subtracting the coordinates of corresponding points respectively, the resulting point set displacement matrix can effectively characterize the boundary growth rate of the ice shape, therefore transforming the complex icing growth process into binary displacement vectors. The displacement of the ith boundary point in each sample can be written as:
pi = (ui, vi) = (xi′ − xi, yi′ − yi)
in which the vectors ui and vi are presented as the displacements of the ith points in x and y directions respectively. From a physical perspective, this can be interpreted as the icing growth amount under the corresponding environmental conditions Em. That is to say, the mapping relationship between environmental parameters and icing growth characteristics here is physically interpretable. However, direct use of such high-dimensional point set displacement matrices for deep neural networks (DNN) training might give rise to a severe curse of dimensionality, primarily including: (1) the model is prone to overfitting to the training data, leading to a degradation in the generalization performance of predictions; (2) the number of training samples required grows exponentially, which is often difficult to satisfy in practical engineering applications; (3) the scale of network parameters and the consumption of computing resources increase substantially, resulting in low training efficiency; and (4) the mapping relationship becomes excessively complex, which may cause the network to converge to a local optimal solution and render it highly susceptible to noise interference. To address this issue, the method proposes Principal Component Analysis (PCA) [21,22,23,24,25,26] for dimensionality reduction, extracting several principal components as the low-dimensional spatial features.
Figure 1. Generalized difference-in-differences in the iced process.

2.2. Principal Component Analysis

As shown in Figure 2, PCA projects high-dimensional data onto a set of orthogonal basis vectors (i.e., principal components) via linear transformation, and measures the importance of the corresponding principal components by the magnitude of the projection variance in each direction, thus maximizing the retention of the main information of the original data with the fewest possible dimensions. PCA retains the data information while reducing the data dimension, thereby improving the efficiency of the model and maintaining its physical interpretability.
Figure 2. Extraction of the principal components with PCA.

2.2.1. Data Centering

The first step of PCA is data centering. Let the original high-dimensional dataset be Pn×s, where n is the sample number and s is the dimension of the original data. Data centering requires computing the mean vectors for each displacement component separately:
ua = (Σin ui)/n
va = (Σin vi)/n
(pi)a = (uiua, viva)
where ua and va are the mean vector and (pi)a is the centered vector. The centering matrix (PC)n×s does not alter the distributional properties of the data, but simplifies the derivation process of the PCA algorithm. For high-dimensional data, directly performing eigenvalue decomposition on the covariance matrix may suffer from high computational complexity. The Singular Value Decomposition (SVD) is therefore adopted to extract the principal components indirectly:
PC = UQT
where ∧ is an n × s rectangular diagonal matrix of positive number σk, called the singular values of PC; U is an n × n orthogonal unit matrix, called the left singular vectors of PC; Q is an s × s orthogonal unit matrix, called the right singular vectors of PC. In terms of this factorization, the matrix (PC)TPC can be written as:
(PC)TPC = QTQT = QΣ2QT
where Σ denotes a square diagonal matrix consisting of the singular values σk of PC. σk = (λk)0.5, λk is the eigenvector of (PC)TPC, and Q is equivalent to the eigenvector matrix of (PC)TPC.

2.2.2. Principal Component Extraction

With SVD, we can further extract principal components (PCs) and project the high-dimensional data into a low-dimensional subspace, which is the goal of PCA. First, the singular values σ1σ2 ≥ … ≥ σr (r denotes the rank of PC) arranged in descending order, with the corresponding right singular vectors q1, q2, …, qk, i.e., columns of Q, permuted accordingly. The first L right singular vectors are selected to form the s × L projection matrix QL, where each column corresponds to a PC direction and the orthogonality of these right singular vectors guarantees that the projected principal components are linearly independent. The n × L low-dimensional principal component score matrix can be written as:
ZL = PCQL
Each row of matrix ZL corresponds to the low-dimensional representation of an original sample, and each column represents a PC. The orthogonality of QL guarantees that the columns of ZL are mutually uncorrelated, thereby simultaneously achieving the two core objectives of dimensionality reduction and decorrelation.
It is noteworthy that PCA is merely employed as a mathematical preprocessing tool for dimensionality reduction herein, and the derived PCs are linear combinations of the original displacement vectors with no independent physical meaning. Nevertheless, they collectively preserve the geometric information of the physically interpretable boundary point displacement dataset that characterizes the transformation from a clean airfoil to an iced one. The physical interpretability of the proposed method does not stem from PCA itself, but from the fact that both the inputs and outputs are constructed based on physically meaningful quantities: the inputs are icing environmental parameters, and the outputs are the boundary point displacements reconstructed via inverse PCA.

2.3. Multi-Layer Perceptron

A Multi-Layer Perceptron (MLP) [27] artificial neural network is set up between the environmental parameters and the low-dimensional principal component score matrix ZL. The input layer is defined as the environmental matrix consisting of the velocity, the temperature, liquid water content (LWC), median volume diameter (MVD), freezing time, etc., while the output layer is presented as the L orthogonal singular vectors. Several hidden layers are full-linked between them as shown in Figure 3.
Figure 3. The structures of the Multi-Layer Perceptron neural network.
The forward propagation realizes feature mapping and prediction from input to output, while the back propagation updates network parameters via the gradient descent method [27] to minimize the loss function between predicted values and true values, thereby driving the network training process to convergence. The mathematical expression of forward propagation can be written as:
hj = σ(Wjhj−1 + bj)
where hj is the hidden state of jth layer, σ(.) is a nonlinear activation function, Wj is the weight matrix, and bj is the bias vector. For regression tasks, the Mean Squared Error (MSE) is frequently adopted as the loss function J:
MSE = 1/N ΣjN{1/n Σin[(zji)predzji]2}
where N is number of samples, n is number of neurons in the output layer, zji is the true value of the jth sample at the ith output neuron, and (zji)pred is the predicted value.
Back propagation (BP) is the core algorithm for MLP training. The purpose is to compute the gradients of the loss function with respect to the parameters of each layer, then update the parameters via gradient descent to minimize the loss. It can be written as:
δk = əJ/ətpredσ′(hk)
δj = (Wj)Tδj+1σ′(hj)
əJ/əWj = δj+1(hj)T
əJ/əbj = δj+1
where δk is the error term of the output layer, hk is the input of output layer, σ′(.) is the derivative of the activation function, and ⊙ is the Hadamard product of two matrices or vectors of the same dimensions, defined as the element-wise multiplication of their corresponding elements. Update the weights and biases of each layer via gradient descent using the computed gradients:
(Wj)update = Wjη(əJ/əWj)
(bj)update = bjη(əJ/əbj)
where η is the learning rate. The cycle of forward propagation, back propagation, and parameter update is repeated until the loss function converges or the predefined maximum number of iterations is attained.
This study employs the Kolmogorov–Arnold Network (KAN) [28], a variant of MLP. The fundamental architectural distinction lies in the design of activation functions; while MLP deploys nonlinear activations (e.g., SiLU) at the nodes (neurons), with edges performing merely linear transformations, KAN instead places learnable univariate functions (typically parameterized via B-splines) on the edges, leaving nodes to conduct only simple summation of inputs.
Regarding the univariate functions, our KAN model employs a linear combination of a base residual function and a parameterized spline. Specifically, we utilized the SiLU (Sigmoid Linear Unit) function as the base activation, combined with cubic B-splines (degree k = 3) to form the learnable functions on the edges. This setup aligns with the standard architecture proposed in the foundational KAN literature to ensure both expressivity and smooth optimization.
As for the software stack, our machine learning pipeline was developed in Python 3.9.25 relying primarily on the PyTorch 2.8.0 deep learning framework. The KAN architecture itself was implemented utilizing the open-source official pykan library within the PyCharm 2025.2.4 Integrated Development Environment (IDE). Furthermore, all model training and inference procedures were accelerated using CUDA on an NVIDIA RTX 4090 GPU. Each model was trained for 20 min, and the prediction time for a single sample was 1 s.

2.4. Predictive Reconstruction

After completing the learning process, we can obtain the prediction matrix (ZL)pred based on the updated weight/bias matrices W/b and any input environmental parameters Ex. The predictive output (ZL)pred of the MLP is the low-dimensional principal component score matrix rather than the original high-dimensional matrix, thus we perform reconstruction via inverse PCA:
(PC)pred = (ZL)pred(QL)T
The centered matrix PC is derived by subtracting the mean vector from the original matrix P, so it is necessary to add the mean vector back to (PC)pred to recover the original data Ppred. We can reconstruct the icing process based on this matrix Ppred.

2.5. Error Evaluation

To comprehensively evaluate the performance of the models from multiple perspectives, the Intersection over Union (IoU), Coefficient of Determination (R2), Mean Absolute Error (MAE), Chamfer Distance (CD) and other metrics were adopted for comparative analysis, in addition to the Mean Squared Error (MSE).
MSE measures the average squared deviation between predicted and actual values of low-dimensional principal component scores in PCA. In the ice prediction task, it reflects the fitting accuracy of the neural network for the low-dimensional feature mapping from environmental parameters to ice displacement characteristics. Smaller MSE values indicate more accurate fitting of physically meaningful low-dimensional features by the network.
As a dimensionless metric for evaluating the overlap level of spatial regions, IoU is specifically used to quantify the consistency between predicted ice shapes and ground-truth ice contours (e.g., segmentation regions of iced airfoil boundaries). In the icing prediction task, it quantifies the spatial overlap degree between the predicted ice shape contour and the ground-truth contour, with a higher IoU value indicating a more consistent geometric shape of the predicted ice accretion with the actual one. For two regions S1 (predicted ice shape) and S2 (ground-truth ice shape), where ‖.‖ denotes the area of the region in 2D space, the IoU is calculated as:
IoU = ‖S1S2‖/‖S1S2
R2 measures the degree of linear correlation between predicted and actual PCA principal component scores, with values ranging from [0, 1]. It intuitively represents the proportion of variance in ice displacement characteristics that the neural network prediction model can explain. An R2 value approaching 1 indicates that the model effectively captures the variation patterns of ice displacement characteristics driven by environmental parameters:
R2 = 1 − Σin[(zji)pred − zji]2in[(zji)meanzji]2
MAE is defined as the arithmetic mean of the absolute deviations between predicted and actual principal component scores. For ice shape prediction, it physically characterizes the average magnitude of displacement deviations at airfoil boundary points reconstructed through inverse PCA, directly reflecting the mean error of predicted ice thickness relative to the ground truth:
MAE = 1/n Σin|(zji)predzji|
CD quantifies the geometric similarity between discrete point sets by calculating the average of minimum Euclidean distances between predicted and actual airfoil boundary point sets. It is particularly suitable for evaluating the prediction accuracy of complex ice morphology details, such as ice horn tips, local protrusions, and surface roughness. Smaller CD values indicate higher microscopic geometric similarity between predicted and actual ice shapes. For predicted sets, Zpred = {(zj1)pred, (zj1)pred(zjn)pred} and truth sets, Z = {zj1, zj2…zjn}, where ‖.‖2 denotes the Euclidean distance:
Chamfer = 1/ninmin(zji)predZpred(‖(zji)predzji2)2 + Σinmin zjiZ(‖(zji)predzji2)2)

3. Results and Discussions

3.1. Training and Validating Dataset

This study employs the comprehensive icing dataset established by SJTU group [19,20], which integrates numerical simulations and experimental measurements to ensure physical accuracy and data diversity. The dataset covers a full spectrum of icing morphologies, ranging from rime ice (an opaque, low-density structure formed under low-temperature conditions) to complex glaze ice (a transparent, horned structure with surface roughness closely dependent on freezing conditions), as well as transitional rime–glaze mixed ice states. This extensive coverage enables the proposed method to be rigorously validated for icing scenarios encountered in the actual operational conditions of aircraft. Based on the operational velocity range (80.0–120.0 m/s), the corresponding Reynolds number (Re) spans 106–107. All the training and validating data are considered in NACA0012 airfoil, while the detailed range of icing environmental parameters are shown in Table 1.
Table 1. Ranges of icing environmental parameters in the sample set.
A total of 10,000 ice shape samples are obtained, from which we randomly selected a determined number of samples for training and validation. Specifically, the number of selected samples was determined according to computational requirements. Each sample is composed of two parts: (1) seven icing parameters (velocity, temperature, liquid water content (LWC), median volume diameter (MVD), freezing time, angle of attack (AOA), height); (2) two groups of (x,y) datasets describing the initial and iced airfoil respectively. Prior to utilizing these data, obviously questionable icing shape data were excluded from the training set via data cleaning, leaving a total of 8709 valid samples [29]. In this paper, we selected three groups of sample sizes (800, 1200, and 2000) to test the learning ability under low-sample conditions. The test set and the three training sets employ training parameters consistent with the ranges of the cleaned dataset in Table 1, thereby avoiding potential biases in model performance evaluation that may arise from unbalanced sample selection. The ranges for the three training sample sizes are presented in Table 1:
The parameter ranges listed in Table 1 are chosen to cover critical flight phases in accordance with FAR 25 and CS 25 airworthiness certification standards. The angle of attack range (1.0–6.0°) covers typical conditions for cruise (1–2°), approach (3–4°), and landing (5–6°). The altitude range (2000–5000 m) corresponds to the approach, landing, and initial climb phases, where icing encounters pose the most significant threat to flight safety. Together with the velocity range 80–120 m/s, the selected parameter bounds ensure that the dataset fully covers representative operating conditions across multiple critical flight phases.

3.2. Effect of the Neuron Networks

As we know the increment of the number of hidden layers will lead to overfitting and long-time costs, three types of neuron networks are compared in this subsection. The error caused by different neuron networks is discussed below. The icing environmental parameters of six representative examples are presented in Table 2. These six samples cover the full spectrum of icing regimes; examples 7358 and 34 represent rime ice conditions with low temperatures (−7.6 °C and −5.2 °C) that promote immediate freezing upon impact, producing streamlined, opaque ice shapes. Examples 3942, 6012, and 1827 represent glaze ice conditions, characterized by temperatures near freezing (−1.9 °C to −3.8 °C) combined with moderate-to-high LWC (0.33–0.53 g/m3). These conditions lead to delayed freezing, water runback, and the formation of complex double-horn geometries. Example 2102 represents a transitional case between rime and glaze regimes, with a moderate temperature (−2.8 °C) but relatively low LWC (0.14 g/m3), resulting in ice shape complexity that lies between the two extremes. This selection ensures that the model’s performance is evaluated across both simple (rime) and complex (glaze) ice morphologies, as well as intermediate transitional forms. The icing environmental parameters of the six examples are presented in Table 2.
Table 2. Icing environmental parameters of six samples.
Figure 4 renders the predicted ice shapes based on 32 × 32, 32 × 64, 64 × 32, and 64 × 64 hidden layers under these six typical conditions. The black solid line represents the NACA0012 airfoil, the red solid line represents the ice shape in the dataset, and the remaining dashed lines correspond to the model predictions. The comparison reveals that with a training set of 800 samples, the four network architectures demonstrate comparable prediction performance across the six examples and effectively characterize ice shape features, thereby validating the feasibility of the proposed method. In cases 3942 and 7358, the four models generate satisfactory predictions. In cases 6012, 2102, and 34, the 64 × 64 architecture demonstrates superior performance in predicting both ice shape contour and extent compared to the other three models. Notably, in example 1827, although the 64 × 64 network architecture exhibits significant prediction deviations, it achieves the best reconstruction quality for the ice shape contour. This suggests that increasing hidden layer dimensions enhances the network’s nonlinear fitting capacity, with the observed deviations likely attributable to the sample number. Consequently, Section 3.3 will examine the influence of sample number on prediction accuracy specifically for the 64 × 64 network architecture.
Figure 4. Comparison of six samples and the ice shape in the dataset under 32 × 32, 32 × 64, 64 × 32, and 64 × 64 hidden layer.
Figure 5 compares the IoU, MSE, R2, CD and MAE of the four models using a normalized radar chart. The 64 × 64 architecture achieves optimal performance across all five metrics, simultaneously attaining accurate low-dimensional feature learning (high R2, low MSE) and precise geometric reconstruction (high IoU, low MAE and CD). In contrast, the 64 × 32 exhibits contradictory behaviors: acceptable MSE and R2 but significantly degraded IoU, MAE and CD. This performance gap stems from a dimensional mismatch: expanding the first hidden layer to 64 neurons for a seven-dimensional input creates 448 (7 × 64) learnable nonlinear combinations, which inevitably causes overfitting under the constraint of limited training data (n = 800). These results indicate that moderate first-layer expansion preserves both feature learning capacity and reconstruction stability under limited data conditions (n = 800), whereas aggressive expansion (64 × 32) causes dimensional mismatch and overfitting. The data sufficiency issue for the 64 × 64 architecture will be further discussed in Section 3.3.
Figure 5. Error comparison of 32 × 32, 32 × 64, 64 × 32, and 64 × 64 hidden layer.

3.3. Effect of the Sample Size

In this section, the effect of the samples size will be discussed. Three groups of sample sizes (800, 1200, and 2000), consistent with the ranges specified in Table 1, are selected to test the learning and predicted ability under low-sample conditions. Performance across three training sample sizes is compared using the same six examples as in Section 3.2. The meanings of the icons are consistent with those in the previous text.
As revealed by the comparison in Figure 6, all three models are essentially capable of accomplishing the prediction task. In cases 3942, 6012, and 7358, the three models produce favorable predictions with minimal difference between different models. In cases 2102 and 34, the 800- and 1200-sample models exhibit certain deviations in predicting the ice shape contour and extent, respectively: the 800-sample model effectively captures the ice shape contour, while the 1200-sample model performs better in predicting the ice shape extent, whereas the 2000-sample model accurately reproduces both. In case 1827, both the 800- and 1200-sample models show noticeable discrepancies from the ground truth, while the prediction accuracy improves substantially with the 2000-sample model. For simple ice shapes, the 800-sample model already demonstrates excellent performance. For complex ice shapes, increasing the sample size enables the models to characterize the ice shape features adequately. Nevertheless, even the 2000-sample size represents a considerably low requirement, indicating that the proposed method maintains commendable learning and predictive capabilities under low-sample conditions.
Figure 6. Comparison of six samples and the ice shape in the dataset under 800, 1200, 2000 sample size.
Figure 7 presents the prediction results and five metric values (IoU, MSE, R2, MAE, CD) of the 64 × 64 architecture under three sample sizes (800, 1200, 2000). It shows that the 800-sample model exhibits significant fluctuations in prediction errors, which reflects its insufficient prediction stability in samples involving complex or extreme ice shapes. In contrast, the model with 2000 samples maintains the minimum prediction errors and can retain good prediction accuracy even in some extreme cases. This indicates that the model’s prediction performance improves significantly with the increase in sample size, and this conclusion is consistent with the analysis results presented earlier.
Figure 7. Error comparison of 800, 1200, 2000 sample number.

4. Conclusions

This study presents a novel deep learning methodology for binary icing shape prediction, addressing the limitations of existing function-based and image-based approaches. The main conclusions are summarized as follows:
(1)
A point set displacement description framework is proposed, transforming the icing process into boundary point movements from clean to iced airfoils. By integrating Principal Component Analysis (PCA) with deep neural networks, the method successfully converts high-dimensional geometric prediction into a tractable low-dimensional regression problem while preserving physical interpretability.
(2)
The hidden layer structure significantly influences prediction performance. The 64 × 64 architecture achieves optimal results across multiple metrics (IoU, MSE, R2 et al.) due to its stronger nonlinear fitting capability, significantly outperforming other architectures (32 × 32, 32 × 64, and 64 × 32).
(3)
The method demonstrates excellent learning ability under low-sample scenarios (800–2000 samples) across diverse icing regimes, including rime ice, complex glaze ice with pronounced horns, and transitional mixed ice. For accurate prediction of simple rime ice shapes, 800 samples suffice, while 2000 samples enable stable reproduction of complex glaze ice details and significantly improve prediction stability under extreme operating conditions characterized by near-freezing temperatures and high liquid water content. Increasing the sample size effectively enhances the prediction accuracy and consistency of complex ice morphologies.
(4)
Compared with traditional function-based characterization and image learning methods, this method integrates three key advantages: physical interpretability (displacement vectors correspond to actual icing growth), low data dependency (substantially reducing sample requirements), and efficiency (avoiding large-scale image processing and lowering computational costs).
Future work will investigate the generalization capability to arbitrary airfoils and the extrapolation performance for parameters beyond the established training data range, while extending the framework to three-dimensional icing scenarios. Specifically, expanding the input parameter space to include airfoil geometric characteristics (e.g., chord length, camber, thickness distribution) will be essential for multi-airfoil generalization, contingent upon the availability of comprehensive training databases. The present seven-parameter framework establishes the methodological foundation for such extensions. This methodology provides a promising foundation for real-time icing prediction systems in aircraft design and flight safety assessment.

5. Patents

Some materials of this report are protected by a Chinese patent that is currently under application, with the application number of 202511846233.8.

Author Contributions

Conceptualization, C.Z.; methodology, Y.L.; software, Y.W.; writing—original draft preparation, C.Z. and Y.L.; writing—review and editing, C.Z.; visualization, Y.W.; funding acquisition, C.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 52302439.

Data Availability Statement

All the data will be made available on request.

Acknowledgments

In this section, the authors want to thank Zhang Bin for their help in setting up the training dataset.

Conflicts of Interest

The authors declare no competing financial conflicts of interest or personal relationships that could have appeared to influence the work reported in the paper.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
PCAPrincipal Component Analysis
MLPMulti-Layer Perceptron
GRNNGeneral Regression Neural Network
BPBack Propagation
LWCLiquid Water Content
MVDMedian Volume Diameter
NNNeural Network
DNNDeep Neural Network
SAEStacked Auto-Encoder
WPTWavelet Packet Transform
TCNNTransposed Convolutional Neural Network
CNNConvolutional Neural Network
FEAEFeature Extraction Autoencoder
IGNNImage Generation Neural Network
NRAENoise Reduction Autoencoder
MAEFNMulti-Autoencoder Fusion Network
FSRNFused Super-Resolution Network
ViTVision Transformer
SVDSingular Value Decomposition
PCPrincipal Component
KANKolmogorov–Arnold Network
SiLUSigmoid Linear Unit
IDEIntegrated Development Environment
IoUIntersection over Union
R2Coefficient of Determination
MAEMean Absolute Error
CDChamfer Distance
MSEMean Squared Error
SJTUShanghai Jiao Tong University
ReReynolds number
AOAAngle of Attack

References

  1. Zhang, C.; Liu, H.; Wang, F.; Kong, W. Supercooled Large Droplet Icing Accretion and its Unsteady Aerodynamic Characteristics on High-lift Devices. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2017, 231, 2391–2403. [Google Scholar] [CrossRef]
  2. Zhao, Y.; Guo, Q.; Lin, T.; Cheng, P. A review of recent literature on icing phenomena: Transport mechanisms, their modulations and controls. Int. J. Heat Mass Transf. 2020, 159, 120074. [Google Scholar] [CrossRef]
  3. Dai, H.; Zhu, C.L.; Zhao, H.; Liu, S. A New Ice Accretion Model for Aircraft Icing Based on Phase-Field Method. Appl. Sci. 2021, 11, 5693. [Google Scholar] [CrossRef]
  4. Wang, C.; Chang, S.L.; Leng, M.; Wu, H.; Yang, B. A two-dimensional splashing model for investigating impingement characteristics of supercooled large droplets. Int. J. Multiph. Flow 2016, 80, 131–149. [Google Scholar] [CrossRef]
  5. Zhang, C.; Liu, H. Effect of drop size on the impact thermodynamics for Supercooled Large Droplet in aircraft icing. Phys. Fluids 2016, 28, 062107. [Google Scholar] [CrossRef]
  6. Myers, T.G. Extension to the Messinger model for aircraft icing. AIAA J. 2001, 39, 211–218. [Google Scholar] [CrossRef]
  7. Shen, X.; Xiao, C.; Ning, Y.; Wang, H.; Lin, G.; Wang, L. Research on the methods for obtaining droplet impingement characteristics in the Lagrangian framework. Aerospace 2024, 11, 172. [Google Scholar] [CrossRef]
  8. Shen, K.; Zeng, D.; Wang, C.; Wang, L.; Dong, Y. A Novel Multi-Step Numerical Framework for Ice Accretion Prediction Based on Unsteady Water Film Dynamics. Front. Heat Mass Transf. 2025, 23, 1958–1979. [Google Scholar] [CrossRef]
  9. Wang, J.; Guo, R.; Zhao, N.; Zhu, C. An Experimental Investigation of the Effect of a Supercooled Large Droplet Impingement on Freezing Behaviour. Aeronaut. J. 2025, 129, 2556–2574. [Google Scholar] [CrossRef]
  10. Jia, W.; Zhang, F. Numerical Investigation of Supercooled Large Droplets Impingement Characteristics of the Rotating Spinner. Int. J. Aerosp. Eng. 2024, 2024, 1683744. [Google Scholar] [CrossRef]
  11. Myers, T.G.; Charpin, J.P.F.; Thompson, C.P. Slowly Accreting Ice Due to Supercooled Water Impacting on a Cold Surface. Phys. Fluids 2002, 14, 240–256. [Google Scholar] [CrossRef]
  12. Chang, S.; Tang, H.; Wu, H.; Su, X.; Lewis, A.; Ji, C. Three-Dimensional Modelling and Simulation of the Ice Accretion Process on Aircraft Wings. Int. J. Astronaut. Aeronaut. Eng. 2018, 3, 020. [Google Scholar] [CrossRef]
  13. Ogretim, E.; Huebsch, W.; Shinn, A. Aircraft ice accretion prediction based on neural networks. J. Aircr. 2006, 43, 233–240. [Google Scholar] [CrossRef]
  14. Yi, X.; Wang, Q.; Chai, C.; Guo, L. Prediction model of aircraft icing based on deep neural network. Trans. Nanjing Univ. Aeronaut. Astronaut. 2021, 38, 535–544. [Google Scholar]
  15. Qu, J.G.; Peng, B.; Yi, X.; Ma, Y. Icing prediction method for arbitrary airfoil using deep neural networks. Acta Aerodyn. Sin. 2023, 41, 48–55. [Google Scholar] [CrossRef]
  16. Chang, S.; Leng, M.; Wu, H.; Thompson, J. Aircraft ice accretion prediction using neural network and wavelet packet transform. Aircr. Eng. Aerosp. Technol. 2016, 88, 128–136. [Google Scholar] [CrossRef]
  17. He, L.; Qian, W.; Yi, X.; Wang, Q.; Zhang, X. Graphical Prediction method of airfoil ice shape based on transposed convolution neural networks. J. Natl. Univ. Def. Technol. 2021, 43, 98–106. (In Chinese) [Google Scholar] [CrossRef]
  18. He, L.; Qian, W.; Dong, K.; Yi, X.; Chai, C. Aerodynamic characteristics modeling of iced airfoil based on convolution neural networks. Acta Aeronaut. Et Astronaut. Sin. 2023, 44, 126434. (In Chinese) [Google Scholar] [CrossRef]
  19. Yu, D.; Han, Z.; Zhang, B.; Zhang, M.; Liu, H.; Chen, Y. A multi-autoencoder fusion network for fast image prediction of aircraft ice accretion. Phys. Fluids 2022, 34, 076107. [Google Scholar] [CrossRef]
  20. Yu, D.; Han, Z.; Zhang, B.; Zhang, M.; Liu, H.; Chen, Y. A fused super-resolution network and a vision transformer for airfoil ice accretion image prediction. Aerosp. Sci. Technol. 2024, 144, 108811. [Google Scholar] [CrossRef]
  21. Greenacre, M.; Groenen, P.J.F.; Hastie, T.; D’Enza, A.I.; Markos, A.; Tuzhilina, E. Principal Component Analysis. Nat. Rev. Methods Primers 2022, 2, 100. [Google Scholar] [CrossRef]
  22. Alsoufi, M.S.; Bawazeer, S.A. Mechanistic Prediction of Machining-Induced Deformation in Metallic Alloys Using Property-Based Regression and Principal Component Analysis. Machines 2026, 14, 37. [Google Scholar] [CrossRef]
  23. Qaedi, K.; Abdullah, M.; Yusof, K.A.; Hayakawa, M. Feasibility of Principal Component Analysis for Multi-Class Earthquake Prediction Machine Learning Model Utilizing Geomagnetic Field Data. Geosciences 2024, 14, 121. [Google Scholar] [CrossRef]
  24. Wang, C.; Zhang, X.; Wang, X.; Chang, G. Prediction of Earthquake Death Toll Based on Principal Component Analysis, Improved Whale Optimization Algorithm, and Extreme Gradient Boosting. Appl. Sci. 2025, 15, 8660. [Google Scholar] [CrossRef]
  25. Lei, D.; Zhang, Y.; Lu, Z.; Lin, H.; Fang, B.; Jiang, Z. Slope Stability Prediction Using Principal Component Analysis and Hybrid Machine Learning Approaches. Appl. Sci. 2024, 14, 6526. [Google Scholar] [CrossRef]
  26. Li, K. Research on the Factors Influencing the Spatial Quality of High-Density Urban Streets: A Framework Using Deep Learning, Street Scene Images, and Principal Component Analysis. Land 2024, 13, 1161. [Google Scholar] [CrossRef]
  27. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning, 1st ed; MIT Press: Cambridge, MA, USA, 2016; pp. 84–111, 189–222. [Google Scholar] [CrossRef]
  28. Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. KAN: Kolmogorov–Arnold Networks. In Proceedings of the International Conference on Learning Representations (ICLR), Singapore, 24–26 April 2025; pp. 1–41. [Google Scholar] [CrossRef]
  29. Côté, P.-O.; Nikanjam, A.; Ahmed, N.; Humeniuk, D.; Khomh, F. Data cleaning and machine learning: A systematic literature review. Autom. Softw. Eng. 2024, 31, 54. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.