Prediction of Transonic Flow over Cascades via Graph Embedding Methods on Large-Scale Point Clouds

: In this research, we introduce a deep-learning-based framework designed for the prediction of transonic ﬂow through a linear cascade utilizing large-scale point-cloud data. In our experimental cases, the predictions demonstrate a nearly four-fold speed improvement compared to traditional CFD calculations while maintaining a commendable level of accuracy. Taking advantage of a multilayer graph structure, the framework can extract both global and local information from the cascade ﬂow ﬁeld simultaneously and present prediction over unstructured data. In line with the results obtained from the test datasets, we conducted an in-depth analysis of the geometric attributes of the cascades reconstructed using our framework, considering adjustments made to the geometric information of the point cloud. We ﬁne-tuned the input using 1603 data points and quantiﬁed the contribution of each point. The outcomes reveal that variations in the suction side of the cascade have a signiﬁcantly more substantial inﬂuence on the ﬁeld results compared to the pressure side and explain the way graph neural networks work for cascade ﬂow-ﬁeld prediction, enhancing the comprehension of graph-based ﬂow-ﬁeld prediction among developers and proves the potential of graph neural networks in ﬂow-ﬁeld prediction on large-scale point clouds and design.


Introduction
For engine-fan cascades, localized complex flows in the flow field, such as shocks and wake, are the main sources of fan aerodynamic losses [1][2][3][4].Studies [5,6] have shown that, especially in transonic and supersonic flow regimes, there is a significant increase in losses, with shock losses dominating the overall losses in the linear cascade.Inadequate design can lead to the generation of shocks and shock wave/boundary layer interaction [7], consequently resulting in energy dissipation and decreased efficiency.Wake losses also constitute a principal contributor to losses in transonic blades [8,9].Concurrently, wake exerts an influence on the flow field within the downstream blade row passage [3], therefore impacting the aerodynamic performance of the fan.Thus, a fine design of the cascade profile is required to improve the fan efficiency.During the design process, deep learning is commonly used to construct surrogate models to quickly predict the flow field [10,11], while exploring the relationship between cascade geometry and flow losses can guide the design.Therefore, the development of a framework that allows for an in-depth understanding of flow-field characterization and flow-field prediction results can serve well both to fulfill the design process requirements for fast, low-cost flow-field prediction and to further guide subsequent design.
Multiple methods have been proposed for fast flow-field prediction, of which convolutional neural network (CNN) is frequently employed in the prediction of flow around airfoil profiles due to the potent nonlinear mapping [12][13][14] and feature extraction [15][16][17][18] capabilities.Sekar et al. [19] performed training on a set of airfoils based on deep CNN and deep Multilayer Perceptron, where CNN was employed for parameterization, while a deep MLP network was used to predict the flow field around the airfoil, achieving great prediction accuracy in flow field prediction.The research demonstrated the excellent feature extraction capabilities of CNN, which effectively extracted airfoil features and fitted the airfoil.Meanwhile, the use of the deep MLP network avoided the decrease in accuracy in fitting the airfoil boundary in traditional image-to-image-regression scenarios.Hui et al. [20] developed a CNN-based model to predict the pressure distribution over an airfoil.The proposed model achieved a mean squared error of less than 2% for test cases.Wu et al. [21] proposed a CNN-DCNN model, tested the influence of training parameters, and quantified the feature extraction capabilities of the presented model.Despite CNN demonstrating excellent predictive performance, precision, and the ability to capture inherent flow characteristics, particularly in the context of airfoil flow-field prediction, the capability of CNN in handling unstructured flow-field data remains suboptimal, particularly in practical applications with irregular flow path structure [22,23].
Due to the intricate flow patterns around three-dimensional turbine blades, researchers have introduced linear cascade testing to approximate blade performance, which extracts a specific cross-sectional blade profile from an overall blade and unfolds the profile circumferentially to create a linear structure [24].Within the linear structure, profiles are arranged linearly to simulate the motion of annular blades in the flow field.For numerical simulations of airfoils, the equations are typically solved over the entire surface of the airfoil.In contrast, numerical simulations for flow over cascades are often conducted within one single flow path of the linear cascade, as illustrated in Figure 1, where the upper boundary in numerical simulation corresponds to the pressure side of the cascade, and the lower boundary corresponds to the suction side, forming a linear cascade through periodic configuration.Standard 2D-image-to-2D image-regression scenarios based on CNN commonly handle images in the regular shape of (height, width, depth), as the filters are fixed.However, for the irregular flow field depicted in Figure 1, conventional CNN-based methods may not be well-suited, as ordinary CNN approaches are constrained in generalizing to unstructured data because of the challenge of selecting a fixed convolution kernel that can effectively accommodate the various grid sizes, shapes, and irregular boundaries.
Aerospace 2023, 10, x FOR PEER REVIEW 2 of 22 airfoil profiles due to the potent nonlinear mapping [12][13][14] and feature extraction [15][16][17][18] capabilities.Sekar et al. [19] performed training on a set of airfoils based on deep CNN and deep Multilayer Perceptron, where CNN was employed for parameterization, while a deep MLP network was used to predict the flow field around the airfoil, achieving great prediction accuracy in flow field prediction.The research demonstrated the excellent feature extraction capabilities of CNN, which effectively extracted airfoil features and fitted the airfoil.Meanwhile, the use of the deep MLP network avoided the decrease in accuracy in fitting the airfoil boundary in traditional image-to-image-regression scenarios.Hui et al. [20] developed a CNN-based model to predict the pressure distribution over an airfoil.
The proposed model achieved a mean squared error of less than 2% for test cases.Wu et al. [21] proposed a CNN-DCNN model, tested the influence of training parameters, and quantified the feature extraction capabilities of the presented model.Despite CNN demonstrating excellent predictive performance, precision, and the ability to capture inherent flow characteristics, particularly in the context of airfoil flow-field prediction, the capability of CNN in handling unstructured flow-field data remains suboptimal, particularly in practical applications with irregular flow path structure [22,23].
Due to the intricate flow patterns around three-dimensional turbine blades, researchers have introduced linear cascade testing to approximate blade performance, which extracts a specific cross-sectional blade profile from an overall blade and unfolds the profile circumferentially to create a linear structure [24].Within the linear structure, profiles are arranged linearly to simulate the motion of annular blades in the flow field.For numerical simulations of airfoils, the equations are typically solved over the entire surface of the airfoil.In contrast, numerical simulations for flow over cascades are often conducted within one single flow path of the linear cascade, as illustrated in Figure 1, where the upper boundary in numerical simulation corresponds to the pressure side of the cascade, and the lower boundary corresponds to the suction side, forming a linear cascade through periodic configuration.Standard 2D-image-to-2D image-regression scenarios based on CNN commonly handle images in the regular shape of (height, width, depth), as the filters are fixed.However, for the irregular flow field depicted in Figure 1, conventional CNNbased methods may not be well-suited, as ordinary CNN approaches are constrained in generalizing to unstructured data because of the challenge of selecting a fixed convolution kernel that can effectively accommodate the various grid sizes, shapes, and irregular boundaries.Moreover, GCN effectively captures both topological structures [25] and flow tures [26].Additionally, GCN leverages sparse matrices for computation, enablin handling of larger matrices and accommodating extensive discrete flow-field p Meanwhile, convolutional networks aggregate features from neighboring nodes, op ing the utilization of topological information between these nodes [27].Consequ GCN finds application in the realm of flow-field reconstruction.Economon et al. [28 bined traditional GCN with CFD simulations, which significantly accelerated pred speed.To address non-Euclidean flow problems, Wang et al. [29] integrated GCN traditional numerical solvers and proposed the FlowGCN solver, which signifi speeded up the convergence of the entire program and secured accurate predictions et al. [26] proposed a data-driven flow prediction framework, GraphSAGE, based o basic architecture of GCN.This framework learned potential features by samplin aggregating features from the local neighborhoods of vertices, demonstrating good a ability to non-uniformly distributed grid data.Furthermore, taking advantage of th of sparse matrices in graph neural networks, GCN can effectively process and p large-scale flow fields.Strönisch et al. [30] found out that GCN could predict flow over NACA airfoils and handle a large number of flow-field data points, which ben the computational runtime by providing initial flow distributions for CFD.Howeve rent research mainly focuses on cases such as airfoil and cylinder flow, with less emp on studies related to turbine blade cascades.Given that transonic/supersonic blad cade flow fields are more complex and involve shock waves, multiple flow intera [31,32], resulting in spatial non-uniformity and temporal non-stationarity in the flow it is essential to establish a prediction framework with higher-resolution flow-field to improve predictions of the characteristics of turbine blade cascade flow.
Furthermore, despite the significant progress made by GCN in predicting fluid there is still a need for further research on elucidating how GCN predicts these fluid Presently, various methods for interpreting graph neural networks (GNNs) have be veloped.Ying et al. [33] analyzed the impact of node features and the linking proc node information aggregation on model predictions and proposed GNNExplainer, identified crucial subgraph structures and node features within GNN predictions, de strating a general and model-agnostic property.SubgraphX [34] focused on the sub tures of the graph, interpreting GNN by exploring and identifying significant subgr GNN Prediction Interpreter (GPI) [35] studied the correlation between node feature GNN predictions and elucidated the impact of node features on GNN predic Moreover, GCN effectively captures both topological structures [25] and flow features [26].Additionally, GCN leverages sparse matrices for computation, enabling the handling of larger matrices and accommodating extensive discrete flow-field points.Meanwhile, convolutional networks aggregate features from neighboring nodes, optimizing the utilization of topological information between these nodes [27].Consequently, GCN finds application in the realm of flow-field reconstruction.Economon et al. [28] combined traditional GCN with CFD simulations, which significantly accelerated prediction speed.To address non-Euclidean flow problems, Wang et al. [29] integrated GCN with traditional numerical solvers and proposed the FlowGCN solver, which significantly speeded up the convergence of the entire program and secured accurate predictions.Peng et al. [26] proposed a data-driven flow prediction framework, GraphSAGE, based on the basic architecture of GCN.This framework learned potential features by sampling and aggregating features from the local neighborhoods of vertices, demonstrating good adaptability to non-uniformly distributed grid data.Furthermore, taking advantage of the use of sparse matrices in graph neural networks, GCN can effectively process and predict large-scale flow fields.Strönisch et al. [30] found out that GCN could predict flow fields over NACA airfoils and handle a large number of flow-field data points, which benefited the computational runtime by providing initial flow distributions for CFD.However, current research mainly focuses on cases such as airfoil and cylinder flow, with less emphasis on studies related to turbine blade cascades.Given that transonic/supersonic blade cascade flow fields are more complex and involve shock waves, multiple flow interactions [31,32], resulting in spatial non-uniformity and temporal non-stationarity in the flow field, it is essential to establish a prediction framework with higher-resolution flow-field data to improve predictions of the characteristics of turbine blade cascade flow.
Furthermore, despite the significant progress made by GCN in predicting fluid fields, there is still a need for further research on elucidating how GCN predicts these fluid fields.Presently, various methods for interpreting graph neural networks (GNNs) have been developed.Ying et al. [33] analyzed the impact of node features and the linking process of node information aggregation on model predictions and proposed GNNExplainer, which identified crucial subgraph structures and node features within GNN predictions, demonstrating a general and model-agnostic property.SubgraphX [34] focused on the substructures of the graph, interpreting GNN by exploring and identifying significant subgraphs.GNN Prediction Interpreter (GPI) [35] studied the correlation between node features and GNN predictions and elucidated the impact of node features on GNN predictions.Although explanations for graph neural networks have primarily focused on important subgraph structures and node features [27,36,37], explanations for fluid field regression tasks are yet to be fully developed.
Currently, some studies represent the flow field with geometric points and aerodynamic information [38], effectively avoiding the impact of pixelation on data accuracy and the high costs associated with increasing flow-field resolution [22].Kashefi et al. [39] proposed a novel deep-learning framework for predicting steady incompressible flow on multiple sets of irregular geometries based on PointNet and tested the effectiveness of the PIPN in the case of incompressible flows and thermal fields.To reduce the computational cost of numerical simulations, Xiong et al. [40] designed a point-cloud deep neural network based on the PointNet architecture and established a mapping between the spatial position of the ONERA M6 wing and CFD calculation values to predict the aerodynamic characteristics of the three-dimensional geometry.The results indicate that the computational cost can be reduced by approximately 23% under comparable predictive accuracy.However, while existing point-cloud-based techniques have been evaluated in many flow scenarios, little study has been done on engine flow fields.
Based on the aforementioned research, this work constructs prediction models in the form of graphs based on GCN since CNN primarily collects characteristics from twodimensional images, whereas the data structures of cascade flow fields are often more complicated.Grid resolution should be enhanced to capture flow features in blade cascades.This method allows for the prediction of flow fields on large-scale, non-uniform grids while maintaining the benefits of feature extraction.We deliver a point-cloud and GCNbased deep-learning architecture in this research.This framework aims to predict the turbulent viscosity and pressure fields around the fan cascade flow.It employs the model based on GCN to extract geometric information and deliver aerodynamic information at different positions in the flow field from point-cloud inputs with up to 295,035 points.This work first generates 1000 distinct cascade samples with varying disturbances, using the Hicks-Henne parameterization approach, which are then subjected to CFD simulations and data processing to generate point-cloud data as the dataset.Model parameters based on GCN are adjusted to provide predictions for the pressure and turbulent viscosity fields.Subsequently, we conducted an in-depth analysis of the specific understanding mechanism of the model based on graph encoding methods concerning the flow field.The key characteristics of this work are as follows: • A novel framework has been devised to predict flow fields over the cascade, combining GCN with point clouds to enhance prediction accuracy; • This innovative framework facilitates swift and precise predictions across an extensive grid containing 295,035 flow-field points, ensuring large-scale flow-field analysis efficiency; • A detailed investigation has been conducted to unravel the underlying mechanisms of GCN in the context of flow-field prediction, shedding light on its intricate understanding and application.
The paper is structured as follows: Section 2 explains the cascade geometry generation and numerical simulation, Section 3 introduces the structure of the framework and implementation of deep learning, Section 4 presents the results, followed by a discussion of the findings and limitations of the current approach in Section 5, while Section 6 provides the conclusions.

Cascade Geometry Generation
The subject in the research is a specific type of linear cascade profile.In this study, the Hicks-Henne bump function is applied as the parameterization method, through which the linear superposition of the perturbation function and the midrib analytic function characterize the cascade profile.The expression for this function is: where y top and y low stands for the suction side and pressure side of the cascade; y top0 and y low0 represents the y-coordinates on the suction and pressure sides of the original cascade; x represents the location of the mean aerodynamic chord, which ranges from 0 to 1; i stands for the sequence number of the design variable; n represents the number of the shape function; c i stands for the weight of the i-th shape function, which determines the thickness distribution.f i (x) is the shape function, which can be expressed as: where w represents the width of the bump; and x i stands for the location of the bump.In this paper, the perturbation on the suction and pressure sides of the cascade is generated based on the Hicks-Henne function.Three perturbation points on each surface are positioned at relative chord lengths of 0.05, 0.4, and 0.7, with mean values corresponding to the original profile data at these relative chord positions and a variance of 0.0577.To ensure a uniform distribution of geometric parameter samples, the Latin Hypercube Sampling (LHS) method is employed for selecting specific parameter values.Moreover, a constraint has been enforced to guarantee that the thickness variations at each profile point do not surpass 10% of the initial thickness.This constraint has led to the creation of 1000 profile shapes, as depicted in Figure 3.
Aerospace 2023, 10, x FOR PEER REVIEW 5 of 22 where top y and low y stands for the suction side and pressure side of the cascade; top0 y and low 0 y represents the y-coordinates on the suction and pressure sides of the original cascade; x represents the location of the mean aerodynamic chord, which ranges from 0 to 1; i stands for the sequence number of the design variable; n represents the number of the shape function; i c stands for the weight of the i-th shape function, which determines the thickness distribution.( ) i f x is the shape function, which can be expressed as: ( ) ln 0.5 / ln ,0 where w represents the width of the bump; and i x stands for the location of the bump.
In this paper, the perturbation on the suction and pressure sides of the cascade is generated based on the Hicks-Henne function.Three perturbation points on each surface are positioned at relative chord lengths of 0.05, 0.4, and 0.7, with mean values corresponding to the original profile data at these relative chord positions and a variance of 0.0577.To ensure a uniform distribution of geometric parameter samples, the Latin Hypercube Sampling (LHS) method is employed for selecting specific parameter values.Moreover, a constraint has been enforced to guarantee that the thickness variations at each profile point do not surpass 10% of the initial thickness.This constraint has led to the creation of 1000 profile shapes, as depicted in Figure 3.

CFD Simulation and Dataset Generation
For the generated 1000 geometric shapes, the computational domain is divided as shown in Figure 4, which calculates a single flow channel of the periodic flow field with

CFD Simulation and Dataset Generation
For the generated 1000 geometric shapes, the computational domain is divided as shown in Figure 4, which calculates a single flow channel of the periodic flow field with a Reynolds number of approximately 1.9 × 10 6 .The grid over the blade surface is controlled as y + ≈ 1/2, with the size on the order of 10 −6 m.Over the surface of the cascade, 1603 grid points are set, and the far-field length is nearly four times the length of the cascade.
a Reynolds number of approximately 1.9 × 10 6 .The grid over the blade surface is controlled as y + ≈ 1/2, with the size on the order of 10 −6 m.Over the surface of the cascade, 1603 grid points are set, and the far-field length is nearly four times the length of the cascade.As illustrated in Figure 1, the flow channel is divided into three parts: the leadingedge inlet channel, the cascade passage, and the trailing-edge outlet channel.The lengths of the inlet and outlet channels are each extended by 1 chord length beyond the leading and trailing edges of the profile.For the leading-edge inlet and outlet channels, periodic boundary conditions are applied to the upper and lower parts.Inlet and outlet boundaries are set as pressure boundaries, with inlet total pressure of 119,950 Pa, total inlet temperature of 293 K, outlet static pressure of 101,325 Pa, total temperature of 293 K, turbulence intensity of 0.2%, and turbulent viscosity ratio of 10.The no-slip boundary condition is set at the surface.
During the simulation, Reynolds-Averaged Navier-Stokes (RANS) and the transition SST four-equation model [41] are selected.RANS equations can be described as: Additionally, an implicit solution and the second-order upwind scheme for the solution format are chosen.Grid independence verification is conducted, and the numerical results are presented in Table 1, which demonstrates that when the total number of grids increases to 170 K, the relative change rate of the total pressure loss coefficient η and the inlet static pressure Pst decreases to within 0.4%, meeting the grid independence requirements.To accurately predict the cascade flow field based on GCN, a grid number of 295,035 is ultimately selected for the subsequent optimization database construction, as the results are basically unchanged with the increase of the grid numbers.Numerical simulations are performed over 1000 generated cases to generate an array containing flow-field information, including the coordinates of each grid vertices, along with corresponding static pressure and turbulent viscosity, stored in the form of point As illustrated in Figure 1, the flow channel is divided into three parts: the leading-edge inlet channel, the cascade passage, and the trailing-edge outlet channel.The lengths of the inlet and outlet channels are each extended by 1 chord length beyond the leading and trailing edges of the profile.For the leading-edge inlet and outlet channels, periodic boundary conditions are applied to the upper and lower parts.Inlet and outlet boundaries are set as pressure boundaries, with inlet total pressure of 119,950 Pa, total inlet temperature of 293 K, outlet static pressure of 101,325 Pa, total temperature of 293 K, turbulence intensity of 0.2%, and turbulent viscosity ratio of 10.The no-slip boundary condition is set at the surface.
During the simulation, Reynolds-Averaged Navier-Stokes (RANS) and the transition SST four-equation model [41] are selected.RANS equations can be described as: Additionally, an implicit solution and the second-order upwind scheme for the solution format are chosen.Grid independence verification is conducted, and the numerical results are presented in Table 1, which demonstrates that when the total number of grids increases to 170 K, the relative change rate of the total pressure loss coefficient η and the inlet static pressure P st decreases to within 0.4%, meeting the grid independence requirements.To accurately predict the cascade flow field based on GCN, a grid number of 295,035 is ultimately selected for the subsequent optimization database construction, as the results are basically unchanged with the increase of the grid numbers.Numerical simulations are performed over 1000 generated cases to generate an array containing flow-field information, including the coordinates of each grid vertices, along with corresponding static pressure and turbulent viscosity, stored in the form of point clouds.Each case consists of a point cloud of size 295,035.The dataset split ratio for training, testing, and validation sets is 8:1:1.

The Structure of the Framework
To process discrete data representations of the output flow field, they need to be transformed into graph form.A graph is defined as G = (V, E), where V represents the set of 295,035 nodes and E represents the set of edges.In this research, each node corresponds to a discrete grid point in the flow field, and the eigenvector comprises coordinates and aerodynamic parameters.Edges are formed by connecting points on the surface of the cascade with various grid points in the flow field and their relative relations.The generated graph comprises multiple subgraphs, with each subgraph depicted as illustrated in Figure 5.In this representation, node 0 represents the original nodes, the light brown nodes 1,2,3 represent the neighborhood, corresponding to the 3 spatial neighbors in the grids and 1603 points on the profile surface, and the green nodes 4,5,6,7,8 represent the indirect neighborhood.In addition, a global node containing the Mach number and the direction of the stream is added to the graph and fully connected with each node to guarantee the model generalization.The edge is defined as the relationship between the original node and its neighbors, with each node having a total of 1606 edges.
Aerospace 2023, 10, x FOR PEER REVIEW 7 of 22 clouds.Each case consists of a point cloud of size 295,035.The dataset split ratio for training, testing, and validation sets is 8:1:1.

The Structure of the Framework
To process discrete data representations of the output flow field, they need to be transformed into graph form.A graph is defined as G = (V, E), where V represents the set of 295,035 nodes and E represents the set of edges.In this research, each node corresponds to a discrete grid point in the flow field, and the eigenvector comprises coordinates and aerodynamic parameters.Edges are formed by connecting points on the surface of the cascade with various grid points in the flow field and their relative relations.The generated graph comprises multiple subgraphs, with each subgraph depicted as illustrated in Figure 5.In this representation, node 0 represents the original nodes, the light brown nodes 1,2,3 represent the neighborhood, corresponding to the 3 spatial neighbors in the grids and 1603 points on the profile surface, and the green nodes 4,5,6,7,8 represent the indirect neighborhood.In addition, a global node containing the Mach number and the direction of the stream is added to the graph and fully connected with each node to guarantee the model generalization.The edge is defined as the relationship between the original node and its neighbors, with each node having a total of 1606 edges.The pressure and turbulent viscosity values for each grid point in the flow field are calculated using weighted propagation based on the eigenvectors of each node.The message-passing scheme can be expressed mathematically as follows: where h stands for the embedding of the nodes, v and u are the index of the node, N(v) is the neighbor nodes of node v, k represents the number of the layer, σ is the activation function, Wk and Bk stands for the calculating matrix, and AGG stands for the generalized aggregation function.In this study, aggregation and update functions can be expressed as: The pressure and turbulent viscosity values for each grid point in the flow field are calculated using weighted propagation based on the eigenvectors of each node.The message-passing scheme can be expressed mathematically as follows: where h stands for the embedding of the nodes, v and u are the index of the node, N(v) is the neighbor nodes of node v, k represents the number of the layer, σ is the activation function, W k and B k stands for the calculating matrix, and AGG stands for the generalized aggregation function.In this study, aggregation and update functions can be expressed as: Through the aggregation function, it becomes evident that the process considers not just the number of nodes adjacent to a given node but also the number of neighbors that those adjacent nodes have.The process involves computing a weighted sum of the target node and all its nearby nodes.This also indicates that GCN is effective in handling non-Euclidean discrete data from the flow field [42].
In the study, the point cloud is fed into the model displayed in Figure 6 after undergoing the preprocessing steps detailed above to create a graph.The pressure and turbulent viscosity properties of each node make up the output.In the particular process, 3 GCN layers are employed as previous research has demonstrated that stacking convolutional layers is advantageous for feature extraction in the model [43].Moreover, a smoothing layer is added at the end to perform averaging on the output graph and create a continuous flow field [44].ReLU activation is implemented after the first two convolutional layers.The loss function is then used to train the hyperparameters of the model using the output from the smoothing layer and the last convolutional layer.
Aerospace 2023, 10, x FOR PEER REVIEW 8 of 22 Through the aggregation function, it becomes evident that the process considers not just the number of nodes adjacent to a given node but also the number of neighbors that those adjacent nodes have.The process involves computing a weighted sum of the target node and all its nearby nodes.This also indicates that GCN is effective in handling non-Euclidean discrete data from the flow field [42].
In the study, the point cloud is fed into the model displayed in Figure 6 after undergoing the preprocessing steps detailed above to create a graph.The pressure and turbulent viscosity properties of each node make up the output.In the particular process, 3 GCN layers are employed as previous research has demonstrated that stacking convolutional layers is advantageous for feature extraction in the model [43].Moreover, a smoothing layer is added at the end to perform averaging on the output graph and create a continuous flow field [44].ReLU activation is implemented after the first two convolutional layers.The loss function is then used to train the hyperparameters of the model using the output from the smoothing layer and the last convolutional layer.It has been proved that normalization helps speed up the convergence results [45].Additionally, there is a significant difference in the magnitudes of the two output fields in the data structure presented in this paper, which may lead to the oscillation of the loss.As a result, each field, including the inputs and outputs, is normalized separately using the maximum-minimum scaling method.

Training
The choice of the loss function has a significant impact on the prediction results in regression problems [46], like flow-field prediction.In such cases, various loss functions, such as mean squared error (MSE), mean absolute error (MAE), Log-Cosh loss function, and Huber loss function, are commonly used.The research conducted separate tests on these four types of loss functions to compare their effectiveness.Training becomes unfeasible when gradient explosion problems arise from unstable convergence of loss functions defined by MSE.There is no discernible difference in the problem solution when the learning rate is changed.For turbulent viscosity field data, the difference between the wake area data and other sections is substantial, and since they are influenced by the data themselves, there may be a continual accumulation and amplification of prediction errors, It has been proved that normalization helps speed up the convergence results [45].Additionally, there is a significant difference in the magnitudes of the two output fields in the data structure presented in this paper, which may lead to the oscillation of the loss.As a result, each field, including the inputs and outputs, is normalized separately using the maximum-minimum scaling method.

Training
The choice of the loss function has a significant impact on the prediction results in regression problems [46], like flow-field prediction.In such cases, various loss functions, such as mean squared error (MSE), mean absolute error (MAE), Log-Cosh loss function, and Huber loss function, are commonly used.The research conducted separate tests on these four types of loss functions to compare their effectiveness.Training becomes unfeasible when gradient explosion problems arise from unstable convergence of loss functions defined by MSE.There is no discernible difference in the problem solution when the learning rate is changed.For turbulent viscosity field data, the difference between the wake area data and other sections is substantial, and since they are influenced by the data themselves, there may be a continual accumulation and amplification of prediction errors, resulting in gradient explosion.Analogously, there is a gradient issue during training and a notable oscillation issue during the convergence phase for the Log-Cosh loss function.When MAE is used as the loss function, the gradient is consistent for all prediction sites, and the convergence is sluggish.The consistent gradient problem can be avoided by dynamically modifying the learning rate to decrease with an increase in iteration.For the Huber loss function, of utmost importance is the adjustment of the hyperparameters δ, which can be written as: When comparing different loss functions in machine learning models, the Huber loss function is more robust than MSE and faster in convergence than MAE as it reduces the gradient around the minimum value.After training and adjusting the hyperparameters with δ set to 1.35, Figure 7 displays the predicted results using MAE with dynamic learning rate and Huber loss function as loss functions, respectively.To display the prediction fields more clearly, a periodic operation is performed on the contours, which displays three flow channels simultaneously.The figure shows that the predicted results for the pressure field are satisfactory, while for the turbulent viscosity field, the predicted results using MAE as the loss function are significantly worse than those using the Huber loss function despite the application of dynamic decreasing learning rate.When dealing with high-turbulentviscosity regions, the model based on the MAE loss function does not achieve the desired prediction effect and shows incomplete learning, while the model based on the Huber loss function has a stronger learning ability for these regions.Therefore, this article recommends using the Huber loss function for subsequent research, which is defined as Equation ( 9) with the variable of pressure and turbulent viscosity on each node.
resulting in gradient explosion.Analogously, there is a gradient issue during training and a notable oscillation issue during the convergence phase for the Log-Cosh loss function.When MAE is used as the loss function, the gradient is consistent for all prediction sites, and the convergence is sluggish.The consistent gradient problem can be avoided by dynamically modifying the learning rate to decrease with an increase in iteration.For the Huber loss function, of utmost importance is the adjustment of the hyperparameters δ, which can be written as: When comparing different loss functions in machine learning models, the Huber loss function is more robust than MSE and faster in convergence than MAE as it reduces the gradient around the minimum value.After training and adjusting the hyperparameters with δ set to 1.35, Figure 7 displays the predicted results using MAE with dynamic learning rate and Huber loss function as loss functions, respectively.To display the prediction fields more clearly, a periodic operation is performed on the contours, which displays three flow channels simultaneously.The figure shows that the predicted results for the pressure field are satisfactory, while for the turbulent viscosity field, the predicted results using MAE as the loss function are significantly worse than those using the Huber loss function despite the application of dynamic decreasing learning rate.When dealing with high-turbulent-viscosity regions, the model based on the MAE loss function does not achieve the desired prediction effect and shows incomplete learning, while the model based on the Huber loss function has a stronger learning ability for these regions.Therefore, this article recommends using the Huber loss function for subsequent research, which is defined as Equation ( 9) with the variable of pressure and turbulent viscosity on each node.Once the loss function has been determined, the hyperparameters of the batch of nodes, such as the learning rate, convolutional kernel size, and number of convolutional layers, should be adjusted.However, adjusting all parameters for large-scale learning of the entire flow field can be time-consuming.To address this issue, a grid search method can be used to construct a graph in the highly characteristic high-turbulent-viscosity region of the flow field shown in Figure 8, where incomplete learning occurs frequently and performs automatic hyperparameter tuning.In the grid search method, a grid containing all possible values is created for the selected adjusted parameters.Each iteration attempts its combination in a certain order and records the prediction performance, ultimately returning the model with the best performance.This article conducts a grid search on several parameter values such as learning rate, epoch, batch size, dropout rate, and the dimensionality of the output space and optimizer and chooses the hyperparameters with the highest prediction accuracy.Among the hyperparameters, the set can be expressed as follows: learning rate, epoch {100, 200, 500, 1000}, batch size {10, 50, 100, 500}, dropout rate {0.1, 0.2, 0.3, 0.4}, the dimensionality of the output space {64, 32, 16, 8, 4}, and optimizer {Adam, SGD}.
Once the loss function has been determined, the hyperparameters of the batch of nodes, such as the learning rate, convolutional kernel size, and number of convolutional layers, should be adjusted.However, adjusting all parameters for large-scale learning of the entire flow field can be time-consuming.To address this issue, a grid search method can be used to construct a graph in the highly characteristic high-turbulent-viscosity region of the flow field shown in Figure 8, where incomplete learning occurs frequently and performs automatic hyperparameter tuning.In the grid search method, a grid containing all possible values is created for the selected adjusted parameters.Each iteration attempts its combination in a certain order and records the prediction performance, ultimately returning the model with the best performance.This article conducts a grid search on several parameter values such as learning rate, epoch, batch size, dropout rate, and the dimensionality of the output space and optimizer and chooses the hyperparameters with the highest prediction accuracy.Among the hyperparameters, the set can be expressed as follows: learning rate, epoch {100, 200, 500, 1000}, batch size {10, 50, 100, 500}, dropout rate {0.1, 0.2, 0.3, 0.4}, the dimensionality of the output space {64, 32, 16, 8, 4}, and optimizer {Adam, SGD}.After testing, the following hyperparameters yield the best performance: learning rate = 0.01, epoch =1000, batch size = 50, dropout rate = 0.2, the dimensionality of the output space = 16, 8, 2, respectively, and the optimizer chooses Adam.After the initial two convolutional layers, a ReLU activation function is added to improve the network's ability to express nonlinear features and predict fields more accurately.Additionally, a smoothing layer is included after the last convolution layer to maintain the continuity of the flow field.
Training has been conducted on the dataset based on the determined hyperparameters and the model as described.Figure 9 illustrates the convergence of the model.After multiple epochs of training iterations, both the training and validation sets have tended to converge, indicating the effectiveness of the training.The convergence level of the validation set is also guaranteed to be within an acceptable range, which ensures that the trained model accurately predicts the flow field.After testing, the following hyperparameters yield the best performance: learning rate = 0.01, epoch =1000, batch size = 50, dropout rate = 0.2, the dimensionality of the output space = 16, 8, 2, respectively, and the optimizer chooses Adam.After the initial two convolutional layers, a ReLU activation function is added to improve the network's ability to express nonlinear features and predict fields more accurately.Additionally, a smoothing layer is included after the last convolution layer to maintain the continuity of the flow field.
Training has been conducted on the dataset based on the determined hyperparameters and the model as described.Figure 9

Fields Prediction Performance
Different cases are considered to study the prediction results of the framework

Fields Prediction Performance
Different cases are considered to study the prediction results of the framework described in the second section, in which the leading-edge point of the cascade is located at the origin.The results are presented in the form of point clouds.To make the predicted results more comprehensible, contours are used to display the predicted pressure and turbulent viscosity fields.Figure 10 shows the flow-field prediction results, and Figure 11 presents the ratio of predicted values to CFD values for each point in the flow field, demonstrating their deviation from the y = x line.The figure indicates that the main structural and physical features in the flow field are successfully captured, while the areas with significant errors are mainly concentrated at the edges in pressure fields and high-turbulent-viscosity areas, which can be shown in Figure 11 that the predicted errors are concentrated in the low-pressure and high-turbulent-viscosity regions.In the pressure field, the pressure gradient at the leading edge of the cascade is much larger than that in the rest of the flow field, where the contour edges cannot be clearly displayed in the prediction and show larger errors in pressure field prediction, while the remaining parts exhibiting high prediction accuracy, including the high-pressure areas that appear at the suction side in certain cases.The low turbulent viscosity area on the surface of the cascade and the high-turbulent-viscosity feature at the trailing edge guarantee an accurate representation of the features of the cascade in the turbulent viscosity field prediction.Although predictions of the high-turbulent-viscosity region at the trailing edge and the prediction of the high-turbulent-viscosity region at the front edge are not adequate, both are still within an acceptable range, profiting from the combination of GCN and point cloud, which enables the framework to predict dominant regions under high resolution without increasing global resolution.Overall, the accuracy of the entire pressure field prediction is over 99%, while the turbulent viscosity field is more than 96%, as indicated in Figure 11, with the prediction speed of 87 s, converging four times faster than nearly 8 min in CFD.An additional flow-field prediction model is CNN-based.To maintain a roughly consistent total number of points, the resolution of the flow-field images input to the CNN is set to 1000×500.The comparative results are illustrated in Figure 12.Due to resolution limitations, the CNN-based model exhibits poorer performance in identifying high-gradient boundaries within the flow field.In contrast, the GCN-based model is not affected by resolution constraints and can accurately predict pressure values in low-pressure regions.However, it shows suboptimal performance in predicting turbulent viscosity fields in regions with sparse nodes.An additional flow-field prediction model is CNN-based.To maintain a roughly consistent total number of points, the resolution of the flow-field images input to the CNN is set to 1000 × 500.The comparative results are illustrated in Figure 12.Due to resolution limitations, the CNN-based model exhibits poorer performance in identifying high-gradient boundaries within the flow field.In contrast, the GCN-based model is not affected by resolution constraints and can accurately predict pressure values in low-pressure regions.However, it shows suboptimal performance in predicting turbulent viscosity fields in regions with sparse nodes.sistent total number of points, the resolution of the flow-field images input to the CNN set to 1000×500.The comparative results are illustrated in Figure 12.Due to resolut limitations, the CNN-based model exhibits poorer performance in identifying high-gra ent boundaries within the flow field.In contrast, the GCN-based model is not affected resolution constraints and can accurately predict pressure values in low-pressure regio However, it shows suboptimal performance in predicting turbulent viscosity fields in gions with sparse nodes.

Prediction of the Trained Model on Cascade with Different Nodes Selection Approach
In general, researchers interpret models by explaining the importance of specific dicators [47,48].If the removal of a certain node significantly changes the prediction sults, that node is considered important.To investigate which part of the cascade is m crucial in predicting cascade flow fields based on graph neural networks, global poi  The selection of nodes is achieved by removing surface points with different intervals.To be specific, global nodes are removed from 1603 global points on the suction side and pressure side at intervals of 2 to 10, which sorts sequentially from the trailing edge to the leading edge.The predicted pressure and turbulent viscosity fields are compared with the originally predicted result in the regions of the cascade leading edge and the wake areas.For quantitative comparison, the new prediction values and the original values are weighted and take ratio, defined as contribution, which is expressed as: where y is pressure or turbulent viscosity.The results are shown in Figure 13.After removing the intervals, the predicted pressure field at the leading edge remains the same as the original values.Meanwhile, there is no significant change in the predicted contribution values between the global points of each interval removed.The prediction results of global nodes removed at the same interval for the turbulent viscosity field in the wake region are shown in Figure 13b.Although there is a certain degree of change compared to the pressure field prediction as the interval increases, it is still minor.The study also investigated the effect of removing lower-order global points on the predicted flow field, which indicates that removing one or two nodes has almost no impact on the outcomes.
where y is pressure or turbulent viscosity.The results are shown in Figure 13.After removing the intervals, the predicted pressure field at the leading edge remains the same as the original values.Meanwhile, there is no significant change in the predicted contribution values between the global points of each interval removed.The prediction results of global nodes removed at the same interval for the turbulent viscosity field in the wake region are shown in Figure 13b.Although there is a certain degree of change compared to the pressure field prediction as the interval increases, it is still minor.The study also investigated the effect of removing lower-order global points on the predicted flow field, which indicates that removing one or two nodes has almost no impact on the outcomes.The output of convolutional layers has been analyzed to learn additional information regarding the learning pattern of convolutional networks.In the selected area, the prediction over various starting locations of interval 10 is explored.The output ratios of the first and second convolution layers at various starting positions concerning the original convolution output are displayed in Figure 14a,b, respectively.As per the results, removing nodes with the same interval but different starting points only causes slight changes in The output of convolutional layers has been analyzed to learn additional information regarding the learning pattern of convolutional networks.In the selected area, the prediction over various starting locations of interval 10 is explored.The output ratios of the first and second convolution layers at various starting positions concerning the original convolution output are displayed in Figure 14.As per the results, removing nodes with the same interval but different starting points only causes slight changes in the prediction, as shown in Figure 14.The consistency of the findings remains nearly the same after the first layer of convolution output, demonstrating the GCN learning pattern on data processing in flow-field prediction.When the advertisement matrix and features are multiplied, the features of the nodes neighboring the certificate nodes are included, along with the aggregation of features over global nodes.As a result, the results provided in Figure 13 display a change interval without appreciable deviations from the anticipated outcomes.

Explanation of Graph Embedding Approach Based on the Framework
The convolution processing procedure on the flow-field data is displayed in Section 4.2.Further feature analysis is carried out on different parts with sequential 20, 50, 100,

Explanation of Graph Embedding Approach Based on the Framework
The convolution processing procedure on the flow-field data is displayed in Section 4.2.Further feature analysis is carried out on different parts with sequential 20, 50, 100, and 200 nodes removed, as shown in Figure 15, to investigate the impact of specific learning techniques of the GCN-based framework on the flow channel properties of cascades.

Explanation of Graph Embedding Approach Based on the Framework
The convolution processing procedure on the flow-field data is displayed in Section 4.2.Further feature analysis is carried out on different parts with sequential 20, 50, 100, and 200 nodes removed, as shown in Figure 15, to investigate the impact of specific learning techniques of the GCN-based framework on the flow channel properties of cascades.When the interval size is set to 10 instead of the corresponding step, a similar prediction trend is shown in Figure 17, indicating that the weight of nodes near the cascade surface learned by the framework to the prediction of the flow field is almost consistent.Figure 18 illustrates the impact of various cascade surface points on the wake region.In comparison to the suction side, the changes caused by the pressure side are much more subtle.The trailing edge of the suction side is most of the component contributing to the field, in which a substantially greater impact on the wake than the pressure field is observed.Additionally, it demonstrates agreement with the pressure field prediction for the prediction weights with the same interval.Figure 18 illustrates the impact of various cascade surface points on the wake region.In comparison to the suction side, the changes caused by the pressure side are much more subtle.The trailing edge of the suction side is most of the component contributing to the field, in which a substantially greater impact on the wake than the pressure field is observed.Additionally, it demonstrates agreement with the pressure field prediction for the prediction weights with the same interval.

Discussion and Limitations
The pressure and turbulent viscosity flow field along the cascade can be predicted with over 99% and 96% prediction accuracy with the proposed framework, respectively.The outcomes demonstrate that the framework is capable of handling large-scale pointcloud inputs and graph structures based on this, accurately capturing the characteristic

Discussion and Limitations
The pressure and turbulent viscosity flow field along the cascade can be predicted with over 99% and 96% prediction accuracy with the proposed framework, respectively.The outcomes demonstrate that the framework is capable of handling large-scale point-cloud inputs and graph structures based on this, accurately capturing the characteristic structure of the fan cascade flow and predicting the pressure and wake turbulent viscosity regions at the leading edge of the cascade.According to the learning of partial flow fields in grid search and the final flow-field prediction results, the most distinctive portions of the flow field can be chosen for learning, negating the necessity to solve the full flow field as in CFD.Especially for engine flow situations, where the flow field shows more complexity, this framework is more flexible and does not require costly global resolution refinement due to partial flow characteristics.
Nevertheless, there are certain limitations of this framework for flow-field prediction as well, such as the relatively poor prediction precision for wave zones.The sparse grid in this area is most likely responsible for the inaccurate turbulent viscosity prediction, as the framework demonstrates an improved comprehension of features in the relatively dense part of the nodes, while surrounding nodes in the relatively sparse portion of the grid have relatively lower feature values, making it susceptible to the influence of neighboring nodes during the learning process.Simultaneously, additional investigation is required about the extrapolation of alternative operational conditions.It has been proven through learning that global nodes with smaller magnitudes do not substantially affect the outcomes of the trained model.Consequently, more research is required to confirm the efficacy of the global points defined in the framework, with the features of the Mach number and the inlet angle of attack.
The purpose of this study is to elucidate the mechanism of the flow path feature learning process utilizing the GCN-based framework.To accomplish the goal, nodes with various positional characteristics are removed from the graph, and the resulting variations in prediction outcomes are noted, serving as the foundation for the GCN explanation.The results gathered show that in the GCN-based model, learning global node features requires the feature addition of neighboring nodes.As a result, for fewer global node inputs with evenly distributed positional information, the model remains producing outputs with great precision.The nodes at the trailing edge of the cascade suction side have a substantial impact on the turbulent viscosity field prediction by the framework, as demonstrated by the findings of a study on the influence of global nodes with non-uniform distribution position features on flow-field prediction results.Despite having a negligible effect on the turbulent viscosity field, the suction side also influences the pressure field prediction to some extent.When predicting the turbulent viscosity field at a thickness of 10% and loading requirements for a certain cascade, the pressure side has a lesser influence, where the impact on the field prediction is negligible.
This study has exclusively focused on the investigation of 2D profiles, necessitating an extension to encompass the analysis of blades.Meanwhile, the existing data are derived from solving RANS equations.For future investigations, higher precision data will be pursued through the implementation of more advanced techniques such as large eddy simulation (LES) or direct numerical simulation (DNS).Additionally, in the computational setup of this paper, including a subgraph to reconsider the effects of the neighboring nodes will lead to an increase in computational costs.Utilizing graph summarization methods, such as graph compression or graph feature extraction (e.g., using techniques like autoencoders), during the preprocessing stage may effectively reduce computational costs [49], which compresses large-scale graph data into a more concise form, reducing redundancy and enabling a more effective analysis and understanding of large-scale graph data.

Conclusions
Our study proposes a deep-learning framework that utilizes point clouds and GCN to accurately predict the flow field of cascades.The method involves converting CFD grid data into point-cloud data and the detailed data conversion method of feeding the point cloud into a GCN-based model, as well as fine-tuning the network hyperparameters and training process.Utilizing the framework, we can predict the flow field and employ the trained model to help explain the GCN interpretation of the cascade flow field, thus enhancing the understanding of the flow-field features.
Based on the results gathered, the proposed framework is capable of effectively predicting the flow situation in the cascade, establishing a mapping of flow-field position information and aerodynamic information, and efficiently processing large-scale point-cloud data.Meanwhile, it provides valuable data support for learning local flow characteristics instead of solving the entire flow field as in CFD simulations.For the given graph as the input of the model, results suggest that the trailing-edge point of the cascade is the crucial part that significantly impacts the important feature points of the cascade, which should be considered to be important input global nodes.
In addition, the loss function and hyperparameters of the framework are also tested.The outcomes suggest that the selection of loss function significantly affects the convergence

Figure 1 .
Figure 1.Linear cascade single flow path schematic diagram.Graph Convolutional Network (GCN) can directly extract spatial features from topological graphs, showcasing superior adaptability and flexibility in swiftly generating flow fields, especially for flow over irregular geometries.Figure 2 illustrates the transonic cascade Mach number field employed in this paper for flow-field prediction, in which the grid-based model outperforms the CNN-based model, which is limited to pixelation at a globally consistent resolution, in identifying details in the flow field over the cascade.It also indicates that, in the case of transonic cascades, the complex flow patterns and irregular flow path structure may result in the loss of crucial flow-field information in CNNbased field prediction.

Figure 1 .
Figure 1.Linear cascade single flow path schematic diagram.Graph Convolutional Network (GCN) can directly extract spatial features from topological graphs, showcasing superior adaptability and flexibility in swiftly generating flow fields, especially for flow over irregular geometries.Figure2illustrates the transonic cascade Mach number field employed in this paper for flow-field prediction, in which the grid-based model outperforms the CNN-based model, which is limited to pixelation at a globally consistent resolution, in identifying details in the flow field over the cascade.It also indicates that, in the case of transonic cascades, the complex flow patterns and irregular flow path structure may result in the loss of crucial flow-field information in CNN-based field prediction.

Figure 2 .
Figure 2. Identifications of the details in the flow field over the cascade based on different m

Figure 2 .
Figure 2. Identifications of the details in the flow field over the cascade based on different models.

Figure 3 .
Figure 3.The geometry of the 1000 generated cascades.

Figure 3 .
Figure 3.The geometry of the 1000 generated cascades.

Figure 4 .
Figure 4. Grids of outline and magnified details at dense grids.

Figure 4 .
Figure 4. Grids of outline and magnified details at dense grids.

Figure 5 .
Figure 5.The generation and the structure of the input graph of the model.

Figure 5 .
Figure 5.The generation and the structure of the input graph of the model.

Figure 6 .
Figure 6.The framework for the cascade flow-field prediction.

Figure 6 .
Figure 6.The framework for the cascade flow-field prediction.

Figure 7 .
Figure 7. Flow-field prediction based on the models utilizing MAE and Huber loss function as loss functions, respectively.(a,f) are the reference pressure field and turbulent viscosity field based on the CFD solution.(b,c,g,h) are the predicted flow fields and absolute error using MAE as the loss function.(d,e,i,j) are the predicted flow fields and absolute errors based on the Huber loss function.

Figure 7 .
Figure 7. Flow-field prediction based on the models utilizing MAE and Huber loss function as loss functions, respectively.(a,f) are the reference pressure field and turbulent viscosity field based on the CFD solution.(b,c,g,h) are the predicted flow fields and absolute error using MAE as the loss function.(d,e,i,j) are the predicted flow fields and absolute errors based on the Huber loss function.

Figure 8 .
Figure 8.The area where grid search works and the diagram of the grid search method.

Figure 8 .
Figure 8.The area where grid search works and the diagram of the grid search method.

Figure 9 .
Figure 9.The loss value for the model with the given hyperparameters on each epoch.

Figure 9 .
Figure 9.The loss value for the model with the given hyperparameters on each epoch.

Figure 10 .
Figure 10.Flow prediction based on a set model over different geometry.(a,c,e,g), respectively, display the CFD (left) and predicted pressure fields (middle), along with the absolute errors (right) for different geometries.(b,d,f,h), respectively, display the CFD (left) and predicted turbulent viscosity fields (middle), along with the absolute errors (right) for different geometries.

Figure 10 .
Figure 10.Flow prediction based on a set model over different geometry.(a,c,e,g), respectively, display the CFD (left) and predicted pressure fields (middle), along with the absolute errors (right) for different geometries.(b,d,f,h), respectively, display the CFD (left) and predicted turbulent viscosity fields (middle), along with the absolute errors (right) for different geometries.Aerospace 2023, 10, x FOR PEER REVIEW 13 of 22

Figure 11 .
Figure 11.Comparison of predicted fields and CFD fields value.(a,b) sequentially display the results of the pressure field and turbulent viscosity field.

Figure 11 .
Figure 11.Comparison of predicted fields and CFD fields value.(a,b) sequentially display the results of the pressure field and turbulent viscosity field.

Figure 12 .
Figure 12.Comparison of CFD fields, GCN-based predicted fields, and CNN-based predicted fie (a-c) sequentially display the results of CFD, GCN-based model and CNN-based model.

Figure 12 .
Figure 12.Comparison of CFD fields, GCN-based predicted fields, and CNN-based predicted fields.(a-c) sequentially display the results of CFD, GCN-based model and CNN-based model.

4. 2 .
Prediction of the Trained Model on Cascade with Different Nodes Selection ApproachIn general, researchers interpret models by explaining the importance of specific indicators[47,48].If the removal of a certain node significantly changes the prediction results, that node is considered important.To investigate which part of the cascade is more crucial in predicting cascade flow fields based on graph neural networks, global points are created for the 1603 points constituting the initial data cascade surface throughout the graph generation stage.This allows the framework to learn the characteristics of different flow channels.The flow field is projected, and the cascade surface points are rearranged.By removing different intervals of nodes, this process aims to analyze the features of the output flow field based on GCN predictions and understand the contribution of the cascade surface points to the flow field.As observed in the prediction results in Section 4.1, the predictions for the inlet and outlet of this flow field tend to converge, with a particular emphasis on the leading edge of the cascade and cascade wake.Consequently, additional research on the construction of the two regions, including information on 5797 and 1223 nodes sequentially, is conducted.

Figure 13 .
Figure 13.The contribution based on prediction over the cascade leading edge and wake regions with different intervals of global nodes removed.(a,b) represents the results of the pressure and turbulent viscosity field, respectively.

Figure 13 .
Figure 13.The contribution based on prediction over the cascade leading edge and wake regions with different intervals of global nodes removed.(a,b) represents the results of the pressure and turbulent viscosity field, respectively.

023, 10 ,
x FOR PEER REVIEW 15 of 22 the prediction, as shown in Figure14.The consistency of the findings remains nearly the same after the first layer of convolution output, demonstrating the GCN learning pattern on data processing in flow-field prediction.When the advertisement matrix and features are multiplied, the features of the nodes neighboring the certificate nodes are included, along with the aggregation of features over global nodes.As a result, the results provided in Figure13display a change interval without appreciable deviations from the anticipated outcomes.

Figure 14 .
Figure 14.The outcomes of the first and second convolution layers with different starting points.

Figure 14 .
Figure 14.The outcomes of the first and second convolution layers with different starting points.

Figure 14 .
Figure 14.The outcomes of the first and second convolution layers with different starting points.

Figure 16
Figure16illustrates the technique of the continuous 20, 50, 100, and 200 nodes on the cascade surface that affect the estimated pressure field in comparison to the initial predicted fields.As can be seen from Figure15, points on the suction side have a considerable

Figure 16
Figure16illustrates the technique of the continuous 20, 50, 100, and 200 nodes on the cascade surface that affect the estimated pressure field in comparison to the initial predicted fields.As can be seen from Figure15, points on the suction side have a considerable impact on the prediction results when consecutive nodes are excluded.On the other hand, the flow field is less affected by the pressure side channel properties that the trained model learns.Aerospace 2023, 10, x FOR PEER REVIEW 16 of 22

Figure 16 .
Figure 16.The contribution based on prediction over the cascade leading edge with continuous steps of global nodes removed, where (a-d) represent the results with steps 20, 50, 100, and 200.

Figure 16 . 22 Figure 17 .
Figure 16.The contribution based on prediction over the cascade leading edge with continuous steps of global nodes removed, where (a-d) represent the results with steps 20, 50, 100, and 200.

Figure 17 .
Figure 17.The contribution shown in the same intervals based on prediction over the cascade leading edge with continuous steps of global nodes removed, where (a-d) represent the results with steps 20, 50, 100, and 200.

Figure 18 .
Figure 18.The contribution based on prediction over the wake region with continuous steps of global nodes removed, where (a-d) represents the results with steps 20, 50, 100, and 200, while (eh) stands for the plotting intervals of 10 with different steps.

Figure 18 .
Figure 18.The contribution based on prediction over the wake region with continuous steps of global nodes removed, where (a-d) represents the results with steps 20, 50, 100, and 200, while (e-h) stands for the plotting intervals of 10 with different steps.

Table 1 .
Grid Independence of the linear cascade.

Table 1 .
Grid Independence of the linear cascade.