A Review on Optimal Design of Fluid Machinery Using Machine Learning Techniques

Xu, Bin; Deng, Jiali; Liu, Xingyu; Chang, Ailian; Chen, Jiuyu; Zhang, Desheng

doi:10.3390/jmse11050941

Open AccessReview

A Review on Optimal Design of Fluid Machinery Using Machine Learning Techniques

by

Bin Xu

^1,3

,

Jiali Deng

¹,

Xingyu Liu

¹,

Ailian Chang

²,

Jiuyu Chen

² and

Desheng Zhang

^1,*

¹

Research Center of Fluid Machinery Engineering and Technology, Jiangsu University, Zhenjiang 212013, China

²

Jiangsu Province Engineering Research Center of High-Level Energy and Power Equipment, Changzhou University, Changzhou 213164, China

³

Wenling Fluid Machinery Technology Institute of Jiangsu University, Wenling 317525, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(5), 941; https://doi.org/10.3390/jmse11050941

Submission received: 9 March 2023 / Revised: 24 April 2023 / Accepted: 27 April 2023 / Published: 28 April 2023

(This article belongs to the Special Issue CFD Simulation of Floating Offshore Structures)

Download

Browse Figure

Versions Notes

Abstract

The design of fluid machinery is a complex task that requires careful consideration of various factors that are interdependent. The correlation between performance parameters and geometric parameters is highly intricate and sensitive, displaying strong nonlinear characteristics. Machine learning techniques have proven to be effective in assisting with optimal fluid machinery design. However, there is a scarcity of literature on this subject. This study aims to present a state-of-the-art review on the optimal design of fluid machinery using machine learning techniques. Machine learning applications primarily involve constructing surrogate models or reduced-order models to explore the correlation between design variables or the relationship between design variables and performance. This paper provides a comprehensive summary of the research status of fluid machinery optimization design, machine learning methods, and the current application of machine learning in fluid machinery optimization design. Additionally, it offers insights into future research directions and recommendations for machine learning techniques in optimal fluid machinery design.

Keywords:

machine learning; fluid machinery; optimal design; deep learning

1. Introduction

Fluid machinery, such as hydraulic turbines, pumps, torque converters, wind turbines, and compressors, is widely used in many critical sectors of the national economy, including aerospace, agricultural engineering, the petrochemical industry, water conservancy projects, and medical equipment. However, in engineering applications, fluid machinery often faces issues such as low equipment efficiency, unstable operation, and incompatibility with the system [1]. Consequently, the need for more efficient and stable fluid machinery has become a crucial area of research. Optimization design is a crucial aspect of fluid machinery design, and it involves addressing the complex and nonlinear relationship between performance and geometric parameters. To tackle these problems, some scientists have proposed the application of machine learning to optimize the design of fluid machinery.

Due to the rapid development of data science, processing units, neural-network-based technology, and sensor adaptation, machine learning has become an essential research method for development and innovation in various fields [2]. Its unique ability to handle nonlinear problems has made it a powerful and intelligent data processing framework for the optimal design of fluid machinery, typically used for numerical simulation and structural optimization tasks in fluid machinery. Recently, machine learning has achieved significant success in the field of fluid machinery, leading to its rapid development. It has broken through traditional research methods and greatly enriched the industrial application of fluid machinery. Additionally, the development of fluid machinery has brought new challenges and breakthroughs to machine learning. Convolutional neural networks [3], genetic algorithms, BP algorithms, deep learning, and other algorithms have made notable achievements in fluid machinery research. Due to the abundance of data, enhanced computing power, and improved data analysis accuracy, machine learning technology will play an increasingly crucial role in future social development.

Machine learning is an artificial intelligence development that primarily simulates human learning modes. It continuously learns and trains, updates the training model framework, and analyzes and classifies sample data characteristics to predict cases or issue execution commands. Machine learning is usually classified into three categories: supervised learning algorithms, unsupervised learning algorithms, and semi-supervised learning algorithms. Supervised learning algorithms can identify normal and abnormal data automatically and effectively from a large number of labeled samples, while unsupervised learning algorithms train samples without labeled samples. It can guarantee the model’s generalization performance and reduce learning costs. Semi-supervised learning builds models using a small number of labeled samples and a large number of unlabeled samples and makes predictions through algorithms. The active development of machine learning methods provides a new way to solve the nonlinear problems between parameters in optimization design, with the growth of machine learning and data-driven technology in the design and flow control field.

In conclusion, the application of machine learning in the optimization design of fluid machinery has significant theoretical and engineering value. This paper first summarizes the research status of fluid machinery optimization design and machine learning theory. Then, it provides an overview of the application status of machine learning in fluid machinery optimization design. Finally, the future prospects of machine learning in the optimal design of fluid machinery are discussed.

2. Review Methodology

Nowadays, the most efficient method of conducting research is by utilizing the internet and databases. Nevertheless, there is an overwhelming abundance of information available, varying in its authenticity, reliability, and usefulness. It can be challenging to distinguish between the effective and non-effective, authenticated and non-authenticated, as well as reliable and non-reliable sources of information. Therefore, it is crucial to exercise caution and discretion when selecting sources to ensure that the information obtained is beneficial for the research.

Google Scholar was used to search for high-quality research papers, initially using the keywords “machine learning”. However, it became apparent that many of the downloaded papers were not related to the optimal design of fluid machinery. To refine the search, the keywords “fluid machinery” and “optimal design” were employed, and relevant papers were downloaded. Additional papers were reviewed based on cross-references and their important role in the development of the optimal design of fluid machinery using machine learning techniques. Each paper was categorized based on the most suitable category, depending on whether the term was in the title, abstract, or body. This empirical research included literature review papers, conceptual papers, descriptive papers, and research papers. The primary databases searched were the Taylor & Francis, Elsevier, IEEE, and Springer publishing groups.

This paper provides an overview of the various definitions proposed by different researchers. To conduct this study, we reviewed a total of 89 research papers to gather a comprehensive understanding of the research contributions in this field.

3. Research Status of Optimal Design of Fluid Machinery

Fluid machinery is a type of energy conversion machinery that uses fluid as a working medium to achieve fluid pressurization or transportation. The traditional research process of fluid machinery includes modeling, performance testing, and analysis, as well as optimization design. The quality of 3D modeling during the modeling process typically depends on the designer’s experience. In performance testing, the method of creating models can be time-consuming and expensive. Moreover, due to errors in manufacturing and testing, solving these problems can require more time and energy during performance testing. Therefore, optimization design has become a critical aspect of fluid machinery research. Given the complex flow phenomena and the intricate relationship between flow and structural parameters (e.g., secondary flow caused by rotation and curvature, stratified flow, and boundary layer flow), creating the corresponding model is particularly challenging. In response to these issues, long-term research has focused on improving the internal flow field and optimizing mechanical geometric parameters. Commonly used optimization design methods include the inverse design optimization method, the multidisciplinary coupling optimization method, the genetic algorithm, and neural networks.

In the optimization design of fluid machinery, improving the internal flow field of fluid machinery is a challenging problem. However, few studies have simultaneously considered the mechanical problems associated with the design of optimized structures, especially in cases where turbulence, compressibility, or different physics are involved [4]. To explore the influence of the internal flow field of fluid machinery on its performance and the relationship between them, some scholars proposed applying the Inverse Design Method (IDM) to the optimization design of fluid machinery. The optimization idea of IDM is to reconstruct the model through previously collected data. Compared to traditional optimization design methods, the IDM-based optimization method exhibits significant advantages in terms of time cost and universality. It is usually combined with turbulence simulation technology and mathematical optimization algorithms to improve the hydrodynamic performance of fluid machinery. This method has been studied both at home and abroad. For example, Yang et al. [5] applied IDM to the optimization design of fluid machinery, effectively suppressing secondary flow and cavitation by controlling the loading parameters and stacking conditions of blades, thereby improving the internal flow field of fluid machinery. Moghadassian et al. [6] used IDM to calculate the geometry of wind turbine blades, and the iterative inverse algorithm solved the optimization problem, thus improving the performance of single-rotor and double-rotor wind turbines. However, gray scale in the results of modeling the fluid flow field can make the contour of the fluid area inaccurate, which often occurs in the design of double channels and elbows. To avoid gray scale and obtain clear boundaries, Souza et al. [7] proposed applying the Topology of Binary Structures (TOBS) in fluid flow design. In fluid topology optimization design, considering the density method, the material distribution characteristics are preserved, the gray problem is successfully eliminated, and the boundary between fluid and solid becomes clear. To solve this problem, Wildey et al. [8] utilized the uncertainty of the discontinuity position and generated robust boundaries based on the estimation of specific probability quantities of samples.

For fluid machinery, optimizing the overall structural parameters is essential. These parameters not only ensure that the impeller can generate the required pressure, but also have a global impact on the flow situation in the impeller passage. The interface between the impeller and fluid largely determines the performance of fluid machinery. Therefore, blade shape selection is a crucial aspect of impeller design [9]. Blade design is a complex multidisciplinary optimization problem, but innovative solutions have been proposed to address this issue. For instance, Meng et al. [10] developed a multidisciplinary optimization strategy that uses surrogate models to obtain solutions such as adiabatic efficiency, equivalent stress, and the total pressure ratio, which improves the reliability, safety, and performance of impellers. Munk et al. [11] proposed a multidisciplinary coupling optimization framework that incorporated fluid and structure into the topology optimization framework. By coupling a finite element solver with the lattice Boltzmann method, they demonstrated that adjusting the degree of coupling can significantly improve the algorithm’s computational efficiency and reduce the influence of fluid–structure coupling on the final optimization design. Moreover, Ghosh et al. [12] adopted the Probabilistic Machine Learning (PMI) framework to overcome challenges related to ill-posed inverse problems of turbine blades. This method solved the issue of sparse data required for training such models and produced explicit inverse designs.

When optimizing the design of fluid machinery, it is important to consider the interaction between performance and geometric parameters. To explore this relationship, Yu et al. [13] used computational fluid dynamics and neural networks to optimize the design of a blood pump, analyzing the influence of each parameter on performance and completing the parameter optimization. The research results indicate that this optimization method can be effectively applied to the design and research of complex, high-precision, multi-parameter, and multi-objective axial spiral vane pumps. In addition, Yu et al. [14] also studied the influence of structural factors of splitter blades on the performance and flow fields of axial flow pumps, analyzing and optimizing the pump’s flow field and performance by using the orthogonal array method. Shi et al. [15] adopted a multidisciplinary optimization design method based on an approximate model, considering blade mass and efficiency as objective functions, and head, efficiency, maximum stress, and maximum deformation as constraint conditions under small flow conditions. This method fully considers the interaction and mutual influence between hydraulic and structural design, improving the comprehensive performance of axial flow pump impellers. Lastly, Xu et al. [16] proposed a global optimization method for annular jet pump design by combining computational fluid dynamics simulation, Kriging approximation models, and experimental data. Experimental results demonstrate that this method can improve the efficiency of annular jet pumps.

In recent years, the use of the genetic algorithm as an optimization algorithm has led to significant progress in the optimal design of fluid machinery [17]. It is widely used to find the design parameters that optimize the performance of fluid machinery. The optimization idea of the genetic algorithm is to take the impeller performance as the objective function, determine the flow parameters related to the impeller performance, and combine the CFD calculation to predict the objective function result. Kim et al. [18] used a commercial computational fluid dynamics program and the response surface method to optimize the design of a mixed-flow pump impeller, analyzed the design variables and performance changes of the inlet part of the mixed-flow pump impeller, and obtained the best shape, thereby improving the suction performance and efficiency of the mixed-flow pump. Peng et al. [19] used a multi-objective genetic algorithm (MOGA) to perform multi-objective optimization on the optimal response surface model, and proved that this method significantly improved the efficiency of multiphase pumps under the condition of a large mass flow rate. In conclusion, the genetic algorithm is a popular and effective optimization method in fluid machinery design. The research status of the optimal design of fluid machinery for the internal flow field and the structure parameter is summarized in Table 1.

4. Machine Learning Algorithm

With the rapid development of computing technology, machine learning has become an important research tool in various fields. It originated from the in-depth study of artificial intelligence by scholars. Machine learning simulates the human brain and behavior and selects appropriate algorithms to analyze data according to different types of phenomena. It continuously trains and learns to understand the characteristics and internal laws of the data, thereby making judgments and predictions about phenomena. The advantages of intelligence and automation make machine learning a powerful tool for simplifying data analysis problems in various fields. The process of estimating associations between the inputs, outputs, and parameters of a system using a limited number of observations can be described as the learning problem.

Machine learning can be categorized into three types based on different focuses: supervised learning, unsupervised learning, and semi-supervised learning. Supervised learning involves predicting samples through algorithm training models based on the classification information of known data. Unsupervised learning involves building models directly without the classification information of data. Semi-supervised learning involves using a small number of existing data with categorical information and most data without categorical information to build models. At present, machine learning research mainly focuses on algorithm development. Strengthening research on machine learning algorithms can significantly improve machine learning efficiency. Commonly used machine learning algorithms include support vector machine (SVM), random forest (RF), naive Bayesian (NB), back propagation (BP), K-means clustering algorithm, generative adversarial networks (GAN), artificial neural networks (ANN), decision tree, autoencoder (AE), deterministic policy gradient (DPG), and others [20]. The method flow is shown in Figure 1.

4.1. Supervised Learning

In recent years, supervised learning has increasingly become a research focus in the field of machine learning. Supervised learning involves using known partial sample data to obtain a model through algorithm training, which can then be used to predict or classify unknown data. This process is not only suitable for simple sample judgments, but it can also lead to the development of a more accurate mathematical model through testing and optimizing another part of the sample data [22].

4.1.1. Regression and Classification Algorithm

In the supervised learning process, the computer first classifies the input known sample data, then builds a model based on these data, and finally makes a prediction. When the input variable is continuous, this type of problem is called a regression problem; when the input variable is discrete, it is called a classification problem. Commonly used algorithms to solve such problems include neural networks, support vector machine (SVM), random forest (RF), naive Bayesian (NBM), and others. These algorithms have been widely used in industry. Al Shahrani et al. [23] applied the elaborative stepwise stacked artificial neural network algorithm for automating industrial processes, significantly improving the control and monitoring of industrial environments in the automation industry. Nassif et al. [24] conducted a comprehensive study on the application of deep learning in speech recognition, demonstrating that machine learning can achieve better performance in speech recognition than in other fields. Zhang et al. [25] applied machine learning to a cloud-based medical diagnostic platform, using a secure model of support vector machine (SVM) to ensure safety and effectiveness in clinical diagnosis, highlighting the effectiveness and practicality of machine learning in medical diagnosis.

The focus of research on supervised learning algorithms is improving their performance and efficiency. Jafari-Marandi et al. [26] studied the assumptions made to promote efficient pattern recognition, and theoretically explained the reasons for the lack of effectiveness of supervised learning. They illustrated that inherent assumptions in the process of building models for classification algorithms have a significant impact on the credibility of the built models and resulting predictions. Hu et al. [27] proposed a spike optimization mechanism and demonstrated that it can improve the learning accuracy of the multi-layer Spiking Neural Network (SNN) and shorten the running time of the algorithm.

The support vector machine (SVM) is a popular supervised learning algorithm widely used in various fields. It demonstrates strong stability in nonlinear classification and is often employed for classification tasks. Wang et al. [28] discussed the use of the SVM algorithm to process FMRI data and proved that the supervised learning model training can link brain imaging and stimulation. The study compared the performance of linear models of SVM, ridge regression, and Lasso regression on multivariate classification. The results showed that SVM is more efficient for small-scale data classification. In classification tasks of fault diagnosis and defect prediction, Zhou et al. [29] proposed the Mean-ReSMOTE algorithm based on the traditional SVM algorithm named the SMOTE algorithm (Synthetic Minority Oversampling Technique) to oversample the samples in the dataset in order to reduce the over-generalization of the machine learning model. They also used the Hybrid-RFE+ algorithm to perform feature selection on the sampled data and obtained the optimal subset. The supervised defect prediction model built using SVM solves the problems of current defect prediction cross-domain category imbalance and feature redundancy. Xie [30] established a fault diagnosis model using the SVM algorithm, which showed promising results in terms of accuracy, calculation time, and memory consumption, accurately classifying existing faults. Bordoloi et al. [31] diagnosed the faults of centrifugal pumps using the SVM algorithm, analyzed the conditions under which the vibration signals caused the faults of centrifugal pumps, and classified their time-domain features. The SVM algorithm is often combined with other machine learning algorithms to address specific problems. Zhou et al. [32] designed a supervised learning algorithm based on a multi-scale convolutional neural network (CNN) model, which combined the multi-scale CNN model with the Multi-Objective Generating Countermeasure Active Learning (MO-GAAL) algorithm in the unsupervised learning algorithm against active learning. This solved the problem of large loss function values in the test set and provided a further solution for fault diagnosis in different scenarios.

The performance of SVM is dependent on the choice of penalty and kernel parameters. To address the multi-parameter optimization problem of SVM, researchers have proposed various methods. For example, Yang et al. [33] proposed the Cultural Emperor Penguin Optimizer (CEPO), which integrates the Cultural Algorithm (CA) and the Emperor Penguin Optimizer (EPO) for the parameter optimization of SVM. Experiments have shown that CEPO can improve the classification accuracy, convergence speed, stability, robustness, and operating efficiency of SVM. Additionally, Hu et al. [34] developed a Fractional-order-PCA-SVM coupling algorithm for digital image recognition, which demonstrated effectiveness in digital medical image recognition. Tharwat et al. [35] proposed a Chaotic Antlion Optimization (CALO) algorithm to optimize the kernel and penalty parameters of the SVM classifier, thereby reducing the classification error. Lastly, Tan [36] utilized an improved Particle Swarm Optimization (PSO) algorithm to optimize the penalty factor and kernel function parameters in the SVM model, resulting in the PSO-SVM algorithm, which significantly improved the accuracy of the electric load forecasting model, according to experimental results.

In order to enhance the classification accuracy of SVM classifiers in various fields, scholars have proposed several approaches. Wang et al. [37] suggested the use of Raman spectroscopy and an improved SVM to screen for thyroid dysfunction. They also introduced a Genetic Particle Swarm Optimization algorithm based on partial least squares, which can improve the classification accuracy of the SVM model. Xie et al. [38] proposed a cancer classification algorithm based on the Dragonfly Algorithm and SVM, which can optimize the parameters of the SVM classifier and yield better classification accuracy. Li et al. [39] developed a new Differential Evolution Algorithm for SVM parameter selection, which can attain faster convergence speeds and higher classification accuracy. Lastly, Ding et al. [40] proposed an Improved Sparrow Search Algorithm (ISSA) for SVM fault diagnosis, where they established a fault diagnosis model, ISSA-SVM, for dissolved gas analysis. The results demonstrated that the algorithm can accurately determine the current operating state of the transformer. Random forest is an integrated learning idea that combines multiple decision trees into a forest and utilizes them together to predict the final outcome.

The random forest (RF) algorithm incorporates random attribute selection during the training process of decision trees. It is an ensemble algorithm that has been widely used in many applications, such as classification and regression. However, the current theoretical research on random forest lags far behind practical applications. Although the existing parallel random forest algorithms have been researched for a long time, they still have problems such as long execution times and low parallelism. To address these problems, Wang et al. [41] proposed a parallel random forest (PRF) optimization algorithm based on distance weights, which has proven to perform better and more efficiently than previous PRF algorithms. Additionally, Wang et al. [42] proposed a Post-Selection Boosting Random Forest (PBRF) algorithm that combines the RF and Lasso regression methods, and this algorithm has been verified to improve model performance. To further enhance the performance of the RF algorithm, Wang et al. [43] utilized the Spark platform to propose a method that calculates feature weights to distinguish between strong and weak correlation features, and obtain feature subspaces through hierarchical sampling. This method has been proven to improve the classification accuracy and data calculation efficiency of the RF algorithm. However, these studies did not fully consider the problem of data imbalance. To make the RF algorithm more widely applicable, Sun et al. [44] proposed a Banzhaf Random Forest algorithm (BRF) based on cooperative game theory, which has proven to be consistent, thus narrowing the gap between random forest theory and practical applications. Furthermore, to overcome the shortcomings of low learning efficiency and local optima, Wang et al. [45] proposed an epistasis detection algorithm based on Artificial Fish Swarm Optimizing Bayesian Network (AFSBN), which has outperformed other methods in terms of epistasis detection accuracy across various datasets.

4.1.2. Evolutionary Algorithm

Evolutionary algorithms, also known as genetic algorithms (GA), are used to build models by simulating the crossover variation of chromosomes in the process of biological evolution, and to search for optimal solutions using algorithms. Genetic algorithms possess excellent global search capabilities and are widely used in various fields. However, the use of genetic algorithms in different application scenarios presents several challenges. To overcome these challenges, various solutions have been proposed by scholars. Maionchi et al. [46] used a neural network to train a dataset with the diameter of the obstacle and its offset as input, the mixing percentage, and the pressure drop as output, to obtain the optimal geometry of circular obstacles in the channel of the micromixer. A genetic algorithm was then used to find the geometry that provided the maximum mixing percentage and minimum pressure drop values, proving the effectiveness of the combination of neural network and genetic algorithm in optimization problems. However, when the sample size in the dataset is small, directly inputting it into the network for training may lead to overfitting, making it necessary to expand the existing samples by using data enhancement techniques. To address this issue, Huang et al. [47] proposed a generalized regression neural network optimized by a genetic algorithm. Additionally, Chui et al. [48] proposed a general model of support vector machine for deep multi-kernel learning optimized using a multi-objective genetic algorithm. The results show that the algorithm not only achieves higher accuracy but also solves typical problems of datasets in simulated environments, such as the unreliability of cross-validation and input signals. The summary for supervised learning is provided in Table 2.

4.2. Unsupervised Learning

Unsupervised learning is a machine learning method that explores the inherent laws and characteristics of sample data without labeled information. In order to achieve correct classification, the feature vector needs to contain sufficient category information, but it is often difficult to determine if the feature contains enough information. Unsupervised learning can effectively address this issue by selecting the best features for classifier training and automatically classifying all samples into different categories. This approach can be used to solve various problems in pattern recognition where class information is not available.

Stoudenmire et al. [49] utilized an unsupervised learning algorithm to compute most of the layers of layered tree tensor networks, and then optimized only the top layer for supervised classification on the MNIST dataset. They demonstrated that combining prior guesses for supervised weights with unsupervised representations maintains good performance. Li et al. [50] applied Particle Swarm Optimization (PSO) to predict the biological self-organization and thermodynamic properties of living systems, and proved that the model requires relatively little prior sample knowledge while ensuring the accuracy of the analysis. In order to further improve the prediction accuracy, Liu et al. [51] combined Extreme Gradient Boosting (XGBoost), Kernel Principal Component Analysis (KPCA), and Linear Discriminant Analysis (LDA) and proposed a supervised learning model based on Kernel Principal Component Analysis, Linear Discriminant Analysis, and the Extreme Gradient Boosting Algorithm (KPCA-LDA-XGB). Hamadeh et al. [52] optimized the multivariate statistical model by combining PCA and LDA, which improved the ability to process and analyze complex images using machine learning algorithms.

4.2.1. Dimensionality Reduction

In high-dimensional situations, dimensionality reduction becomes the first step in extracting predictive features from complex data due to problems such as sparse data samples and difficult distance calculations. Ge et al. [53] proposed a domain adversarial neural network model for learning the dimensionality reduction representation of single-cell RNA sequencing data. The model reduces representativeness through dimensionality reduction to better focus on the feature types of the data. Similarly, Deng et al. [54] proposed a Tensor Envelope Mixture Model (TEMM) for tensor data clustering and multidimensional dimensionality reduction, which reduces the number of free parameters and estimation variability. They also developed an expectation-maximization algorithm that obtains likelihood estimates of cluster means and covariances.

Moreover, despite extensive research on feature extraction methods, little attention has been paid to reducing the complexity of data. As a solution, Charte et al. [55] proposed an autoencoder-based approach to reduce the complexity. By improving the shape and distribution of different classes, the complexity is further reduced, while preserving most information about classes in the encoded features. Experiments show that the proposed class-informed autoencoders outperform traditional unsupervised feature extraction techniques for classification tasks. Additionally, Li et al. [56] proposed a new unsupervised robust discriminative manifold embedding method that addresses the problem of low performance on noisy data.

4.2.2. Clustering

Clustering refers to grouping similar data together and is often applied in the fields of computer vision and pattern recognition. Based on different learning strategies, scholars have designed various types of clustering algorithms, such as K-means clustering, t-SNE clustering, DBSCAN clustering, etc. Gan et al. [57] proposed a general deep clustering framework, integrating representation learning and clustering into a single pipeline for the first time. This method has shown superior performance on benchmark datasets for pattern recognition and has received widespread attention. In the segmentation of digital images, clustering algorithms play an important role. Basar et al. [58] proposed a new adaptive initialization method that determines the optimal initialization parameters of the traditional K-means clustering technique, optimizes the segmentation quality, and reduces the classification error. t-distributed random neighbor embedding clustering (t-distributed Random Neighbor Embedding, t-SNE) is often used for data dimensionality reduction and data visualization. Kimura et al. [59] extended t-SNE using the framework of information geometry. With a carefully selected set of parameters, the generalized t-SNE outperforms the original t-SNE. In unsupervised machine learning, density-based algorithms can not only reduce the complexity of algorithm operation but also improve the accuracy of clustering results. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is one of the most preferred algorithms among density-based clustering methods. In the study of variable density spatial clustering for high-dimensional data, Unver et al. [60] proposed the definition of fuzzy core points so that DBSCAN can try two different density modes in the same operation, and initially put forward the DBSCAN extension IFDBSCAN. The summary for unsupervised learning is provided in Table 3.

4.3. Semi-Supervised Learning

4.3.1. Regression, Dimension Reduction, and Clustering

Semi-supervised learning is a method that uses a small amount of labeled data and a large amount of unlabeled data to build a model and predict unknown data through algorithms. It covers various fields, such as regression, dimensionality reduction, and clustering. Despite the significant progress made in semi-supervised regression, there is still a lack of systematic and comprehensive research in this area. In semi-supervised clustering, unlabeled data are used to obtain more precise clustering results and enhance the performance of clustering methods. Semi-supervised classification improves classifier performance by using unlabeled data to augment the training of labeled data. As a result, semi-supervised classifiers can utilize more data than those trained using only labeled data, leading to better generalization performance. Common semi-supervised learning methods include neural networks, reinforcement learning, and deep learning, with deep learning being widely used and researched in this field.

A neural network is an algorithm that uses functions to calculate input parameters and output results. It calculates errors through loss functions and uses chain derivation rules to propagate errors backward to correct neural network weights. This process is repeated until the neural network can fit the data well. Neural networks can learn many different attributes to process various aspects of instances, which can be continuous-valued, discrete-valued, or vectors. They are versatile and can be used for a wide range of tasks, such as image and speech recognition, natural language processing, and game playing [61].

The back propagation (BP) neural network is a type of multi-layer perceptron (MLP) or feedforward neural network (FNN). Compared to a single-layer neural network, it can learn more complex nonlinear functions. Furthermore, as the number of layers increases, the fitting ability of the multilayer neural network for complex functions also increases. Zhao et al. [62] designed an intelligent fault identification system based on the genetic algorithm using a set of BP neural networks suitable for rotary fluid machinery. Experiments proved that the system improved the fault recognition rate and accuracy rate, effectively diagnosed the fault types of rotating machinery, and demonstrated high generalization ability. Ling [63] used the multilayer feedforward neural network model and applied the machine learning method’s flow field reproduction to calculate the aerodynamic characteristics of the airfoil. They used the excellent prediction model of the flow around the cylinder to predict the flow around the airfoil, predicted the velocity vector field through the independent neural network, established a regression problem to solve the two-dimensional airfoil flow field, and achieved rapid prediction. Lu et al. [64] proposed a semi-supervised extreme learning machine (SSELM) method based on improved SMOTE to solve the problem of scarce labeled samples in the model construction process. They proved that the stacked denoising autoencoder can preserve and obtain better features, and further demonstrated that ELM can increase the learning rate of the model, resulting in better generalization performance.

4.3.2. Deep Reinforcement Learning

Reinforcement learning (RL) is a method that seeks to maximize long-term rewards by adapting behavior to a specific environment. It has found widespread application in various fields, including physics, chemistry, and biology [65]. However, the existing parallel reinforcement learning methods suffer from a couple of issues. Firstly, the number of running algorithms cannot be reduced; secondly, the algorithm may not necessarily converge to the optimal solution. To address these problems, Ding et al. [66] developed a new algorithm for asynchronous reinforcement learning—the Sarsa algorithm (APSO-BQSA) —by combining the backward Q-learning and Asynchronous Particle Swarm Optimization (APSO) algorithms. The proposed algorithm effectively searches for the optimal solution. In a related study, Kumar et al. [67] compared the performance of multiple linear regression, multiple nonlinear regression, and artificial neural networks (ANN) in predicting the optimal configuration parameters of jet aerators. The findings indicate that artificial neural networks outperform multiple linear regression and multiple nonlinear regression techniques in this regard. Another study by Dalca et al. [68] established a connection between classical learning and machine learning methods. The authors proposed a probabilistic generative model and derived an inference algorithm based on unsupervised machine learning. By combining classical registration methods and convolutional neural networks (CNN), the algorithm significantly improves the operation accuracy and speeds up the running time.

Deep learning (DL) has rapidly developed in the supervised field, and its combination with reinforcement learning has led to the emergence of deep reinforcement learning (DRL). As the latest achievement in the field of machine learning, DRL has found widespread application in various fields. Many classic algorithms and typical application fields have been produced for DRL research. DRL can achieve high-level control in the field of intelligent manufacturing, making it a technology with great potential [69]. Li et al. [70] proposed a deep learning model based on statistical features for the vibration measurement of rotating machinery. They used the model to classify faults in three domain feature sets and demonstrated the efficiency of the model for fault classification. In terms of structural optimization design, Viquerat et al. [71] applied DRL to direct shape optimization for the first time. They demonstrated that an artificial neural network trained with DRL can generate optimal shapes on its own in a finite amount of time, without any prior knowledge. This demonstrates the effectiveness of reinforcement learning research in hydrodynamic shape optimization. Young et al. [72] introduced an open-source, distributed Bayesian model optimization algorithm called HyperSpace. They proved that the algorithm consistently outperformed standard hyperparameter optimization techniques among the three DRL algorithms. The summary for semi-supervised learning is provided in Table 4.

5. Application of Machine Learning in Optimal Design of Fluid Machinery

In this context, two complementary techniques can be identified: dimensionality reduction and reduced-order modeling. Dimensionality reduction aims to extract essential features and dominant patterns from the fluid, which can be used to compactly and efficiently describe its behavior using reduced coordinates. On the other hand, reduced-order modeling focuses on developing a parametrized dynamical system that captures the spatiotemporal evolution of the flow. Additionally, this technique may also involve creating a statistical map that relates the model parameters to averaged quantities.

5.1. Flow Feature Extraction

Machine learning (ML) is highly regarded for its strengths in pattern recognition and data mining, making it a valuable tool for analyzing spatiotemporal fluid data. The ML community has developed several techniques that can be easily applied in this field. In this discussion, we will delve into both linear and nonlinear dimensionality reduction methods, clustering, and classification techniques. Furthermore, we will examine approaches for expediting measurement and computation, and for processing experimental flow field data.

5.1.1. Dimensionality Reduction

Dimensionality reduction is a technique that reduces the dimensionality of data while preserving most of the information present in the original data. It can be employed in the preprocessing stage to minimize redundant information and noise. One of the most widely used dimensionality reduction algorithms is Proper Orthogonal Decomposition (POD), particularly in fluid mechanics and structural mechanics. POD reduces the dimensionality of the data by performing singular value decomposition (SVD) on the high-dimensional data matrix and keeping only the first r decomposition components. These components contain the most critical features or geometric structures of the data, and when the data are restored, the entire dataset can be recovered. Similarly, principal component analysis (PCA) is used in machine learning for dimensionality reduction by generating a set of orthogonal principal components through the covariance matrix of the dataset. The combination of principal components captures most of the variance in the original data.

In a study by Mendez et al. [73], the differences and connections between autoencoders and manifold learning methods were emphasized. They investigated the impact of nonlinear techniques such as kernel principal component analysis, isometric feature learning, and local linear embedding on filtering, oscillation pattern recognition, and data compression.

5.1.2. Clustering and Classification

In the field of fluid machinery, flow field data refer to information describing the fluid’s motion state in a specific area. This information includes parameters such as velocity, pressure, and density. Clustering and classifying flow field data under different physical or boundary conditions can aid in understanding the fluid’s characteristics and its physical laws, and facilitate fluid control or optimal design. Machine learning offers various approaches, including K-means, which divides data points into K clusters, allowing for the identification of different fluid elements moving in the flow field. The K-means algorithm iteratively adjusts the positions of the clusters to minimize the error function, and the clustering results can be interpreted using the center point. Mi et al. [74] applied K-means clustering to density logging and P-wave velocity data from three wells. The density log equation was also used to calculate the porosity of each cluster. The main lithologies, pore fluids, and fluid contacts were identified based on the center of mass of each cluster. However, this algorithm may lead to poor clustering results for complex structures.

Alternatively, the Gaussian Mixture Model (GMM) is suitable for more complex, nonlinear structural problems. GMM is a clustering method based on the probability density model and uses labeled particles, such as droplets or bubbles, as data points to achieve clustering by modeling the density relationship between them. GMM can adapt and fit to various shapes of clustering structures and avoids wrong classifications due to distribution instability. Zeng et al. [75] used a deep autoencoder (DAE) and a Gaussian Mixture Model (GMM) to cluster trajectories and mine the main traffic flow patterns in the terminal airspace. The feature representations extracted by the DAE from historical high-dimensional trajectory data were used as input to the GMM and used for clustering.

Convolutional neural networks (CNNs) are commonly used for classification and can automatically extract features from input data. In the field of fluid mechanics, CNNs can detect fluid elements and track them across multiple image frames. Furthermore, CNNs can perform semi-supervised learning, which enables the clustering of unlabeled data points into different categories. He et al. [76] applied a CNN to feature extraction in multidimensional data and used a Long Short-Term Memory Network (LSTM) to identify the relationships between different time steps, which overcomes the limitation of high-quality feature dependence.

5.1.3. Sparse and Randomized Methods

In machine learning, there are various approaches to choose from. For instance, the sparse method reduces the feature dimension by selecting some of the most representative features from the original features. This improves the accuracy and interpretability of the model prediction and is suitable for situations where only some features impact the model prediction results.

Compared to the sparse method, the random method is more appropriate for situations where the feature dimension is high, but the data are sparse and noisy. Techniques such as low-rank decomposition and random projection are used to randomly reduce the high-dimensional data to a low-dimensional space, thereby speeding up calculations and reducing noise. Commonly used methods include Proper Orthogonal Decomposition (POD) and Dynamic Mode Decomposition (DMD).

Krah et al. [77] used the biorthogonal wavelet to compress data for wavelet feature extraction and reduce the amount of data analyzed, resulting in a sparse representation while controlling compression errors. DMD extracts key features that reflect system movement through the analysis of a large amount of time series data. These features can be used as input features for machine learning models and are valuable for problems such as time series forecasting and control.

In fluid machinery optimization design, these methods are often used in combination with machine learning to improve the efficiency and accuracy of data analysis. For example, DMD can decompose high-dimensional dynamic systems, and after extracting features, machine learning algorithms can perform tasks such as classification or prediction. Naderi et al. [78] used the Hybrid Dynamic Mode Decomposition (HDMD) method to analyze unsteady fluid flow on a moving structure. Using the K-Nearest Neighbors (KNN) algorithm, numerical data of the dynamic grid are interpolated to a single stationary grid at each time step, providing the required fixed spatial domain for DMD.

5.1.4. Super-Resolution and Flow Cleansing

Super-resolution technology refers to using computer algorithms to increase the resolution of images, videos, or audio signals. This technology employs mathematical models, statistical knowledge, and machine learning methods to reconstruct missing pixels or signals, producing high-quality and clear results. In the field of computer vision and image processing, super-resolution is widely used to improve image or video quality, restore blurred or low-quality images, and enlarge small images. Similarly, in fluid mechanics, super-resolution techniques can help to visualize flow fields, leading to better process understanding.

One of the machine learning-based super-resolution techniques is the three-dimensional (3D) Enhanced Deep Super-Resolution (EDSR) convolutional neural network, which Jackson et al. [79] developed. This network enhances the low-resolution images of large samples first, and then generates high-resolution images from those images, which is helpful in reducing micro-CT hardware or reconstruction defects often seen in high-resolution images. The EDSR technique can help to identify fluid phenomena, such as flow lines and vortices, and improve the efficiency of hydrodynamic analysis and visualization.

Flow cleansing refers to the process of selecting relatively important features using algorithms or models for dimensionality reduction. In data processing, missing value processing, outlier detection, and noise filtering methods are usually employed to gradually clean and process data, which can reduce computational complexity and improve algorithm accuracy. Montes et al. [80] used the Evolutionary Polynomial Regression-Multi-Objective Genetic Algorithm (EPR-MOGA) to collect data from different steady-state flows, process them, and derive three new self-cleaning models based on their optimization strategies. Each operating system used a different set of potential input parameters to describe the modified Froude number. The summary for flow feature extraction is provided in Table 5.

5.2. Optimal Design of Fluid Machinery

To achieve optimal design for fluid machinery, it is important to consider various interconnected factors. The relationship between performance and geometric parameters is highly sensitive and complex, displaying strong nonlinearity. Some scholars suggest using machine learning technology to simulate and better understand the behavior and mechanism of action, leveraging deep information and knowledge to improve the practicality and efficiency of optimization. Specifically, in the optimization design of fluid machinery, machine learning can be applied to construct surrogate or reduced-order models to explore the correlations between design variables or the relationship between design variables and performance.

5.2.1. Surrogate Model

The construction of a surrogate model is based on the relationship between inputs and outputs. In fluid machinery, surrogate models are commonly used to build a proxy model between inputs and outputs, such as experimental or simulation data. This is achieved by using various methods such as Gaussian process regression, the radial basis function network (RBF network), and multilayer perceptron (MLP). These surrogate models can then be used to quickly predict target functions based on design parameters, partially or completely replacing the need for CFD analysis.

Si et al. [81] optimized the impeller of a pump using an orthogonal experimental design and the multi-island genetic algorithm (MIGA). They obtained the best impeller geometry by using the response surface method and analyzed the mixed flow design variables and performance variations in the inlet section of the pump impeller to obtain the optimal shape. Zhu et al. [82] applied the Artificial Bee Colony (ABC) algorithm to find the optimal parameters in the traditional Support Vector Regression (SVR) model. They established an improved SVR model, the Improved Support Vector Regression Model (ISRM), and used the Multiple Population Genetic Algorithm (MPGA) to solve the optimization model and program of the ISRM method. This approach improved the reliability of optimization design for complex structures, particularly turbine blades, by increasing the modeling accuracy and optimization efficiency. Bouhlel et al. [83] used an artificial neural network based on gradient enhancement to simulate airfoils under different flight conditions. They simulated the aerodynamic coefficients of subsonic and transonic airfoils by gradually introducing gradient information, allowing airfoil design optimization within seconds. Compared to traditional CFD-based optimization models, this approach greatly reduces the calculation time while producing similar results.

5.2.2. Reduced-Order Model

The reduced-order model uses machine learning methods to simplify the Navier–Stokes (N–S) equations for flow optimization, resulting in a streamlined model that can extract flow features and guide optimized design.

Yao et al. [84] used Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (Light GBM) algorithms to model the filtered density function of the mixture fraction in a turbulent evaporative spray. Their integrated model achieved high accuracy. Aversano et al. [85] combined PCA with Kriging to build a low-level model. They used PCA to identify system invariants and separate them from coefficients related to characteristic operating conditions. The Kriging correspondence method was then used to find the response surface for these coefficients.

5.2.3. Deep Learning

In recent years, there have been remarkable developments and achievements in machine learning methods, especially in deep learning. Compared to traditional machine learning, deep learning utilizes a more complex network structure and improved training processes, resulting in greatly improved inductive ability. Deep learning models can independently select and eliminate useless features from a large pool of candidate features, and perform regression and classification tasks. Reinforcement learning, in particular, offers an interactive learning method that can be used to optimize design.

Renganathan et al. [86] utilized machine learning to create a predictive model for the radar-based measurement of wind turbine wakes. They employed probabilistic machine learning techniques, specifically Gaussian process modeling, to learn the mapping between the parameter space and latent space and to account for data cognition and statistical uncertainty. This approach provided an accurate approximation of the wind turbine wake field. Maulik et al. [87] used the Probabilistic Neural Network (PNN) to develop a machine-learning-based surrogate model for fluid flow, which also quantified the uncertainty in the model. This enhanced the reliability of the model. Li et al. [88] employed a deep reinforcement learning algorithm to reduce the aerodynamic drag of a supercritical airfoil. They pre-trained the initial strategy of reinforcement learning through imitation learning and trained the policy in a proxy-model-based environment, which effectively improved the mean drag reduction across 200 airfoils. Zheng et al. [89] combined the Bayesian optimization algorithm with a specified control action and used the Gaussian progress regression surrogate model to predict the vibration amplitude of the eddy-current-induced vibration. They also applied the soft actor–critic deep reinforcement learning algorithm to build a real-time control system, which provided a novel concept for typical flow control problems. The summary for the optimal design of fluid machinery using machine learning techniques is provided in Table 6.

6. Outlook

After reviewing recent research on the application of machine learning methods in optimizing fluid machinery, this paper concludes that there are still many areas in this field that warrant further exploration.

Firstly, as science and technology advances, there is a growing need for more accurate turbulence models. Although machine learning techniques offer new possibilities for the development of such models, their industrial applications are still in their infancy. Current methods, such as using statistical models and data to compensate for prediction differences, often have shortcomings such as poor convergence, low learning speeds, and reduced accuracy caused by the introduction of human knowledge. Therefore, further research is necessary to improve the effectiveness of machine learning in optimizing fluid machinery.

Secondly, the issue of sparse design sample data is common in fluid machinery, and accurately quantifying the underlying physical mechanisms is crucial for analysis. Probabilistic machine learning has significant potential for data-driven prediction, fluid control, shape optimization, reconstruction, and model reduction in fluid mechanics applications. Current research shows that it can be used to create surrogate models for turbulent and compressible fluids, representing a promising research direction.

Finally, in machine learning, it is essential to consider various factors when selecting an appropriate algorithm, such as the quality and quantity of data, expected inputs and outputs, the cost function to be optimized, whether the task involves interpolation or extrapolation, and the interpretability of the model. Failure to account for these factors may result in overfitting or the non-convergence of the data to the model. Therefore, it is vital to establish an appropriate cross-validation model and optimize the algorithm to improve the accuracy and effectiveness of machine learning in optimizing fluid machinery.

Author Contributions

Conceptualization, B.X. and J.D.; methodology, B.X.; software, J.D.; validation, J.D. and X.L.; formal analysis, J.D.; investigation, X.L.; resources, A.C.; data curation, J.C.; writing—original draft preparation, J.D.; writing—review and editing, B.X.; visualization, J.D.; supervision, B.X. and D.Z.; project administration, B.X. and D.Z.; funding acquisition, B.X. and D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. U2106225), Natural Science Foundation of Jiangsu Province (No. 21KJA470002), Jiangsu Province Engineering Research Center of High-Level Energy and Power Equipment (No. JSNYDL-202204), Senior Talent Foundation of Jiangsu University (No. 18JDG034), and Postdoctoral Science Foundation of Jiangsu Province (No. 2018K102C).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the editors and the anonymous reviewers for their helpful and constructive comments. Their time is appreciated and the relevance of the comments on our manuscript indicated that they dedicated a significant portion of it to helping us to improve our work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Luo, X.; Ye, W.; Song, X.; Geng, C. Future fluid machinery supporting “double-carbon” targets. J. Tsinghua Univ. Sci. Technol. 2022, 62, 678–692. [Google Scholar]
San, O.; Pawar, S.; Rasheed, A. Prospects of federated machine learning in fluid dynamics. Aip Adv. 2022, 12, 095212. [Google Scholar] [CrossRef]
Li, Y.; Chang, J.; Kong, C.; Bao, W. Recent progress of machine learning in flow modeling and active flow control. Chin. J. Aeronaut. 2022, 35, 14–44. [Google Scholar] [CrossRef]
Zhao, J.; Zhang, M.; Zhu, Y.; Cheng, R.; Wang, L.; Li, X. Topology optimization of planar heat sinks considering out-of-plane design-dependent deformation problems. Meccanica 2021, 56, 1693–1706. [Google Scholar] [CrossRef]
Yang, W.; Liu, B.; Xiao, R. Three-dimensional inverse design method for hydraulic machinery. Energies 2019, 12, 3210. [Google Scholar] [CrossRef]
Moghadassian, B.; Sharma, A. Designing wind turbine rotor blades to enhance energy capture in turbine arrays. Renew. Energy 2020, 148, 651–664. [Google Scholar] [CrossRef]
Souza, B.C.; Yamabe, P.V.M.; Sa, L.F.N.; Ranjbarzadeh, S.; Picelli, R.; Silva, E.C.N. Topology optimization of fluid flow by using Integer Linear Programming. Struct. Multidiscip. Optim. 2021, 64, 1221–1240. [Google Scholar] [CrossRef]
Wildey, T.; Gorodetsky, A.A.; Belme, A.C.; Shadid, J.N. Robust uncertainty quantification using response surface approximations of discontinuous functions. Int. J. Uncertain. Quantif. 2019, 9, 415–437. [Google Scholar] [CrossRef]
Chelabi, M.A.; Saga, M.; Kuric, I.; Basova, Y.; Dobrotvorskiy, S.; Ivanov, V.; Pavlenko, I. The effect of blade angle deviation on mixed inflow turbine performances. Appl. Sci. 2022, 12, 3781. [Google Scholar] [CrossRef]
Meng, D.; Liu, M.; Yang, S.; Zhang, H.; Ding, R. A fluid-structure analysis approach and its application in the uncertainty-based multidisciplinary design and optimization for blades. Adv. Mech. Eng. 2018, 10, 1687814018783410. [Google Scholar] [CrossRef]
Munk, D.J.; Kipouros, T.; Vio, G.A.; Parks, G.T.; Steven, G.P. On the effect of fluid-structure interactions and choice of algorithm in multi-physics topology optimisation. Finite Elem. Anal. Des. 2018, 145, 32–54. [Google Scholar] [CrossRef]
Ghosh, S.; Padmanabha, G.A.; Peng, C.; Andreoli, V.; Atkinson, S.; Pandita, P.; Vandeputte, T.; Zabaras, N.; Wang, L. Inverse aerodynamic design of gas turbine blades using probabilistic machine learning. J. Mech. Des. 2022, 144. [Google Scholar] [CrossRef]
Yu, Z.; Tan, J.; Wang, S.; Guo, B. Multiple parameters and target optimization of splitter blades for axial spiral blade blood pump using computational fluid mechanics, neural networks, and particle image velocimetry experiment. Sci. Prog. 2021, 104, 00368504211039363. [Google Scholar] [CrossRef] [PubMed]
Yu, Z.; Tan, J.; Wang, S. Multi-parameter analysis of the effects on hydraulic performance and hemolysis of blood pump splitter blades. Adv. Mech. Eng. 2020, 12, 1687814020921299. [Google Scholar] [CrossRef]
Shi, L.; Zhu, J.; Tang, F.; Wang, C. Multi-disciplinary optimization design of axial-flow pump impellers based on the approximation model. Energies 2020, 13, 779. [Google Scholar] [CrossRef]
Xu, K.; Wang, G.; Wang, L.; Yun, F.; Sun, W.; Wang, X.; Chen, X. Parameter analysis and optimization of annular jet pump based on kriging model. Appl. Sci. 2020, 10, 7860. [Google Scholar] [CrossRef]
Kavuri, C.; Kokjohn, S.L. Exploring the potential of machine learning in reducing the computational time/expense and improving the reliability of engine optimization studies. Int. J. Engine Res. 2020, 21, 1251–1270. [Google Scholar] [CrossRef]
Kim, S.; Kim, Y.-I.; Kim, J.-H.; Choi, Y.-S. Three-objective optimization of a mixed-flow pump impeller for improved suction performance and efficiency. Adv. Mech. Eng. 2019, 11, 1687814019898969. [Google Scholar] [CrossRef]
Peng, C.; Zhang, X.; Gao, Z.; Wu, J.; Gong, Y. Research on cooperative optimization of multiphase pump impeller and diffuser based on adaptive refined response surface method. Adv. Mech. Eng. 2022, 14, 16878140211072944. [Google Scholar] [CrossRef]
Wang, Y.R.; Zhao, W.J.; Liang, L.G.; Wu, Q.; Jin, H.H.; Wang, C.X. Fluid mechanical pneumatic optimization based on machine learning method Research Status and Prospect of Design. Chin. J. Turbomach. 2020, 62, 77–90. [Google Scholar] [CrossRef]
Brunton, S.L.; Noack, B.R.; Koumoutsakos, P. Machine learning for fluid mechanics. In Annual Review Of Fluid Mechanics; Davis, S.H., Moin, P., Eds.; Annual Reviews: San Mateo, CA, USA, 2020; Volume 52, pp. 477–508. [Google Scholar]
Liu, F.N.; Shi, J.X.; Wang, W.J.; Zhao, R. Overview of machine learning algorithms in material science. New Chmical Mater. 2022, 50, 42–46+52. [Google Scholar] [CrossRef]
Al Shahrani, A.M.M.; Alomar, M.A.; Alqahtani, K.N.N.; Basingab, M.S.; Sharma, B.; Rizwan, A. Machine learning-enabled smart industrial automation systems using internet of things. Sensors 2023, 23, 324. [Google Scholar] [CrossRef] [PubMed]
Nassif, A.B.; Shahin, I.; Attili, I.; Azzeh, M.; Shaalan, K. Speech recognition using deep neural networks: A systematic review. IEEE Access 2019, 7, 19143–19165. [Google Scholar] [CrossRef]
Zhang, M.; Song, W.; Zhang, J. A secure clinical diagnosis with privacy-preserving multiclass support vector machine in clouds. IEEE Syst. J. 2022, 16, 67–78. [Google Scholar] [CrossRef]
Jafari-Marandi, R. Supervised or unsupervised learning? Investigating the role of pattern recognition assumptions in the success of binary predictive prescriptions. Neurocomputing 2021, 434, 165–193. [Google Scholar] [CrossRef]
Hu, T.; Lin, X.; Wang, X.; Du, P. Supervised learning algorithm based on spike optimization mechanism for multilayer spiking neural networks. Int. J. Mach. Learn. Cybern. 2022, 13, 1981–1995. [Google Scholar] [CrossRef]
Wang, Y.Y.; Yu, H.F.; Li, B.; Lu, X.M. How to analyze fmri data with open source tools: An introduction to supervised machine learning algorithm for multi-voxel patterns analysis. J. Psychol. Sci. 2022, 45, 718–724. [Google Scholar] [CrossRef]
Chao, Z.; Zheng, W.; Futong, Q.; Yi, L. An enhanced supervised cross-domain protocol defect prediction algorithm. Comput. Eng. Appl. 2022, 1–7. [Google Scholar]
Xun, Y.Y. Research on Fault Identification of Rolling Bearing in Rotating Fluid Machinery. Master’s Thesis, Zhejiang Sci-Tech University, Hangzhou, China, 2020. [Google Scholar]
Bordoloi, D.J.; Tiwari, R. Identification of suction flow blockages and casing cavitations in centrifugal pumps by optimal support vector machine techniques. J. Braz. Soc. Mech. Sci. Eng. 2017, 39, 2957–2968. [Google Scholar] [CrossRef]
Feng, Z.; Yi, Y.; Xin, L.; Yaguang, J.; Rufang, L. Equipment fault diagnosis technology based on supervised and unsupervised learning algorithms and investigation on algorithm fusion. Ind. Technol. Innov. 2022, 9, 30–38. [Google Scholar] [CrossRef]
Yang, J.; Gao, H. Cultural emperor penguin optimizer and its application for face recognition. Math. Probl. Eng. 2020, 2020, 9579538. [Google Scholar] [CrossRef]
Hu, L.; Cui, J. Digital image recognition based on Fractional-order-PCA-SVM coupling algorithm. Measurement 2019, 145, 150–159. [Google Scholar] [CrossRef]
Tharwat, A.; Hassanien, A.E. Chaotic antlion algorithm for parameter optimization of support vector machine. Appl. Intell. 2018, 48, 670–686. [Google Scholar] [CrossRef]
Tan, X.; Yu, F.; Zhao, X. Support vector machine algorithm for artificial intelligence optimization. Clust. Comput. J. Netw. Softw. Tools Appl. 2019, 22, 15015–15021. [Google Scholar] [CrossRef]
Wang, D.; Jiang, J.; Mo, J.; Tang, J.; Lv, X. Rapid screening of thyroid dysfunction using raman spectroscopy combined with an improved support vector machine. Appl. Spectrosc. 2020, 74, 674–683. [Google Scholar] [CrossRef]
Xie, T.; Yao, J.; Zhou, Z. Da-based parameter optimization of combined kernel support vector machine for cancer diagnosis. Processes 2019, 7, 263. [Google Scholar] [CrossRef]
Li, J.; Fang, G. A novel differential evolution algorithm integrating opposition-based learning and adjacent two generations hybrid competition for parameter selection of SVM. Evol. Syst. 2021, 12, 207–215. [Google Scholar] [CrossRef]
Ding, C.; Ding, Q.; Wang, Z.; Zhou, Y. Fault diagnosis of oil-immersed transformers based on the improved sparrow search algorithm optimised support vector machine. IET Electr. Power Appl. 2022, 16, 985–995. [Google Scholar] [CrossRef]
Wang, Q.; Chen, H.H. Optimization of parallel random forest algorithm based on distance weight. J. Intell. Fuzzy Syst. 2020, 39, 1951–1963. [Google Scholar] [CrossRef]
Wang, H.; Wang, G.Z. Improving random forest algorithm by Lasso method. J. Stat. Comput. Simul. 2021, 91, 353–367. [Google Scholar] [CrossRef]
Wang, S.Z.; Zhang, Z.F.; Geng, S.S.; Pang, C.Y. Research on optimization of random forest algorithm based on spark. CMC Comput. Mater. Contin. 2022, 71, 3721–3731. [Google Scholar] [CrossRef]
Sun, J.; Zhong, G.; Huang, K.; Dong, J. Banzhaf random forests: Cooperative game theory based random forests with consistency. Neural Netw. 2018, 106, 20–29. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Wang, Y.; Fu, Y.; Gao, Y.; Du, J.; Yang, C.; Liu, J. Afsbn: A method of artificial fish swarm optimizing bayesian network for epistasis detection. IEEE Acm Trans. Comput. Biol. Bioinform. 2021, 18, 1369–1383. [Google Scholar] [CrossRef] [PubMed]
Maionchi, D.D.; Ainstein, L.; dos Santos, F.P.; de Souza, M.B. Computational fluid dynamics and machine learning as tools for optimization of micromixers geometry. Int. J. Heat Mass Transf. 2022, 194, 655–661. [Google Scholar] [CrossRef]
Huang, H.-B.; Xie, Z.-H. Generalized regression neural network optimized by genetic algorithm for solving out-of-sample extension problem in supervised manifold learning. Neural Process. Lett. 2019, 50, 2567–2593. [Google Scholar] [CrossRef]
Chui, K.T.; Lytras, M.D.; Liu, R.W. A generic design of driver drowsiness and stress recognition using moga optimized deep mkl-svm. Sensors 2020, 20, 1474. [Google Scholar] [CrossRef]
Stoudenmire, E.M. Learning relevant features of data with multi-scale tensor networks. Quantum Sci. Technol. 2018, 3, 034003. [Google Scholar] [CrossRef]
Li, J.; Xie, F. Self-organized criticality of molecular biology and thermodynamic analysis of life system based on optimized particle swarm algorithm. Cell. Mol. Biol. 2020, 66, 177–192. [Google Scholar] [CrossRef]
Liu, Y.; Wang, H.; Fei, Y.; Liu, Y.; Shen, L.; Zhuang, Z.; Zhang, X. Research on the prediction of green plum acidity based on improved xgboost. Sensors 2021, 21, 930. [Google Scholar] [CrossRef]
Hamadeh, L.; Imran, S.; Bencsik, M.; Sharpe, G.R.; Johnson, M.A.; Fairhurst, D.J. Machine learning analysis for quantitative discrimination of dried blood droplets. Sci. Prog. 2020, 10, 3313. [Google Scholar] [CrossRef]
Ge, S.; Wang, H.; Alavi, A.; Xing, E.; Bar-Joseph, Z. Supervised adversarial alignment of single-cell rna-seq data. J. Comput. Biol. 2021, 28, 501–513. [Google Scholar] [CrossRef] [PubMed]
Deng, K.; Zhang, X. Tensor envelope mixture model for simultaneous clustering and multiway dimension reduction. Biometrics 2022, 78, 1067–1079. [Google Scholar] [CrossRef] [PubMed]
Charte, D.; Charte, F.; Herrera, F. Reducing data complexity using autoencoders with class-informed loss functions. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 9549–9560. [Google Scholar] [CrossRef] [PubMed]
Li, J. Unsupervised robust discriminative manifold embedding with self-expressiveness. Neural Netw. 2019, 113, 102–115. [Google Scholar] [CrossRef] [PubMed]
Gan, Y.; Dong, X.; Zhou, H.; Gao, F.; Dong, J. Learning the precise feature for cluster assignment. IEEE Trans. Cybern. 2022, 52, 8587–8600. [Google Scholar] [CrossRef]
Basar, S.; Ali, M.; Ochoa-Ruiz, G.; Zareei, M.; Waheed, A.; Adnan, A. Unsupervised color image segmentation: A case of rgb histogram based k-means clustering initialization. PLoS ONE 2020, 15, e0240015. [Google Scholar] [CrossRef]
Kimura, M. Generalized t-SNE through the lens of information geometry. IEEE Access 2021, 9, 129619–129625. [Google Scholar] [CrossRef]
Unver, M.; Erginel, N. Clustering applications of IFDBSCAN algorithm with comparative analysis. J. Intell. Fuzzy Syst. 2020, 39, 6099–6108. [Google Scholar] [CrossRef]
Huiwei, X. Research on Network Traffic Analysis and Prediction Based on Decision Tree Integration and Width Forest. Master’s Thesis, Nanjing University of Posts and Telecommunications, Nanjing, China, 2020. [Google Scholar]
Peng, Z.; Zhaolong, C.; Zhili, C. Design of intelligent fault diagnosis system for fluid machinery. Petrochem. Equip. Technol. 2022, 43, 52–58, 62+56–57. [Google Scholar]
Ling, Z. High-Precision Numerical Simulation of Airfoil and Prediction of Airfoil Flow Field Based on Machine Learning Method; Lanzhou University of Technology: Lanzhou, China, 2021. [Google Scholar]
Zihao, L.; Xiaoyuan, J. Defect prediction of semi-supervised limit learning machine based on improved SMOTE. Comput. Technol. Dev. 2021, 31, 21–25. [Google Scholar]
Martin-Guerrero, J.D.; Lamata, L. Reinforcement learning and physics. Appl. Sci. 2021, 11, 8589. [Google Scholar] [CrossRef]
Ding, S.; Du, W.; Zhao, X.; Wang, L.; Jia, W. A new asynchronous reinforcement learning algorithm based on improved parallel PSO. Appl. Intell. 2019, 49, 4211–4222. [Google Scholar] [CrossRef]
Kumar, M.; Ranjan, S.; Tiwari, N.K. Oxygen transfer study and modeling of plunging hollow jets. Appl. Water Sci. 2018, 8, 121. [Google Scholar] [CrossRef]
Dalca, A.V.; Balakrishnan, G.; Guttag, J.; Sabuncu, M.R. Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces. Med. Image Anal. 2019, 57, 226–236. [Google Scholar] [CrossRef] [PubMed]
Kong, S.; Liu, C.; Shi, Y.; Xie, Y.; Wang, K. Review of application prospect of deep reinforcement learning in intelligent manufacturing. Comput. Eng. Appl. 2021, 57, 49–59. [Google Scholar]
Li, C.; Sanchez, R.V.; Zurita, G.; Cerrada, M.; Cabrera, D. Fault diagnosis for rotating machinery using vibration measurement deep statistical feature learning. Sensors 2016, 16, 895. [Google Scholar] [CrossRef]
Viquerat, J.; Rabault, J.; Kuhnle, A.; Ghraieb, H.; Larcher, A.; Hachem, E. Direct shape optimization through deep reinforcement learning. J. Comput. Phys. 2021, 428, 110080. [Google Scholar] [CrossRef]
Young, M.T.; Hinkle, J.D.; Kannan, R.; Ramanathan, A. Distributed bayesian optimization of deep reinforcement learning algorithms. J. Parallel Distrib. Comput. 2020, 139, 43–52. [Google Scholar] [CrossRef]
Mendez, M.A. Linear and nonlinear dimensionality reduction from fluid mechanics to machine learning. Meas. Sci. Technol. 2023, 34, 042001. [Google Scholar] [CrossRef]
Mi, A.; Chen, S.-C. Characterization of well logs using K-mean cluster analysis. J. Pet. Explor. Prod. Technol. 2020, 10, 2245–2256. [Google Scholar] [CrossRef]
Zeng, W.; Xu, Z.; Cai, Z.; Chu, X.; Lu, X. Aircraft trajectory clustering in terminal airspace based on deep autoencoder and gaussian mixture model. Aerospace 2021, 8, 266. [Google Scholar] [CrossRef]
He, W.; Li, J.; Tang, Z.; Wu, B.; Luan, H.; Chen, C.; Liang, H. A novel hybrid CNN-LSTM scheme for nitrogen oxide emission prediction in fcc unit. Math. Probl. Eng. 2020, 2020, 8071810. [Google Scholar] [CrossRef]
Krah, P.; Engels, T.; Schneider, K.; Reiss, J. Wavelet adaptive proper orthogonal decomposition for large-scale flow data. Adv. Comput. Math. 2022, 48, 10. [Google Scholar] [CrossRef]
Naderi, M.H.; Eivazi, H.; Esfahanian, V. New method for dynamic mode decomposition of flows over moving structures based on machine learning (hybrid dynamic mode decomposition). Phys. Fluids 2019, 31, 127102. [Google Scholar] [CrossRef]
Jackson, S.J.; Niu, Y.; Manoorkar, S.; Mostaghimi, P.; Armstrong, R.T. Deep learning of multiresolution x-ray micro-computed-tomography images for multiscale modeling. Phys. Rev. Appl. 2022, 17, 054046. [Google Scholar] [CrossRef]
Montes, C.; Berardi, L.; Kapelan, Z.; Saldarriaga, J. Predicting bedload sediment transport of non-cohesive material in sewer pipes using evolutionary polynomial regression—Multi-objective genetic algorithm strategy. Urban Water J. 2020, 17, 154–162. [Google Scholar] [CrossRef]
Si, Q.; Lu, R.; Shen, C.; Xia, S.; Sheng, G.; Yuan, J. An intelligent cfd-based optimization system for fluid machinery: Automotive electronic pump case application. Appl. Sci. 2020, 10, 366. [Google Scholar] [CrossRef]
Zhu, Z.-Z.; Feng, Y.-W.; Lu, C.; Fei, C.-W. Reliability optimization of structural deformation with improved support vector regression model. Adv. Mater. Sci. Eng. 2020, 2020, 3982450. [Google Scholar] [CrossRef]
Bouhlel, M.A.; He, S.; Martins, J.R.R.A. Scalable gradient-enhanced artificial neural networks for airfoil shape design in the subsonic and transonic regimes. Struct. Multidiscip. Optim. 2020, 61, 1363–1376. [Google Scholar] [CrossRef]
Yao, S.; Kronenburg, A.; Stein, O.T. Efficient modeling of the filtered density function in turbulent sprays using ensemble learning. Combust. Flame 2022, 237, 111722. [Google Scholar] [CrossRef]
Aversano, G.; Bellemans, A.; Li, Z.; Coussement, A.; Gicquel, O.; Parente, A. Application of reduced-order models based on PCA & Kriging for the development of digital twins of reacting flow applications. Comput. Chem. Eng. 2019, 121, 422–441. [Google Scholar] [CrossRef]
Renganathan, S.A.; Maulik, R.; Letizia, S.; Iungo, G.V. Data-driven wind turbine wake modeling via probabilistic machine learning. Neural Comput. Appl. 2022, 34, 6171–6186. [Google Scholar] [CrossRef]
Maulik, R.; Fukami, K.; Ramachandra, N.; Fukagata, K.; Taira, K. Probabilistic neural networks for fluid flow surrogate modeling and data recovery. Phys. Rev. Fluids 2020, 5, 104401. [Google Scholar] [CrossRef]
Li, R.; Zhang, Y.; Chen, H. Learning the aerodynamic design of supercritical airfoils through deep reinforcement learning. Aiaa J. 2021, 59, 3988–4001. [Google Scholar] [CrossRef]
Zheng, C.; Ji, T.; Xie, F.; Zhang, X.; Zheng, H.; Zheng, Y. From active learning to deep reinforcement learning: Intelligent active flow control in suppressing vortex-induced vibration. Phys. Fluids 2021, 33, 063607. [Google Scholar] [CrossRef]

Figure 1. Machine learning method flow [21].

Table 1. Research status of optimal design of fluid machinery.

Problems	Authors	Approach Summary
Internal flow field	Yang et al. [5]	They applied IDM to optimize the design of fluid machinery, effectively suppressing secondary flow and cavitation. This was achieved by controlling the loading parameters and stacking conditions of the blades.
	Moghadassian et al. [6]	They used IDM to calculate the geometry of wind turbine blades. The iterative inverse algorithm solved the optimization problem, improving the performance of both single-rotor and double-rotor wind turbines.
	Souza et al. [7]	They applied the Topology of Binary Structures (TOBS) in fluid flow design. This method preserves the material distribution characteristics through the density method, successfully eliminates the gray problem, and clarifies the boundary between fluid and solid.
	Wildey et al. [8]	They utilized the uncertainty of the discontinuity position and generated robust boundaries by estimating specific probability quantities of samples.
Structure parameter	Meng et al. [10]	They developed a multidisciplinary optimization strategy that uses surrogate models to obtain solutions for adiabatic efficiency, equivalent stress, and total pressure ratio. This improved the reliability, safety, and performance of impellers.
	Munk et al. [11]	They demonstrated that coupling a finite element solver with the lattice Boltzmann method can significantly improve the algorithm’s computational efficiency and reduce the influence of fluid–structure coupling on the final optimization design. The degree of coupling can be adjusted to achieve this.
	Ghosh et al. [12]	To overcome challenges related to ill-posed inverse problems of turbine blades, they adopted the Probabilistic Machine Learning (PMI) framework. This method produced explicit inverse designs and solved the issue of sparse data required for training such models.
	Yu et al. [13]	They optimized the design of a blood pump using computational fluid dynamics and neural networks. They analyzed the influence of each parameter on performance and completed the parameter optimization.
	Yu et al. [14]	They studied the influence of splitter blade structural factors on the performance and flow field of axial flow pumps. Using the orthogonal array method, they analyzed and optimized the pump’s flow field and performance.
	Shi et al. [15]	They used a multidisciplinary optimization design method based on an approximate model. The objective functions were blade mass and efficiency, and the constraint conditions included head, efficiency, maximum stress, and maximum deformation, all under small flow conditions.
	Xu et al. [16]	They proposed a global optimization method for the design of annular jet pumps, which combines computational fluid dynamics simulations, Kriging approximation models, and experimental data.
	Kim et al. [18]	The mixed-flow pump impeller design was optimized by utilizing a commercial computational fluid dynamics program and a response surface method. The analysis involved the identification of design variables and performance changes in the inlet part of the impeller.
	Peng et al. [19]	A multi-objective genetic algorithm (MOGA) was used to perform multi-objective optimization on the optimal response surface model.

Table 2. Research status of supervised learning.

Method	Authors	Approach Summary
Regression and classification algorithm	Jafari-Marandi et al. [26]	They studied the assumptions that promote efficient pattern recognition and provided a theoretical explanation for why supervised learning may lack effectiveness.
	Hu et al. [27]	They proposed a spike optimization mechanism and demonstrated its ability to improve the accuracy of multi-layer Spiking Neural Networks (SNNs) while also reducing the algorithm’s running time.
	Wang et al. [28]	They discussed the use of support vector machine (SVM) algorithms for processing FMRI data and demonstrated that training supervised learning models can establish a link between brain imaging and stimulation.
	Zhou et al. [29]	The supervised defect prediction model, built using SVM, addresses issues of category imbalance and feature redundancy that exist in current defect prediction approaches.
	Xie [30]	He developed a fault diagnosis model using the SVM algorithm, which accurately classifies existing faults while demonstrating promising results in terms of accuracy, calculation time, and memory consumption.
	Bordoloi et al. [31]	They used the SVM algorithm to diagnose faults in centrifugal pumps, analyzed the conditions under which vibration signals caused these faults, and classified their time-domain features.
	Zhou et al. [32]	They combined the multi-scale CNN model with the MO-GAAL algorithm in an unsupervised learning approach to address active learning, effectively solving the problem of large loss function values in the test set.
	Yang et al. [33]	They proposed the Cultural Emperor Penguin Optimizer (CEPO) to improve the classification accuracy, convergence speed, stability, robustness, and operating efficiency of SVM.
	Hu et al. [34]	They developed a coupling algorithm, using Fractional-order-PCA-SVM, for digital image recognition, which was effective in recognizing digital medical images.
	Tharwat et al. [35]	They proposed the Chaotic Antlion Optimization (CALO) algorithm to optimize the kernel and penalty parameters of the SVM classifier, thereby reducing classification errors.
	Tan [36]	He utilized an improved Particle Swarm Optimization (PSO) algorithm to optimize the penalty factor and kernel function parameters in the SVM model.
	Wang et al. [37]	They introduced a Genetic Particle Swarm Optimization algorithm based on partial least squares to improve the classification accuracy of the SVM model.
	Xie et al. [38]	They proposed a cancer classification algorithm that combines the Dragonfly Algorithm with SVM. The algorithm optimizes the parameters of the SVM classifier and improves the classification accuracy.
	Li et al. [39]	They developed a new Differential Evolution Algorithm (DEA) for selecting parameters in SVM models, which achieves faster convergence speeds and higher classification accuracy.
	Ding et al. [40]	They proposed the Improved Sparrow Search Algorithm (ISSA) for SVM fault diagnosis. They applied this algorithm to establish a fault diagnosis model, ISSA-SVM, for dissolved gas analysis.
	Wang et al. [41]	They proposed the Distance-Weighted Parallel Random Forest (DW-PRF) optimization algorithm. The algorithm incorporates distance weights and parallelization techniques, leading to better results in terms of accuracy and computational time.
	Wang et al. [42]	They proposed a Post-Selection Boosting Random Forest (PBRF) algorithm that combines RF and Lasso regression methods, and this algorithm has been verified to improve model performance.
	Wang et al. [42]	They proposed a method to distinguish between strong and weak correlation features by calculating feature weights using the Spark platform. The method obtains feature subspaces through hierarchical sampling.
	Sun et al. [44]	They proposed the Banzhaf Random Forest algorithm (BRF), which is based on cooperative game theory. This approach has shown consistency and can narrow the gap between random forest theory and practical applications.
	Wang et al. [45]	They proposed an epistasis detection algorithm, called Artificial Fish Swarm Optimizing Bayesian Network (AFSBN), which has shown superior performance in accurately detecting epistasis across various datasets when compared to other methods.
Evolutionary algorithm	Maionchi et al. [46]	The optimal geometry of circular obstacles in the channel of the micromixer was obtained using a neural network. The genetic algorithm was utilized to identify the geometry that can offer the maximum mixing percentage and minimum pressure drop values.
	Huang et al. [47]	They proposed a generalized regression neural network (GRNN) optimized using a genetic algorithm.
	Chui et al. [48]	They proposed a generalized support vector machine (SVM) model for deep multi-kernel learning, optimized using a multi-objective genetic algorithm. The results demonstrate that the algorithm achieves higher accuracy while also addressing typical issues with datasets in simulated environments, such as unreliable cross-validation and input signals.

Table 3. Research status of unsupervised learning.

Method	Authors	Approach Summary
Dimensionality reduction	Ge et al. [53]	They proposed a domain adversarial neural network model to learn a dimensionality reduction representation of single-cell RNA sequencing data.
	Deng et al. [54]	They proposed a Tensor Envelope Mixture Model (TEMM) for clustering and reducing the dimensionality of tensor data. This model reduces the number of free parameters and estimation variability. Additionally, they developed an expectation-maximization algorithm to obtain likelihood estimates of cluster means and covariances.
	Charte et al. [55]	They proposed an autoencoder-based approach for reducing complexity. This approach improves the shape and distribution of different classes, which further reduces complexity while preserving the most information about the classes in the encoded features.
	Li et al. [56]	They proposed a new unsupervised method for robust and discriminative manifold embedding. This method addresses the issue of low performance on noisy data.
Clustering	Gan et al. [57]	They proposed a general deep clustering framework that integrates representation learning and clustering into a single pipeline for the first time.
	Basar et al. [58]	They proposed a new adaptive initialization method for determining the optimal initialization parameters of the traditional K-means clustering technique. This method optimizes segmentation quality and reduces classification error.
	Kimura et al. [59]	They extended t-SNE using the framework of information geometry.
	Unver et al. [60]	They proposed defining fuzzy core points to enable DBSCAN to operate with two different density modes simultaneously. This proposal also includes the initial concept of a DBSCAN extension called IFDB-SCAN.

Table 4. Research status of semi-supervised learning.

Method	Authors	Approach Summary
Regression, Dimension Reduction, and Clustering	Zhao et al. [62]	They developed an intelligent fault identification system for rotary fluid machinery. The system was based on a set of BP neural networks, and utilizes genetic algorithms for optimization.
	Ling [63]	He utilized a multilayer feedforward neural network model and a machine learning approach called flow field reproduction to compute the aerodynamic features of the airfoil.
	Lu et al. [64]	They proposed a semi-supervised extreme learning machine (SSELM) method based on improved SMOTE to solve the problem of scarce labeled samples in the model construction process.
Deep Reinforcement Learning	Ding et al. [66]	They created a novel algorithm, APSO-BQSA, for asynchronous reinforcement learning. This algorithm combines the backward Q-learning and Asynchronous Particle Swarm Optimization (APSO) techniques to improve its effectiveness.
	Kumar et al. [67]	They evaluated the effectiveness of multiple linear regression, multiple nonlinear regression, and artificial neural network (ANN) models in forecasting the best configuration parameters for jet aerators.
	Dalca et al. [68]	They proposed a probabilistic generative model and derived an inference algorithm based on unsupervised machine learning.
	Li et al. [70]	They developed a deep learning model that utilizes statistical features for measuring vibrations in rotating machinery.
	Viquerat et al. [71]	They utilized deep reinforcement learning (DRL) for direct shape optimization. This study shows the potential of reinforcement learning research for hydrodynamic shape optimization and its effectiveness in this application.
	Young et al. [72]	They introduced a new, open-source algorithm for distributed Bayesian model optimization, named HyperSpace.

Table 5. Research status of flow feature extraction.

Method	Authors	Approach Summary
Dimensionality reduction	Mendez et al. [73]	The distinctions and associations between autoencoders and manifold learning methods were highlighted. The study explored how nonlinear techniques, such as kernel principal component analysis, isometric feature learning, and local linear embedding, affect filtering, oscillation pattern recognition, and data compression.
Clustering and classification	Mi et al. [74]	They applied K-means clustering to density logging and P-wave velocity data from three wells. Additionally, they used the density log equation to calculate the porosity of each cluster. By identifying the center of mass of each cluster, they were able to determine the main lithologies, pore fluids, and fluid contacts.
	Zeng et al. [75]	To cluster trajectories and identify the main traffic flow patterns in the terminal airspace, they used a deep autoencoder (DAE) and a Gaussian Mixture Model (GMM). Specifically, they utilized the feature representations extracted by the DAE from historical high-dimensional trajectory data as input for the GMM clustering.
	He et al. [76]	They applied a convolutional neural network (CNN) to extract features from multidimensional data. Additionally, they utilized a Long Short-Term Memory Network (LSTM) to identify the relationships between different time steps. This overcomes the limitation of high-quality feature dependence.
Sparse and randomized methods	Krah et al. [77]	They applied biorthogonal wavelets to compress data for wavelet feature extraction, resulting in a sparse representation that reduces the amount of data analyzed while controlling compression errors.
Sparse and randomized methods	Naderi et al. [78]	They utilized the Hybrid Dynamic Mode Decomposition (HDMD) method to analyze the unsteady fluid flow on a moving structure. To achieve a fixed spatial domain for DMD, they used the K-Nearest Neighbors (KNN) algorithm to interpolate numerical data of the dynamic grid onto a single stationary grid at each time step. This approach allowed them to effectively study the fluid dynamics in a more manageable and accurate manner.
Super-resolution and flow cleansing	Jackson et al. [79]	They developed a three-dimensional (3D) Enhanced Deep Super-Resolution (EDSR) convolutional neural network.
Super-resolution and flow cleansing	Montes et al. [80]	They employed the Evolutionary Polynomial Regression-Multi-Objective Genetic Algorithm (EPR-MOGA) to gather data from various steady-state flows, process them, and create three new self-cleaning models based on their optimization strategies.

Table 6. Research status of optimal design of fluid machinery using machine learning techniques.

Method	Authors	Approach Summary
Surrogate model	Si et al. [81]	The impeller of a pump was optimized using an orthogonal experimental design in combination with the multi-island genetic algorithm (MIGA).
	Zhu et al. [82]	An improved Support Vector Regression Model (ISRM) was developed, and the Multiple Population Genetic Algorithm (MPGA) was used to optimize the ISRM model and its corresponding program.
	Bouhlel et al. [83]	An artificial neural network based on gradient enhancement was utilized to simulate airfoils under various flight conditions. The aerodynamic coefficients of subsonic and transonic airfoils were simulated by gradually introducing gradient information, enabling airfoil design optimization within seconds.
Reduced-order model	Yao et al. [84]	They applied the XGBoost and Light GBM algorithms to model the mixture fraction’s filtered density function in turbulent evaporative spray.
Reduced-order model	Aversano et al. [85]	They employed principal component analysis (PCA) to distinguish system invariants from coefficients associated with characteristic operating conditions. Subsequently, they utilized the Kriging correspondence method to determine the response surface for these coefficients.
Deep learning	Renganathan et al. [86]	They employed probabilistic machine learning techniques, specifically Gaussian process modeling. This allowed them to learn the mapping between the parameter space and latent space while accounting for data cognition and statistical uncertainty.
	Maulik et al. [87]	They used Gaussian process modeling, a probabilistic machine learning technique, to map the parameter space to the latent space and account for statistical uncertainty and data cognition.
	Li et al. [88]	They analyzed the aerodynamic drag of a supercritical airfoil using a deep reinforcement learning algorithm. The initial strategy of reinforcement learning was pre-trained through imitation learning, and the policy was subsequently trained in a proxy-model-based environment to reduce drag.
	Zheng et al. [89]	The study combined the Bayesian optimization algorithm with a specified control action to predict the vibration amplitude of eddy-current-induced vibration. They used the Gaussian process regression surrogate model for the prediction and also developed a real-time control system using the soft actor–critic deep reinforcement learning algorithm.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, B.; Deng, J.; Liu, X.; Chang, A.; Chen, J.; Zhang, D. A Review on Optimal Design of Fluid Machinery Using Machine Learning Techniques. J. Mar. Sci. Eng. 2023, 11, 941. https://doi.org/10.3390/jmse11050941

AMA Style

Xu B, Deng J, Liu X, Chang A, Chen J, Zhang D. A Review on Optimal Design of Fluid Machinery Using Machine Learning Techniques. Journal of Marine Science and Engineering. 2023; 11(5):941. https://doi.org/10.3390/jmse11050941

Chicago/Turabian Style

Xu, Bin, Jiali Deng, Xingyu Liu, Ailian Chang, Jiuyu Chen, and Desheng Zhang. 2023. "A Review on Optimal Design of Fluid Machinery Using Machine Learning Techniques" Journal of Marine Science and Engineering 11, no. 5: 941. https://doi.org/10.3390/jmse11050941

APA Style

Xu, B., Deng, J., Liu, X., Chang, A., Chen, J., & Zhang, D. (2023). A Review on Optimal Design of Fluid Machinery Using Machine Learning Techniques. Journal of Marine Science and Engineering, 11(5), 941. https://doi.org/10.3390/jmse11050941

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review on Optimal Design of Fluid Machinery Using Machine Learning Techniques

Abstract

1. Introduction

2. Review Methodology

3. Research Status of Optimal Design of Fluid Machinery

4. Machine Learning Algorithm

4.1. Supervised Learning

4.1.1. Regression and Classification Algorithm

4.1.2. Evolutionary Algorithm

4.2. Unsupervised Learning

4.2.1. Dimensionality Reduction

4.2.2. Clustering

4.3. Semi-Supervised Learning

4.3.1. Regression, Dimension Reduction, and Clustering

4.3.2. Deep Reinforcement Learning

5. Application of Machine Learning in Optimal Design of Fluid Machinery

5.1. Flow Feature Extraction

5.1.1. Dimensionality Reduction

5.1.2. Clustering and Classification

5.1.3. Sparse and Randomized Methods

5.1.4. Super-Resolution and Flow Cleansing

5.2. Optimal Design of Fluid Machinery

5.2.1. Surrogate Model

5.2.2. Reduced-Order Model

5.2.3. Deep Learning

6. Outlook

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI