Interfacial Friction Prediction in a Vertical Annular Two-Phase Flow Based on Support Vector Regression Machine

: Accurate prediction of interfacial friction factor is critical for calculation of pressure drop and investigation of ﬂow mechanism of vertical annular two-phase ﬂows. Theoretical models of interfacial friction factor based on physical insight have been developed; however, these are inconvenient in engineering practice as too many parameters need to be measured. Although many researchers have proposed various empirical correlations to improve computation efﬁciency, there is no generally accepted simple formula. In this study, an efﬁcient prediction model based on support vector regression machine (SVR) is proposed. Through sensitivity analysis, ﬁve factors are determined as the input parameters to train the SVR model, relative liquid ﬁlm thickness, liquid Reynolds number, gas Reynolds number, liquid Froude number and gas Froude number. The interfacial friction factor is chosen as the output parameter to check the overall performance of the model. With the help of particle swarm algorithm, the optimization process is accelerated considerably, and the optimal model is obtained through iterations. Compared with other correlations, the optimal model shows the lowest average absolute error ( AAE of 0.0004), lowest maximum absolute error ( MAE of 0.006), lowest root mean square error ( RMSE of 0.00076) and highest correlation factor ( r of 0.995). The analysis using various data in the literature demonstrates its accuracy and stability in interfacial friction prediction. In summary, the proposed machine learning model is effective and can be applied to a wider range of conditions for vertical annular two-phase ﬂows.


Introduction
Vertical annular two-phase flow (as shown in Figure 1) is a type of gas-liquid two-phase flow in a vertical pipe [1]. A vertical annular flow occurs in the processes of steam generation, gas-liquid mixing and separation, oil and gas production and transportation. In particular, the gas occupies the bulk volume of the pipeline, and the liquid mainly distributes on the wall of pipeline in the form of liquid film. Entrained droplets could be formed in the liquid, and they flow in the gas stream along the pipeline [2]. In the process of a vertical annular flow, liquid droplets generated by the liquid film could enter the gas core and droplets embedded in the gas core would deposit into the liquid film. The entire complex process results in continuous exchange of mass, momentum and energy at the gas-liquid interface [3]. In order to explore in-depth the physics of vertical annular flows, many models have been proposed [4,5]. In these models, the pressure drop is dominated by friction force and gravity. Interfacial shear stress is a significant parameter for calculation of friction force, which remains difficult to measure. Experimental observations in the literature show that there were a few types of fluctuations at the gas-liquid interface, leading to the complex interfacial structure [6]. These fluctuation waves are generally classified into two categories, namely the large wave and the wavelet, according to parameters, such as relative wave height (relative average liquid film thickness), distribution continuity and Water 2021, 13, 3609 2 of 17 average liquid film flow rate [7,8]. Wavelets are waves of low velocity and small relative wave height. Large waves have relatively large wave height and long fluctuation distance, and they usually occur when the gas velocity is high and have dominant influence on the behavior at the gas-liquid interface, also known as disturbance wave. Because of the existence of these fluctuations, the flow structure of the liquid film is quite similar to the rough wall of a single-phase pipe flow. Therefore, many researchers study annular flow by analogy with turbulent flow of a single-phase rough pipe flow. Specifically, the liquid film is comparable to the rigid wall surface, the fluctuation to the tube wall roughness and the air core to the pipe flow core [9]. Based on this hypothesis, researchers have established a numerical model considering the interfacial shear stress. However, a large number of parameters have to be carefully measured and calibrated in order to apply the model. parameters, such as relative wave height (relative average liquid film thickness), distribution continuity and average liquid film flow rate [7,8]. Wavelets are waves of low velocity and small relative wave height. Large waves have relatively large wave height and long fluctuation distance, and they usually occur when the gas velocity is high and have dominant influence on the behavior at the gas-liquid interface, also known as disturbance wave. Because of the existence of these fluctuations, the flow structure of the liquid film is quite similar to the rough wall of a single-phase pipe flow. Therefore, many researchers study annular flow by analogy with turbulent flow of a single-phase rough pipe flow. Specifically, the liquid film is comparable to the rigid wall surface, the fluctuation to the tube wall roughness and the air core to the pipe flow core [9]. Based on this hypothesis, researchers have established a numerical model considering the interfacial shear stress. However, a large number of parameters have to be carefully measured and calibrated in order to apply the model. The interfacial shear stress is directly related to the interfacial friction factor. Accurate prediction of the interfacial friction factor could greatly simplify the calculation of the shear stress. Therefore, empirical formulas of the interfacial friction factor of an annular flow have been proposed by a number of researchers. Wallis et al. [10] proposed an empirical formula assuming that the interfacial friction factor was proportional to the relative liquid film thickness, and the Wallis formula is widely used in the literature. However, this formula is only applicable to cases of small relative liquid film thickness. Since then, researchers have made great improvements to the Wallis formula and developed different prediction formulas [1,[11][12][13][14][15][16][17][18][19][20][21][22]. These extended formulas however have limitations for different ranges of superficial liquid and gas velocity. Therefore, it is necessary to develop a general and reliable method to more accurately predict the interfacial friction factor.
Machine learning (ML) techniques have been developed in recent years to make predictions for problems with a large number of influencing factors. One of the well-established ML techniques is the support vector machine (SVM) algorithm, which was firstly developed by Vapnik et al. at AT&T Bell laboratories [23][24][25][26][27][28]. As a matter of fact, it is difficult to find solutions of nonlinear problems directly, and they can be solved as linear separation problems. The SVM maps training samples to the high-dimensional characteristic space through nonlinear transformation, where a classification hyperplane is established and these samples can be separated linearly. As a general ML method, it follows the principle of structural risk minimization and shows a number of excellent advantages in solving problems of small sample size and nonlinear pattern recognition and can be applied to nonlinear regression problems. Based on applications of different problems, The interfacial shear stress is directly related to the interfacial friction factor. Accurate prediction of the interfacial friction factor could greatly simplify the calculation of the shear stress. Therefore, empirical formulas of the interfacial friction factor of an annular flow have been proposed by a number of researchers. Wallis et al. [10] proposed an empirical formula assuming that the interfacial friction factor was proportional to the relative liquid film thickness, and the Wallis formula is widely used in the literature. However, this formula is only applicable to cases of small relative liquid film thickness. Since then, researchers have made great improvements to the Wallis formula and developed different prediction formulas [1,[11][12][13][14][15][16][17][18][19][20][21][22]. These extended formulas however have limitations for different ranges of superficial liquid and gas velocity. Therefore, it is necessary to develop a general and reliable method to more accurately predict the interfacial friction factor.
Machine learning (ML) techniques have been developed in recent years to make predictions for problems with a large number of influencing factors. One of the wellestablished ML techniques is the support vector machine (SVM) algorithm, which was firstly developed by Vapnik et al. at AT&T Bell laboratories [23][24][25][26][27][28]. As a matter of fact, it is difficult to find solutions of nonlinear problems directly, and they can be solved as linear separation problems. The SVM maps training samples to the high-dimensional characteristic space through nonlinear transformation, where a classification hyperplane is established and these samples can be separated linearly. As a general ML method, it follows the principle of structural risk minimization and shows a number of excellent advantages in solving problems of small sample size and nonlinear pattern recognition and can be applied to nonlinear regression problems. Based on applications of different problems, SVM is divided into support vector classification (SVC) and support vector regression (SVR). The application of SVM for regression prediction analysis is known as the support vector regression machine method. In order to find the best parameters for SVR, particle swarm Water 2021, 13, 3609 3 of 17 optimization (PSO) could be employed to optimize the searching process and improve the efficiency. PSO is an evolutionary algorithm developed by Kennedy et al. [29] and is of fast convergence and high precision. Similar to the simulated annealing algorithm, PSO starts from a random initial solution, searches for the optimal solution through iterations, evaluates the adaptability of solutions by a fitting function and updates the global optimal value after comparing with the current optimal value. Instead of applying empirical formulas, in this work, we develop a combined PSO-SVR model to compute the optimal friction factor for the vertical annular flow. In contrast to empirical formulas, the proposed PSO-SVR model has the capabilities of dealing with a wider range of flow conditions. It is able to analyze the influence of various factors on the change of interfacial friction factor due to the nature of the ML techniques.
We collect the empirical formulas for vertical annular flow available in the open literature and discuss the limitation of these formulas under different experimental conditions. Then, we analyze factors correlated with the friction factor and develop a prediction model for the interfacial friction factor in vertical annular flows using ML techniques. An examination of published models for the interfacial friction factor is carried out using data collected from the literature for vertical annular flows. By comparison with results from empirical formulas, we show that the proposed model is promising for calculating the friction factor for vertical annular flows in a wide range of parameters.

Interfacial Friction Factor Calculation Model and Empirical Formulas
With the assumption that the droplet entrainment rate of the gas core and deposition rate in the gas core were equal and the axial flow rate was constant in the fully developed vertical film region, the momentum conservation equation for the vertical annular flow can be derived and be simplified to the interfacial shear stress formula [21,30] as where dp/dz is the flow pressure difference, ρ C is the air core density (when there are small droplets in the air core, it represents the mixture density of the air core and the droplets), R E is the entrainment rate of droplets atomized from liquid film to gas core and R D is the deposition rate of droplets to liquid film in the gas core. A C is the cross section area of the gas core, S C is the perimeter of the cross section of the gas core, U E is the average axial velocity of liquid droplets and U D is the droplet deposition velocity. From the perspective of the gas flow, the liquid film can be taken as a rigid and rough wall with the velocity of U G + U i , where U G is the gas flow velocity and U i is the interface velocity. Compared with the gas flow velocity, the interface velocity is much smaller and can been neglected. Therefore, the friction factor [1,10] is expressed as: where f i is the interfacial friction factor of the vertical annular flow, τ i is the interfacial shear stress, and ρ G is the gas density. Substituting the shear stress in Equation (1) into the correlation, the interfacial friction factor could be obtained. Through experimental analysis, researchers found that the friction factor correlates with relative liquid film thickness, gas Reynolds number, liquid Reynolds number and other parameters. Wallis [10] proposed an empirical formula to calculate friction factor using the relative liquid film thickness, which was then extended by a number of authors in a variety of forms for different conditions, as summarized in Table 1.  Moeck, 1970) [15] f i = 0.005 1 + 1458(t/D) 1.42 (Hori et al., 1978) [16] Fr G = u sg / gD, Re G = ρ g u g D/µ g Given a set of training data (x 1 , y 1 ), . . . , (x i , y i ), y ∈ [+1, −1], i = 1, 2, . . . , l, linear regression function can be used to fit the data as where ω represents a multidimensional vector and b represents a real number. For a nonlinear regression problem, samples will be analyzed in a feature space of higher dimension (Hilbert space) through the non-linear mapping function ϕ: X→ϕ (X), as shown in Figure 2.  where represents a multidimensional vector and represents a real number. For a nonlinear regression problem, samples will be analyzed in a feature space of higher dimension (Hilbert space) through the non-linear mapping function φ: X→φ (X), as shown in Figure 2. According to the structural risk minimization criterion, the regression estimation problem can be transformed into an optimization problem with a ε-insensitive loss function as where the constant C is positive and represents the penalty degree of samples outside the ε interval illustrated in Figure 3a. The ε is used to control the complexity of the regression According to the structural risk minimization criterion, the regression estimation problem can be transformed into an optimization problem with a ε-insensitive loss function as where the constant C is positive and represents the penalty degree of samples outside the ε interval illustrated in Figure 3a. The ε is used to control the complexity of the regression functions and the ε-insensitivity loss function is explained in Figure 3b. The slack variables ξ i , ξ i are used to consider the fitting error.
Water 2021, 13, x FOR PEER REVIEW 5 of 18 functions and the ε-insensitivity loss function is explained in Figure 3b. The slack variables , are used to consider the fitting error.
(a) (b) Kernel functions, as the projecting function, are introduced to make the calculation directly on the input space, which is a real symmetric positive function and satisfies the Mercer condition [31][32][33][34] , Common kernel functions include linear kernel function, polynomial kernel function, Gaussian kernel function, etc. Among them, Gaussian kernel function has a strong learning ability and is selected as kernel function of the model developed in this study. The formula is as follows:

Particle Swarm Optimization
The PSO algorithm was firstly proposed by Kennedy and Eberhart in 1995, which was originated from the study on bird behaviors [35]. Each particle in the algorithm represents a potential solution to the problem, which corresponds to a "fitness" determined by an optimization function. The function has no mass and volume but only speed and position. The velocity of a particle determines its flight direction and distance. The target function is gradually optimized through the updated information of velocity and position. The initial state of a particle swarm is a group of random particles, and iterations are carried out to find the optimal position of the whole population, that is, the global optimum. At every time step when a particle position is updated, a 'fitness' value needs to be calculated. The fitness values of the individual optimal position and the group optimal position are compared with that of the new position, and updated accordingly. After multiple iterations, the whole group gradually moves to the optimal solution of the target function. The process is shown in Figure 4. Kernel functions, as the projecting function, are introduced to make the calculation directly on the input space, which is a real symmetric positive function and satisfies the Mercer condition [31][32][33][34] Common kernel functions include linear kernel function, polynomial kernel function, Gaussian kernel function, etc. Among them, Gaussian kernel function has a strong learning ability and is selected as kernel function of the model developed in this study. The formula is as follows:

Particle Swarm Optimization
The PSO algorithm was firstly proposed by Kennedy and Eberhart in 1995, which was originated from the study on bird behaviors [35]. Each particle in the algorithm represents a potential solution to the problem, which corresponds to a "fitness" determined by an optimization function. The function has no mass and volume but only speed and position. The velocity of a particle determines its flight direction and distance. The target function is gradually optimized through the updated information of velocity and position. The initial state of a particle swarm is a group of random particles, and iterations are carried out to find the optimal position of the whole population, that is, the global optimum. At every time step when a particle position is updated, a 'fitness' value needs to be calculated. The fitness values of the individual optimal position and the group optimal position are compared with that of the new position, and updated accordingly. After multiple iterations, the whole group gradually moves to the optimal solution of the target function. The process is shown in Figure 4.
Suppose a population of m particles is distributed in the D-dimensional search space, which is X = (X 1 , X 2 , . . . X m ). At the time t, the position of the ith particle is , the optimal position of each particle in iteration is P i (t) = [P i1 (t), P i2 (t), . . . , P iD (t)], and the optimal position of the population is P g (t) = P g1 (t), P g2 (t), . . . , P gD (t) . In order to better balance the global and local search ability and ensure convergence, Shi and Eberhart [36] introduced an inertia weight ω into the velocity equation and obtained  Suppose a population of m particles is distributed in the D-dimensional search space, which is , , … . At the time t, the position of the ith particle is , , … , the velocity is , , … , the optimal position of each particle in iteration is , , … , and the optimal position of the population is , , … . In order to better balance the global and local search ability and ensure convergence, Shi and Eberhart [36] introduced an inertia weight ω into the velocity equation and obtained (7) where is the initial inertia weight, the final inertia weight, and the maximum iteration number. (8) where is cognitive coefficient and is social coefficient; , are random numbers in (0,1). To avoid blind search of particles, the position and the speed are generally limited to the interval , and , , respectively.

Model Development and Parametric Analysis
The SVR algorithm is employed to establish the model, with parameters optimized by a PSO algorithm. The model development process involves data processing, SVR input variable analysis and parameter optimization.

Data Processing
Data processing is needed before model training and prediction, which includes data collection and data normalization and denormalization. The database is established by collecting the data from available literature. Data normalization helps to accelerate numerical convergence, and denormalization is used to obtain the real values from predicted results.

Data Collection
For our analysis of vertical annular two-phase flow, 268 data points were collected from 4 sources, with pipe diameter D ranging from 16-50.8 mm. Most of the data sets were recorded at atmospheric pressure, with gas speed ranging from 9-132 m/s and liquid speed from 0.001-0.200 m/s. Details of the data are shown in Table 2.
where ω max is the initial inertia weight, ω min . the final inertia weight, and t max the maximum iteration number.
v t+1 where c 1 is cognitive coefficient and c 2 is social coefficient; r 1 , r 2 are random numbers in (0,1). To avoid blind search of particles, the position and the speed are generally limited to the interval [−X max ,X max ] and [−V max ,V max ], respectively.

Model Development and Parametric Analysis
The SVR algorithm is employed to establish the model, with parameters optimized by a PSO algorithm. The model development process involves data processing, SVR input variable analysis and parameter optimization.

Data Processing
Data processing is needed before model training and prediction, which includes data collection and data normalization and denormalization. The database is established by collecting the data from available literature. Data normalization helps to accelerate numerical convergence, and denormalization is used to obtain the real values from predicted results.

Data Collection
For our analysis of vertical annular two-phase flow, 268 data points were collected from 4 sources, with pipe diameter D ranging from 16-50.8 mm. Most of the data sets were recorded at atmospheric pressure, with gas speed ranging from 9-132 m/s and liquid speed from 0.001-0.200 m/s. Details of the data are shown in Table 2. The selected data points have a wide range of distribution. The Hewitt & Roberts (HR) flow pattern graph has been the most widely used flow pattern diagram, where different  In our ML model, the collected data are grouped into a training set and a testing set. The training set accounts for 90% (241) of the data bank, which is utilized to optimize the developed ML model by searching the best parameters. Normally, multiple-fold validation approach is employed to optimize the training process. The remaining 10% (27) is applied as the testing set to validate the accuracy of the optimized model. For each of the sources in Table 2, data points from the same source are randomly mixed and allocated into the training set and the testing set in proportion. For instance, with the 161 data points from Asali [37], 145 data points were randomly selected as the training set and the rest 16 as the testing set. With the splitting of each data source, the training sets and testing sets from different sources are merged respectively to form the final training set and testing set for our model.

Data Normalization and Denormalization
Before model training, all data should be normalized to eliminate large prediction errors caused by the order difference. In this paper, the normalization formula is given as follows (9) where is the original input data, the minimum value of the original data, the maximum value of the original input data, and the normalized solution. The model training and prediction are performed after data normalization. With  [39] showing all data are in the annular flow regime. In our ML model, the collected data are grouped into a training set and a testing set. The training set accounts for 90% (241) of the data bank, which is utilized to optimize the developed ML model by searching the best parameters. Normally, multiple-fold validation approach is employed to optimize the training process. The remaining 10% (27) is applied as the testing set to validate the accuracy of the optimized model. For each of the sources in Table 2, data points from the same source are randomly mixed and allocated into the training set and the testing set in proportion. For instance, with the 161 data points from Asali [37], 145 data points were randomly selected as the training set and the rest 16 as the testing set. With the splitting of each data source, the training sets and testing sets from different sources are merged respectively to form the final training set and testing set for our model.

Data Normalization and Denormalization
Before model training, all data should be normalized to eliminate large prediction errors caused by the order difference. In this paper, the normalization formula is given as follows where x is the original input data, x min the minimum value of the original data, x max the maximum value of the original input data, and x original the normalized solution. The model training and prediction are performed after data normalization. With normalization, the input data and the calculated results are all in the range of 0 to 1. In order to obtain the actual results, the predicted data shall be recovered by the following equation where y predicted is the output predicted by the ML model and y is the denormalized output.

SVR Input Variable Analysis
To ensure dimensional consistency, the input variables of the SVR should be dimensionless. However, there are no definite and general models to determine the factors that influence the interfacial friction factor. Based on the existing empirical formulas, five most relevant parameters are selected, i.e., relative film thickness, gas Reynolds number, liquid Reynolds number, gas Froude number and liquid Froude number. To investigate the relationship between these five parameters and the interfacial friction factor, sensitivity analysis is conducted.
As can be seen from Figure 6, the overall trend shows that the interfacial friction factor is gradually enlarged with the increase of the relative liquid film thickness (the ratio of film thickness to pipe diameter). There is a clear correlation between the interfacial friction factor and the relative film thickness, which has been presented by Wallis et al. [10], Belt et al. [13] and Moeck [15]. However, for the data obtained under different experimental conditions, the exact relationship might be different. Therefore, it will not be sufficient to predict a wide range of interfacial friction factors based on this correlation alone.
To ensure dimensional consistency, the input variables of the SVR should be dimen sionless. However, there are no definite and general models to determine the factors tha influence the interfacial friction factor. Based on the existing empirical formulas, five mos relevant parameters are selected, i.e., relative film thickness, gas Reynolds number, liquid Reynolds number, gas Froude number and liquid Froude number. To investigate the re lationship between these five parameters and the interfacial friction factor, sensitivity analysis is conducted.
As can be seen from Figure 6, the overall trend shows that the interfacial friction fac tor is gradually enlarged with the increase of the relative liquid film thickness (the ratio of film thickness to pipe diameter). There is a clear correlation between the interfacial fric tion factor and the relative film thickness, which has been presented by Wallis et al. [10] Belt et al. [13] and Moeck [15]. However, for the data obtained under different experi mental conditions, the exact relationship might be different. Therefore, it will not be suf ficient to predict a wide range of interfacial friction factors based on this correlation alone Normally, the vertical annular flow is modelled as a single-phase flow through a rough pipe as seen in Figure 1. In the analogy, the uneven liquid film can be regarded as the rough wall surface where the film surface motion represents the variation of the wal roughness. The difference between the gas and liquid velocity would affect the motion o the interface, such that the interfacial "friction" factor can be changed at different flow status. In addition, the fluid density and viscosity are important physical quantities for the flow velocities. Fukano and Furukawa [19] proposed an empirical formula of interfa cial friction factor with consideration of fluid viscosity. In order to describe the common influence of fluid density, viscosity and flow rate, the Reynolds numbers of gas and liquid can be utilized to characterize the flow condition. In a single-phase flow, the friction factor of the transitional rough zone is affected by the roughness height and the Reynolds num ber. With the reference to single-phase flow analysis, the Reynolds number is included in Figure 6. Variation of interfacial friction factor with relative liquid film thickness.
Normally, the vertical annular flow is modelled as a single-phase flow through a rough pipe as seen in Figure 1. In the analogy, the uneven liquid film can be regarded as the rough wall surface where the film surface motion represents the variation of the wall roughness. The difference between the gas and liquid velocity would affect the motion of the interface, such that the interfacial "friction" factor can be changed at different flow status. In addition, the fluid density and viscosity are important physical quantities for the flow velocities. Fukano and Furukawa [19] proposed an empirical formula of interfacial friction factor with consideration of fluid viscosity. In order to describe the common influence of fluid density, viscosity and flow rate, the Reynolds numbers of gas and liquid can be utilized to characterize the flow condition. In a single-phase flow, the friction factor of the transitional rough zone is affected by the roughness height and the Reynolds number. With the reference to single-phase flow analysis, the Reynolds number is included in many empirical formulas for annular flow. Fore et al. [21] noted that the Wallis formula could overpredict the friction factor at very small values of the relative film thickness and modified the interfacial friction factor formula with the introduction of the gas Reynolds number. Compared with the Wallis formula, the suggested formula by Fore et al. [21] showed better prediction accuracy in a wider range of film thickness. Similarly, the modification was adopted in the formula of Wongwises and Kongkiatwanitch [1]. Liquid Reynolds number was also included in the formulas proposed by Hori et al. [16] and Pan et al. [22]. Here, the Reynolds numbers for liquid and gas are defined as Re G = ρ G v G D/µ G and Re L = ρ L v L D/µ L , respectively, where the subscripts G and L represent gas and liquid, respectively, and ρ, v, µ are the density, velocity and viscosity of gas and liquid.
The vertical annular flow is mainly governed by gravity, inertial force and friction force. When calculating the interfacial friction factor, it is necessary to consider the influence of gravity and inertial force on the flow. The fluid Froude number is defined as the ratio of inertial force to gravity of fluids in the vertical pipe. The dimensionless gas and liquid Froude numbers can properly represent the influence of gravity and inertial force on the change of friction. The empirical formula proposed by Hori et al. [16] has introduced gas and liquid Froude numbers into the calculation of the interfacial friction factor and produces good predictions. Similarly, we introduce the Froude numbers into our prediction model. The gas and liquid Froude numbers can be written as Fr G = v G / gD and Fr L = v L / gD. With the collected data from available studies, plots of f i versus Re G , Re L , Fr G and F L values are illustrated in Figure 7, respectively. It is clear that the f i magnifies as the gas Reynolds number increases, which shows the same trend as that of f i with the gas Froude number. However, it goes conversely for the relationship between f i and liquid Reynolds number, as well as liquid Froude number. As can readily be seen, differences of viscosity and diameter could only influence the slope and point distribution, while the variance of the interfacial friction factor mainly depends on gas velocity and liquid velocity.
force. When calculating the interfacial friction factor, it is necessary to consider the influence of gravity and inertial force on the flow. The fluid Froude number is defined as the ratio of inertial force to gravity of fluids in the vertical pipe. The dimensionless gas and liquid Froude numbers can properly represent the influence of gravity and inertial force on the change of friction. The empirical formula proposed by Hori et al. [16] has introduced gas and liquid Froude numbers into the calculation of the interfacial friction factor and produces good predictions. Similarly, we introduce the Froude numbers into our prediction model. The gas and liquid Froude numbers can be written as / and / . With the collected data from available studies, plots of versus , , and values are illustrated in Figure 7, respectively. It is clear that the magnifies as the gas Reynolds number increases, which shows the same trend as that of with the gas Froude number. However, it goes conversely for the relationship between and liquid Reynolds number, as well as liquid Froude number. As can readily be seen, differences of viscosity and diameter could only influence the slope and point distribution, while the variance of the interfacial friction factor mainly depends on gas velocity and liquid velocity. The above analysis indicates that f i shows apparent relevance with relative thickness, Re G , Re L , Fr G and F L . Thus, we select these five parameters as the input variables of our PSO-SVR model.

Parameter Optimization of PSO-SVR Model
The prediction accuracy of the SVR model is determined by the model's control parameters. In this paper, the PSO algorithm is used to determine the optimal control parameters of the SVR model. Firstly, we construct the PSO-SVR model structure by programming, and we set the control parameters of the PSO algorithm, including weight ω, global search coefficient c 1 , local search coefficient c 2 , maximum evolution algebra max gen and population size pop . Then, we initialize the SVR model operation parameters, including the penalty coefficient C, g of the kernel function and the loss coefficient P. A larger C means it is easier for the model to be overfitted. g is a parameter of the RBF kernel, which implicitly determines the distribution of data mapped to a new feature space. A larger g indicates a smaller support vector. In this paper, we set C [10 −3 , 10 3 ], g [10 −3 , 10 3 ] and P as 0.01 and use the four-order cross validation method to obtain the mean square error of solutions from each generation of the PSO-SVR model. Under different SVR parameter combination, the prediction accuracy of the SVR model is calculated and taken as the fitness for the PSO iteration process. As discussed before, the relative liquid film thickness, gas Reynolds number, liquid Reynolds number, gas Froude number and liquid Froude number are chosen as input parameters, and the interfacial friction factor is the desired output solution. After sufficient training, the built model reaches its optimal prediction performance with the best SVR parameter combination. Finally, the performance of the optimal SVR model is assessed by comparing the testing data and the result predicted from the optimal model. The procedure to make prediction of the interfacial friction factor using the PSO-SVR model is demonstrated in Figure 8.
the penalty coefficient C, g of the kernel function and the loss coefficient P. means it is easier for the model to be overfitted. g is a parameter of the RBF ke implicitly determines the distribution of data mapped to a new feature space indicates a smaller support vector. In this paper, we set C ϵ [10 −3 , 10 3 ], g ϵ [10 − as 0.01 and use the four-order cross validation method to obtain the mean squ solutions from each generation of the PSO-SVR model. Under different SVR combination, the prediction accuracy of the SVR model is calculated and take ness for the PSO iteration process. As discussed before, the relative liquid film gas Reynolds number, liquid Reynolds number, gas Froude number and liq number are chosen as input parameters, and the interfacial friction factor is output solution. After sufficient training, the built model reaches its optima performance with the best SVR parameter combination. Finally, the perform optimal SVR model is assessed by comparing the testing data and the resu from the optimal model. The procedure to make prediction of the interfacial fr using the PSO-SVR model is demonstrated in Figure 8.

Result
In order to achieve better prediction, we assess the model performance u ent combinations of parameters, including parameter combination 1 (ℎ/ , parameter combination 2 (ℎ/ , , , , ). The prediction accuracy

Result
In order to achieve better prediction, we assess the model performance under different combinations of parameters, including parameter combination 1 (h/D, Re G , Re L ) and parameter combination 2 (h/D, Re G , Re L , Fr G , F L ). The prediction accuracy of our PSO-SVR model is compared with that of formulas collected from previous studies shown in Table 1. Several parameters are selected as the evaluation indexes to numerically report the performance of the model, containing the correlation coefficient (r), coefficient of determination (R 2 ), root mean square error (RMSE), maximum absolute error (MAE), maximum relative error (MRE) and average relative error (AAE) as defined in Table 3. Among them, r represents correlation between the experimental f i and the predicted f i . RMSE is used to quantify the stability of prediction performance, and MAE, AAE and MRE are used to quantify the overall prediction accuracy [40]. The PSO algorithm is applied to optimize the SVR model with control parameters shown in Table 4. Table 3. Formulas of evaluation index.

Evaluation Index Value
Maximum absolute error,

Comparison of Different Input Parameter Combinations
Under different input parameter combinations, the optimal control parameters of the SVR model are set separately. The numerical analysis of the built model on training, testing and total data sets are listed in Table 5. The model based on parameter combination 2 (h/D, Re G , Re L , Fr G , F L ) performed much better with higher correlation coefficient (r = 0.995) and coefficient of determination (R 2 = 0.989), lower maximum absolute error (MAE = 0.006),average absolute error (AAE = 0.0015) and lower root mean square error (RMSE = 0.00074) compared with the other model. The agreement between experimental f i and predicted f i is significantly improved, since the maximum absolute error (MAE) and average absolute error (AAE) decline from 0.019 to 0.006 and 0.0015 to 0.0004, respectively. It is concluded that the introduction of gas and liquid Froude number significantly improves the accuracy and stability of the model. Thus, the optimal model should be determined on parameter combination 2, whose control parameters are optimized as C = 83.78, g = 1 in this study. The searching process of the PSO algorithm is illustrated in Figure 9, and we see that the SVR model eventually becomes stable after 120 iterations. Figure 10 shows the comparison of the experimental f i and the predicted f i by the optimal model. We see that the prediction has only noticeable errors at certain large values of the interfacial friction factor.

Comparison of Optimal Model with Empirical Formulas
To test the reliability of the optimal model, a comparison between the existing pirical formulas and the optimal PSO-SVR model is conducted. The available empir formulas from 9 sources are gathered, with detailed information listed in Table 1. Sca diagrams in Figure 11 show the comparison between the calculated and the exp mental . The evaluation results of each formula have been summarized in Table 6.

Comparison of Optimal Model with Empirical Formulas
To test the reliability of the optimal model, a comparison between the exi pirical formulas and the optimal PSO-SVR model is conducted. The available formulas from 9 sources are gathered, with detailed information listed in Table  diagrams in Figure 11 show the comparison between the calculated and th mental . The evaluation results of each formula have been summarized in Tab

Comparison of Optimal Model with Empirical Formulas
To test the reliability of the optimal model, a comparison between the existing empirical formulas and the optimal PSO-SVR model is conducted. The available empirical formulas from 9 sources are gathered, with detailed information listed in Table 1. Scatter diagrams in Figure 11 show the comparison between the calculated f i and the experimental f i . The evaluation results of each formula have been summarized in Table 6.
In Figure 11, the points distributed on the diagonal line means the predicted results are perfectly consistent with the actual results in experiments, and the points away from the diagonal tell the increasing prediction error. The interval between dash lines represents a 50% error band, directly displaying the prediction accuracy of each method, with visualization of points falling in this region.   In Figure 11, the points distributed on the diagonal line means the predicted results are perfectly consistent with the actual results in experiments, and the points away from the diagonal tell the increasing prediction error. The interval between dash lines represents a 50% error band, directly displaying the prediction accuracy of each method, with visualization of points falling in this region.
As shown in Figure 11i, the classic formula developed by Wallis [10] makes good predictions, with AAE equaling 0.0038, and most of the data points uniformly lie in the 50% error band. However, with the expansion of the interfacial friction factor range, its prediction error increases gradually. Additionally, the calculated of formulas proposed by Moeck and Fore et al. display good agreements with experimental results (AAE  As shown in Figure 11i, the classic formula developed by Wallis [10] makes good predictions, with AAE equaling 0.0038, and most of the data points uniformly lie in the 50% error band. However, with the expansion of the interfacial friction factor range, its prediction error increases gradually. Additionally, the calculated f i of formulas proposed by Moeck and Fore et al. display good agreements with experimental results (AAE lower than 0.0060 and more than 80% of points in the 50% error band) with a better coefficient of determination (R2 over than 0.50). Nevertheless, others in Figure 11a-h predict a smaller number of points in the 50% error band and show higher AAE (over 0.0060). For our PSO-SVR model, it can be seen from Figure 11j that the prediction is stable with almost all points lying along the diagonal line. The accuracy is not sensitive to the change of f i , which demonstrates its robustness in application. Compared with other formulas, the AAE of the present model remains the lowest in both the testing data set (0.0007) and the total data set (0.0004), with MAE lower than 0.01. To date, there is no widely used formula to calculate interfacial friction factor in all ranges of flow conditions. Nevertheless, with the development of computer science, it is possible to develop ML-based numerical models without any explicit formula. The above analysis suggests that the proposed PSO-SVR model is stable and can be applied in a wider range of parameters for prediction of the interfacial friction factor.
In order to further demonstrate the prediction stability of the model, 4 extra cases are established with the same sampling method described before. In Cases 1 and 2, 90% of the database is used as the training set, and 10% of the database is used as the testing set. In Cases 3 and 4, 80% of the database is the training set, and 20% of the database is the testing set. The prediction results in each case are shown in Figure 12. interfacial friction factor.
In order to further demonstrate the prediction stability of the model, 4 extra cases are established with the same sampling method described before. In Cases 1 and 2, 90% of the database is used as the training set, and 10% of the database is used as the testing set. In Cases 3 and 4, 80% of the database is the training set, and 20% of the database is the testing set. The prediction results in each case are shown in Figure 12. As shown in Figure 12, all data points for each case fall into the 50% error band and are distributed closely to the diagonal line. The AAEs for Cases 1 and 2 are the same (AAE = 0.0004), and it increases slightly to 0.0005 and 0.0006 for Case 3 and Case 4. The MAEs are 0.0046 and 0.0045 for Case 1 and Case 2, and 0.0054 and 0.0066 for Case 3 and Case 4. Due to limited amount of data, we have to reduce the number of training data when allocating more data for prediction. Nevertheless, the prediction error increases slightly, suggesting that reducing slightly the training data does not affect the overall prediction.

Conclusions
In this paper, we develop a machine learning model based on the SVR-PSO algorithms to predict the interfacial friction factor of a vertical annular two-phase flow.
From analysis of the input variables, five parameters (the relative film thickness, the Reynolds numbers of gas and liquid, and the Froude numbers of gas and liquid) show significant correlation with the interfacial friction factor and are considered as the input parameters of our model. By comparisons of different parameter combinations, it is concluded that the model considering the five parameters performs the best, with high accuracy and stability. With the well-trained model, we compare the predicted results with those by the empirical formulas. The ability of the model learning and generalization is shown to be satisfactory, and the prediction accuracy is obviously improved compared with the empirical formulas. In order to confirm the stability and sensitivity of the model, four different cases are generated based on random sampling of the database. All predicted points of the four cases fall into the 50% error band, and error analysis shows that the prediction accuracy is insensitive to the variation of the dataset.
It has to state that the method proposed in this paper only considers the influence of liquid film and gas core. The entraining and deposition of droplets corresponding to energy transfer at the gas-liquid interface could have an important impact on the liquid film thickness and the gas-liquid velocities. In order to further improve the prediction accuracy of the interfacial friction factor, the influence of entrained droplets in the gas core could be taken into account in the following work in the near future.
Funding: This research is funded by the Shenzhen City Technology and Innovation Committee, grant number JCYJ20210324104606017.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: All data was presented in this manuscript.