Data ‐ Based Flow Rate Prediction Models for Independent Metering Hydraulic Valve

: Accurate valve flow rate prediction is essential for the flow control process of inde ‐ pendent metering (IM) hydraulic valve. Traditional estimation methods are difficult to meet the high ‐ precision requirements under the restricted space of the valve. Thus data ‐ based flow rate prediction method for IM valve has been proposed in this study. We took the four ‐ spool IM valve as the research object, and carried out the IM valve experiments to generate labeled data. Picking up the post ‐ valve pressure and valve opening as input, we developed and compared eight differ ‐ ent data ‐ based estimation models, including machine learning and deep learning. The results in ‐ dicated that the SVR and DNN with three hidden layers performed better than others on the whole dataset in the trade ‐ off of overfitting and precision. And MAPE of these two models was close to 4%. This study provides further guidelines on high ‐ precision flow rate prediction of hy ‐ draulic valves, and has definite application value for development of digital and intelligent hy ‐ draulic systems in construction machinery.


Introduction
At the present time, the multi-way valves for the linkage of the inlet and outlet are mostly used in the hydraulic system of construction machinery for flow rate control [1][2][3][4]. Different from the traditional multi-way valve, the IM valve with two or more spools can make decoupling between the inlet and outlet throttling areas, which increases the control freedom of the system and makes it possible to further improve the control performance of the valve [5,6]. Hence the application prospect of this valve is greatly broad.
In order to guarantee the flow rate control quality of the IM valves, accurate flow rate estimation and prediction are essential. There are three traditional methods of estimating the flow rate in valves, which are directly measured by flowmeter, mechanism modeling and numerical simulation. Turbine flowmeters have been widely used in hydraulic system for the estimation of flow rate with a defined precision, but they are not suitable for compact space due to their large volume [7,8]. Mechanism modeling builds flow rate estimation model through the mechanism formulas to realize the estimation of the valve flow rate, but it's difficult to guarantee the accuracy of the model because of the high nonlinearity and multi-physics of the flow field inside the valve [9]. The flow field model in valve also can be established by numerical simulation based on CFD, but the results are significantly affected by the mesh quality and boundary conditions [10,11]. New method needs to be proposed for accurate prediction of the in-valve flow rate.
Data-based machine learning (ML) and deep learning methods which can improve the models' accuracy have been introduced into various fields [12][13][14]. The methods consist of supervised and unsupervised learning. Unsupervised learning is mainly used for clustering or grouping with unknown patterns [15,16]. However, supervised learning utilizes a set of labeled data for regression or classification [17,18]. The above algorithms have been implemented in the hydraulic field. Jia et al. [19] employed the support vector machine (SVM) to estimate the comprehensive characteristics of the hydraulic valve with the geometric factors as the input, and reduced the repair and rejection rate of the hydraulic valve. Zhao et al. [20] used the machine learning model for the fault diagnosis of the aero hydraulic pump, and the experimental results showed that the method has high accuracy. Guo et al. [21] made use of machine learning and deep learning models to calculate the leakage of hydraulic cylinders, and completed the comparison of the prediction effects of BP neural network (BPNN), convolution neural network (CNN), support vector regression (SVR) and radial basis function network (RBF). However, the reports on the development and application of data-based machine learning and deep learning methods in flow rate prediction of hydraulic valve are limited.
Therefore, this study took the four-spool IM valve as the research object, and proposed a data-based method to calculate the flow rate of the valve. The labeled data was generated through the experiments. Different ML models were developed with post-valve pressure and valve opening as input variables.

The Tested Valve
The research objective of this study is the four-spool IM valve manufactured by Jiangsu Advanced Construction Machinery Innovation Center Ltd. The diameter of the valve is 32 mm. Figures 1 and 2 are the tested valve and its hydraulic principle respectively. In Figure 2, port P is the oil inlet connected to the pump, port T1 and T2 are the oil outlet connected to the tank, port A and B are the oil ports connected to the load. The valve can be simplified into four 2/2 spool valves. Negative coverage of each spool valve is 2.5 mm, and the maximum stroke of the spool is 14.5 mm. There are primarily two working modes in the valve, one case is the oil flow through the P-B and the A-T1, another case is the oil flow through the P-A and the B-T2. The two oil loops are symmetrical except for machining errors. Therefore, only one case needs to be taken into account in our study.

Data Generation by Experiments
A dataset is needed in the development of data-based method. In our study, we generated the dataset by experiments of the tested valve. First, we need to determine the variables that need to be collected. The flow rate through the hydraulic valve is calculated by Equation (1), as follows: where Q is the flow rate, Cd is the discharge coefficient, A is the flow area, p1 is the pre-valve pressure, p2 is the post-valve pressure, and ρ is the density of oil. It can be seen that the flow rate in the valve is affected by the variables contained in the Equation (1). Except for the pressure p1 and p2, other parameters cannot be measured directly. But the Cd, A and ρ are related to the valve opening and oil temperature which can be measured easily. Therefore, the oil temperature, oil pressure, valve opening and flow rate were determined as the collected variables.
Then we took the multi-channel valve test platform of Jiangsu Advanced Construction Machinery Innovation Center Ltd which is shown in Figure 3 as the foundation, and built the test system whose experimental principle is shown in Figure 4.  In the experiments, the oil provided by the pump went through the P-B and the A-T1 circuit of the tested valve and then returned to the tank. The temperature, pressure, flow rate of oil and valve opening in the P-B valve cavity have been obtained accurately by the flowmeter, the micro temperature and pressure integrated sensors and the displacement sensor connected to the valve stem when the experiments operated.
The dataset generated by the experiments is shown in Figure 5. The dataset contains six features, namely P port temperature, B port temperature, P port pressure, B port pressure, valve opening, and valve flow rate, with a total of 8244 sets of data. The above six features are described as t1, t2, p1, p2, x and Q respectively for the convenience. The first 70% of the experimental dataset were used to train the model, and the remaining 30% of the dataset were utilized for test and evaluation in our study.

Dataset Analysis
The dataset collected from the test valve experiments contains 6 features. Table 1  slight owing to the short sampling time. The pressure pulsations and losses would been generated as a result of the turbulence in the valve. Hence the CV of p2 is larger than that of p1, but the minimum, maximum and average values of p2 are lower than those of p1.

Input Selection
Selecting features with the strongest prediction power in dataset is a crucial step before performing ML method. In our study, the Pearson Correlation was utilized as the metric of prediction power for input selection. The Pearson Correlation Coefficient is calculated by Equation (2), as follows: where r is the Pearson Correlation Coefficient, n is the number of samples, xi and yi are the points in different samples, x _ and y _ are the mean of individual samples. The output of the coefficient ranges from −1 to +1, with larger absolute values indicating stronger correlations. A change in scale of the two variables has no effect on the coefficient.
Pearson correlation coefficients between different features were calculated according to Equation (2), and the chord diagram is given in Figure 6. The r values between x, p2 and Q are 0.970 and 1 respectively, showing relatively high correlation. Hence, x and p2 have greater ability to calculate the flow rate in the valve. Thus x and p2 were selected as the input variables of the regression model in this study. In addition, there is a high correlation between x and p2, with the r reaching 0.974, indicating the multicollinearity among the selected input possibly.

Machine Learning Techniques
Since the data collected are labeled previously, supervised regression models are suitable for our research.

Linear Regression
Linear Regression (LR) employs a linear regression equation to fit the regularity of variables [22,23]. The LR assumption function and the cost function based on the ordinary least square method are calculated by Equations (3) and (4) respectively, as follows: where x is the independent variables, β is the coefficient vector, m is the number of samples, and y is the target value. The near-linear relationships among the independent variables called multicollinearity affect the results of LR seriously. Ridge adds a penalty to the ordinary least square to reduce the incidence of multicollinearity. The cost function of Ridge is calculated by Equation (5), as follows: where α is the regularization parameter, and p is the number of independent variables.

K-Nearest Neighbor
K-nearest Neighbor (KNN) is an instance-based learning method. The model finds the k nearest neighbors in a given sample space, and takes the average or weighted average of the neighbors as the calculated values. The Euclidean distance is invoked as a metric of weight in the model. The distance is calculated by Equation (6), as follows: where x and y are two points with n number of feature i.

Decision Tree
Decision Tree (DT) is good at expressing the association between data. The Classification and Regression Tree (CART) algorithm was first proposed by Breiman in 1984 [24]. CART is suitable for consecutive and multi-feature variables with binary segmentation. Meanwhile, this algorithm has simple extraction rules, high accuracy and strong interpretability [25]. Hence, the CART was utilized in DT prediction model of our study.

Random Forest
Random Forest (RF) uses multiple trees for ensemble integrated learning. RF could avoid the shortcoming of overfitting in DT effectively [26]. RF calculates and compares multiple decision trees, and picks up the average of the outputs of all trees as the calculated values. The performance of RF produces better with the number of decision trees increases, but the calculation time will become longer. Figure 7 displays the calculation process of RF.

Support Vector Regression
For a given dataset of the independent variable x, Support Vector Regression (SVR) determines a linear function with dependent variable y within a given tolerance [27]. The function is calculated by Equation (7), as follows: where w is the slope, and b is the intercept. In order to ensure the flatness of line represented by Equation (7), it is necessary to find a minimum w. When the accuracy is ε and the relaxation factors are ξ and ξ * , the problem of the estimation in Equation (7) is calculated by Equations (8) and (9), as follows: where C is the trade-off between the w and the deviation of ξ, ξ * . Figure 8 shows the principle that SVR finds the linear function in terms of Equations (7)-(9).   K(xi, x) is the kernel. The radial basis function (RBF) is commonly used as kernel function which is calculated by Equation (11), as follows: where γ is a free parameter. The hyper parameters of SVR (RBF) are C and γ, which should be hunted carefully.

Artificial and Deep Neural Network
Artificial Neural Network (ANN) is a forward network with one hidden layer, which approximates the nonlinear mapping of multi-variate functions through multiple composites of unary functions [28]. The difference between ANN and Deep Neural Network (DNN) is the number of hidden layers. Figure 9 displays the basic structure of ANN and DNN.

Hyperparameters Optimization
Grid search and random search are two widely used hyperparameters optimization methods. The grid search can traverse all parameters combinations, so the computational cost is high when the number of parameters is large [29]. The random search randomly searches for possible parameters combinations within the specified parameter range, which can significantly reduce the computational cost. In our study, both methods are used for hyper parameter optimization. Depending on the number of hyper parameters, the grid search was used for Ridge, KNN, DT, RF, SVR, and the random search was utilized for ANN and DNN.
In Ridge, alpha (α) was searched from 10 −10 to 10. In KNN, k was searched in the range of 1-50. In DT, the maximum depth and the minimum leaf nodes are used to control the size and complexity of the tree to avoid overfitting. So in our DT model, the maximum depth ranged from 10 to 100, and the minimum leaf nodes ranged from 1 to 5. In RF, besides the maximum depth and the minimum leaf nodes, the number of trees needs to be considered to avoid overfitting. The number of trees was searched in the range of 10-100, the maximum depth ranged from 10 to 100, the minimum leaf nodes ranged from 1 to 5. In SVR, the value of C will affect the generalization ability of the model. We searched C from 2 −5 to 2 5 , and the range of gamma was 2 −5 -2 5 . In ANN and DNN, the number of neurons in each hidden layer was searched in the range of 1-2 times the number of neurons in the previous layer, the activation function was searched in logistic, tanh and relu, the solver was adam, the batch size was ranged from 100 to 1000, the range of learning rate was 10 −5 -10 −1 , the maximum iterations were 20,000. When the accuracy rate was no longer improved, the training was stopped early to avoid overfitting. K-fold cross-validation was utilized to find the hyperparameters with the best generalization performance of our models. At each iteration, a different subset was selected for testing and the remaining K-1 subsets were used for training. In this study, the K value was 10. Table 2 displays the hyper parameters searched for the models.

Performance Evaluation Criteria of Models
In the study, four statistical measures were adopted for the performance evaluation of our models, as follows: (i) Coefficient of determination, R 2 . High R 2 value indicates a good performance of the model.

Comparison of Regression Models
In the previous section, we obtained the optimal hyperparameters of each model. The models were trained and tested in this section. Figure 10 shows the performance evaluation criteria of each model on the training set and test set. M1-8 represent Ridge, KNN, DT, RF, SVR, ANN and DNN with two and three hidden layers respectively.
In the training set, KNN, DT and RF had the highest R 2 , which was 1. The R 2 of SVR and DNN with three hidden layers was next. The R 2 of ANN and DNN with two hidden layers were less than others. The LR model's R 2 was just 0.992, which may be caused by collinearity between p2 and x. Compared with other models, KNN DT and RF had smallest MAE, RMSE and MAPE at the same time. SVR and DNN with three hidden layers was the second. The performance of Ridge, ANN and DNN with two hidden layers measured by MAE, RMSE and MAPE was worst.
In the test set, KNN also had highest R 2 . The next models were DT, RF, SVR and DNN with 3 hidden layers. The R 2 of Ridge, ANN and DNN with 2 hidden layers was lower than other models. The MAE, RMSE and MAPE of KNN DT and RF were lower than others. SVR and DNN with three hidden layers were next. Ridge, ANN and DNN with two hidden layers were the poorest models.
In the whole dataset, KNN, DT and RF had the highest R 2 and lowest MAE, RMSE, MAPE than other models. But the RMSE of them in the training set was greatly less than in the test set, which indicated serious overfitting. Almost no overfitting occurred in the LR, ANN and DNN with two hidden layers, but the accuracy of them was much lower than other models. Only SVR and DNN with three hidden layers had high accuracy and little overfitting simultaneously. Taylor diagram summarizes how well the outcome of models match observations graphically [30]. The similarity of the models to the reference is displayed according to the correlation, the central root mean square error (CRMSE) and the standard deviation, where the correlation is calculated in the same way as Equation (2). Figure 11 is the Taylor diagram of M1-M8 on the training set and test set respectively. Compared with other models, the KNN, DT and RF had higher correlation coefficient, smaller CRMSE and standard deviation compared with the actual referenced flow rate in the valve on the entire dataset. The SVR and DNN with three hidden layers had moderate correlation coefficient and CRMSE. In addition, the standard deviation values of these two models were close to zero in training set and test set respectively, indicating that the models results were consistent with actual values on the whole dataset.
Overall, SVR and DNN with three hidden layers performed best in the trade-off of accuracy and overfitting. The models accuracy represented by the MAPE on the entire dataset was 3.926% and 4.535% respectively.

Conclusions
In view of the difficulty of accurate estimation of the in-valve flow rate under space constraints at present, this study took the four-spool IM valve as the research object and proposed a data-based flow rate prediction method. IM valve experiments have been carried out to generate labeled data. Then the pre-valve pressure, post-valve pressure and valve opening were chosen as input variables according to the prediction power. Different data-based models including machine learning and deep learning were developed and compared. The results indicate that SVR and DNN with three hidden layers performed best in the trade-off of accuracy and overfitting. The models' MAPE on the entire dataset was 3.926% and 4.535% respectively.
Our study also have some limitations. The sample size will affect the accuracy of ML method. It's significantly essential to train the ML models based on more samples in the future. However, this study provides the possibility for high-precision flow rate prediction of multi-way valve under limited installation space, and has definite reference value for promoting digitalization, networking and smartness of the hydraulic system of construction machinery.