The Effectiveness of Ensemble-Neural Network Techniques to Predict Peak Uplift Resistance of Buried Pipes in Reinforced Sand

: Buried pipes are extensively used for oil transportation from offshore platforms. Under unfavorable loading combinations, the pipe’s uplift resistance may be exceeded, which may result in excessive deformations and signiﬁcant disruptions. This paper presents ﬁndings from a series of small-scale tests performed on pipes buried in geogrid-reinforced sands, with the measured peak uplift resistance being used to calibrate advanced numerical models employing neural networks. Multilayer perceptron (MLP) and Radial Basis Function (RBF) primary structure types have been used to train two neural network models, which were then further developed using bagging and boosting ensemble techniques. Correlation coefﬁcients in excess of 0.954 between the measured and predicted peak uplift resistance have been achieved. The results show that the design of pipelines can be signiﬁcantly improved using the proposed novel, reliable and robust soft computing models.


Introduction
Buried pipes are extensively used to transport oil and gas from offshore platforms. Under unfavorable loading combinations, the pipe's uplift resistance may be exceeded and may lead to excessive deformations [1][2][3][4][5] and significant disruptions. The peak uplift resistance of buried pipes is typically determined using a combination of laboratory, field tests and numerical modelling [1][2][3][4][5][6][7][8][9][10]. It has been shown that the peak uplift resistance of buried pipes depends on the density of the surrounding soil, although the relative pipe-to-soil movement required to mobilize the peak uplift resistance is not significantly affected by soil density [11]. The increase in pipe embedment depth is attributed to the increase in the peak uplift resistance up to a critical embedment depth, beyond which no further increase in the normalized peak uplift resistance is recorded [5].
Laboratory-scale and field tests for the prediction of the peak uplift resistance of buried pipes are time-consuming and costly to perform [7] and in this context advanced numerical modelling may offer a viable alternative. A variety of artificial intelligence techniques have been employed in the literature for the prediction of the uplift resistance of buried pipes, including artificial neural network (ANN), neuro-genetic network (NGN), genetic programming and simulated annealing (GP-SA), multivariate adaptive regression spline (MARS), support vector machine (SVM), intelligent fuzzy radial basis function neural network inference method (IFRIM) and group method of data handling-harmony search (GMDH-HS). A summary of the various artificial intelligence (AI) and soft computing techniques (SC) used in the literature to predict the uplift resistance of buried pipes in various soil types is shown in Table 1. Although artificial neural networks (ANN) have been widely used for the prediction of the peak uplift resistance of buried pipes [6,[12][13][14], the capabilities of ANN may have not have been fully explored. For example, only a limited number of studies developed artificial network models in combination with other techniques, which may have enhanced their accuracy [13]. It has been shown that when artificial neural networks are developed with boosting and bagging ensemble techniques, the prediction accuracy of artificial intelligence models may significantly increase .
This paper aims to perform a series of small-scale tests on pipes buried in geogridreinforced sands and the measured peak uplift resistance was used to calibrate advanced numerical models employing neural networks. Therefore, multilayer perceptron (MLP) and Radial Basis Function (RBF) primary structure types were used to train two neural network models, which were then developed using bagging and boosting ensemble techniques. Pai [45] To predict the uplift capacity of suction caissons using a neuro-genetic network.

NGN -
Alavi et al. [46] To obtain an empirical model for evaluating the complicated behavior of suction caissons' uplift capacity by the hybrid GP-SA technique.
GP-SA R 2 : Train = 0.858; test = 0.759 Alavi et al. [47] To examine the robustness of the conventional tree-based GP and its suitable modifications, i.e., LGP and GEP, for evaluating the complicated behavior of the uplift capacity of suction caissons. To develop a model to obtain the minimum liquefaction potential.

Laboratory Test Procedure
A series of 44 small-scale uplift resistance tests were performed and the measured peak uplift resistance was used to calibrate advanced numerical models employing neural networks. Figure 1 shows the layout of the pullout system used for the small-scale uplift resistance tests. The pullout system comprised a rectangular test chamber, and a pulling arm system joined to an AC motor, which applied an uplift force to the buried pipes. More details regarding the testing methodology can be found in Faizi et al. [8] and Armaghani et al. [9] The small-scale tests investigated the effect of varying the pipe diameter, pipe burial depth, and geogrid geometry on the uplift resistance of buried pipes, so as to identify the most suitable input parameters affecting the peak uplift resistance of pipes. Table 2 lists the range parameter values investigated in the 44 small-scale uplift resistance tests performed in this research. In addition, Figures 2 and 3 show typical uplift resistance tests in reinforced sand conducted in this study.

ANN
Artificial neural networks (ANNs) are advanced numerical models evolved from the idea of simulating the interaction of the synapsis of brain neurons. An artificial neural network (ANN) is an adaptive method that alters its composition according to its outer or inner knowledge that runs within the network through the learning stage [55][56][57][58]. In an ANN, associations among neurons generally carry different weights that show how influential the association is. The neuron measures by creating a weighted aggregate of its input, that is, the activation of each input neuron is multiplied by the corresponding link weight. ANNs are categorized into two main types; Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) [59]. The popularity of ANN models is associated with their robustness compared to linear numerical modeling methods [60]. Both MLP and RBF comprise non-linear universal approximators with feed-forward structures [61,62].
An MLP is a type of feed-forward ANN, which comprises at least three layers of links: (1) input layer, (2) hidden layer and (3) output layer. Every link is a neuron that utilizes a non-linear activation function except for the input links. It is worth noting that the MLP employs a supervised learning procedure named backpropagation training [63,64]. The MLP can be differentiated from a linear perceptron through its multiple layers and non-linear activation. MLP can also detect data which are not linearly associated [65].
The RBF network is an ANN that employs RBFs as activation functions. The product of the system is a linear compound of RBFs of the data inputs and neuron parameters. Similar to the MLP, the RBF network typically comprises three layers: (1) input layer, (2) hidden layer with a non-linear RBF activation function and (3) linear output layer. The input layer should be signified as a vector of real numbers. At that stage, the product of the system is a scalar function of the input vector.
MLP and RBF have significant architectural differences [66]. An MLP-NN might comprise one or more hidden layers, while the RBF-NN displays only one hidden layer. The MLP-NN might have several non-linear hidden layers, compared to the RBF-NN, that contains one non-linear hidden layer which is accompanied by a linear output layer. Regarding the activation function of every neuron, a dot product is executed in an MLP-NN, while an RBF-NN calculates a vector norm. Due to the nature of the activation functions, MLP can create global input-output non-linear mappings, whereas RBF-NNs can only produce local estimates. Figure 4 shows the significant differences between the architecture of MLP and RBF.

Ensemble Approach
Ensemble learning (EL) is an advanced numerical modelling technique which uses various models usually termed "weak learners", which are trained to solve the same problem and which are consolidated to generate more reliable results. The main underlying assumption is that if weak models are perfectly merged, then the results will be more accurate and the consolidated model is likely to be more robust. Employing ensembles to conduct predictions may be justified due to their remarkable capability to generalize outcomes and decrease model variation, due to their lower susceptibility to data noise [67]. In this paper, two primary ensemble techniques including bagging and boosting were employed. Boosting is a method applied to resolve classification and regression problems. Boosting techniques primarily focus on bias mitigation and on enhancing the model's suitability to data. Compared with bagging techniques that primarily decrease diversity, boosting is a technique that involves sequentially fitting various weak learners in a highly adaptive process. Every model in the series is applied, providing higher importance to observations in the dataset examined by the earlier models in the series. Each novel model concentrates on fitting the most challenging measurements and the outcome is an intense learner with a lower bias.
Bagging, or bootstrap aggregating, is one of the most established methods for generating ensemble models [67]. While this technique was primarily developed to solve classification problems, it was later employed to analyze regression data. Bagging is defined by producing and refitting various samples, by utilizing the bootstrap method from an identical set of data. As a result, it is likely to create many different trees for an identical predictor and employ them for creating an aggregate forecast. The prediction can be achieved by voting or by an average for classification and regression problems [68]. This prediction creates various samples for an identical set. The benefit of utilizing this method for ensemble creation is that it minimizes any error in the baseline predictors. The prediction accuracy of the bagging technique can present its predictive performance measures related to calculations of cross-validation and measures of testing sets [69]. Figure 5 shows the effect of pipe diameter and burial depth on the peak uplift resistance of pipes buried in non-reinforced sand. The results show that the peak uplift resistance increased with increasing pipe diameter. For 100 mm burial depths, the peak uplift resistance increased from 36.48 to 49.8 N as the pipe diameter increased from 25 mm to 55 mm. Similar trends were observed for 150 mm burial depths with the peak ultimate resistance increasing from 49.92 to 76.8 N as the pipe diameter increased from 25 to 55 mm. For both burial depths, the peak uplift resistance values are related to the pipe with a diameter of 55 mm. It is clear from this figure that pipe diameter and burial depth have a great influence on the results of the peak uplift resistance. Actually, there is a direct connection between the peak uplift resistance values of pipes and their diameters. Specifically, it was found that there is a big difference in the peak uplift resistance values from D = 25 mm to D = 35 mm, while the mentioned variance is decreased from D = 45 mm to D = 55 m significantly. In this regard, Dickin [70] mentioned that the difference in the peak uplift resistance values turn out to be lesser in the case of greater pipe diameters. Figure 6 shows the effect of the geogrid layout on the peak uplift resistance of sand. The geogrid layout was varied by including different geogrid lengths and numbers of layers. The results in Figure 6 show that increasing the geogrid length and number of layers significantly increased the peak ultimate resistance of the buried pipe. Inclusion of geogrid reinforcement changed the uplift resistance to the displacement response of the sand. In the non-reinforced sand, the uplift resistance increased with the increasing strain up to a peak value, followed by strain softening, whereas in the reinforced soil the uplift resistance increased monotonically with the increasing strain. The results in Figures 5 and 6 show the influence of pipe diameter, burial depth, geo-grid length and geo-grid number of layers on the uplift resistance of pipes and justifies them as appropriate input parameters for the artificial neural network modeling for the prediction of the peak ultimate resistance.

Soft Computing Predictive Models
This paper applied two ensemble approaches on the two most commonly used ANN models for predicting the peak uplift resistance of pipelines buried in sand. The ensemble techniques employed in this research include boosting and bagging techniques. ANN-MLP and ANN-RBF were also developed to estimate the peak uplift resistance of pipes buried in sand. Two standard models (STANDARDANNMLP and STANDARDANNRBF), without applying the ensemble approaches, were developed to benchmark the ANNensemble models. Before the models' development, the data were divided into train and test partitions with an 80:20 proportion. The results of these models were then analyzed using a number of performance criteria, including the mean absolute error (MAE), correlation coefficient (R), and gains. The expression for calculating the correlation coefficient (R) and mean absolute error (MAE) are presented below: where y are the measured values,ȳ and y show are the mean and predicted values of y and N is the total number of data. This research initially applied standard ANN models based on MLP and RBF structures to a database of 44 laboratory test data. In these two models, a single hidden layer, with a maximum training time of 5 min, and an over-fit prevention of 30% were used. It is worth mentioning that the neural network modelling internally divides samples into a model building set and an over-fit prevention set. The structure of these models are shown in Figure 7. Thereafter, bagging and boosting techniques were applied to the trained ANN, to assess the effectiveness of these techniques when applied to standard ANN models. For these four models (ANNMLP-BOOSTED, ANNMLP-BAGGED, ANNRBF-BOOSTED, and ANNRFB-BAGGED), the "mean" value was set as the combination rule for the target variable. In addition, the quantity of component models for boosting and bagging was established as in [10]. The measured and predicted peak uplift resistances of the buried pipes for all six models are shown in Figure 8. The performance of these models was assessed using the criteria mentioned above. In addition, a simple ranking system was deployed to obtain a comprehensive assessment. In this system, each model performance is evaluated based on R and MAE within each partition. Then, each model is ranked according to the value of each criterion. Thus, for R, the highest rank is assigned to a model which obtains the highest value; and for RMSE, the highest rank should be given to a model which acquires the lowest value. It should be noted that, in this case, the highest value is six because this research deployed six models. The cumulative rank of each model is calculated using the following expression: where α shows the ranking of R, β represents the ranking of MAE, with the subscript "tr" and "te" denoting training and testing values respectively. Table 3 shows the results of each performance criterion and the rankings and cumulative ranking of each model. As can be seen, the ANNMLP-BOOSTED model obtained the highest cumulative, training R, and training and testing MAE rankings. Alternatively, the ANNRFB-BAGGED model achieved the lowest cumulative ranking in addition to the testing and training of R and MAE rankings. For the total ranking of the training phase, the highest-ranking was achieved by ANNMLP-BOOSTED, while both ANNMLP-BOOSTED and ANNMLP-BAGGED achieved the highest and similar testing phase total ranking. For a better understanding of the ensemble models developed in this study, a diagram is provided in Figure 9, which compares the models with other ML models, such as KNN, SVM, and CHAID models.  Figure 9. Comparison between the ANN models developed in this study and other ML models.
In addition to using R and MAE criteria for evaluating the models' performance, a series of gain charts were employed to allow a more in-depth performance assessment ( Figure 10). Gains are measured as (quantity of hits in quantile/whole quantity of hits) × 100%. At this point, it is essential to state that "hit" points show the progress of a predictive model to forecast the values more significant than the middle point of the field's range (peak uplift resistance > 78.92 N). In the diagram mentioned above, the blue line in the faultless model with tremendous confidence (where hits = 100% of cases), the oblique red line signifies the at-chance model, and the other three lines in the center depict the models applied in this study. In general, the higher-level lines display more reliable models, especially on the chart's left side. To analyze a model applied and the at-chance model, the region between a model and the red line can be utilized. The area mentioned above signifies the superiority of an applied model versus the at-chance model. The region between a model employed and the excellent model illustrates areas of improvement for an applied model. According to the results of gains charts, for the ANNMLP-ensemble models in both testing and training phases, the ANNMLP-BOOSTED models performed better than all the other models developed in this study. Moreover, for ANNRBF-ensemble models, all models showed similar performance in the testing phase. On the other hand, for the training phase, the ANNRBF-BOOSTED model outperformed all the other models.
This study shows that boosting and bagging techniques can improve performance prediction (both the accuracy level and system error) of the MLP model. However, these techniques were not able to improve the performance of the RBF model, which may imply that both ensemble techniques are more suitable for the MLP-ANN model than the RBF-ANN model. The boosting technique showed better effectiveness than the bagging technique when applied to MLP and RBF models. The better performance of the boosting technique may be associated with its ability to mind the weightage of the more eminent accuracy sample and the lower accuracy sample, and that it can at a later stage supply the merged results. Furthermore, the net error is assessed in every learning run, and it operates well with interactions. Moreover, the boosting method is more appropriate when the users deal with under-fitting or bias in the data.

Limitations and Future Research
The proposed neural network is applied in the five-dimensional space defined by the five parameters which influence the development of the peak uplift resistance of buried pipes in reinforced sand. In fact, the neural network is applicable for any parameter values ranging between the lowest and highest values of each parameter, as presented in Table 2. It should be mentioned that any developed soft computing-based model such as artificial neural networks can predict an output parameter only for dataset output parameter values ranging between the lowest and highest values of each parameter used during its training stage.
At this point it must be stressed that in certain areas, more data are required in order to achieve the optimum fitting of the proposed neural network to the data. Pipe diameters, burial depths, geo-grid lengths and number of geo-grid layers should in the future be designed with the aim of revealing the development of peak uplift resistance of buried pipes in reinforced sand for intermediate values of these parameters, not included in the database presented and used herein. Through this optimization process, the influence of each parameter on the peak uplift resistance will be further elaborated, and the neural network, having been trained for all the ranges of the input parameters, will be able to predict intermediate values more accurately.

Summary and Conclusions
A series of small scale laboratory tests for buried pipes in reinforced and non-reinforced sands were conducted by measuring the peak uplift resistance to calibrate advanced numerical models employing neural network methods. Many studies have employed ANN modelling techniques to predict the peak uplift resistance. However, a few studies hybridized the ANN with other modelling techniques. Besides, the ANN capacities were not explored completely in previous studies. Thus, this study investigated the feasibility of employing ANN modelling techniques while considering different capabilities of ANN. Multilayer perceptron (MLP) and Radial Basis Function (RBF) primary structure types were used to train two neural network models, which were then developed using bagging and boosting ensemble techniques.
The main conclusions of this study are the following: • The results of the numerical modelling revealed that boosting and bagging techniques can significantly improve the performance prediction (both the accuracy level and the system error) of the MLP model. However, these techniques were not able to improve the performance of the RBF model, which may imply that both ensemble techniques are more suitable for the MLP-ANN than the RBF-ANN model.

•
The boosting technique performed better than the bagging technique when applied to MLP and RBF models. In addition, correlation coefficients in excess of 0.954 were achieved between the measured and predicted peak uplift resistance.

•
The proposed developed models reveal the complicated non-linear response of the peak uplift resistance of buried pipes in reinforced sand. Furthermore, they can be useful tools for researchers, engineers and for supporting the teaching and interpretation of the peak uplift resistance of buried pipes in reinforced sand.
It is important to mention that the proposed models in this study should be utilized within the condition and variation of the input variables. The peak uplift results of the other situations may be different and unreliable. While this is common in that the geo-technical database is small in size and these datasets are used for predicting the variables of interest, further studies can expand the database with more datasets to achieve better accuracies and fewer errors.