Since the relationship between input variables (including pipe properties, corrosion location, corrosion size, corrosion type, etc.) and the corrosion growth is very complex, finding a formula to describe the relationship is difficult. Considering that the BP neural network has strong ability to deal with nonlinear problems, as well as strong self-learning and self-adaptive abilities, a BP neural network is used to predict the corrosion growth of the pipeline. In this section, the structure, the modeling process, and the performance assessment of the BP neural network model will be described.
2.1. Structure of the BP Neural Network
Being composed of many neurons with operation functions, the structure of the BP neural network includes the input layer, hidden layer, and output layer [
19], which is shown in
Figure 1. The input layer consists of
p neurons represented by
xi,
i = 1, 2, 3, ...,
p, where
p is the number of input variables. The output layer consists of
q neurons represented by
yj,
j = 1, 2, 3, ...,
q, where
q is the number of output variables. Each node of the input layer is connected to all the nodes of the first hidden layer. Each node of the previous hidden layer is connected with all the nodes of the next hidden layer. Similarly, each node of the last hidden layer is connected to the all the nodes of the output layer. Each connection has a weight associated with it. In this paper, 11 input variables (service life, pipe segment length, pipe wall thickness, corrosion type, corrosion location (distance to upstream/downstream girth weld, inner/outer, clock direction), and corrosion size (length, width, depth)) and 3 output variables (corrosion growth coefficients (length, width, depth)) are used to construct the BP neural network. Thus, the BP neural network has 11 input neurons and 3 output neurons, which means
p = 11,
q = 3.
The output of the neuron in the first hidden layer,
,
k = 1, 2, …
H1, where
H1 is the number of neurons in the first hidden layer, is expressed as follows.
where
f is a monotonically increasing function, whose value is within (0, 1).
is a vector of weights, and the initial values of these weights are all within [−1, 1]. Similarly, the output of the neuron in the
i-th (
i > 1) hidden layer,
,
k = 1, 2, …
Hi, where
Hi is the number of neurons in the
i-th hidden layer, is expressed as follows,
where
is also a vector of weights whose initial values are within [−1,1]. The output of the model
yj, namely the output of the output neurons, is,
where
is a vector of weights within [−1,1], and
Hlast is the number of neurons in the last hidden layer. In this paper, the sigmoid function is selected as the activation functions that can be used in (2), (4), and (6). This function is expressed as follows.
The construction of a BP neural network is essentially the process of determining the weights of these connections. When a BP neural network works, it mainly transmits two kinds of data: the forward propagating signal and the back-propagating error. After the input data are obtained, its flow direction is taken from the input layer to the hidden layer, and then to the output layer. Then, the BP algorithm compares the actual outputs with the target outputs and the error is propagated in the opposite direction. The error is shared with each node of each layer, and the weight of each connection is adjusted until the objective function reaches the minimum value by using the back propagation learning rule. Then, the process of establishing the BP neural network is finished.
2.2. Modeling Process
The BP neural network is a data-driven model, and the modeling process is as follows:
Obtain the database and determine the number of neurons in the input layer and output layer of the BP neural network;
Randomly sort the collected data (data size = 11,103), and select = 70% of the samples (viz. 7772 data points) as the training samples. Then, the remaining samples are used as the testing samples;
Train the BP neural network with the training samples and evaluate the performance of the model on the testing samples.
The BP model has two main limitations. The first one is the overfitting problem, which means the trained BP model has pretty high fitting precision on the training set, but has a relatively large prediction error on the testing set. In the proposed method, the target error of the BP model is not set too small, and the redundant samples are deleted. Then, this limitation is avoided. The second limitation is the inherent defect of the BP model and cannot be avoided. In the flat region of the gradient error surface, the variation of the weight is quite small, which makes the convergence of the BP model relatively slow. It spends more time in the training process.
In the process of establishing a BP neural network, the main content is to determine the neural network parameters, including the number of hidden layers, the number of nodes in each hidden layer, the learning rate, the learning objectives, and the frequency of training.
As for the determination of network parameters, we need to determine the number of hidden layers firstly. Then, we can get the number of nodes in each hidden layer according to the empirical formula, where
represents the number of neurons in the
i-th hidden layer.
In theory, the BP neural network of three hidden layers has a good fitting result. In this paper, using the collected data and a simple linear growth model, the BP neural network with one, two, three, four, five hidden layers are tested, respectively. The simulation result is the best when the BP neural network has four hidden layers. When the number of hidden layers is too large, such as five, the overfitting problem occurs. So, the number of hidden layers is set as four in this paper. Then, the number of neurons in these four hidden layers are 5, 7, 9, and 11.
Meanwhile, the other parameters of the BP neural network are proposed to use the default value [
24]. The parameter selection of the BP neural network is shown in
Table 1.
Based on the chosen parameters, the initial BP neural network model is built. After training the BP neural network with the training samples, the weights of the connections between neurons are optimized and the final BP neural network structure is determined. Then, the BP neural network is evaluated on the testing samples to verify its validity. After that, the established BP neural network can be used to predict the corrosion growth of the pipeline.