Location Detection Method of Detector in Pipeline Using VMD Algorithm and Machine Learning Classiﬁer

: The internal detector in a pipeline needs to use the ground marker to record the elapsed time for accurate positioning. Most existing ground markers use the magnetic ﬂux leakage testing principle to detect whether the internal detector passes. However, this paper uses the method of detecting vibration signals to track and locate the internal detector. The Variational Mode Decomposition (VMD) algorithm is used to extract features, which solves the defect of large noise and many disturbances of vibration signals. In this way, the detection range is expanded, and some non-magnetic ﬂux leakage internal detectors can also be located. Firstly, the extracted vibration signals are denoised by the VMD algorithm, then kurtosis value and power value are extracted from the intrinsic mode functions (IMFs) to form feature vectors, and ﬁnally the feature vectors are input into random forest and Multilayer Perceptron (MLP) for classiﬁcation. Experimental research shows that the method designed in this paper, which combines VMD with a machine learning classiﬁer, can effectively use vibration signals to locate the internal detector and has the characteristics of high accuracy and good adaptability.


Introduction
Pipeline transportation is one of the most economical ways for power transportation. After years of use, the pipe wall is extremely sensitive to corrosion caused by the influence of fluids. Serious corrosion will bring damage and leakage to the pipeline. This will not only destroy the natural environment but also cause economic losses [1].
Internal detectors are often used to detect abnormalities in pipelines. It is very important to obtain the operating position of the internal detector when it is detecting inside the pipe [2]. There are various methods for positioning the internal detector: odometer wheel [3], radioactive elements [4], magnetic leakage field [5], etc. The ground marker is a common external positioning device used for internal detection, which can identify whether the internal detector passes through the pipeline. It generally uses the magnetic principle for positioning.
A ground marker using the magnetic principle can only locate the Magnetic Field Leakage (MFL) detector. Therefore, it has a relatively small range of applications. In this paper, the ground marker uses vibration sensors to locate the internal detector. The vibration sensor can pick up the vibration signal generated by the friction between the inner detector and the pipe wall. Vibration signals are often used as the basis for diagnosis [6,7], due to its simplicity and the significant amount of information on machinery condition it provides. Compared with the ground marker with magnetic principle, the ground marker with vibration sensors can detect more kinds of internal detectors and has greater applicability.
To increase the positioning accuracy, it is necessary to choose suitable methods to reduce noise and extract features of vibration signals. Common denoising methods include Fourier transform, wavelet transform, etc. However, these methods are difficult to denoise according to the characteristics of the signal. Empirical Mode Decomposition (EMD) has been widely used for analyzing non-stationary and nonlinear signal processes. The basic idea of EMD is that series signal at any time can be composed of Intrinsic Mode Functions (IMFs). IMFs are arranged in the order of frequencies from high to low [8]. However, when EMD is applied to the vibration signal, it will cause modal aliasing and endpoint effect. The Variational Mode Decomposition (VMD) method proposed by Konstantin Dragomiretskiy can better suppress the mode aliasing phenomenon [9]. VMD is multiple adaptive Wiener filters, which shows better noise robustness. Each decomposed IMF can calculate instantaneous amplitude and frequency, and it can also be used as the characteristic of the original signal. The authors of [10] used the VMD method combined with Discrete Wavelet Transform (DWT) and Constrained Least Squares (CLS) optimization to denoise the physiological signal. In [11], VMD is proposed to denoise unknown nonlinear systems before they are fed into the ELM network. Therefore, we can use VMD to denoise vibration signals, and the feature vectors of vibration signals can be acquired by extracting from each IMF.
Abnormal recognition in fault diagnosis using vibration signal can be roughly classified into two categories: (1) traditional mathematical analysis; and (2) machine learning. Abnormal signal recognition is highly dependent on the prior knowledge and experience, and the practical application faces many difficulties. Intelligent recognition methods have achieved great progress in recent years. The authors of [12] used Artificial Neural Network (ANN) to intelligently classify the fault from machine vibrations; Root Mean Square (RMS), kurtosis, and Power Spectrum Density (PSD) are the feature extraction types used as ANN inputs. The authors of [13] used wavelet packet analysis for preprocessing the radial basis function neural network for fault diagnosis.
For the intelligent recognition method, the selection and extraction of fault features has always been the core of recognition problems. Different feature selection and extraction methods have great influence on recognition results. Compared with other methods, the extracted features by VMD are more representative. It is useful to study the signal preprocessing and eigenvalue extraction of non-stationary signals because the instantaneous amplitude and frequency can be calculated by each decomposed IMF. More feature values extracted by VMD will better reflect the attributes of the sample. The authors of [14] proposed a method based on VMD and singular value decomposition of rolling bearing fault feature extraction. Compared with the EMD feature extraction method, it has better classification performance for the same fault diagnosis.
To discuss the adaptability of the features extracted by the VMD algorithm, this paper adopts two different classifiers. One is random forest, which is an algorithm based on decision tree, and the other is Multilayer Perceptron (MLP). The classification results show that the features extracted by the VMD algorithm have good adaptability to different classifiers.
The main innovations of this paper are as follows: (1) Aiming at the problem of internal detector positioning, the ground marker with vibration sensors is adopted to collect the vibration signals produced by the internal detector, when it passes by. The ground marker with vibration sensors can locate the position of the inner detectors with any principle. It not only achieves the same effect as traditional ground markers with magnetic principle but also improves the universality of application. (2) A method of extracting characteristic parameters from vibration signal by using the VMD algorithm is proposed. This method can get more feature information in a shorter time. By using a small number of labeled samples, this method with different intelligent classifiers can calculate more accurate recognition results.

Using the VMD Algorithm to Extract Feature Values
The VMD algorithm can deal with variational problems and is suitable for processing non-stationary and nonlinear signals. The input signal is decomposed into K modal functions, and each mode is assumed to have a limited bandwidth with different center.
The goal of the VMD algorithm is to minimize the sum of estimated bandwidths of each modal. With the purpose of solving the variational problem, VMD adopts the alternating direction multiplier method to update the modes and their center frequencies. Therefore, each mode is demodulated to the corresponding fundamental frequency band with the updating process. To covert the variational problem into an unconstrained variational problem, the algorithm uses the secondary penalty factor and Lagrange multiplication operator. The role of the secondary penalty is to reduce the interference of Gaussian noise, while the Lagrange operator makes the constraint condition strict. In general, the process of VMD adaptively decomposing vibration signal is mainly composed of the construction and solution of variational problems.

Variational Problem Construction
(1) Firstly, a variational constraint model is established, as shown in the following formula: where { u k } is the decomposed modal function, {w k } is the center frequency of the corresponding modal, and f (t) is the original signal. * denotes convolution operation, ∂ t denotes the derivative of function over time, and δ(t) is the unit impulse function. (2) To make sure the strictness of constraint conditions in the solution process, the Lagrange multiplier operator λ is introduced. Thus, the expression of the augmented Largrange is obtained as follows: where α is the penalty factor and λ is the Lagrange factor. (3) Then, by alternating the direction multiplier method and continuously updating u n+1 k , ω n+1 k , and λ n+1 k , the saddle point of the above formula can be found, which is the optimal solution of Equation (1). The formulas for updating variables during iteration are as follows.û n+1 k

Implementation Process of VMD Specific Algorithm
The implementation process of the algorithm is in Table 1: Table 1. The implementation process of VMD.
(1) Initialize each mode û 1 k and the center frequency ω 1 k .
(4) For a given identification accuracy Step 2, otherwise stop the iteration.

Feature Extraction by the VMD Algorithm
In practical industrial field, the vibration signals usually have the characteristics of non-stationarity. It is difficult to get accurate results by traditional statistical methods. The VMD method can perform adaptive time-frequency decomposition according to the local time-varying characteristics of the signals, so it is exceedingly suitable for the decomposition of nonlinear and unstable signals in pipeline. In this paper, the signal is decomposed into several IMFs by the VMD algorithm, and the kurtosis and power of different IMF after decomposition are calculated as extracted feature vectors.

Introduction of Kurtosis Value and Power Value
As a dimensionless parameter, kurtosis reflects the kurtosis of waveform and the distribution characteristics of vibration signals. The mathematical expression is shown in Equation (6): where µ is the mean value of x, σ is the standard deviation of x, and E is the expectation. The frequency distribution of the vibration signal will change as the internal detector passes through the pipeline. At the same time, the power distribution of the signal will also change. The internal detector has a great influence on the signal power in the frequency band. There is abundant information in the power component of each frequency band of the signal. Therefore, the power feature of IMF is extracted as the feature of pipeline state. The mathematical expression is shown in Equation (7): where i = 1, 2, . . . , n, c i is the i th IMF component and E i is the corresponding power value.

Steps for Extracting Features
The specific steps for feature extraction of kurtosis value and power value are as follows: (1) Use the VMD method to initially denoise the signal and decompose the denoised signal.
(2) Determine the decomposed modal number K and select N IMF components containing the main information.
(5) Use the vector T of the normalized feature as the input of the classifier.

Random Forest
Random forest was proposed by Leo Breiman and Adele Cutler in 1995. It is an ensemble learning algorithm based on decision tree [15]. Different from the decision tree algorithm, the random forest is composed of a plurality of decision trees combined by bagging with random feature selection for decision trees. The results are determined by the output votes of each decision tree. Therefore, it is hard to produce overfitting using random forest, and it has strong anti-interference capability. For a large number of complex data, random forest has more efficient and accurate classification results than a single classifier, which can yield a boosted estimate with a better performance.
The steps of random forest are shown in Figure 1: (1) Extract data samples with multiple playbacks to obtain plenty of data subsets. To be specific, for a tree, each sampling randomly extracts N samples from the original N samples, including duplicate samples. N samples are used as the training set of the tree; K sets of training samples are generated by repeating this process K times.

Multilayer Perceptron
A neural network is a system formed by connecting a number of simple processing units. The system processes information with the dynamic response of its state to external input information. An artificial neural network is an information processing system aiming at imitating the structure and function of the human brain. Back Propagation (BP) is the most widely used algorithm for supervised learning using multilayer feedforward networks.
Multilayer Perceptron (MLP) is an extension of the neural network model. Its original idea is to construct a multilayer neural network model by increasing the number of hidden layers. The MLP neural network is one of the most mature artificial neural networks at present [16]. It is mainly composed of three layers of perceptron structure: input layer, hidden layer, and output layer. Each layer of perceptron contains several neurons.
W is the network weight of neurons, b is the network offset, and ϕ is the activation function. The simplified structure is shown in Figure 2.

Test Environment
In the experiment field, the sensor IEPE 50g was used to detect the vibration signal generated when the inner detector passes through the pipeline, as shown in Figure 3. The sampling frequency of the sensor was 250 Hz. Through the analysis, it can be concluded that there are three kinds of vibration signals: signals caused by some uncorrected operations of researchers, signals created by sensor shedding, and signals generated by the internal detector.

Feature Extraction Using VMD Algorithm
Before using the VMD algorithm to decompose the signal, the number of decomposed modes (K) must be determined. The VMD algorithm decomposes the preprocessed mixed signal to obtain multiple variational modal components and the corresponding center frequencies of each component. Because each variational modal component is distinguished by the center frequency, the most suitable K value can be determined by observing, comparing, and analyzing the center frequency of each modal component.
In this study, it was determined that, if the difference between the center frequencies of the two components is less than 20 Hz after a decomposition, the decomposition is considered as an over-decomposition. In this experiment, different K values from 1 to 6 were determined to decompose the vibration signal, and then the algorithm could obtain the center frequency of each IMF component under the condition of different K values. By analyzing and comparing the proximity of each center frequency, the appropriate mode number K could be judged.
The corresponding center frequencies obtained after decomposition with K from 1 to 6 are shown in Table 2.  Table 2 shows that, when K = 6, the center frequencies of IMF6 and IMF5 are 114.33 and 96.56 Hz, respectively. The difference between them is less than 20 Hz, which can determine that this decomposition is over-decomposed. Therefore, the K value of the VMD algorithm in this experiment was set at K = 5. Figure 4 shows the waveform graphs of five IMFs obtained by decomposing the vibration signals through the VMD algorithm.
The kurtosis value and power value of each IMF were extracted to form feature vectors with data generalization. The same operation was done for other kinds of interference vibration signals. In total, 300 sample sets were selected for each kind of vibration signal, and the feature dimension of each sample was 10.

Analysis of Random Forest Classification Results
To verify the advantages of the VMD algorithm in feature extraction, the feature vectors formed by VMD and EMD were input into a classifier of the random forest. The tree number of the random forest was chosen as 3, 5, 10, 50, and 100 for comparison. The relationship between classification accuracy and the number of trees is shown in Figure 5. The accuracy refers to the rate of successfully classifying the positioning signal from interference signal.   Figure 5 shows that the classification accuracy using the VMD algorithm is higher than the one using the EMD algorithm, no matter what the tree number is. When the number of trees reaches 100, the accuracy of 98.43% can be obtained by the proposed method.

Analysis of Classification Results of MLP Neural Network
With the purpose to further verify the superiority of the VMD algorithm in extracting features and show its wide adaptability, the feature vectors formed by the VMD and EMD algorithms were input into the MLP classifier. Different classification effects caused by the different numbers of neurons in the hidden layer were compared and analyzed. The MLP classifier used two hidden layers, and the number of neurons in each hidden layer was selected as 5, 10, 15, 20, and 25, respectively. The relationship between classification accuracy and the number of neurons in hidden layer is shown in Figure 6. The accuracy refers to the rate of successfully classifying the positioning signal from interference signal.  Figure 6 shows that, as the number of in hidden layer increases, the classification accuracy becomes higher. For any number of neurons, the results of VMD are more accurate than the results of EMD. For using features extracted by VMD, when the number of neurons in hidden layer reaches 25, MLP obtains the highest accuracy of 99.35%. However, for using features extracted by EMD, the accuracy can only reach 92.69% in the same condition.
By analyzing Figures 5 and 6, it can be found that, when classifying vibration signals, the classification of MLP is better than random forest. Moreover, the results show that using the VMD algorithm to extract the features of vibration signals is very helpful for a classifier to obtain high accuracy.

Comparison of Classification Effects with Different Methods
To make the experimental conclusion more universal, we tested 100 groups of pipeline vibration data measured in the experiment field. The signal length of each group is 2 h. To further prove the superiority of using the VMD algorithm, it was compared with two other methods. One is to extract the kurtosis value and power value from each component obtained by EMD, and the other is to extract wavelet coefficients from Morlet wavelet base. We inputted the feature vectors formed by different methods into the classifier and compared the accuracy. Random forest and MLP were used as classifiers. The number of trees in random forest was set to 100, and the number of neurons in MLP's hidden layer was set to 25. The MLP classifier used two hidden layers. The classification effect is shown in Table 3. From the results in Table 3, it can be concluded that VMD combined with the MLP classifier has the highest average accuracy, which is 99.36% according to 100 repeated experiments. In terms of the highest average accuracy, the other methods are not as good as VMD: EMD is 7.38% lower than VMD and Morlet wavelet is 26.1% lower than VMD. Through comprehensive comparison, the results show that the classification effect of random forest is worse than that of MLP. Among all combinations, Morlet wavelet combined with random forest has the lowest accuracy, which is 68.32%.

Conclusions
To detect the pipeline internal detector's position, the ground marker using magnetic principle is generally used. However, its range of detection is relatively narrow, because it can only track and locate a specific in-pipeline detector, which uses magnetic leakage signals to find defects. In this paper, we use a new ground marker with vibration sensors to position the in-pipeline detector. The features of vibration signals are extracted by the VMD algorithm as input for the classifier. Then, the machine learning model can distinguish the positioning signal from an abnormal signal. Firstly, VMD with different preset numbers of modal components is used to denoise and decompose the signal. After obtaining and analyzing the center frequency of each component, the most suitable modal number K is judged. Then, the power value and kurtosis value extracted from each IMF component are used as classification basis, and they are input into random forest and MLP for classification. The features extracted by EMD are also input for experimental comparison. The classification results show that the features extracted by VMD are a good basis for classifying vibration signals. For the classifiers of random forest and MLP, the VMD algorithm can increase the accuracy by 11.69% and 7.38% compared to the EMD algorithm. In total, 100 repeated experiments were performed to make the conclusion more universal. It was found that, compared with other methods, VMD combined with MLP has the best classification effect. It increases the accuracy by 26.1% compared to the traditional method. This proves that the combination of the VMD algorithm to extract features and the machine learning classifier can well solve the problem of positioning detectors in pipelines by using vibration signals.

Conflicts of Interest:
The authors declare no conflict of interest.