3.3.1. Time Domain Analysis
Through the analysis in this paper, different operating states correspond to different motor power curves; therefore, the diagnosis of pumping machine operating conditions can be realized by varying electric power over time. Nine features are defined in the time domain as follows:
1. Upstroke power. The sum of the work done by the motor during the upstroke of the pumping unit operation is called the upstroke work, expressed as follows:
2. Downstroke power. The sum of the work done by the motor during the downstroke of the pumping unit operation is called the downstroke work. In fault conditions, the electric power diagram tends to show noticeable fluctuations in the downstroke curve, expressed as follows:
3. Cycle work deviation. Under normal operating conditions, the upstroke and downstroke work of a beam-pumping unit are generally balanced. However, when a fault occurs, a significant disparity in work between the two strokes may emerge, expressed as follows:
: average upstroke power; : average downstroke power. Generally, the balance of a pumping unit system is determined by whether the degree of balance falls between 80% and 110%. It is commonly accepted that if the degree of balance exceeds 1.1, the pumping unit is underbalanced, and if the degree of balance is less than 0.8, the unit is overbalanced.
5. First-order moment statistics: The mean, quantile and triple mean. The mean describes the centralized location of the order data and is considered a stable component of the signal. The sample mean is generally calculated according to the following formula:
Sort the data from smallest to largest in a new order as
. Let
, then, the P-quartile is expressed as follows:
where
stands for
n multiplied by
P and then rounded.
The triple mean is the weighted average of the upper quartile, median and lower quartile, calculated as follows:
6. Second-order moment statistics: The mean square value and variance. The mean square value can be used to represent the energy of a vibration signal when describing the vibration signal, expressed as follows:
Variance, also known as the second-order central moment, is a statistic that describes the relative dispersion of data values:
3.3.2. Time–Frequency Domain Analysis
The shape characteristics of the electric power diagram curve of the pumping machine can accurately show the working state of the pumping machine under certain faults, but this also has certain limitations. Sand production failure occurs when sand particles enter the pump, creating significant resistance to the movement of the plunger. This resistance leads to the vibration of the pumping rod, which is also reflected as strong vibrations in the electric power diagram. Reduction gearbox failure primarily involves broken teeth on the gears. Such gear damage can cause violent impacts, leading to large high-frequency fluctuations in motor power. During this time, the power diagram’s frequency curve becomes rich with information. A typical power diagram is shown in
Figure 10d,e. At this point, these two conditions cannot be effectively identified solely through the changes in the power diagram during the upstroke and downstroke. Therefore, it is necessary to analyze the time–frequency information of the power diagram to accurately diagnose these conditions. To address this, feature extraction of the electropower diagram using the improved Hilbert–Huang transform [
23,
24] is explored to enhance the characterization of the electropower diagram.
The improved Hilbert–Huang Transform (HHT) consists of the following two components: the CEEMDAN decomposition and the Hilbert transform. HHT is a method used to analyze non-linear and non-stationary signals, which are commonly encountered in real-world fault diagnosis scenarios. The CEEMDAN decomposition enhances the traditional Empirical Mode Decomposition (EMD) algorithm by addressing the issue of mode mixing through the addition of adaptive Gaussian white noise and the iterative averaging of the primary Intrinsic Mode Function (IMF) components. This refinement improves the accuracy and stability of signal decomposition, allowing for the more precise extraction of features from complex signals. The Hilbert transform is then applied to the decomposed IMFs to obtain instantaneous frequency and amplitude information, which can be used for further analysis and fault detection. By combining these two components, HHT offers a powerful tool for analyzing signals that exhibit time-varying and frequency-varying characteristics, such as those found in oilfield pumping systems.
The specific steps of CEEMDAN decomposition are as follows:
First, add Gaussian white noise components with zero mean and unit variance, decomposed from the EMD, to the original signal x. The local mean decomposition sequence is as follows:
Obtain the local mean signal of the decomposed sequence
, and use this to obtain the first residue, as follows:
At this point, the first mode at the first stage (
., when k = 1) can be obtained as follows:
Next, to estimate the second residue, white noise is added to the first residue
as the second local mean
, and the second mode component is defined as follows:
Furthermore, when stage
k cannot be decomposed, the
residual is computed as follows:
The Hilbert transform is applied to each of the above IMF components separately to obtain the following:
Construct a parser function as follows:
Expanding Equation (
34) yields the Hilbert spectrum of
, denoted as follows:
Then, the Hilbert marginal spectrum is expressed as follows:
where
T is the total signal duration. The Hilbert marginal spectrum
clearly reveals the distribution of the original signal
based on frequency changes. It reflects how the amplitude of
varies at any given frequency point within the entire frequency distribution. The IMF components and marginal spectra for typical working conditions obtained by the modified Hilbert yellow transform are shown in
Figure 11.
Since the sampling period is 50 ms, according to Shannon’s theorem, the marginal spectrum contains only 0–10 Hz signals. This can be seen through the marginal spectrum of
Figure 12.
The energy distribution of the electric power diagrams under different working conditions is different, the energy of the low-frequency signals from 0 to 2 Hz mainly originates from the movement of the pumping rods, and the signal components of the higher frequency signals from 2 to 10 Hz are mainly due to the failure of the reduction gearboxes and mechanical vibration caused by the sand production. Although the amplitude of the abnormal frequency is much smaller than the base frequency, long-term wear may waste power. When sand production in the reservoir is severe, it may cause production reduction and shutdown, which may seriously affect the oil well production and cause economic loss. Therefore it is crucial to analyze the electric power signal in time and frequency domain.
The cliff value, which is very sensitive to the vibration signal, is chosen to screen the characteristic signal parameters for determining the vibration, and for a given discrete vibration signal, the cliff coefficient
K is expressed as follows:
where
i represents the position of the discrete point in the component;
represents the signal value;
represents the signal mean;
N represents the sampling length; and
represents the standard deviation.
The crags of all IMF components were found and normalized according to the following quation:
where
i represents the IMF component labeling, and
m represents the number of IMF components.
In addition, the Hilbert marginal spectrum can reflect the cumulative distribution of the energy of the vibration generated at different frequencies during the occurrence of faults. Therefore, frequency features and energy region change features are extracted for the Hilbert marginal spectrum. Three frequency features are defined as follows:
(1) Center of mass frequency (FC):
(2) Mean square frequency (MSF):
(3) Frequency variance (VF):
where
N is the length of the data,
is the
ith frequency value,
is the amplitude corresponding to
, and
is the frequency resolution.
The steps for the energy region change feature extraction are as follows:
Step 1: Observe the results of the change in the energy of the marginal spectral distribution with frequency, and equalize it into five sub-intervals with different energy distributions, according to the range of frequencies.
Step 2: Sum the energies in each sub-interval. Let
be the sum parameter for the
ith range interval.
where
represents the amplitude of the
ith discrete point, and
N is the number of sampling points.
Step 3: Find the sum parameter of the energy over the entire frequency range interval, given by E.
Step 4: Normalize by E normalized to the baseline; the characteristic parameter of the ith sub-interval
is expressed as follows:
Step 5: The marginal spectrum energy transformation feature vector
X is given by the following: