Unsupervised Fault Diagnosis of Sucker Rod Pump Using Domain Adaptation with Generated Motor Power Curves

: The poor real-time performance and high maintenance costs of the dynamometer card (DC) sensors have been signiﬁcant obstacles to the timely fault diagnosis in the sucker rod pumping system (SRPS). In contrast to the DCs, the motor power curves (MPCs), which are accessible easily and highly associated with the entire system, have been attempted to predict the working conditions of the SRPS in recent years. However, the lack of labeled MPCs limits the successful applications in the industrial scenario. Thereby, this paper presents an unsupervised fault diagnosis methodology to leverage the generated MPCs of different working conditions to diagnose the actual unlabeled MPCs. Firstly, the MPCs of six working conditions are generated with an integrated dynamics mathematical model. Secondly, a framework named mechanism-assisted domain adaptation network (MADAN) is proposed to minimize the distribution discrepancy between the generated and actual MPCs. Speciﬁcally, beneﬁting from introducing the mechanism analysis to label the collected MPCs preliminarily, a conditional distribution discrepancy metric is deﬁned to guarantee a more accurate distribution matching with respect to different working conditions. Eventually, validation experiments are performed to evaluate the mathematical model and the diagnosis method with a set of actual MPCs collected by a self-developed device. The experimental result demonstrates that the proposed method offers a promising approach for the unsupervised diagnosis of the SRPS.


Introduction
The sucker rod pump system (SRPS) plays an indispensable role in the field of oil exploitation [1]. Due to the long-time operations and harsh working environment, some faults will inevitably occur, resulting in economic loss and energy consumption [2]. With the rapid development of machine learning, many data-driven fault diagnosis methods have been utilized to guarantee manufacturing security and improve production efficiency in the SRPS [3,4]. However, the most traditional and commonly used diagnostic methods universally depend on the dynamometer card (DC), which is measured by the load sensor installed on the "horse head". These DC-based methods inevitably suffer from the high maintenance cost and low detection frequency, resulting in poor ability in the real-time diagnosis of the SRPS.
Owing to power's advantages of accessibility and high correlation with the SRPS, the motor power-based diagnosis methods have received ever-increasing attention [5]. Ref. [6] distilled seven features from the motor power curves (MPCs) and utilized improved hidden conditional random fields to diagnose different working conditions. An MPC-based broad learning method was proposed in [7].
Even though conspicuous achievements have been achieved, these methodologies lack applicability due to their reliance on the massive labeled data, which is invalid in the help of the adversarial domain adaptation, and 1-D CNN constructs the feature generator network to extract the features of the time-series signal.
The main contributions of this paper can be summarized as follows: 1. An integrated dynamics mathematical model is established to generate the MPCs at normal and five kinds of faulty scenarios. The model calculates "four-bar" linkage movement, sucker rod vibration, the pump chamber pressure, and the liquid flow rate. The adjustment strategies of the model and relevant parameters under different working conditions are also presented.
2. A novel DA method named MADAN is proposed to exploit the knowledge learned from the generated MPCs to facilitate diagnosing the MPCs collected in practical scenarios. The mechanism-assisted pseudo-label learning is constructed to realize better conditional distribution alignment of the collected and generated MPCs under different working conditions. Furthermore, the domain classifier is designed for the marginal distribution alignment of the collected and generated MPCs.
3. Experiments demonstrate the superiority of the proposed fault diagnosis methodology with the MPCs collected by self-developed portable devices in the practical application scenario. The model's validity is verified by analyzing crucial downhole parameters and comparing the generated and actual MPCs. Furthermore, we experimentally show that MADAN outperforms five other state-of-the-art methods in terms of diagnostic accuracy and distribution alignment.
The rest of this article is organized as follows. The integrated dynamics mathematical model to generate the MPCs under various working conditions is surveyed in Section 2. Section 3 describes the proposed MADAN method. Section 4 shows the effectiveness of the proposed method through experimental verification. Finally, Section 5 concludes this article.

Generation of the Motor Power Curves
Driven by the motor, the pump connected with a series of transmissions is in a reciprocating up-and-down motion to pull the oil from the down-hole to the ground in the SRPS. As the energy for the whole system, the MPCs involve information about the SRPS working properties. Homoplastically, the MPCs of the well can be obtained by the inversion of the individual components simulation. To generate supplementary power waveforms of different working states for fault diagnosis, a detailed and integrated discussion of the mathematical model for the SRPS is presented in this section.  Aiming at the MPCs composed by the power vs. time, the mathematical model follows the order as the red arrows in Figure 1 as time → crank angle → polished rod motion plunger motion pump pressure polished rod load → crank torque → power. The prediction of SRPS behavior involves calculating "four-bar" linkage movement, sucker rod vibration, down-hole pump simulation of the pump, etc. Of these items, the operation of the down-hole pump, the polished rod motion, and the vibrations of the rod string are of the most difficulty but primary importance. In this subsection, the establishment of the mathematical model centers on the difficulties mentioned.

Polished Rod Motion Simulation
As the crank angular velocity approaches constant speed in practice, the crank angle θ vs. time is given by From trigonometrical considerations, the calculation for the displacement of the polished rod s(t) is listed as follows: The polished rod load is obtained as the summation torque acting on the polished rod of the crank torque, the counterbalance torque arising from the balanced weight and the net weight of the crankshaft, and counterbalance torque. The crank torque is derived backward by the relation where µ = 1 when TF < 0, and µ = −1 when TF 0. The torque factor as obtained from mechanics is given by

Rod String Simulation
Considering the rod string is up to thousands of meters long, the elastic deformation and vibration should not be neglected during the reciprocating up-and-down motion. Simulation of the rod string involves the calculation of a boundary problem. The boundary upon the ground is regarded as a compulsive movement, which is determined by the motion of the polish rod. The boundary of the down-hole is decided by the pump pressure acting at the plunger and the elastic force of the rod string. Benefiting from the rod string acting as a spring-mass-damping system with multiple degrees of freedom, the rod string is divided into individual parts connected by an equivalent spring, as illustrated in Figure 1.
Combined with buoyancy, gravity, frictional damping of the rod, and tubing, the dynamic analysis of the rod string can be deduced as whereṖ i andP i denote the first and second derivatives of P i regarding time, respectively. C l can be caculated as C l = π(d p + d r ).

Down-Hole Pump Simulation
Equal to the force on the bottom of the rod string F n , the force on the plunge can also be deduced from the down-hole pump simulation as follows: It is mainly related to the pressure of the pump. Suffering from various coupled variables and sophisticated processes existing in the down-hole, the simulation of pressure remains a severe issue but is the core of the whole model. In order to get around this impasse, the basic concepts of iterative algorithms are applied in this subsection. The pressure proportional to the mass per unit volume of free gas can be deduced as follows: The variation of gas, liquid, and oil in the pump can be calculated by flow rate. When the standing valve is closed, the flow rate is zero. When the standing valve is open, the flow rate can be calculated as Considering a bit of gas dissolved in the oil, the solubility of the gas is calculated based on Henry's Law. On the assumption that the water-oil-gas mass ratio of flows is constant in one stroke, the specific calculation is organized as follows: An iterative equation for estimating the pump pressure P p is given as

Moter and Gearbox Simulation
Considering the energy loss in the gearbox and the motor, the crank toque vs. the motor power is simplified as 2.1.5. Dynamic Implementation of the Overall Model As the order of the red arrows in Figure 1, Algorithm 1 outlines the procedure of the whole generating power method mentioned above combining the standing and traveling valve switch situations. By the proposed mathematical model, the theoretical MPC can be obtained based on the mechanical parameters of the specific oil well.

Algorithm 1: Generation of motor power waveforms.
Input: Times of stroke: n, a series of mechanical parameters of the system Output: P m (t) for t = 1 to 60/n do S(t) ← t refer to Equations (1) and (2); (3) and (4); P m (t) ← F c (t) refer to Equation (11); end

Generation for Faulty Working States
Based on the mathematical model of the SRPS, the MPCs of five faulty working states are analyzed in this subsection. The characteristics forming reasons and representation in the model will be discussed emphatically.

Traveling Valve Leakage
After repeating the switch operation numerous times, the traveling valve will wear out so that the oil in the sucker rod leaks into the chamber with the rate concerned to the P p (t). In order to simulate this state, the leaked oil is divided into static and dynamic parts. The pressure and the flow rate are the same as the normal state when the traveling valve is open. When the traveling valve is closed, the static part that is caused by the pressure discrepancy between the top and bottom of the plunger can be deduced as The dynamic part that is caused by the motion of the plunger can be calculated as The leaked oil can be obtained from the sum of the static part and the dynamic part.

Insufficient Liquid Supply
After a long extraction period, the reservoir formation pressure usually decreases, resulting in insufficient fluid supply capacity. In this working state, the submergence pressure P s is less than the pressure under the normal working state. There will be less oil flowing into the pump, and the traveling valve will open in a shorter time. To simulate this working state, only the submergence pressure P s needs to be set as a smaller value.

Gas Affected
During the oil production process, the remaining free gas accumulates due to the sealing performance of the pump. In the upstroke, the remaining free gas will slow down the reduction of pressure in the pump, which in turn delays the opening of the standing valve, resulting in a low fluid intake. Analogously, in the downstroke, the remaining free gas will also delay the opening of the traveling valve because of the deferred increase of pressure in the pump. In the simulation, the initial mass of the free gas in the pump and the gas mass ratio of flows are set as higher proportions than the normal working state.

Gas Locking
This working state is the special case of gas affected. When the remaining free gas is accumulating to a threshold, the pressure P p (t) is greater than the submergence pressure P s so that the valves remain closed without any inflow or outflow all the time. In order to simulate this state, the pressure is set as P s ≤ P p (t) ≤ P d .

Parting Rod
The rod string may crack suffering from corrosion, mechanical vibration, and friction in the down-hole after a long period of continuous work. In this working state, the polished rod load is only related to the rod weight, vibration, and friction above the breakpoint because the pump departs from the rod string. So, the P p (t) is equal to 0, and only the department above the breakpoint needs to be calculated in Equation (5) during the simulation.

Domain Adaptation Based on Generated Motor Power Curves
Although the labeled MPCs are supplemented with the mechanism model, the distribution discrepancies between generated and collected MPCs limit the diagnosis accuracy. A novel domain adaptation diagnostic network combining the mechanism analysis is proposed in this section to tackle this issue. Considering the load characteristics of the SRPS in one period, the pseudo-labels are assigned for the collected MPCs preliminarily. Then, the conditional and marginal probability distribution of the generated and collected MPCs are well aligned by distilling the domain-invariant features. That implements to acquire knowledge from the generated MPCs to facilitate the diagnosis of collected MPCs. The method's detailed architecture and training process are discussed in the subsequent section.

Problem Setting
Benefiting from the dynamic mechanism analysis, the MPCs of different working conditions are generated. However, the data-driven diagnosis methods trained with such generated curves possibly fail in diagnosing actual curves even though the waves have the same varying tendency under different conditions. The simplification and idealization in the mechanism simulation should be the main reason for the misdiagnosis. Take the vibration simulation of the sucker rod as an example. The rod is divided into several individually connected segments to simulate elastic deformation and vibration. The simulated MPCs with different quantities of segments and a similar actual MPC are illustrated in Figure 2. Dividing the rod into different segments changes the vibration analysis of the rod, which in turn affects the transformation of the DCs to the MPCs. Similar simplified features, e.g., gearbox vibrations, liquid flow rate, and crankshaft speed, lead to the difference between the generated and actual curves collectively.
Inspired by the idea of domain adaptation, which can project the data from various domains into a shared subspace, this section proposes an innovative fault diagnosis approach. It can leverage the knowledge learned from the generated MPCs with labels to facilitate diagnosing actual unlabeled MPCs. In order to promote the features between the generated and actual MPCs to be aligned, a domain classifier is built for marginal distribution adaptation. What is more, a conditional distribution discrepancy metric is employed for conditional distribution adaptation. Therefore, the proposed domain adaptive method not only considers all in-domain features as a whole for feature alignment but also ensures category features of different domains to be aligned.

Network Architecture
According to the above-mentioned description, the generated MPCs with labels are denoted as the source domain D s = {(χ s i , y s i )} n s i=1 of six categories of working conditions, and the actual MPCs are denoted as the target domain D t = {(χ t j )} n t j=1 without labels. Leveraging the knowledge learned from D s to facilitate diagnosing for D t , the proposed framework is illustrated in Figure 3. Overall, the methodology contains a feature generator network f g with parameters θ g , a domain classifier f d with parameters θ d , a label classifier f c with parameters θ c , a conditional distribution discrepancy metrics M, and a pseudo-label learning layer f P . The detailed description of the methodology is discussed as follows:

Pseudo-Label Learning Layer
Different from the marginal distribution, which does not require the category label, conditional distribution needs the labels to adapt the category-level discrepancy. Unfortunately, the samples in the target domain are unlabeled. Many existing approaches assign pseudo-labels to these samples based on maximum predictive probability, clustering algorithms, or pre-trained models trained with source domain samples. However, since the initial pseudo-labels learned by these methods are inaccurate, some errors will be caused by incorrect labels and accumulate with the strategy training, resulting in negative effects on fault diagnosis. In this respect, a novel pseudo-label learning method combining the mechanism analysis in the SRPS and the source domain samples is proposed to tackle this problem. The mechanism analysis of the MPCs under different working conditions is summarized as follows: 1. Normal working condition Y 0 : The MPCs of the upstroke and the downstroke are relatively full with similar peaks. 2. Traveling valve leakage Y 1 : The leakage will delay the increase of the pressure during the upstroke, resulting in the delayed opening of the standing valve. Therefore, the power of the upstroke will be less than the normal working condition. 3. Insufficient liquid supply Y 2 : The pump chamber can not be fulled in the upstroke. During the downstroke, the load reduces in the initial stage of the opening of the traveling valve. The load increases rapidly when the plunger hits the oil interface, resulting in double peaks in the power curve. The average value of the MPC is also lower than the normal power. 4. Gas affected Y 3 : Similar to the condition of insufficient liquid supply, the pump chamber also can not be filled because of the superabundant gas dissolved in oil, resulting in the lower average power in the downstroke. The difference is that more gas is present to act as a buffer to the plunger, so there is no second peak in the downstroke. 5. Gas locking Y 4 : The gas in the chamber makes the pressure insufficient to open the standing and traveling valve, so the oil cannot be adequately discharged. During the downstroke, the motor power curve will have negative values due to the gravity of the oil in the sucker rod. 6. Parting rod Y 5 : The motor load is mainly caused by the crank and the weight of the rod above the breakpoint. During the upstroke, the energy stored in the crank is more than the requirement to uplift the remaining rod, resulting in the apparent negative power in the MPC.
On the basis of the above analysis, the mechanistic pseudo-labels of the source domain {ȳ s i } n s i=1 and the target domain {ȳ t j } n t j=1 are obtained as shown in Figure 4, where P u and P d denote the power points of the upstroke and the downstroke in one stroke, respectively. N u and N d denote the numbers of the points in the upstroke and the downstroke. a 1 and a 2 are set as 0.9 and 1.1.
With the help of mechanism analysis, the accuracy of the initial pseudo-labels is improved. However, as the training continues, the accuracy of the classifier gradually outperforms the mechanical analysis. Therefore, we design the pseudo-label learning layer based on the comparison between the accuracy of the mechanical analysis P m and the accuracy of the classifier in the current epoch P c with the date of the source domain. The P m and P c can be calculated as follows: where L(,) denotes the cross-entropy loss function. The final pseudo-labels in the target domainŷ j t can be obtained as follows: The target domain with the pseudo-labels are defined asD t = {(χ t j ,ŷ t j )} n t j=1 .

Feature Generator Network
Inspired by the great nonlinear characterization capabilities of convolutional neural network (CNN), 1-D CNN specializing in the time-series signal is selected to extract features from the generated and actual power curves. The feature generator is implemented based on a 3-layer 1-D CNN associated with a fully connected layer (FC), whose structure is detailed in Table 1.

Label Classifier
The label classifiers aim to recognize the working condition and direct the feature generator to retain the information of each working condition. As illustrated in Figure 3, the label classifier consists of one hidden layer with the neurons of 256 and one output layer with the Softmax as the activation function. The dropout ratio is set as 0.5. For the classifier of the source domain D s = {(χ s i , y s i )} n s i=1 , the desired objective function can be defined as It is noteworthy that the classifier of the target domain is not involved in the backpropagation. It is only used for pseudo-label learning, and its parameters are kept the same as the parameters of the source domain label classifier.

Domain Classifier
In order to direct the feature generator to extract the domain-invariant features, a domain classifier f d is designed by following the idea of DANN [21]. The f d consists of three FCs with neurons as 1024-256-1. The output is a binary classifier that outputs 0 for all target samples and 1 for all source samples. The desired objective function can be defined as whereẏ i denotes the domain label.

Conditional Distribution Discrepancy Metrics
Regarding all the samples in one domain as one class, the marginal distributions can be well aligned by the domain classifier. However, only adapting the marginal distributions is insufficient, since the discriminative hyperplane may differ for diverse domain tasks. The conditional distribution adaptation, which aims to match the discriminative structures between source and target data, is also indispensable and highly effective. With the aid of the pseudo-label learning layer, pseudo-labels for target data can be preliminarily supplied. Defining C as the total number of categories and the category c ∈ {Y 0 , Y 1 · · · , Y 5 }, the distance index, MMD, can be designed to measure the discrepancy of conditional distributions D s andD t as where · H represents the Reproducing Kernel Hilbert Space.

Optimization
According to the network losses discussed above, the optimization objective of the proposed MADAN is summarized as where the hyperparameters λ 1 and λ 2 indicate the penalty coefficient for different loss functions. A gradient reversal layer (GRL) [33] is placed before the domain classifier to receive the gradient of L d by multiplying a negative factor. The network is updated employing the adaptive moment estimation optimizer (Adam) with the learning rate τ, which is set to 0.001. The parameters θ g , θ c , and θ d are updated simultaneously at each step as With the updates of the parameters, the extracted features are domain-invariant and discriminative simultaneously. The label classifier not only can predict labels for generated MPCs but also is available for the collected MPCs.

Industrial Experiments
A series of industrial experiments are conducted in this section with the MPCs collected in SRPS with self-developed equipment to verify the feasibility of the proposed mathematic model and the diagnosis method in practical application scenarios. The generated MPCs with the mathematic model are discussed with the mechanical characteristics and compared with the collected MPCs under different working conditions. Moreover, we compare the MADAN with some baseline methods in the field of DA to demonstrate the effectiveness of the improvement in practical applications.

Data Collection
As illustrated in Figure 5, the portable device developed by the authors' team in Northeastern University implements the MPCs acquisition by collecting the three-phase current and voltage of the motor. The device consists of five core units as follows: 1.
Power acquisition unit: realize the motor power calculation with the help of the ATT7022B.

2.
Transmission unit: realize remote query and parameter adjustment on mobiles and computers.

3.
Human-machine interaction unit: a touch screen is equipped to facilitate parameter entry, data query, and data display.

4.
Data storage unit: it is used to store the collected and calculated data and parameters. 5.
Data processing unit: with the help of the XC7Z020CLG400 chip, it implements the core calculation, including the trained diagnostic model, device operation, etc. After long-term practice, 300 groups of MPCs are collected from seven oil wells with the same mechanical parameters as shown in Table 2. All simulations are implemented in the MATLAB and Pytorch framework and conducted on a workstation with a Core i7-9700K CPU@3.60 GHz and a GTX2080TI GPU with 11-GB memory.

Validation of the Generated Motor Power Curves
According to the working and mechanical parameters listed in Table 2, the analysis results of six working conditions generated with the model in Section 2 are illustrated in Figures 6-11. Each working state contains four sub-figures. The first sub-figures express the variation curves of the crucial variables in the pump containing the chamber volume, oil and water volume, the pressure, and the flow rate through the standing valve. The second and third sub-figures illustrate the generated DCs and MPCs under different working conditions. The fourth sub-figures are typical MPCs selected from the collected samples in practical scenarios. As illustrated in the figures, the variation of essential parameters is consistent with the settings in Section 2.2. The characteristics embodied in the generated DCs under the different working conditions conform to the historical experience learned from the extensive data collected in different practical application scenarios. The generated and measured power curves have similar trends, and their characteristics are consistent with the previous mechanical analysis in Section 3.2.1. These results verify the rationality of the model on the mechanism analysis.
In order to take a more in-depth validity on the quantitative analysis, 50 samples of the MPCs under each working condition are generated as the training data by adjusting the downhole parameters to diagnose 300 groups of collected samples, which are testing data. The diagnostic method employs the mechanical feature extraction combined with the conditional random field (MCRF), which is mentioned in [6]. The experimental result is presented in Figure 12, where the diagnostic accuracy achieves 73% without the help of collected samples at all. This demonstrates the effectiveness of the generated data. However, the diagnostic accuracy does not meet the industrial requirement. The main reason mainly includes two aspects. On the one hand, limited by the insufficiency of the mechanism feature extraction method, some MPCs of critical working conditions are difficult to identify. On the other hand, the generated samples deviate from the actual samples' distribution because of the model's simplifications and interference in the data acquisition.
Moreover, the collected data are divided into two parts, where 240 groups are randomly selected as the training set, and the remaining 60 groups are the testing set. To comprehensively investigate the generated data, we set various scenarios with different amounts of generated samples adding to the training set of the collected data to monitor the working conditions in the SRPS. Three methods named 1-D CNN, CNN, and MCRF are selected from three perspectives of time-series, image, and mechanism to conduct experiments. The diagnostic results are shown in Table 3.
As illustrated in Table 3, the diagnosis accuracy presents an upward trend as the generated samples are added to the original training set. Machine learning is more capable of extracting features than mechanistic feature analysis, and the time-series-based approach is more applicable to the MPCs than the curves acting as pictures.

Diagnosis Based on Domain Adaptation
In this section, the proposed MADAN is employed to minimize the distribution discrepancy across domains in practical application scenarios. Since the new conditional metrics and pseudo-label learning strategy are appended to the objective function for the distribution alignment, the convergence analysis is imperative to illustrate the stability and transfer ability. As shown in Figure 13a, the discrepancy in diagnostic accuracy between the source and target domain gradually decreases with the iteration of optimization, which illustrates the effectiveness of the feature generator network in bridging the distribution discrepancy. In addition, the accuracy curves converge rapidly and finally approach 1, which demonstrates the superiority of this method in industrial diagnosis. Furthermore, the training loss including classification loss (classifier_loss), domain classifier error (adversarial_loss), and conditional distribution loss (distance_loss) are plotted in Figure 13b, respectively. It can be found that the classification loss is gradually decreasing with the increasing of training epoch and finally approaches 0. The reciprocal oscillation of the adversarial loss illustrates that the domain classifier efficiently guides the feature generator network to explore domain-invariant features. This is because the feature extraction network keeps improving the information extraction capability under the requirement of the classifier error reduction, which makes the domain classifier keep improving the ability to discriminate domain features to inhibit the feature extraction network from retaining domain-related information. The conditional distribution loss presents a gradual declining trend. This demonstrates that the conditional distribution discrepancy is gradually disappearing.
For comparison purposes, several state-of-the-art methods are considered for comparisons with the MADAN, including 1-D CNN, DANN [21], DATLN [22], DTN [30], and MiDAN [24]. In order to make a fair comparison, all the compared methods adopt the same 1-D CNN architectures to explore features. The details of the compared methods are presented in Table 4, where MDA denotes the marginal distributions alignment and CDA denotes the conditional distributions alignment. The diagnostic result is an average of five random tests, where the testing set is 60 groups randomly split from the 300 groups of collected MPCs. To comprehensively show the capabilities of the proposed method, three evaluation indicators including Accuracy, F1-score, and MCC are selected to assess the performance of each method. The expressions of the MCC are defined as follows: The results are listed in Table 5. As the results show, the MADAN performs better than other diagnostic methodologies in all evaluation indicators. Concretely, t-distributed stochastic neighbor embedding (t-SNE) is employed to demonstrate visual insights into the distribution discrepancy of features distilled by different methods from the generated and collected MPC. The t-SNE visualization for the original data and the features after the alignment by the methods mentioned above are illustrated in Figure 14.
From Table 5 and Figure 14, some results can be clearly obtained. Firstly, in terms of classification performance, the outlier source samples are much less with the help of transfer learning. In addition, the MADAN can better cluster the same categories and separate different categories than the other methods. Secondly, in terms of the marginal distributions alignment, the adversarial training is superior to the MMD, where Figure 14c,e correspond to Figure 14b,d, respectively. Thirdly, in terms of the conditional distributions alignment, the data in different domains within each category are more evenly distributed, where Figure 14d-f correspond to Figure 14a-c. Fourthly, in terms of pseudo-label learning, despite MiDAN having achieved good results, our MADAN performs better in the same number of iterations due to the higher pseudo-label accuracy resulting from assisted mechanisms during the initial training. From the analysis and discussion above, it can be seen that the proposed MADAN can effectively bridge the distribution discrepancy, resulting in better diagnosis performance in practical application scenarios.

Conclusions
The motor power as an easily collected signal contains information about the working status of the SRPS. In order to tackle the issue of an insufficiently labeled MPC database due to the early stage of the electrical parameters research on the SRPS, this paper has proposed an unsupervised fault diagnosis methodology named MADAN to leverage the generated MPCs of different working conditions to diagnose the actual MPCs. Firstly, an integrated dynamics mathematical model has been established to generate the MPCs under different working conditions. Secondly, a mechanism-assisted pseudo-label learning strategy and a conditional distribution discrepancy metric have been added to the adversarial domain adaptation model to bridge the marginal and conditional distribution discrepancy of the generated and collected MPCs. Finally, a set of actual MPCs collected by self-developed portable devices has been utilized to verify the feasibility of the proposed methodology. The experimental results indicated that the generated and the actual MPCs had similar trends, and the MADAN can effectively utilize the generated and actual unlabeled MPCs to realize the power fault diagnosis of oil wells.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: