Fault Diagnosis Approach of Main Drive Chain in Wind Turbine Based on Data Fusion

: The construction and operation of wind turbines have become an important part of the development of smart cities. However, the fault of the main drive chain often causes the outage of wind turbines, which has a serious impact on the normal operation of wind turbines in smart cities. In order to overcome the shortcomings of the commonly used main drive chain fault diagnosis method that only uses a single data source, a fault feature extraction and fault diagnosis approach based on data source fusion is proposed. By fusing two data sources, the supervisory control and data acquisition (SCADA) real-time monitoring system data and the main drive chain vibration monitoring data, the fault features of the main drive chain are jointly extracted, and an intelligent fault diagnosis model for the main drive chain in wind turbine based on data fusion is established. The diagnosis results of actual cases certify that the fault diagnosis model based on the fusion of two data sources is able to locate faults of the main drive chain in the wind turbine accurately and provide solid technical support for the high-efﬁcient operation and maintenance of wind turbines.


Introduction
The smart city concept is an advanced trend for the development for cities today and some crucial technologies such as Internet of Things (IoT), renewable energy, and smart grids are integrated to build the intelligent energy system in a smart city [1][2][3][4][5]. To increase the share of renewable energy in electricity generation and avoid the challenges caused by the centralized construction, centralized grid connection, and long-distance transmission of large-scale wind farms far away from the load center, distributed wind turbines are being widely used in development of smart cities [6][7][8]. The operation and maintenance of wind turbines distributed around the whole smart city are more difficult than the operation and maintenance of wind turbines in a centralized large-scale wind farm. In the wind turbines, electrical components have the highest fault frequency, followed by the main drive chain components. However, the electrical component faults can be located quickly, and the time to recover is short. Compared with the electrical component, the main drive chain component faults have a longer positioning time. Because of their huge size and heavy weight, it is cumbersome to replace them, and they need more time to recover. The outage of wind turbines before the replacement of the main drive chain fault components severely affects the operational reliability. So the wind turbines are in urgent need of economic efficiency of the wind turbines and the accurate and highly efficient remote intelligent operation and maintenance service. Therefore, it is imperative to establish the fault diagnosis system with high precision for the wind turbine main drive chain [9][10][11].
To improve the accuracy of main drive chain fault diagnosis in wind turbines, scholars worldwide have carried out a lot of research on the fault diagnosis of main drive chain. Among all the researchers, most of them try to alarm the out-of-limit main drive chain based on the real-time monitoring data of the wind farm supervisory control and data acquisition (SCADA) system [12][13][14][15][16][17] so as to maintain the normal operation state of wind turbines. Based on the real-time monitoring data of the SCADA system and the out-of-limit alarms of the main drive chain, the fault diagnosis framework of the wind turbine is established, and the out-of-limit diagnosis indexes of typical faults are given [18]. The research uses highfrequency SCADA data to extract the core technical indicators to improve the performance of wind turbines [19]. Further, the real-time monitoring data of the SCADA system is used to establish the fault prediction model of wind turbines [20]. More approached-based data analysis of the SCADA are being used today [21,22]. For instance, a data-driven method is proposed to diagnose the pitch fault of wind turbines [23]. Reference [24] authors also use a data-based prognostic system without any additional sensor out of the SCADA. This paper [25] summarizes various types of real-time monitoring systems of wind turbines and states the advantages and disadvantages of various methods for fault monitoring of wind turbines based on the real-time monitoring data of the SCADA system. To improve the accuracy of the main drive chain fault diagnosis in wind turbines, people began to use the professional fault main drive chain diagnosis system for high-frequency vibration data acquisition and fault analysis to its main components [26]. Damage can be detected based on the differences between modified modal displacements in the undamaged and damaged states [27]. Based on the high-frequency vibration signal analysis of the main drive chain fault diagnosis system in reference [28], the gearbox faults under the nonstationary state of speed and load were statistically analyzed. Compared with the outof-limit alarm signal in the low frequency real-time monitoring system of the SCADA system, the high-frequency signal analysis can locate the gearbox faults more accurately. In references [29][30][31][32][33][34], the high-frequency resonance vibration signal of bearing is extracted by wavelet analysis. Combined with the classification ability of the support vector machine (SVM) and the dynamic time series processing ability of the hidden Markov model, a new bearing fault diagnosis scheme is proposed so as to improve the accuracy of bearing fault diagnosis. In reference [35], wavelet packet energy entropy is combined with empirical mode decomposition (EMD) to enhance the noise elimination of original vibration data and improve the accuracy of diagnosis. References [36][37][38] proposed a typical fault diagnosis method of the gearbox based on wavelet decomposition and support vector machine classification, a wind turbine bearing vibration fault diagnosis method based on noise suppression and a fault diagnosis method for the planetary gearbox of main drive chain of wind turbine under non-stationary conditions based on adaptive optimal kernel timefrequency analysis, respectively. References [39][40][41] show other methods of fault diagnosis based on wavelet transform. They all show that more effective typical fault features are extracted from the high-frequency vibration signals of the professional main drive chain vibration fault diagnosis system. Reference [42] began to introduce the influence of different working conditions on wind turbine fault analysis. Recently, more scholars used advanced algorithms such as neural network and machine learning for fault diagnosis in wind turbines [43][44][45][46]. References [47,48] compare the advantages and disadvantages of the neural network model and traditional condition monitoring analysis model in the fault diagnosis of wind turbines. Some other methods proposed fault diagnosis using different technologies such as thermal imaging [49]. They are non-invasive but limited to specific tested objects and show less universality than the commonly used SCADA and vibration fault diagnosis system. For instance, reference [50] uses thermal imaging to evaluate the condition of only angle grinders in wind turbines. Moreover, wind turbines usually operate in wild environments with variable temperatures, and, therefore, evaluating the wind turbine using only thermal imaging is possible to be disturbed in operating states.
Reference [51] uses ultrasonic reflectometry for detecting while the targeted fault to be detected is limited to the lubrication failure of the wind turbine bearing.
The main drive chain vibration fault diagnosis system and wind turbine SCADA system are two of the most popular diagnosis systems for wind turbines. However, their suppliers are independent of each other so the data of these two systems cannot be shared, and, therefore, each of the research projects above is based on one of these two data sources for fault diagnosis.
In this paper, a data interface between the two systems is established through the technical transformation of the two systems by the wind farm owners. An approach of using a data fusion method of two types of data to extract fault features of the wind turbine main drive chain is proposed and paves a new way to improve the accuracy of fault diagnosis of the main drive chain in the wind turbine.
The contributions of this study are as follows: (1) Proposing a fault diagnosis strategy of the main drive chain in wind turbines based on data fusion, considering both the real-time monitoring data from the SCADA system and the high-frequency vibration data of the main chain. (2) Proposing the detailed method to classify and extract the fault features based on two types of data and the method for fault diagnosis using the deep autoencoder model. (3) Conduct case studies in a real wind farm to verify the effectiveness of the proposed strategy, and analyze the experimental results and the benefits to the high-efficient operation and maintenance of wind turbines.

Fault Features Extraction of Wind Turbine Main Drive Chain Based on Data Fusion
The entire process of the proposed data-fusion based method is given in Figure 1. The whole process can be divided into two steps: fault features extraction, and fault diagnosis based on data fusion. Details of each step are described separately in Sections 2 and 3.

Process of Fault Features Extraction of the Main Drive Chain in Wind Turbine
The supervisory control and data acquisition (SCADA) system of a wind farm collects real-time operating status data of each wind turbine in the wind farm comprehensively, with the sampling frequency of 1 s. Specifically, it comprehensively monitors the parameters and operating status of each operation control module in a wind turbine, including pitch, yaw, gearbox, generator, hydraulic pump station, nacelle, converter, power grid, safety chain, torque, main shaft, tower base, anemometer, and other modules. With This section demonstrates the method of fusing two types of real-time monitoring systems to extract the features of main drive chain faults. This method takes advantage of the globality of the wind turbine SCADA system and the pertinence and depth of the main drive chain vibration fault diagnosis system for the real-time monitoring of the wind turbine main drive chain.

Process of Fault Features Extraction of the Main Drive Chain in Wind Turbine
The supervisory control and data acquisition (SCADA) system of a wind farm collects real-time operating status data of each wind turbine in the wind farm comprehensively, with the sampling frequency of 1 s. Specifically, it comprehensively monitors the parameters and operating status of each operation control module in a wind turbine, including pitch, yaw, gearbox, generator, hydraulic pump station, nacelle, converter, power grid, safety chain, torque, main shaft, tower base, anemometer, and other modules. With the analysis and judgment for each module's operating status parameters, faults, and trend, normal operation of the turbine is maintained through the approaches of over-limit alarm and over-limit shutdown. However, the fault has already risen to a certain extent when the parameters and trends of modules of the wind turbine exceed the limit, and, therefore, how to trigger early warning becomes a core concern of wind farm owners. In recent years, many wind farm owners have equipped a high-frequency vibration signal acquisition system specifically for the main drive chain in order to improve the accuracy of fault diagnosis to the main drive chain of the wind turbine. They use acceleration sensors and other high-speed sensors to collect the high-frequency vibration signal of the main points of the main drive chain and do time-domain analysis, frequency domain analysis, as well as timefrequency domain analysis to extract more detailed fault features to locate the main drive chain faults more accurately. However, sometimes the added high-frequency vibration signal acquisition system gets noisy result data because it is susceptible to interference from different operating conditions, such as the yaw state of the unit, the rotational speed, and the icing of blades or anemometers. The added high-frequency vibration signal acquisition system is unable to deal with the operating status of the wind turbine or to remove the noise of the vibration data in a well-targeted manner. These deficiencies all make it difficult to reflect the fault state of the main drive chain accurately and comprehensively based on the fault eigenvector extracted from a single data source. That in turn reduces the accuracy of diagnosis results. Therefore, this paper proposes a method to transform two types of systems technically, the main drive chain vibration fault diagnosis system and the wind turbine SCADA system, to establish a data interface between them and use a method of fusing two kinds of data to extract the fault features of the wind turbine main drive chain. On the one hand, the wind turbine SCADA system has the ability to monitor the overall situation of the wind turbine in real time, which is used to extract low-frequency vibration signals related to drive chain faults, the rotational speed of main shaft and generator, and the operation control mode of the wind turbine. The last two types of signals, the rotational speed of main shaft and generators and the operation control mode of the wind turbine, are highly related to the vibration mode of the main drive chain and provide supplementary knowledge for the denoising of high-frequency vibration signals of the main drive chain. On the other hand, taking advantage of pertinence and depth of the main drive chain vibration fault diagnosis system for real-time monitoring of the wind turbine main drive chain, the high-frequency vibration signals of all added measurement points of the main drive chain can be extracted and use two types of signal, rotational speed of main shaft and generator of the wind turbine and wind turbine operation control mode, to classify the background noises of high-frequency vibration signals. The used sensors and method to equip them are shown in Figures 2 and 3. High-frequency vibration signals are clearly different between different speed ranges of the main shaft and generators and between the power-up and power-down operation intervals of the wind turbine. The effective removal of background noises is beneficial to the extracting of the high-frequency vibration features of the main drive chain itself. and generator of the wind turbine and wind turbine operation control mode, to cla the background noises of high-frequency vibration signals. The used sensors and me to equip them are shown in Figures 2 and 3. High-frequency vibration signals are cl different between different speed ranges of the main shaft and generators and bet the power-up and power-down operation intervals of the wind turbine. The effectiv moval of background noises is beneficial to the extracting of the high-frequency vibr features of the main drive chain itself.  When using the SCADA system to extract low-frequency vibration signals relat drive chain faults, it is also necessary to use two types of signals, the speed of main and generators of the wind turbine and the operation control mode of the wind tur to classify its background noise. Between different speed ranges of the main shaf and generator of the wind turbine and wind turbine operation control mode, to the background noises of high-frequency vibration signals. The used sensors and to equip them are shown in Figures 2 and 3. High-frequency vibration signals ar different between different speed ranges of the main shaft and generators and the power-up and power-down operation intervals of the wind turbine. The effe moval of background noises is beneficial to the extracting of the high-frequency v features of the main drive chain itself.  When using the SCADA system to extract low-frequency vibration signals r drive chain faults, it is also necessary to use two types of signals, the speed of m and generators of the wind turbine and the operation control mode of the wind to classify its background noise. Between different speed ranges of the main s When using the SCADA system to extract low-frequency vibration signals related to drive chain faults, it is also necessary to use two types of signals, the speed of main shaft and generators of the wind turbine and the operation control mode of the wind turbine, to classify its background noise. Between different speed ranges of the main shaft and generators and between the power-up and power-down operation intervals of the wind turbine, background noises are classified and removed to extract the low-frequency vibration characteristics of the main drive chain itself. Table 1 shows the types and causes of typical faults in the main drive chain of wind turbines. For the typical types of the wind turbine main drive chain faults, the lowfrequency vibration signal and high-frequency vibration signal of the main drive chain are denoised, respectively, according to the fault features extraction flowchart given in Figure 4. On the basis of this procedure, it is necessary to extract the low-frequency fault features and high-frequency fault features of each typical fault in the main drive chain, and then eliminate redundant fault features to reduce dimensionality of the fault features to form eigenvectors that characterize the typical faults of the main drive chain.

Fault Features Extraction of Wind Turbine Main Drive Chain Based on Data Fusion
Among them, the noise reduction of low-frequency and high-frequency vibration signals of the main drive chain, the extraction of low-frequency and high-frequency fault features of typical faults, and the dimensionality reduction of fault features are the core procedures of fault features extraction of the wind turbine main drive chain based on data fusion: (1) Noise reduction of low-frequency and high-frequency vibration signals of the main drive chain The nacelle of the wind turbine will still shake and vibrate during normal operation when the main drive chain is not vibrating. The frequency and amplitude of shaking and vibration are closely related to the rotational speed of the main shaft and generators and the operation control mode of the wind turbine. Experimental data indicate that it is a nonlinear relationship. To simplify the calculation, this article firstly segments the rotational speed of the main shaft and generators of the wind turbine according to their sizes and classifies the operation control mode of the wind turbine according to the increasing power and decreasing power. Based on the combination of these two approaches, we classify the nacelle shake and vibration during the normal operation of wind turbine and give out the frequency and amplitude of the nacelle shake and vibration background noise in the combination of each rotational speed range of the main shaft and generator as well as the wind turbine power-up or power-down operation.  (1) Noise reduction of low-frequency and high-frequency vibration signals of the main drive chain The nacelle of the wind turbine will still shake and vibrate during normal operation when the main drive chain is not vibrating. The frequency and amplitude of shaking and vibration are closely related to the rotational speed of the main shaft and generators and the operation control mode of the wind turbine. Experimental data indicate that it is a non-linear relationship. To simplify the calculation, this article firstly segments the rotational speed of the main shaft and generators of the wind turbine according to their sizes and classifies the operation control mode of the wind turbine according to the increasing power and decreasing power. Based on the combination of these two approaches, we classify the nacelle shake and vibration during the normal operation of wind turbine and give out the frequency and amplitude of the nacelle shake and vibration background noise in the combination of each rotational speed range of the main shaft and generator as well as the wind turbine power-up or power-down operation.
Assume that the background noise of the nacelle shaking and vibration under the first section of rotational speed ranges of the main shaft and generator and the wind turbine power up operation state are x1(t). Fourier transform of x1(t) is: This will be used as the background noise of the low-frequency and high-frequency vibration signal of the main drive chain in the first section of the main shaft and generator This will be used as the background noise of the low-frequency and high-frequency vibration signal of the main drive chain in the first section of the main shaft and generator rotational speed ranges and the wind turbine power-up operation. Before extracting the low-frequency and high-frequency fault features, these background noises are eliminated separately.
(2) Extraction of low-frequency and high-frequency fault features of typical faults After getting the low-frequency vibration signal and high-frequency vibration signal without background noise in the previous section, the frequency domain analysis method is adopted to calculate the following parameters as the low-frequency fault features: the low-frequency radial vibration, the axial vibration amplitude, and the vibration phase difference of the main frequency band. For high-frequency vibration data, it is necessary to calculate dimensional parameters, such as effective value, average amplitude, mean square error, kurtosis, and slope, and non-dimensional parameters, such as kurtosis index, impulse index, and margin index. With the help of the spectrum analysis method, the radial vibration amplitude, axial vibration amplitude, and phase difference of the main frequency band of each frequency band can be extracted as high-frequency fault features.

(3) Dimensionality reduction of fault features
For each type of typical fault, enough fault features should be extracted to determine the type of fault accurately. However, too many redundant fault features will not help increase the accuracy of fault determination, and contradictory samples inside will reduce the accuracy of fault diagnosis. Therefore, it is necessary to reduce the dimensionality of fault features.
In order to judge each type of typical fault, we require not only low-frequency fault and high-frequency fault features data, but also the combination of how the main shaft and generator rotational speed of the wind turbine are segmented according to the sizes and how the operation control mode of the wind turbine is classified according to the power-up and power-down operations. Therefore, it is necessary to take the rotational speed of the main shaft and generators and the operation control mode of the wind turbine as fault features when reducing the dimensionality of fault features. In this paper, the dimensionality reduction algorithm using Principal Component Analysis (PCA) is used to reduce the dimensionality of the fused eigenvectors. Based on the original n-dimensional features, the k-dimensional orthometric eigenvector is extracted as the principal component through centralized processing and calculation of covariance. It uses orthogonal transformation as the mapping matrix, calculates the covariance matrix of data matrix, obtains an eigenvalue and eigenvectors of the covariance matrix, and then selects the eigenvectors corresponding to the k characteristics with the largest eigenvalue (that is, the largest variance) from the matrix. In this way, the data matrix can be transformed into a new space, and dimensionality reduction of data characteristics can be realized. The main processing steps for a high-dimensional space data sample x ∈ R d are: use the orthogonal matrix A ∈ Rˆ(k × d) to map the sample to a low-dimensional space Ax ∈ R k , where k d states that the purpose of dimensionality reduction is to alleviate the curse of dimensionality and classify data better. The specific algorithm is as follows: Input: n-dimensional sample set D = x (1) , x (2) , . . . , x (m) , the dimension to be reduced to is k.
Output: the sample set D after dimensionality reduction.
(1) Centralize all samples: where m is data volume of sample x (i) , (2) Calculate the covariance matrix XX T of the sample, (3) Perform singular value decomposition on the matrix XX T , (4) Take out the eigenvectors w 1 , w 2 , · · · , w k corresponding to the largest k singular values, and normalize all the eigenvectors to form an eigenvector matrix W, (5) For each sample x (i) in the sample set, transform it into a new sample: (6) Obtain the output sample set: Through the dimensionality reduction processing using the PCA algorithm, the dimension of the fault characteristic vector can be reduced from hundreds to dozens, which can markedly reduce the complexity of the following step of data processing.

Fault diagnosis of Wind Turbine Main Drive Chain Based on Fusion of Two Types of Data
Based on low-frequency fault and high-frequency fault features obtained from the fusion of two types of data and fault characteristic variables obtained by combining the main shaft and generator rotation speed of the wind turbine and the operation control mode of the wind turbine, dozens of typical characteristics of fault early warning are generated after dimensionality reduction. However, they are still multi-variable and large- The DA model training process above is unsupervised learning of the sample data set. The parameters obtained by training can be used as prior information of supervised learning of the DA model. The DA model can be further optimized by using the labeled data set for supervised learning to improve accuracy of fault diagnosis. This fine-tuning process is designed as follows: The DA model training process above is unsupervised learning of the sample data set. The parameters obtained by training can be used as prior information of supervised learning of the DA model. The DA model can be further optimized by using the labeled data set for supervised learning to improve accuracy of fault diagnosis. This fine-tuning process is designed as follows: Assume that the sample data are: where the category status corresponding to x i is y (i) ∈ {1, 2, · · · , k}, which is generally is given in the form of label encoder, and k represents the total number of categories.
According to the analysis above, the k-dimensional vector output obtained by the classifier represents the conditional probability h θ x (i) = p(y = j x) that the input x is the corresponding category, and the main form is: θ is not a column vector but a matrix as where each row of the matrix represents the parameter corresponding to a category in the classifier while the count of all categories is k. The supervised global fine-tuning stage aims to do further parameters' adjustment to minimize the value of the objective optimization function. The objective optimization function (or, more exactly, the cost function) is: where 1{· · · } is the indicator function. The function value is 1 when the value in parentheses is true; otherwise it is 0. We have to minimization the value of J(θ), and we still use the stochastic gradient descent method to solve it here. The iterative formula is: The stochastic gradient descent method is a popular method in the field of machine learning. We repeat the process in Equation (9) until convergence. α ∂J(θ) ∂θ jl in Equation (10) is the partial differential of cost function to θ j . At the convergence point, the partial differential is 0, and therefore the cost function is minimized.

Results
In this section, we analyze the actual data of 66 doubly fed generators of 2MW in a wind farm. This wind farm is fully equipped with a SCADA system and a vibration fault diagnosis system for the main drive chain. Some user interfaces of the software system after upgrading are shown in Figures 6 and 7. The upgrading of the fault diagnosis module for the main chain in the software uses the data-fusion-based fault diagnosis approach we discussed in this article.        After the training of the fault diagnosis DA model, the test set data are used to test the gearbox broken tooth diagnosis model, and the verified results are shown in Figures  8 and 9.  As shown in the figures above, before the gearbox teeth of the main drive chain of the wind turbine A and B are broken at a data point of about 150, the model output was within the monitoring threshold range, and the fluctuation range is not large, indicating that the main drive chain gearboxes of the two turbines are operating in normal states. However, the output of the model begins to increase and exceeds the pre-set monitoring threshold after the tooth break. Hence, it is judged that the two wind turbines are malfunctioning, and fault diagnosis as well as early warning are performed.
Consistent with the actual data, the time domain and frequency spectrum diagrams of the vibration signal of the medium-speed shaft of the gearbox of turbine A (9 May 2019) are shown in Figures 10 and 11.
Contrastively, the time domain and frequency spectrum diagrams of the vibration signal of the medium-speed shaft in the gearbox in the early warning state (12 May 2019) are shown in Figures 12 and 13.
As shown in Figures 10 and 12, the time domain diagrams under the normal and warning conditions are similar and show little valuable information. However, the hidden differences can be clearly revealed in the diagrams of the frequency spectrum in Figures  11 and 13. The details are described below. As shown in the figures above, before the gearbox teeth of the main drive chain of the wind turbine A and B are broken at a data point of about 150, the model output was within the monitoring threshold range, and the fluctuation range is not large, indicating that the main drive chain gearboxes of the two turbines are operating in normal states. However, the output of the model begins to increase and exceeds the pre-set monitoring threshold after the tooth break. Hence, it is judged that the two wind turbines are malfunctioning, and fault diagnosis as well as early warning are performed.
Consistent with the actual data, the time domain and frequency spectrum diagrams of the vibration signal of the medium-speed shaft of the gearbox of turbine A (9 May 2019) are shown in Figures 10 and 11.
Contrastively, the time domain and frequency spectrum diagrams of the vibration signal of the medium-speed shaft in the gearbox in the early warning state (12 May 2019) are shown in Figures 12 and 13.
Contrastively, the time domain and frequency spectrum diagrams of the vibration signal of the medium-speed shaft in the gearbox in the early warning state (12 May 2019) are shown in Figures 12 and 13.
As shown in Figures 10 and 12, the time domain diagrams under the normal and warning conditions are similar and show little valuable information. However, the hidden differences can be clearly revealed in the diagrams of the frequency spectrum in Figures  11 and 13. The details are described below.   As shown in Figures 10 and 12, the time domain diagrams under the normal and warning conditions are similar and show little valuable information. However, the hidden differences can be clearly revealed in the diagrams of the frequency spectrum in Figures 11 and 13. The details are described below. Figure 13 shows a sideband modulation signal of 5.99 HZ (signal in the red box in Figure 13) at a rotational frequency of the medium-speed shaft near the gear mesh frequency (670.833 HZ) from the medium-speed shaft to the high-speed shaft of the gearbox. However, there is no sideband signal such as this under the normal operating state in Figure 12. This difference can be judged as an abnormal condition of the meshing gear of the shaft. The operation and maintenance personnel disassembled the on-site gearbox and found that the fault was the breakage of tooth on the medium-speed gear, as shown in Figure 14. This model correctly warned the early fault of the main drive chain gearbox of turbine A and avoided further expansion of this failure.    Figure 13 shows a sideband modulation signal of 5.99 HZ (signal in the red box in Figure 13) at a rotational frequency of the medium-speed shaft near the gear mesh frequency (670.833 HZ) from the medium-speed shaft to the high-speed shaft of the gearbox. However, there is no sideband signal such as this under the normal operating state in Figure 12. This difference can be judged as an abnormal condition of the meshing gear of the shaft. The operation and maintenance personnel disassembled the on-site gearbox and found that the fault was the breakage of tooth on the medium-speed gear, as shown in quency (670.833 HZ) from the medium-speed shaft to the high-speed shaft of the gearbox. However, there is no sideband signal such as this under the normal operating state in Figure 12. This difference can be judged as an abnormal condition of the meshing gear of the shaft. The operation and maintenance personnel disassembled the on-site gearbox and found that the fault was the breakage of tooth on the medium-speed gear, as shown in Figure 14. This model correctly warned the early fault of the main drive chain gearbox of turbine A and avoided further expansion of this failure.  The details of all experiments are not shown completely here. However, a brief table is shown below as Table 2 to demonstrate the results of diagnoses to different typical faults in the main drive chain in wind turbines. As shown in the table above, maintenances are not taken immediately after the early warnings emitted by the diagnosis system in some scenes, and there are considerable interval times before we got alarms from the traditional SCADA system. Additionally, the following maintenances indicate that there is no false alarm. That clearly indicates the efficiency and accuracy of the diagnosis method proposed.

Conclusions
This article demonstrates the shortcomings of the commonly used main drive chain fault diagnosis methods that only use a single data source. Then a method of fault features extraction and fault diagnosis based on data source fusion is proposed. The new method makes integrated uses of the globality of the wind turbine SCADA system and the pertinence and depth of the vibration fault diagnosis system for main drive chain in wind turbines to solve the problem that background noises from a single data source are difficult to process.
In the proposed method, fault features of the main drive chain are jointly extracted and a deep self-encoding network fault diagnosis model based on data fusion is established by integrating SCADA real-time monitoring system data with main drive chain vibration monitoring data. The parameters obtained by the unsupervised learning training of the deep auto-encoding network can be used as the prior information of the following supervised learning model. Using labeled data sets for supervised learning further optimizes the deep auto-encoding network model and improves the accuracy of fault diagnosis.
The experimental results show that the diagnosis system using the proposed method accurately located the gearbox broken tooth fault in a wind turbine at a very early phase before the traditional SCADA system raised any alarm. That diagnosis avoided further expansion of this failure followed by greater loss. Obviously, this new approach provides strong technical support for the operation and maintenance of wind turbines with more immediacy and efficiency.
There is a possibility to use the way of fusing data from multiple sources for other problems. However, the scenes and methods proposed in this article are specific and highly concentrated. More complete and pertinent analysis and experiments must be done for another specific problem.
Future related research will be focused on classification and recognition of possible original causes of the detected faults. Various types of faults and the original reasons will be analyzed. It will allow for the pre-analysis and early warning of faults before the manual detection and will improve the efficiency of the operation and maintenance of wind turbines.