Wind Turbine Fault Diagnosis by the Approach of SCADA Alarms Analysis

Featured Application: The research proposed in this paper could be useful in wind turbine condition monitoring and fault diagnosis. Abstract: Wind farm operators are overwhelmed by a large amount of supervisory control and data acquisition (SCADA) alarms when faults occur. This paper presents an online root fault identiﬁcation method for SCADA alarms to assist operators in wind turbine fault diagnosis. The proposed method is based on the similarity analysis between an unknown alarm vector and the feature vectors of known faults. The alarm vector is obtained from segmented alarm lists, which are ﬁltered and simpliﬁed. The feature vector, which is a unique signature representing the occurrence of a fault, is extracted from the alarm lists belonging to the same fault. To mine the coupling correspondence between alarms and faults, we deﬁne the weights of the alarms in each fault. The similarities is measured by the weighted Euclidean distance and the weighted Hamming distance, respectively. One year of SCADA alarms and maintenance records are used to verify the proposed method. The results show that the performance of the weighted Hamming distance is better than that of the weighted Euclidean distance; 84.1% of alarm lists are labeled with the right root fault.


Introduction
As wind power installations continue worldwide, wind power is in a rapid transition toward becoming a fully commercialized, unsubsidized technology. It is thus vital to reduce the levelized wind power energy cost for enhancing the competitiveness of wind farms during the transition to fully commercial, market-based operations. Due to the remote and harsh operational environment, the operation and maintenance (O&M) costs of wind farms are high. Statistics show that the O&M costs account for 10-15% of total wind farm project costs [1]. For an offshore wind farm, the O&M costs account for up to 14-30% [2]. To reduce O&M costs, it is necessary to improve the reliability of wind turbines. Therefore, condition monitoring and fault diagnosis methods are commonly developed and employed [3].
Supervisory control and data acquisition (SCADA) systems are a standard installation for large wind turbines, and provide a wide range of operational information for almost all the subcomponents. As a potentially low-cost and wide-coverage solution, plentiful studies using SCADA data for condition monitoring and fault diagnosis were developed [4,5]. Moreover, the SCADA system also provides alarms to operators when a process key variable crosses a pre-fixed threshold, or a fault of a subcomponent occurs. These alarms can be used as emergency event indicators that assist operators in mitigating risk. However, these SCADA alarms are often overlooked in industrial applications for the following reasons: (1) The occurrence of a fault usually raises alarm floods. An alarm flood refers to a situation during which tens or hundreds of alarms appear in a short time [6]. The operator is overwhelmed by these alarm floods because it exceeds his response capability. (2) Because of the bad alarm configuration and the causal relationships among the measured variables,

1.
We segment alarms into alarm lists by an information alarm and represent alarm lists by vectors to simplify alarms;

2.
We extract the feature vector from alarm vectors as a unique signature representing an occurring fault; 3.
We define the weights of alarms to establish the coupling correspondence between alarms and faults.
The remainder of this paper is organized as follows: Section 2 introduces SCADA alarms and maintenance logs; Section 3 presents the online root fault identification method; Section 4 presents the results and the discussion; and Section 5 presents the conclusions.

SCADA Alarms
Alarm systems [21] play an important role in process monitoring. Due to advanced technologies, modern wind turbines use hundreds of sensors and actuators as parts of their many control loops. This situation can result in a large number of measured variables and their corresponding configured alarms. Thus, alarms can be generated at a high rate. A wind turbine SCADA system is integrated with an alarm function, which monitors the condition of wind turbines and their subcomponents.
Alarms are stored in the SCADA database. A list of the alarms used in this paper is shown in Table 1. They are recorded in chronological order. There are three types of alarms: information alarms, warning alarms, and fault alarms. Information alarms are generally used to communicate changes in certain operating conditions, e.g., wind turbine is reset, or a manual switch is engaged. Warning alarms are generated when the monitored variables come close to exceeding thresholds. Fault alarms are generated when these thresholds are exceeded. The attribute 'code' is the unique code of an alarm. The attribute 'flag' represents the start and the end of each alarm. Hence, each alarm has two records.

Maintenance Log
The maintenance log collects the repair activities carried out by the maintenance engineers. A record of this is shown in Table 2. The attribute 'type of faults' reveals the actual fault of this repair activity. It is the root fault of the corresponding alarms.

Online Root Fault Identification Method
This study proposes an online root fault identification method based on a similarity analysis. The idea of the method is based on the assumption that similar alarms tend to have the same root faults. The flowchart of the proposed method is shown in Figure 1. There are two processes: feature vector extraction and online root fault identification.
Briefly, in the process of feature vector extraction, alarms are firstly segmented into alarm lists, and the information and chattering alarms are removed. Subsequently, the alarm lists and their root faults are matched. Finally, the feature vectors of faults are extracted and the fault-template database is built. To establish the coupling correspondence between an alarm and a fault, the weight of an alarm in a fault is defined. In the process of online fault identification, firstly, the online alarms are preprocessed and represented by vectors. Afterward, the weighted distance between an online alarm vector and the feature vectors is calculated. The root fault of the online alarm vector is deduced by the value of similarity.

Online Root Fault Identification Method
This study proposes an online root fault identification method based on a similarity analysis. The idea of the method is based on the assumption that similar alarms tend to have the same root faults. The flowchart of the proposed method is shown in Figure 1. There are two processes: feature vector extraction and online root fault identification. Briefly, in the process of feature vector extraction, alarms are firstly segmented into alarm lists, and the information and chattering alarms are removed. Subsequently, the alarm lists and their root faults are matched. Finally, the feature vectors of faults are extracted and the fault-template database is built. To establish the coupling correspondence between an alarm and a fault, the weight of an alarm in a fault is defined. In the process of online fault identification, firstly, the online alarms are preprocessed and represented by vectors. Afterward, the weighted distance between an online alarm vector and the feature vectors is calculated. The root fault of the online alarm vector is deduced by the value of similarity.

Segmenting Alarm Lists
The alarms are recorded continuously in chronological order. First of all, we need to segment continuous alarms into alarm lists. The information alarm, 'I2' is used to segment alarm lists in this paper.  The alarms are recorded continuously in chronological order. First of all, we need to segment continuous alarms into alarm lists. The information alarm, 'I2' is used to segment alarm lists in this paper.
A SCADA system not only monitors process variables and triggers alarms, but also changes the operating condition of a wind turbine to deal with alarms. The actions in response to alarms are different according to the alarm levels. When the alarm level is low, no operation is performed, or the wind turbine is restarted to try to eliminate alarms. On the contrary, when the alarm level is high, the wind turbine is shut down, awaiting manual maintenance. Therefore, every time one wind turbine is shut down due to a fault, it needs manual maintenance.
The information alarm 'I2' indicates that the wind turbine is started. When the flag of I2 is 'start', the wind turbine is started from a shutdown. When the flag of I2 is 'end', the wind turbine is shut down from running; that is to say, the start of I2 indicates that the wind turbine has returned to normal operation, and the end of I2 indicates that the wind turbine is shut down due to faults. All the alarms generated between the start of I2 and the end of I2 are related to the following shutdown. Therefore, we segment alarms into alarm lists using I2. The alarms generated between the start of I2 and the end of I2 make an alarm list. The total number of alarm lists is expressed as M.

Matching Alarm Lists and Their Root Faults
Every time one wind turbine is shut down due to a fault, manual maintenance is needed. Therefore, the root fault of each alarm list is recorded in the maintenance log. The total number of records is Q. In theory, Q should be equal to the total number of alarm lists M. However, Q is smaller than M. This is due to the irregular work of operators in the industry. Some maintenance activities are missing.
The end time of alarm lists and the start time of maintenance activities is used to match alarm lists and their root faults. The schematic diagram of the match criterion is shown in Figure 2. The end time of an alarm list should be earlier than the start time of maintenance activity. The alarm list corresponding to the maintenance activity is the last list. Ultimately, we obtain Q pairs of data, which are made up of alarm lists and their root faults. the wind turbine is shut down from running; that is to say, the start of I2 indicates that the wind turbine has returned to normal operation, and the end of I2 indicates that the wind turbine is shut down due to faults. All the alarms generated between the start of I2 and the end of I2 are related to the following shutdown. Therefore, we segment alarms into alarm lists using I2. The alarms generated between the start of I2 and the end of I2 make an alarm list. The total number of alarm lists is expressed as M.

Matching Alarm Lists and Their Root Faults
Every time one wind turbine is shut down due to a fault, manual maintenance is needed. Therefore, the root fault of each alarm list is recorded in the maintenance log. The total number of records is Q. In theory, Q should be equal to the total number of alarm lists M. However, Q is smaller than M. This is due to the irregular work of operators in the industry. Some maintenance activities are missing.
The end time of alarm lists and the start time of maintenance activities is used to match alarm lists and their root faults. The schematic diagram of the match criterion is shown in Figure 2. The end time of an alarm list should be earlier than the start time of maintenance activity. The alarm list corresponding to the maintenance activity is the last list. Ultimately, we obtain Q pairs of data, which are made up of alarm lists and their root faults. Figure 2. The match criterion of alarm lists and maintenance records.

Removing Information and Chattering Alarms
The information alarms generally communicate changes in the operating conditions of wind turbines. We are not interested in such alarms. We focus on the warning alarms and fault alarms that indicate the abnormalities of the wind turbine. Thus, the information alarms are removed.
A chattering alarm [22] is an alarm that appears repeatedly during a short time. The reasons for a chattering alarm are that the monitored process variable is close to the alarm threshold and a noise is present. In this paper, we keep the first alarm of the chattering alarms in one alarm list and remove the following repeated alarms.

Representing Data by Vectors
The occurrence of an alarm can be recorded as a binary value. If the alarm is present, the value is equal to one; if the alarm is not present, the value is equal to zero. The binary value is expressed as:

Removing Information and Chattering Alarms
The information alarms generally communicate changes in the operating conditions of wind turbines. We are not interested in such alarms. We focus on the warning alarms and fault alarms that indicate the abnormalities of the wind turbine. Thus, the information alarms are removed.
A chattering alarm [22] is an alarm that appears repeatedly during a short time. The reasons for a chattering alarm are that the monitored process variable is close to the alarm threshold and a noise is present. In this paper, we keep the first alarm of the chattering alarms in one alarm list and remove the following repeated alarms.

Representing Data by Vectors
The occurrence of an alarm can be recorded as a binary value. If the alarm is present, the value is equal to one; if the alarm is not present, the value is equal to zero. The binary value is expressed as: where i = 1, 2, . . . , N and N is the total number of alarm types configured in the SCADA system. A list of alarms may be represented either by a sequence of alarms or by a vector of alarms. In a sequence, the alarms are ordered by their time of appearance. In a vector, their time of appearance is not considered. Only the fact that the alarms are present is considered. In this paper, an alarm list is represented by a vector. The j-th alarm vector is expressed as: where v j i is the binary value of alarm i in the j-th alarm vector; j = 1, 2, . . . , M; M is the total number of alarm vectors. It should be noted that each alarm has two records: one record represents the start of the alarm; another record represents the end of the alarm. In an alarm vector, as long as one alarm occurs, the binary value of the alarm is set to one. Fault k in the r-th maintenance record is expressed as: where Q is the total number of maintenance records and P is the number of fault types. Because the same fault can happen multiple times, P is smaller than Q. All the faults in records are expressed as: where f l k k is fault k in the l k −th maintenance record; l k is the number of records belonging to fault k; and l P is the number of records belonging to fault P. Thus, the set of records belonging to fault k is expressed as: where f l k k is fault k in the l k −th maintenance record; l k is the number of records belonging to fault k; k = 1, 2, . . . , P. Thus, After the match process, the alarm vectors are labeled with their root faults. The pairs of alarm vectors and their root faults are expressed as: where V l k k is the l k −th alarm vector of fault k; f l k k is fault k in the l k −th maintenance record; k = 1, 2, . . . , P.

Feature Vector Extraction
The same fault of the wind turbine can happen multiple times. However, the alarm lists generated during the same fault are not always the same, since the physical processes are not deterministic, and the environmental conditions may differ when a fault occurs. This section aims to extract the feature vector of alarm vectors belonging to the same fault. The feature vector is used as a unique signature representing the occurrence of a fault.
There are l k alarm lists generated when fault k occurs. The alarm vector belonging to fault k is expressed as: where v j ki is the binary value of alarm i in the j-th alarm vector of fault k; j = 1, 2, . . . , l k . The feature vector of fault k is expressed as: where c ki is the binary value of alarm i in fault k; k = 1, 2, . . . , P; and P is the number of fault types. The feature vector C k is built using the l k alarm vectors belonging to fault k. The binary value of alarm i in the alarm vector of fault k is calculated as follows: where v j ki is the binary value of alarm i in the j-th alarm vector of fault k; and f r ∈ [0, 1] is a frequency. If alarm i is frequently triggered by fault k, the corresponding alarm in the Appl. Sci. 2022, 12, 69 7 of 14 feature vector is set to one; otherwise, it is set to zero. The value of fr is set to 0.5 in this paper. It is determined according to the final performance.
A pair of a fault and its feature vector is expressed as: where f k is fault k; C k is the feature vector of fault k; and k = 1, 2, . . . , P. The fault-template database is composed of faults and their feature vectors. It is expressed as: where f k is fault k; C k is the feature vector of fault k; and k = 1, 2, . . . , P.

Weights of Alarms
When a fault occurs, the responses from alarms are different. To explore the coupling correspondence between alarms and faults, we define the weights of alarms in each fault. The weight of alarm i in fault k is expressed as: where λ 1i is the weight defined according to the alarm type; λ 2ki is the weight defined according to the significance of alarm i in fault k; λ 3ki is the weight defined according to the specificity of alarm i in fault k; P is the number of fault types; and N is the number of alarm types configured in the SCADA system.

Alarm type
Warning alarms and fault alarms play different roles in the SCADA system. Warning alarms are triggered when the monitored variables come close to exceeding thresholds. Fault alarms are triggered when these thresholds are exceeded. Thus, fault alarms are more important than warning alarms. We define that the weights of fault alarms are bigger than those of warning alarms. The value of λ 1i is determined as follows: 2.
The significance of an alarm When fault k occurs, some alarms are always triggered or never triggered. The indicative effect of these alarms in fault k is strong. Thus, the significance of these alarms is great. On the contrary, when fault k occurs, other alarms are not always triggered. The indicative effectiveness of these alarms in fault k is weak. Thus, these alarms have little significance. The weight λ 2ki is used to enhance the alarms which are significant to one fault and discards the nonsignificant ones; λ 2ki is calculated as follows: where v j ki is the binary value of alarm i in the j-th alarm vector of fault k; c ki is the binary value of alarm i in the alarm vector of fault k; k = 1, 2, . . . , P; i = 1, 2, . . . , N; P is the number of fault types; and N is the number of alarm types configured in the SCADA system.
The specificity of an alarm When an alarm is only significant to fault k and nonsignificant to other faults, we consider that the alarm is unique to fault k. The weight λ 3ki is used to enhance the alarms which are unique to fault k; λ 3ki is calculated as follows: where P is the number of faults, F k is a finite set of faults except fault k, v j gi is the binary value of alarm i in the j-th alarm vector of fault g; c ki is the binary value of alarm i in the alarm vector of fault k; l g is the number of alarm lists belonging to fault g.
The weight λ 3ki decreases in the following situations: (1) Alarm i is frequently triggered by fault k and frequently triggered by the other faults. (2) Alarm i is seldom triggered by fault k or by the other faults. Therefore, λ 3ki decreases when alarm i is shared by several faults, but increases when alarm i is more specific to fault k than to the other faults. The value range of λ 3ki is between zero and one.

Online Root Fault Diagnosis
The first three steps in the process of online root fault identification are the same as those in the process of feature vector extraction.

1.
Distance measure This similarity is typically measured by computing certain metrics. When compared with thresholds, the resulting score determines if one alarm vector belongs to a root fault. Choosing a suitable distance measure increases the overall performance of the online diagnosis. An online unknown alarm vector V is expressed as: where v i is the binary value of alarm i. The distance between the alarm vector and the feature vector of fault k is expressed as: where C k is the feature vector of fault k and k = 1, 2, . . . , P. The Euclidean distance [23] and the Hamming distance [24] are often used as metrics. The Euclidean distance between the unknown alarm vector and the feature vector of fault k is calculated as: where v i is the binary value of alarm I and c ki is the binary value of alarm i in the alarm vector of fault k. The Hamming distance is defined to be the number of positions where they differ. The Hamming distance between an unknown alarm vector and the feature vector of fault k is calculated as: where v i is the binary value of alarm I and c ki is the binary value of alarm i in the alarm vector of fault k. We used both distances to measure the similarity. The performance of the distances is compared and analyzed in the next sections.

Weighted distance
The above similarity measures treat alarms in feature vectors equally without any identification. The coupling correspondence between alarms and faults is not considered. We define a weighted distance based on the weights of alarms to measure the similarity. A weight vector is associated with each alarm. The weight vector is expressed as: where w ki is the weight assigned to alarm i in fault k.
The weighted distance is expressed as D W (V , C k ), where k = 1, 2, . . . , P. The weighted Euclidean distance is defined as follows: where w ki is the weight assigned to alarm i in fault k; v i is the binary value of alarm i;c ki is the binary value of alarm i in the alarm vector of fault k.
The weighted Hamming distance is defined as follows: where w ki is the weight assigned to alarm i in fault k; v i is the binary value of alarm i; c ki is the binary value of alarm i in the alarm vector of fault k.

Root Fault Label
To identify the root fault of an alarm list, the weighted distances between the alarm vector and the feature vectors of every fault should be calculated. Thus, we can obtain P-weighted distances D W (V , C k ), where k = 1, 2, . . . , P. The smaller the weighted distance is, the higher the similarity is. Thus, the minimum weighted distance is selected and expressed as: where D W (V , C µ ) is the weighted distance between an unknown alarm vector and the feature vector of fault µ. The detection threshold is expressed as T µ . If D W (V , C µ ) ≤ T µ , and the root fault of the alarm list is labeled as a fault µ. If D W (V , C µ ) > T µ , the root fault of the alarm list does not belong to the known P faults. The detection threshold T µ is determined using the available fault cases of fault µ in the fault-template database; T µ is the maximum weighted distance between the fault cases of the fault µ and the feature vector of the fault µ.

Data Description
The data used in this study are from a wind farm located in southern China. There are 24 wind turbines on the wind farm, installed with direct-drive, variable-speed, variablepitch generators. One year of alarm data and maintenance records are available. The SCADA system in wind turbines is configured with 102 warning alarms and 266 fault alarms. There are a total of 240 maintenance records. After matching alarm lists and their root faults, we obtain 240 pairs of data. Each pair of data consists of an alarm list and its root faults.
For the sake of verification, we select the faults that have more than five records in order to extract the feature vectors. Ultimately, six faults are selected. They are the pitch-motor driver fault, pitch-system communication fault, hub speed encoder fault, high temperature of generator stator, wind vane fault, and vibration sensor fault. The number of alarm lists belonging to each fault is shown in Table 3. Forty-six alarm lists are used to extract the feature vectors of faults. Forty-four alarm lists are used to test the proposed method. These alarm lists are named as the test set one. The other 150 alarm lists, the root faults of which are not among the selected six faults, are also used in the test phase. These alarm lists are named as the test set two. Table 3. Faults and the number of their alarm lists.

Case Study: Pitch-Motor Driver Fault
Ten alarm lists belong to the pitch-motor drive fault. Five alarm lists are used to extract the feature vector. The obtained feature vector is C 1 = [c 1,1 , c 1,2 , . . . , c 1,T309 , c 1,T724 , . . .  Table 4. Vibration sensor fault 5 0 0 The weight of an alarm for a pitch-motor driver fault is expressed as w 1,i , where i = 1, 2, . . . , 368. The weights of alarms T309 and T724 are w 1,T309 and w 1,T724 , respectively. The calculation processes of w 1,T309 and w 1,T724 are provided as examples: 1.

Performance Evaluation
Three indicators are defined to evaluate the performance of the proposed method. The performance-evaluation results are shown in Table 5. The similarity between an alarm list and a feature vector is measured by the weighted Euclidean distance and the weighted Hamming distance, respectively. Test set one consists of 44 alarm lists. The root faults of these lists are among the selected six faults. Test set two consists of 150 alarm lists. The root faults of these lists are not among the selected six faults. Thus, the indicator TD for test set two does not exist. The overall performance of the weighted Hamming distance is better than that of weighted Euclidean distance. For test set one, the percentage of TD of weighted Hamming distance is higher, and the percentage of FD and MD is lower. For test set two, the percentage of FD and MD of weighted Hamming distance is lower. A multidimensional information processing method proposed in reference [25] is also applied in this paper. The Dempster-Shafer evidence theory is applied to the selected six faults. Each alarm list is labeled with the most possible fault. True detection and false detection can be used to evaluate the performance. The results are shown in Table 6. The percentage of TD is 81.8%. The percentage of FD is 18.2%. The number of the data set has a great influence on the method, which is based on probability analysis. The percentage of TD is a little lower than that of the proposed method. Table 6. The performance of multidimensional information processing method.

Test Set One
The percentage of TD 81.8% The percentage of FD 18.2% The more detailed analysis of test cases set one, with the weighted Hamming distance applied, is given as follows: Two alarm lists are labeled with a wrong fault. One alarm list, the actual root fault of which is pitch-motor driver fault, is wrongly labeled with pitch-system communication fault. The pitch-motor driver fault and pitch-system communication fault both belong to the faults of the pitch system. They are sensitive to the same alarms. Another alarm list, the actual root fault of which is hub speed encoder fault, is wrongly labeled with wind vane fault. This is because wind speed has a great influence on the hub speed. The coupling of alarms is responsible for both cases. Five root faults are not detected. All of them are wind vane faults. This is because the description of the wind vane fault in maintenance records is not detailed and accurate. The alarm lists generated when the fault occurs are more dispersive. The extracted feature vector cannot represent the occurrence of the fault well.
The value of fr is crucial for extracting the feature vector of faults; fr is set to 0.5 in this paper. It is determined according to the percentage of TD. Figure 3 describes the percentage of TD for D W H and D WE with different fr. When fr is 0.5, the percentage of TD is the greatest. vane fault in maintenance records is not detailed and accurate. The alarm lists generated when the fault occurs are more dispersive. The extracted feature vector cannot represent the occurrence of the fault well.
The value of fr is crucial for extracting the feature vector of faults; fr is set to 0.5 in this paper. It is determined according to the percentage of TD. Figure 3 describes the percentage of TD for WH D and WE D with different fr. When fr is 0.5, the percentage of TD is the greatest.

Discussion
The proposed method in this paper is based on a similarity analysis. The key steps are feature vector extraction and the selection of weighted distance. The number of fault

Discussion
The proposed method in this paper is based on a similarity analysis. The key steps are feature vector extraction and the selection of weighted distance. The number of fault cases and the quality of maintenance records influences the feature vector extraction greatly. One year of maintenance records are used in this paper, and the number of repeated faults is relatively few. The method performs better with more fault cases; it can be self-optimizing with more fault cases in the fault-template database.
If the similarity between an online alarm list and each feature vector is small, we think that the root fault of this online alarm list is unknown. There are two reasons for this situation. First, the root fault is not in the fault-template database. Second, the available fault cases of this fault are relatively small. The coupling correspondence between the fault and its alarm lists is not well established. However, the identification process is not over. In this case, manual maintenance is needed, and the fault-template database is updated according to the maintenance results. In the later identification process, the identification accuracy is improved with new and more fault cases. This self-optimizing process is shown in Figure 4. cases and the quality of maintenance records influences the feature vector extraction greatly. One year of maintenance records are used in this paper, and the number of re peated faults is relatively few. The method performs better with more fault cases; it can be self-optimizing with more fault cases in the fault-template database.
If the similarity between an online alarm list and each feature vector is small, we think that the root fault of this online alarm list is unknown. There are two reasons for thi situation. First, the root fault is not in the fault-template database. Second, the available fault cases of this fault are relatively small. The coupling correspondence between the faul and its alarm lists is not well established. However, the identification process is not over In this case, manual maintenance is needed, and the fault-template database is updated according to the maintenance results. In the later identification process, the identification accuracy is improved with new and more fault cases. This self-optimizing process i shown in Figure 4.
Choosing a suitable distance measure also increases the overall performance of the proposed method. Other distances, aside from Euclidean distance and Hamming dis tance, can also be used in the similarity measure.  Figure 4. The self-optimizing process of the proposed method.

Conclusions
This study proposes an online method to simplify the alarm lists generated during the occurrence of wind turbine faults, explore the alarm patterns, and identify the roo faults. It does not require a time-consuming training procedure and is easy to apply. The proposed method is based on the similarity analysis between an unknown alarm vecto and the feature vectors of known faults. This similarity is measured by the weighted Eu clidean distance and weighted Hamming distance. The weights are determined by the alarm types and the specificity of alarms to the known faults. One year of SCADA alarm and maintenance records are used to verify the method. The results show that the perfor mance of the weighted Hamming distance is better than that of the weighted Euclidean distance. The percentage of TD when the weighted Hamming distance is used is 84.1% which means 37 out of 44 alarm lists are labeled with the right root fault. The proposed method can effectively assist the operator in identifying the root faults when confronted with a large number of alarms. With more fault cases, the method can be self-optimizing Choosing a suitable distance measure also increases the overall performance of the proposed method. Other distances, aside from Euclidean distance and Hamming distance, can also be used in the similarity measure.

Conclusions
This study proposes an online method to simplify the alarm lists generated during the occurrence of wind turbine faults, explore the alarm patterns, and identify the root faults. It does not require a time-consuming training procedure and is easy to apply. The proposed method is based on the similarity analysis between an unknown alarm vector and the feature vectors of known faults. This similarity is measured by the weighted Euclidean distance and weighted Hamming distance. The weights are determined by the alarm types and the specificity of alarms to the known faults. One year of SCADA alarms and maintenance records are used to verify the method. The results show that the performance of the weighted Hamming distance is better than that of the weighted Euclidean distance. The percentage of TD when the weighted Hamming distance is used is 84.1%, which means 37 out of 44 alarm lists are labeled with the right root fault. The proposed method can effectively assist the operator in identifying the root faults when confronted with a large number of alarms. With more fault cases, the method can be self-optimizing, and the detection accuracy can be improved in the future.
Author Contributions: Conceptualization, methodology, software, validation and writing-original draft preparation, L.W.; formal analysis, investigation, resources and funding acquisition, Z.Q.; writing-review and editing, Y.P. and J.W. All authors have read and agreed to the published version of the manuscript.