Using SCADA Data for Wind Turbine Condition Monitoring: A Systematic Literature Review

: Operation and maintenance (O&M) activities represent a signiﬁcant share of the total expenditure of a wind farm. Of these expenses, costs associated with unexpected failures account for the highest percentage. Therefore, it is clear that early detection of wind turbine (WT) failures, which can be achieved through appropriate condition monitoring (CM), is critical to reduce O&M costs. The use of Supervisory Control and Data Acquisition (SCADA) data has recently been recognized as an e ﬀ ective solution for CM since most modern WTs record large amounts of parameters using their SCADA systems. Artiﬁcial intelligence (AI) techniques can convert SCADA data into information that can be used for early detection of WT failures. This work presents a systematic literature review (SLR) with the aim to assess the use of SCADA data and AI for CM of WTs. To this end, we formulated four research questions as follows: (i) What are the current challenges of WT CM? (ii) What are the WT components to which CM has been applied? (iii) What are the SCADA variables used? and (iv) What AI techniques are currently under research? Further to answering the research questions, we identify the lack of accessible WT SCADA data towards research and the need for its standardization. Our SLR was developed by reviewing more than 95 scientiﬁc articles published in the last three years.


Introduction
The problems associated with global warming have led the scientific community to search for new sources of energy, with renewable resources emerging as a promising alternative to conventional energy from fossil fuels. Wind energy stands out among the different renewable energy sources for its excellent characteristics, being defined as "the fastest growing renewable energy source in the world" with an average annual growth of 30% over the past two decades [1].
The last annual report published by the Global Wind Energy Council (GWEC) showed that installed wind power capacity had reached 650 GW by the end of 2019 [2] (Figure 1). The report estimated that installed wind power would have to be 800 GW by the end of 2021 and 840 GW by the end of 2022 in order to achieve the global CO 2 emission reduction targets. If the wind industry continues to grow, the technology will also continue to improve, and the costs of installing wind projects will continue to decline [3][4][5].
In addition to having certain advantages as compared with those onshore, such as more stable and stronger wind speeds, it is clear that offshore wind farms (WFs) operate in extreme conditions (remote marine environments). However, certain onshore WFs can also be found in harsh conditions [6], with some operating in complex terrains, extreme temperatures (very low or very high temperatures), temperatures), high humidity, and hard-to-reach areas. Thus, further challenges are posed in the transport, installation, and operation of such wind turbines (WTs) [7]. As mentioned, the difficult access to WFs and their remoteness from control centers give rise to high costs of operation and maintenance (O&M) activities. The scientific literature has proposed different criteria on the costs of O&M for WFs. The authors in [8] stated that for a useful life of 20 years, the O&M costs of offshore WFs accounted for 30% of the total expenditure. O&M tasks are said to account for up to 23% of the total costs for offshore WFs in [9]. In [10], it was mentioned that O&M costs represented approximately 20% to 25% of the total power generation costs for offshore WFs and 10% to 15% of the total generation costs for onshore ones. The authors in [11][12][13][14][15][16] broadly agreed that WF O&M costs could account for approximately 25% to 35% of the total cost of power generation. These O&M cost percentages could be increased by recurring failures in the different components of a WT. Therefore, early detection of potential WT failures [17], as well as establishing an appropriate maintenance strategy, are imperative in the decision making of WF owners and operators. Thus, appropriate WT condition monitoring (CM) is of paramount importance.
According to [18], CM can be defined as an activity, performed manually or automatically, for observing the actual state of a component. Thus, in the early stage of a component, CM can function to provide a reliable indication of the presence of a failure, so that actions can be planned and downtime is minimized.
Many authors have attempted to identify the most critical components in a WT. Few reliability studies, however, have been conducted to resolve this issue. The authors in [19] suggested that the control system, gearbox, electric system, generator, as well as hub and blades are the most critical components. A recent review published in [20] presented and compared thirteen reliability studies (including the breakdown of failure rate and downtime for thirteen WT sub-assemblies), an average of which has been calculated and is presented in Figure 2 below. Figure 2 shows that the electric and control systems and the others category present the highest failure rates. The gearbox is the most critical component regarding downtime, followed by the electric system, the generator, and the braking system.
Condition monitoring systems (CMSs) are used to provide data on different components of a WT based on different types of information (parameters or signals) obtained from many types of sensors. Technical solutions, application, approaches, benefits and other elements of CMSs have been extensively investigated in several studies [2,9,[19][20][21][22][23][24], with most studies concluding that cost and technical complexity have limited their use. As mentioned, the difficult access to WFs and their remoteness from control centers give rise to high costs of operation and maintenance (O&M) activities. The scientific literature has proposed different criteria on the costs of O&M for WFs. The authors in [8] stated that for a useful life of 20 years, the O&M costs of offshore WFs accounted for 30% of the total expenditure. O&M tasks are said to account for up to 23% of the total costs for offshore WFs in [9]. In [10], it was mentioned that O&M costs represented approximately 20% to 25% of the total power generation costs for offshore WFs and 10% to 15% of the total generation costs for onshore ones. The authors in [11][12][13][14][15][16] broadly agreed that WF O&M costs could account for approximately 25% to 35% of the total cost of power generation. These O&M cost percentages could be increased by recurring failures in the different components of a WT. Therefore, early detection of potential WT failures [17], as well as establishing an appropriate maintenance strategy, are imperative in the decision making of WF owners and operators. Thus, appropriate WT condition monitoring (CM) is of paramount importance.
According to [18], CM can be defined as an activity, performed manually or automatically, for observing the actual state of a component. Thus, in the early stage of a component, CM can function to provide a reliable indication of the presence of a failure, so that actions can be planned and downtime is minimized.
Many authors have attempted to identify the most critical components in a WT. Few reliability studies, however, have been conducted to resolve this issue. The authors in [19] suggested that the control system, gearbox, electric system, generator, as well as hub and blades are the most critical components. A recent review published in [20] presented and compared thirteen reliability studies (including the breakdown of failure rate and downtime for thirteen WT sub-assemblies), an average of which has been calculated and is presented in Figure 2 below. Figure 2 shows that the electric and control systems and the others category present the highest failure rates. The gearbox is the most critical component regarding downtime, followed by the electric system, the generator, and the braking system.
Condition monitoring systems (CMSs) are used to provide data on different components of a WT based on different types of information (parameters or signals) obtained from many types of sensors. Technical solutions, application, approaches, benefits and other elements of CMSs have been extensively investigated in several studies [2,9,[19][20][21][22][23][24], with most studies concluding that cost and technical complexity have limited their use. The use of Supervisory Control and Data Acquisition (SCADA) data has recently been proposed as an attractive solution for CM of WTs [25]. Most modern WTs record more than 200 variables in intervals of 1 to 10 min using their SCADA systems [14], generating rich historical data. Using appropriate data treatment solutions, the dataset can be converted into useful information for CM and, thus, WT fault prediction.
In the present work, a systematic literature review (SLR) was conducted, revealing numerous studies on CM and early failure prediction in WTs using SCADA data. The contributions of this work are summarized as follows: 1. An SLR is performed with the aim of assessing the methods and algorithms most commonly used in CM and predicting WTs failures from SCADA data. The proposed methodology for the SLR can be applied to other fields of wind energy research. 2. Four research questions are posed and answered to perform the SRL and to establish a classification for the most relevant research articles based on the following: • The challenges of current WT CM are identified and sorted per technique and type of data used.

•
The different WT components or subsystems to which CM has been applied are identified and classified for different methods.

•
The most common SCADA variables used for CM of WT are analyzed and classified per amount and sampling frequency.

•
The most significant articles published in the last three years on artificial intelligence (AI) techniques used in CMSs to detect failures in WTs are included, compared, and sorted.
3. Furthermore, the authors found there is a need for accessible SCADA data from actual operating WTs for research and for the standardization of such data.
The remainder of this paper is structured as follows: Section 2 provides a detailed description of the SLR method, wherein the conceptual mind map, research questions and semantic search structure are presented; Section 3 describes the results obtained for each research question; and finally, Section 4 presents the conclusions.

Methodology
This work uses the SLR methodology proposed by Torres in [26], which is a method adapted from Kitchenham [27] and Bacca [28]. The steps are shown in the flowchart in Figure 3, and the protocol is described below. The use of Supervisory Control and Data Acquisition (SCADA) data has recently been proposed as an attractive solution for CM of WTs [25]. Most modern WTs record more than 200 variables in intervals of 1 to 10 min using their SCADA systems [14], generating rich historical data. Using appropriate data treatment solutions, the dataset can be converted into useful information for CM and, thus, WT fault prediction.
In the present work, a systematic literature review (SLR) was conducted, revealing numerous studies on CM and early failure prediction in WTs using SCADA data. The contributions of this work are summarized as follows: 1.
An SLR is performed with the aim of assessing the methods and algorithms most commonly used in CM and predicting WTs failures from SCADA data. The proposed methodology for the SLR can be applied to other fields of wind energy research.

2.
Four research questions are posed and answered to perform the SRL and to establish a classification for the most relevant research articles based on the following: • The challenges of current WT CM are identified and sorted per technique and type of data used.

•
The different WT components or subsystems to which CM has been applied are identified and classified for different methods.

•
The most common SCADA variables used for CM of WT are analyzed and classified per amount and sampling frequency.

•
The most significant articles published in the last three years on artificial intelligence (AI) techniques used in CMSs to detect failures in WTs are included, compared, and sorted.

3.
Furthermore, the authors found there is a need for accessible SCADA data from actual operating WTs for research and for the standardization of such data.
The remainder of this paper is structured as follows: Section 2 provides a detailed description of the SLR method, wherein the conceptual mind map, research questions and semantic search structure are presented; Section 3 describes the results obtained for each research question; and finally, Section 4 presents the conclusions.

Methodology
This work uses the SLR methodology proposed by Torres in [26], which is a method adapted from Kitchenham [27] and Bacca [28]. The steps are shown in the flowchart in Figure 3, and the protocol is described below.

Identification of the Need for a Review
As mentioned, wind power has experienced rapid growth over the past decade. One of the main contributors of O&M costs in WTs is unscheduled maintenance due to unexpected failures. Therefore, being able to forecast failures is critical for reducing O&M costs and maintaining the competitiveness of wind energy [29]. An exhaustive bibliographic search in the scientific literature was conducted with the aim of identifying the AI techniques based on SCADA data currently associated with the CM and early detection of WT failures. The use of SCADA data has the advantage of avoiding additional expenses from the installation and maintenance of sensors, cables, and dedicated acquisition systems. Under this premise, it was also necessary to understand the main elements or systems of the WT under analysis, using CM techniques and the variables coming from the SCADA system of a WT that are most often used for CM.

Research Questions
For [27], the most important activity of the SLR protocol is to pose the research questions. However, the most common shortcoming of a SLR is the lack of an explicit statement of the research questions posed [30]. To focus the bibliographic search on the current topic and area of interest, the following research questions (RQ) were formulated:

Conceptual Mind Map
A mind map is a hierarchical cognitive diagram used to organize and preserve knowledge, where fundamental ideas are embodied and secondary ideas are discarded. To this end, a conceptual mind map performs two functions; it organizes the propositions and preserves the concepts, thus, storing them in a simple hierarchical diagram [31]. According to [26], a conceptual mind map is

Identification of the Need for a Review
As mentioned, wind power has experienced rapid growth over the past decade. One of the main contributors of O&M costs in WTs is unscheduled maintenance due to unexpected failures. Therefore, being able to forecast failures is critical for reducing O&M costs and maintaining the competitiveness of wind energy [29]. An exhaustive bibliographic search in the scientific literature was conducted with the aim of identifying the AI techniques based on SCADA data currently associated with the CM and early detection of WT failures. The use of SCADA data has the advantage of avoiding additional expenses from the installation and maintenance of sensors, cables, and dedicated acquisition systems. Under this premise, it was also necessary to understand the main elements or systems of the WT under analysis, using CM techniques and the variables coming from the SCADA system of a WT that are most often used for CM.

Research Questions
For [27], the most important activity of the SLR protocol is to pose the research questions. However, the most common shortcoming of a SLR is the lack of an explicit statement of the research questions posed [30]. To focus the bibliographic search on the current topic and area of interest, the following research questions (RQ) were formulated:

Conceptual Mind Map
A mind map is a hierarchical cognitive diagram used to organize and preserve knowledge, where fundamental ideas are embodied and secondary ideas are discarded. To this end, a conceptual mind map performs two functions; it organizes the propositions and preserves the concepts, thus, storing Energies 2020, 13, 3132 5 of 20 them in a simple hierarchical diagram [31]. According to [26], a conceptual mind map is necessary to ensure detailed reading and learning. Moreover, the conceptual-mind map tool is used in pedagogy to represent concepts. Furthermore, according to [31], the mind map is an ideogram or a graphic sketch that represents something, i.e., it assumes a complex idea and conceptualizes it, and is structured by classes as follows: • Supraordinate: refers to a type of proposition that entirely encompasses others. It serves to identify and discover the most important qualities of the concept.

•
Isoordinate: establishes non-total correspondence, highlights relationships and links between adjacent propositions and links ideas with one another; proposals precede concepts and allow them to be structured.

•
Infraordinate: contains several subclasses or derivations and is divided by illustration and the order in which propositions, notions, concepts and categories appear evolutionarily.

•
Exclusions: these are classes that are mutually opposed or mutually exclusive and associated with the operation of excluding or denying a link between two adjacent classes. Figure 4 presents the conceptual mind map that helped to direct the search tasks of the bibliography towards the research problem. At the top, we locate the supraordinate "wind turbine", which allows the central concept "CM" to be more specified if reading from top to bottom. On the right side, the exclusions are defined; these are concepts considered to be related to the central concept, but which differ from an operational point of view. On the left side, we locate the isoordinates, which are the concepts that helped the bibliographic search to determine the characteristics of the central concept. The infraordinates are defined at the bottom; these contain the main components of a WT to which CM techniques are applied.
Energies 2020, 13, x FOR PEER REVIEW 5 of 22 necessary to ensure detailed reading and learning. Moreover, the conceptual-mind map tool is used in pedagogy to represent concepts. Furthermore, according to [31], the mind map is an ideogram or a graphic sketch that represents something, i.e., it assumes a complex idea and conceptualizes it, and is structured by classes as follows: • Supraordinate: refers to a type of proposition that entirely encompasses others. It serves to identify and discover the most important qualities of the concept.

•
Isoordinate: establishes non-total correspondence, highlights relationships and links between adjacent propositions and links ideas with one another; proposals precede concepts and allow them to be structured.

•
Infraordinate: contains several subclasses or derivations and is divided by illustration and the order in which propositions, notions, concepts and categories appear evolutionarily.

•
Exclusions: these are classes that are mutually opposed or mutually exclusive and associated with the operation of excluding or denying a link between two adjacent classes. Figure 4 presents the conceptual mind map that helped to direct the search tasks of the bibliography towards the research problem. At the top, we locate the supraordinate "wind turbine", which allows the central concept "CM" to be more specified if reading from top to bottom. On the right side, the exclusions are defined; these are concepts considered to be related to the central concept, but which differ from an operational point of view. On the left side, we locate the isoordinates, which are the concepts that helped the bibliographic search to determine the characteristics of the central concept. The infraordinates are defined at the bottom; these contain the main components of a WT to which CM techniques are applied.

Related Systematic Reviews
To accurately and efficiently perform a bibliographic search on Scopus and Web of Science (WOS) databases, a search script was built. The semantic structure was designed from the conceptual mind map and with the help of a scientific thesaurus. It should be noted that the basic function of thesauruses is to neutralize synonymy and polysemy, both natural characteristics of the language, which hinder the accuracy of indexing and retrieval of information [32]. Table 1 presents the semantic search structure, which consists of five levels described as follows:

Related Systematic Reviews
To accurately and efficiently perform a bibliographic search on Scopus and Web of Science (WOS) databases, a search script was built. The semantic structure was designed from the conceptual mind map and with the help of a scientific thesaurus. It should be noted that the basic function of thesauruses is to neutralize synonymy and polysemy, both natural characteristics of the language, which hinder the accuracy of indexing and retrieval of information [32]. Table 1 presents the semantic search structure, which consists of five levels described as follows: Table 1. Semantic search structure.

Level 1
Condition Monitoring ("condition monitoring" OR "condition-based maintenance") Level 2 + Artificial Intelligence AND ("machine learning" OR "AI" OR "prognostic techniques" OR "signal processing methods") Level 3 + Fault Detection AND (detection W/1 (fault OR failure OR anomaly OR diagnosis)) Level 4 + Wind Turbine AND ("wind turbine" OR "wind turbines" OR "wind farm" OR "wind power plant") Level 5 + SCADA AND ("SCADA" OR "signals" OR "real conditions") The first level was taken directly from the conceptual mind map; the second was oriented towards searching for studies related to AI; the third was intended for the search for work related to fault detection; the fourth was oriented towards searching for WT; and finally, the fifth was directed towards searches in the SCADA systems. The five levels are directly related to the four research questions. In addition, it should be mentioned that in each of the levels, the terms provided by the thesaurus were taken into account, which allowed us to cover all terms related through synonymy.
Moreover, the search script can help to determine the existence of related SLRs on the subject under research and to gather scientific literature to answer the research questions. Some exclusion criteria were also taken into account in the development of the script. First, the search was restricted to studies developed in recent years (2017-2020) and only to those published in English. In addition, items that did not belong to the area of interest or related areas were excluded. Table 2 shows the search script used. TITLE-ABS-KEY(("condition monitoring" OR "condition-based maintenance") AND ("machine learning" OR "AI" OR "prognostic techniques" OR "signal processing methods") AND ("wind turbine" OR "wind turbines" OR "wind farm" OR "wind power plant") AND (slr OR review OR survey OR "meta-analysis") AND (detection W/1 (fault OR failure OR anomaly OR diagnosis)) AND ("SCADA" OR "signals" OR "real conditions") AND With the script detailed in Table 2 the search on Scopus resulted in four articles listed as SLRs [19,22,33,34], as shown in the Figure 5.
In [33], the authors suggested that the main limitation to progress was a lack of large public datasets where new models could be developed, evaluated, and compared. The studies reviewed in this article, included analysis of different components of the WT and common ways of performing the CM. The authors also explained that most of the models used for WT CM in the literature consulted used SCADA, simulated or, rarely, experimental data. The recent literature (after 2011) on machine learning (ML) models used for WT CM were reviewed. It was concluded that artificial neural networks (ANNs), support vector machines (SVMs), and decision trees are the most commonly used.
The authors in [34] presented a concise review of the current effort in the field of prognosis and remaining useful life estimation methods applied specifically to components of WTs and identified the areas in need of development in the field of prognostic techniques applied to WT (e.g., a platform to collect and transfer SCADA data, share information, and execute analysis). The work also described the critical components of the WTs and a list of references classified in the following order: author, year of publication, technique, and input SCADA parameters. Finally, ANN and particle filter were found to be the most promising techniques, given the maturity stage and the results obtained.
The review presented in [19] showed that there was still a significant research gap in developing comprehensive, cost-effective, online CM techniques for WTs. This paper presented a comprehensive review of common faults, signals, and signal processing methods for CM and fault diagnosis of WT components. An extensive analysis of conventional CM techniques applied to several WT components Energies 2020, 13, 3132 7 of 20 was also presented. The aim was to provide the reader with the overall features for CM WTs and fault diagnosis, including various potential fault types and locations along with the signals to be analyzed with different signal processing methods.
The work in [22] reviewed supervised and unsupervised applications of ANNs and, in particular, deep learning (DL) for CM of WTs. ANNs were frequently used as an ML tool for this purpose and DL was a deep neural network (DNN)-based ML paradigm that has proven to be highly successful in various applications in recent years. The main finding was that despite a promising performance of supervised methods, unsupervised approaches prevailed in the literature. In addition, the application of DL to SCADA data was found to be limited by its relatively low dimensionality. Thus, ways to work with larger SCADA data were suggested.
Energies 2020, 13, x FOR PEER REVIEW 7 of 22 purpose and DL was a deep neural network (DNN)-based ML paradigm that has proven to be highly successful in various applications in recent years. The main finding was that despite a promising performance of supervised methods, unsupervised approaches prevailed in the literature. In addition, the application of DL to SCADA data was found to be limited by its relatively low dimensionality. Thus, ways to work with larger SCADA data were suggested. In Table 3, we present a summary of the main characteristics and findings in relation to the RQs presented in this study. It can be concluded that the literature review articles mentioned do not respond to the research questions posed, which evidences the relevance of our SLR.  In Table 3, we present a summary of the main characteristics and findings in relation to the RQs presented in this study. It can be concluded that the literature review articles mentioned do not respond to the research questions posed, which evidences the relevance of our SLR.

Definition of Inclusion and Exclusion Criteria
The general inclusion criteria were as follows: • Studies related to CM and early WT failure detection; • Studies published between 2017 and 2020; • Studies in English.
The specific inclusion criteria were as follows: • Studies describing the architecture of ML models, classification methods (supervised and unsupervised), as well as AI techniques applied to early detection of failures in WT; • Studies detailing which WT components are most frequently submitted to CM techniques; • Studies detailing the variables acquired by SCADA that are most often used in the training of ML models; • Studies using SCADA system variables for CM and WTs early fault detection.
The following exclusion criteria were also considered: • Fault prediction studies using parametric methods; • Journals that are not considered scientific journals, including editorials, book reviews, technical reports, datasets, etc.

Definition of Categories
According to [26], it is necessary to define a series of categories of analyses that allow the studies to be grouped together to answer the research questions. For RQ1, the following categories were considered: • Challenges and new trends which included sensorless and non-intrusive CM systems, real-time CM methodologies, online CM techniques, large public SCADA datasets, standardization of the SCADA data, SCADA data, public high-resolution, and assessment methods for monitoring the overall conditions of WTs.
For RQ2, the following categories were considered: • The components or subsystems of the WT to which CM techniques can be applied which included gearbox, electrical and electronic components, blades and pitch angle, tower, drive train and bearing, yaw system and hydraulic system; • Measurement techniques used included temperature-based, SCADA data, vibration signals, electrical signals, acoustic emission, strain measurements, and other non-destructive testing.
For RQ3, the following categories were considered: • Variable names which included wind speed, rotor speed, nacelle temperature, active power, alarms and faults, pitch angle, generator winding or bearing temperature, transformer temperature, gearbox temperature, yaw, environment temperature, other electrical signals, wind direction; • Sampling frequencies of one second, one minute, five minutes, ten minutes, etc.;

•
The analysis periods of one year, longer than one year, and shorter than one year.
For RQ4, the following categories were considered:  Table 4 shows the list of journals consulted and their principal metrics, such as the impact factor (IF), the location quartile according to JCR (Journal Citation Report), and SJR (Scimago Journal Rank), and the value according to the Google Scholar h5 indicator. We also detailed the number of articles consulted from each journal and the order of importance, which was calculated using the following expression:

Review Report
This section presents the results of the bibliographic search, using tables and graphs containing the research questions and the categories associated with each question, the bibliographic references, and the frequency of appearance.

RQ1: What Are the Challenges of WT CM?
RQ1 was proposed with the aim of understanding the challenges and research trends in the current field of interest. As can be seen in Table 5, a significant number of the consulted works state the need for a large publicly accessible set of data from the SCADA system. In this context, [33] stated that a major obstacle to the development of CM was the lack of large public datasets where new models could be developed, evaluated, and compared. In addition, [7] stated that the main reason for requiring an open SCADA and CMS data platform for academics and manufacturers was the difficulty in verifying theoretical research caused by insufficient field data. As shown in Figure 6, large public SCADA datasets is the category with the highest frequency, appearing in six of the articles consulted, followed by the categories "real-time CM methodologies", "online CM techniques" and "standardization of the SCADA data" with two articles each.

RQ2: What are the Main Components and/or Subsystems of the WTs to which CM Techniques are Applied?
CM of WT components has emerged as an important field of study in recent years. The goal is to increase the life expectancy of the components while reducing their operating and maintenance costs [40]. Articles related to RQ2 are presented in Table 6. According to the literature consulted, a large percentage of tasks use data collected by the SCADA system for the CM and diagnosis of WT failures. In addition, it is worth mentioning that the gearbox and pitch systems are the WT Figure 6. Distribution of number of papers on current challenges and trends for CM of WTs.

RQ2: What Are the Main Components and/or Subsystems of the WTs to Which CM Techniques Are Applied?
CM of WT components has emerged as an important field of study in recent years. The goal is to increase the life expectancy of the components while reducing their operating and maintenance costs [40]. Articles related to RQ2 are presented in Table 6. According to the literature consulted, a large percentage of tasks use data collected by the SCADA system for the CM and diagnosis of WT failures. In addition, it is worth mentioning that the gearbox and pitch systems are the WT components that most frequently appear in the reviewed studies associated with CM. As mentioned, the gearbox is one of the WT components that most frequently appeared in the reviewed articles as it generates a high failure rate and causes long downtime in the WT [95]. For example, in [48][49][50], different AI techniques such as support vector regression (SVR), SVM, and DNN were used in CM to identify anomalies and predict gearbox failures from temperature and oil pressure data acquired by the SCADA system of the gearbox. Moreover, another approach to predict the faults and remaining useful life of gearboxes was presented in [58], where the failure was predicted up to a month before it occurred, using ML techniques on large amounts of SCADA data. Further, a new approach was proposed in [53], where automatic text mining was used as an emerging and complementary tool to predict the possible failures in WTs from the automatic processing of O&M information. The generator and gearbox were analyzed using the relevant words from the WT service history. The experimental results were promising, with the accuracy and F-score above 90% in some cases. Figure 7 shows that 68 of the consulted articles refer to SCADA as a technique associated with the CM of WTs, confirming the findings of [25], where it was mentioned that CMSs were installed with the aim of providing specific information of WT components to WF operators and that the use of SCADA data was a possible solution for CM due to its availability.
Energies 2020, 13, 3132 13 of 20 possible failures in WTs from the automatic processing of O&M information. The generator and gearbox were analyzed using the relevant words from the WT service history. The experimental results were promising, with the accuracy and F-score above 90% in some cases. Figure 7 shows that 68 of the consulted articles refer to SCADA as a technique associated with the CM of WTs, confirming the findings of [25], where it was mentioned that CMSs were installed with the aim of providing specific information of WT components to WF operators and that the use of SCADA data was a possible solution for CM due to its availability.

RQ4: Which AI Techniques Are Currently under Research for WT CM?
The results related to RQ4 are presented in Table 8. It can be observed that ANN is the most frequently used technique in the consulted studies. This result relates to [83], where a method for sensor validation and fault detection was proposed. After building different ANN architectures, these were chosen because of their high performance in nonlinear environments. The results showed that the proposed method was feasible and effective for sensor fault detection and isolation.
In relation to the "computer tools" category, the most commonly used software is Matlab (R2018a, MathWorks, Natick, MA, USA). In addition, it is important to assess the accuracy of the methods used. It was determined that the metrics most frequently found in the consulted articles were RMSE and MAE. In some articles, more than one metric was used to evaluate the accuracy of the proposed models, as can be found in [16,92].
Additionally, in [12], a model that could be used for detecting incipient failure condition of a WT based on SCADA data was proposed. The prediction model was developed based on different data mining algorithms, such as back propagation neural network (BPNN) algorithm, the radial basis function neural network (RBFNN), and the least-square support vector (LSSVM) algorithm. The proposed method has been used for actual 1.5 MW WTs and the effectiveness of the proposed approach verified. Meanwhile, in [81], a regression model based on the SVM was proposed to estimate the pitch angle curve of the shovel from the SCADA data and its application in the detection of WT anomalies was explored. Finally, the advantages and limitations of these techniques were presented. In [71], a new approach to WT fault detection based on quantile regression neural networks (QRNN) was proposed under the framework of a normal behavioral model. On the basis of the acquired SCADA data, the QRNN model was found to exceed multiple linear regression (MLR) and BPNNs in terms of the MAE.
As can be seen in Figure 9, ANN is the most commonly applied ML approach, appearing in 35 reviewed papers, followed by SVM (23 articles) and DL (11 articles). In addition, it is important to note that ANNs and SVMs have undergone rapid development over the past decade, driving their application for the diagnosis of WT failures [96]. As can be seen in Figure 9, ANN is the most commonly applied ML approach, appearing in 35 reviewed papers, followed by SVM (23 articles) and DL (11 articles). In addition, it is important to note that ANNs and SVMs have undergone rapid development over the past decade, driving their application for the diagnosis of WT failures [96].

Conclusions
On the basis of the scientific literature reviewed by the authors, it can be concluded that the most frequently found AI techniques for CM and WT fault prediction are ANN and SVM, appearing in 39% and 27% of the total articles, respectively. As explained in the Introduction section, different ANN architectures have been successfully used in predicting the faults of different components of the WT for their high performance in modeling nonlinear environments. In addition, RMSE is the most commonly used failure prediction model evaluation metric.
Early fault detection, as well as that of the remaining useful life of the gearbox, using AI techniques, is a widely researched topic. In this SLR, 26% of the consulted articles refer to the gearbox. This is largely due to failure in this component, which despite not being the most frequent, causes approximately 20% of total WT downtime.
CM of WTs is a widely researched field, and thus there are many commercial solutions, techniques, and methods available in the market. However, the cost and complexity involved in assembling additional equipment in the WT have limited their use. The appearance and rapid

Conclusions
On the basis of the scientific literature reviewed by the authors, it can be concluded that the most frequently found AI techniques for CM and WT fault prediction are ANN and SVM, appearing in 39% and 27% of the total articles, respectively. As explained in the Introduction section, different ANN architectures have been successfully used in predicting the faults of different components of the WT for their high performance in modeling nonlinear environments. In addition, RMSE is the most commonly used failure prediction model evaluation metric.
Early fault detection, as well as that of the remaining useful life of the gearbox, using AI techniques, is a widely researched topic. In this SLR, 26% of the consulted articles refer to the gearbox. This is largely due to failure in this component, which despite not being the most frequent, causes approximately 20% of total WT downtime.
CM of WTs is a widely researched field, and thus there are many commercial solutions, techniques, and methods available in the market. However, the cost and complexity involved in assembling additional equipment in the WT have limited their use. The appearance and rapid development in recent years of AI techniques as a tool associated with CM and the diagnosis of WT failures has been proposed as a technically and economically viable solution. Economic aspects are especially attractive if the data provided by the SCADA system can be considered to be the main input for CM, thus, avoiding the need for extra sensors, acquisition equipment, etc.
To further analyze CM of WTs based on SCADA data, it is necessary for owners and operators of WFs to provide detailed information and high-frequency data. Ideally, this information should be made accessible to researchers, even if under confidentiality agreements. In addition, it is necessary to mention the lack of standardization in terms of how the SCADA system reports and event data (such as the names of the different variables) are gathered. Thus, given the large number of WT manufacturers, it is necessary to standardize the SCADA system data with the aim of facilitating traceability of CM methods and techniques.