Detecting and Mitigating Adversarial Examples in Regression Tasks: A Photovoltaic Power Generation Forecasting Case Study

: With data collected by Internet of Things sensors, deep learning (DL) models can forecast the generation capacity of photovoltaic (PV) power plants. This functionality is especially relevant for PV power operators and users as PV plants exhibit irregular behavior related to environmental conditions. However, DL models are vulnerable to adversarial examples, which may lead to increased predictive error and wrong operational decisions. This work proposes a new scheme to detect adversarial examples and mitigate their impact on DL forecasting models. This approach is based on one-class classiﬁers and features extracted from the data inputted to the forecasting models. Tests were performed using data collected from a real-world PV power plant along with adversarial samples generated by the Fast Gradient Sign Method under multiple attack patterns and magnitudes. One-class Support Vector Machine and Local Outlier Factor were evaluated as detectors of attacks to Long-Short Term Memory and Temporal Convolutional Network forecasting models. According to the results, the proposed scheme showed a high capability of detecting adversarial samples with an average F1-score close to 90%. Moreover, the detection and mitigation approach strongly reduced the prediction error increase caused by adversarial samples.


Introduction
Wind and solar energy are the most acceptable and promising resources of renewable energy due to their potential and availability. In particular, photovoltaic (PV) facilities have experienced an enormous technological advance over the last few years, exploiting the advantages of using recent architectures such as Internet of Things (IoT) and Cloud Computing [1]. IoT sensors can collect variables such as weather conditions, system temperature, and generated power in PV power plants, which may indicate faults and contribute to the understanding of the plant's generation capacity. By accessing this information online, operators can be prepared for promptly handling unexpected events and variations [2].
Machine learning (ML) is an important building block for successfully integrating PV power plants into smart grids. ML algorithms can underpin solutions to analyze and predict the power grid behavior from data collected by IoT sensors. In PV systems, alongside other goals, ML has been explored to forecast their generation capacity. Accurate generation predictions make power grids more reliable amid fluctuations in demand and capacity, avoid power outages, prevent plant managers from penalties, and save costs [3]. More specifically, deep learning (DL) models have been applied to forecast PV power generation with encouraging results [4,5]. Although the use of these forecasting models contributes to more active, flexible, and intelligent smart grids [6], they may be vulnerable to adversarial examples. In these attacks, adversaries add maliciously crafted noise to legitimate input samples, driving the DL model to make wrong predictions [7]. This fact draws attention to the physical and cyber security of this kind of facility [8], especially when considering that industry practitioners are not equipped with measures to protect, detect, and respond to attacks on their ML models [9].
Different schemes [10][11][12] have been proposed recently to defend ML algorithms against adversarial examples. Studies [10,11] made use of adversarial training. In this technique, data used for model training include adversarial samples especially crafted to make it more resilient against this kind of attack. Conversely, Abdu-Aguye et al. [12] proposed an approach that detects adversarial samples during the test phase. Despite their encouraging results, these studies only focused on protecting ML models designed for classification tasks. The literature still lacks defense schemes for regression models, which can also be deeply affected by these attacks. In PV systems, these attacks represent a severe threat. An adversarial sample might make the forecasting model predict a much higher or lower generation capacity than the correct one. As operators use these predictions to coordinate multiple power plants that operate together to meet the energy demand, a high prediction error will eventually lead to wrong decisions, which can cause large-scale failures [13].
This work proposes a novel scheme to detect and mitigate adversarial samples inputted into DL regression models that forecast PV power generation. First, the approach extracts multiple features from the inputs forwarded to the forecasting system. These inputs are observations about the power plant generation capacity over time. The extracted features range from basic statistics such as minimum, maximum, and mean to spectral measures such as Hurst exponents. Their objective is to make a time series profile that allows for distinguishing natural observations from maliciously crafted ones. Then, a one-class classifier is employed to classify the feature vector as legitimate or malicious. If malicious behavior is detected, the observations are replaced by the last set of observations classified as legitimate. This means that the approach mitigates the attack, preventing the adversarial samples from reaching the forecasting system. The results showed that the proposed scheme could detect most of the adversarial samples and reduce significantly the error increase caused by the attacks.
The main contributions of this paper are: The remaining of this paper is organised as follows: Section 2 presents the background about time series and adversarial ML along with the related work. In Section 3, the proposed approach is discussed. Section 4 shows the materials and methods used during the proposal's evaluation, while Section 5 discusses the results. Finally, Section 6 draws the final conclusions.

Time Series
A time series is a data sequence in a particular period. These data can produce different values at distinct moments in time. Formally, it can be defined as an ordered set of observations X = [x 1 , x 2 , . . . x T ] in which T corresponds to the length of the series [14]. The forecasting task consists of finding a function f that predicts the h-th future value in any time t, i.e., x t + h based on i past values: where i represents the input window size and h, the forecast horizon. When the latter is equal to one, the forecasting task is referred to as a one-step-ahead forecast. Otherwise, it is known as a multi-step ahead forecast. In a supervised training of f , t must also attend the condition t ≤ T − h. Moreover, time series can present seasonality, which occurs when regular patterns are captured in the series. Seasonal events are phenomena that occur, for instance, daily at a certain time, every day, or in a certain month every year.

Adversarial Machine Learning
Solutions that rely on ML might suffer attacks based on adversarial examples [7]. Adversarial inputs are very similar to benign ones but tailored to maximize the model's prediction error. Three aspects of these attacks are worthy of discussing in this section: their classification, adversarial example generation, and defense strategies.

Attack Classification
An attack may be classified according to its specificity. In targeted attacks, the attacker focuses on specific system instances (e.g., specific users, periods, or inputs). Conversely, in untargeted attacks, the attacker aims at any instance of indiscriminate attacks.
Adversarial examples can be used at the training or test phases of the ML pipeline. A poisoning attack, also known as causative, occurs when the attacker can access and modify training data. Data access attacks are also related to the training phase but are more restricted. In these attacks, the attacker can access but not modify the training data. They may then use the retrieved data to induce substitute learning models useful for attacks in the test phase. An exploratory attack occurs when the attacker can modify only the test data [15].
The attacker's knowledge is another relevant feature, which might differ according to the level of access to the system components: training data, feature space, and learning algorithm. The latter may also involve the knowledge of the loss function and the trained hyper-parameters. The attacker's knowledge can be classified according to the access to these three components [16]: • A white-box attack, which implies that the attacker has access to the entire set of components. • A black-box attack, which implies that the attacker lacks substantial knowledge about the system components. • A gray-box attack, which lies between the previous attacks. In this case, the attacker may have partial access to the training data, knowing the training algorithm or the feature space.
When the attacker lacks knowledge about the learning algorithm, an alternative is defining a surrogate/substitute model. This leads to the concept of transferability, which means that adversarial examples designed for a specific model can also affect another model [17].

Adversarial Examples Generation
The attacker generates an adversarial input to fool the ML model based on their knowledge about the target. By accessing training data or gathering information about the model, the attacker can make inputs that look like the legitimate ones but carry a perturbation specially crafted to explore the model's vulnerabilities [18]. Most of the methods for crafting adversarial examples were originally designed for images. A Fast Gradient Sign Method (FGSM) [19] is one of the most notable methods, being also the basis for later methods [20,21]. In an FGSM attack, the perturbation η is given by Equation (2): where corresponds to the coefficient that controls the perturbation magnitude, x to the input to the model, y to the output associated with x, θ to the weights of the adversarial model, and J(.) to the loss function. The malicious sample to be inputted to the target results from adding η and x. FGSM is computationally cheap since it only needs the gradient sign, which can be quickly obtained. Although it was designed to compute adversarial image perturbations, Santana et al. [13] showed that FGSM is also effective at making adversarial examples to degrade the prediction performance of DL models in PV power generation forecasting.

Defense Approaches
In the image processing literature, defense approaches include network distillation [22], adversarial retraining [23], randomisation [24], denoising [25], and adversarial example detection during test time [26]. The main idea behind adversarial example detection relies on training a classifier to detect adversarial inputs, distinguishing them from legitimate ones.
ML algorithms might be used for this task. They have been successful in detecting attacks in traditional computer systems and are also promising to detect attacks in smart grids and in time series [27][28][29]. In this kind of proposal, data about the target's behavior are gathered and then used to feed a ML-based classifier, which learns to classify the behavior as malicious or legitimate according to its characteristics [30,31]. As the types of attack change so does the type of analyzed data. For example, to detect network-based attacks, variables related to network traffic such as the packets per second rate or average packet size are investigated. False data injection in smart grids, on the other hand, can be detected through the analysis of measurements about the power grid state such as current flow and voltage magnitude. In short, these proposals gather data that are sensitive to an attack and use them to distinguish legitimate from malicious instances. The same rationale may be successfully applied to detect adversarial examples in the domain of PV generation.
Addressing the topic of adversarial examples in smart grids and time series, Chen et al. [32] evaluated the adversarial examples impact on feed-forward Neural Network (NN) and Recurrent Neural Network (RNN) models for simulated data on power quality classification and load forecasting, respectively. Based on the results, the authors encouraged more discussion towards increasing the robustness of models implemented in power systems.
Fawaz et al. [7] adapted FGSM and Basic Iterative Method to univariate time series classification and performed attacks against DL models. These attacks achieved an average reduction in the model's accuracy of 43.2% and 56.89%, respectively, and the experiments showed that FGSM allows real-time adversarial sample generation. The authors claim that their work is the first to consider the vulnerability of DL models concerning time series examples.
Niazazari and Livani [10] performed attacks on a multiclass Convolutional Neural Network (CNN) trained on simulated data. The targeted model classifies power grid events such as line energization, capacitor bank energization, or fault. The attacks were generated using FGSM and Jacobian-based Saliency Map Attack (JSMA) algorithms and showed significant potential to make the CNN-based model misclassify the tampered input.
Karim et al. [11] proposed using an adversarial transformation network to attack 1-Nearest Neighbor Dynamic Time Warping (1-NN DTW) and Fully Convolutional Network (FCN) models, trained on 42 classification datasets, showing their susceptibility to adversaries. They used the retraining defense strategy to improve the models' robustness. Abdu-Aguye et al. [12] proposed using OCSVM to classify samples as original or perturbed. The work was based on the attacks and datasets presented in [7]. The authors claimed to reach 90% detection accuracy on most datasets and up to 97% in the best case. Table 1 summarizes the comparisons among the reviewed studies. This work addresses an important limitation found in the reviewed literature: the lack of protection for regression models. In other words, most researchers in adversarial ML are devoted to tackling classification focused on image processing tasks. As observed in a previous work [13], adversarial examples can also affect regression models, which are usually the core of PV generation forecasting. Among the related works, only Chen et al. [32] addressed this possibility, but they did not propose a defense solution against these attacks. All other proposals are aimed at attacks against classification models.

Proposed Approach
Attacks involving adversarial examples against forecasting models consist of multiple steps. They begin with the attacker exploring any vulnerability that allows for accessing training data. Using these data, the attacker induces a model to craft malicious perturbations. Then, the attacker needs to find a breach to tamper with the data inputted into the forecasting model. After achieving this goal, the attacker can add malicious perturbations into the input data and complete the attack. As this attack requires breaking into multiple systems through various steps, multiple defense mechanisms are needed to tackle it. Detecting malicious inputs and preventing forecasting models from processing them may provide protection when other defense lines have already been violated.
This work proposes an approach based on one-class classifiers to detect and replace malicious inputs over power generation data in a PV plant. It assumes that adversarial examples can be distinguished from legitimate ones because they are intentional anomalies [33]. Even malicious inputs crafted to be as similar as possible to legitimate ones might carry distinguishable characteristics. Figure 1 provides an approach's overview. When new data instances from the power plant are forwarded to the Generation Forecasting Module, they are first assessed by the Attack Detection Module. This module organizes the data instances in windows of length i, which is the input window length of the forecasting model, as explained in Section 2.1. Then, the Attack Detection Module extracts the following features from each window: Minimum, Mean, Median, Maximum, Standard deviation, Ratio between Mean and Maximum, Ratio between Minimum and Maximum, Entropy, Correlation, Detrended fluctuation analysis (DFA), and Hurst Exponent. They make up a statistical profile of each window, which is intended to evince the differences between legitimate and maliciously crafted data.
After being extracted, the feature vector feeds an ML-based detector, more specifically, a one-class classifier. This kind of ML model is usually employed for anomaly detection. The most important one-class classifier's characteristic is the need for samples from only one class to be trained. In this work, the one-class classifier is trained using only legitimate data. As it might be hard to find samples from malicious data, this aspect of one-class algorithms is particularly useful for the proposed approach.
The one-class classification model then analyzes the feature vector extracted from the input window and classifies it as legitimate or malicious. When a malicious input is detected, the window is replaced by the most recent window classified as legitimate. Therefore, it prevents malicious data from being forwarded to the Generation Forecasting Module, while ensuring that the forecasting process keeps receiving inputs. Finally, the Generation Forecasting Module employs a DL model to make the predictions.

Dataset
The power generation samples are obtained from a PV plant that started to operate in November 2019 at the State University of Londrina campus (Brazil). This power plant is a typical IoT system that contains sensors connected to the Internet through a wireless network and transmits data to be processed in the cloud. More specifically, the plant has 1020 solar panels and sensors that collect observations about solar power generation every 15 min. Thus, 96 observations about the plant performance are collected each day. The plant's generation capacity is 489.6 MWh/year. Some variations in the collected samples may occur primarily due to two factors. Firstly, they depend on the weather condition. Rainy or cloudy days show a considerable disparity in sample values collected on sunny days. Secondly, the quality of the collection is also subject to interference from dirt that can accumulate on the solar panels, such as leaves from trees that surround the PV plant. These variations are also meaningful to calibrate the forecasting models.
All collected data are transmitted online to a private cloud maintained by the plant vendor, where the data are stored and can be accessed for operation and control. For the training of forecasting models, data from December 2019 to June 2020 was chosen. Moreover, 20% of the training data was used for hyper-parameter tuning of each model. For testing, data from July and August were employed.

Threat Model
Our threat model is based on targeted gray-box attacks that use FGSM for crafting adversarial examples. A targeted attack means that not all inputs are maliciously manipulated. The attacker picks specific inputs or targeted instances to manipulate according to some criteria. Modeling the behavior of attackers is an intricate task and exhaustive options are possible. For practical purposes, three different patterns were defined to select attacked instances: (1) Random: the attacker picks the targeted instances at random. In this pattern, the attack can be confused with the plant intrinsic noise; (2) Intermittent (inter): every targeted instance is followed by a non-attacked instance and vice versa. In [34], this pattern showed to be hardly detected by an estimation-based detector; (3) Sinusoidal (sin): a group of targeted instances is followed by a group of nonattacked instances and vice versa. This function takes as an argument the instance index in radians. If the result is negative or zero, the instance is attacked. This pattern corresponds to a smoother variation of the intermittent pattern. Figure 2 depicts the attack patterns. To understand how gray-box FGSM attacks can be launched against a forecasting model, it is necessary to recall first that the models addressed in this work have a training and a test phase. During the training phase, they use the training data to induce a regression model F. Then, during the test phase, they make predictions by using historical data inputted to F. In gray-box attacks [15], the attacker has limited knowledge about the target. Following this idea, it is assumed that the attacker can access a significant portion of the training data but has no knowledge about the model F induced by the target. Moreover, the attacker cannot modify the training data but can tamper with inputs during the test phase.
To overcome the lack of knowledge about F, the attacker induces a substitute model F , exploring the cross-technique transferability. This means that the attacker can analyze the attack circumstances and choose an algorithm that better fits their need to induce F . In this work's scenario, an attacker could install malware or plug a rogue device at different points, ranging from the PV power plant to the cloud-based servers. If the algorithm behind the attack is a big consumer of CPU, memory, disk, or network resources, the defense systems that monitor these parameters can detect it. In this sense, employing a lightweight solution is an attacker's strategy to stay unnoticed. A costly ML algorithm can also make the requirements to run the attack very strict, hindering its execution. For being simpler than DL, successful in other adversarial scenarios [17] and still differentiable, logistic regression (LR) was adopted to build the substitute model.
Based on this strategy, the attacker uses the first half of their training data to build F . Then, to compute the perturbation η in Equation (2), the attacker uses the second half of the training data as x and y along with the F model. After calculating η, the attacker is ready to manipulate inputs and generate adversarial examples in the forecasting model's test phase. For a given legitimate input x test that is forwarded to the forecasting model, the attacker will make an adversarial sample x test according to Equation (3): This adversarial example x test is inputted to the forecasting model instead of x test , increasing F's prediction error. During the experiments, the value in Equation (2) was varied from 0.05 to 2 with steps of 0.05. In FGSM attacks, determines the attack magnitude.
As for the one-class classifier in the Attack Detection Module's core, OCSVM and LOF were explored. This kind of classifier has been successfully applied to fault detection in smart electric power systems [6]. OCSVM [36] creates hyperplanes (n-dimensional planes) that set boundaries around a region containing as much as possible of the training data. By doing so, OCSVM can identify whether an instance is within this area. LOF estimates a score, named Outlier Factor, which reflects the level of abnormality of each observation from a dataset [37]. It works based on the idea of local density. The k-Nearest Neighbors algorithm is applied to the data, and each data instance is given a locality, which is used to estimate the clusters' density.
During the experiments, the OCSVM hyper-parameter ν varied from 0.1 to 0.4 in 0.05 steps. As for LOF, the contamination alternated between 2.5 × 10 −4 and 5 × 10 −4 , and the number of neighbors varied from 20 to 45 in steps of 5.

Generation Forecasting Module
Studies related to data analysis in time series have been carried out for a long time [38,39]. ML-based applications have become more popular due to their high performance on data inference, outperforming even classical statistical models [40]. More specifically, DL techniques have played a fundamental role in reducing the regression approaches' error. This work explores TCN and LSTM, both DL models, to make predictions in time series.
LSTM is a type of RNN. Unlike some traditional neural networks, LSTM can remember the most useful information. This is possible thanks to its architecture. The networks that comprise the LSTM are connected in the form of loops. This process allows information to persist on the network. It also has a gating mechanism for learning long-term dependencies without losing short-term capability [41]. In particular, this neural network has been achieving important contributions in photovoltaic power generation forecasting [42].

Evaluation Metrics
To compute the forecasting model performance, the Root Mean Squared Error (RMSE) was assessed for the test sets. This error metric tends to be more robust with undesirable large deviations [43]. F1-score was calculated to evaluate the detector. This metric describes the relation between two other metrics for classifiers, recall and precision. Precision measures the percentage of classified adversarial examples that are truly malicious. Recall consists of the effectiveness of the approach in identifying adversarial examples.

Adversarial Examples' Mitigation
This section assesses whether the Attack Detection Module effectively reduces the adversarial examples' impact over the prediction error. Alongside the detection mechanism, the mitigation approach is evaluated here. The mitigation function blocks samples classified as malicious and, at the same time, has to be able to replace them with samples that keep the prediction error low.
The tests were carried out as follows. First, the Generation Forecasting Module was executed to make predictions under non-attack and attack scenarios without the Attack Detection Module's aid. The same data used for the tests in Section 5.2 were employed here. The results show that LSTM had a better performance in terms of RMSE. LSTM obtained the lowest error in several scenarios: without attack and for with values of 0.05, 0.15, and 0.2. TCN outperformed LSTM just for = 0.1. The second part of the tests reintroduced the Attack Detection Module in the pipeline. A remarkable reduction of RMSE for both models (LSTM and TCN) was observed using OCSVM or LOF at the Attack Detection Module's core. Figure 3 presents the results for all these scenarios.   Table 3 presents RMSE obtained with all attack patterns and grouped by . TCN outperformed LSTM in all scenarios where the Attack Detection Module was present in the pipeline. This result suggests that TCN benefits more from the mitigation scheme than LSTM. The fact that LSTM is solidly grounded on the time series's sequential information can explain this outcome. As the mitigation scheme uses the most recent legitimate input, when the current input is malicious, the time series' sequence is eventually broken. TCN, which uses local and global information of the time series, handles this characteristic of the mitigation strategy better.
It is noteworthy that the error increase for the scenario with = 0.15 was substantially reduced when the Attack Detection Module was used. In this scenario, the Attack Detection Module based on LOF reduced the increase in TCN's prediction error caused by adversarial examples from 711.21% to 19.70%.

Attack Detection Module's Efficacy in Detecting Adversarial Examples
OCSVM and LOF were applied as detectors using different hyper-parameters to find the most suitable classifier for the Attack Detection Module. Figure 4 shows box plots for LOF F1-Scores obtained by varying the Number of Neighbors (20, 25, 30, 35, 40, and 45) and Contamination (2.5 × 10 −4 and 5 × 10 −4 ) over different attack patterns (random, intermittent, and sinusoidal) and values (0.05 to 2 in 0.05 steps). In box plots, boxes depict the range between the upper and lower quartiles, while horizontal lines inside the boxes represent the median. Vertical lines extending from the boxes illustrate the variability outside the quartiles. Individual points represent outliers.
The results for different numbers of neighbors show that lower values for this hyperparameter deliver better results. The best average F1-score found in these tests was 86.05%, obtained with 20 neighbors. In contrast, the contamination hyper-parameter tests pointed out that the best average performances were obtained with the highest value for this hyper-parameter. With Contamination = 5 × 10 −4 , the detector reached an average F1-Score of 87.94%. The standout LOF outcome was obtained by combining 40 neighbors and contamination = 5 × 10 −4 , which resulted in an F1-Score of 95.86% for detecting adversarial examples following a sinusoidal pattern.  To check whether there is a statistically significant difference between the performance of both classifiers, LOF and OCSVM, the Friedman's statistical test and the post-hoc test of Nemenyi were used. In this evaluation, three metrics were compared: F1-Score, precision, and recall. The Critical Difference (CD) demonstrates that the difference between two algorithms is significant if the gap between their ranks is larger than CD. Otherwise, no significant differences are found between them. Diagrams for these three metrics are presented in Figures 6-8. The metrics were collected considering all attack patterns and values, and the tests had a significance level of 95%. According to the statistical tests, there was a statistically significant difference between both models since the CD is equal to 0.57 and the distance between them is equal to 1.
The CD value equal to 0.57 is the same for all scenarios as the number of experiments and algorithms used are also the same. Consequently, there is statistical difference between the metrics evaluating the two models with the same value in all cases. The detector efficacy focusing on the influence of values and attack patterns was also analyzed. The experiments showed that LOF reached higher F1-score for lower , while OCSVM outperformed LOF for higher values, as Figure 9 shows. Clear differences in detection performance were not observed for each attack pattern. Figure 10 presents a box plot that depicts F1-Score results obtained with LOF and OCSVM considering the three attack patterns (random, inter, and sin). OCSVM achieved a higher median F1-Score than LOF for the three attack patterns. Actually, the OCSVM median was very close to the third quartile of LOF and the minimum values of OCSVM were very close to the LOF median. Lastly, the influence of the detection model, attack magnitude ( ), and attack pattern on the Attack Detection Module's efficacy was investigated. The Pearson correlation coefficient was employed to identify a linear relationship between each factor and the detection performance. A coefficient value of 0 means no correlation. On the other hand, a value close to −1 or 1 represents the full correlation. The obtained correlations were 0.016, 0.244, and 0.651, for attack pattern, detection model, and attack magnitude, respectively. This result suggests that the detection performance is more affected by the attack magnitude, while the attack pattern and the detection model have a low correlation to the detector efficacy.

Feature Importance
Seeking to provide more insights into what distinguishes FGSM adversarial samples from legitimate ones, the importance of each feature inputted to the Attack Detection Module was analyzed. Spectral Feature Selection for Supervised and Unsupervised Learning (SPEC) [44] was used to this end. Figure 11 shows the features sorted by their importance. Hurst exponent (hurst), median, entropy, ratio between mean and maximum (ratio-mean-max), and correlation (corr) were the most promising ones with roughly the same importance. Mean, standard deviation (std), and maximum (max) showed slightly worse performance than the best ones. Despite its high importance, DFA is clearly less important than the other ones. In short, the computed features' importance suggests that a great part of them contribute significantly to distinguish legitimate and adversarial samples, except the features related to the minimum value (the minimum itself and the ratio between minimum and maximum features).

Discussion
Considering two different classifiers (LOF and OCSVM), three attack patterns (random, intermittent, and sinusoidal), and a broad range of attack magnitudes, the results showed that the proposed approach was consistently effective at detecting adversarial examples over several situations. The approach successfully detected low-magnitude attacks, which are particularly challenging due to their small difference to legitimate samples, and achieved excellent performance in detecting high-magnitude attacks. The variation of attack patterns did not affect the detection capacity. Moreover, OCSVM and LOF both had a good performance, but OCSVM was statistically superior to LOF. Abdu-Aguye et al. [12] also achieved high accuracy at detecting FGSM adversarial examples with an OCSVMbased scheme. Unlike this work, their scheme focused on defending classification models and did not vary attack patterns and attack magnitudes. Despite these methodological differences, the high efficacy reported by both studies suggests that one-class classifiers are a promising option to address this issue.
Almost all features extracted from the forecasting model input showed to be good indicators of artificial presence within the analyzed data. First, this suggests that adversarial examples affect the analyzed window's basic features, such as minimum, maximum, and median. Moreover, this result implies that features related to the time series spectral behavior, such as the Hurst exponent, are influenced by these artificial manipulations.
Combining the detection approach with a mitigation mechanism allowed a significant reduction in the error increase caused by adversarial samples. For high-magnitude attacks, the error increase plummeted from figures above 700% to roughly 20%. The results were also relevant for low-magnitude attacks, dropping from above 200% to around 45% in the worst case. Both TCN and LSTM could benefit from using the attack detection and mitigation mechanism, but slightly better results were found for TCN. Other studies [10,11] that followed a different mitigation strategy (e.g., adversarial training) also reported a positive impact on the target model robustness towards adversarial examples. Nevertheless, with a simple mitigation scheme backed by an effective detection approach, this work achieved a positive outcome for attack mitigation without requiring adversarial examples during the training phase.

Conclusions
DL models are great options to forecast the generation capacity of PV power plant, but they are vulnerable to adversarial examples: as the results showed, the forecasting error under attack increased up to 962.21% when compared to the forecasting error in non-attacked conditions.
On the other hand, detecting and discarding these examples reduces the damage to the forecasting model accuracy: in the worst case, the error increased 77.60% when compared to the forecasting error in non-attacked conditions. In this sense, schemes that detect adversarial examples and mitigate them should not be neglected to avoid the malfunction of the power plant.
Future work includes investigating other methods to defend regression models against adversarial examples, testing the proposed scheme over different attack methods and domains. Furthermore, the proposed mitigation approach will be extended as it is possibly a point that can be changed to reduce the error increase caused by attacks even more.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: