Early Detection of Photovoltaic Panel Degradation through Artiﬁcial Neural Network

: In this paper, an artiﬁcial neural network (ANN) is used for isolating faults and degradation phenomena occurring in photovoltaic (PV) panels. In the literature, it is well known that the values of the single diode model (SDM) associated to the PV source are strictly related to degradation phenomena and their variation is an indicator of panel degradation. On the other hand, the values of parameters that allow to identify the degraded conditions are not known a priori because they can be different from panel to panel and are strongly dependent on environmental conditions, PV technology and the manufacturing process. For these reasons, to correctly detect the presence of degradation, the effect of environmental conditions and fabrication processes must be properly ﬁltered out. The approach proposed in this paper exploits the intrinsic capability of ANN to map in its architecture two effects: (1) the non-linear relations existing among the SDM parameters and the environmental conditions, and (2) the effect of the degradation phenomena on the I − V curves and, consequently, on the SDM parameters. The ANN architecture is composed of two stages that are trained separately: one for predicting the SDM parameters under the hypothesis of healthy operation and the other one for degraded condition. The variation of each parameter, calculated as the difference of the output of the two ANN stages, will give a direct identiﬁcation of the type of degradation that is occurring on the PV panel. The method was initially tested by using the experimental I − V curves provided by the NREL database, where the degradation was introduced artiﬁcially, later tested by using some degraded experimental I − V curves. of series resistance for different induced degradation effect and for different environmental conditions.


Introduction
The penetration of photovoltaic (PV) generation in the urban environment is significantly growing, owing to its ability to reduce the power bills of owners and support the grid with local generation [1].
In this scenario, PV systems degradation and failures are less tolerated since not only do they reduce the return on investment, but they can lead to a grid power imbalance if the actual energy production is different from the expected energy production that comes, for instance, from a digital twin of the system.
One of the key factors for increasing PV system reliability and its service life is to develop methodologies and technical solutions for the accurate monitoring of the state of health of PV panels. Indeed, PV modules are often exposed to harsh environmental conditions, or operate in abnormal conditions that lead to fast PV degradation or unexpected failures. For example, especially in urban area, residential PV plants have a high probability to be subject to panel mismatches, partial shading, hot spots, and mechanical stress, which accelerate the degradation phenomena. Another example is shown in [2], where the authors prove that the combined effect of PV delamination, water penetration into the delaminated area and high string voltage operation leads to many failures in PV panels and inverters. Since the severity of delamination increases gradually, this phenomenon can be early detected so that the affected PV panel can be replaced with a new one. This preventive maintenance will preserve the inverter operation, avoid further damages and consequently, keep the PV plant up and running, thereby increasing the PV plant's energy yield over the system lifetime.
PV modules also exhibit natural aging that reduces the annual PV energy production with a more or less flat degradation rate. For crystalline silicon PV modules, it was estimated a reduction of [0.8-0.9]% over the year [3]. An exhaustive review of degradation phenomena occurring in PV modules is reported in [4]. The early detection of PV degradation allows to make decisions about system maintenance and PV panel replacements. In some cases, it can prevent catastrophic consequences, such as fires.
At present, there are many studies related to PV fault diagnosis, but most of them are focused at the PV system level. In [5], different methods are reviewed and discussed in detail by putting into evidence their feasibility, complexity, cost-effectiveness and generalization capability for large-scale integration. In [6], the discussion is focused on the use of artificial intelligence (AI) and Internet of Things (IoT) for the remote sensing of solar photovoltaic systems to improve the PV diagnosis.
On the other hand, such models as the single-diode model (SDM), double-diode model (DDM), and triple-diode model (TDM) are widely used for the modeling, simulation, performance evaluation, and design optimization of PV systems as well as for monitoring and diagnosis purposes [7]. Indeed, the accurate parameter identification of the equivalent PV electrical model allows to study the characteristics of a PV source [8] in all operating conditions. Therefore, instead of analyzing the shape of each I − V curve, it is easier to detect degradation by evaluating the variations of such parameters with respect to their values in healthy conditions.
The use of photovoltaic models in combination with AI methods has been already proposed in the literature for PV faults detection. In [9], an artificial neural networks (ANN) is firstly trained with a numerical simulation, provided by a PV model, for the classification and isolation of eight types of faults, and then used on field measurements to identify possible faulty operating conditions. A field programmable gate array (FPGA) implementation of the proposed method is also proposed in the paper for the online operation. In [10], the kernel-based extreme learning machine (KELM) is employed to train the single hidden-layer feed-forward neural network (SLFN) to classify the degradation fault, short-circuit fault, open-circuit fault and partial shading faults in PV arrays. The SLFN needs as input electrical values taken from the I − V curve, environmental conditions and the SDM parameters previously calculated starting from the whole I − V curve. An ANN with a radial basis function (RBF) requiring only irradiance and PV output power as input is developed in [11]. The results obtained through the testing of the developed ANN on a PV installation of 2.2 kW capacity provide an accuracy of 97.9% in faults identifications. In this case, no model is used but it long-term data measurements are taken for reproducing on the PV installation different kind of faults.
The ANNs training phase usually requires a large number of observations, which are not always available. This problem might be mitigated by using probabilistic neural networks (PNNs), which learn on-line with a small number of observations [12,13].
The explosion of the IoT technologies is expected to enable, with an acceptable additional cost, the diagnosis at the PV panel level. Moreover, by exploiting edge computing sensors [14], it will be possible to elaborate the data on site and to transmit to the final user only synthetic information related to the state of health of each PV module. Module-level monitoring devices are already available in the market for monitoring and controlling a single PV panel, thereby improving the system performance and the planning of system's operation and maintenance activities [15]. Such devices are also able to perform I − V curve tracing but the final user is in charge of the analysis of these data. Nevertheless, the offline data analysis is time consuming and requires an operator with a specific PV back-ground. Moreover, the types of faults and degradation mechanisms that can be identified are very limited. To solve this drawback, in [16], it is proposed an automatic fault detection method that elaborates online the I − V curves of PV panels. The diagnosis is focused on the identification of current mismatch due to partial shading, hot spots, and cell cracks. The method calculates the concavity and convexity along the I − V curves because the three analyzed faults produce steps on the I − V curve; thus, not trivial data processing is required for proper detection of the occurring fault.
Differently from the previous works, which are mainly focused on faults classification or partial shading identification, the ANN-based method proposed in this paper is aimed at estimating PV degradation through the identification of SDM parameters values that can be easily related to the PV panel state of health. The approach exploits the intrinsic capability of ANN to map in its architecture two effects: (1) the non-linear relations existing among the SDM parameters and the environmental conditions, and (2) the effect of the degradation phenomena on the I − V curves and, consequently, on the SDM parameters. The joint elaboration of these two relationship allows to quantify the PV degradation due to aging, corrosion, cracks, or hot spots, among others.
With respect to existing fault diagnosis methods, the technique introduced in this paper has the advantage of recognizing faults quantitatively by using only three points on the I − V curve around the maximum power point (MPP), irradiance and panel temperature, and thus, does not need the whole I − V curve, which may not always be available during the normal operation of the PV system. Moreover, due to the fast elaboration of the ANN results, it can be easily implemented on an embedded system for online elaboration once the training phase is performed offline.

Single Diode Model for Describing Degradation Phenomena
The SDM, shown in Figure 1, is the most used PV model due to the trade-off between simplicity and accuracy [17]. It is described by Equation (1) and has a set of five parameters to be identified: The number of PV cells in the module (n s ) and the thermal voltage (V t = k·T q ) are known once the cells temperature (T), the Boltzman's constant (k) and electron charge (q) are given. The parameters identification is a relevant task since they are not available on the manufacturer data sheet. There are variations associated with the operating conditions, non-linear nature, and degradation phenomena [18]. Methodologies for solving this task are commonly grouped into three categories [19]: iterative (numerical), non-iterative (analytical), and AI-based optimization approaches.
Analytical methods use equations solved symbolically or explicitly by using key-points information from data sheets or I − V curve data. These approaches are characterized by the simplicity of their implementation and computational efficiency [20].
Numerical methods seek to fit the points of the I − V curve by using systems of equations that are solved numerically. Commonly, trial and error approaches or numerical solvers, such as Newton-Raphson and curve-fitting methods, are used. The accuracy, reliability, and convergence of these methodologies are strongly linked to the selection of the initial conditions [19,21].
Optimization approaches group different kinds of algorithms based on artificial intelligence and heuristics methods. The development of computational intelligence has improved the implementation of these algorithms to solve highly non-linear and complex problems. Many advantages are associated with these approaches, as preliminary identification of the search space of parameters, high accuracy, and in some cases, a mathematical model, are not needed. Nevertheless, it requires high computational complexity [8].
In this last category, artificial neural networks are proven to be an effective tool for SDM parameters identification. In [22], a multilayer perceptron (MLP) for identifying the I ph , R s , and I sat parameters is implemented. This work uses an input vector of five inputs composed by I sc , V oc , I mpp , V mpp and P mpp from the data sheet at the panel level. Another study [23] focuses on identifying some important points of the I − V curve, such as V oc , I sc , P max , V max , and I max at the panel level. For this purpose, it uses as inputs the irradiance and temperature in a two-hidden layer configuration. In [24], a recurrent neural network is implemented for predicting the output current of the cell by using temperature, irradiance, and voltage as inputs. In [25], the input vector carries out a mix among environmental and electrical variables, such as irradiance, temperature, V oc , I sc , V mpp , I mpp , and P mpp . In this case, the parameters identified are R s , R sh , and η for an application at the panel level. Finally, authors in [26] and [27] propose configurations based on feed-forward neural networks for identifying the full set of five parameters by means of two-stage identification. Both works only use irradiance and temperature as inputs, but the work in [27] is focused on a single cell, while the other [26] concentrates on an entire PV panel.
In this paper, a numerical method is presented that combines I − V curve fitting and ANN optimization to estimate the variation in PV panels' SDM parameters when the PV panels are subject to degradation.

Effect of PV Degradation on the I − V Characteristic and the SDM Parameters
In [28], the most common degradation effects and failures that are detectable by the inspection of the I − V curve and the related SDM parameters variations are described. They are visually shown in Figure 2. A brief comment on these effects is also reported in the following.   • S1 effect: The I − V curve exhibits a lower short-circuit current (I sc ) than expected. This degradation effect may be caused by the following: loss of transparency of the encapsulation, due to browning or yellowing; glass corrosion, which reduces the light trapping of the module; or delamination, which causes optical uncoupling of the layers. This is mainly reflected in the reduction of the photoinduced current (I ph ) parameter. • S2 and S3 effects: The I − V curve has an open circuit voltage (V oc ) lower than expected, and all points shift homogeneously to the left, while the I − V curve preserves its slope around V oc . This anomaly may be due to failed cell interconnections, short circuits from cell to cell or a failure of the bypass diode. Such failure can be associated to the SDM ideality factor (η) because the number of cells (n s ) in a PV module directly appears in the SDM Equation (1). Thus, the effects of cells failures can be expressed as follows: where η H is the healthy ideality factor and n H is the number of healthy cells inside the PV panel made of n s cells. For example, if n s = 36, one failed cell has an impact of almost −3% on the η parameter. The open-circuit voltage of the module can be reduced also by the light-induced degradation (LID) of crystalline silicon modules or potential induced degradation (PID). Since the leakage current inside the PV cell is an indicator of such phenomena, they can be directly associated to the variation of saturation current I sat parameter. Small variation of I sat does not affect significantly the I − V characteristic. It can be observed that the impact of I sat and η is directly opposite. This can cause a multimodal problem in the parameter identification, which means that the same I − V curves may be reproduced with different pairs of the I sat and η. Therefore, in some cases, the same degradation phenomena can be associated almost indifferently in the I sat variation or η variation. • S4 effect: The slope of the I − V curve near V oc is lower, indicating an increase in the series resistance R s in the PV module. R s in the module could rise by the increase in interconnections resistance, corrosion in junction box or interconnects and slacks joints.
• S5 effect: The slope around I sc is mostly associated to the parallel resistance R sh . The variation of this parameter is due to shunt paths in the PV cells and/or the interconnections. Slight cell mismatch or slight non-uniform yellowing may be additional causes. • S6 effect: The presence of steps in the curve is likely caused by the activation of one or more bypass diodes that are connected in parallel to a block of cells to protect them from inverse polarization under mismatched operating conditions. It can be due to irregular soiling, the shadow affecting only a few cells in the PV module, or the breakage of one or more cells protected by the same bypass diode. This effect cannot be reproduced with a single diode model, and thus the variations of the SDM parameters associated to this effect do not have a physical meaning.
Since all degradation phenomena have an impact on the delivered power, the normalized sensitivity (S n ) of the PV output power with respect to the SDM parameters variation is calculated as follows: where the s n,k is the normalized sensitivity calculated by introducing the variation ∆k to the k-parameter. The subscript H refers to the values in healthy conditions. Due to the non-linearities of PV power with respect to the SDM parameters, the sensitivity is not constant and it should be calculated locally and for the different environmental operating conditions. The normalized sensitivity values of the delivered PV power with respect to the five parameters' variation shown in Figure 2 is reported in (4).
The results show that the PV output sensitivity with respect to the I ph and η is close to one. As a consequence, few percentage variation of I ph and η is reflected in a significant variation of PV power. R sh and I sat have lower impact at less than one order of magnitude. The sensitivity to R s is somewhere in between. This means that a small variation of these parameters can be tolerated since the corresponding degradation process is not yet detrimental.

Description of the Proposed ANN Architecture
In the previous section, the relation among PV panel degradation and SDM parameters variation was put into evidence by analyzing the I − V curves. On the other hand, the parameters of the PV equivalent circuit change with respect to the irradiance and the temperature of the solar cells. The relationship between them is nonlinear and cannot be easily expressed by analytical equations; nonlinear regression methods can fail when any preliminary information about the input-output relationships is provided. Many papers have tried to characterize such behavior. However, it is strongly dependent on the PV panel under test, and no general rules can be applied. Some examples are reported in [29]. In [30], the authors highlight that the fitting curve, which maps each SDM parameter as a function of the environmental condition, also varies along the seasons, due to the influence of the variable weather and environmental conditions.
As a consequence, a reliable identification of PV degradation phenomena cannot be achieved if relationships among SDM parameters and the environmental condition are not properly accounted for.
The capability of the ANN to train non-linear and unknown relations among variables and to generalize these relationships when new input is provided to the ANN, is exploited for solving this task. Since ANN does not require knowledge of the internal system parameters, it implies reduced computational effort and represents a compact solution for multivariable problems. It is also a good candidate to be implemented on embedded system for online operation. Figure 3 shows the flow chart of the procedure proposed in this paper. In particular, it describes the main steps for selecting the ANN input and target datasets and performing the ANN training phases.  In the first stage, the ANN predicts the equivalent circuit parameters by only measuring the irradiance and temperature. Such parameters are assumed to be reference values for the healthy operating condition. Indeed, for training this ANN, a proper number of healthy I − V curves, acquired under different environmental conditions, are selected, and the corresponding SDM parameters are used as the target dataset. As shown in [26], ANN can provide a very good estimation of healthy SDM parameters for every environmental condition. This SDM parameters estimation is used as the baseline for detecting the possible presence of degradation.
In a second stage, a more complex ANN architecture is trained to account for the different types of degradation. For achieving this task, a modified set of I − V curves is generated by using the single diode model with different sets of parameters that are associated to realistic degradation phenomena. In this stage, the ANN receives as input not only the irradiance and cell temperature, but also some points of the I − V curve, which are necessary for taking into account the modification of the I − V curve shape due to the degradation.
It is worth noting that in this paper, only three points around the maximum power point are used. This allows to monitor the PV source's state of health during normal operation without the need for a complete scan of the I − V curve: measurement of the voltage and current around the MPP suffices for the detection of degradation. More details on the generation of degraded datasets are given in Section 4.2.
The two training phases are completely independent, having in common only part of the input data (G and T) and providing two sets of SDM parameters. One output represents the vector of estimated SDM parameters under the hypothesis that the PV panel is not degraded-in a healthy condition-and the other one is the vector of estimated SDM parameters associated to the real state of health. The difference between the two estimations gives a measure of which parameter is changing and consequently, which type of degradation is occurring inside the PV panel. Details on how to configure the ANN architecture are reported in Section 5.

Description of Experimental I − V Curves Database
For covering as much as possible the different outdoor operating conditions of real PV arrays, a large database of experimental data is selected. The I − V curve dataset provided by the National Renewable Energy Laboratory (NREL) was used at the beginning to develop the proposed method. NREL has a public database with data measured for flatplate photovoltaic (PV) modules installed in three different cities in the USA (Cocoa-Florida, Eugene-Oregon, and Golden-Colorado). The experimental process collected PV module current-voltages curves and meteorological data samples from 2010 until 2014 [31]. The work employed different PV technologies, such as single-crystalline silicon (c-Si), multicrystalline silicon (m-Si), cadmium telluride (CdTe), copper indium gallium selenide (CIGS), amorphous silicon (a-Si) tandem and triple junction, amorphous silicon/crystalline silicon or heterojunction with intrinsic thin layer (HIT), and amorphous silicon/microcrystalline silicon. The database does not report specific commercial or manufacturer information for avoiding any legal conflict. To describe the procedure proposed in this manuscript, without the loss of generality, the multi-crystalline silicon PV module information is used.
The variables extracted and used for the procedure are the plane-of-array irradiance (W/m 2 ), the PV module back surface temperature ( • C), and the corresponding currentvoltage curve represented by a number of points ranging from 180 to 190 samples, depending on the voltage resolution settled on the tracer device. Table 1 shows the features and ranges of measurements of the panel chosen for performing the current analysis. The irradiance and temperature ranges also have a high impact over the performance of the PV model. From the literature, it is well known that the single diode model is not suitable for characterizing the PV devices at low irradiance values [32]. For this reason, in this work, we refer to irradiance values in the range from medium to high irradiance, and only I − V curves acquired with a irradiance level above 200 [W/m 2 ] are used.
When it comes to the temperature, the single-diode model has no particular restrictions about the ranges. Therefore, there are no restrictions about the ranges of temperature. This work takes I − V curves acquired over a wide temperature range of High quality datasets is a key factor for training the ANN efficiently. To achieve this, the data must first be collected and cleaned to remove errors (bad data), outliers, and samples with excessive noise. If these practices are skipped or poorly executed, it becomes difficult for the ANN to detect the true underlying models.
In certain cases, partial shading or measurement issues in the tracer device provide I − V curves shapes that generate wrong SDM parameters. For this reason, the I − V curves in the NREL database are preliminary analyzed, and the ones having an abnormal profile are discarded. At the end, more than 20,000 I − V curves are available in the filtered database.
It is worth noting that only a part of the available NREL database is necessary for the ANN training phases; thus, the proposed approach can be applied in practical applications where enough I − V curves are available for different irradiance and temperature conditions. More details on the used I − V curve are given in the sections discussing the simulation and experimental cases.
During the normal operation, only a few values of voltage and current (V pv , I pv ) around the MPP, irradiance (G), and PV panel temperature (T) measurements are used as the inputs to the neural network. They are chosen since they are already measured on photovoltaic installations. Therefore, it is possible to take advantage of such information for online monitoring of the PV source's state of health by means of the proposed ANN architecture.

Generation of Training Set and Validation Set for a Healthy PV Panel
The reference values of the SDM parameters, the named target values, must be calculated for each experimental I − V curve that is used during the ANN training and validation processes. Since the ANN training phase is performed offline, in this paper, the target dataset can be generated by using the nonlinear least-square solver of Matlab to assure high-quality fitting among the experimental I − V curves and the ones generated by the single diode model. Such a Matlab toolbox needs boundaries, a representative function, and the initial values for the parameters. The boundaries represent the upper and lower limits that every parameter could have. Table 2 shows the boundaries and the parameters initial values configured for this work. They are based on the indication given in [33]  Concerning the initial conditions summarized in Table 2, three special values are defined there. First, the short-circuit current I sc is a characteristic value of every PV device. Second, the slopes of the tangent lines R so and R sho . They are defined as follows: Finally, the representative function that describes the PV generator is based on Equation (1): For every selected experimental I − V curve, and thus, for the known (G, T), the fitting procedure calculates the set of five parameters p = [I ph , I sat , η, R s , R sh ] that minimize the mean square error between the experimental data and the I − V curve generated by using the single-diode model. In this way, the p vector associated to the healthy I − V curve is the target used to train the neural network.
It is worth highlighting that the single diode model allows to describe the electrical constraint between I − V measurements (V pv , I pv ) and SDM parameters in the following form: f (V pv , I pv , p) = 0 but the ANN allows to detect the following unknown relations: where the vector p H (G, T) is the estimation of the healthy SDM parameters for the environmental conditions G, T.

Generation of Training Set and Validation Set for a Degraded PV Panel
Although the NREL database collects I − V curves of PV panels operating in outdoor conditions, there are no indications concerning the degraded I − V curves. In this paper, a selection of I − V curve is performed by analyzing the I − V curve shape, slopes and operation conditions in order to take from the NREL database only the healthy I − V curves. On the other hand, experimental degraded I − V curves are not easily detectable because of the difficulty of reproducing the large variety of degraded conditions and the long time the measurement process takes for registering these kinds of phenomena. For these reasons, the I − V degraded curves are reproduced artificially by still using the single diode model, where variations on the SDM parameters are fixed and know a priori. In this way, it is easy to generate enough I − V curves that are useful for the ANN training process. This is a similar approach already adopted for emulating PV faults and mismatched operating conditions in others' fault identification methods [10,12,34].
The degraded I − V curves' database have in common with the healthy I − V curves database the same environmental conditions. The new database is generated by applying the pseudo code shown in Table 3. N healthy is the number of experimental I − V curve taken from the healthy database. for n = 1 : N healthy load irradiance and temperature G, T load I − V curve as vectors V pv , I pv load the vector p of healthy SDM parameters for k = 1 : 5 assign a random variation (α k ∈ [α k,min , α k,max ]) to the k-th parameter: p(k) deg = α k · p(k) calculate the I − V degraded curve by using Equation (1) save G, T, I pv deg , V pv and p deg in the degraded database end end It is worth noting that a degradation effect is applied separately to each parameter; thus, the database containing the degraded I − V curves is five times larger than the healthy database. Moreover, to introduce a detectable I − V curve deformation, the applied parameters degradation is randomly chosen, according to the boundaries shown in Table 4. Such boundaries are chosen by taking into account that the sensitivity of the I − V curve with respect to each SDM parameter is strongly different, as highlighted in Section 2.1.
An example of degraded I − V curve obtained artificially by starting from a healthy, experimental I − V curve is reported in Figure 4. The figure also shows the only three points that are used by the ANN to estimate the SDM parameters.

Configuration of the Proposed Double Level ANN Architecture
This work uses a multi-layer feed-forward neural network. It comprises one input layer, one hidden layer, and one output layer. The number of neurons in the input layer is equal to the number of parameters that compound the input vector. The number of neurons in the output layer is fixed by the number of parameters to identify, in this case, five neurons (equal to the set of five parameters). The number of neurons in the hidden layer is not fixed. It depends on the complexity of the problem, but some works conclude that the best option is to choose the smallest configuration that reaches the desired performance and accuracy [26].
The developed ANN is shown in Figure 5; it is composed of two levels trained independently. The first level is devoted to estimate the parameters of the single diode model by using as input only the irradiance and the PV panel temperature. It is trained by using as target values the SDM parameters extracted with the MATLAB fitting procedure associated to the healthy I − V curve. A number of N trial = 5000 experimental curves are selected randomly from the NREL database in order to cover the different environmental conditions. The selected dataset is distributed as 70% for the training set, 15% for the validation set and 15% for the testing set. An inner layer with 20 neurons is used.
The second level of the ANN architecture is trained by using as target values the SDM parameters associated to the degraded I − V curves. In this case, the input is a vector of eight elements, including the irradiance, temperature, and voltages and currents of the three points around MPP. They are equally spaced of 1 volt with respect to the MPP, as shown in Figure 4.
A number of 5 · N trial curves are selected randomly from the degraded database in order to cover the different environmental conditions and different kinds of parameter degradation. Additionally, in this case, the selected dataset is distributed as 70% for the training set, 15% for the validation set and 15% for the testing set. An inner layer with 50 neurons is used.
The MATLAB Neural Network Toolbox ® is employed for configuring, training and testing the proposed architecture.
It is worth noting that, although in this manuscript all the elaborations are performed on a PC, the trained ANNs can be exported in the Open Neural Network Exchange files [35] and executed by the most common open-source platforms (e.g., TensorFlow ® ) for running on embedded systems.
Alternatively, the MATLAB Compiler SDK ® [36] can be used for compiling the MATLAB ® functions into a shared library for C/C++, .NET, Java, or Python projects and executed on the most common development boards, e.g., Raspberry ® , BeagleBone ® or DSP/FPGA-based architectures.
Moreover, microcontroller manufactures allow to train ANNs and develop optimized codes directly by using their programming tools, thereby optimizing performance and reducing development costs [37].

Dataset Normalization
Before passing the inputs and targets to the neural network architecture, it is necessary to preprocess the dataset values for improving the performance of the training process. The normalization process is important for neural network training because it adjusts the different inputs and outputs ranges to a normalized range before applying them to the neural network. In MATLAB, the normalization process is set by default and the values are adjusted to fall in the range of [−1,1]. However, in this case, a small bug is found that is associated with the default normalization process. It is found that the default normalization process has problems with inputs or targets that are too small, producing errors in the training process. For instance, the common range for the saturation current I sat is in the order of micro and nano amperes. These ranges of values prevent the training process from finding a suitable fit for the targets.
For solving this issue, the normalization process is implemented manually, and the inputs and targets are adjusted in the following way: for ANN level #2 target = I ph I ph,max , log 10 (I sat ) log 10 (I sat,max ) , For the saturation current, the normalization is given on a logarithmic scale for better representing the large range of variations of this parameter.
Since the neural network approximates its outputs inside the same range, it is also necessary to convert the ANN results back into the same range rather than into the originals inputs and targets.

Overfitting and Generalization
Another common problem of the neural network training process is overfitting. This concept is associated with the way that the neural network learns the process and adjusts a model for representing it. In a training process with overfitting, the neural network finds a model that fits the set of data. Although the error in this process could be set as very small, the neural network builds an overly complex model that is unable to identify the right outputs for new data presented to the input. Therefore, the neural network memorizes the behavior of the training data instead of building a model that generalizes the outputs for testing or validation data.
A regularization method consists of modifying the performance function. In this case, the default performance function used by the toolbox of Matlab is the mean square error (MSE), defined in (9). This performance function can be tuned for focusing on generalization by using the weights and bias of the neural network. Here, it is necessary to add the mean values of the sum of weights (MSW) of the neural network to the performance function. Equation (11) expresses the way to tune this configuration. The parameter γ (performance ratio) allows the user to define the level of impact of the regularization. This parameter must be defined in the range of [0-1]. In this case, the user must use their expertise to find a trade-off between generalization and performance [38,39].
where N is the number of trials, and n is the total number of weights w i for all the ANN nodes.
Here, the challenge is to choose the correct value for the performance ratio parameter (γ). If the user uses a parameter that is too large, there is a risk of overfitting. On the contrary, if the performance ratio parameter is too small, the neural network does not fit the training data adequately.
Bayesian regularization is a neural network training algorithm that updates the weights and bias values. The main characteristic of this algorithm is that it automatically determines the optimal regularization parameters and the correct combination for making up neural networks that generalize well. In the toolbox of Matlab, this function uses the Jacobian for calculation; then, the performance must be the mean or sum of square errors. As a consequence, the training process must be assessed by MSE or by the sum square error (SSE) performance functions [22,39].
The Bayesian regularization method does not need to configure a performance ratio parameter. On the contrary, it automatically calculates the best parameters by focusing on generalization.
In the following, the results concerning the ANN trained with Bayesian regularization is proposed; they exhibit good identification of the SDM parameters in both healthy and degraded conditions.

ANN Identification Results for Healthy Conditions
The continuous lines in the Figures 6-10 are the estimated SDM parameters provided by ANN; they refer to the healthy conditions and put into evidence the intrinsic relationships among SDM parameters with the environmental conditions for the PV panel under test. It is worth noting that, apart from I ph , which is almost linear with G and practically insensitive with respect to the temperature, the behavior of the remaining parameters is completely different from the cases analyzed in [26,30]. This result is not very surprising, given that the relationships among parameters and the environmental conditions change significantly from panel to panel. Photoinduced current I   To demonstrate the goodness of the ANN parameters estimation, in Figures 11 and 12, the plots of experimental data, selected randomly from the NREL database, in comparison with the reconstructed I − V curves obtained with the estimated parameters at low and high irradiance conditions are shown.
The error area, defined as the difference in the area below the reconstructed I − V curves and the area below the corresponding experimental I − V curve, is calculated for the tested cases; a 5% maximum error is found for a few cases at low irradiance conditions. This corresponds to the plots shown in Figure 11; however, the error can be considered acceptable since, as is already remarked in [32], the single-diode model is less precise for low irradiance conditions.

ANN Results with Simulated Degradation on I − V Curves
The capability of ANN to detect the degraded SDM parameters is tested in this section by still using emulated degraded curves. Even if this is a limitation with respect to using real degraded curves, it allows to corroborate the methods with a well-controlled degradation effect introduced artificially. For each k parameter (k ∈ [1, .., 5]), the analysis is carried out by using the following procedure: • Select randomly N test experimental healthy I − V curves from the NREL database (not used during the ANN training phases) and save the related SDM parameters as the vector p and environmental conditions (G, T).

•
For each selected case, do the following: 1.
Apply a fixed degradation factor α k to the k-th parameter of p.

2.
Generate the degraded I − V curve by using Equation (1).

3.
From the degraded I − V curve, select the voltage and current of 3 points equally spaced around MPP and ordered in the vector [V 1 , V 2 , V 3 , I 1 , I 2 , I 3 ]. 4.
ANN(level #1) estimates the healthy SDM parameters p H with [G, T] as input vector.
Calculate the percentage of parameters variation as follows: Figures 13-16 show some comparison between the degraded I − V curves (blue lines) and the reconstructed I − V curves (light blue lines) obtained by using the p deg parameters estimated by means of ANN. In each figure, the healthy I − V curves (red lines) used to generate the degraded curves and the 3 points passed to the ANN for estimating the SDM degraded parameters are also reported. Of course, in an on-board operation, only steps 3-6 of the previous procedure are necessary since all ANN input are provided by the real-time measurements.
As mentioned in Section 2.1, the sensitivity of the I − V curve with respect to each SDM parameter, especially close to the maximum power point, is strongly different. In particular, a variation of a few percentage points on I ph and η produces a significant modification of the I − V curves, while the effect of I sat , R s and R sh is visible on the I − V curve only for larger percentage variations. For this reason, different percentages of degradation are considered in the examples shown in Figures 13-16. In Tables 5-9, the corresponding ANN identification results are reported. The vector (p deg ) of degraded SDM parameters estimation is compared with the healthy SDM parameters estimation vector (p H ) to calculate (∆p % ) and find out which parameter has the most significant percentage variation.
It is worth noting that, in order to establish which kind of degradation is most relevant, the vector of maximum power variations ∆P mpp,% , due to each parameter variation, is also shown in the tables. It is estimated numerically as follows: • Calculate the maximum power in healthy condition P H mpp by using the single diode model and healthy parameters (p H ).
• For k ∈ [1,5], do the following: -Replace the parameter p H (k) with the corresponding degraded values p deg (k) -Estimate the maximum power P k mpp by still using the SDM. - The percentage error evaluation allows to appreciate rapidly the ANN capability to detect the degradation on each SDM parameter and the related impact on the P mpp .

PV current [A]
Estimation with ANN SDM parameters Degraded I-V curve Healty I-V curve 3 points passed to ANN Figure 13. ANN identification of degraded curve with −5% of variation on I ph for two different irradiance and temperature conditions.        Tables 5-9, in particular ∆p % , it is evident that the ANN allows to associate, with a good approximation, the degradation effect introduced on the I − V curve to the corresponding SDM parameter. Nevertheless, in some cases, the results of ANN parameters identification are not completely satisfactory. For instance, in the first example of Table 5, the ANN estimates a −43.5% reduction in R sh , which does not correspond to a real degradation of such a parameter. The wrong estimation of R sh , which may occur also for the other SDM parameters, is due to the intrinsic nature of the ANN to provide generalized results when the input data change. Moreover, since the I − V curve sensitivity with respect to some parameters is very low, errors on the estimation of these parameters are more likely and more frequent. The results can be improved if the proposed procedure is repeated and the parameters degradation is detected by considering their average values. Some examples are reported in the following section. Table 9. SDM parameters estimated with ANN for I − V curves of Figure 17.

Improving the ANN Results with Repeated Tests
By assuming that a degradation phenomenon is occurring permanently, the ANN parameters identification method can be executed frequently (e.g., more than one time per day) without affecting the normal operation of the PV system, and the ∆p % can be estimated for all cases. Since the effective degradation of the PV panel is not related to the changes in the environmental conditions, the average values of ∆p % are considered for all tests collected in a short period (e.g., one day). Table 10 shows the average percentage variation of the SDM parameters estimated with the ANN when the process described in the previous section is repeated for a number of trials N test = 100 selected randomly among different environmental conditions. Each row in the table refers to the parameters variation reported in the first column. For example, for the first row, a −4% of induced degradation on I ph is estimated, on average, with −3.73%.
It worth noting that some residual cross-coupled variations appear in the estimation of the other parameters. Nevertheless, if we take into account the different sensitivities of the I − V curve with respect to each parameter, these crossed variations can be acceptable. Indeed by considering the sensitivity values reported in (4) and by referring to the first row of Table 10, the −3.73% reduction in I ph is reflected in the reduction of 3.76% in the delivered power, while a −22.3% of variation in R sh corresponds to 0.69% power reduction.
Another small anomaly is in the second row where the identification of the variation of the saturation current is not detected accurately (+103%) with respect to the induced degradation of +150% on I sat . In this case, part of the induced degradation on I sat is translated in a variation of η. This can be easily justified by the fact that variations on I sat and on η parameters produce the same deformation on the I − V curve (see S2 and S3 effects in Figure 2); thus, for the ANN, it is more difficult to detect the origin of degradation when they produce similar deformations on the I − V curve.
Finally, the last row indicates the average errors on the estimated SDM parameters when no degradation is applied. Here, it is evident that in the presence of healthy curves, the estimation of the SDM parameters variation tends toward small values, confirming that no degradation is occurring.

Comparison with Other ANN Solutions
In Table 11, the main characteristics of the proposed ANN architecture are compared with other ANN solutions proposed in recent years-this is briefly commented on in the introduction section. The table includes only the methods suitable for the online diagnosis and faults detection of PV sources.
The comparison is made in terms of the ANN architecture, inputs required during online operation, data for the training phase (usually performed in the offline mode), and PV granularity, which means the level of applicability of the method (panel-, string-or array-levels). The different types of detected faults and the level of complexity, which could have a significant impact on the embedded system implementation, are also included in the comparison.
It is worth noting that the solution described in [13] uses two independent ANNs, similar to the approach developed in this manuscript, but the ANNs architecture and the type of detected faults are completely different. The other methods are mainly devoted to string or array diagnosis; thus, they are not suitable for detecting degradation in a single PV panel.
The comparison also put into evidence that the selection of the appropriate method strongly depends on the size of the PV source and on what types of faults must be detected.

ANN Results with Experimental Degraded I − V Curves
The developed method is also tested with experimental I − V curves, where the series resistance degradation is applied by connecting in series to the PV module a small resistance of value (∆R s ).
The experimental data refer to a Isofotón I-53 PV module installed on the roof of the Department of Applied Physics II at the University of Málaga (latitude: 36.715 • N ; longitude: 4.478 • O; elevation: 60 m). The main data are summarized in Table 12. The measurement equipment acquires simultaneously the I − V curves, the in-plane irradiance (G) and the PV module temperature (T). Figure 18 shows the effect of the induced R s degradation on the I − V curves under the same environmental conditions. We assume that the acquired I − V curves with ∆R s = 0 correspond to the healthy conditions. The red points on the curves are the only values passed to the ANN together with G and T for estimating the degraded SDM parameters.  The ANN is trained only by using the healthy I − V curves in combination with the single-diode model for emulating the degraded curves, as described in the flowchart of Figure 3.
The experimental degraded curves are obtained with ∆R s = 300 mΩ, ∆R s = 1 Ω and ∆R s = 1.5 Ω. For the healthy conditions, R s = 364 mΩ; thus, the induced degradation is 82%, 274%, and 412%, respectively. The SDM parameters variations estimated with the proposed ANN architecture are reported in Figure 19 for different irradiance conditions. In that figure, the R s parameter shows a trend that is in agreement with the expected values. The saturation current is slightly changing only at high irradiance, and thus, cannot be associated to a permanent degradation effect; the other parameters do not exhibit a significant variation with respect to the values estimated for healthy conditions. It is worth noting that the variation of I sat for high irradiance values, that is, not associated to a real degradation effect, could be due to the limited dataset used to train the ANN. Indeed, only 75 healthy experimental I − V curves are available for this experimental example, and they are not enough to cover all the operating conditions. An exhaustive experimental campaign should be executed that could lead to further improvement of the performance of the proposed method. Nevertheless, also for this reduced dataset, the proposed approach is able to isolate the main degradation effects by using the SDM parameters estimation as indicators of possible faults that could occur in the PV modules.  Figure 19. ANN identification of series resistance for different induced degradation effect and for different environmental conditions.

Conclusions
In this paper, the MLP artificial neural network is used for isolating faults and degradation phenomena affecting photovoltaic panels. The parameters of the single diode model are used as indicators of the main degradation phenomena. The SDM parameters are strongly different from panel to panel and depend on the environmental conditions, PV technology and manufacturing process. To identify the PV degradation through the SDM parameters, the proposed method exploits two independent MLP-ANN architectures. The first one is trained to estimate the SDM parameters of the healthy PV panel for the measured environmental conditions. Since only G and T are the inputs for this MLP-ANN, it is able to reproduce the non-linear relations existing among the SDM parameters of the healthy PV panel and the environmental conditions. The second MLP is trained to estimate the SDM parameters of the PV panel in the presence of degradation phenomena affecting the I − V curve for the measured environmental conditions. This second MLP-ANN requires as inputs G, T and three points of the I − V curve measured close to the MPP; thus, it estimates the SDM parameters, including the environmental and degradation effects. To isolate the degradation effect, the difference among the two MLP-ANNs is used. The main benefits of the proposed solution are as follows: • Simple ANN architectures that allows easy implementation on an embedded system. • The ANN training process requires only experimental healthy I − V curves.
• Does not require the complete I − V scan during the online operation since the ANN accepts as input only three experimental points measured around the MPP.
The method is validated with simulation and experimental results showing a good agreement between induced and estimated degradation. In line with the recent expansion of IoT technologies for PV monitoring, the proposed approach represents a useful and relevant AI-based diagnosis tool that can be used to optimize operation and maintenance activities as well as enhance decision-making processes, thereby facilitating the integration of PV systems in smart grids.
Author Contributions: All the authors participated to the conceptualization of the proposed methodology and to the preparation of the submitted manuscript draft. Data analysis, software development and validation were performed by R.A.G.B. with the supervision of G.P. and P.M. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the "Ministero dell'Università e Ricerca" in the frame of PRIN 2017 project: Holistic approach to EneRgy-efficient smart nanOGRIDS (HEROGRIDS #2017WA5ZT3_003), and by FARB funds of the University of Salerno.