Abstract
In this paper, a new methodology for fault detection and diagnosis in photovoltaic systems is proposed. This method employs a novel Euclidean distance-based tree algorithm to classify various considered faults. Unlike the decision tree, which requires the use of the Gini index to split the data, this algorithm mainly relies on computing distances between an arbitrary point in the space and the entire dataset. Then, the minimum and the maximum distances of each class are extracted and ordered in ascending order. The proposed methodology requires four attributes: Solar irradiance, temperature, and the coordinates of the maximum power point (Impp, Vmpp). The developed procedure for fault detection and diagnosis is implemented and applied to classify a dataset comprising seven distinct classes: normal operation, string disconnection, short circuit of three modules, short circuit of ten modules, and three cases of string disconnection, with 25%, 50%, and 75% of partial shading. The obtained results demonstrate the high efficiency and effectiveness of the proposed methodology, with a classification accuracy reaching 97.33%. A comparison study between the developed fault detection and diagnosis methodology and Support Vector Machine, Decision Tree, Random Forest, and K-Nearest Neighbors algorithms is conducted. The proposed procedure shows high performance against the other algorithms in terms of accuracy, precision, recall, and F1-score.
1. Introduction
The worldwide demand for electrical energy continues to increase, and the governments of different nations must face several challenges to effectively respond to this demand. The first challenge is to provide energy for the growing proportion of the world’s population [1,2,3], while the second challenge lies in the production of this energy without causing environmental pollution or causing climatic problems such as global warming [4,5,6].
One of the ways to reduce greenhouse gas emissions is to use renewable energy, such as wind and solar energies. The wind and the sun provide infinite amounts of energy without generating greenhouse gases, unlike fossil-fuel-burning electric power stations. Solar energy allows producing electricity from photovoltaic panels or solar thermal power stations, thanks to sunlight captured by solar panels. Solar energy is clean, does not emit any greenhouse gases, and its source—the sun—is free, inexhaustible, and available everywhere in the world. Several countries around the globe are already at the forefront of renewable energy technologies and generate a large part of their electricity from photovoltaic systems (PVSs).
Like all other industrial process, a PVS can be subjected, during its operation, to various faults and anomalies, leading to a drop in the performance of the system and even to the total unavailability of the system. These faults will deviously reduce the productivity of the installation [7] and generate an additional cost of maintenance to restore the system to normal conditions—hence, the importance of having a system to detect and diagnose faults in system photovoltaic installation, which contributes to raising production efficiency and reducing maintenance time and cost [8].
There are many research contributions over the past decades in developing methods and algorithms for detecting and diagnosing faults in PV systems [9,10,11]. According to references [12,13], these algorithms can be classified into three distinct categories, whereas in reference [14], they are grouped into six categories. In this quick narration, the first categorization is adopted.
The first category encompasses all algorithms that use mathematical analysis and signal processing. Methods within this category heavily rely on the information extracted solely from the I–V characteristic, whether it pertains to a photovoltaic module (PVM), a PV string, or a photovoltaic array (PVA). Time Domain Reflectometry (TDR) is a technique utilized for identifying faulty photovoltaic modules within a photovoltaic array [15,16]. It has been employed to detect open circuits in grid-connected photovoltaic systems (GCPV) [17,18]. Earth Capacitance Measurement (ECM) has been added to the TDR to identify the PV module disconnected from the PV string [18]. Reference [18] has demonstrated the applicability of the ECM algorithm in PV strings made of both silicon and amorphous silicon.
The second category comprises several algorithms characterized by two main phases: the detection phase, utilizing a PV model, and the subsequent diagnosis phase, employing various methods, such as artificial intelligence [19,20,21,22]. These algorithms detect faults by comparing the measured values extracted from the considered PV generator with the simulated values from the PV model. The residual signal derived from this comparison can be utilized to detect degradation faults [23] as well as various cases of line-to-line faults [24].
The third category encompasses artificial intelligence and machine learning algorithms, including support vector machine (SVM) [25,26,27,28,29], decision tree (DT) [7,30], random forest (RF) [31,32,33], K-nearest neighbors (KNN) [34,35,36], and artificial neural network (ANN) [37,38,39] algorithms. Reference [25] provides a comparison of efficiency and execution time among various multiclass strategies—such as one vs. all (OVA), adaptive directed acyclic graph (ADAG), and decision-directed acyclic graph (DDAG) utilizing SVM. The goal of SVM classification is to categorize data into four classes: module short circuit, inverse bypass diode module, shunted bypass module, and shadowing effect in a module. The OVA strategy has demonstrated significant superiority over others in terms of efficiency, achieving an 88.33% accuracy rate. In reference [26], the SVM algorithm was employed to detect series faults of 10%, 50%, 70%, and 90% under sunny, cloudy, and rainy weather conditions. The recorded accuracies were 88.3%, 91.5%, and 75.3%, respectively. In reference [27], both the CPA and SVM algorithms were utilized to identify four operating states: normal, open circuit, short circuit, and partial shading. The authors concluded that with k = 6 (number of dimension), the algorithm achieved an accuracy rate of 100%. In reference [28], the authors used the SVM algorithm to detect faults such as open circuit, short circuit, and lack of solar radiation. The algorithm requires four inputs: short-circuit current Isc, open-circuit voltage Voc, and coordinates of maximum power point Impp and Vmpp. The algorithm’s efficiency and accuracy were enhanced by employing k-fold cross-validation. The drawback with the mentioned algorithms is that their authors solely relied on accuracy as the criterion for evaluation, whereas employing different metrics like precision and recall could offer a more comprehensive assessment of the algorithms.
In [7], a novel approach based on the DT algorithm is presented. This approach comprises two models: the first model detects faults, while the second model diagnoses four different fault types: short circuit, string, line-to-line, and free faults. The accuracy rates for the first and second models are 99.86% and 99.80%, respectively. Notably, although a confusion matrix was calculated, precision and recall metrics were not evaluated. The utilization of the random forest algorithm for fault detection and classification in PV systems is highlighted in [31]. The method introduced in this study necessitates the current from each string in the PV array, along with the PV array voltage, as features. Successfully, the algorithm detects and diagnoses four different faults: degradation, partial shading, line-to-line, and short-circuit faults. Authors have employed the grid-search method to optimize the random forest parameters. In order to evaluate this method, experimental and simulation samples were used. The accuracy of this method reached 99%. Another method was developed based on RF to detect and diagnose faults in photovoltaic systems [32]. A set of criteria were used to evaluate the method, which are computation time, accuracy, and F1-score. In [34], a modified KNN algorithm was proposed and applied to photovoltaic systems for fault detection and diagnosis. The main modification made by the researchers in this work is to facilitate the selection of the appropriate K value in addition to the distance function. This modification greatly contributed to the increase in the classification speed. Moreover, in [35], an interesting technique that uses the KNN algorithm was developed to detect multiple faults, including line-to-line and partial shading faults. Remarkably, this method only relies on data from the datasheet and achieves an accuracy rate of 99%. Another model was developed based on the combination of KNN and the Exponential Weighted Moving Average (EWMA) [36]. The KNN aims to detect faults on the DC side of the PV system, while the EMWA works to diagnose those faults. Researchers in [37] built a two-stage classifier: the first stage was a model of the PV system to detect faults, while the second stage was devoted for diagnosis purposes, in which two artificial neural networks were used to identify eight different faults. In [38], another approach based on artificial neural networks for fault detection and diagnosis in PV systems was introduced. The authors utilized an ANN with radial basis function (RBF) architecture, relying on two features: power generated and solar irradiance. The achieved accuracy in this study was 97.9% for the 2.2 kW PV system and 97% for the 4.16 kW PV system.
In this work, a novel fault detection and diagnosis algorithm is developed and designed in the DC of PV systems. This methodology is based on an innovative tree algorithm that mainly depends on calculating Euclidean distances to detect faults when they occur, effectively. At first, the algorithm classifies the data into two classes so that all the distances between a random point in space and the entire data set are calculated. Then, in each class, the minimum and maximum distances are extracted. After that, all the distances are arranged in ascending order to show one case out of five possible cases. Based on the apparent case, the data are classified. The algorithm needs four features to function properly: solar radiation, temperature, and current and voltage at the maximum power point. This algorithm has been implemented to seven different classes: normal operation, series disconnection, short circuit for 3 and 10 modules, and three other classes for series disconnection with partial shading of 25%, 50%, and 75%. The efficiency and effectiveness of the methodology is clearly demonstrated by the accuracy rate achieved, which exceeds 97%. In order to further evaluate the algorithm, a comparative study was conducted between the proposed algorithm and several well-known algorithms (support vector machine, K-nearest neighbors, decision tree, and random forest). The comparison results show a clear superiority of the developed algorithm in terms of accuracy, precision, recall, and F1-score.
This paper is organized as follows: Section 2 is dedicated to introducing the developed algorithm, while Section 3 explains the database used and its different categories. Section 4 reveals the classification strategy followed in this work. In Section 5, the results obtained are presented and discussed. The last section of this paper presents a summary of the work done in this research paper
2. Proposed Euclidean-Based Decision Tree Classification Algorithm
Despite the similarities between the proposed algorithm and the decision trees in their data splitting approach, the key distinction lies in using the Euclidean distance for partitioning data instead of the Gini index.
Initially, a training dataset, comprising values for features for each of the two classes (class 0 and class 1), is created. Then, the following steps are performed:
- (a)
- Choose an arbitrary point in an -dimensional space.
- (b)
- Using Equations (1) and (2), compute the Euclidean distances between the chosen point and all samples within the training dataset for each respective class:
- (c)
- Determine the minimum and maximum distances for each class:
Minimum and maximum values () as well as the arbitrary point coordinates are considered as the algorithm parameters which are needed in the testing phase.
- (d)
- Merge the two vectors and into one vector. Then, arrange the vector in ascending order. Among the following five cases, one may arise:
- case 1:Figure 1 shows a graphical representation of the first case.
Figure 1. Data splitting for case 1.- -
- Training sample shaving distances within the interval belong to class 0 (pure data in class 0).
- -
- Training samples having distances within the interval belong to class 1 (pure data in class 1).
- -
- Training samples having distances within the interval cannot be classified; therefore, another random point must be chosen for their classification.
- case 2:Figure 2 shows a graphical representation of the second case.
Figure 2. Data splitting for case 2.- -
- Training samples having distances within the interval belong to class 1 (pure data in class 1).
- -
- Training samples having distances within the interval belong to class 0 (pure data in class 0).
- -
- Training samples having distances within the interval cannot be classified; therefore, another random point must be chosen for their classification.
- case 3:Figure 3 shows a graphical representation of the third case.
Figure 3. Data splitting for case 3.- -
- Training samples having distances within the interval or belong to class 0.
- -
- Training samples having distances within the interval cannot be classified; therefore, another random point must be chosen for their classification.
- case 4:Figure 4 shows a graphical representation of the forth case.
Figure 4. Data splitting for case 4.- -
- Training samples having distances within the interval or belong to class 0.
- -
- Training samples having distances within the interval cannot be classified; therefore, another random point must be chosen for their classification.
- case 5: orFigure 5 shows a graphical representation of the fifth case.
Figure 5. Data splitting for case 5.- -
- Training samples having distances within the interval belong to class 0.
- -
- Training samples having distances within the interval belong to class 1.
- (e)
- If the case that occurred in the previous step is case 1, 2, 3, or 4:
- -
- Choose another random point .
- -
- Using Equations (1) and (2), compute the Euclidean distances between the chosen points and the unclassified samples within the training dataset for each respective class.
- -
- Go to step (c).
- (f)
- The algorithm iterates through steps (c) to (e) until all data are classified (case 5) or the stopping criterion is met. It employs early stopping as its stopping criterion to effectively mitigate overfitting without compromising the accuracy of the algorithm [40,41,42]
To address the overfitting issue, the difference between test accuracy and training accuracy is calculated. This difference should be minimal (less than 3% for example). If it exceeds this threshold, the training process is halted.
Figure 6 provides a graphical illustration of the proposed algorithm depicting a given possible situation.
Figure 6.
Graphical illustration of the proposed algorithm.
The flowchart of the algorithm is given in Figure 7.
Figure 7.
Flowchart of the proposed algorithm.
Algorithm 1 presents the pseudo-code of the proposed algorithm.
| Algorithm 1. Pseudo-code of the proposed algorithm |
| STEP (a): Generate a random point. STEP (b): Using Equations (1) and (2), calculate the distances and . STEP (c): Find , the minimal and maximal distances of each class. STEP (d): Store the computed di-stances int a vector named and organize it in ascending order. STEP (e): (the counter for unclassified data) - If For to () If The point associated to belongs to elseif The point associated to belongs to Else () Increment End if END for . if the stopping criteria is not verified Choose a new arbitrary point. Calculate the distances and for unclassified data. Go to step (c). Else Go to step (f). End if Elseif For to If The point associated to belongs to elseif The point associated to belongs to Else () Increment End if END for . if the stopping criteria is not verified Choose a new arbitrary point. Calculate the distances and for unclassified data. Go to step (c). Else Go to step (f). End if. Elseif For to If or The point associated to belongs to Else () Increment End if END for . if the stopping criteria is not verified Choose a new arbitrary point. Calculate the distances and for unclassified data. Go to step (c). Else Go to step (f). End if. Elseif For to If or The point associated to belongs to Else () Increment End if END for . if the stopping criteria is not verified Choose a new arbitrary point. Calculate the distances and for unclassified data. Go to step (c). Else Go to step (f). End if. Else (() or ()) For to If The point associated to belongs to If The point associated to belongs to END for Go to step (f) End if Step (f): End (all data are classified or the stopping criterion is met). |
3. Dataset Description
The PV array used to generate the dataset, for both healthy and faulty states, consists of two parallel strings. Each string comprises fifteen series-connected Isofoton PVM (106 W–12 V), modeling a realistic photovoltaic (PV) system located at a research center in Bouzareah, Algeria. The Simulink/MATLAB 2015a platform is utilized to simulate the current (Impp) and voltage (Vmpp) at the maximum power point of this PV array under both healthy and faulty states, considering various values of cell temperature (T) and irradiance (G). Through this simulation, 753 samples are generated for each of the considered classes, consisting of the four physical quantities (T, G, Impp, Vmpp). In this study, besides the normal operating state, six faulty states are considered. These states and their corresponding labels are given in Table 1. Environmental factors such as dust accumulation, adverse weather conditions, and snowfall often lead to partial shading faults. The algorithm’s ability to detect and classify such faults has been rigorously tested (refer to Table 1 for the examined faults). Results demonstrate a high degree of accuracy in diagnosing these issues. However, module aging, which gradually degrades PV system performance, was not included in this study. Future research will incorporate this factor to enhance fault detection capabilities further.
Table 1.
Operating states and their labels.
Before fitting the classifier, the dataset needs to be preprocessed. Three steps are included: attribute normalization, where input data are scaled to preserve the consistency in computing distances; data structuring by adding labels for each data point to create a labeled dataset; and filtering out of outlier data.
The proposed algorithm is an excellent candidate to implement it in large-scale PV farms. Its reliance on key attributes ensures rapid fault detection, making it well suited for real-time monitoring and diagnosis in large-scale installations.
As shown in Figure 8, utilizing the Impp as a feature makes it possible to distinguish between three faults: string disconnection, string disconnection with 50% shading, and string disconnection with 75% shading. Meanwhile, in Figure 9, it appears that the Vmpp feature can be used to classify faults such as string disconnection with 25% shading, short circuits of three modules, and short circuits of ten modules. To detect the healthy state class, both Impp and Vmpp features must be used simultaneously.
Figure 8.
Impp for various operating states of the PV array.
Figure 9.
Vmpp for various operating states of the PV array.
In fact, the proposed algorithm was tested on a dataset containing four classes. Subsequently, it was generalized to a more complex dataset with seven classes, as presented in this study.
4. Fault Detection and Diagnosis Methodology
The flowchart of the classification strategy used is shown in Figure 10. In order for the algorithm to function effectively, the multi-class dataset must be adapted to a bi-class dataset by isolating one class at a time, starting from class 0 and going up to class 6. Therefore, six classifiers must be designed for this purpose.
Figure 10.
Fault detection and diagnosis flowchart.
The first, the second, and the third classifiers isolate classes 0, 1, and 2 from the rest of the classes, respectively. Then, the fourth classifier isolates class 4 from classes 3, 5, and 6. The fifth classifier separates class 3 from classes 5 and 6. Finally, the sixth classifier distinguishes between classes 5 and 6.
Each classifier is designed based on the classification algorithm described previously and uses the four specified features (T, G, Impp, and Vmpp).
5. Results and Discussion
The confusion matrix is a well-known and important mathematical tool in the field of machine learning for evaluating algorithms. The elements of this matrix play a role in calculating the accuracy, precision, and recall metrics. This matrix has two rows and two columns, as illustrated in Table 2.
Table 2.
Confusion matrix used to evaluate the algorithm.
The four three metrics are computed as follows:
- Accuracy: A metric that shows how many data the algorithm correctly classifies. It is given by
- Precision: Measures the proportion of correctly predicted positive data to the total data predicted as positive. It is given by
- Recall: A metric that shows how many data that are really in class 1 which the classifier correctly predicted to be in class 1. It is given by
- F1-score: Measures the harmonic mean between the precision and the recall. It is given by
5.1. Training the Fault Detection and Diagnosis Model Using the Proposed Algorithm
Like any other statistical learning algorithm, the proposed algorithm firstly needs to be trained using a training dataset. Following training, its performance is evaluated using a separate testing set. The dataset is partitioned into two subsets: the training set comprises 87% of the global dataset, while the testing set encompasses 13% of the global dataset. As mentioned earlier, six classifiers are necessary to detect and diagnose the specified faults. To mitigate overfitting effectively without compromising the algorithm’s accuracy, the early stopping criterion is employed to stop the training process of each classifier.
The accuracy metric for each classifier is calculated at every iteration and illustrated in Figure 11, Figure 12, Figure 13, Figure 14, Figure 15 and Figure 16. As can be seen, for all classifiers, the accuracy value increases over iterations. Classifiers 1 to 6 of the trained model require 23, 9, 4, 6, 16, and 17 steps, respectively, to separate a class from the other classes.
Figure 11.
Evolution of accuracy for the first classifier.
Figure 12.
Evolution of accuracy as a function of iterations in the second classifier.
Figure 13.
Evolution of accuracy as a function of iterations in the third classifier.
Figure 14.
Evolution of accuracy as a function of iterations in the fourth classifier.
Figure 15.
Evolution of accuracy as a function of iterations in the fifth classifier.
Figure 16.
Evolution of accuracy as a function of iterations in the sixth classifier.
5.2. Evaluating the Performance of the Obtained Model Using the Proposed Algorithm
The performance of the proposed approach is evaluated using the average values of precision, precision, and recall. The higher these values, the better the performance of the proposed approach is, and vice versa. The confusion matrices and values of the three metrics are calculated from the test dataset. The results are presented in Table 3 and Table 4, respectively.
Table 3.
Confusion matrices for the obtained model.
Table 4.
Metric values for the obtained model.
Table 3 presents the confusion matrix values for each classifier within the resulting model. These values were used to calculate the precision, accuracy, and recall measures for all six classifiers and are presented in Table 4. The last row of Table 4 shows the average values of the three measures, which represent the measures of the resulting model.
Two additional train–test variants were conducted, and the results are presented in Table 5 and Table 6.
Table 5.
Metric values for the obtained model with a second train–test variant.
Table 6.
Metric values for the obtained model with a third train–test variant.
5.3. Comparative Studyy of Various Machine Learning Algorithms
In this comparative study, the fault detection and diagnosis model depicted in the flowchart of Figure 5 is constructed using various statistical methods, namely, the SVM algorithm [27], the DT algorithm [8,31], the RF algorithm [32,33,34], and the KNN algorithm [35,36,37].
The confusion matrices for the obtained model using the aforementioned algorithms are provided in Table 7, while Table 8 presents the values for accuracy, precision, recall, F1-score, and execution time, along with the average values of these metrics.
Table 7.
Confusion matrices for the obtained model using the four algorithms.
Table 8.
Metric values for the obtained model using the four algorithms.
Table 9 collects all the average values of the three metrics for each of the proposed algorithms, as well as the SVM, DT, RF, and KNN algorithms.
Table 9.
Metric average values.
From the table, it can be seen that the performance of the proposed algorithm is superior to the rest of the algorithms in terms of accuracy, precision, recall, and F1-score. Although the proposed algorithm is slower in fault detection compared to other algorithms, it remains suitable for industrial application and real-time operation.
Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21 display the fault detection and diagnosis results using the proposed algorithm-based model and those based on the SVM, DT, RF, and KNN algorithms, respectively. It can be seen from these figures that the smallest number of incorrectly classified data are obtained in the case of both the RF algorithm-based model and the proposed algorithm-based model. The models fail to correctly classify all data due to data overlap and overfitting issues.
Figure 17.
Fault detection and diagnosis results using the proposed algorithm-based model.
Figure 18.
Fault detection and diagnosis results using the SVM algorithm-based model.
Figure 19.
Fault detection and diagnosis results using the DT algorithm-based model.
Figure 20.
Fault detection and diagnosis results using the RF algorithm-based model.
Figure 21.
Fault detection and diagnosis results using the KNN algorithm-based model.
6. Conclusions
In this work, an enhanced approach was proposed for identifying and diagnosing PV array faults. A comparative study was conducted between the proposed algorithm-based model and models based on four statistical learning algorithms: SVM, DT, RF, and KNN. Unlike the decision tree algorithm, which uses the Gini index to split the data onto two classes, the proposed algorithm calculates Euclidean distances between an arbitrary point and the dataset samples. It then utilizes the minimal and maximal distances to separate the samples belonging to each class.
In this study, four features, namely, cell temperature, irradiance, and current and voltage of the maximum power point, were utilized. The proposed methodology effectively distinguishes the normal operating condition from other abnormal states, achieving a classification accuracy of 97%. The comparative investigation demonstrated that the proposed approach outperformed the other methods considered in this work in terms of accuracy, precision, recall, F1-score, and execution time.
By increasing the number of classifiers, the proposed technique can be easily extended to encompass additional faults.
Author Contributions
Conceptualization, Y.M.; methodology, K.K. and A.C.; software, Y.M. and A.A.; validation, Y.M., K.K. and A.C.; formal analysis, K.K.; investigation, K.K. and A.A.; resources, Y.M., K.K. and S.S.; data curation, Y.M.; writing—original draft preparation, Y.M.; writing—review and editing, K.K., A.C. and S.S.; visualization, Y.M. and A.A.; supervision, K.K., A.C. and S.S.; project administration, K.K. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
The data are not available.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| ADAG | Adaptive directed acyclic graph |
| ANN | Artificial neural network |
| DDAG | Decision directed acyclic graph |
| DT | Decision Tree |
| FDD | Fault Detection and Diagnosis |
| FN | False Negative |
| FP | False Positive |
| G | Irradiance |
| GCPV | Grid connected photovoltaic |
| Impp | Current at the maximum power point |
| Isc | Current of short circuit |
| KNN | K-Nearest Neighbors |
| OVA | One vs. all |
| PV | Photovoltaic |
| PVM | Photovoltaic module |
| PVS | Photovoltaic system |
| RF | Random Forest |
| SVM | Support Vector Machine |
| T | Temperature |
| TDR | Time domain reflectometry |
| TN | True Negative |
| TP | True Positive |
| Vmpp | Voltage at the maximum power point |
| Voc | Voltage of open circuit |
References
- Sohani, A.; Sayyaadi, H.; Cornaro, C.; Shahverdian, M.; Pierro, M.; Moser, D.; Karimi, N.; Doranehgard, M.; Li, L.K. Using machine learning in photovoltaics to create smarter and cleaner energy generation systems: A comprehensive review. J. Clean. Prod. 2022, 364, 132701. [Google Scholar]
- Mughal, S.; Sood, Y.R.; Jarial, R. A review on solar photovoltaic technology and future trends. Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol. 2018, 4, 227–235. [Google Scholar]
- Madeti, S.R.; Singh, S. A comprehensive study on different types of faults and detection techniques for solar photovoltaic system. Sol. Energy 2017, 158, 161–185. [Google Scholar]
- Hernandez, J.; Velasco, D.; Trujillo, C. Analysis of the effect of the implementation of photovoltaic systems like option of distributed generation in Colombia. Renew. Sustain. Energy Rev. 2011, 15, 2290–2298. [Google Scholar]
- Qais, M.H.; Hasanien, H.M.; Alghuwainem, S.; Nouh, A.S. Coyote optimization algorithm for parameters extraction of three-diode photovoltaic models of photovoltaic modules. Energy 2019, 187, 116001. [Google Scholar]
- Kumar, B.P.; Ilango, G.S.; Reddy, M.J.B.; Chilakapati, N. Online fault detection and diagnosis in photovoltaic systems using wavelet packets. IEEE J. Photovolt. 2017, 8, 257–265. [Google Scholar]
- Benkercha, R.; Moulahoum, S. Fault detection and diagnosis based on C4.5 decision tree algorithm for grid connected PV system. Sol. Energy 2018, 173, 610–634. [Google Scholar]
- Villarini, M.; Cesarotti, V.; Alfonsi, L.; Introna, V. Optimization of photovoltaic maintenance plan by means of a FMEA approach based on real data. Energy Convers. Manag. 2017, 152, 1–12. [Google Scholar]
- Pillai, D.S.; Rajasekar, N. A comprehensive review on protection challenges and fault diagnosis in PV systems. Renew. Sustain. Energy Rev. 2018, 91, 18–40. [Google Scholar]
- Zhao, Q.; Shao, S.; Lu, L.; Liu, X.; Zhu, H. A new PV array fault diagnosis method using fuzzy C-mean clustering and fuzzy membership algorithm. Energies 2018, 11, 238. [Google Scholar] [CrossRef]
- Hazra, A.; Das, S.; Basu, M. An efficient fault diagnosis method for PV systems following string current. J. Clean. Prod. 2017, 154, 220–232. [Google Scholar]
- Khelil, C.K.M.; Amrouche, B.; Soufiane Benyoucef, A.; Kara, K.; Chouder, A. New intelligent fault diagnosis (IFD) approach for grid-connected photovoltaic systems. Energy 2020, 211, 118591. [Google Scholar]
- Khelil, C.K.M.; Amrouche, B.; Kara, K.; Chouder, A. The impact of the ANN’s choice on PV systems diagnosis quality. Energy Convers. Manag. 2021, 240, 114278. [Google Scholar]
- Mellit, A.; Tina, G.M.; Kalogirou, S.A. Fault detection and diagnosis methods for photovoltaic systems: A review. Renew. Sustain. Energy Rev. 2018, 91, 1–17. [Google Scholar]
- Takashima, T.; Yamaguchi, J.; Otani, K.; Kato, K.; Ishida, M. Experimental studies of failure detection methods in PV module strings. In Proceedings of the 2006 IEEE 4th World Conference on Photovoltaic Energy Conference, Waikoloa, HI, USA, 7–12 May 2006; Volume 2, pp. 2227–2230. [Google Scholar]
- Takashima, T.; Yamaguchi, J.; Ishida, M. Fault detection by signal response in PV module strings. In Proceedings of the 2008 33rd IEEE Photovoltaic Specialists Conference, San Diego, CA, USA, 11–16 May 2008; pp. 1–5. [Google Scholar]
- Takashima, T.; Yamaguchi, J.; Otani, K.; Oozeki, T.; Kato, K.; Ishida, M. Experimental studies of fault location in PV module strings. Sol. Energy Mater. Sol. Cells 2009, 93, 1079–1082. [Google Scholar]
- Takashima, T.; Yamaguchi, J.; Ishida, M. Disconnection detection using earth capacitance measurement in photovoltaic module string. Prog. Photovolt. Res. Appl. 2008, 16, 669–677. [Google Scholar] [CrossRef]
- Chouder, A.; Silvestre, S. Automatic supervision and fault detection of PV systems based on power losses analysis. Energy Convers. Manag. 2010, 51, 1929–1937. [Google Scholar]
- Silvestre, S.; Chouder, A.; Karatepe, E. Automatic fault detection in grid connected PV systems. Sol. Energy 2013, 94, 119–127. [Google Scholar] [CrossRef]
- Spataru, S.; Sera, D.; Kerekes, T.; Teodorescu, R. Photovoltaic array condition monitoring based on online regression of performance model. In Proceedings of the 2013 IEEE 39th Photovoltaic Specialists Conference (PVSC), Tampa, FL, USA, 16–21 June 2013; pp. 0815–0820. [Google Scholar]
- Drews, A.; De Keizer, A.; Beyer, H.G.; Lorenz, E.; Betcke, J.; Van Sark, W.; Heydenreich, W.; Wiemken, E.; Stettler, S.; Toggweiler, P.; et al. Monitoring and remote failure detection of grid-connected PV systems based on satellite observations. Sol. Energy 2007, 81, 548–564. [Google Scholar]
- Bastidas-Rodriguez, J.D.; Franco, E.; Petrone, G.; Ramos-Paja, C.A.; Spagnuolo, G. Quantification of photovoltaic module degradation using model based indicators. Math. Comput. Simul. 2017, 131, 101–113. [Google Scholar]
- Dhoke, A.; Sharma, R.; Saha, T.K. An approach for fault detection and location in solar PV systems. Sol. Energy 2019, 194, 197–208. [Google Scholar]
- Mandal, R.K.; Kale, P.G. Assessment of different multiclass SVM strategies for fault classification in a PV system. In Proceedings of the Proceedings of the 7th International Conference on Advances in Energy Research, Singapore, 13–15 August 2019; Springer: Singapore, 2021; pp. 747–756. [Google Scholar]
- Cho, K.H.; Jo, H.C.; Kim, E.s.; Park, H.A.; Park, J.H. Failure diagnosis method of photovoltaic generator using support vector machine. J. Electr. Eng. Technol. 2020, 15, 1669–1680. [Google Scholar]
- Chen, L.; Lin, P.; Zhang, J.; Chen, Z.; Lin, Y.; Wu, L.; Cheng, S. Fault diagnosis and classification for photovoltaic arrays based on principal component analysis and support vector machine. IOP Conf. Ser. Earth Environ. Sci. 2018, 188, 012089. [Google Scholar]
- Wang, J.; Gao, D.; Zhu, S.; Wang, S.; Liu, H. Fault diagnosis method of photovoltaic array based on support vector machine. Energy Sources Part A Recovery Util. Environ. Eff. 2019, 45, 5380–5395. [Google Scholar]
- Yi, Z.; Etemadi, A.H. Line-to-line fault detection for photovoltaic arrays based on multiresolution signal decomposition and two-stage support vector machine. IEEE Trans. Ind. Electron. 2017, 64, 8546–8556. [Google Scholar]
- Dhibi, K.; Mansouri, M.; Bouzrara, K.; Nounou, H.; Nounou, M. An enhanced ensemble learning-based fault detection and diagnosis for grid-connected PV systems. IEEE Access 2021, 9, 155622–155633. [Google Scholar]
- Chen, Z.; Han, F.; Wu, L.; Yu, J.; Cheng, S.; Lin, P.; Chen, H. Random forest based intelligent fault diagnosis for PV arrays using array voltage and string currents. Energy Convers. Manag. 2018, 178, 250–264. [Google Scholar]
- Dhibi, K.; Fezai, R.; Mansouri, M.; Trabelsi, M.; Kouadri, A.; Bouzara, K.; Nounou, H.; Nounou, M. Reduced kernel random forest technique for fault detection and classification in grid-tied PV systems. IEEE J. Photovolt. 2020, 10, 1864–1871. [Google Scholar]
- Dhibi, K.; Fezai, R.; Bouzrara, K.; Mansouri, M.; Nounou, H.; Nounou, M.; Trabelsi, M. Enhanced RF for Fault Detection and Diagnosis of Uncertain PV systems. In Proceedings of the 2021 IEEE 18th International Multi-Conference on Systems, Signals & Devices (SSD), Monastir, Tunisia, 22–25 March 2021; pp. 103–108. [Google Scholar]
- Wang, L.; Qiu, H.; Yang, P.; Gao, J. Fault Diagnosis Method Based on An Improved KNN Algorithm for PV strings. In Proceedings of the 2021 IEEE 4th Asia Conference on Energy and Electrical Engineering (ACEEE), Virtual, 10–12 September 2021; pp. 91–98. [Google Scholar]
- Madeti, S.R.; Singh, S. Modeling of PV system based on experimental data for fault detection using kNN method. Sol. Energy 2018, 173, 139–151. [Google Scholar]
- Harrou, F.; Taghezouit, B.; Sun, Y. Improved k NN-based monitoring schemes for detecting faults in PV systems. IEEE J. Photovolt. 2019, 9, 811–821. [Google Scholar]
- Karatepe, E.; Hiyama, T. Controlling of artificial neural network for fault diagnosis of photovoltaic array. In Proceedings of the 2011 IEEE 16th International Conference on Intelligent System Applications to Power Systems, Crete, Greece, 25–28 September 2011; pp. 1–6. [Google Scholar]
- Chine, W.; Mellit, A.; Lughi, V.; Malek, A.; Sulligoi, G.; Pavan, A.M. A novel fault diagnosis technique for photovoltaic systems based on artificial neural networks. Renew. Energy 2016, 90, 501–512. [Google Scholar]
- Hussain, M.; Dhimish, M.; Titarenko, S.; Mather, P. Artificial neural network based photovoltaic fault detection algorithm integrating two bi-directional input parameters. Renew. Energy 2020, 155, 1272–1292. [Google Scholar]
- Bai, Y.; Yang, E.; Han, B.; Yang, Y.; Li, J.; Mao, Y.; Niu, G.; Liu, T. Understanding and improving early stopping for learning with noisy labels. Adv. Neural Inf. Process. Syst. 2021, 34, 24392–24403. [Google Scholar]
- Prechelt, L. Automatic early stopping using cross validation: Quantifying the criteria. Neural Netw. 1998, 11, 761–767. [Google Scholar]
- Zhang, T.; Yu, B. Boosting with early stopping: Convergence and consistency. Ann. Stat. 2005, 33, 1538–1579. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).