Adaptive Deep Belief Networks and LightGBM-Based Hybrid Fault Diagnostics for SCADA-Managed PV Systems: A Real-World Case Study
Abstract
1. Introduction
1.1. Importance of Fault Detection in PV Systems
1.2. Challenges in PV Fault Diagnosis
1.3. Latest Trends in PV Fault Detection Using Hybrid Models
- Development of an improved hybrid fault diagnostic framework based on the DBN–LightGBM framework for photovoltaic (PV) systems fault diagnostics using real-time operational data collected from large-scale grid-connected photovoltaic systems.
- To utilize a combination of deep learning-based feature extraction as DBN with LightGBM, leveraging the strengths of both sequential modeling and gradient-boosted classification, employing PV operational parameters such as DC, voltage, irradiance, module temperature, and inverter output power.
- Utilization of the QASP PV Fault Detection Variables (QPV-FDV) Dataset, a practical dataset collected from a 100 MW operational PV plant, offering real fault scenarios rarely available in open-source databases.
- The proposed model demonstrates superior classification accuracy and reliability for various PV faults, while ensuring lightweight deployment feasibility for real-time PV monitoring systems.
- The framework is designed to be scalable for larger PV systems and compatible with real-time deployment in smart grid environments, ensuring early fault detection and efficient maintenance planning that make it suitable for real-time deployment in smart grid environments.
2. Preliminaries
2.1. Photovoltaic (PV) System Overview
2.2. PV Fault Types
2.3. Research Hypotheses and Variable Structure
- Independent Variables: The environmental conditions include solar irradiance (GHI), temperature, wind speed, humidity, and soiling loss (SL).
- Dependent Variables: The amount of power that is given (kW), the performance ratio (PR), and the total energy efficiency of the solar plant.
- Control Variables: Installed solar plant capacity (100 MW), module specifications (JA Solar 255 W), and systems configuration and location-specific constants (e.g., tilt angle and oriented in a fixed direction).
- Moderating Variables: Temporal measurements like time of the day, season, and flux of daily irradiance, which determine the presence of the strength as well as the direction of the links between the key variables.
2.4. Adaptive Deep Belief Network (A-DBN)
3. Development of Hybrid Model (Adaptive DBN + LightGBM) for PV Fault Diagnosis
Algorithm 1 SCADA-based hybrid PV fault diagnosis using DBN + LightGBM |
Start |
Input Parameters (SCADA features): Where;
|
Output Parameters: Where,
|
Data Preprocessing: Normalize each input using min-max scaling: |
Deep Feature Extraction using Adaptive DBN: DBN Architecture: A stack of 3 Restricted Boltzmann Machines (RBMs) trained layer-by-layer in an unsupervised fashion. Each RBM layer is trained to model the probability:
Adaptive Training Strategy:
|
Fault Classification using LightGBM Use the extracted feature vector as input to the LightGBM classifier. LightGBM minimizes the objective:
|
Output: |
Hyperparameter Optimization
4. Project Profile and Specifications of Quaid-e-Azam Solar Park (QASP)—100 MWp
4.1. JA 255-Watt Solar Module Description
4.2. I-V and P-V Characteristics of the JA 255 W Solar Panel Under Varying Temperature and Irradiance Conditions
4.3. QASP PV Fault Detection Dataset
4.4. QASP PV Fault Detection Variables (QPV-FDV) Dataset
- DC Voltage (): This is the fundamental parameter that shows the electrical potential difference of a PV module. Voltage fluctuations give an idea of how modules behave in various environmental and operating conditions.
- DC Current (): This is the current generated by the solar modules as a result of the photovoltaic effect. Measurements can be useful in determining the efficiency of power generation by modules in different conditions. Mathematically:
- Solar Irradiance (): This is the rate of solar radiation reaching the surface of the solar PV module and is usually expressed in . It has a direct influence on the power generation of the PV system and total solar irradiance () is the sum of the direct normal irradiance, diffuse horizontal irradiance, and diffuse irradiance. Mathematically:
- Module Temperature (): This is the photovoltaic module operating temperature on the surface of a solar cell. In solar panels, the ambient temperature of the modules also has an influential effect on the energy conversion efficiency since voltage output tends to decrease at a higher temperature. Mathematically:
- Performance Ratio (): The PR is a dimensionless metric of actual performance of a PV plant to its maximum potential output, accounting for losses due to temperature, shading, and inefficiencies. It is calculated as the ratio of actual output to the product of active area and reference irradiance.
- Inverter Output Power (): The alternating current (AC) output of power is generated after the changeover of direct current (DC), and available electrical energy is transmitted to the grid.
4.5. Data Acquisition (SCADA System) and Data Preprocessing Steps
4.6. Proposed Methodology of Hybrid DBN–LightGBM Model for PV Fault Diagnosis
5. Results and Discussion
6. Conclusions and Future Work
Future Work
- Implement the Wavelet Transform or Hilbert–Huang Transform to enhance time-series data preprocessing for extracting patterns focused on SCADA signal analysis.
- Expand the fault dataset by capturing additional data streams under varying seasonal conditions, irradiance levels, and inverter loads to improve the accuracy of the hybrid DBN–LightGBM model.
- Deploy real-time data processing frameworks and validate the proposed model against live SCADA data streams under diverse physical conditions.
- Develop an integrated version of the model for deployment in diagnostic systems located at the inverter or plant controller level, optimizing the model for edge computing environments.
- Explore cross-site or cross-plant model validation using transfer learning to assess model performance across multiple PV installations or geographic regions.
- Establish benchmarks for diagnostic accuracy and computational efficiency by comparing the proposed approach with emerging ensemble and deep hybrid models, such as CNN-XGBoost and transformer-based classifiers.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mohamed Abd El Razik, A. The importance of solar energy projects in achieving sustainable development. Int. J. Adv. Res. Plan. Sustain. Dev. 2022, 5, 11–42. [Google Scholar] [CrossRef]
- Maka, A.O.; Alabid, J.M. Solar energy technology and its roles in sustainable development. Clean Energy 2022, 6, 476–483. [Google Scholar] [CrossRef]
- Hong, Y.Y.; Pula, R.A. Methods of photovoltaic fault detection and classification: A review. Energy Rep. 2022, 8, 5898–5929. [Google Scholar] [CrossRef]
- Mehmood, A.; Sher, H.A.; Murtaza, A.F.; Al-Haddad, K. Fault detection, classification and localization algorithm for photovoltaic array. IEEE Trans. Energy Convers. 2021, 36, 2945–2955. [Google Scholar] [CrossRef]
- Pimpalkar, R.; Sahu, A.; Patil, R.B.; Roy, A. A comprehensive review on failure modes and effect analysis of solar photovoltaic system. Mater. Today Proc. 2023, 77, 687–691. [Google Scholar] [CrossRef]
- Seghiour, A.; Ait Abbas, H.; Chouder, A.; Rabhi, A. Deep learning method based on autoencoder neural network applied to faults detection and diagnosis of photovoltaic system. Simul. Model. Pract. Theory 2023, 123, 102704. [Google Scholar] [CrossRef]
- Kaitouni, S.I.; AitAbdelmoula, I.; Es-sakali, N.; Mghazli, M.O.; Er-retby, H.; Zoubir, Z.; El Mansouri, F.; Ahachad, M.; Brigui, J. Implementing a Digital Twin-based fault detection and diagnosis approach for optimal operation and maintenance of urban distributed solar photovoltaics. Renew. Energy Focus 2024, 48, 100530. [Google Scholar] [CrossRef]
- Kandeal, A.W.; Elkadeem, M.R.; Thakur, A.K.; Abdelaziz, G.B.; Sathyamurthy, R.; Kabeel, A.E.; Yang, N.; Sharshir, S.W. Infrared thermography-based condition monitoring of solar photovoltaic systems: A mini review of recent advances. Sol. Energy 2021, 223, 33–43. [Google Scholar] [CrossRef]
- Umar, S.; Nawaz, M.U.; Qureshi, M.S. Deep learning approaches for crack detection in solar PV panels. Int. J. Adv. Eng. Technol. Innov. 2024, 1, 50–72. [Google Scholar]
- Abouobaida, H.; Abouelmahjoub, Y. New Diagnosis and Fault-Tolerant Control Strategy for Photovoltaic System. Int. J. Photoenergy 2021, 2021, 8075165. [Google Scholar] [CrossRef]
- Buffa, S.; Fouladfar, M.H.; Franchini, G.; Lozano Gabarre, I.; Andrés Chicote, M. Advanced control and fault detection strategies for district heating and cooling systems—A review. Appl. Sci. 2021, 11, 455. [Google Scholar] [CrossRef]
- Saberironaghi, A.; Ren, J.; El-Gindy, M. Defect detection methods for industrial products using deep learning techniques: A review. Algorithms 2023, 16, 95. [Google Scholar] [CrossRef]
- Alrifaey, M.; Lim, W.H.; Ang, C.K.; Natarajan, E.; Solihin, M.I.; Juhari, M.R.M.; Tiang, S.S. Hybrid deep learning model for fault detection and classification of grid-connected photovoltaic system. IEEE Access 2022, 10, 13852–13869. [Google Scholar] [CrossRef]
- Yousif, H.; Al-Milaji, Z. Fault detection from PV images using hybrid deep learning model. Sol. Energy 2024, 267, 112207. [Google Scholar]
- Qu, J.; Qian, Z.; Pei, Y.; Wei, L.; Zareipour, H.; Sun, Q. An unsupervised hourly weather status pattern recognition and blending fitting model for PV system fault detection. Appl. Energy 2022, 319, 119271. [Google Scholar] [CrossRef]
- Abubakar, A.; Jibril, M.M.; Almeida, C.F.; Gemignani, M.; Yahya, M.N.; Abba, S.I. A novel hybrid optimization approach for fault detection in photovoltaic arrays and inverters using AI and statistical learning techniques: A focus on sustainable environment. Processes 2023, 11, 2549. [Google Scholar] [CrossRef]
- Garud, K.S.; Jayaraj, S.; Lee, M.Y. A review on modeling of solar photovoltaic systems using artificial neural networks, fuzzy logic, genetic algorithm and hybrid models. Int. J. Energy Res. 2021, 45, 6–35. [Google Scholar]
- Chahine, K. Tree-Based Algorithms and Incremental Feature Optimization for Fault Detection and Diagnosis in Photovoltaic Systems. Eng 2025, 6, 20. [Google Scholar] [CrossRef]
- Berghout, T.; Benbouzid, M.; Bentrcia, T.; Ma, X.; Djurović, S.; Mouss, L.H. Machine learning-based condition monitoring for PV systems: State of the art and future prospects. Energies 2021, 14, 6316. [Google Scholar] [CrossRef]
- Mellit, A.; Kalogirou, S. Assessment of machine learning and ensemble methods for fault diagnosis of photovoltaic systems. Renew. Energy 2022, 184, 1074–1090. [Google Scholar] [CrossRef]
- Levent, İ.; Şahin, G.; Işık, G.; van Sark, W.G. Comparative Analysis of Advanced Machine Learning Regression Models with Advanced Artificial Intelligence Techniques to Predict Rooftop PV Solar Power Plant Efficiency Using Indoor Solar Panel Parameters. Appl. Sci. 2025, 15, 3320. [Google Scholar] [CrossRef]
- Nawaz, R.; Wadood, A.; Mehmood, K.K.; Bukhari, S.B.A.; Albalawi, H.; Alatwi, A.M.; Sajid, M. Gradient Boosting Feature Selection For Integrated Fault Diagnosis in Series-Compensated Transmission Lines. IEEE Access 2025, 13, 63640–63670. [Google Scholar] [CrossRef]
- Abdelsattar, M.; AbdelMoety, A.; Emad-Eldeen, A. Advanced machine learning techniques for predicting power generation and fault detection in solar photovoltaic systems. Neural Comput. Appl. 2025, 37, 8825–8844. [Google Scholar] [CrossRef]
- Suliman, F.; Anayi, F.; Packianather, M. Electrical faults analysis and detection in photovoltaic arrays based on machine learning classifiers. Sustainability 2024, 16, 1102. [Google Scholar] [CrossRef]
- Mansouri, M.; Trabelsi, M.; Nounou, H.; Nounou, M. Deep learning-based fault diagnosis of photovoltaic systems: A comprehensive review and enhancement prospects. IEEE Access 2021, 9, 126286–126306. [Google Scholar] [CrossRef]
- Veerasamy, V.; Wahab, N.I.A.; Othman, M.L.; Padmanaban, S.; Sekar, K.; Ramachandran, R.; Hizam, H.; Vinayagam, A.; Islam, M.Z. LSTM recurrent neural network classifier for high impedance fault detection in solar PV integrated power system. IEEE Access 2021, 9, 32672–32687. [Google Scholar] [CrossRef]
- Amiri, A.F.; Kichou, S.; Oudira, H.; Chouder, A.; Silvestre, S. Fault detection and diagnosis of a photovoltaic system based on deep learning using the combination of a convolutional neural network (cnn) and bidirectional gated recurrent unit (Bi-GRU). Sustainability 2024, 16, 1012. [Google Scholar] [CrossRef]
- Kumari, P.; Toshniwal, D. Long short term memory–convolutional neural network based deep hybrid approach for solar irradiance forecasting. Appl. Energy 2021, 295, 117061. [Google Scholar] [CrossRef]
- Yuan, Z.; Xiong, G.; Fu, X. Artificial neural network for fault diagnosis of solar photovoltaic systems: A survey. Energies 2022, 15, 8693. [Google Scholar] [CrossRef]
- Eldeghady, G.S.; Kamal, H.A.; Hassan, M.A.M. Fault diagnosis for PV system using a deep learning optimized via PSO heuristic combination technique. Electr. Eng. 2023, 105, 2287–2301. [Google Scholar] [CrossRef]
- Ghorbani, N.; Kasaeian, A.; Toopshekan, A.; Bahrami, L.; Maghami, A. Optimizing a hybrid wind-PV-battery system using GA-PSO and MOPSO for reducing cost and increasing reliability. Energy 2018, 154, 581–591. [Google Scholar] [CrossRef]
- Qian, Y.L.; Zhang, H.; Peng, D.G.; Huang, C.H. Fault diagnosis for generator unit based on RBF neural network optimized by GA-PSO. In Proceedings of the 2012 8th International Conference on Natural Computation, Chongqing, China, 29–31 May 2012; IEEE: New York, NY, USA, 2012; pp. 233–236. [Google Scholar]
- Karthikeyan, G.; Jagadeeshwaran, A. Enhancing solar energy generation: A comprehensive machine learning-based PV prediction and fault analysis system for real-time tracking and forecasting. Electr. Power Compon. Syst. 2024, 52, 1497–1512. [Google Scholar] [CrossRef]
- Leite, D.; Andrade, E.; Rativa, D.; Maciel, A.M. Fault Detection and Diagnosis in Industry 4.0: A Review on Challenges and Opportunities. Sensors 2024, 25, 60. [Google Scholar] [CrossRef]
- Aghaei, M.; Kolahi, M.; Nedaei, A.; Venkatesh, N.S.; Esmailifar, S.M.; Moradi Sizkouhi, A.M.; Aghamohammadi, A.; Oliveira, A.K.; Eskandari, A.; Parvin, P.; et al. Autonomous Intelligent Monitoring of Photovoltaic Systems: An In-Depth Multidisciplinary Review. Prog. Photovolt. Res. Appl. 2025, 33, 381–409. [Google Scholar] [CrossRef]
- Iqbal, S.; Hasan, S.M.; Ayaz, Y.; Din, E.U.; Waqas, A.; Sajid, M. Condition Monitoring of Photovoltaic Panels through Electrical Impedance Spectroscopy and Machine Learning Focusing on Temperature, Dust and Microcracks. IEEE Access 2025, 13, 53039–53052. [Google Scholar] [CrossRef]
- Sairam, S.; Seshadhri, S.; Marafioti, G.; Srinivasan, S.; Mathisen, G.; Bekiroglu, K. Edge-based Explainable Fault Detection Systems for photovoltaic panels on edge nodes. Renew. Energy 2022, 185, 1425–1440. [Google Scholar] [CrossRef]
- Noura, H.N.; Allal, Z.; Salman, O.; Chahine, K. Explainable artificial intelligence of tree-based algorithms for fault detection and diagnosis in grid-connected photovoltaic systems. Eng. Appl. Artif. Intell. 2025, 139, 109503. [Google Scholar] [CrossRef]
- Hassan, I.; Alhamrouni, I.; Younes, Z.; Azhan, N.H.; Mekhilef, S.; Seyedmahmoudian, M.; Stojcevski, A. Explainable deep learning model for grid connected photovoltaic system performance assessment for improving system relaibility. IEEE Access 2024, 12, 120729–120746. [Google Scholar] [CrossRef]
- Li, B.; Delpha, C.; Migan-Dubois, A.; Diallo, D. Fault diagnosis of photovoltaic panels using full I–V characteristics and machine learning techniques. Energy Convers. Manag. 2021, 248, 114785. [Google Scholar] [CrossRef]
- Wang, J.; Gao, D.; Zhu, S.; Wang, S.; Liu, H. Fault diagnosis method of photovoltaic array based on support vector machine. Energy Sources Part A Recovery Util. Environ. Eff. 2023, 45, 5380–5395. [Google Scholar] [CrossRef]
- Lu, X.; Lin, P.; Cheng, S.; Lin, Y.; Chen, Z.; Wu, L.; Zheng, Q. Fault diagnosis for photovoltaic array based on convolutional neural network and electrical time series graph. Energy Convers. Manag. 2019, 196, 950–965. [Google Scholar] [CrossRef]
- Ferlito, S.; Ippolito, S.; Santagata, C.; Schiattarella, P.; Di Francia, G. A Study on an IoT-Based SCADA System for Photovoltaic Utility Plants. Electronics 2024, 13, 2065. [Google Scholar] [CrossRef]
- Ahsan, L.; Baig, M.J.A.; Iqbal, M.T. Low-Cost, Open-Source, Emoncms-Based SCADA System for a Large Grid-Connected PV System. Sensors 2022, 22, 6733. [Google Scholar] [CrossRef]
- Qays, M.O.; Ahmed, M.M.; Parvez Mahmud, M.A.; Abu-Siada, A.; Muyeen, S.M.; Hossain, M.L.; Yasmin, F.; Rahman, M.M. Monitoring of renewable energy systems by IoT-aided SCADA system. Energy Sci. Eng. 2022, 10, 1874–1885. [Google Scholar] [CrossRef]
- Vodapally, S.N.; Ali, M.H. Overview of intelligent inverters and associated cybersecurity issues for a grid-connected solar photovoltaic system. Energies 2023, 16, 5904. [Google Scholar] [CrossRef]
- Hassan, Y.B.; Orabi, M.; Gaafar, M.A. Failures causes analysis of grid-tie photovoltaic inverters based on faults signatures analysis (FCA-B-FSA). Sol. Energy 2023, 262, 111831. [Google Scholar] [CrossRef]
- Song, R.; Wang, Z.; Guo, L.; Zhao, F.; Xu, Z. Deep belief networks (DBN) for financial time series analysis and market trends prediction. World J. Innov. Mod. Technol. 2024, 7, 1–10. [Google Scholar] [CrossRef]
- Wang, H.Y.; Chen, B.; Pan, D.; Lv, Z.A.; Huang, S.Q.; Khayatnezhad, M.; Jimenez, G. Optimal wind energy generation considering climatic variables by Deep Belief network (DBN) model based on modified coot optimization algorithm (MCOA). Sustain. Energy Technol. Assess. 2022, 53, 102744. [Google Scholar] [CrossRef]
- Zambra, M.; Testolin, A.; Zorzi, M. A developmental approach for training deep belief networks. Cogn. Comput. 2023, 15, 103–120. [Google Scholar] [CrossRef]
- Hartanto, A.D.; Kholik, Y.N.; Pristyanto, Y. Stock price time series data forecasting using the light gradient boosting machine (LightGBM) model. JOIV Int. J. Inform. Vis. 2023, 7, 2270–2279. [Google Scholar]
- Sivagamasundari, S.; Rayudu, M.S. IoT based solar panel fault and maintenance detection using decision tree with light gradient boosting. Meas. Sens. 2023, 27, 100726. [Google Scholar]
- Rajalakshmi, D.; Sudharson, K.; Suresh Kumar, A.; Vanitha, R. Advancing Fault Detection Efficiency in Wireless Power Transmission with Light GBM for Real-Time Detection Enhancement. Int. Res. J. Multidiscip. Technovation 2024, 6, 54–68. [Google Scholar] [CrossRef]
- Adhya, D.; Chatterjee, S.; Chakraborty, A.K. Performance assessment of selective machine learning techniques for improved PV array fault diagnosis. Sustain. Energy Grids Netw. 2022, 29, 100582. [Google Scholar] [CrossRef]
- Kellil, N.; Aissat, A.; Mellit, A. Fault diagnosis of photovoltaic modules using deep neural networks and infrared images under Algerian climatic conditions. Energy 2023, 263, 125902. [Google Scholar] [CrossRef]
- Joshua, S.R.; Yeon, A.N.; Park, S.; Kwon, K. A Hybrid Machine Learning Approach: Analyzing Energy Potential and Designing Solar Fault Detection for an AIoT-Based Solar–Hydrogen System in a University Setting. Appl. Sci. 2024, 14, 8573. [Google Scholar]
- Cai, X.; Wai, R.J. Intelligent DC arc-fault detection of solar PV power generation system via optimized VMD-based signal processing and PSO–SVM classifier. IEEE J. Photovolt. 2022, 12, 1058–1077. [Google Scholar]
- Syed, S.S.; Li, B.; Zheng, A. Detection and Classification of Physical and Electrical Fault in PV Array System by Random Forest-Based Approach. Int. J. Electr. Energy Power Syst. Eng. 2024, 7, 67–84. [Google Scholar] [CrossRef]
- Teta, A.; Korich, B.; Bakria, D.; Hadroug, N.; Rabehi, A.; Alsharef, M.; Bajaj, M.; Zaitsev, I.; Ghoneim, S.S. Fault detection and diagnosis of grid-connected photovoltaic systems using energy valley optimizer based lightweight CNN and wavelet transform. Sci. Rep. 2024, 14, 18907. [Google Scholar] [CrossRef] [PubMed]
- Prasshanth, C.V.; Venkatesh, N.; Sugumaran, V.; Aghaei, M. Enhancing photovoltaic module fault diagnosis: Leveraging unmanned aerial vehicles and autoencoders in machine learning. Sustain. Energy Technol. Assess. 2024, 64, 103674. [Google Scholar] [CrossRef]
- Hu, Z.; Xia, K.; Fan, Z.; Chang, K.; Wu, D. A novel switch open-circuit fault diagnostic method for three-phase inverter based on PSO-DBN. In Proceedings of the 2022 9th International Forum on Electrical Engineering and Automation (IFEEA), Zhuhai, China, 4–6 November 2022; IEEE: New York, NY, USA, 2022; pp. 802–806. [Google Scholar]
- Et-taleby, A.; Chaibi, Y.; Allouhi, A.; Boussetta, M.; Benslimane, M. A combined convolutional neural network model and support vector machine technique for fault detection and classification based on electroluminescence images of photovoltaic modules. Sustain. Energy Grids Netw. 2022, 32, 100946. [Google Scholar] [CrossRef]
- Alhanaf, A.S.; Farsadi, M.; Balik, H.H. Fault detection and classification in ring power system with DG penetration using hybrid CNN-LSTM. IEEE Access 2024, 12, 59953–59975. [Google Scholar] [CrossRef]
- Aslam, S.; Kumar, K.V.; Babu, T.A.; Rajesh, P. Hamiltonian deep neural network technique optimized with lyrebird optimization algorithm for detecting and classifying power quality disturbances in PV combined DC microgrids system. Environ. Dev. Sustain. 2025, 1–24. [Google Scholar] [CrossRef]
- Sridharan, N.V.; Sugumaran, V. Visual fault detection in photovoltaic modules using decision tree algorithms with deep learning features. Energy Sources Part A Recovery Util. Environ. Eff. 2025, 47, 2020379. [Google Scholar] [CrossRef]
- Kuo, R.J.; Xu, Z.X. Predictive maintenance for wire drawing machine using MiniRocket and GA-based ensemble method. Int. J. Adv. Manuf. Technol. 2024, 134, 1661–1676. [Google Scholar] [CrossRef]
- Zhang, X.; Yang, K.; Zheng, L. Transformer fault diagnosis method based on timesnet and informer. Actuators 2024, 13, 74. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, C.; Ke, Q.; Zhang, B.; Wang, Y.; Zeng, S.; Kang, T.; Lan, T.; Liu, Z.; Liu, C. A fault diagnosis method based on TCN-LSTM-SE neural networks for distributed PV systems. In Proceedings of the 2024 IEEE 2nd International Conference on Sensors, Electronics and Computer Engineering (ICSECE), Jinzhou, China, 29–31 August 2024; IEEE: New York, NY, USA, 2024; pp. 183–189. [Google Scholar]
- Liu, B.; Sun, K.; Wang, X.; Zhao, J.; Hou, X. Fault diagnosis of photovoltaic strings by using machine learning-based stacking classifier. IET Renew. Power Gener. 2024, 18, 384–397. [Google Scholar] [CrossRef]
Hybrid Model | Feature Extraction | Classifier | Key Strengths | Accuracy (%) | Macro-F1 (%) | Computational Efficiency | Dataset Used |
---|---|---|---|---|---|---|---|
CNN + SVM | CNN | SVM | Excellent for image-based faults | 97.6 | 97.2 | High training cost, moderate inference | QPV-FDV |
LSTM + XGBoost | LSTM | XGBoost | Handles sequential/time-series data | 97.9 | 97.5 | High training cost, slower inference | QPV-FDV |
Autoencoder + RF | Autoencoder | RF | Robust anomaly detection, noise tolerance | 97.2 | 96.8 | Moderate | QPV-FDV |
PCA + GBM | PCA | GBM | Dimensionality reduction + boosting | 96.9 | 96.5 | Very efficient | QPV-FDV |
DBN + LightGBM (Proposed) | DBN | LightGBM | High accuracy, fast training, interpretability | 98.21 | 98.0 | Moderate training, fast inference | QPV-FDV |
Sr. No. | Capability | Associated Component | Contribution of the Proposed Approach | Benefit for PV Fault Diagnosis | Complexity Level |
---|---|---|---|---|---|
1 | Adaptability to Varying PV Fault Patterns | DBN | Learns hierarchical features adaptively for diverse and evolving PV fault signatures. | Detects a wide range of fault modes without manual tuning | Moderate |
2 | Enhanced Classification Performance | DBN + LightGBM | Hybrid integration ensures superior accuracy and generalization over standalone models. | Improves diagnostic reliability across fault conditions | High |
3 | Robustness to Noisy and Non-Linear Data | DBN + LightGBM | DBN captures hidden dependencies; LightGBM manages noisy and irregular PV signals. | Stable performance even under sensor noise | Moderate |
4 | Hybrid Deep–Shallow Learning Synergy | DBN + LightGBM | Combines representational strength of deep networks with efficient boosting. | Achieves optimal trade-off between accuracy and speed | High |
5 | Improved Model Transparency | LightGBM | Provides feature importance metrics for interpretability and engineering insights. | Helps engineers understand the root causes of PV faults | Low |
Methodology | Feature Extraction | Classification | Accuracy | Highlights | Year | Reference |
---|---|---|---|---|---|---|
ResNet–XGBoost | ResNet | XGBoost | 97.0% | Combines deep ResNet features with powerful XGBoost classification. | 2023 | [23] |
CNN–SVM | CNN | SVM | 93.5% | Leverages CNN for spatial features and SVM for robust classification. | 2023 | [24] |
Bi-LSTM–XGBoost | Bi-LSTM | XGBoost | 94.9% | Captures temporal patterns with Bi-LSTM and efficient boosting with XGBoost | 2023 | [25] |
RNN–SVM | RNN | SVM | 92.1% | Sequential learning of RNN with the generalizing power of SVM | 2023 | [26] |
Hybrid PCA–XGBoost | PCA | XGBoost | 91.8% | Dimensionality reduction using PCA followed by XGBoost classification | 2023 | [27] |
ResNet + RF | ResNet | Random Forest | 96.9% | Uses deep feature extraction with ensemble RF for high accuracy | 2023 | [28] |
Autoencoder–SVM | Autoencoder | SVM | 92.5% | Unsupervised feature learning via autoencoder, classified by SVM | 2023 | [29] |
CNN–RF–XGBoost | CNN | RF + XGBoost | 96.4% | A tri-level hybrid integrating deep and ensemble learners | 2024 | [30] |
GRU–SVM | GRU | SVM | 93.2% | Temporal modeling using GRU, paired with efficient SVM | 2024 | [31] |
Dense Net–RF | Dense Net | Random Forest | 95.6% | Dense connections for better feature reuse, classified by RF | 2024 | [32] |
CNN–GBM | CNN | Gradient Boosting | 95.1% | Deep CNN features with a strong boosting-based classifier | 2024 | [33] |
AE–RF hybrid | Autoencoder | Random Forest | 93.4% | Combines unsupervised encoding with ensemble classification | 2024 | [34] |
VGG16–XGBoost | VGG16 | XGBoost | 97.2% | Strong visual feature extractor with accurate boosting | 2024 | [35] |
GRU–LightGBM | GRU | LightGBM | 94.3% | Sequential data modeling with fast and accurate LightGBM | 2024 | [36] |
Hybrid CNN–RF model | CNN | Random Forest | 96.5% | Merges convolutional features with RF ensemble for performance boost | 2025 | [37] |
LSTM–CatBoost model | LSTM | CatBoost | 95.8% | Temporal modeling via LSTM and category-aware boosting with CatBoost | 2025 | [38] |
LSTM–RF model | LSTM | Random Forest | 94.7% | Long-term temporal learning with a robust ensemble classifier | 2025 | [39] |
PCA + Gradient Boosting | Principal Component Analysis (PCA) | Gradient Boosting Machine | 97.3% | Efficient feature reduction with powerful ensemble boosting | 2025 | [40] |
CNN–LightGBM | CNN | LightGBM | 96.0% | CNN-driven feature maps classified with fast LightGBM | 2025 | [41] |
AE–CatBoost | Autoencoder | CatBoost | 94.8% | Efficient representation learning with fast gradient boosting | 2025 | [42] |
DBN–LightGBM | Deep Belief Network | LightGBM | 98.2% | Deep feature learning with DBN and superior boosting via LightGBM | 2025 | Proposed work |
Hyperparameter | Description | Model Component | Value |
---|---|---|---|
learning_rate | Controls the speed of DBN weight updates; smaller values improve stability. | DBN | 0.01 |
batch_size | Number of samples per iteration affects convergence stability. | DBN | 64 |
n_hidden_layers | Number of RBM layers: deeper networks enhance representation. | DBN | 3 |
hidden_units | Neuron per hidden layer determines model capacity. | DBN | [128, 64, 32] |
activation_function | Enables non-linear feature learning. | DBN | ReLu |
n_estimators | Total trees in boosting balances accuracy and speed. | LightGBM | 200 |
learning_rate | Tree contribution per round; lower values improve generalization. | LightGBM | 0.05 |
max_depth | Tree depth captures complexity but may overfit. | LightGBM | 9 |
subsample | The data fraction used per tree reduces overfitting risk. | LightGBM | 0.8 |
eval_metric | Loss function for optimization; suitable for multi-class tasks. | LightGBM | log loss |
Parameter | Symbol | Value | Unit |
---|---|---|---|
Peak Power | 255 | W | |
Open Circuit Voltage | 37.82 | V | |
Voltage at Maximum Power | 30.29 | V | |
Short Circuit Current | 8.98 | A | |
Current at Maximum Power | 8.42 | A | |
Power Tolerance | — | 0 to +5 | W |
Sr. No. | Fault Type | Cause Description | Data Points | Label |
---|---|---|---|---|
1 | Healthy | Normal operating conditions with no visible or electrical fault. | 19,945 | Healthy |
2 | Open Circuit | A break in the circuit path caused by disconnected wiring or cracked cell connections. | 19,782 | OC |
3 | PVG Fault | Grounding fault is where one or more conductors make contact with the earth. | 19,925 | PVG |
4 | Partial Shading | Caused by clouds, dust, trees, or nearby structures blocking solar irradiance. | 20,101 | PS |
5 | Busbar Fault | Caused by micro-cracks or corrosion, interrupting current flow in busbars. | 19,832 | BBF |
6 | Soiling Fault | Due to dust, bird droppings, or pollution accumulating on the panel surface. | 19,793 | SF |
7 | Hotspot Fault | Localized overheating due to cell damage or shading leads to reduced output. | 19,917 | HSF |
Rank | Feature Name | Importance Score (%) | Interpretation in PV Fault Context |
---|---|---|---|
1 | Irradiance | 18.7 | Strongly affects power generation; deviations often indicate shading or panel soiling. |
2 | Module Temperature | 16.5 | High sensitivity to thermal faults, hotspots, and cooling inefficiencies. |
3 | DC Current () | 13.2 | Fluctuations reflect string-level mismatch or partial faults. |
4 | DC Voltage () | 12.8 | Drop in voltage indicates disconnection or bypass diode failures. |
5 | AC Power Output | 11.3 | Direct measure of energy loss and overall system health. |
6 | Inverter Efficiency | 8.9 | Critical for identifying inverter-related degradation. |
7 | Ambient Temperature | 7.5 | External condition influencing the thermal behavior of modules. |
8 | Reactive Power (Q) | 6.1 | Useful for capturing inverter imbalance and grid compliance issues. |
9 | Frequency (Hz) | 3.0 | Stability indicator: variations may point to grid disturbances. |
10 | Wind Speed | 2.0 | Secondary factor influencing cooling and structural stress. |
Fault Class | PCN (%) | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) | Specificity (%) | Support |
---|---|---|---|---|---|---|---|
Healthy | 97.6 | 97.6 | 96.66 | 92.6 | 94.59 | 98.5 | 5983 |
Over current | 98.8 | 98.8 | 99.82 | 99.38 | 99.6 | 99.9 | 5935 |
PVG | 97.8 | 97.8 | 97.31 | 97.31 | 97.31 | 98.6 | 5978 |
Partial Shading | 98.1 | 98.1 | 97.85 | 96.89 | 97.36 | 98.9 | 6030 |
BBF | 98.0 | 98.0 | 97.8 | 95.95 | 96.86 | 98.7 | 5949 |
SF | 98.6 | 98.6 | 100.0 | 99.75 | 99.88 | 99.8 | 5937 |
HSF | 99.6 | 99.57 | 98.78 | 99.43 | 99.1 | 99.9 | 5975 |
Macro Avg | 98.07 | 98.06 | 98.32 | 97.61 | 97.96 | 99.06 | - |
Weighted Avg | 98.06 | 98.06 | 98.33 | 97.69 | 97.95 | 99.02 | - |
Reference | Model Type | Achieved Accuracy | Pros | Cons | Key Achievements | Computational Efficiency |
---|---|---|---|---|---|---|
[57] | SVM | 92.5% | Simple, fast, and effective for small datasets | Limited scalability, sensitive to feature selection | Effective for linearly separable fault classes | Very fast training/inference on small datasets but scales poorly with large samples. |
[58] | Random Forest | 94.3% | Handles high-dimensional data, robust to noise | Prone to overfitting with small datasets | Good generalization for non-linear patterns | Moderate training speed, inference is efficient but memory-heavy for many trees. |
[59] | CNN | 96.8% | Automatic feature extraction from raw signals | Requires large data, computationally heavy | Accurate in identifying complex fault patterns | High GPU demand, slow training; inference moderate |
[60] | LSTM | 97.2% | Captures time dependencies, ideal for time series | Slower training, overfitting risk with long sequences | Strong temporal learning for vibration-based signals | Training slower due to sequential processing; inference is moderate |
[61] | DBN | 96.0% | Layer-wise feature learning, unsupervised pretraining | Complex structure, slower convergence | Good hierarchical abstraction for feature representations | Training time is high, inference moderate |
[62] | CNN + SVM | 97.6% | Combines deep features with a simple classifier | SVM still needs careful tuning | The hybrid improved both training time and accuracy | Training is costly (CNN), but inference is faster after SVM integration |
[63] | LSTM + RF | 98.0% | Combines temporal features with ensemble prediction | Increased model complexity | Strong for multivariate sequence input | Training moderately slow; inference slower than single models |
[64] | CNN-LSTM Hybrid | 97.9% | Learns both spatial and temporal dependencies | Computationally demanding, tuning complexity | Effective in extracting spatiotemporal features | Very high GPU demand for training, inference is slower than pure CNN/LSTM |
[65] | Inception Time | 98.0% | Strong multivariate time-series classifier, captures multi-scale patterns | Requires large training data, heavy model size | State-of-the-art accuracy on industrial TS datasets | Training is heavy but parallelizable; inference is fast once trained |
[66] | MiniROCKET + Ridge | 97.8% | Very fast training, lightweight, high accuracy | Limited interpretability, feature-based, not fully deep | Efficient TS classification with competitive accuracy | Extremely fast training and inference; CPU-friendly |
[67] | Transformer (TST/TimesNet) | 98.1% | Captures long-range dependencies, flexible for sequence modeling | Requires more data, computationally expensive | Cutting-edge performance in time-series tasks | Training is very expensive; inference is slower than CNN/LSTM |
[68] | Temporal Conv. Network (TCN) | 97.6% | Parallelizable, handles long sequences with dilated convolutions | Requires careful kernel/dilation tuning | Good balance of accuracy and efficiency | Faster training than LSTM, inference is efficient |
[69] | CatBoost | 97.5% | Handles categorical features, with less tuning than LightGBM | Slightly slower training than LightGBM | Robust gradient boosting with stable accuracy | Training moderate; inference efficient, CPU-friendly |
Proposed | DBN + LightGBM (Proposed) | 98.21% | Efficient training, high precision, and interpretable via feature importance | Slightly increased preprocessing and model integration | Outperforms existing models in fault classification performance | Balanced training cost; inference is very fast due to LightGBM |
Reference | Model Type | Training Time (Relative) | Inference Time per Sample | Memory/Resource Demand | Deployment Feasibility |
---|---|---|---|---|---|
[59] | CNN | High (~1.5 h) | Moderate (~25 ms) | High GPU required | Limited (GPU needed) |
[60] | LSTM | High (~1.2 h) | Moderate (~30 ms) | Moderate-High | Feasible with GPU/High CPU |
[63] | LSTM + RF | Very High (~2.1 h) | Slow (~35 ms) | High | Limited, not edge-suitable |
[64] | CNN-LSTM Hybrid | Very High (~1.5 h) | Slow (~40 ms) | Very High GPU demand | Limited (GPU only) |
[65] | Inception Time | High (~1.8 h) | Fast (~15 ms) | High, parallelizable | Feasible with GPU/Server |
[66] | MiniROCKET + Ridge | Very Low (~10 min) | Very Fast (~5 ms) | Low, CPU-friendly | Excellent for Edge deployment |
[67] | Transformer (TST/TimesNet) | Very High (~4 h) | Slow (~30 ms) | Very High (GPU clusters) | Limited (Data center preferred) |
[68] | Temporal Conv. Network | Moderate (~1.5 h) | Fast (~12 ms) | Moderate | Good balance; feasible with CPU/GPU |
[69] | CatBoost | Moderate (~45 min) | Fast (~8 ms) | Low-Moderate | Edge/Server feasible |
Proposed | DBN + LightGBM | Moderate (~35 min) | Very Fast (~12 ms) | Moderate, CPU/GPU | Highly feasible for SCADA edge deployment |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kull, K.; Khan, M.A.; Asad, B.; Naseer, M.U.; Kallaste, A.; Vaimann, T. Adaptive Deep Belief Networks and LightGBM-Based Hybrid Fault Diagnostics for SCADA-Managed PV Systems: A Real-World Case Study. Electronics 2025, 14, 3649. https://doi.org/10.3390/electronics14183649
Kull K, Khan MA, Asad B, Naseer MU, Kallaste A, Vaimann T. Adaptive Deep Belief Networks and LightGBM-Based Hybrid Fault Diagnostics for SCADA-Managed PV Systems: A Real-World Case Study. Electronics. 2025; 14(18):3649. https://doi.org/10.3390/electronics14183649
Chicago/Turabian StyleKull, Karl, Muhammad Amir Khan, Bilal Asad, Muhammad Usman Naseer, Ants Kallaste, and Toomas Vaimann. 2025. "Adaptive Deep Belief Networks and LightGBM-Based Hybrid Fault Diagnostics for SCADA-Managed PV Systems: A Real-World Case Study" Electronics 14, no. 18: 3649. https://doi.org/10.3390/electronics14183649
APA StyleKull, K., Khan, M. A., Asad, B., Naseer, M. U., Kallaste, A., & Vaimann, T. (2025). Adaptive Deep Belief Networks and LightGBM-Based Hybrid Fault Diagnostics for SCADA-Managed PV Systems: A Real-World Case Study. Electronics, 14(18), 3649. https://doi.org/10.3390/electronics14183649