1. Introduction
The global shift towards renewable energy sources, driven by the urgent need to mitigate environmental impacts and ensure sustainable energy generation, has significantly increased the deployment of solar photovoltaic (PV) systems. At the core of these PV systems are solar inverters, critical devices responsible for converting direct current (DC) produced by solar panels into alternating current (AC) compatible with the electric grid [
1]. Due to their crucial role, the operational reliability and performance efficiency of solar inverters directly influence the overall system effectiveness, financial return, and grid stability [
2]. However, these inverters are susceptible to various faults, including open-circuit, short-circuit, insulation degradation, overheating, and grid synchronization issues, potentially reducing system efficiency, compromising safety, and resulting in substantial financial losses [
3,
4,
5].
Detecting inverter faults rapidly and accurately is a vital task that ensures seamless operation, system reliability, and long-term profitability. Multiple factors, including weather, operating conditions, manufacturing defects, could significantly impact the reliability of the inverters [
6].
With advancements in hardware storage technologies and the declining cost of high-speed data storage and IOT devices [
7], both the volume and velocity of data have increased substantially. This increase in data volume and velocity has created new opportunities for machine learning (ML) approaches to explore, as the increased speed of data writing and reading enables faster processing, real-time analytics, and the development of more sophisticated models capable of handling large-scale, dynamic datasets. ML algorithms have found widespread applications in various domains, including cybersecurity, anomaly detection, and load forecasting [
3,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18]. The availability of such large-scale data has enabled the development of predictive models, contributing to significant progress in these emerging areas.
Recent studies highlight the efficacy of ML-based fault detection in PV systems [
19,
20]. For example, recognizing patterns in power system data enables supervised learning models for cyberattack detection [
21,
22]. Additionally, advanced anomaly detection methods, such as stacked autoencoders and PCA [
21,
23,
24], effectively identify deviations from normal operating conditions, enabling early detection of False Data Injection Attacks (FDIAs) in smart grids.
Figure 1 presents a hierarchical tree structure for fault detection in solar inverters. It categorizes faults into electrical, mechanical, and environmental types. Fault detection methods are divided into traditional methods (rule-based monitoring, statistical analysis) and intelligent methods (machine learning and deep learning-based approaches). Data acquisition involves sensor-based collection, SCADA & IoT monitoring, and synthetic data generation. Performance evaluation is classified into classification metrics and regression metrics. The intelligent methods include supervised and unsupervised learning, CNNs for image detection, LSTMs for time series [
25], multi-layer perceptrons, and autoencoders with generative adversarial networks (GANs) for anomaly detection [
26,
27].
1.1. Related Work
Extensive research conducted over the past decade (2015–2025) has consistently demonstrated that ML methods effectively detect and diagnose inverter faults in grid-connected solar systems, frequently achieving accuracies exceeding 95%. For instance, recent studies highlight methods such as the improved SE-ResNet18 (Squeeze-and-Excitation Residual Network) [
28], which used techniques like Conditional Variational Autoencoders (CVAE) and signal denoising via Wavelet Packet Decomposition (WPD). These techniques significantly increase the model performance reaching upto 100% accuracy. In this study applying SE-ResNet18, accuracy reached upto 98.18% on the original dataset and also improved to a remarkable 100% after the dataset was augmented.
Deep learning models are very accurate, but putting them to use in the real world comes with challenges. Most importantly, real-time detection faces issues because they need a lot of computing power, including fast hardware for processing and a lot of time for training [
29]. Also, deep learning models are considered as black-box and are not interpretable [
30,
31]. This is because their very complicated, multiple structures make them hard to see, making it hard to figure out why certain fault predictions are made. This limitation can be especially annoying in places with important systems where clear communication, following rules, and being able to understand are very important. Whereas the alternative ML methods like RF classifiers offer simpler, interpretable, and computationally efficient solutions [
31]. These ensemble-based models employ DTs that inherently facilitate transparency through explicit rule-based structures, providing insights into feature importance and decision-making criteria. A recent application of RF classifiers in inverter fault detection [
32] showed an accuracy of up to 99%, validating the reliability and usability of tree based algorithms in scenarios where model interpretability and resource efficiency are prioritized.
A 2023 research [
33] developed a hybrid strategy using AI to improve fault detection in PV arrays and inverters. This research predicted AC power using a regression model to identify inverter failures. They used Elman neural networks (ENN), boosted tree algorithms (BTA), multi-layer perceptrons (MLP), and Gaussian process regression. The scientists obtained great accuracy using real-world datasets containing operational characteristics such daily energy production, ambient and module temperatures, solar radiation, and DC and AC power measurements. The optimized GPR model (GPR-M4) achieved low errors, with a mean absolute percentage error (MAPE) of 3.9% and MAE of 0.002 for inverter faults and a similar result for PV array faults (MAPE of 0.091 and negligible MAE).Another study in 2024 [
34] investigated ML-based methods for monitoring and classifying faults in solar photovoltaic (PV) inverters. Using real operational data from two solar plants (140 kWp and 590 kWp), they applied supervised learning algorithms, including fine, medium, and coarse DT models. The fine tree algorithm achieved the highest accuracy (up to 98.4%) in classifying common faults like grid voltage abnormalities and output overloads. A semi-supervised VAE-based method [
35] detects PV system faults by identifying latent-space deviations using 1SVM, iForest, EE, and LOF algorithms.
Among recent studies [
36,
37] focus on fault detection in grid-connected PV systems operating under MPPT and IPPT modes. These used high-frequency data that addresses seven realistic fault scenarios. The approach in [
36] combines PCA, KDE, and KLD with an adaptive mechanism to account for environmental variations. Although it achieved a low false alarm rate below one percent and fast processing time. But this study did not report standard classification metrics. Whereas, ref. [
37] used the GPVS-Faults dataset and evaluated supervised ML models including RF, LR, and NB. RF achieved an F1-score of 0.96, while LR slightly outperformed it in accuracy but required more training time. The PV array voltage was found to be the most important predictor.
An advanced fault diagnosis method for photovoltaic (PV) systems using cascaded multilevel H-bridge inverters was proposed in [
19]. A two-stage classification framework using PCA and Support Vector Machine was developed to distinguish two groups of similar open-circuit faults in power switching devices (IGBTs). They tested 37 fault scenarios—one normal, eight single IGBT faults, and twenty-eight double IGBT faults. Each fault type had 200 samples with 10,000 inverter output voltage sampling points under different solar irradiance and temperature conditions. After PCA reduced dimensionality, SVM classified fault types. A second PCA-SVM classification distinguished similar faults that were initially difficult to classify. The proposed PCA-SVM secondary classification strategy outperformed traditional methods like PCA-SVM (94.59%) and PCA-ELM (89.0%) with a diagnostic accuracy of 99.95%. By eliminating ambiguities between similar fault groups, the method improved classification accuracy.
These findings show that ML improves inverter fault detection. However, given the large volume of sensor data, dimensionality reduction must be balanced with model accuracy and complexity. A balanced dataset is also needed to find meaningful patterns in faulty classes. A simple oversampling technique, SMOTE lacks diversity, making GAN-based approaches a viable alternative for imbalanced datasets. PCA reduces dimensionality in feature engineering, but stacked autoencoders should be considered. Real-time deployment requires critical analysis of model training and inference times.
The proposed study integrates autoencoder-based nonlinear feature extraction with conventional ML classifiers, unlike previous studies that focused on deep learning classifiers or anomaly detection frameworks using autoencoders or VAEs. We also provide a benchmarking framework that compares the accuracy, AUC, training/inference time, and confusion matrix metrics of multiple classifiers after dimensionality reduction (via PCA and autoencoders), making it ideal for real-time and embedded fault detection systems. This structured analysis addresses the accuracy-computational feasibility trade-offs in resource-constrained environments, which are often overlooked in the literature.
1.2. Objective
The objective of this study is to develop an efficient inverter fault detection framework for grid-connected photovoltaic systems using dimensionality reduction and ML classifiers. We employ PCA and autoencoders to reduce the high-dimensional sensor data while preserving critical fault-related features. Scree analysis is performed to determine the optimal number of principal components (PCs). The dataset is trained using multiple classifiers, including DT, RF, KNN and LR, on both the full feature set and reduced feature sets (8 PCs from PCA and 8 features from autoencoders). The study also evaluates model performance across different dimensionality settings.
The key contribution of this paper includes:
Comprehensive Classifier Benchmarking: Conducted a comparative analysis of multiple ML classifiers (RF, DT, LR) for inverter fault detection, highlighting trade-offs in accuracy, inference time, and deployment feasibility.
Evaluation of Dimensionality Reduction Techniques: Assessed the impact of PCA and autoencoders on classifier performance, including feature space reduction, optimal component selection, and preservation of critical fault signatures.
Real-Time Suitability Analysis: Investigated model training and inference times across different feature sets to identify configurations suitable for real-time deployment in resource-constrained PV systems.
2. System Design, Data Acquisition, and Methodology
This section provides an overview of the experimental setup and data used in this study. It begins with a discussion of inverter topologies commonly found in grid-connected PV systems, along with their typical fault mechanisms. This context establishes the relevance of the faults investigated. Following this, we describe the laboratory-based GCPVS used to generate fault data, the instrumentation involved, and the structure of the dataset. This information forms the basis for the ML-based fault classification framework proposed in subsequent sections.
2.1. Inverter Topologies and Fault Mechanisms in Grid-Connected PV Systems
Grid-connected photovoltaic (PV) systems commonly use three types of inverter topologies: string inverters, central inverters, and microinverters [
38]. These topologies differ in architecture, scale of deployment, and associated failure modes as shown in
Figure 2.
2.2. GCPVS for Inverter/IGBT Fault Analysis
String inverters are widely used in residential and small commercial applications. They connect multiple PV modules in series (forming a string), with each string feeding a single inverter. String inverters are prone to faults such as maximum power point tracking (MPPT) failures, DC-link capacitor degradation, IGBT open/short circuit faults, and ground leakage issues [
39,
40].
Central inverters, used in large utility-scale installations, aggregate power from multiple strings or arrays. Their larger size and centralized architecture make them susceptible to cooling system failures, control board malfunctions, synchronization issues with the grid, and bulk capacitor degradation. IGBT-related faults and DC overvoltage or undervoltage issues are also frequent [
39,
41].
Microinverters operate at the individual panel level, providing module-level conversion. While they offer enhanced fault isolation, common issues include communication loss, islanding faults, overheating due to enclosure design, and intermittent MPPT failures [
42].
In this study, the IGBT open-circuit, IGBT short-circuit, open-switch, and normal operating conditions modeled in the benchmark dataset are representative of faults typically seen in both string and central inverters. These faults directly impact parameters such as input DC voltage (), input PV current (), and three-phase output currents (, , ), which are used as features in our ML framework. Thus, the classification tasks and fault detection methods explored in this paper are well-aligned with real-world inverter configurations used in grid-connected PV systems.
The dataset utilized in this study is derived from experimental fault scenarios in grid-connected photovoltaic (PV) systems operating under both Maximum Power Point Tracking (MPPT) and Intermediate Power Point Tracking (IPPT) modes. The dataset, known as GPVS-Faults, was obtained from laboratory experiments that systematically introduced faults into a PV system, including inverter faults and Insulated Gate Bipolar Transistor (IGBT) failures, among other fault types [
43]. This study validates a fault detection method using a lab-implemented GCPVS, evaluating its performance under experimental conditions in two operational modes: MPPT and IPPT. The objective is to assess the method’s effectiveness in accurately identifying Inverter/IGBT faults under these modes.
The GCPV system employed a Programmable DC Power Supply Chroma 62150H-1000S (1000 V/15 A/15 kW) with Solar Array Simulator software to simulate PV array outputs under varying irradiance (Gi) and PV cell temperature (Tc) conditions [
39]. The Chroma 62150H-1000S enabled the emulation of crystalline, multi-crystalline, and thin-film PV arrays with distinct fill factors. A Programmable AC Source Chroma 61511 (0–300 V, 151.5 kHz/12 kVA) replicated AC grid conditions and captured critical PV inverter dynamics [
36]. The AC load ensured system protection during the intentional introduction of Inverter/IGBT faults into the GCPVS, maintaining experimental safety and reliability.
Data acquisition and control algorithms were implemented using DSpace 1104 hardware with MATLAB/Simulink’s RTI [
39]. Voltage Oriented Control (VOC) and Space Vector Pulse Width Modulation (SVPWM) regulated active/reactive power, while a Phase Lock Loop (PLL) synchronized inverter output with the grid. A Particle Swarm Optimization (PSO)-based controller switched multiple modes depending on available power.
2.3. Grid-Connected PV System Fault Description
This study specifically addresses inverter-related faults, particularly Insulated-Gate Bipolar Transistor (IGBT) failures within the implemented GCPVS. Real inverter fault data was intentionally generated by inserting failure of one of the six IGBT and collected to rigorously test and validate fault detection algorithms. Fault scenarios were manually introduced during multiple independent trials lasting 10–15 s, with faults injected around the 7th or 8th second of each trial. The dataset was sampled at 100 microseconds (µs), ensuring high-resolution measurements for both faulty and fault-free conditions. These experimentally generated faults provide reliable, realistic data suitable for training and validating fault detection algorithms aimed at inverter protection and predictive maintenance tasks. Additional scenario descriptions are provided in [
43]. The
Table 1 presents a complete of the features in the dataset. The
Figure 3 presents correlations between input features and the target variable for inverter fault detection. The most strongly correlated features with the target are Vpv (0.72 positive correlation) and Ipv (−0.51 negative correlation), indicating these variables significantly impact fault prediction. Features like Iabc also exhibit a moderate negative correlation (−0.49).
2.4. Methodology
This study presents a comprehensive methodology for evaluating and comparing the effectiveness of various ML algorithms in the classification of inverter faults.The dataset includes a total record of 272,727 with 15 features including the target. Out of the total records, there are 143,715 (52.7%) instances of healthy and 129,012 (47.3%) instances of fault. The methodology is summarized in
Figure 4 and Algorithm 1 which covers data preprocessing, dimensionality reduction, and model evaluation.
Algorithm 1 Inverter fault diagnosis. |
- 1:
Step 1: Dataset Loading - 2:
Load dataset D containing inverter operational data from data repository. - 3:
Log dataset dimensions and structure. - 4:
Step 2: Exploratory Data Analysis (EDA) - 5:
Analyze data summary, descriptive statistics, and check for missing values. - 6:
Visualize data distribution and correlation between features. - 7:
Step 3: Data Preprocessing - 8:
Standardize input features using StandardScaler. - 9:
Step 4: Feature Importance Analysis - 10:
Train a RF classifier on standardized data. - 11:
Compute and plot feature importance scores. - 12:
Step 5: Dimensionality Reduction using PCA - 13:
Apply PCA to standardized data. - 14:
Reduce dimensionality to 8 principal components. - 15:
Step 5: Autoencoder Training - 16:
Define Autoencoder neural network architecture: Input layer matching input feature dimensions. Hidden layers with ReLU activation functions and Batch Normalization. Bottleneck layer of 8 dimensions.
- 17:
Train Autoencoder for 30 epochs with batch size 32. - 18:
Extract encoded features from the trained Autoencoder for classification. - 19:
Step 6: Feature Importance Analysis - 20:
Train RF classifier on standardized original features. - 21:
Compute and visualize feature importance scores. - 22:
Step 7: Model Training and Evaluation - 23:
Split dataset into training and test sets (80:20 ratio). - 24:
Split training data in training and validation set (80:20 ratio) - 25:
for each classifier in {LR, DT, RF, KNN} do - 26:
Train classifier on: Original standardized features, PCA-transformed features, Autoencoder-encoded features.
- 27:
Evaluate models using accuracy, confusion matrices, classification reports, ROC curves, and ROC-AUC scores. - 28:
end for - 29:
Step 8: Results and Comparison - 30:
Summarize and compare classifier performance metrics in tabular form. - 31:
Visualize ROC curves for model comparison. - 32:
Step 9: Model Selection and Saving - 33:
Identify best-performing model based on evaluation metrics. - 34:
Save the selected model for future deployment.
|
The initial phase involved loading of the dataset for preprocessing and initial data analysis. The dataset comprises a series of features and a binary target variable indicating the presence or absence of inverter faults, as shown in
Table 1. Following data loading, the independent features and target variables were separated, facilitating independent analysis and preprocessing. The independent numerical variables in the dataset were then standardized using the StandardScaler from scikit-learn v1.6.1. The scaling is a crucial step in ensuring that each feature contributes equally to the model training by normalizing the distribution of each variable.
In the next step, the standardized data was split into training, validation, and testing subsets by applying the train-test split strategy. In the first split process, the dataset was split into an 80% training set and a 20% testing set. The training subset was further split into training and validation subsets, comprising 80% and 20% of the initial training data, respectively. This multi-step splitting allowed for effective model training, hyperparameter tuning, and unbiased evaluation on unseen test data and avoided overfitting.
During the training phase, four commonly applied ML classifiers were employed: RF, LR, DT, and KNN. These models were selected for their different types of learning approaches—ensemble-based, linear, tree-based, and instance-based, respectively. This provides a wide range of comparisons. Initial training was performed using the entire standardized feature set.
Following the first phase of the training, a feature importance analysis was performed to analyze the contribution of each feature in correctly classifying the target label, as shown in the
Figure 5. We also used PCA to reduce dimensionality and improve performance and computational efficiency. The PCA transformation aimed at reducing the feature space to eight principal components, retaining maximum data variance in the data without losing important information. This procedure resulted in new derived feature sets, subsequently used to retrain the ML models. The PCA derived features followed the same train-test split structure for consistency in comparison.
Moreover, an advanced autoencoder-based feature extraction method was introduced to explore a deep learning-based dimensionality reduction technique. The autoencoder network architecture comprised an input layer corresponding to the standardized feature set’s dimensionality, an intermediate dense layer with 32 neurons utilizing a ReLU activation function, followed by a Batch Normalization layer to stabilize learning. This was succeeded by an encoding layer compressing features down to eight dimensions. The decoding process mirrored the encoding pathway but aimed at reconstructing the original input. The autoencoder was trained using 30 epochs with batch sizes of 32 and a mean-squared error loss function. Upon training completion, the encoded features were extracted and employed as inputs for retraining the selected ML classifiers.
For each transformation method—standardized features, PCA-transformed features derived using scree plot analysis as shown in
Figure 6, and autoencoder-extracted features—four ML models (RF, LR, DT, and KNN) were systematically trained and evaluated. The RFC, with 100 estimators, leveraged ensemble learning to mitigate overfitting and capture complex feature interactions. LR, configured to execute binary classification with a maximum iteration limit set to ensure convergence, provided insights into linear relationships within the data. DT classification offered intuitive model interoperability, and KNN, with k = 5, utilized proximity-based classification to identify patterns in the transformed feature spaces. The packages during in this work includes numpy, pandas, scikitlearn, tensorflow, tabulate, and matplotlib.
Key performance indicators included accuracy, confusion matrices, classification reports detailing precision, recall, and F1-scores, and the computation of the Receiver Operating Characteristic (ROC) curve along with the Area Under the Curve (AUC) score. The ROC analysis was particularly insightful, providing a comprehensive view of the trade-off between true positive rates and false positive rates across varying classification thresholds.
Furthermore, the autoencoder employed for feature extraction was carefully validated by analyzing training convergence through epochs to ensure the stability of the encoding process. The training of the autoencoder spanned 30 epochs with a batch size of 32, with the optimizer effectively adjusting the model parameters iteratively to minimize reconstruction error, thus ensuring the robustness of the extracted features. The
Figure 7 shows the variation of the loss with respect to epochs during autoencoder training.
Detailed visualizations significantly enhanced the interoperability of the results. Confusion matrices were generated for each model, clearly illustrating the distinction between true positives, false positives, true negatives, and false negatives. These visual tools were instrumental in diagnosing the strengths and limitations of each model with respect to classification accuracy and types of errors made.
Further enriching the analytical depth, the ROC curves for each model provided graphical representations of their discriminative capabilities, showcasing sensitivity (true positive rate) against specificity (false positive rate). These curves, supplemented by the numerical AUC scores, facilitated straightforward comparisons among models, underscoring their relative performances and their predictive capabilities in handling binary inverter fault classification tasks.
3. Results and Discussion
The comparative study of dimensionality reduction methods for inverter fault detection in grid-connected solar photovoltaic (PV) systems yielded a comprehensive evaluation of various ML models, both with and without dimensionality reduction techniques. The experimental results are summarized in
Table 2 and
Table 3, which present the performance metrics of the models in terms of training time, prediction time, area under the curve (AUC) score, accuracy across training, validation, and test sets, as well as detailed classification metrics including TP, TN, FP, FN, accuracy, precision, recall, and F1 score. These metrics collectively provide insight into the effectiveness and efficiency of the proposed methods: RF, LR, DT, and KNN, when applied to the original feature set (All Features), PCA-reduced features, and autoencoder (AE)-reduced features. Three types of input features that are original standardized features, PCA-reduced features, and autoencoder-derived features, are used in
Figure 6 to show the ROC curves for various ML models. The figure shows that RF model with all features demonstrated optimal classification ability with AUC of almost 1.0. Whereas KNN with all features yielded AUC of 0.9988, RF with PCA slightly lower AUC of 0.9994, RF with autoencoder features (0.9993), and KNN with PCA and AE features (AUCs > 0.997) are additional high-performing models.
3.1. Model Performance with All Features
The baseline performance of the models utilizing the full feature set demonstrated high predictive accuracy and robustness across all classifiers. As shown in
Table 2, the RF with all features model achieved the highest test accuracy of 0.99, with a training time of 36.47 s and a prediction time of 0.39 s. The AUC score of 0.99 further corroborates its excellent discriminative ability. Detailed results in
Table 3 indicate an accuracy of 0.9987, with a precision of 0.9992, recall of 0.9980, and F1 score of 0.9986. The model correctly identified 25,860 TP and 28,614 TN instances, with only 21 FP and 51 FN, as shown in all models’ confusion matrix
Figure 8 which highlights the superior performance of RF in minimizing misclassifications.
The LR with all features model, while computationally efficient with a training time of 0.81 s and a prediction time of 0.006 s, exhibited a slightly lower test accuracy of 0.97. Its AUC score remained high at 0.99, but the classification metrics in
Table 2 reveal a drop in performance, with an accuracy of 0.9776, precision of 0.9815, recall of 0.9707, and F1 score of 0.9761. The increase in FP to 474 in comparison to FP of RF which was only 21 and FN to 759 from 51 suggests that LR struggled to capture the full complexity of the data compared to RF.
The DT with all features model performed comparably to RF, with a test accuracy of 0.99, a training time of 3.82 s, and a prediction time of 0.008 s. Its AUC score of 0.99 and classification metrics (accuracy: 0.9963, precision: 0.9961, recall: 0.9961, F1 score: 0.9961) indicate strong performance, though it recorded slightly higher FP (101) and FN (100) than RF with all features.
The KNN with all features model, despite its high test accuracy of 0.99 and AUC score of 0.99, incurred a significant computational cost during prediction, with a time of 21.815 s. This is likely due to the distance computation required for all features in the original high-dimensional space. Its classification metrics (accuracy: 0.9951, precision: 0.9955, recall: 0.9942, F1 score: 0.9948) were slightly lower than RF and DT, with 116 FP and 151 FN.
3.2. Model Performance with PCA
Applying PCA as a dimensionality reduction technique resulted in varied impacts on model performance. The RF based PCA model maintained a high test accuracy of 0.99 and an AUC score of 0.99, though its training time increased to 57.20 s, likely due to the additional computational overhead of PCA transformation. Prediction time rose to 0.750 s, reflecting a trade-off between dimensionality reduction and inference speed.
Table 3 shows an accuracy of 0.9900, precision of 0.9905, recall of 0.9885, and F1 score of 0.9895, with 246 FP and 297 FN, indicating a slight decline in classification performance compared to RF with all features.
The LR based PCA model exhibited the most significant reduction in performance, with a test accuracy of 0.92 and an AUC score of 0.98. Its training and prediction times were notably low (0.11 s and 0.002 s, respectively), making it the most computationally efficient model in this category. However, its classification metrics (accuracy: 0.9277, precision: 0.9217, recall: 0.9265, F1 score: 0.9241) reflect a substantial increase in FP (2039) and FN (1904), suggesting that PCA may have discarded critical features necessary for effective fault detection with LR.
The DT based PCA model retained a test accuracy of 0.98 and an AUC score of 0.98, with a training time of 4.68 s and a prediction time of 0.008 s. Its classification metrics (accuracy: 0.9811, precision: 0.9810, recall: 0.9792, F1 score: 0.9801) indicate a modest decline from DT with all features, with 492 FP and 538 FN, suggesting that PCA preserved most of the DT’s discriminative power.
The KNN based PCA model benefited significantly from dimensionality reduction, reducing its prediction time to 9.129 s while maintaining a test accuracy of 0.99 and an AUC score of 0.99. Its classification metrics (accuracy: 0.9928, precision: 0.9927, recall: 0.9920, F1 score: 0.9924) improved slightly compared to KNN with all features, with 189 FP and 206 FN, demonstrating that PCA effectively reduced computational complexity without compromising accuracy.
3.3. Model Performance with Autoencoder
The use of an autoencoder (AE) for dimensionality reduction produced results that were generally competitive with PCA, with some notable differences. The RF based AE model achieved a test accuracy of 0.98 and an AUC score of 1.00, with a training time of 49.54 s and a prediction time of 0.379 s. Its classification metrics (accuracy: 0.9893, precision: 0.9901, recall: 0.9873, F1 score: 0.9887) indicate robust performance, though slightly below RF with all features, with 257 FP and 328 FN.
The LR based AE model showed a test accuracy of 0.94 and an AUC score of 0.98, with exceptionally low training and prediction times (0.12 s and 0.000 s, respectively). However, its classification metrics (accuracy: 0.9492, precision: 0.9504, recall: 0.9422, F1 score: 0.9463) reflect a higher error rate, with 1273 FP and 1499 FN, suggesting that the AE may not have captured the linear relationships as effectively as PCA for LR.
The DT based AE model recorded a test accuracy of 0.97 and an AUC score of 0.97, with a training time of 2.80 s and a prediction time of 0.000 s. Its classification metrics (accuracy: 0.9790, precision: 0.9784, recall: 0.9774, F1 score: 0.9779) indicate a slight decline from DT with all features, with 559 FP and 586 FN, reflecting a minor loss of discriminative capability.
The KNN based AE model achieved a test accuracy of 0.99 and an AUC score of 0.99, with a training time of 0.15 s and a prediction time of 2.200 s. Its classification metrics (accuracy: 0.9923, precision: 0.9921, recall: 0.9917, F1 score: 0.9919) were comparable to KNN based PCA, with 205 FP and 216 FN, demonstrating that the AE effectively reduced dimensionality while preserving KNN’s performance.
3.4. Comparative Analysis
Across all configurations, RF consistently outperformed other models in terms of accuracy, precision, recall, and F1 score, particularly when using the full feature set. Dimensionality reduction with PCA and AE maintained high accuracy for RF and KNN, though with minor trade-offs in classification performance and increased training times. LR exhibited the greatest sensitivity to dimensionality reduction, with significant drops in accuracy and increases in misclassifications, suggesting that it relies heavily on the original feature space. DT showed moderate resilience to both PCA and AE, with slight reductions in performance metrics.
In terms of efficiency, PCA and AE significantly reduced prediction times for KNN, making it a more practical choice for real-time applications despite its high computational cost in the original feature space. LR remained the fastest model across all configurations, though its lower accuracy limits its suitability for critical fault detection tasks. RF, while computationally intensive during training, offered a balanced trade-off between accuracy and inference speed, particularly with AE.
3.5. Comparison with Prior Work Using the GPVS-Faults Dataset
This study investigates the problem of accurately and efficiently detecting inverter faults in grid-connected photovoltaic systems through the application of dimensionality reduction techniques. As summarized in
Table 4, recent literature utilizing the GPVS-Faults dataset has explored a range of methodologies. Ref. [
35] employed a semi-supervised approach based on variational autoencoders combined with anomaly detection algorithms, achieving competitive AUC values; however, the absence of supervised classifier comparisons and limited model interpretability constrain its broader applicability. Ref. [
44] developed a two-tier framework using Extra Trees with explainable artificial intelligence, which demonstrated high classification accuracy, yet did not incorporate dimensionality reduction, potentially affecting computational scalability. Ref. [
45] proposed a hybrid model combining Modified Independent Component Analysis with RFs to address class imbalance, yielding high predictive accuracy, though the complexity of the ICA component may limit its suitability for real-time deployment.
In response to these limitations, the present work introduces a supervised learning framework that systematically benchmarks four classifiers—RF, DT, K-Nearest Neighbors, and LR—under both principal component analysis and autoencoder-derived feature sets. Among the evaluated models, RF achieved the highest accuracy and AUC with minimal inference latency, demonstrating strong potential for real-time applications. Furthermore, the integration of autoencoders proved particularly effective for enhancing the performance of computationally lightweight classifiers such as LR. These results highlight the proposed method’s capacity to balance predictive accuracy, interpretability, and operational efficiency, contributing a novel comparative perspective to the existing body of work on PV inverter fault detection.
3.6. Strategies to Mitigate Prediction Errors
Despite the high classification accuracy achieved by the proposed models (e.g., RF with 99.87% accuracy), discrepancies between predicted and actual fault states showed as FP and FN in
Table 3 makes it an essential task to adopt strategies to enhance system reliability.
First, classification thresholds can be adjusted to favor recall over precision in safety-critical environments, as illustrated by the ROC curves in
Figure 9. Second, a hybrid ensemble strategy combining RF and DT predictions through majority voting can reduce misclassification by exploiting the models’ complementary strengths.
Furthermore, low-confidence predictions can be flagged for manual verification, while periodic model retraining on updated operational data can mitigate concept drift. Finally, integrating the ML-based detection pipeline with rule-based heuristics (e.g., real-time monitoring of critical parameters such as ) provides an additional layer of validation, thereby minimizing the operational impact of classification errors.
3.7. Strengths, Limitations, and Future Directions
This study presents a comprehensive evaluation of dimensionality reduction methods, including principal component analysis and autoencoders, combined with multiple classifiers for inverter fault detection using the GPVS-Faults dataset. One key strength of the proposed framework is its suitability for real-time deployment, supported by low inference latency and reliance on standard inverter measurements such as , , and . Additionally, the comparative benchmarking of model accuracy, training and inference time, and classification metrics provides a transparent and reproducible foundation for future studies.
However, certain limitations should be acknowledged. The current analysis focuses on binary classification of fault and non-fault states, which may not fully capture the complexity of real-world inverter behavior. Furthermore, the experimental validation is limited to a controlled dataset, and its generalizability across diverse inverter types, environmental conditions, and grid configurations remains to be explored. Although autoencoders improved model efficiency, their latent representations are less interpretable compared to more transparent methods such as DTs or feature ranking techniques.
Future research may extend this work by incorporating multi-class fault classification, validating models on field data from different PV installations, and exploring explainable ML approaches. In addition, real-time deployment on edge computing platforms such as Raspberry Pi or embedded controllers, along with adaptive retraining mechanisms to handle data drift, could enhance the robustness and applicability of the proposed method.
4. Conclusions
This study comprehensively explored the effectiveness of various ML algorithms: RF, LR, DT, and KNN, in diagnosing inverter faults using binary inverter data. The performance was assessed using three distinct feature sets: original standardized features, PCA-derived features, and AE-derived features, employing accuracy, training and inference times, and AUC scores as evaluation metrics.
RF consistently outperformed other models across all feature extraction techniques, achieving exceptional accuracy and near-perfect AUC scores. Specifically, RF with the complete original feature set reached an impressive accuracy of 99.87% and an AUC score of approximately 0.99, emphasizing its superior capacity to model complex interactions between features. However, this high performance comes at the cost of increased computational demands during training. Despite longer training durations (ranging up to 57.20 s with PCA), its inference speed remained practical for real-world applications.
LR exhibited the fastest computational performance, with minimal training times (0.11–0.81 s) and near-instantaneous inference. While LR with original features retained high accuracy (approximately 97.76%), its performance decreased notably after dimensionality reduction with PCA (accuracy dropping to 92.77%). Interestingly, employing AE-derived features significantly boosted LR’s accuracy (94.92%), demonstrating the value of autoencoder-based nonlinear transformations.
DTs provided a strong balance between computational efficiency and accuracy. DT models trained on original features maintained excellent accuracy (99.63%) and rapid inference speed, making them especially suitable for real-time prediction environments. Their performance slightly diminished when PCA or AE-derived features were used, highlighting DT’s sensitivity to transformed features.
KNN demonstrated robust accuracy (approximately 99%) but experienced substantial computational overhead during inference due to intensive distance computations. Dimensionality reduction significantly improved its inference time, particularly with autoencoder-extracted features, which dropped inference duration dramatically (from 21.82 s to 2.20 s), indicating that AE-derived features effectively preserve essential data characteristics for proximity-based classifiers.
The analysis underscores the value of dimensionality reduction techniques particularly autoencoders for balancing model performance with computational efficiency, which is essential for deploying predictive systems in resource-constrained environments. Ultimately, model selection should align explicitly with operational priorities: RF is optimal when accuracy is paramount; DT and KNN with AE-derived features are ideal for real-time inference; and LR with AE-derived features is recommended for applications demanding extreme computational efficiency with acceptable accuracy. This study focus on binary fault classification of inverter the future work will extend the methodology to multi-class fault detection framework capturing different types of fault including weather induced anomalies, equipment failure and severity of the failure. Additionally, the proposed method is computationally lightweight and relies on standard inverter measurements such as , , and , making it suitable for real-time deployment. From our analysis, the RF and DT models demonstrated optimal inference time for real-time applications. The trained model can be deployed on a single-board computer. This will receive live sensor input, performs inference locally, and raises alerts based on output, without requiring major changes to existing PV system hardware.