1. Introduction
The global energy landscape has experienced a significant shift toward renewable sources, with solar energy emerging as a crucial contributor to sustainable power generation [
1]. PV systems offer an efficient solution for converting sunlight directly into electrical energy, yet their performance and longevity are frequently compromised by various faults that can develop during manufacturing, installation, or operation. These defects not only reduce energy yield but also create potential safety hazards, highlighting the necessity of effective fault detection and diagnosis (FDD) systems [
2,
3,
4,
5].
Traditional approaches to PV fault detection have relied on electrical parameter measurements and visual inspection techniques [
6,
7]. While these methods provide valuable diagnostic information, they face substantial limitations in scalability and resource requirements, particularly for large-scale solar installations. Infrared (IR) thermography has emerged as a promising alternative, offering non-invasive and rapid assessment capabilities. However, accurately interpreting thermal imagery under varying environmental conditions remains challenging, necessitating advanced analytical frameworks [
8,
9].
Recent years have witnessed significant advancements in applying deep learning techniques to PV fault detection [
10]. Convolutional neural networks (CNNs) have demonstrated remarkable capabilities in processing complex imaging data for fault classification [
11,
12,
13]. Despite these developments, existing solutions continue to face challenges related to detection accuracy, model generalization, and computational efficiency, particularly when dealing with subtle visual fault patterns across diverse operational environments, including varying lighting conditions, multiple panel configurations, and different viewing angles encountered in real-world drone-based aerial surveys [
14,
15,
16].
To address these limitations, we introduce SolarFaultAttentionNet, a channel-wise and spatial attention-based deep learning architecture specifically designed for PV module fault detection. Our approach combines advanced feature extraction with targeted attention mechanisms to enhance discrimination between various fault types while maintaining efficient processing capabilities. The principal contributions of this research include the following:
We propose
SolarFaultAttentionNet, a novel multi-path CNN architecture that incorporates channel-wise and spatial attention mechanisms, achieving 99.14% classification accuracy across six fault categories (electrical damage, physical damage, snow-covered, dusty, bird-drop, and clean), as shown in
Figure 1.
We developed a data augmentation pipeline utilizing Albumentations [
17] to address environmental variability. Our experimental evaluation shows that this approach contributed to a 100% detection rate for dust accumulation across all metrics and a 99.12% F1 score for electrical damage detection.
We present a systematic, comprehensive, comparative evaluation against recent PV fault detection approaches [
18,
19] and state-of-the-art models [
20], including VGG16/19, MobileNetV2, ResNet50V2, InceptionV3, DenseNet variants, and InceptionResNetV2, demonstrating substantial performance improvements with SolarFaultAttentionNet outperforming the next best model by 5.14%.
We demonstrate the computational efficiency of our model by achieving an inference time of 0.0160 s—a performance that is on par with a range of state-of-the-art lightweight architectures. This efficiency facilitates practical deployment in resource-constrained environments.
We achieved an optimal detection balance by attaining 98.24% sensitivity and 99.91% specificity. This performance minimizes both false negatives and false positives, thereby ensuring reliable deployment in practical monitoring applications.
The remainder of this paper is organized as follows:
Section 2 reviews existing literature on PV fault detection methodologies with emphasis on deep learning approaches.
Section 3 details the dataset characteristics, preprocessing techniques, and the proposed architectural framework.
Section 4 presents experimental findings and performance comparisons.
Section 5 examines the implications of our results within the broader context of solar energy monitoring. Finally,
Section 6 summarizes our contributions and identifies promising directions for future research.
2. Related Works
The rapid expansion of PV systems has necessitated the development of robust fault detection techniques to ensure high operational efficiency and safety [
6]. Early methods predominantly relied on electrical measurements—such as current–voltage (I–V) analysis—and electroluminescence imaging to detect anomalies at the module or cell level [
6,
7]. Although these approaches are effective for small-scale inspections, they are often labor-intensive and lack the sensitivity required to capture subtle defects (e.g., micro-cracks or partial shading) in extensive PV installations [
8].
IR thermography emerged as a non-invasive alternative, utilizing surface temperature variations to identify hotspots, cracks, and soiling. Integration with unmanned aerial vehicles (UAVs) has further enhanced its scalability, enabling rapid, wide-area inspections of large PV farms [
9,
21,
22]. However, despite its advantages, IR imaging can struggle with precise fault localization in heterogeneous environments [
8,
9].
Amaral et al. [
23] developed a framework for diagnosing faults in PV tracking systems by integrating machine learning with image processing and principal component analysis. Similarly, Abubakar et al. [
24] introduced a novel approach that combines Elman neural networks with boosted tree algorithms and statistical learning techniques to enhance fault detection capabilities.
The advent of deep learning has significantly advanced PV fault detection [
25]. CNNs such as VGG-16, EfficientDet, and YOLO variants have been employed to automatically extract discriminative features from large thermal or visible-light datasets, thereby improving both the classification and the localization of faults [
11,
12,
13]. Additionally, hybrid approaches that integrate classical statistical methods—such as principal component analysis and boosted tree algorithms—with deep learning have been explored to address the complexity of fault patterns [
23,
24]. Despite these advances, many current methods are still challenged with imbalanced datasets, environmental noise, and limited sensitivity to subtle defect features.
More recent studies have combined segmentation frameworks with classification networks to enhance spatial fault localization [
11,
13]. For instance, segmentation-based models leveraging U-Net architectures integrated with advanced backbones (e.g., InceptionV3) and auxiliary modules such as squeeze-and-excitation blocks have demonstrated improved accuracy in delineating fault regions [
19]. Yet, these methods often incur high computational costs and do not systematically address the generalization challenges posed by variable environmental conditions [
18,
26].
Furthermore, data augmentation has been recognized as a crucial strategy for mitigating issues related to limited and imbalanced datasets. While several works have incorporated augmentation techniques—such as rotation, scaling, flipping, and noise injection—to increase training sample diversity and improve model robustness [
23,
24], many studies lack a systematic evaluation of these strategies’ impacts on model generalization.
Building on these advancements and addressing the aforementioned limitations, our work offers several key contributions. First, we propose a novel deep learning model that integrates channel-wise and spatial attention mechanisms within a CNN backbone to enhance feature discrimination, enabling the detection of subtle defects with high class-specific accuracy (99%). Second, we employ an extensive data augmentation pipeline that systematically increases dataset diversity, thereby improving robustness across varied environmental conditions. Third, our architecture is optimized for fast inference—critical for real-time monitoring—without compromising detection precision. This comprehensive approach aims to overcome previous challenges in subtle fault detection, dataset imbalance, and computational efficiency, thereby setting a new benchmark for scalable and reliable PV fault detection.
Table 1 summarizes recent deep learning approaches for PV fault detection, revealing significant methodological diversity with accuracy rates ranging from 79% to 96.0%. The analysis shows that thermal infrared imaging dominates the field, while visible light approaches remain underexplored despite offering practical advantages including lower costs and simpler deployment.
3. Materials and Methods
3.1. Overview
The proposed methodology establishes a comprehensive framework for automated fault detection in PV modules using visible-light-based imagery. As illustrated in
Figure 2, our approach implements a sequential pipeline that begins with rigorous data preprocessing and augmentation, continues through feature extraction via a Multi-Path CNN architecture, and culminates in fault classification through an innovative dual-attention mechanism. The SolarFaultAttentionNet architecture is specifically designed to identify subtle anomalies associated with various PV module faults, enhancing detection accuracy while maintaining computational efficiency for practical deployment scenarios. Our approach addresses the challenges of dataset imbalance, environmental variability, and the need for fine-grained feature discrimination that are inherent to PV fault detection applications.
3.2. Data Collection and Preprocessing
Our study utilizes a dataset comprising 885 image samples [
29] of PV modules distributed across six distinct fault classes (as shown in the
Figure 1): Physical-Damage (69 images, 7.80%), Electrical-Damage (102 images, 11.53%), Snow-Covered (119 images, 13.45%), Clean (192 images, 21.69%), Dusty (188 images, 21.24%), and Bird-Drop (215 images, 24.29%). The dataset exhibits significant class imbalance with an imbalance ratio of 3.12:1 between the most represented class (Bird-Drop) and the least represented class (Physical-Damage). Our classification system identifies both permanent defects (Electrical-Damage, Physical-Damage) and temporary performance-reducing conditions (Snow-Covered, Dusty, Bird-Drop) for comprehensive automated monitoring. Electrical-Damage in visible imagery manifests as burn marks, cell discoloration, dark spots, and visible degradation patterns resulting from electrical faults. We recognize the dataset size constraints inherent to specialized industrial imaging applications. While our augmentation strategy addresses immediate practical needs, future work with larger, naturally balanced datasets will provide stronger validation of our approach.
To address this imbalance and enhance model generalization capabilities, we implemented an extensive data augmentation pipeline using the Albumentations [
17] library. This approach significantly expands upon standard augmentation techniques through the following:
Advanced Color Transformations: CLAHE (Contrast Limited Adaptive Histogram Equalization), brightness/contrast adjustments, and HSV shifts to simulate various lighting and environmental conditions.
Geometric Transformations: affine and perspective transforms, rotations at varying angles, and scale modifications to account for different viewing perspectives.
Occlusion Simulations: cutout and mosaic techniques to mimic partial shading and obstructions commonly encountered in outdoor PV installations.
Noise and Blur Operations: various noise patterns and blur effects to simulate different image quality conditions.
Through this comprehensive augmentation strategy, we balanced each class to approximately 1000 images, creating a more robust training dataset that enhances the model’s ability to detect faults under diverse environmental conditions. All images were normalized to the [0, 1] range using Equation (
1) and resized to a uniform input dimension of
pixels to ensure compatibility with the Multi-Path CNN backbone architecture.
3.3. Proposed Model Architecture
The SolarFaultAttentionNet architecture, depicted in
Figure 3, integrates a Multi-Path CNN backbone with specialized attention mechanisms [
30] to enable precise fault classification in PV modules. Our design consists of three primary components that work synergistically to extract and emphasize discriminative features for fault detection.
3.3.1. Multi-Path CNN Block
The foundation of our architecture begins with an InceptionV3 backbone that extracts hierarchical features from input images. As shown in
Figure 3, this component consists of multiple parallel convolutional paths, each capturing distinct aspects of fault signatures at different scales and abstractions. The multi-path design incorporates the following:
Convolutional layers with varying filter sizes ( and ).
Batch normalization for training stability.
Different dilation rates to capture multi-scale information.
Residual connections to facilitate gradient flow during backpropagation.
The feature maps generated via this block, denoted as (where C represents channels, and represent spatial dimensions), serve as input to the subsequent attention modules.
3.3.2. Channel-Wise Attention Block
The core innovation of SolarFaultAttentionNet is the channel attention module that dynamically recalibrates feature map channels based on their importance to fault detection. This attention mechanism has also been used in other applications [
31]. The channel attention mechanism can be formalized as follows:
where
represents the sigmoid activation function, and
refers to a multi-layer perceptron with a bottleneck architecture:
In this equation, and are learnable parameters, r is the reduction ratio (set to 16 in our implementation), and denotes the ReLU activation function.
The global average pooling operation aggregates spatial information to form a channel descriptor,
:
Similarly, the max pooling operation captures the most prominent features to form a complementary descriptor,
:
The final channel attention map,
, is applied to the original feature maps through element-wise multiplication:
This mechanism effectively assigns higher weights to channels that contain discriminative information for fault detection, enhancing the model’s ability to focus on subtle visible patterns such as localized texture variations, irregular color distributions, and surface irregularities associated with specific fault types.
3.3.3. Spatial Attention Block
Complementing the channel attention, the spatial attention module focuses on relevant regions within the feature maps. As illustrated in
Figure 3, this module first aggregates channel information through both average and maximum pooling operations along the channel dimension, generating two 2D spatial feature maps that are concatenated and processed through a convolutional layer to produce a spatial attention map,
:
where
represents a convolutional operation with a
kernel, and [; ] denotes the concatenation operation. The average and max pooling operations along the channel dimension are expressed as follows:
The resulting spatial attention map is applied to the channel-refined features to produce the final refined feature representation:
This dual-attention mechanism enables the model to focus on both the most informative feature channels and the most relevant spatial regions for discriminating between different fault types, significantly enhancing classification performance.
3.3.4. Classification Head
The refined features from the attention blocks are fed into a classification head consisting of the following:
Global average pooling to reduce spatial dimensions.
Dense layers with dropout regularization (rate = 0.5) to prevent overfitting.
A final softmax layer that outputs class probabilities across the six fault categories.
The entire network is trained end-to-end using categorical cross-entropy loss with label smoothing:
where
N is the batch size,
C is the number of classes,
is the ground truth label,
is the predicted probability, and
is the smoothing parameter, set to 0.1 in our implementation.
3.4. Experimental Setup
We constructed SolarFaultAttentionNet by incorporating a pre-trained InceptionV3 backbone and our custom channel and spatial attention modules. The dataset was split using a stratified approach with an 80:10:10 ratio for training, validation, and testing, respectively, ensuring balanced representation across all fault classes.
For training, we employed the following hyperparameters:
Optimizer: Adam with an initial learning rate of and weight decay of .
Batch size: 32.
Loss function: categorical cross-entropy with label smoothing ().
Early stopping: patience of 15 epochs monitoring validation loss.
Learning rate scheduler: reduce on plateau with a factor of 0.5 and patience of 5.
To evaluate the effectiveness of our approach, we compared SolarFaultAttentionNet against several state-of-the-art CNN architectures [
20], including VGG16/19 [
32], MobileNetV2 [
33], ResNet50V2 [
34], InceptionV3 [
35], InceptionResNetV2 [
36], Xception [
37], DenseNet201 [
38], ResNet152V2, DenseNet121 [
38], and EfficientNetV2B3 [
39]. Each model was trained on identical data splits with optimized hyperparameters to ensure a fair comparison.
3.5. Evaluation Metrics
A comprehensive evaluation framework is essential for assessing the performance of fault detection models in photovoltaic systems. In this study, we employ a diverse set of metrics to provide a thorough analysis of SolarFaultAttentionNet’s capabilities compared to existing approaches.
To quantify classification performance across the six fault categories, we utilize the following standard metrics.
Accuracy serves as our primary metric, representing the overall proportion of correctly classified instances across all classes:
While accuracy provides a general assessment, it can be misleading in imbalanced datasets. Therefore, we incorporate additional metrics to capture more nuanced aspects of model performance.
Precision quantifies the proportion of true positives among all positive predictions, providing insight into the reliability of the model’s fault detections:
Recall (also known as sensitivity) measures the model’s ability to correctly identify all instances of a particular fault type:
F1 score represents the harmonic mean of precision and recall, offering a balanced measure particularly valuable for fault detection, where both false positives and false negatives carry significant operational consequences:
Specificity evaluates the model’s ability to correctly identify negative cases (non-faulty modules or different fault types), which is critical for minimizing unnecessary maintenance interventions:
For multi-class scenarios, these metrics are calculated using both averaging strategies and class-specific evaluations. Class-specific metrics provide targeted insights into model performance for each fault category, which is particularly important for identifying weaknesses in detecting rare but critical fault types.
Beyond classification performance, practical deployment considerations necessitate an evaluation of computational efficiency through the following metrics. We evaluate models based on inference time, which measures the average time required to process a single image and generate a prediction, reported in milliseconds. This metric is crucial for assessing real-time monitoring capabilities, particularly in drone-based inspection scenarios.
These efficiency metrics are particularly relevant for photovoltaic monitoring systems, where models may need to operate on resource-constrained edge devices or process large volumes of imagery in near real-time during aerial inspections of extensive solar installations. By employing this comprehensive evaluation framework, we ensure a fair and thorough comparison between SolarFaultAttentionNet and existing state-of-the-art architectures, addressing both detection performance and practical deployment considerations.
4. Results
This section presents a comprehensive evaluation of the proposed SolarFaultAttentionNet architecture with channel-wise and spatial attention mechanism for photovoltaic fault detection. We analyze the model’s performance through multiple metrics and provide comparisons with state-of-the-art deep learning architectures [
20] and recent proposed approaches [
18,
19].
4.1. Comparison with Baseline Models
Table 2 presents a detailed comparison between SolarFaultAttentionNet and several established deep learning architectures for solar panel fault detection.
The VGG family architectures (VGG16 and VGG19) demonstrate moderate performance, achieving accuracies of 89.00% and 83.00%, respectively. MobileNetV2, designed for computational efficiency, achieves 85.00% accuracy with a balanced precision–recall profile. The EfficientNet variants and InceptionV3 show improved performance with 91.00% accuracy, while InceptionResNetV2 represents the strongest baseline at 94.00% accuracy with 92.88% F1 score.
SolarFaultAttentionNet substantially outperforms all comparison models, achieving 99.14% accuracy, which represents a 5.14 percentage point improvement over the next best model. This superior performance extends across precision (98.63%), recall (98.24%), and F1 score (98.42%). Notably, SolarFaultAttentionNet maintains competitive inference time (0.0160 s), comparable to models like VGG16 (0.0161 s), while delivering significantly higher accuracy. The memory footprint of 2697.07 MB further demonstrates the efficiency of the proposed architecture despite its more sophisticated design. To address potential overfitting concerns, given the high accuracy achieved,
Figure 4 presents the training and validation accuracy curves for SolarFaultAttentionNet. The curves demonstrate stable convergence with minimal divergence (0.23% gap), indicating effective generalization, rather than the memorization of training patterns. The smooth progression and early convergence around epoch 47 support the model’s ability to learn genuine discriminative features for fault detection.
4.2. Class-Specific Performance Analysis
Table 3 presents SolarFaultAttentionNett’s performance metrics across different fault categories, revealing consistent detection capabilities across diverse fault types.
The class-specific metrics demonstrate exceptional detection capabilities across all fault types. The Dusty class achieves perfect scores across all metrics (100% sensitivity, specificity, precision, and F1 score), indicating the model’s proficiency in identifying dust accumulation—a common issue significantly impacting energy generation efficiency.
Electrical-damage detection shows outstanding performance with 99.19% sensitivity, 99.80% specificity, and 99.12% F1 score, highlighting the model’s reliability in identifying critical electrical faults requiring immediate attention. The Physical-Damage and Snow-Covered classes also demonstrate excellent detection performance with F1 scores of 98.59% and 98.13%, respectively.
The consistently high specificity values (98.33% to 100.00%) indicate minimal false positive rates across all classes, which is crucial for practical deployment to avoid unnecessary maintenance interventions. This balanced performance across different fault types validates the effectiveness of the channel-wise attention mechanism in capturing subtle, discriminative features specific to each fault category. The dual-attention mechanism distinguishes bird droppings from dust by leveraging their visual differences, where bird droppings present localized irregular patterns with concentrated coverage, while dust exhibits diffuse and uniform surface attenuation.
SolarFaultAttentionNet achieves 99.14% accuracy with 98.63% precision, 98.24% recall, and 98.42% F1 score. These metrics indicate the model’s exceptional balance between minimizing both false positives and false negatives across all fault categories, making it highly reliable for practical deployment in solar panel monitoring systems.
The integration of channel-wise and spatial attention within the InceptionV3 backbone significantly enhances the model’s ability to focus on discriminative features while suppressing less relevant information. This targeted feature extraction enables SolarFaultAttentionNet to effectively distinguish subtle defects across diverse fault categories, establishing a new benchmark for photovoltaic fault detection systems.
4.3. Comparative Analysis of State-of-the-Art PV Fault Detection Models
Table 4 presents a comparison between SolarFaultAttentionNet and other recent approaches for solar panel fault detection.
A detailed comparison between SolarFaultAttentionNet and other recent approaches reveals significant performance advantages of our proposed model. The implementation by Ledmaoui et al. [
18] CNN-PyQt5 implementation achieves 91.46% accuracy with practical deployment capabilities through a graphical interface, but it falls short in detection accuracy compared to our model. SPF-Net by Rudro et al. [
19] employs a U-Net segmentation architecture that performs well on satellite imagery with 94.35% accuracy, yet SolarFaultAttentionNet outperforms it by 4.79%.
The performance gap can be attributed to several key differences in approach. Ledmaoui et al. [
18] focus on implementation practicality with their PyQt5 integration but utilize a standard CNN architecture without advanced attention mechanisms. Rudro et al. [
19] leverage the segmentation capabilities of U-Net, which are effective for satellite imagery but may not capture the subtle visible light-based approach detects. SolarFaultAttentionNet’s channel-wise attention mechanism specifically enhances feature discrimination by recalibrating channel importance, allowing it to identify subtle fault patterns with greater precision.
All three models classify similar fault categories (Clean, Dusty, Bird-Drop, Electrical-Damage, Physical-Damage, and Snow-Covered), providing a fair basis for comparison. However, SolarFaultAttentionNet demonstrates superior performance across all metrics, particularly in precision (98.63%) and recall (98.24%), indicating its robust capabilities in minimizing both false positives and false negatives—critical factors for practical deployment in solar monitoring systems.
SolarFaultAttentionNet’s exceptional performance across both overall and class-specific metrics, combined with its efficient inference time and balanced precision–recall characteristics, establishes it as a robust solution for automatic fault detection in photovoltaic systems. The model’s ability to accurately identify diverse fault types while maintaining minimal false positives makes it particularly valuable for real-world applications where reliable fault detection is critical for maximizing energy production and minimizing maintenance costs.
5. Discussion
The comprehensive evaluation of SolarFaultAttentionNet demonstrates the significant impact of integrating channel-wise and spatial attention mechanisms within convolutional neural networks for photovoltaic fault detection. The experimental results reveal several key insights regarding the effectiveness of our proposed approach. The substantial performance gap between SolarFaultAttentionNet (99.14% accuracy) and conventional CNN architectures (maximum 94.00% accuracy for InceptionResNetV2) can be attributed to three primary factors.
The channel-wise attention mechanism significantly enhances feature discrimination by dynamically recalibrating channel weights based on their relevance to specific fault patterns. This allows the model to focus on subtle signatures that distinguish different fault types, particularly for challenging categories like micro-cracks and partial shading that exhibit similar visual characteristics. The mechanism’s ability to highlight discriminative channels while suppressing less informative ones directly addresses a fundamental limitation of standard CNNs, which treat all channels with equal importance regardless of their relevance to the classification task.
Furthermore, the spatial attention component complements channel attention by identifying regions of interest within feature maps that contain fault-related information. This dual-attention approach creates a synergistic effect, as demonstrated by the exceptional class-specific performance across diverse fault categories. The perfect detection achieved for the Dusty class (100% sensitivity, specificity, precision, and F1 score) exemplifies how targeted feature extraction can overcome the challenges of environmental noise and variable imaging conditions.
The extensive data augmentation pipeline significantly contributes to the model’s robustness. The performance improvement on previously unseen test images aligns with findings from Bommes et al. [
27] and Yan et al. [
40], who emphasized the importance of diverse training examples for generalizing across variable environmental conditions. Our augmentation approach, which incorporates advanced color transformations, geometric modifications, occlusion simulations, and noise operations, enables the model to adapt to the wide range of imaging scenarios encountered in real-world PV installations.
The comparative analysis with Ledmaoui et al. [
18] and Rudro et al. [
19] highlights notable differences in approach and performance. CNN-PyQt5 implementation (91.46% accuracy) prioritizes deployment practicality through a graphical interface but employs a standard CNN architecture without advanced attention mechanisms. SPF-Net (94.35% accuracy) leverages U-Net for segmentation capabilities but primarily focuses on satellite imagery. SolarFaultAttentionNet outperforms both approaches by 7.68% and 4.79% in accuracy, respectively, while maintaining a competitive inference time.
These performance gains are particularly remarkable, considering the challenging nature of visible-light imagery analysis. Visible images require the extraction of subtle surface patterns that can be influenced by factors such as lighting variability, reflections, and environmental noise. The channel-wise attention mechanism addresses this challenge by emphasizing the feature channels most responsive to discriminative visual characteristics. The model’s exceptional performance on challenging fault categories, such as Electrical-Damage (99.12% F1 score) and Physical-Damage (98.59% F1 score), demonstrates its ability to capture fine-grained discriminative features that conventional architectures can often fail to detect.
A critical consideration for practical deployment is computational efficiency, particularly for real-time monitoring applications such as drone-based inspections. Despite its sophisticated architecture, SolarFaultAttentionNet maintains an inference time of 0.0160 s, comparable to lightweight models like VGG16 (0.0161 s) and significantly faster than models with similar accuracy like InceptionResNetV2 (0.0756 s). This efficiency can be attributed to the targeted nature of the attention mechanisms, which focus computational resources on the most informative features, rather than processing all features equally. The memory footprint of 2697.07 MB further reinforces the model’s suitability for deployment on edge devices with limited computational resources, enabling on-site analysis during inspection operations. The model’s efficiency supports deployment in drone-based inspection systems [
27] for large-scale PV installations, where the memory requirements are manageable within typical UAV computational payloads while enabling real-time fault detection during aerial surveys.
The longer training time (99.09 min) compared to other models is a reasonable trade-off, considering the substantial performance gains and the one-time nature of the training process. For practical implementations, this suggests that model updates can be performed periodically on centralized servers, with the optimized model then deployed to field devices for efficient inference.
Despite SolarFaultAttentionNet’s exceptional performance, several limitations warrant consideration. While our dataset encompasses six common fault categories, it may not capture the full spectrum of anomalies encountered in diverse PV installations across different geographical regions and climatic conditions. Additionally, the model’s performance on extremely degraded or composite faults (where multiple fault types co-occur) requires further investigation. While the model demonstrates robust performance on our test set, long-term validation in operational environments is necessary to assess its resilience to temporal degradation patterns and seasonal variations.
These limitations suggest several promising directions for future research. Expanding the dataset to include a broader range of fault types, environmental conditions, and PV module technologies would enhance generalization capabilities. Incorporating temporal information through sequential imagery could enable the detection of progressive degradation patterns, potentially facilitating predictive maintenance, rather than reactive fault detection. Furthermore, exploring parameter-efficient fine-tuning techniques such as Spatially Aligned-and-Adapted Visual Prompt (SA
2VP) [
41] and E
2VPT [
42] could further optimize the model for deployment on resource-constrained edge devices.
The practical implications of SolarFaultAttentionNet extend beyond technical performance metrics to tangible improvements in solar energy generation efficiency. The model’s high sensitivity (98.24%) ensures early detection of faults that might otherwise progress to more severe conditions, while its high specificity (99.91%) minimizes unnecessary maintenance interventions. Early fault detection directly impacts energy yield and operational costs by reducing energy losses associated with undetected faults, enabling targeted maintenance that prioritizes high-impact issues, extending module lifespan through timely intervention before fault propagation, and improving return on investment by optimizing maintenance resource allocation.
6. Conclusions
This paper has presented SolarFaultAttentionNet, a channel-wise and spatial attention-based deep learning framework for robust and interpretable PV cell fault detection. By incorporating an InceptionV3 backbone in Multi-Path CNN, applying comprehensive data preprocessing steps, and incorporating an innovative dual-attention mechanism, the proposed model surpasses various state-of-the-art CNNs in accurately identifying faults such as micro-cracks, hotspots, and shading in PV images.
The experimental evaluations demonstrate that SolarFaultAttentionNet achieves exceptional performance metrics with 99.14% accuracy, 98.63% precision, 98.24% recall, and 98.42% F1 score across six fault categories. Notably, the model maintains perfect detection capability for dust accumulation (100% across all metrics) and exhibits outstanding performance for critical electrical damage detection (99.12% F1 score). The comparative analysis with eleven established CNN architectures confirms the superiority of our approach, with a 5.14% accuracy improvement over the next best model while maintaining a competitive inference time.
The integration of channel-wise attention significantly enhances the model’s ability to focus on discriminative features while suppressing less relevant information. This targeted feature extraction enables SolarFaultAttentionNet to effectively distinguish subtle defects across diverse fault categories, establishing a new benchmark for photovoltaic fault detection systems. Furthermore, the comprehensive data augmentation pipeline improves model robustness on previously unseen data, demonstrating enhanced generalization capability across variable environmental conditions.
From a practical perspective, SolarFaultAttentionNet’s balanced performance profile—high sensitivity (98.24%) and specificity (99.91%)—ensures reliable fault detection with minimal false positives and negatives, making it ideal for deployment in large-scale solar monitoring systems. The model’s efficiency in terms of inference time (0.0160 s) and reasonable memory requirements (2697.07 MB) further supports its practical utility in real-world applications, including drone-based inspections of extensive solar installations.
Future work will focus on refining SolarFaultAttentionNet for resource-constrained, real-time explainable applications using explainable AI techniques such as Local Interpretable Model-Agnostic Explanations (LIME) [
43]. We will extend the framework to handle composite faults through multi-label classification and incorporate temporal analysis to track fault progression over time for comprehensive PV monitoring systems. We aim to expand the dataset to include a broader array of fault types and environmental conditions, allowing the model to generalize more effectively across diverse PV installations. To reduce computational overhead without sacrificing accuracy, we plan to explore parameter-efficient fine-tuning techniques such as Spatially Aligned-and-Adapted Visual Prompt (SA
2VP) [
41] and other visual prompt tuning approaches [
42,
44], which show promise in adapting large-scale networks with minimal parameter updates.