This research introduces a novel hybrid machine learning framework for automated quality prediction and classification of silicon solar modules in production lines. Unlike conventional approaches that rely solely on encapsulation loss rate (
ELR) for performance evaluation—a method limited to assessing encapsulation-related power loss—our framework integrates unsupervised clustering and supervised classification to achieve a comprehensive analysis. By leveraging six critical performance parameters (open circuit voltage (
VOC), short circuit current (
ISC), maximum output power (
Pmax), voltage at maximum power point (
VPM), current at maximum power point (
IPM), and fill factor (
FF)), we first employ k-means clustering to dynamically categorize modules into three performance classes: excellent performance (
ELR: 0–0.77%), good performance (0.77–8.39%), and poor performance (>8.39%). This multidimensional clustering approach overcomes the narrow focus of traditional
ELR-based methods by incorporating photoelectric conversion efficiency and electrical characteristics. Subsequently, five machine learning classifiers—decision trees (DT), random forest (RF), k-nearest neighbors (KNN), naive Bayes classifier (NBC), and support vector machines (SVMs)—are trained to classify modules, achieving 98.90% accuracy with RF demonstrating superior robustness. Pearson correlation analysis further identifies
VOC,
Pmax, and
VPM as the most influential quality determinants, exhibiting strong negative correlations with
ELR (−0.953, −0.993, −0.959). The proposed framework not only automates module quality assessment but also enhances production line efficiency by enabling real-time anomaly detection and yield optimization. This work represents a significant advancement in solar module evaluation, bridging the gap between data-driven automation and holistic performance analysis in photovoltaic manufacturing.
Full article