3.5.1. Linear Regression
To evaluate the effectiveness and generalizability of Linear Regression in predicting capacity retention, the model was tested under two configurations:
- -
Using only cycler-derived features (test data).
- -
Using cycler + material features (test and unseen data).
The training dataset consisted of cycling results and material characterizations from five different commercial supercapacitors. The data were merged by sample ID and stratified by supercapacitor type to ensure balanced sampling across the classes. Feature values were standardized using Standard Scaler, and models were trained using a 70/30 train-test split. The results are summarized in
Figure 10, which presents performance metrics and error distributions for both test and unseen datasets.
In the test dataset, adding material features to the cycler data led to a clear improvement in model accuracy. The R2 value increased from 0.88 to 0.955, while MAE and RMSE dropped from 3.93% to 2.49% and 5.08% to 3.12%, respectively. This indicates that incorporating material and physical properties helps the model capture additional degradation-related information not present in electrical data alone.
Importantly, the inclusion of material features also improved prediction robustness on unseen data. While the model trained on cycler-only features struggled to generalize beyond the training set, adding material inputs yielded an R2 of 0.917 and maintained a low MAE of 2.76% and RMSE of 3.74%. These results highlight the positive contribution of intrinsic material characteristics, such as porosity, electrode composition, and thermal stability, to predictive generalization.
The error distributions further support these observations. As seen in
Figure 10 (bottom row), the cycler-only model exhibits wider error spreads with multiple large deviations, particularly on unseen samples. In contrast, the inclusion of material features yields tighter, more symmetric error distributions, suggesting not only better model fit but also improved consistency when predicting new, untested devices.
Table 4 presents a quantitative summary of the Linear Regression model’s performance across three scenarios: using only cycler data (Train Cycler), using combined cycler and material features (Train Cycler + Material), and evaluating the trained model on unseen supercapacitors (Unseen Cycler + Material). As shown, the inclusion of material features substantially improved all performance metrics during training, with R
2 rising from 0.879 to 0.955, and MAE and RMSE dropping to 2.49% and 3.12%, respectively. Notably, the model maintained high accuracy even on unseen data, achieving an R
2 of 0.917 and a Pearson correlation of 0.971, confirming the material features’ contribution to enhanced generalization. These results validate the value of integrating structural and compositional properties into predictive frameworks, even within a simple linear modeling approach.
3.5.2. Random Forest
To further investigate the predictive capability of classical ensemble methods, a Random Forest Regressor was developed to estimate the capacity retention of commercial supercapacitors. To mitigate overfitting and enhance generalization, hyperparameter tuning was conducted using GridSearchCV with 5-fold cross-validation. The parameter grid included constraints such as limiting the number of estimators (n_estimators = 100, 200), setting maximum tree depth (max_depth = 10 or 20), and enforcing a minimum number of samples per leaf node (min_samples_leaf = 100, 300). Additionally, the max_features parameter was tuned (sqrt and 0.5) to reduce input dimensionality per split. These regularization strategies were selected to balance model flexibility with robustness, particularly in scenarios involving a high-dimensional input space due to material features. Despite this, the model showed signs of overfitting when tested on unseen data, indicating that further techniques such as feature selection or model ensemble averaging may be necessary in future work. As shown in
Figure 11a, incorporating material features significantly improved performance on the test dataset. The R
2 increased from 0.63 to 0.84, and both MAE and RMSE decreased from 5.93% to 3.89% and from 9.56% to 6.30%, respectively. These improvements suggest that Random Forest effectively leveraged material inputs such as porosity, electrode thickness, and thermal degradation profiles to capture nonlinear relationships influencing degradation behavior.
However, this benefit did not fully translate to the unseen dataset. When evaluated on supercapacitors not included in training, the model trained with cycler + material features experienced a drop in performance, with R2 decreasing to 0.48 and RMSE increasing to 9.34%, nearly reverting to cycler-only levels. This decline suggests that while material features helped the model fit the training distribution more accurately, they may have also introduced overfitting or dataset-specific noise that hindered generalization.
The error distributions in
Figure 11b reinforce this observation. For the test data, adding material features clearly narrowed the error spread and centered it around zero, indicating better accuracy and reduced variance. In contrast, the unseen data error distribution showed wider tails and more outliers, highlighting the model’s reduced stability when applied to new samples. These findings point to Random Forest’s sensitivity to feature noise and suggest that additional regularization or feature selection may be needed to improve robustness.
Table 5 presents the quantitative performance metrics of the Random Forest model across three scenarios: training with only cycler features, training with cycler + material features, and evaluation of unseen supercapacitors using the cycler + material model. When trained on cycler-only features, the model achieved a modest performance with an R
2 of 0.633 and a relatively high RMSE of 9.56%, indicating limited capacity to capture the degradation behavior using cycling data alone.
Incorporating material features during training substantially improved predictive power, raising the R2 to 0.841 and reducing RMSE to 6.30%. This demonstrates that Random Forest effectively utilized material-level inputs such as porosity, electrode thickness, and thermal properties to capture complex degradation mechanisms. Furthermore, the Pearson correlation increased significantly to 0.920, reinforcing the model’s ability to align well with true capacity retention trends.
However, on the unseen validation dataset, performance dropped noticeably. The R2 declined to 0.482 and RMSE increased to 9.34%, which is comparable to the cycler-only model. Although the test data showed strong predictive accuracy, this decline on unseen samples suggests potential overfitting to training patterns or dataset-specific material characteristics. The decrease in Pearson correlation to 0.695 further highlights a loss in generalization. These results suggest that while material features boost learning on known data, additional regularization or broader training diversity may be necessary to enhance robustness across new supercapacitor types.
3.5.3. MLP (Multi-Layer Perceptron)
To explore the effectiveness of deep learning in modeling supercapacitor degradation, a Multi-Layer Perceptron (MLP) Regressor was employed. MLP is a type of feedforward artificial neural network capable of capturing complex nonlinear relationships through its hidden layers. Two models were trained: one using only cycling features and another using an augmented feature set that included selected material properties and engineered interaction terms. Feature engineering involved the creation of nonlinear interaction variables such as Current × Voltage and Charge Time × Discharge Time, enhancing the model’s capacity to learn meaningful patterns. As shown in
Figure 12, the inclusion of material inputs significantly enhanced the model’s learning capacity and predictive accuracy.
On the training dataset, the model trained with cycler-only features performed poorly, exhibiting a negative R2 of −3.26, an MAE of 26.6%, and an RMSE of over 33.4%—worse than a baseline mean predictor. After incorporating material and physical features, however, the model’s performance improved dramatically, achieving an R2 of 0.976, MAE of 1.97%, and RMSE of 2.50%. This indicates that MLP relies heavily on the nonlinear and complementary nature of material descriptors to learn degradation behavior effectively.
The error distribution plots in
Figure 12b further illustrate this transformation. The cycler-only model produced a broad, skewed error profile with extreme deviations exceeding 60%. In contrast, the cycler + material configuration yielded a narrow, symmetric distribution centered near zero, demonstrating improved stability and accuracy.
Despite being trained only on known samples, the MLP model also performed well on unseen data, as indicated in
Table 6. While some increase in error was expected, the model maintained a high R
2 of 0.941 and RMSE of 3.15%, indicating that the incorporated material features improved generalization. The corresponding errors with the histogram for unseen data (
Figure 12b) remained compact, with most predictions within a ±3% range. This shows that the MLP not only captured complex patterns during training but also generalized well when supported by carefully selected structural and material attributes.
A comprehensive comparison of model performance across three machine learning algorithms Linear Regression, Random Forest, and MLP (Multi-Layer Perceptron) under three scenarios: training with cycler features only, training with cycler plus material features, and evaluation on unseen data using cycler plus material features, is presented in
Table 7. The performance is assessed using four key metrics: R
2, MAE, RMSE, and Pearson correlation. The superior performance of the MLP model can be attributed to its deep nonlinear learning capability and ability to automatically capture complex feature interactions. Unlike linear models that assume additive relationships or tree-based models that rely on axis-aligned splits, MLPs utilize multiple hidden layers and activation functions to learn nonlinear transformations of the input space. This is especially advantageous in this study, where temporal degradation behavior, structural parameters, and material features interact in nontrivial ways. For instance, the effect of porosity may depend on electrode thickness or cycling voltage interactions that are naturally learned by neural networks. Additionally, the inclusion of engineered features (e.g., Current × Voltage, Charge Time × Discharge Time) further enhanced the model’s ability to generalize by enriching the representation space. These characteristics make MLPs well-suited for predicting supercapacitor degradation, where both linear trends and subtle multivariate patterns co-exist.
Across all three models, the addition of material features consistently enhances model performance. For instance, in the Linear Regression model, the R2 increased from 0.879 (cycler only) to 0.955 (cycler + materials), and MAE dropped from 3.93 to 2.49, demonstrating the value of including electrochemical and physical material descriptors. A similar trend is seen in the Random Forest model, where R2 improved from 0.633 to 0.841 and MAE dropped from 5.93 to 3.89 upon adding material features.
The MLP model benefited the most from the inclusion of material features. Without them, it failed to generalize and exhibited poor performance (R2 = −3.26, RMSE = 33.41), but once material features were included, it achieved the best overall results, with an R2 of 0.976, MAE of 1.97, and RMSE of 2.49 on the training set. Even on unseen data, the MLP retained strong predictive performance (R2 = 0.941, Pearson = 0.994).
To evaluate the contribution of material and physical features to the capacity retention prediction, we analyzed feature importance rankings for all three models: Linear Regression, Random Forest, and Multilayer Perceptron (MLP). While electrical parameters such as voltage and current consistently emerged as dominant predictors, several material and physical features also demonstrated significant influence. Specifically, features such as electrode weight, density, thickness, and Fluorine percentage appeared prominently in the Random Forest and MLP models, highlighting their relevance to the degradation behavior of supercapacitors. Additionally, the Linear Regression model assigned meaningful coefficients to features like remaining weight in the TGA test, DTG temperature, and derivative weight, indicating that thermal and compositional properties affect long-term performance. These findings underscore the complementary role of material characterization data in enhancing model interpretability and predictive accuracy beyond conventional electrical metrics. The detailed visualization of feature importance for each model is provided in
Supplementary Materials (Figure S5).
In summary, this analysis underscores the critical role of material features in improving model performance for capacity retention prediction. Across all three machine learning models—Linear Regression, Random Forest, and MLP—the inclusion of material descriptors led to consistent and often substantial enhancements in accuracy. These descriptors, which include properties such as electrode thickness, porosity, and thermal degradation behavior, provide essential physical and structural context that cycling data alone cannot capture. Their presence enables models to better understand and predict the long-term electrochemical behavior of supercapacitors.
While Linear Regression showed improved performance with material inputs, the effect was most pronounced in more complex models. For Random Forest, material features reduced prediction error on the test set by a wide margin, though challenges in generalization remained. In contrast, the MLP model, which initially struggled with cycler-only features, exhibited exceptional performance once material features were included, achieving the best results across all metrics, both in training and on unseen data. This indicates that neural networks, with their ability to model nonlinear relationships, are particularly well-suited to harness the predictive power of diverse material attributes.
Overall, these findings highlight that material features are not merely supplementary but essential for accurate and robust modeling of supercapacitor degradation. Their integration transforms simple data-driven models into more physically informed predictive tools, enabling better generalization across different device chemistries and operational histories. As such, future work on predictive modeling in energy storage should prioritize the systematic collection, selection, and integration of material-level data to maximize both model performance and reliability.
In addition to the discussion, it should be noted that the machine learning-based approach presented in this study is applicable not only to commercial supercapacitors but also to non-commercial ones, including those fabricated in laboratory environments. As demonstrated in a study by the same authors (Khosravinia et al., iScience 2023) [
23], a similar framework was successfully applied to predict the performance of pseudo-capacitors produced using ultra-short laser pulses for direct electrode fabrication. This supports the generalizability of the approach across a wide range of supercapacitor systems, irrespective of their commercial status.