On the Sufficiency of Direct Regression for Perovskite Solar Cell Degradation Forecasting
Abstract
1. Introduction
- A physics-grounded characterization of PSC degradation complexity: We demonstrate that 150 h MPPT degradation trajectories are predominantly single-exponential, with device-specific decay rates identifiable within the first 30 h, making the forecasting task low-dimensional and amenable to direct regression.
- A rigorous three-way benchmark: We compare NHITS [13], a hierarchical MLP with direct multi-horizon regression; P-NHITS, which uses the same architecture with multi-quantile output [14]; and TimeDiff [5], a conditional diffusion model incorporating the full CSDI backbone, autoregressive initialization, mode conditioning, and classifier-free guidance. The evaluation is conducted on 2245 devices from the Hartono et al. dataset [1]. Despite providing TimeDiff with every architectural advantage described in the literature, NHITS outperforms it by 17% on point RMSE (0.738 vs. 0.863 PCE%, p < 10−15, Wilcoxon signed-rank test).
- Evidence that uncertainty quantification does not require generative models in this context: P-NHITS attains 77% coverage on nominal 80% prediction intervals with negligible loss in point accuracy (0.744 vs. 0.738 PCE%). In contrast, TimeDiff’s sample-based intervals cover only 63%, indicating overconfidence despite 50 DDPM samples. For T90 lifetime prediction over forecast-window crossings, both NHITS variants achieve an MAE of 16.2–16.9 h, compared to TimeDiff’s 22.5 h.
- Practical guidelines linking degradation physics to model selection: We delineate when diffusion models add value (multimodal futures, cycling-dependent dynamics) and when they do not (smooth, unimodal, physics-constrained trajectories).
2. Related Work
2.1. Perovskite Degradation Modeling and the Data Landscape
2.2. Time-Series Forecasting for Degradation in Energy Devices
2.3. Conditional Diffusion Models Versus Direct Regression
3. Dataset and Degradation Physics
3.1. The HySPRINT Aging Dataset
3.2. Preprocessing Pipeline
- Duration filter: Devices with a total measurement duration less than 149 h are excluded, ensuring that only devices with complete aging runs are retained.
- Causal outlier detection and repair: A backward-looking rolling median filter with a window of 42 samples (approximately 7 h) is applied to each device’s PCE series. At each time step, the local median and median absolute deviation (MAD) are calculated from the preceding window. Data points deviating by more than 5 × 1.4826 × MAD from the rolling median are identified as outliers and replaced with the local median. Devices exhibiting more than 5% outliers, or containing values outside the physical range (PCE below −1% or above 50%), are excluded. Since the filter relies solely on past data, the temporal structure necessary for forecasting is maintained.
- Temporal resampling: The irregularly spaced 10 min samples are resampled to a uniform 1 h grid using Akima cubic spline interpolation [21]. Akima interpolation fits piecewise cubic functions determined by local slopes at each data point, thereby preserving monotonicity and preventing the Runge oscillations associated with global polynomial methods. This procedure produces 150 evenly spaced data points per device.
3.3. Degradation Modes and Their Physical Origins
- Mode 0: Initial gain then plateau (∼56%). Light soaking and beneficial ion redistribution during early operation enhance PCE before stabilization. Mobile ions (I−, MA+) redistribute to establish favorable built-in fields at interfaces [11]. The gain phase is transient, while the plateau reflects the device’s intrinsic steady-state efficiency. This mode is most frequently observed among high-efficiency devices.
- Mode 1: Slow exponential decay (∼30%). This mode exhibits gradual, monotonic efficiency loss at less than 0.5% per day, consistent with slow irreversible processes such as trap-state accumulation at grain boundaries, progressive contact oxidation, or slow halide phase segregation.
- Mode 2: Medium exponential decay (∼10%). This mode is characterized by a steeper decline at 0.5 to 2% per day, potentially involving multiple concurrent degradation mechanisms or devices with less robust interface engineering.
- Mode 3: Fast exponential decay (∼4%). This mode involves rapid failure, often reaching near-zero PCE within 50 to 100 h. It is consistent with catastrophic interface failure, delamination, or severe phase instability. Hartono et al. [1] reported no representation of this mode among devices with maximum PCE above 19.2%.
3.4. Batch Structure and Generalization
3.5. Task Formulation
- Point accuracy: Per-device RMSE is computed over the 120 h forecast horizon and pooled across three random seeds [42, 123, 456]. All RMSE and MAE values are reported in absolute percentage points of PCE (denoted PCE%), representing the difference in efficiency between predicted and actual values on the original scale (e.g., an RMSE of 0.738 PCE% means the average prediction deviates by 0.738 absolute percentage points from the measured efficiency). This serves as the primary evaluation metric.
- Lifetime milestones: T80 and T90 [22] represent the predicted times at which PCE first falls below 80% or 90% of the reference PCE (maximum PCE in the first 30 h). These values are computed using linear interpolation between hourly grid points and reported as MAE in hours. These milestones are particularly relevant for stability screening, as manufacturers require T90 to determine whether a device will maintain more than 90% of its initial efficiency for 100 h.
- Uncertainty calibration: For models that provide prediction intervals (P-NHITS and TimeDiff), 80% coverage is reported as the proportion of actual values within the 10th to 90th percentile band, along with the mean band width in PCE%. Narrower bands at sufficient coverage indicate more informative uncertainty estimates.
4. Methods
4.1. NHITS: Direct Multi-Horizon Regression
4.2. Probabilistic NHITS: Quantile Regression
4.3. TimeDiff: Conditional Diffusion with Full Enhancements
- A 128-dim sinusoidal positional encoding of timestamps (matching the CSDI reference implementation);
- A 16-dim learned feature embedding ( for univariate);
- A 16-dim learned mode embedding from causal slope-quartile labels (see below);
- A 1-dim binary conditioning mask (1 for observed, 0 for target).
5. Results
5.1. Point Forecast Accuracy
5.2. Per-Mode Analysis
Mode 0 (Initial Gain, )
Modes 1–2 (Exponential Decay, )
Mode 3 (Fast Decay, )
5.3. T80/T90 Lifetime Milestone Prediction
5.4. Uncertainty Quantification
5.5. Representative Trajectories
5.6. Input-Window Sensitivity
6. Discussion
6.1. Why Degradation Physics Determines Model Selection
The Root Cause: PSC Degradation Is Low-Dimensional
Why Diffusion Adds Overhead Without Payoff?
The Normalization Asymmetry Is a Symptom, Not a Cause
6.2. The Probabilistic NHITS Result in Context
6.3. Connecting to DiffBatt and the Broader Generative-AI Debate
6.4. Practical Recommendations
- For point predictions: NHITS with identity scaler is recommended. Training requires approximately 5 min on a T4 GPU. Inference is immediate, requiring only a single forward pass per device without iterative sampling. An RMSE of 0.738 PCE% over 120 h corresponds to an average prediction error below 1 percentage point, which is within measurement uncertainty for many device architectures.
- For uncertainty-aware screening: P-NHITS with multi-quantile loss is recommended. Training time matches that of deterministic NHITS. The 10th to 90th quantile interval (77% empirical coverage) directly addresses practical questions, such as whether a device is likely to maintain greater than 15% PCE at hour 100. This information supports go/no-go decisions in screening workflows. Coverage can be increased toward the nominal 80% using conformal calibration if required.A concrete screening decision rule using these intervals is as follows. Let and denote the 90th and 10th percentile forecasts at hour t, and let be the T90 stability threshold. For a target assessment time (e.g., hour 100):
- –
- Accept if (even the pessimistic bound stays above the threshold);
- –
- Reject if (even the optimistic bound falls below the threshold);
- –
- Continue testing otherwise (the interval straddles the threshold, indicating insufficient certainty for a decision).
This three-outcome rule uses the 80% prediction interval directly as a decision boundary. In practice, the quantile levels should be selected based on the application’s tolerance for incorrect acceptance versus incorrect rejection, noting that the empirical coverage of the 80% interval is 77% (Section 5.4). - For T80/T90 estimation: When restricted to devices whose T90 falls in the forecast window (after hour 30), both NHITS variants achieve 16.2 to 16.9 h MAE at T90. In-window crossings (those occurring before hour 30) are identified exactly from the observed data. For devices requiring genuine forecasting, the model predicts within approximately ±16 h when a device will fall below 90% of its initial efficiency. When combined with a 30 h observation window, this still enables meaningful screening decisions with 80% time savings, though the timing precision for late-crossing devices is coarser than for in-window events.
- When to consider diffusion: Two scenarios justify generative modeling for PSC degradation: (a) if per-device metadata (absorber composition, architecture, aging temperature) becomes available and reveals distinct degradation pathways that create multimodal futures, and (b) if synthetic trajectory generation for data augmentation is needed to bootstrap models for new fabrications with limited real data.
6.5. Limitations and Future Work
6.5.1. Single Dataset
6.5.2. No Per-Device Metadata
6.5.3. Window Sensitivity and Stability
6.5.4. Denoiser Architecture
6.5.5. Beyond 150 h
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Hartono, N.T.P.; Köbler, H.; Graniero, P.; Khenkin, M.; Schlatmann, R.; Ulbrich, C.; Abate, A. Stability follows efficiency based on the analysis of a large perovskite solar cells ageing dataset. Nat. Commun. 2023, 14, 4869. [Google Scholar] [CrossRef] [PubMed]
- Khenkin, M.V.; Katz, E.A.; Abate, A.; Bardizza, G.; Berry, J.J.; Brabec, C.; Brunetti, F.; Bulović, V.; Burlingame, Q.; Di Carlo, A.; et al. Consensus statement for stability assessment and reporting for perovskite photovoltaics based on ISOS procedures. Nat. Energy 2020, 5, 35–49. [Google Scholar] [CrossRef]
- Köbler, H.; Neubert, S.; Jankovec, M.; Glažar, B.; Haase, M.; Hilbert, C.; Topič, M.; Rech, B.; Abate, A. High-Throughput Aging System for Parallel Maximum Power Point Tracking of Perovskite Solar Cells. Energy Technol. 2022, 10, 2200234. [Google Scholar] [CrossRef]
- Tashiro, Y.; Song, J.; Song, Y.; Ermon, S. CSDI: Conditional score-based diffusion models for probabilistic time series imputation. In Proceedings of the 35th International Conference on Neural Information Processing Systems, NIPS ’21, Red Hook, NY, USA, 6–14 December 2021; pp. 24804–24816. [Google Scholar]
- Shen, L.; Kwok, J. Non-autoregressive Conditional Diffusion Models for Time Series Prediction. In Proceedings of the 40th International Conference on Machine Learning. PMLR, Honolulu, HI, USA, 23–29 July 2023; pp. 31016–31029. [Google Scholar]
- Eivazi, H.; Hebenbrock, A.; Ginster, R.; Blömeke, S.; Wittek, S.; Herrmann, C.; Spengler, T.S.; Turek, T.; Rausch, A. DiffBatt: A Diffusion Model for Battery Degradation Prediction and Synthesis. arXiv 2024, arXiv:2410.23893. [Google Scholar] [CrossRef]
- Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. arXiv 2020, arXiv:2006.11239. [Google Scholar] [CrossRef]
- Ho, J.; Salimans, T. Classifier-Free Diffusion Guidance. arXiv 2022, arXiv:2207.12598. [Google Scholar] [CrossRef]
- Severson, K.A.; Attia, P.M.; Jin, N.; Perkins, N.; Jiang, B.; Yang, Z.; Chen, M.H.; Aykol, M.; Herring, P.K.; Fraggedakis, D.; et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy 2019, 4, 383–391. [Google Scholar] [CrossRef]
- Domanski, K.; Alharbi, E.A.; Hagfeldt, A.; Grätzel, M.; Tress, W. Systematic investigation of the impact of operation conditions on the degradation behaviour of perovskite solar cells. Nat. Energy 2018, 3, 61–67. [Google Scholar] [CrossRef]
- Di Girolamo, D.; Phung, N.; Kosasih, F.U.; Di Giacomo, F.; Matteocci, F.; Smith, J.A.; Flatken, M.A.; Köbler, H.; Turren Cruz, S.H.; Mattoni, A.; et al. Ion Migration-Induced Amorphization and Phase Segregation as a Degradation Mechanism in Planar Perovskite Solar Cells. Adv. Energy Mater. 2020, 10, 2000310. [Google Scholar] [CrossRef]
- Kohonen, T. Self-Organizing Maps; Springer Series in Information Sciences; Springer: Berlin/Heidelberg, Germany, 2001; Volume 30. [Google Scholar] [CrossRef]
- Challu, C.; Olivares, K.G.; Oreshkin, B.N.; Garza Ramirez, F.; Mergenthaler Canseco, M.; Dubrawski, A. NHITS: Neural Hierarchical Interpolation for Time Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2023, 37, 6989–6997. [Google Scholar] [CrossRef]
- Koenker, R.; Bassett, G. Regression Quantiles. Econometrica 1978, 46, 33. [Google Scholar] [CrossRef]
- Jacobsson, T.J.; Hultqvist, A.; García-Fernández, A.; Anand, A.; Al-Ashouri, A.; Hagfeldt, A.; Crovetto, A.; Abate, A.; Ricciardulli, A.G.; Vijayan, A.; et al. An open-access database and analysis tool for perovskite solar cells based on the FAIR data principles. Nat. Energy 2021, 7, 107–115. [Google Scholar] [CrossRef]
- Graniero, P.; Khenkin, M.; Köbler, H.; Hartono, N.T.P.; Schlatmann, R.; Abate, A.; Unger, E.; Jacobsson, T.J.; Ulbrich, C. The challenge of studying perovskite solar cells’ stability with machine learning. Front. Energy Res. 2023, 11, 1118654. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, H.; Jacobsson, T.J.; Luo, J. Big data driven perovskite solar cell stability analysis. Nat. Commun. 2022, 13, 7639. [Google Scholar] [CrossRef] [PubMed]
- Oreshkin, B.N.; Carpov, D.; Chapados, N.; Bengio, Y. N-BEATS: Neural Basis Expansion Analysis for Interpretable Time Series Forecasting. In Proceedings of the Eighth International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Olivares, K.G.; Challú, C.; Garza, A.; Canseco, M.M.; Dubrawski, A. NeuralForecast: User Friendly State-of-the-Art Neural Forecasting Models; PyCon: Salt Lake City, UT, USA, 2022. [Google Scholar]
- Rasul, K.; Seward, C.; Schuster, I.; Vollgraf, R. Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting. In Proceedings of the 38th International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 8857–8868. [Google Scholar]
- Akima, H. A New Method of Interpolation and Smooth Curve Fitting Based on Local Procedures. J. ACM 1970, 17, 589–602. [Google Scholar] [CrossRef]
- Saliba, M.; Stolterfoht, M.; Wolff, C.M.; Neher, D.; Abate, A. Measuring Aging Stability of Perovskite Solar Cells. Joule 2018, 2, 1019–1024. [Google Scholar] [CrossRef]






| NHITS/P-NHITS | TimeDiff | |
|---|---|---|
| Input/Horizon | 30/120 h | 30/120 h |
| Architecture | 3 stacks, MLP | 4 residual blocks, Transformer |
| Hidden units/Channels | 64 channels, 8 heads | |
| Pool kernels | — | |
| Freq. downsample | — | |
| Scaler | Identity | Per-device (cond. mean) |
| Loss | Huber/MQLoss | -prediction MSE |
| Training steps/epochs | 2000 steps | 30 AR + 200 diffusion epochs |
| Batch size | 128 | 32 (oversampled) |
| Learning rate | (AR)/ (diff.) | |
| Optimizer | Adam | Adam () |
| Scheduler | — | MultiStepLR () |
| EMA | — | decay |
| Early stopping | — | Patience 25 |
| Grad. clip | — | 1.0 |
| Side info dim | — | 161 |
| Diffusion steps | — | |
| Inference samples | 1 | 50 (median) |
| CFG dropout/weight | — | 0.15/1.0 |
| Mix-up rate | — | 0.3 |
| Seed | NHITS | P-NHITS | TimeDiff |
|---|---|---|---|
| 42 | 0.756 | 0.763 | 0.912 |
| 123 | 0.741 | 0.763 | 0.825 |
| 456 | 0.715 | 0.706 | 0.853 |
| Mean | 0.738 | 0.744 | 0.863 |
| Mode | Description | NHITS | P-NHITS | TimeDiff | N |
|---|---|---|---|---|---|
| 0 | Initial gain | 0.568 | 0.584 | 0.678 | 440 |
| 1 | Slow decay | 0.885 | 0.866 | 0.993 | 277 |
| 2 | Medium decay | 0.953 | 0.983 | 1.197 | 140 |
| 3 | Fast decay | 0.798 | 0.800 | 0.847 | 58 |
| All | 0.738 | 0.744 | 0.863 | 915 |
| Mode | Model | N | MAE (h) | RMSE (h) |
|---|---|---|---|---|
| Initial gain | NHITS | 92 | 16.9 | 24.8 |
| P-NHITS | 88 | 18.2 | 25.8 | |
| TimeDiff | 126 | 23.8 | 31.6 | |
| Slow exp. decay | NHITS | 53 | 15.1 | 25.8 |
| P-NHITS | 53 | 14.7 | 24.8 | |
| TimeDiff | 52 | 19.5 | 28.4 | |
| Forecast-only | NHITS | 145 | 16.2 | 25.2 |
| P-NHITS | 141 | 16.9 | 25.4 | |
| TimeDiff | 178 | 22.5 | 30.7 |
| Mode | Model | N | MAE (h) | RMSE (h) |
|---|---|---|---|---|
| Initial gain | NHITS | 14 | 38.9 | 46.6 |
| P-NHITS | 13 | 42.1 | 48.5 | |
| TimeDiff | 12 | 25.9 | 35.8 | |
| Slow exp. decay | NHITS | 150 | 15.3 | 22.0 |
| P-NHITS | 148 | 14.6 | 21.4 | |
| TimeDiff | 137 | 20.6 | 27.1 | |
| Medium exp. decay | NHITS | 5 | 2.6 | 4.2 |
| P-NHITS | 5 | 2.9 | 4.8 | |
| TimeDiff | 5 | 9.7 | 14.6 | |
| Fast exp. decay | NHITS | 1 | 24.1 | 24.1 |
| P-NHITS | 1 | 27.5 | 27.5 | |
| TimeDiff | 1 | 66.9 | 66.9 | |
| Forecast-only | NHITS | 170 | 16.9 | 24.7 |
| P-NHITS | 167 | 16.5 | 24.4 | |
| TimeDiff | 155 | 21.0 | 28.0 |
| Model | Coverage (%) | Mean Width (PCE%) |
|---|---|---|
| Prob. NHITS | 77.2 | 1.78 |
| TimeDiff | 62.8 | 1.49 |
| Window | NHITS | P-NHITS | TimeDiff |
|---|---|---|---|
| 20 h | 0.976 | 0.966 | 1.185 |
| 30 h | 0.735 | 0.744 | 0.888 |
| 40 h | 0.642 | 0.642 | 1.464 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Published by MDPI on behalf of the International Institute of Knowledge Innovation and Invention. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Chahine, K.; Noura, H.N. On the Sufficiency of Direct Regression for Perovskite Solar Cell Degradation Forecasting. Appl. Syst. Innov. 2026, 9, 116. https://doi.org/10.3390/asi9060116
Chahine K, Noura HN. On the Sufficiency of Direct Regression for Perovskite Solar Cell Degradation Forecasting. Applied System Innovation. 2026; 9(6):116. https://doi.org/10.3390/asi9060116
Chicago/Turabian StyleChahine, Khaled, and Hassan N. Noura. 2026. "On the Sufficiency of Direct Regression for Perovskite Solar Cell Degradation Forecasting" Applied System Innovation 9, no. 6: 116. https://doi.org/10.3390/asi9060116
APA StyleChahine, K., & Noura, H. N. (2026). On the Sufficiency of Direct Regression for Perovskite Solar Cell Degradation Forecasting. Applied System Innovation, 9(6), 116. https://doi.org/10.3390/asi9060116

