Reinforced Dynamic Degradation Evolution Modeling Network: A New Remaining Useful Life Prediction Method for Lithium-Ion Batteries

Zhao, Xin; Ye, Fuqian; Wen, Changkun; Li, Jipu

doi:10.3390/en19102408

Open AccessArticle

Reinforced Dynamic Degradation Evolution Modeling Network: A New Remaining Useful Life Prediction Method for Lithium-Ion Batteries

by

Xin Zhao

^1,*,

Fuqian Ye

¹,

Changkun Wen

¹ and

Jipu Li

^2,*

¹

School of Optoelectronic Information and Physical Science, Jiangnan University, Wuxi 214122, China

²

The Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong

^*

Authors to whom correspondence should be addressed.

Energies 2026, 19(10), 2408; https://doi.org/10.3390/en19102408

Submission received: 17 April 2026 / Revised: 6 May 2026 / Accepted: 14 May 2026 / Published: 17 May 2026

(This article belongs to the Section F5: Artificial Intelligence and Smart Energy)

Download

Browse Figures

Versions Notes

Abstract

To address the limitations of insufficient multi-scale feature mining and rigid fusion strategies in existing remaining useful life (RUL) prediction methods, this paper proposes a novel approach based on a Reinforced Dynamic Degradation Evolution Modeling Network (RDDEMN). The proposed model integrates a Dynamic State Transition Network (DST-Net) and Sequence Pattern Attention (SPA) to jointly capture local capacity fluctuations and global degradation trends while adaptively weighting critical temporal patterns. Furthermore, a reinforcement learning-based adaptive gating mechanism is introduced to intelligently adjust feature fusion ratios according to the current degradation states. Ex-tensive experiments on the NASA dataset demonstrate that RDDEMN significantly out-performs mainstream models across MSE, RMSE, and MAE metrics. The results verify the model’s high accuracy and robustness under complex degradation scenarios, providing a reliable solution for battery health management.

Keywords:

lithium-ion battery; remaining useful life prediction; Dynamic State Transition Network (DST-Net); Sequence Pattern Attention (SPA); reinforcement learning

1. Introduction

Against the backdrop of the accelerating global transition toward sustainable energy structures, lithium-ion batteries, as core energy storage components, are playing an increasingly crucial role [1]. From portable electronic devices to electric vehicles, and from distributed energy storage systems to smart grid peak regulation, lithium-ion batteries have become key technologies supporting the electrification of modern society due to their high energy density, long cycle life, and low self-discharge rate [1,2].

It is noteworthy that the performance degradation mechanism of lithium-ion batteries is extremely complex, involving interdisciplinary fields such as electrochemical reaction kinetics, material structural evolution, and interfacial chemistry [3]. In practical applications, battery capacity inevitably decays with increasing charge–discharge cycles and service time, while internal resistance rises simultaneously. This not only directly shortens device endurance and reduces the available capacity of energy storage systems—thus significantly affecting economic benefits—but may also trigger severe safety accidents such as thermal runaway due to performance deterioration, posing potential threats to life and property. Therefore, accurate prediction of the remaining useful life (RUL) of lithium-ion batteries is of great significance for optimizing battery management strategies, improving utilization efficiency, reducing operation and maintenance costs, and ensuring safe and stable system operation. It also promotes the efficient circulation of the battery’s full life-cycle value chain [4,5].

However, due to the complex coupling of electrochemical reactions inside batteries, diverse operating conditions, and performance variability caused by manufacturing differences among individual cells, traditional RUL prediction methods based on empirical models, physicochemical mechanism models, or shallow data-driven models struggle to accurately capture complex degradation patterns and long-term degradation trends. Both prediction accuracy and generalization ability face significant challenges. Therefore, developing advanced RUL prediction models capable of deeply mining degradation features and adapting to complex working conditions has become a key scientific problem and research hotspot in the field of battery management. It holds profound theoretical and engineering value for promoting the widespread application of lithium-ion battery technology and achieving efficient and sustainable energy systems [6].

In recent years, lithium-ion battery RUL prediction methods can be broadly divided into model-based methods and data-driven methods [7]. Model-based methods predict battery lifetime through various modeling strategies, ranging from simplified equivalent circuit models (ECM) to electrochemical–thermal–aging coupled models. By incorporating internal physical mechanisms (e.g., lithium-ion diffusion, electrode activity decay, and thermal effects), these methods improve RUL prediction accuracy and provide a foundation for interpretable and generalizable prediction models [8]. However, model-based approaches are constrained by simplified assumptions regarding complex degradation mechanisms, making it difficult to accurately capture nonlinear aging characteristics under multi-factor coupling [9,10,11,12,13,14]. They are prone to cumulative errors under dynamic conditions and extreme environments. Furthermore, parameter identification depends heavily on prior knowledge and dynamically drifts with battery degradation, limiting universality across battery types and full life cycles [15,16].

In contrast, data-driven methods do not require modeling complex internal electrochemical processes. Instead, they analyze operational data to extract features reflecting battery health status and achieve RUL prediction, often requiring only partial external characteristics for efficient estimation [10,11,17].

In the RUL prediction domain, operational data exhibit significant temporal characteristics. Although Convolutional Neural Networks (CNNs) [18] possess strong spatial feature extraction capabilities, they are less effective for processing time-series degradation data. Recurrent Neural Networks (RNNs) [19,20], on the other hand, are naturally suited for modeling sequential data and can effectively capture dynamic features such as capacity decay during charge–discharge cycles, thereby significantly improving prediction accuracy. Long Short-Term Memory (LSTM) networks [21,22] introduce gating mechanisms and cell states to selectively remember and forget information, alleviating gradient vanishing and explosion problems in RNNs and enhancing long-sequence modeling capability. Gated Recurrent Units (GRUs) [23], as simplified variants of LSTMs, reduce structural complexity while maintaining performance and shortening training time. Although these methods capture temporal dependencies to some extent, they typically lack explicit modeling of the internal dynamic state evolution of batteries [24,25,26]. Moreover, they often adopt static weight allocation mechanisms when handling multi-scale degradation features, making it difficult to adapt to dynamic changes across different aging stages (e.g., linear decay phase and nonlinear accelerated phase).

In summary, although data-driven methods improve prediction accuracy, they still face the following limitations [27,28,29]:

(1): Strong dependence on high-quality labeled data: Full life-cycle degradation data are costly and time-consuming to obtain in practical scenarios, and uneven sample distributions may lead to model overfitting.
(2): Insufficient multi-scale feature extraction and state modeling capability: Existing networks often struggle to jointly model local capacity fluctuations and global degradation trends and are highly sensitive to noise.
(3): Lack of adaptive feature fusion mechanisms: Most methods employ fixed fusion strategies when combining features from different time steps or scales, making it difficult to adaptively adjust fusion weights according to the current degradation state, thus limiting robustness under complex dynamic conditions.

To address these issues, this paper proposes a lithium-ion battery RUL prediction method based on a Reinforced Dynamic Degradation Evolution Modeling Network (RDDEMN). The proposed method first utilizes a Dynamic State Transition Network (DST-Net) to jointly model multi-scale degradation features in capacity sequences, then introduces a Sequence Pattern Attention (SPA) mechanism to highlight key temporal contributions, and finally employs a reinforcement learning-based adaptive gating mechanism to dynamically fuse multi-scale features with the current degradation state. The main contributions of this paper are as follows:

(1): A novel RDDEMN architecture is proposed. By integrating a dedicated DST-Net module with input-conditioned state transitions and local temporal convolution, the model accurately captures multi-scale degradation features and dynamic state evolution processes.
(2): A reinforcement learning-based adaptive gating fusion mechanism is designed. Combined with SPA, the mechanism intelligently adjusts fusion weights according to the current degradation state, significantly enhancing adaptability and robustness across different aging stages.
(3): Extensive validation on full life-cycle datasets is conducted. The objective is to comprehensively evaluate the proposed RDDEMN’s prediction accuracy and robustness under various degradation scenarios, providing a reliable methodological premise for practical battery health management and fault diagnosis.

The remainder of this paper is organized as follows: Section 2 introduces the theoretical background. Section 3 presents the proposed network model. Section 4 describes experimental validation and result analysis. Section 5 concludes the paper and discusses future work.

2. Theoretical Background

2.1. Recurrent Neural Networks and State-Space Modeling Theory

In full life-cycle battery data analysis, capacity sequences exhibit pronounced temporal dependence and nonlinear dynamic evolution characteristics. The Recurrent Neural Network (RNN), as a deep learning architecture specifically designed for sequential data, introduces hidden states to retain historical information. Its core mechanism lies in establishing a recursive mapping relationship between the current output and the hidden state at the previous time step. For a given time series input

X = {x_{1}, x_{2}, \dots, x_{T}}

, the hidden state

h_{t}

update equation of a standard RNN at time step

t

can be expressed as:

h_{t} = σ (W_{i h} x_{t} + b_{i h} + W_{h h} h_{t - 1} + b_{h h})

(1)

where

W_{i h \hat{y} = W_{f c} F_{f u s i o n} + b_{f c}}

and

W_{h h}

denote the weight matrices of the input layer and hidden layer, respectively;

b_{i h}

and

b_{h h}

represent the bias term;

σ

is the activation function. However, standard RNNs encounter difficulties in capturing long-term dependencies due to gradient vanishing or exploding problems.

State Space Models (SSMs) provide a more physically interpretable framework for describing complex dynamic systems. In an SSM, the system is assumed to be driven by latent state variables, while the observed data are regarded as mappings of these latent states. The general discrete-time formulation is given as:

\{\begin{array}{l} s_{t} = F s_{t - 1} + G u_{t} + ω_{t} \\ y_{t} = H s_{t} + ν_{t} \end{array}

(2)

where

s_{t}

denotes the latent state vector of the system;

u_{t}

is the external input;

y_{t}

represents the observed output; F, G and H are the state transition matrix, input matrix, and observation matrix, respectively;

ω_{t L_{r l} = - \log (π_{θ} (a_{f i n a l} | S)) \cdot R}

and

v_{t}

represent process noise and observation noise.

The idea of state recursion provides a theoretical basis for constructing a dynamic state recursion network that can adaptively capture the degradation trend in this study.

2.2. Reinforcement Learning

Reinforcement learning (RL) is a machine learning paradigm in which an intelligent agent (Agent) learns an optimal decision-making strategy through interactions with an environment (Environment). Its mathematical formulation is typically modeled as a Markov Decision Process (MDP), which is defined by a five-tuple

< S, A, P, R, γ >

. At time step

t

, the agent observes the current state

s_{t} \in S

, selects an action

a_{t} \in A

according to a policy

π (a_{t} | s_{t}) a_{f i n a l} = clip (a_{r l}, - λ, λ)

, and the environment transitions to a new state

s_{t + 1}

based on the state transition probability

P (s_{t + 1} | s_{t}, a_{t})

, while providing an immediate reward

r_{t} = R (a_{t} | s_{t})

. The objective of reinforcement learning is to find an optimal policy

π^{*}

that maximizes the expected cumulative discounted reward:

J (π) = E_{π} [\sum_{k = 0}^{\infty} γ^{k} r_{t + k}]

(3)

where

γ \in [0, 1]

is the discount factor, which balances the importance between immediate rewards and long-term returns. In deep reinforcement learning, a policy network is usually introduced to approximate the policy function, and the network parameters

θ

are updated through the policy gradient algorithm to achieve effective exploration and decision-making in complex continuous action spaces. Drawing on this idea, this paper designs an adaptive gating mechanism and uses reinforcement learning to dynamically adjust the fusion weights of multi-scale features.

3. The Proposed Remaining Useful Life Prediction Method for Lithium-Ion Battery

To address the problems of insufficient mining of multi-scale degradation features and the inability of static fusion strategies to adapt to dynamic degradation processes in existing data-driven methods under complex working conditions, this paper proposes a battery RUL prediction method based on the RDDEMN. On the basis of extracting multi-scale degradation features, this method innovatively introduces the SPA and a reinforcement learning-based adaptive gating mechanism to achieve accurate feature extraction and dynamic fusion prediction.

3.1. The Proposed RDDEMN Algorithm

The architecture of the proposed RDDEMN network is shown in Figure 1, which mainly consists of three core functional modules: the DST-Net temporal encoding module, the SPA temporal weight assignment module, and the reinforcement learning-based adaptive gating fusion module.

First, normalization processing is performed on the capacity data of the battery throughout its life cycle, and the sliding window technique is adopted to slice the data into sequence samples of fixed length for constructing the training set and test set. To strictly prevent data leakage and avoid temporal overlap, the 80/20 sequential division is performed chronologically based on the discharge cycles. Specifically, for each battery, the first 80% of the continuous life-cycle sequences are strictly allocated to the training set, while the remaining 20% of the sequences are used exclusively for testing. The sliding window moves chronologically without crossing the boundary between the training and testing sets, ensuring that the model’s predictive performance is evaluated solely on unseen, future degradation trajectories. Subsequently, the DST-Net is used to conduct joint modeling of multi-scale degradation features for the capacity time series to capture local fluctuations and global trends; then, the SPA mechanism adaptively assigns higher weights to key time steps; finally, a reinforcement learning agent is constructed to dynamically generate gating coefficients according to the current degradation state, adaptively fuse multi-scale features with state features, and output the final RUL prediction value through the fully connected layer.

3.2. Dynamic State Transition Encoder

The DST-Net is designed to collaboratively extract dynamic state evolution features and local temporal features in time series. Let the input window sequence be

X = {[x_{1}, x_{2}, \dots, x_{L}]}^{T}

, where

x_{t}

is the capacity value at the

t

-th time step and

L

is the window length.

First, linear mapping is performed on the input to obtain the feature vector

u_{t}

:

u_{t} = W_{i n} x_{t} + b_{i n}

(4)

where

W_{i n}

and

b_{i n}

are the weight matrix and bias of the input projection, respectively.

To simulate the dynamic process of battery degradation, input-conditional state transition parameters are constructed, including the state attenuation coefficient

α_{t}

and the state injection vector

β_{t}

:

α_{t} = σ (W_{α} u_{t} + b_{α})

(5)

β_{t} = \tanh (W_{β} u_{t} + b_{β})

(6)

where

σ

is the Sigmoid activation function, tanh is the hyperbolic tangent activation function,

W_{α}

,

W_{β}

are mapping weights, and

b_{α}, b_{β}

are biases. Based on the above parameters, a dynamic state recurrence equation is constructed to update the hidden state

h_{t}

, which is then mapped back to the feature space to obtain

o_{t}

:

h_{t} = α_{t} ⊙ h_{t - 1} + (1 - α_{t}) ⊙ β_{t}

(7)

o_{t} = W_{o} h_{t} + b_{o}

(8)

where

h_{0}

is the initial state vector, and ⊙ denotes element-wise multiplication. Meanwhile, to capture local temporal dependencies, a depthwise separable 1D convolution is introduced to process the input features to obtain

y_{t}

, and the state features and convolution features are fused through a gating mechanism:

y_{t} = Conv 1 D (u_{t})

(9)

g_{t} = σ (W_{g} [o_{t}, y_{t}] + b_{g})

(10)

z_{t} = g_{t} ⊙ o_{t} + (1 - g_{t}) ⊙ y_{t}

(11)

where

z_{t}

is the multi-scale degradation feature output by the DST-Net, and

g_{t}

is the feature fusion gating vector. By stacking multiple layers of DST-Net units, a multi-scale feature sequence containing rich historical degradation information

Z = [z_{1}, z_{2}, \dots, z_{L}]

can be obtained.

3.3. Sequence Pattern Attention

In the battery degradation process, the contribution of capacity fluctuations at different time steps to RUL prediction varies. To highlight the information of key time steps, this paper designs the SPA mechanism. First, the feature sequence output by the DST-Net is mapped to the attention space to calculate the intermediate representation:

e_{t} = v^{T} \tanh (W_{a} z_{t} + b_{a})

(12)

where

W_{a}, b_{a}

are attention mapping parameters, and

v

is the weight vector. Subsequently, the Softmax function is used to calculate the normalized weight

γ_{t}

of each time step:

γ_{t} = \frac{\exp (e_{t})}{\sum_{k = 1}^{L} \exp (e_{k})}

(13)

Finally, a weighted summation is performed on the feature sequence to obtain the multi-scale degradation feature

C_{m u l t i}

focusing on key degradation patterns:

C_{m u l t i} = \sum_{t = 1}^{L} γ_{t} z_{t}

(14)

3.4. The Designed Reinforcement Learning-Based Adaptive Gated Fusion Mechanism

To achieve dynamic fusion between multi-scale features and the current degradation state, this paper proposes an adaptive gating mechanism based on reinforcement learning. This mechanism intelligently adjusts the proportion of feature fusion by perceiving the current degradation state.

3.4.1. Definition of State and Action Space

The hidden state

h_{L}

at the end of the sequence is extracted as the current degradation state feature. The reinforcement learning state vector

S

is defined as the statistical characteristics of the battery capacity within the current window to characterize the degradation level and fluctuation degree:

S = [μ_{w i n}, σ_{w i n}^{2}, \min (X), \max (X)]

(15)

where

μ_{w i n}, σ_{w i n}^{2}

are the mean and variance of the capacity within the window, respectively. A policy network

π_{θ}

is constructed to generate actions. To ensure exploration and stability, actions

a_{r l}

are sampled from a truncated Gaussian distribution:

μ_{a c t i o n} = λ \tanh (π_{θ} (S))

(16)

a_{r l} ~ N (μ_{a c t i o n}, σ_{s t d}^{2})

(17)

a_{f i n a l} = clip (a_{r l}, - λ, λ)

(18)

where

λ

denotes the upper bound of action adjustment amplitude, and

σ_{s t d}

represents the standard deviation used for exploration.

3.4.2. Adaptive Fusion and Loss Function

The final gating value

G_{f i n a l}

is jointly determined by the learnable base gating parameter

G_{b a s e}

and the reinforcement learning action

a_{f i n a l}

:

G_{f i n a l} = σ (G_{b a s e} + a_{f i n a l})

(19)

Using this adaptive gate, the multi-scale degradation features

C_{m u l t i}

and the current degradation state features

h_{L}

are fused, and the RUL prediction value

\hat{y}

for the next cycle is output through a fully connected layer:

F_{f u s i o n} = G_{f i n a l} ⊙ C_{m u l t i} + (1 - G_{f i n a l}) ⊙ h_{L}

(20)

\hat{y} = W_{f c} F_{f u s i o n} + b_{f c}

(21)

To jointly optimize prediction accuracy and the policy network, the model adopts a combined loss function

L_{t o t a l}

for training:

L_{t o t a l} = L_{m s e} + ξ L_{r l}

(22)

L_{r l} = - \log (π_{θ} (a_{f i n a l} | S)) \cdot R

(23)

where

L_{m s e}

denotes the mean squared error prediction loss,

L_{r l}

represents the policy gradient loss,

R

is the reward function defined based on prediction error, and

ξ

is the balancing coefficient. This design enables the model to adaptively learn the optimal feature fusion strategy according to prediction feedback.

4. Experimental Validation and Analysis

To evaluate the performance of the proposed network model in the lithium-ion battery life prediction task, the publicly available full-life lithium-ion battery experimental dataset released by the National Aeronautics and Space Administration (NASA) was selected for validation.

The lifetime of lithium-ion batteries cannot be directly measured and is typically indirectly estimated through related parameters such as capacity, voltage, and current. Therefore, to rapidly determine battery health status and its remaining useful life, it is first necessary to identify key indicators that can represent battery performance. Battery performance degradation is the comprehensive manifestation of various internal physical and chemical changes during cycling. The variation in performance parameters is mainly reflected in capacity decrease and internal resistance increase. Therefore, battery capacity or internal resistance is commonly used to define RUL. In practical research, defining RUL based on capacity is more common. Accordingly, this study estimates battery lifetime based on the remaining capacity, and the life label is defined as follows:

R U L (t) = \frac{C_{i}}{C_{0}} \times 100 %

(24)

where

C_{0}

denotes the rated capacity of the battery;

C_{i}

represents the current actual capacity of the battery, which is measured under standard charge–discharge conditions. When the battery is new, the current actual capacity equals the rated capacity, and the RUL is 100%.

4.1. Experimental Description and Data Acquisition

This study adopts the publicly available lithium-ion battery degradation dataset provided by NASA to evaluate the predictive performance of the proposed model under real-world conditions. The dataset, released by NASA Ames Research Center, is widely used in battery health management and RUL prediction research and is highly authoritative and representative.

The dataset includes full-life discharge cycle data of multiple lithium-ion batteries under different operating conditions, covering various sensor signals such as discharge capacity, voltage, current, and temperature. In this paper, four battery cells (B0005, B0006, B0007, and B0018) are selected as experimental subjects. These batteries were charged in constant current–constant voltage (CC–CV) mode and discharged at constant current to a specified threshold. Throughout their lifetime, the batteries experienced hundreds of complete charge–discharge cycles until their capacity degraded below 70% of the rated capacity, at which point they were considered failed.

The discharge capacity recorded in each cycle reflects the gradual degradation trend of battery performance, exhibiting clear temporal characteristics and nonlinear degradation patterns. Due to its authenticity, reliability, and inclusion of multiple degradation trajectories, this dataset provides an excellent validation platform for lithium-ion battery lifetime modeling and prediction methods. Table 1. presents the dataset description.

Charging process: The four lithium-ion batteries (models B0005, B0006, B0007, and B0018) were operated at room temperature and charged at a constant current (CC) of 1.5 A until reaching 4.2 V, followed by constant voltage (CV) charging until the charging current decreased to 20 mA.

Discharging process: Discharge was conducted at a constant current (CC) level of 2 A until the battery voltages of B0005, B0006, B0007, and B0018 dropped to 2.7 V, 2.5 V, 2.2 V, and 2.5 V, respectively. Figure 2. illustrates the full-life capacity degradation curves of the four batteries.

4.2. Experimental Setup and Analysis

4.2.1. Model Parameter Settings

In this study, the hyperparameter configuration for model training is as follows: batch size is set to 32, total training epochs are 1000, the initial learning rate is 0.001, and the Adam optimizer is used for gradient descent and parameter updating. Unlike traditional regression tasks, to jointly optimize prediction accuracy and gating strategy, the network is trained using a combined loss function composed of supervised prediction loss and reinforcement learning policy loss.

Model construction and training are implemented in Python 3.9 based on the PyTorch 2.1.0 framework. The overall structural parameters are shown in Table 2. During data processing, the data are organized into tensors of shape (batch_size, sequence_length, input_size) and fed into the network model, where batch_size represents the batch size, sequence_length denotes the input sequence length, and input_size indicates the feature dimension of each sample. The true RUL value corresponding to each time step is used as the supervised label, and end-to-end optimization of the RDDEMN model parameters is achieved via backpropagation.

4.2.2. Comparative Experiments

To highlight the advancement of the proposed algorithm, comparative experiments were conducted on the NASA dataset against four state-of-the-art models: CNN, CNN + LSTM, RNN, and GRU. To ensure a fair comparison, all models were configured with identical parameter settings. Three classical error metrics—mean squared error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE)—were selected to evaluate the predictive performance of each model, with all metrics expressed in percentage form to facilitate horizontal comparison. The visualization of the prediction results for the test samples from the NASA dataset is presented in Figure 3.

From the overall trends illustrated in Figure 3, it is evident that the proposed RDDEMN model (OUR) achieves the best performance across all four battery datasets. In particular, RDDEMN demonstrates significant advantages in the RMSE and MAE metrics, maintaining error levels in the lowest range compared to the baseline models. This indicates that by introducing the reinforcement learning-based adaptive gating mechanism, the model possesses superior generalization capability and stability when handling complex degradation patterns and varying battery operating conditions. In contrast, both CNN and CNN + LSTM exhibit relatively poor performance across the three evaluation metrics; notably, the CNN model shows abnormally high prediction errors for battery B0018, indicating poor stability. This further confirms that feature extraction methods based solely on CNN are not well-suited for time-series data such as RUL prediction, which exhibits strong temporal dependencies. Meanwhile, although the RNN and GRU models can capture temporal features, they show large fluctuations in error across different samples. Overall, the proposed RDDEMN method not only maintains high prediction accuracy on individual battery samples but also demonstrates robust cross-sample and cross-condition generalization. The combination of high accuracy and strong stability enables the RDDEMN model to provide a reliable technical solution for fault diagnosis in practical battery health management tasks.

To intuitively demonstrate the prediction reliability across the battery life cycle, Figure 2 illustrates the degradation trajectories of the true capacity versus the RDDEMN predicted capacity over discharge cycles. A horizontal dashed line is plotted at the 1.4 Ah threshold, representing the 70% capacity retention End of Life (EOL) criterion. Furthermore, Table 3 details the exact EOL prediction cycle results for each battery cell. As observed, the predicted trajectories closely track the true capacity dynamics even in highly nonlinear degradation phases. The absolute cycle errors at the specific EOL points are remarkably small (ranging from 0 to 2 cycles), with the relative prediction errors controlled within 2.1%. These quantitative EOL results verify that the proposed RDDEMN model provides highly transparent and reliable early warnings for battery failure, fulfilling the practical requirements of real-world battery health management systems.

4.2.3. Ablation Experiments

To validate the modeling capability of the proposed attention mechanism along the temporal dimension, further analysis and visualization were conducted on the attention weight distribution learned by the model during the testing phase. As shown in Figure 4, the learned average attention weights exhibit a clearly non-uniform distribution along the time dimension. The most recent capacity states are assigned higher weights, indicating their dominant role in future capacity prediction. Meanwhile, early historical information retains a certain level of attention weight, whereas the intermediate time steps contribute relatively less. These results demonstrate that the proposed attention mechanism can adaptively model multi-scale temporal dependencies in the battery capacity degradation process, rather than relying solely on a fixed window or single time-step information.

From the overall trend, the attention weights of the four battery groups all exhibit a distinctly non-uniform distribution over time, generally assigning higher weights to capacity states closer to the prediction time. This indicates that the model relies more heavily on recent degradation information during capacity prediction, which is consistent with the inherent time-correlated physical characteristics of battery capacity evolution. Differences in attention distributions are observed among different batteries. Battery B0006 shows a more concentrated attention allocation in the later stages of its lifetime, whereas B0005, B0007, and B0018 display gradually increasing attention weights over time, reflecting their relatively smooth degradation patterns. It is noteworthy that although the model emphasizes recent information, early historical states still maintain a certain degree of contribution. This indicates that the proposed attention mechanism does not degenerate into a single-step prediction model relying only on the latest observation, but instead adaptively balances short-term variations and long-term degradation characteristics across the temporal dimension, thereby effectively modeling multi-scale temporal dependencies in battery capacity degradation.

To further verify the effectiveness of the proposed model components, ablation experiments were conducted on the RDDEMN model using the NASA dataset. The objective was to determine the impact of the SPA module and the RL module on prediction accuracy and stability. Table 4 presents the prediction results after removing different components of the model. A0 denotes the model without SPA, A1 denotes the model without RL, and A2 denotes the model with both components removed. The analysis shows that removing any single component leads to performance degradation, which fully verifies the effectiveness of each constructed module in the model.

Figure 5 illustrates the visualization results of the ablation experiments. Based on comprehensive analysis of the MSE, RMSE, and MAE metrics, the RDDEMN model demonstrates significant advantages in battery lifetime prediction tasks. Compared with A0, A1, and A2, the RDDEMN model consistently achieves the lowest prediction errors across all battery types, reflecting superior prediction accuracy and stability. The A2 model exhibits a sharp increase in prediction error for batteries such as B0006, indicating poor stability, thereby validating the effectiveness and superiority of the proposed architecture in battery lifetime prediction tasks.

5. Conclusions

To address the limitations of insufficient multi-scale feature mining and rigid fusion strategies in existing data-driven models, this paper proposes a Reinforced Dynamic Degradation Evolution Modeling Network (RDDEMN) for lithium-ion battery remaining useful life (RUL) prediction. Comprehensive validation and ablation studies on the NASA dataset confirm that RDDEMN significantly outperforms mainstream baseline models across MSE, RMSE, and MAE metrics, exhibiting superior robustness during nonlinear degradation stages. The core innovations and contributions of this study are summarized as follows:

(1): Novel RDDEMN Framework: By integrating a customized DST-Net, the model synergizes input-conditioned state transitions and local convolutions to overcome the limitations of traditional RNNs, enabling the precise extraction of multi-scale degradation features from both short-term fluctuations and long-term state evolution.
(2): RL-Based Adaptive Gated Fusion: A dynamic policy network replaces static fusion by intelligently adjusting the fusion weights of multi-scale and state features based on current degradation statistics, substantially enhancing adaptability across diverse aging stages.
(3): Exceptional Cross-Condition Generalization: RDDEMN demonstrates remarkable accuracy and stability in cross-battery scenarios, effectively handling complex phenomena like capacity regeneration to provide reliable technical support for fault diagnosis in practical battery management systems.

To mitigate the high costs of acquiring full life-cycle battery data, future work will explore lightweight prediction schemes using transfer learning and meta-learning. Reusing RDDEMN’s robust features for cross-condition transfer will address data scarcity in small-sample scenarios and broaden the model’s industrial application boundaries.

Author Contributions

Conceptualization, X.Z.; methodology, X.Z.; writing—original draft preparation, X.Z.; writing—original draft preparation, F.Y. and C.W.; writing—review and editing, J.L.; supervision, J.L.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Natural Science Foundation of Jiangsu Province, China, (Youth Fund Project) under Grant BK20221063 and the Sichuan Science and Technology Program (Central Guidance for Local Science and Technology Development Fund Projects) under Grant 2024ZYD0076.

Data Availability Statement

The data presented in this study are available on request from the corresponding author as the data are derived from the National Science and Technology Major Project of China, which is still under execution and subject to confidentiality and proprietary restrictions.

Acknowledgments

The authors would like to thank the reviewers for their valuable feedback. We are grateful to Jipu Li for their supervision and our research group for their technical support. In addition, special thanks go to Xin Zhang from Jiangnan University for his guidance and funding support for this paper. Gratitude is also extended to NASA Ames Research Center for providing the dataset and the National Science and Technology Major Project of China for the financial support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Samanta, A.; Chowdhuri, S.; Williamson, S.S. Machine learning-based data-driven fault detection/diagnosis of lithium-ion battery: A critical review. Electronics 2021, 10, 1309. [Google Scholar] [CrossRef]
Kong, J.-Z.; Yang, F.; Zhang, X.; Pan, E.; Peng, Z.; Wang, D. Voltage temperature health feature extraction to improve prognostics and health management of lithium-ion batteries. Energy 2021, 223, 120114. [Google Scholar] [CrossRef]
Mikheenkova, A.; Schökel, A.; Smith, A.J.; Ahmed, I.; Brant, W.R.; Lacey, M.J.; Hahlin, M. Visualizing ageing-induced heterogeneity within large prismatic lithium-ion batteries for electric cars using diffraction radiography. J. Power Sources 2024, 599, 234190. [Google Scholar] [CrossRef]
Hu, X.; Che, Y.; Lin, X.; Onori, S. Battery health prediction using fusion-based feature selection and machine learning. IEEE Trans. Transp. Electrif. 2021, 7, 382–398. [Google Scholar] [CrossRef]
Catelani, M.; Ciani, L.; Fantacci, R.; Patrizi, G.; Picano, B. Remaining useful life estimation for prognostics of lithium-ion batteries based on recurrent neural network. IEEE Trans. Instrum. Meas. 2021, 70, 1–11. [Google Scholar] [CrossRef]
Khalid, A.; Sundararajan, A.; Acharya, I.; Sarwat, A.I. Prediction of Li ion battery state of charge using multilayer perceptron and long short-term memory models. In Proceedings of the 2019 IEEE Transportation Electrification Conference and Expo (ITEC), Detroit, MI, USA, 19–21 June 2019; pp. 1–6. [Google Scholar] [CrossRef]
Chen, Y. Improvement and development prospects of carbon anode materials for lithium-ion batteries. Highlights Sci. Eng. Technol. 2024, 83, 127–132. [Google Scholar] [CrossRef]
Mo, Y.; Wu, Q.; Li, X.; Huang, B. Remaining useful life estimation via transformer encoder enhanced by a gated convolutional unit. J. Intell. Manuf. 2021, 32, 1997–2006. [Google Scholar] [CrossRef]
Sayed, E.; Abdalmagid, M.; Pietrini, G.; Sa’Adeh, N.-M.; Callegaro, A.D.; Goldstein, C.; Emadi, A. Review of electric machines in more-/hybrid-/turbo-electric aircraft. IEEE Trans. Transp. Electrif. 2021, 7, 2976–3005. [Google Scholar] [CrossRef]
Liaw, B.Y.; Nagasubramanian, G.; Jungst, R.G.; Doughty, D.H. Modeling of lithium ion cells—A simple equivalent-circuit model approach. Solid State Ion. 2004, 175, 835–839. [Google Scholar] [CrossRef]
Cen, Z.; Kubiak, P. Lithium-ion battery SOC/SOH adaptive estimation via simplified single particle model. Int. J. Energy Res. 2020, 44, 12444–12459. [Google Scholar] [CrossRef]
Vennam, G.; Sahoo, A.; Ahmed, S. A novel coupled electro-thermal-aging model for simultaneous SOC, SOH, and parameter estimation of lithium-ion batteries. In Proceedings of the 2022 American Control Conference (ACC), Atlanta, GA, USA, 8–10 June 2022; pp. 5259–5264. [Google Scholar] [CrossRef]
Ren, H.; Zhao, Y.; Chen, S.; Wang, T. Design and implementation of a battery management system with active charge balance based on the SOC and SOH online estimation. Energy 2019, 166, 908–917. [Google Scholar] [CrossRef]
Vichard, L.; Ravey, A.; Venet, P.; Harel, F.; Pelissier, S.; Hissel, D. A method to estimate battery SOH indicators based on vehicle operating data only. Energy 2021, 225, 120235. [Google Scholar] [CrossRef]
Majdabadi, M.M.; Farhad, S.; Farkhondeh, M.; Fraser, R.A.; Fowler, M. Simplified electrochemical multi-particle model for LiFePO4 cathodes in lithium-ion batteries. J. Power Sources 2015, 275, 633–643. [Google Scholar] [CrossRef]
Gu, R.; Malysz, P.; Yang, H.; Emadi, A. On the suitability of electrochemical-based modeling for lithium-ion batteries. IEEE Trans. Transp. Electrif. 2016, 2, 417–431. [Google Scholar] [CrossRef]
Ren, L.; Dong, J.; Wang, X.; Meng, Z.; Zhao, L.; Deen, M.J. A data driven auto-CNN-LSTM prediction model for lithium-ion battery remaining useful life. IEEE Trans. Ind. Inform. 2021, 17, 3478–3487. [Google Scholar] [CrossRef]
Chen, D.; Hong, W.; Zhou, X. Transformer network for remaining useful life prediction of lithium-ion batteries. IEEE Access 2022, 10, 19621–19628. [Google Scholar] [CrossRef]
Lu, Y.-W.; Hsu, C.-Y.; Huang, K.-C. An autoencoder gated recurrent unit for remaining useful life prediction. Processes 2020, 8, 1155. [Google Scholar] [CrossRef]
Liu, J.; Saxena, A.; Goebel, K.; Saha, B.; Wang, W. An adaptive recurrent neural network for remaining useful life prediction of lithium-ion batteries. In Proceedings of the Annual Conference of the Prognostics Health Management Society, Portland, Oregon, USA, 10–14 October 2010; pp. 1–9. [Google Scholar]
Gugulothu, N.; Tv, V.; Malhotra, P.; Vig, L.; Agarwal, P.; Shroff, G. Predicting remaining useful life using time series embeddings based on recurrent neural networks. Int. J. Progn. Health Manag. 2020, 9, 1–10. [Google Scholar] [CrossRef]
Park, K.; Choi, Y.; Choi, W.J.; Ryu, H.-Y.; Kim, H. LSTM-based battery remaining useful life prediction with multi-channel charging profiles. IEEE Access 2020, 8, 20786–20798. [Google Scholar] [CrossRef]
Song, J.W.; Park, Y.I.; Hong, J.J.; Kim, S.G.; Kang, S.J. Attention based bidirectional LSTM-CNN model for remaining useful life estimation. In Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea, 22–28 May 2021; pp. 1–5. [Google Scholar] [CrossRef]
Xiao, B.; Liu, Y.; Xiao, B. Accurate state-of-charge estimation approach for lithium-ion batteries by gated recurrent unit with ensemble optimizer. IEEE Access 2019, 7, 54192–54202. [Google Scholar] [CrossRef]
Zhang, F.; Chen, M.; Zhu, Y.; Zhang, K.; Li, Q. A review of fault diagnosis, status prediction, and evaluation technology for wind turbines. Energies 2023, 16, 1125. [Google Scholar] [CrossRef]
Yuan, Z.; Xiong, G.; Fu, X. Artificial neural network for fault diagnosis of solar photovoltaic systems: A survey. Energies 2022, 15, 8693. [Google Scholar] [CrossRef]
Zou, B.; Zhang, L.; Xue, X.; Tan, R.; Jiang, P.; Ma, B.; Song, Z.; Hua, W. A review on the fault and defect diagnosis of lithium-ion battery for electric vehicles. Energies 2023, 16, 5507. [Google Scholar] [CrossRef]
Shi, Z.; Chehade, A. A dual-LSTM framework combining change point detection and remaining useful life prediction. Reliab. Eng. Syst. Saf. 2021, 205, 107257. [Google Scholar] [CrossRef]
He, X.; Zhao, C.; Li, S.; Zhao, X.; Yang, X.; Song, Y.; Yao, J. Diffusion-Enhanced Dual-Domain Adversarial Network: A Zero-Shot Fault Diagnosis Method for Electro-Hydrostatic Actuators. IEEE Trans. Instrum. Meas. 2025, 74, 3553709. [Google Scholar] [CrossRef]

Figure 1. Flowchart of RDDEMN network architecture.

Figure 2. The full-life capacity curves of the four groups of batteries.

Figure 3. Visualization of the experimental results comparison.

Figure 4. Visualization of attention weights.

Figure 5. Visualization of ablation experiment results.

Table 1. Data declaration.

Battery ID	Temperature (°C)	Discharge Cutoff Voltage (V)	Discharge Current (A)	Charge Current (A)	EIS Frequency (HZ)
B0005, B0006 B0007, B0018	24	2.7, 2.5 2.2, 2.5	2	1.5	0.1~5 k

Table 2. Architecture parameters of RDDEMN.

Module	Key Configuration (Hyperparameters)	Input → Output	Parameters
INPUT	window size = 20	(B,20,1)	0
DST-Net	d_model = 128, d_state = 64, n_layers = 2, conv_kernel = 5	(B,20,1) → (B,20,256)	235,520
SPA + RL	attn_size = 16, dropout = 0.1	(B,20,256) → (B,256)	70,555
FC	Linear (256 → 1)	(B,256) → (B,1)	257
Total			306,332

Table 3. End of Life (EOL) prediction results of the RDDEMN model on the NASA dataset.

Battery ID	Actual EOL (Cycles)	Predicted EOL (Cycles)	Absolute Error (Cycles)	Relative Error (%)
B0005	124	125	1	0.81%
B0006	108	107	1	0.93%
B0007	164	164	0	0.00%
B0018	96	98	2	2.08%

Table 4. The predicted results of the ablation experiments.

Network Model	B0005			B0006			B0007			B0018
Network Model	MSE (×10⁻³)	RMSE (×10⁻³)	MAE (×10⁻³)	MSE (×10⁻³)	RMSE (×10⁻³)	MAE (×10⁻³)	MSE (×10⁻³)	RMSE (×10⁻³)	MAE (×10⁻³)	MSE (×10⁻³)	RMSE (×10⁻³)	MAE (×10⁻³)
OUR	0.11	10.20	6.90	0.14	11.86	7.67	0.07	8.38	6.13	0.26	15.98	10.75
A0	0.13	11.23	7.66	0.17	13.20	10.86	0.09	9.58	8.57	0.29	17.13	9.65
A1	0.13	11.53	9.77	0.20	14.01	8.83	0.07	8.41	6.03	0.27	16.49	10.52
A2	0.15	12.26	8.60	0.39	19.81	16.71	0.20	14.09	13.36	0.31	17.55	11.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, X.; Ye, F.; Wen, C.; Li, J. Reinforced Dynamic Degradation Evolution Modeling Network: A New Remaining Useful Life Prediction Method for Lithium-Ion Batteries. Energies 2026, 19, 2408. https://doi.org/10.3390/en19102408

AMA Style

Zhao X, Ye F, Wen C, Li J. Reinforced Dynamic Degradation Evolution Modeling Network: A New Remaining Useful Life Prediction Method for Lithium-Ion Batteries. Energies. 2026; 19(10):2408. https://doi.org/10.3390/en19102408

Chicago/Turabian Style

Zhao, Xin, Fuqian Ye, Changkun Wen, and Jipu Li. 2026. "Reinforced Dynamic Degradation Evolution Modeling Network: A New Remaining Useful Life Prediction Method for Lithium-Ion Batteries" Energies 19, no. 10: 2408. https://doi.org/10.3390/en19102408

APA Style

Zhao, X., Ye, F., Wen, C., & Li, J. (2026). Reinforced Dynamic Degradation Evolution Modeling Network: A New Remaining Useful Life Prediction Method for Lithium-Ion Batteries. Energies, 19(10), 2408. https://doi.org/10.3390/en19102408

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Reinforced Dynamic Degradation Evolution Modeling Network: A New Remaining Useful Life Prediction Method for Lithium-Ion Batteries

Abstract

1. Introduction

2. Theoretical Background

2.1. Recurrent Neural Networks and State-Space Modeling Theory

2.2. Reinforcement Learning

3. The Proposed Remaining Useful Life Prediction Method for Lithium-Ion Battery

3.1. The Proposed RDDEMN Algorithm

3.2. Dynamic State Transition Encoder

3.3. Sequence Pattern Attention

3.4. The Designed Reinforcement Learning-Based Adaptive Gated Fusion Mechanism

3.4.1. Definition of State and Action Space

3.4.2. Adaptive Fusion and Loss Function

4. Experimental Validation and Analysis

4.1. Experimental Description and Data Acquisition

4.2. Experimental Setup and Analysis

4.2.1. Model Parameter Settings

4.2.2. Comparative Experiments

4.2.3. Ablation Experiments

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI