IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation

Pei, Yulong; Huo, Hua; Guo, Yinpeng; Kang, Shilu; Xu, Jiaxin

doi:10.3390/en18215677

Open AccessArticle

IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation

by

Yulong Pei

,

Hua Huo

^*

,

Yinpeng Guo

,

Shilu Kang

and

Jiaxin Xu

School of Information Engineering, Henan University of Science and Technology, Luoyang 471023, China

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(21), 5677; https://doi.org/10.3390/en18215677

Submission received: 15 September 2025 / Revised: 25 October 2025 / Accepted: 25 October 2025 / Published: 29 October 2025

Download

Browse Figures

Versions Notes

Abstract

Accurate prediction of the degradation trajectory and estimation of the remaining useful life (RUL) of lithium-ion batteries are crucial for ensuring the reliability and safety of modern energy storage systems. However, many existing approaches rely on deep or highly complex models to achieve high accuracy, often at the cost of computational efficiency and practical applicability. To tackle this challenge, we propose a novel hybrid deep-learning framework, IVCLNet, which predicts the battery’s state-of-health (SOH) evolution and estimates RUL by identifying the end-of-life threshold (SOH = 80%). The framework integrates Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN), Variational Mode Decomposition (VMD), and an attention-enhanced Long Short-Term Memory (LSTM) network. IVCLNet leverages a cascade decomposition strategy to capture multi-scale degradation patterns and employs multiple indirect health indicators (HIs) to enrich feature representation. A lightweight Convolutional Block Attention Module (CBAM) is embedded to strengthen the model’s perception of critical features, guiding the one-dimensional convolutional layers to focus on informative components. Combined with LSTM-based temporal modeling, the framework ensures both accuracy and interpretability. Extensive experiments conducted on two publicly available lithium-ion battery datasets demonstrated that IVCLNet significantly outperforms existing methods in terms of prediction accuracy, robustness, and computational efficiency. The findings indicate that the proposed framework is promising for practical applications in battery health management systems.

Keywords:

lithium-ion batteries; remaining useful life prediction; signal decomposition; depthwise feature fusion; multimodal modeling

1. Introduction

Lithium-ion batteries (LIBs) have found extensive use across electric mobility, energy storage infrastructures, and industrial domains, owing to their high specific energy, durable cycling performance, and low self-discharge behavior [1]. Despite these advantages, a cell’s capacity inevitably deteriorates during long-term operation, and once it is reduced below 80% of its rated capacity, the cell is considered to have reached end-of-life (EOL). Prognostics and Health Management (PHM) techniques are, thus, employed to evaluate reliability and avert unplanned failures, with remaining useful life (RUL) estimation being a pivotal task [2].

In recent years, researchers have explored various approaches to improve the accuracy and practicality of RUL prediction for lithium-ion batteries. These methods are broadly classified into model-based and data-driven categories. Model-based approaches include electrochemical models [3], equivalent circuit models [4], and empirical models. Guha and Patra [5] employed a particle filter in conjunction with a fractional-order equivalent circuit model for RUL estimation, though the algorithm imposes high computational demands. Lui et al. [6] proposed an empirical method based on voltage and current analysis. However, such methods often require accurate physical modeling, which is difficult to achieve due to the complexity and variability of electrochemical processes, limiting their applicability.

Data-driven approaches circumvent detailed physical modeling and instead infer degradation behaviors from historical observations [7]. Representative methods include support vector machines (SVMs), relevance vector machines (RVMs) [8], convolutional neural networks (CNNs) [9], and long short-term memory (LSTM) networks [10]. Among these, the LSTM model has been widely applied for battery health assessment and RUL prediction due to its ability to capture long-term temporal dependencies [11].

To further enhance prediction performance, several improvements have been introduced. Wang et al. [12] constructed health indicators through nonlinear dimension reduction of charging current curves and combined them with LSTM. Liu et al. [13] optimized LSTM hyperparameters via an improved sparrow search algorithm, while Ren et al. [14] combined CNN and LSTM with autoencoder-based feature expansion to jointly exploit local and temporal features. These studies demonstrate the potential of deep learning, but also highlight challenges such as noisy inputs, weak feature interpretability, and difficulties in handling non-stationary degradation signals.

Signal decomposition techniques are increasingly explored to tackle these challenges. Variational Mode Decomposition (VMD) has been employed to suppress high-frequency noise, but it still suffers from residual interference. Similarly, CEEMDAN mitigates mode aliasing through noise-assisted decomposition but may retain high-order delays and residual noise. To address these drawbacks, the Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN) [15] method has been proposed; it alleviates endpoint effects and enhances robustness when handling nonlinear, non-stationary time series.

Despite progress in both model-based and data-driven RUL prediction, several gaps remain. Model-based approaches, while interpretable, are sensitive to parameter uncertainties [5,6]. Purely data-driven methods such as SVM, CNN, and LSTM [8,9,10] can capture complex patterns but often fail to fully exploit multi-scale temporal information and are vulnerable to noise. Hybrid strategies have been proposed [13,14], yet most rely on a single feature extraction method or lack mechanisms to adaptively emphasize informative temporal patterns.

To overcome these limitations, this paper proposes IVCLNet, a hybrid prediction framework that integrates ICEEMDAN and VMD for multi-level signal decomposition with an attention-enhanced CNN-LSTM network. By fusing health indicators with decomposed capacity features, the framework captures both global and local temporal information. The inclusion of a Convolutional Block Attention Module (CBAM) further enables selective focus on informative features, thereby improving robustness and predictive accuracy.

The main contributions of this paper are summarized as follows:

A novel hybrid framework, IVCLNet, is proposed. It employs ICEEMDAN and VMD for multi-level signal decomposition and utilizes a fusion model to learn time-series degradation features. This design enhances the modeling of complex non-stationary signals and improves RUL prediction accuracy.
A hybrid input strategy is designed by integrating indirect health indicators and decomposed capacity features, boosting feature sensitivity and predictive robustness.
A CNN-LSTM network enhanced by a Convolutional Block Attention Module (CBAM) is introduced to adaptively extract key temporal features. Experiments demonstrate that IVCLNet outperforms baseline models in both accuracy and robustness.

The remainder of this paper is structured as follows: Section 2 presents the proposed method, Section 3 describes data preprocessing and feature construction, Section 4 reports experimental results, and Section 5 concludes the study.

2. Method

This section describes the various methods used in this paper, including ICEEMDAN, VMD, LSTM, CBAM, and evaluation methods.

2.1. Improved CEEMDAN (ICEEMDAN)

In time series forecasting, the traditional empirical mode decomposition (EMD) method [16] has been improved by introducing noise, leading to the development of enhanced decomposition models such as EEMD [17], CEEMD [18], and CEEMDAN [19]. These improved decomposition methods are more effective at handling non-stationary and complex signals. Colominas et al. [20] further improved the CEEMDAN method by proposing the ICEEMDAN method, which substitutes conventional mode estimation with local mean operations and derives k-order modes from the signal’s local averages rather than directly employing white noise. This effectively avoids mode aliasing and provides more accurate and stable signal decomposition.

The original sequence is

y (t)

, the Gaussian white noise with zero mean and unit variance is

η^{i} (t)

, the noise amplitude factor is

β_{k}

, the EMD operation is

E_{k} (\cdot)

,

M (\cdot)

is the operator that generates the local mean of the original sequence, and the averaging operation is denoted as

E [\cdot]

. The decomposition process of ICEEMDAN is as follows:

Step 1: Add adaptive noise to the original signal and compute the first order:

y^{i} (t) = y (t) + β_{0} E_{1} (η^{i} (t))

(1)

Step 2: Extract the first IMF component:

I M F_{1} (t) = y (t) - r_{1} (t)

(2)

The second IMF component

I M F_{2} (t)

is obtained by subtracting the new local mean

r_{2} (t)

from the residual

r_{1} (t)

from the previous step:

\begin{matrix} I M F_{2} (t) & = r_{1} (t) - r_{2} (t) \\ = r_{1} (t) - 〈M (r_{1} (t) + β_{1} E_{2} (η^{i} (t)))〉 \end{matrix}

(3)

Step 3: Calculate the jth-order residual term

r_{j} (t)

, where

j = 2, 3, \dots, N

, using the residuals from the previous step plus noise for local mean estimation:

r_{j} (t) = 〈M (r_{j - 1} (t) + β_{j - 1} E_{j} (η^{i} (t)))〉

(4)

Step 4: The j-th IMF is the difference between adjacent residuals:

{I M F}_{j} (t) = r_{j - 1} (t) - r_{j} (t)

(5)

2.2. Variational Mode Decomposition (VMD)

In order to further extract non-stationary features from the lithium-ion battery capacity sequence and enhance the input quality of the prediction model, this paper introduces VMD as a secondary decomposition method after ICEEMDAN decomposition, which is an adaptive signal processing method based on the variational framework [21]. The basic idea is to decompose the input signal

f (t)

into a number of finite intrinsic mode functions (IMFs)

{u_{k} (t)}_{k = 1}^{K}

such that each mode has the narrowest spectral bandwidth near its corresponding center frequency

ω_{k}

. The optimization objective can be expressed as the following constrained variational problem:

\begin{matrix} \{\begin{matrix} min_{{u_{k}, ω_{k}}} \{\sum_{k = 1}^{K} {∥\partial_{t} [(δ (t) + \frac{j}{π t}) * u_{k} (t)] e^{- j ω_{k} t}∥}^{2}\} \\ s . t . \sum_{k = 1}^{K} u_{k} = f \end{matrix} \end{matrix}

(6)

Among them, K is the number of decomposed modes, ∗ denotes convolution,

δ (t)

is the Dirac function, and

\partial_{t}

is the first-order derivative operator.

The Lagrange multiplier operator

λ (t)

and penalty factor

α

introduced into the extended Lagrange function can be expressed as follows:

\begin{matrix} L ({u_{k}}, {ω_{k}}, λ) = α \sum_{k = 1}^{K} {∥\partial_{t} [(δ (t) + \frac{j}{π t}) * u_{k} (t)] e^{- j ω_{k} t}∥}^{2} \\ + {∥f (t) - \sum_{k = 1}^{K} u_{k}∥}^{2} + 〈λ (t), f (t) - \sum_{k = 1}^{K} u_{k}〉 \end{matrix}

(7)

VMD reconstructs the problem and iterates variables in the Fourier domain. The update steps are as follows:

Modal function spectrum update:

{\hat{u}}_{k}^{n + 1} (ω) \leftarrow \frac{\hat{f} (ω) - \sum_{i \neq k} {\hat{u}}_{i}^{n + 1} (ω) - \frac{λ (ω)}{2}}{1 + 2 α {(ω - ω_{k}^{n})}^{2}}

(8)

Center frequency update:

ω_{k}^{n + 1} \leftarrow \frac{\int_{0}^{\infty} ω {| {\hat{u}}_{k}^{n + 1} (ω) |}^{2} d ω}{\int_{0}^{\infty} {| {\hat{u}}_{k}^{n + 1} (ω) |}^{2} d ω}

(9)

Lagrangian multiplier operator update:

{\hat{λ}}^{n + 1} (ω) \leftarrow {\hat{λ}}^{n} (ω) + τ [\hat{f} (ω) - \sum_{k = 1}^{K} {\hat{u}}_{k}^{n + 1} (ω)]

(10)

Repeat steps (8)–(10) and stop iteration when the following convergence conditions are met:

\sum_{k = 1}^{K} \frac{∥ {\hat{u}}_{k}^{n + 1} - {\hat{u}}_{k}^{n} ∥_{2}^{2}}{∥ {\hat{u}}_{k}^{n} ∥_{2}^{2}} < ϵ

, obtain K modal functions, where

ϵ > 0

is the convergence threshold.

2.3. Temporal Feature Modeling: CNN and LSTM Integration

To extract local patterns from sequence data, a one-dimensional convolutional neural network (1D CNN) is employed as a shallow encoder. It performs convolution operations on the temporal axis and uses ReLU activation to enhance non-linear representation [22,23]. This configuration enables the network to learn degradation-related regional trends from battery data.

The output of the k-th filter at time t is given by the following:

f_{k}^{(l)} (t) = σ (\sum_{i = 1}^{p} \sum_{j = 1}^{K} w_{k, i, j}^{(l)} \cdot x_{i}^{(l - 1)} (t - j + 1) + b_{k}^{(l)})

(11)

where

x_{i}^{(l - 1)}

is the input to the

(l - 1)

-th layer,

w_{k, i, j}^{(l)}

is the convolution weight, and

σ (\cdot)

denotes the ReLU function.

For long-term dependency modeling, LSTM layers are stacked after the CNN. LSTM uses memory cells and gated operations to retain temporal context over long sequences [24]. The integrated framework enhances the model’s capacity to represent short-term variations as well as long-term degradation trends.

The overall behavior of the hybrid module can be expressed as follows:

h_{t} = LSTM (ReLU (Conv 1 D (x_{t})))

(12)

where

x_{t}

is the input at time t, and

h_{t}

is the LSTM output state.

This joint architecture leverages CNN’s local pattern detection and LSTM’s sequence memory to effectively extract temporal features for RUL prediction.

2.4. Key Feature Enhancement Mechanism: Convolutional Block Attention Module (CBAM)

CBAM [25] is a lightweight yet effective attention module that sequentially applies channel and spatial attention to refine feature maps, thereby enhancing the representation of key information. As shown in Figure 1, CBAM integrates both attention mechanisms in a cascaded manner.

Given a feature map F, CBAM sequentially applies channel and spatial attention using the following:

F^{'} = M_{c} (F) \otimes F, F^{''} = M_{s} (F^{'}) \otimes F^{'}

(13)

where ⊗ denotes element-wise multiplication, and

M_{c} (\cdot)

and

M_{s} (\cdot)

represent channel and spatial attention weights, respectively.

Specifically, channel attention aggregates information through average and max pooling operations, which are then processed by a shared MLP:

M_{c} (F) = σ (MLP (AvgPool (F)) + MLP (MaxPool (F)))

(14)

Spatial attention is computed by concatenating pooled features along the channel axis and applying a convolution:

M_{s} (F) = σ (Conv ([AvgPool (F); MaxPool (F)]))

(15)

This module enables the model to adaptively focus on informative channels and spatial regions, improving robustness and prediction accuracy.

2.5. Fusion Prediction Framework IVCLNet

This paper combines the advantages of multiple modeling components to propose a hybrid lithium-ion battery state-of-health (SOH) prediction and remaining useful life (RUL) estimation framework, called IVCLNet, whose overall structure includes a signal decomposition module and a deep learning prediction module.

The historical capacity sequence up to cycle t is first decomposed using the Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN) method to obtain multiple intrinsic mode functions (IMFs). Then, the modes are frequency-divided using sample entropy and K-means clustering, and the high-frequency components are further decomposed with Variational Mode Decomposition (VMD). All retained modal components from the historical sequence are integrated into a unified multi-modal feature matrix, denoted as

C_{t}

:

C_{t} \in R^{N \times K},

(16)

where N denotes the sliding time window length and K represents the number of retained decomposition modes. This matrix captures the multi-scale degradation dynamics of the battery using only the information available up to cycle t.

The

C_{t}

features are then concatenated with six indirect health indicators (HIs) that are highly correlated with SOH, including constant-current charging time (CCCT), constant-voltage charging time (CVCT), constant-current discharging time (CCDT), time interval of equal charging voltage difference (TIECVD), time interval of equal discharging voltage difference (TIEDVD), and mean voltage fade (MVF), forming the final model input:

X_{t} \in R^{N \times (K + M)},

(17)

where

M = 6

is the number of selected health indicators. By concatenating the HIs along the feature dimension with

C_{t}

, the CNN–LSTM–CBAM module can jointly learn both the historical degradation patterns captured by the decomposition features and the auxiliary health information provided by the HIs. This integration enables the model to leverage complementary information correlated with SOH, thereby improving prediction accuracy and robustness, particularly during early degradation stages or under noisy capacity conditions.

During the prediction phase, the CNN–LSTM–CBAM deep learning module directly models the multi-modal input and outputs the predicted SOH at a future cycle

t + Δ

, denoted as

{\hat{y}}_{t + Δ}

. Based on a predefined end-of-life (EOL) threshold (SOH = 80% of the initial capacity), the predicted SOH degradation trajectory can be further transformed into the corresponding remaining useful life (RUL), defined as the number of remaining cycles until the predicted SOH reaches the EOL criterion.

As shown in Figure 2, the proposed SOH prediction and RUL estimation method comprises four sequential stages.

This comprehensive pipeline ensures that each stage—from signal decomposition to attention-guided temporal modeling—contributes to a coherent and interpretable RUL estimation framework.

3. Health Feature Extraction and Processing

3.1. Data Description

This paper uses widely recognized battery datasets provided by NASA and CALCE, both of which are commonly adopted in LIB-related investigations, as shown in Figure 3.

3.1.1. NASA Lithium Battery Dataset

In this study, capacity degradation data from three batteries, B5, B6, and B7, rated at 2 Ah from the NASA PCoE Research Center, were used [26]. The aging process was accelerated through repeated cycling, conducted at room temperature in three stages, as follows: first, charging at a constant current of 1.5 A (0.75 C) up to 4.2 V; then switching to constant-voltage charging until the current decreased to 20 mA; finally, discharging at a constant current of 2 A until the voltage reached 2.7 V, 2.5 V, 2.2 V, and 2.5 V, respectively. The EOL criterion was set at a 30% decay of the rated capacity to 1.4 Ah [27]. The selected NASA batteries were aged under well-controlled conditions, allowing for a reproducible evaluation of the proposed IVCLNet framework. While real-life operating profiles may involve variable currents, partial charge/discharge cycles, and fluctuating temperatures, the framework is designed to be general and can be adapted to accommodate more diverse conditions in future studies.

3.1.2. CALCE Lithium Battery Dataset

The CALCE lithium-ion battery dataset was obtained from the University of Maryland Center for Advanced Life Cycle Engineering (CALCE). The rated capacity of the battery is 1.1 Ah. The aging process was accelerated through constant-current and constant-voltage cycling, conducted at room temperature, with an initial constant-current charge of 0.55 A (0.5 C) up to 4.2 V, then continued in constant-voltage mode until the current dropped to 55 mA, followed by discharge in constant-current mode at 2 A to a cutoff voltage of 2.7 V. The change in capacity directly reflects the degree of degradation of the battery during the charge–discharge cycle. As the number of cycles increases, the capacity of the battery decreases, so it can be considered a key health indicator for predicting the RUL of a battery based on its performance degradation. Similarly, the CALCE batteries were aged under controlled conditions, ensuring reproducibility. The IVCLNet framework is general and can be applied to more complex real-world operating conditions in future work.

3.2. Health Indicator Extraction

This experiment references a series of studies serving different purposes [28,29,30,31,32] to extract health indicators (HIs). The purpose of extracting HIs is to obtain a concise characterization from complex raw data, where capacity is a standard metric for evaluating battery performance, but is not available online. The high correlation between the extracted HIs and capacity indicates that the extracted HIs can effectively characterize the state of the battery.

As shown in Figure 4, it can be observed that the voltage undergoes significant changes as the number of cycles increases. Based on these changes, several indirect health indicators (HIs) were extracted through sliding-window analysis and multi-segment division of the voltage–time characteristics in constant-current, constant-voltage mode. Finally, Pearson correlation coefficient (PCC) analysis was conducted to assess these factors, leading to the identification of the following as the selected HI: constant-current charging time (CCCT), constant-voltage charging time (CVCT), constant-current discharging time (CCDT), time interval for an equal charging voltage difference (TIECVD), time interval for an equal discharging voltage difference (TIEDVD), and mean voltage decay (MVF).

PCC is a commonly employed statistical method to quantify the degree of linear association between HIs and SOHs [33]. Its values span from −1 to 1, where a magnitude closer to 1 indicates a stronger linear dependence between the two variables. The mathematical expression for PCC is as follows:

P C C (X, Y) = \frac{\sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{\sqrt{\sum_{i = 1}^{n} {(X_{i} - \bar{X})}^{2} \cdot \sum_{i = 1}^{n} {(Y_{i} - \bar{Y})}^{2}}}

(18)

where x and y represent the values of the selected HI sequence and battery SOH sequence, respectively;

\bar{x}

and

\bar{y}

are their corresponding means, and n is the sample size.

As shown in Table 1, the PCCs of the extracted HI values are all higher than 0.95, indicating a high degree of linear correlation with the battery degradation trend.

The high correlation between the selected health indicators (HIs) and state of health (SOH) can be attributed to the intrinsic electrochemical and physical processes during battery degradation. Specifically, the selected indicators include CCDT, CCCT, CVCT, TIECVD, TIEDVD, and MVF.

CCDT, CCCT, and CVCT reflect the duration of charge and discharge under constant-current or constant-voltage conditions. As the battery ages, internal resistance increases and effective capacity decreases, leading to longer charging/discharging times to reach the same voltage thresholds; hence, these time-based indicators are directly sensitive to capacity fade.

TIECVD and TIEDVD capture intervals corresponding to equal voltage differences during charging or discharging. These metrics effectively reflect subtle changes in voltage dynamics caused by electrode polarization and the growth of internal resistance, which indirectly indicate battery health deterioration.

MVF measures the overall decline in average voltage over cycles, integrating the cumulative effects of material degradation and electrolyte aging, thereby providing a global indicator of battery performance.

It should be noted that these HIs are particularly effective under full-cycle conditions, as represented by the NASA and CALCE datasets used in this study. In real-life applications, batteries often operate under non-repetitive profiles with varying depths of discharge, and constant-voltage charging may not always be applied. In such cases, the linear correlation between certain time-based indicators (CCCT, CVCT, CCDT) and battery capacity may be reduced. Nevertheless, the proposed IVCLNet framework is flexible and can accommodate alternative or additional HIs extracted from partial cycles or other measurable voltage/current features, allowing application to a broader range of operating conditions.

In summary, these selected HIs encode both direct effects (capacity loss, resistance increase) and indirect effects (voltage fluctuation patterns) of battery aging, which explains their high linear correlation with SOH as quantified by the PCC values in Table 1.

3.3. Dataset Decomposition

To extract frequency-specific features, the raw capacity data is decomposed using a two-stage ICEEMDAN-VMD strategy. First, ICEEMDAN produces five IMFs and a residual that collectively represent the signal’s multi-scale behavior.

Sample entropy is calculated for each IMF, followed by K-means clustering to categorize them into high-, medium-, and low-frequency groups, corresponding to different temporal dynamics.

The high-frequency group undergoes further VMD decomposition to enhance its resolution. Finally, selected IMFs and the residuals are aligned and concatenated to form a feature matrix of size

N \times 5

.

After the two-stage ICEEMDAN–VMD decomposition, all selected intrinsic mode functions (IMFs) and residual components are aligned and reconstructed into a multi-scale feature matrix, denoted as

Co_data

. This feature matrix is then concatenated with the extracted health indicators (HIs) described in Section 3.2 to form the final model input

X_{t}

. In this way, the proposed method jointly integrates frequency domain decomposition features and time-domain degradation indicators, ensuring that both physical interpretability and statistical sensitivity to SOH degradation are retained before entering the deep learning module.

To avoid any form of data leakage, it is important to emphasize that the ICEEMDAN–VMD decomposition is applied only to the historical portion of each capacity sequence available within the prediction horizon. During both training and inference, the decomposition operates on the observed capacity data up to the current cycle t, without accessing any future capacity information (

t^{'} > t

). The resulting multi-scale feature matrix

Co_data

, therefore, represents historical signal characteristics rather than the target variable itself. This design ensures that the model input strictly depends on observable degradation trends, maintaining the integrity and causality of the prediction process.

4. Experiments and Analysis

This section presents a detailed analysis of the performance of four models (IVCLNet, Transformer, LSTM-Transformer, and LSTM) on a publicly available battery dataset. In the experiment, 50% of the data is designated for model training, while the remaining portion is used for testing. The proposed algorithms are shown to be effective and beneficial.

The training process is conducted on a GIGABYTE AORUS PC with an Intel Core i5-13600KF processor (24 MB cache, up to 5.10 GHz), an NVIDIA GeForce RTX 4060Ti graphics card, and 32 GB of RAM. The experiments were implemented using MATLAB R2023b and PyTorch 2.1.0 with CUDA Toolkit 12.2 for GPU acceleration.

4.1. The Evaluation Criteria

To evaluate the prediction accuracy of the model, a set of commonly used metrics was employed to quantitatively examine its performance: mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and the coefficient of determination

R^{2}

. Here, n is the number of periods from the start of prediction to the end of the cycle,

y_{i}

is the true capacity, and

{\hat{y}}_{i}

is the predicted capacity. The formulas for these metrics are as follows:

MSE (y, \hat{y}) = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}

(19)

RMSE (y, \hat{y}) = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(20)

MAE (y, \hat{y}) = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(21)

MAPE (y, \hat{y}) = \frac{100 %}{n} \sum_{i = 1}^{n} \frac{| y_{i} - {\hat{y}}_{i} |}{max (ε, | y_{i} |)}

(22)

Clearly, within the range

(0, + \infty)

, larger errors result in larger values. The smaller the values of MSE, RMSE, MAE, and MAPE, the more reliable and accurate the predictions are.

R^{2} (y, \hat{y}) = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(23)

The goodness of the model can be assessed based on the value of

R^{2}

. The closer the result is to 1, the smaller the model error.

4.2. SOH Prediction and RUL Estimation

Although the model outputs the predicted SOH trajectory, the RUL can be derived by identifying the cycle index at which the SOH curve crosses the 80% threshold. Therefore, the evaluation metrics in this section reflect the accuracy of the SOH-based degradation prediction, which directly determines the reliability of RUL estimation.

4.2.1. Hyperparameter Setting

The specific parameter configurations for each model are listed in Table 2. To ensure fair comparison between models, all models were evaluated using uniform training hyperparameters and structural hyperparameters. In the table, e,

l r

, b,

d_{e}

,

d_{d}

, and n denote, in order, the number of iterations, learning rate, batch size, embedding size, dimensionality of the MLP’s dense layer or the LSTM’s hidden state, and the total number of layers.

All models were trained using the Adam optimizer with a maximum of 300 epochs. The gradient threshold was set to 1 to prevent gradient explosion, and L2 regularization of 0.0001 was applied to mitigate overfitting. The initial learning rate was 0.0001, and training samples were shuffled at every epoch to enhance generalization. Loss values were monitored during training, and verbose output was enabled to display progress. All models used the mean squared error (MSE) as the loss function. These settings ensured consistent training across all models and enabled fair comparison of performance.

4.2.2. Comparison of the NASA and CALCE Datasets

A comparative experiment was conducted between IVCLNet and three alternative models, whose layer configurations are shown in Table 3.

It should be noted that the comparison models were deliberately selected to represent the key categories of existing approaches rather than to exhaustively include all available architectures. Specifically, they cover three dominant paradigms in data-driven battery prognostics: (1) recurrent sequence learning (e.g., LSTM); (2) hybrid convolutional–recurrent modeling (e.g., CNN–LSTM); and (3) attention-based temporal learning (e.g., Transformer). This selection ensures that the benchmark comparison spans both fundamental and state-of-the-art methodologies while maintaining fairness and computational consistency. Therefore, the chosen set of baselines is sufficiently comprehensive to validate the effectiveness of IVCLNet without introducing redundant architectures.

Figure 5 and Table 4 present the prediction results and errors of IVCLNet, Transformer, LSTM-Transformer, and LSTM on the NASA dataset and CALCE dataset, respectively. As shown in the figure, in the validation results for the CS2-35 battery, although all models exhibited some degree of error accumulation, IVCLNet maintained stable prediction performance. Compared to other models, IVCLNet exhibited smaller fluctuations in prediction error magnitude and distribution, demonstrating higher prediction accuracy and robustness.

As shown in Table 4, in the lithium-ion battery life prediction task, the proposed IVCLNet model significantly outperforms baseline deep learning methods, including Transformer, LSTM, and LSTM-Transformer. Specifically, IVCLNet achieves the lowest average prediction error across all battery cells. Compared to the aforementioned models, its MAE improves by an average of 76.44%, 77.01%, and 72.41%, and its RMSE improves by an average of 71.12%, 71.48%, and 65.93%. MSE by 62.55%, 61.51%, and 49.32%, respectively, and MAPE by 77.46%, 77.99%, and 73.51%, respectively. Notably, IVCLNet demonstrates exceptional performance in applications requiring long-term predictions.

Due to the lack of an explicit local feature extraction mechanism, LSTM has certain limitations in identifying fine-grained degradation patterns, resulting in relatively low overall prediction accuracy. Although the Transformer model has global modeling capabilities, without the introduction of auxiliary structures, its perception of local features is weak, causing large fluctuations in the error curve. The LSTM-Transformer fusion model demonstrates greater stability and robustness across multiple datasets, capable of capturing both local and global temporal dependencies simultaneously. However, its local modeling remains constrained by the structural expressive power of LSTM, and the absence of a convolutional perception mechanism limits its ability to characterize complex feature patterns. Introducing a structure with a larger receptive field may improve performance, but it would also increase computational costs.

Furthermore, from the perspective of input modeling, the most significant difference between IVCLNet and other models lies in its joint ICEEMDAN-VMD decomposition processing of the original capacity sequence prior to prediction, which separates the mixed noise and degradation trends in the original SOH sequence to improve modeling clarity. The introduction of VMD processing in the high-frequency component further refines complex oscillation patterns and enhances the model’s ability to perceive local nonlinear features.

In contrast, other baseline models are affected by high-frequency noise and sudden changes, making it difficult for the models to capture the true degradation patterns. This difference is particularly evident in typical samples, such as the CS2-35 batteries, where the error curves of the undecomposed models fluctuate dramatically, and their long-term prediction performance deteriorates significantly.

Meanwhile, after decomposing feature extraction, IVCLNet effectively enhances its focus on key time steps and significant feature channels by introducing a convolutional layer with the CBAM attention mechanism, thereby reinforcing effective information at the input end. In contrast, although the LSTM structure has sequential modeling capabilities, it lacks an explicit spatial perception mechanism, and although the Transformer has global modeling capabilities, it is difficult to effectively focus on details without a local guidance structure. While LSTM-Transformer partially alleviates this issue, it still does not perform preprocessing at the input data level, resulting in residual noise continuing to interfere with predictions.

Although the comparison does not include every variant reported in recent literature, the selected baselines already capture the key modeling paradigms used in battery RUL prediction, ensuring that the demonstrated improvements of IVCLNet are both representative and meaningful.

In summary, IVCLNet combines data decomposition-based feature enhancement capabilities with attention-guided local-global modeling mechanisms in its structural design, resulting in stronger adaptability and generalization capabilities. It can significantly improve prediction accuracy and stability, especially in samples with high signal noise interference or complex degradation patterns.

4.2.3. Potential Health Indicators Under Varying Conditions

Although this study primarily focuses on constant operational conditions, the IVCLNet framework can accommodate alternative health indicators (HIs) extracted from measurable voltage, current, and temperature features under variable environments. Specifically, several potential HIs have been reported in the literature as robust across varying conditions, including the following: (1) differential voltage–capacity (dQ/dV) peak shift; (2) temperature rise rate during dynamic discharge; and (3) internal resistance growth estimated from voltage–current transients. These indicators capture electrochemical degradation behaviors that are less sensitive to specific cycling protocols, making them promising candidates for integration into IVCLNet.

To preliminarily verify the framework’s adaptability, a small-scale demonstration was conducted using partially sampled dynamic loading data from the NASA B0005 cell. The results confirm that IVCLNet can process and learn from these alternative HIs without requiring structural modifications, validating the flexibility of the proposed architecture under varying operational conditions.

4.2.4. Efficiency Comparison

In practical applications, the computational efficiency of a model is as important as its prediction accuracy. We, therefore, evaluate the efficiency of IVCLNet and the baseline models in terms of both training cost and hardware cost. The evaluation is based on four widely used metrics:

Floating-point operations (FLOPs), which quantify the computational complexity of a single forward pass.
Training time, which reflects the actual runtime required for model convergence.
Number of parameters, indicating the total trainable weights and thus memory demand.
Storage size, representing the practical deployment cost on edge devices.

These metrics jointly capture both algorithmic complexity (FLOPs and parameters) and system-level cost (training time and storage), providing a comprehensive evaluation of computational efficiency. To ensure fairness, all models were trained with the same hyperparameters—a batch size of 25, a learning rate of 0.001, and 300 epochs, using the Adam optimizer and MSE loss function, as summarized in Table 2.

Model-specific configurations, including hidden dimensions, embedding sizes, and attention heads, are listed separately in Table 3. LSTM-based models adopt the same hidden dimensions and layer numbers, while Transformer-based models use standardized embedding sizes, attention heads, and encoder/decoder depths. For IVCLNet, the CNN-LSTM backbone with CBAM attention is applied, and the input feature dimensions are aligned with those of the baseline models to ensure comparability.

Experimental results show that IVCLNet maintains competitive efficiency. Although the inclusion of CNN and attention modules slightly increases FLOPs and parameter counts compared with plain LSTM, the training time remains comparable, and the storage overhead is moderate. More importantly, the accuracy gain achieved by IVCLNet significantly outweighs this minor cost increase, highlighting a favorable trade-off between computational efficiency and predictive performance.

Under nearly identical configurations and the same dataset, as shown in Table 5, the plain LSTM network exhibits the shortest training time among the baseline models due to its inherent sequential structure; however, this architecture limits its ability to capture complex degradation patterns, resulting in lower prediction accuracy. In contrast, IVCLNet achieves a favorable trade-off between accuracy and computational cost. Specifically, although the FLOPs and number of parameters of IVCLNet are 92.6% and 50.7% higher than those of LSTM, respectively, the training time only increases by 25.8%, and the storage requirement increases moderately by 12.1%. Compared with the Transformer model, IVCLNet has 67.7% higher FLOPs and 36.4% more parameters, yet the training time is reduced by 19.4% and storage by 7.7%. Additionally, when compared with the LSTM-Transformer, IVCLNet achieves 14.8%, 14.4%, 26.8%, and 22.7% reductions in FLOPs, parameters, training time, and storage size, respectively, while maintaining superior predictive performance.

4.2.5. Ablation Experiment

To quantitatively evaluate the contribution of each component in IVCLNet, a set of ablation experiments was conducted on the NASA battery dataset (Battery B0005). Three reduced variants were compared with the proposed full model, as follows: (1) w/o ICEEMDAN, where the raw capacity sequence is directly used without the ICEEMDAN decomposition; (2) w/o VMD, where only ICEEMDAN is applied as the single-stage decomposition; and (3) w/o CBAM, where the CNN–LSTM module is used without the attention mechanism.

All models were trained under identical settings to ensure fair comparison. Table 6 reports the evaluation results based on five common regression metrics, including mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and coefficient of determination (

R^{2}

). The results demonstrate that each module contributes positively to the final performance. In particular, ICEEMDAN and VMD jointly enhance multi-scale signal decomposition, while CBAM adaptively refines spatial–temporal feature representations, further improving prediction accuracy and robustness.

The results clearly indicate that removing any of the three modules leads to a noticeable performance degradation. Compared with the baseline model without decomposition (w/o ICEEMDAN), IVCLNet achieves a 52.7% reduction in RMSE and improves

R^{2}

from 0.897 to 0.974, confirming the effectiveness of the multi-stage signal decomposition strategy. Moreover, introducing the CBAM attention mechanism provides an additional 18.8% improvement in RMSE compared with the plain CNN–LSTM model (w/o CBAM), highlighting its ability to enhance attention-guided feature learning. Overall, the ablation analysis demonstrates that ICEEMDAN, VMD, and CBAM contribute synergistically to improving the accuracy, robustness, and interpretability of battery SOH prediction and RUL estimation.

5. Conclusions

This paper presents IVCLNet, a hybrid deep learning framework integrating ICEEMDAN-VMD signal decomposition with an attention-augmented CNN-LSTM architecture, tailored for the accurate prediction of the remaining useful life (RUL) of lithium-ion batteries. By leveraging hierarchical frequency decomposition and channel-wise attention mechanisms, the proposed model effectively captures both global degradation trends and local fluctuation patterns. Experimental results on two publicly available datasets demonstrate that IVCLNet achieves lower prediction errors compared to several baseline models, particularly under complex degradation dynamics. The proposed framework performs SOH trajectory prediction and estimates RUL indirectly by identifying the end-of-life point, ensuring consistency between model prediction and practical prognostic interpretation.

Despite its promising performance, the method has several limitations. First, the current approach relies on fixed decomposition parameters, which may affect its adaptability across diverse battery types or operating conditions. Second, while the model exhibits robustness on the test sets, its generalization capability under transfer learning scenarios remains unexplored. In addition, the feature fusion strategy primarily focuses on temporal alignment, without incorporating potential cross-feature dependencies.

Future work will focus on enhancing model adaptability by introducing dynamic decomposition parameter selection and improving generalization through domain-aware training strategies. In addition, more efficient feature integration modules and lightweight model variants will be explored to support real-time deployment in battery management systems. Furthermore, since the IVCLNet framework is modular and feature-agnostic, it can flexibly accommodate alternative or additional health indicators (HIs) derived from voltage, current, temperature, or partial-cycle features under varying operational conditions. This flexibility provides a theoretical and practical foundation for extending the model to real-world scenarios with fluctuating environments, such as variable temperature and dynamic load profiles. Future studies will also validate the proposed framework under these varying conditions to demonstrate its adaptability and robustness.

Author Contributions

Methodology, S.K.; Resources, H.H.; Writing—original draft, Y.P.; Writing—review & editing, Y.G.; Visualization, J.X.; Project administration, H.H.; Funding acquisition, H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (61672210), the Major Science and Technology Program of Henan Province (221100210500), and the Central Government Guiding Local Science and Technology Development Fund Program of Henan Province (Z20221343032).

Data Availability Statement

The experimental data used in this paper are from the National Aeronautics and Space Administration Prognostics Center of Excellence (NASA) and the Center for Advanced Life Cycle Engineering Research (CALCE).

Conflicts of Interest

The authors declare no competing interests.

References

Lipu, M.H.; Hannan, M.A.; Hussain, A.; Hoque, M.M.; Ker, P.J.; Saad, M.H.M.; Ayob, A. A review of state of health and remaining useful life estimation methods for lithium-ion battery in electric vehicles: Challenges and recommendations. J. Clean. Prod. 2018, 205, 115–133. [Google Scholar] [CrossRef]
Zio, E. Prognostics and health management (PHM): Where are we and where do we (need to) go in theory and practice. Reliab. Eng. Syst. Saf. 2022, 218, 108119. [Google Scholar] [CrossRef]
Zhao, X.; Cai, Y.; Yang, L.; Deng, Z.; Qiang, J. State of charge estimation based on a new dual-polarization-resistance model for electric vehicles. Energy 2017, 135, 40–52. [Google Scholar] [CrossRef]
Xu, Z.; Wang, J.; Lund, P.D.; Zhang, Y. Co-estimating the state of charge and health of lithium batteries through combining a minimalist electrochemical model and an equivalent circuit model. Energy 2022, 240, 122815. [Google Scholar] [CrossRef]
Guha, A.; Patra, A. Online estimation of the electrochemical impedance spectrum and remaining useful life of lithium-ion batteries. IEEE Trans. Instrum. Meas. 2018, 67, 1836–1849. [Google Scholar] [CrossRef]
Lui, Y.H.; Li, M.; Downey, A.; Shen, S.; Nemani, V.P.; Ye, H.; VanElzen, C.; Jain, G.; Hu, S.; Laflamme, S.; et al. Physics-based prognostics of implantable-grade lithium-ion battery for remaining useful life prediction. J. Power Sources 2021, 485, 229327. [Google Scholar] [CrossRef]
Xue, A.; Yang, W.; Yuan, X.; Yu, B.; Pan, C. Estimating state of health of lithium-ion batteries based on generalized regression neural network and quantum genetic algorithm. Appl. Soft Comput. 2022, 130, 109688. [Google Scholar] [CrossRef]
Ji, Y.; Chen, Z.; Shen, Y.; Yang, K.; Wang, Y.; Cui, J. An RUL prediction approach for lithium-ion battery based on SADE-MESN. Appl. Soft Comput. 2021, 104, 107195. [Google Scholar] [CrossRef]
Xu, H.; Wu, L.; Xiong, S.; Li, W.; Garg, A.; Gao, L. An improved CNN-LSTM model-based state-of-health estimation approach for lithium-ion batteries. Energy 2023, 276, 127585. [Google Scholar] [CrossRef]
Zhang, Y.; Chen, L.; Li, Y.; Zheng, X.; Chen, J.; Jin, J. A hybrid approach for remaining useful life prediction of lithium-ion battery with adaptive levy flight optimized particle filter and long short-term memory network. J. Energy Storage 2021, 44, 103245. [Google Scholar] [CrossRef]
Zhao, S.; Zhang, C.; Wang, Y. Lithium-ion battery capacity and remaining useful life prediction using broad learning system and long short-term memory neural network. J. Energy Storage 2022, 52, 104901. [Google Scholar] [CrossRef]
Wang, Z.; Liu, N.; Guo, Y. Adaptive sliding window LSTM NN based RUL prediction for lithium-ion batteries integrating LTSA feature reconstruction. Neurocomputing 2021, 466, 178–189. [Google Scholar] [CrossRef]
Liu, Y.; Sun, J.; Shang, Y.; Zhang, X.; Ren, S.; Wang, D. A novel remaining useful life prediction method for lithium-ion battery based on long short-term memory network optimized by improved sparrow search algorithm. J. Energy Storage 2023, 61, 106645. [Google Scholar] [CrossRef]
Ren, L.; Dong, J.; Wang, X.; Meng, Z.; Zhao, L.; Deen, M.J. A data-driven auto-CNN-LSTM prediction model for lithium-ion battery remaining useful life. IEEE Trans. Ind. Inform. 2020, 17, 3478–3487. [Google Scholar] [CrossRef]
Mo, Y.; Li, Q.; Karimian, H.; Fang, S.; Tang, B.; Chen, G.; Sachdeva, S. A novel framework for daily forecasting of ozone mass concentrations based on cycle reservoir with regular jumps neural networks. Atmos. Environ. 2020, 220, 117072. [Google Scholar] [CrossRef]
Qi, X.; Hong, C.; Ye, T.; Gu, L.; Wu, W. Frequency reconstruction oriented EMD-LSTM-AM based surface temperature prediction for lithium-ion battery. J. Energy Storage 2024, 84, 111001. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
Yeh, J.-R.; Shieh, J.-S.; Huang, N.E. Complementary ensemble empirical mode decomposition: A novel noise enhanced data analysis method. Adv. Adapt. Data Anal. 2010, 2, 135–156. [Google Scholar] [CrossRef]
Torres, M.E.; Colominas, M.A.; Schlotthauer, G.; Flandrin, P. A complete ensemble empirical mode decomposition with adaptive noise. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar]
Colominas, M.A.; Schlotthauer, G.; Torres, M.E. Improved complete ensemble EMD: A suitable tool for biomedical signal processing. Biomed. Signal Process. Control 2014, 14, 19–29. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal Process. 2013, 62, 531–544. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Hoseinzade, E.; Haratizadeh, S.; Khoeini, A. U-CNNpred: A universal CNN-based predictor for stock markets. arXiv 2019, arXiv:1911.12540. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Saha, B.; Goebel, K. Modeling Li-ion battery capacity depletion in a particle filtering framework. In Proceedings of the Annual Conference of the PHM Society 2009, San Diego, CA, USA, 27 September–1 October 2009; pp. 1–9. [Google Scholar]
Liu, D.; Zhou, J.; Liao, H.; Peng, Y.; Peng, X. A health indicator extraction and optimization framework for lithium-ion battery degradation modeling and prognostics. IEEE Trans. Syst. Man Cybern. Syst. 2015, 45, 915–928. [Google Scholar]
Liu, H.; Naqvi, I.H.; Li, F.; Liu, C.; Shafiei, N.; Li, Y.; Pecht, M. An analytical model for the CC-CV charge of Li-ion batteries with application to degradation analysis. J. Energy Storage 2020, 29, 101342. [Google Scholar] [CrossRef]
Shen, J.-N.; Shen, J.-J.; He, Y.-J.; Ma, Z.-F. Accurate state of charge estimation with model mismatch for Li-ion batteries: A joint moving horizon estimation approach. IEEE Trans. Power Electron. 2018, 34, 4329–4342. [Google Scholar] [CrossRef]
Liu, D.; Xie, W.; Liao, H.; Peng, Y. An integrated probabilistic approach to lithium-ion battery remaining useful life estimation. IEEE Trans. Instrum. Meas. 2014, 64, 660–670. [Google Scholar] [CrossRef]
Jameel, S.M.; Altmemi, J.; Oglah, A.A.; Abbas, M.A.; Sabry, A.H. Predicting batteries second-life state-of-health with first-life data and on-board voltage measurements using support vector regression. J. Energy Storage 2024, 104, 114554. [Google Scholar] [CrossRef]
Kwon, S.; Han, D.; Park, J.; Lee, P.-Y.; Kim, J. Joint state-of-health and remaining-useful-life prediction based on multi-level long short-term memory model prognostic framework considering cell voltage inconsistency reflected health indicators. J. Energy Storage 2022, 55, 105731. [Google Scholar] [CrossRef]
Roman, D.; Saxena, S.; Robu, V.; Pecht, M.; Flynn, D. Machine learning pipeline for battery state-of-health estimation. Nat. Mach. Intell. 2021, 3, 447–456. [Google Scholar] [CrossRef]

Figure 1. CBAM network structure diagram.

Figure 2. Flowchart of the developed SOH prediction and RUL estimation approach.

Figure 3. Two different kinds of battery datasets.

Figure 4. Voltage curves and voltage intervals for different SOHs.

Figure 5. The experimental results of different models on the NASA and CALCE datasets.

Table 1. Comparison of PCC values for various HIs across NASA and CALCE datasets.

Battery Series	CCDT	CCCT	CVCT	TIECVD	TIEDVD	MVF
B0005	0.9999	0.9980	−0.9735	0.9971	0.9989	−0.9859
B0006	0.9999	0.9948	−0.9697	0.9928	0.9958	−0.9655
B0007	0.9999	0.9980	−0.9845	0.9918	0.9988	−0.9668
CS2-35	0.9999	0.9967	−0.9833	0.9954	0.9774	−0.9540
CS2-36	0.9999	0.9976	−0.9938	0.9960	0.9815	−0.9780
CS2-37	0.9999	0.9968	−0.9649	0.9956	0.9771	−0.9370

Table 2. Hyperparameters for different models.

Models	Training Hyperparameters			Model Hyperparameters
Models	$e$	$lr$	$b$	$d_{e}$	$d_{d}$	$n$
IVCLNet	300	0.001	25	-	32	2
Transformer	300	0.001	25	64	32	2
LSTM	300	0.001	25	-	32	2
LSTM-Transformer	300	0.001	25	64	32	2

Table 3. Layer configurations of different deep learning models.

IVCLNet	Transformer	LSTM	LSTM-TF
Input	Input	Input	Input
Conv1D(1 × 3, 32)	Embed layer (32)	Linear	LSTM(64)
BN + ReLU	Positional Encoder	LSTM(32)	LSTM(32)
CBAM Attention	Softmax Attention	LSTM(32)	Softmax Attention
LSTM(64)	Add&Norm	LSTM(32)	Add&Norm
LSTM(32)	Softmax Attention	Linear	Feedforward
Linear	Add&Norm		Linear
	MLP
	Add&Norm

Table 4. RUL prediction results on the NASA and CALCE datasets.

Cell	Methods	MAE	RMSE	MSE	MAPE	$R^{2}$
B0005	IVCLNet	0.0030	0.0052	0.0001	0.0043	0.9803
	Transformer	0.0139	0.0169	0.0002	0.0206	0.8127
	LSTM	0.0139	0.0165	0.0002	0.0205	0.8197
	LSTM-Transformer	0.0100	0.0126	0.0001	0.0148	0.8906
B0006	IVCLNet	0.0048	0.0089	0.0001	0.0071	0.9641
	Transformer	0.0154	0.0201	0.0004	0.0244	0.8288
	LSTM	0.0147	0.0198	0.0003	0.0234	0.8303
	LSTM-Transformer	0.0119	0.0159	0.0002	0.0189	0.8901
B0007	IVCLNet	0.0026	0.0041	0.0001	0.0035	0.9805
	Transformer	0.0075	0.0095	0.0001	0.0103	0.9002
	LSTM	0.0089	0.0109	0.0001	0.0121	0.8715
	LSTM-Transformer	0.0073	0.0090	0.0001	0.0099	0.9114
CS2-35	IVCLNet	0.0032	0.0048	0.0001	0.0042	0.9966
	Transformer	0.0277	0.0446	0.0019	0.0401	0.7359
	LSTM	0.0233	0.0360	0.0012	0.0336	0.8266
	LSTM-Transformer	0.0212	0.0336	0.0011	0.0305	0.8441
CS2-36	IVCLNet	0.0041	0.0058	0.0001	0.0054	0.9927
	Transformer	0.0224	0.0300	0.0009	0.0306	0.8267
	LSTM	0.0204	0.0277	0.0007	0.0278	0.8476
	LSTM-Transformer	0.0171	0.0233	0.0005	0.0231	0.8879
CS2-37	IVCLNet	0.0032	0.0046	0.0001	0.0042	0.9940
	Transformer	0.0133	0.0184	0.0003	0.0177	0.9089
	LSTM	0.0155	0.0202	0.0004	0.0206	0.8922
	LSTM-Transformer	0.0156	0.0205	0.0004	0.0207	0.8892

Table 5. Comparison of computational efficiency between IVCLNet and baseline models.

Models	FLOPs (Million)	Training Time (s)	Parameters	Storage Size (KB)
IVCLNet	0.52	11.2	9843	42.7
Transformer	0.31	13.9	7215	46.3
LSTM	0.27	8.9	6527	38.1
LSTM-Transformer	0.61	15.3	11,492	55.2

Table 6. Ablation analysis of IVCLNet components on the NASA dataset (Battery B0005).

Model Variant	MAE	MSE	RMSE	MAPE (%)	$R^{2}$
w/o ICEEMDAN	0.0089	1.21 $\times 10^{- 4}$	0.0110	1.92	0.897
w/o VMD	0.0067	6.58 $\times 10^{- 5}$	0.0081	1.46	0.925
w/o CBAM	0.0054	4.10 $\times 10^{- 5}$	0.0064	1.21	0.946
IVCLNet	0.0035	2.70 $\times 10^{- 5}$	0.0052	0.65	0.974

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pei, Y.; Huo, H.; Guo, Y.; Kang, S.; Xu, J. IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation. Energies 2025, 18, 5677. https://doi.org/10.3390/en18215677

AMA Style

Pei Y, Huo H, Guo Y, Kang S, Xu J. IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation. Energies. 2025; 18(21):5677. https://doi.org/10.3390/en18215677

Chicago/Turabian Style

Pei, Yulong, Hua Huo, Yinpeng Guo, Shilu Kang, and Jiaxin Xu. 2025. "IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation" Energies 18, no. 21: 5677. https://doi.org/10.3390/en18215677

APA Style

Pei, Y., Huo, H., Guo, Y., Kang, S., & Xu, J. (2025). IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation. Energies, 18(21), 5677. https://doi.org/10.3390/en18215677

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation

Abstract

1. Introduction

2. Method

2.1. Improved CEEMDAN (ICEEMDAN)

2.2. Variational Mode Decomposition (VMD)

2.3. Temporal Feature Modeling: CNN and LSTM Integration

2.4. Key Feature Enhancement Mechanism: Convolutional Block Attention Module (CBAM)

2.5. Fusion Prediction Framework IVCLNet

3. Health Feature Extraction and Processing

3.1. Data Description

3.1.1. NASA Lithium Battery Dataset

3.1.2. CALCE Lithium Battery Dataset

3.2. Health Indicator Extraction

3.3. Dataset Decomposition

4. Experiments and Analysis

4.1. The Evaluation Criteria

4.2. SOH Prediction and RUL Estimation

4.2.1. Hyperparameter Setting

4.2.2. Comparison of the NASA and CALCE Datasets

4.2.3. Potential Health Indicators Under Varying Conditions

4.2.4. Efficiency Comparison

4.2.5. Ablation Experiment

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI