1. Introduction
In recent years, lithium-ion batteries, as a primary representative of clean energy, have found widespread applications in various fields, including electric ships (which encompass all-electric propulsion systems, conventional fuel-driven systems, and, as an intermediate solution, hybrid drive systems) [
1], as well as electric vehicles, owing to their high energy density, absence of a memory effect, and overall reliability. Moreover, with the development of new technologies for mobile systems, reducing the mass of batteries has become more and more important [
2], and lithium-ion batteries have been widely used in applications requiring a light weight and high performance due to their high energy density. For comparison purposes,
Table 1 provides the key characteristics of several battery types used in electric ships and vehicles. However, over time, the aging mechanisms within the battery (e.g., loss of active substances) can result in performance degradation and may even lead to safety hazards. Therefore, there is an urgent need to establish an effective battery management system for the real-time monitoring of the lithium-ion battery lifespan [
3]. Currently, battery capacity prediction methods can be broadly categorized into three types: experimental measurement-based methods, model-based methods, and data-driven methods.
Experimental measurement-based methods utilize battery parameters (e.g., resistance, impedance, and capacity) derived from direct measurements to predict the capacity [
4]. This approach, however, is costly, time-consuming, and may damage the battery, making it unsuitable for real-time applications. Model-based methods, including equivalent circuit models, empirical models, and electrochemical models, focus on the internal mechanisms of the battery. These methods require the development of mathematical models to describe the capacity degradation and use the model parameters to predict the battery’s capacity [
5]. However, accurately capturing battery characteristics and operating conditions is challenging, and the complex degradation behavior of lithium-ion batteries makes it difficult for model-based methods to fully characterize and quantify the degradation process.
With the accumulation of battery data and the rapid advancement of artificial intelligence, data-driven methods for battery capacity prediction have gained significant attention [
6]. These methods primarily rely on capacity decay data and characterization data (e.g., current, voltage, and temperature) to train prediction models, which include techniques such as support vector machines [
7], recurrent neural networks [
8], least squares support vector machines [
9], and long short-term memory networks [
10]. These methods do not require an in-depth understanding of the battery’s internal mechanisms and are therefore well-suited for predicting the capacity of lithium-ion batteries [
11]. In addition, more and more researchers are now applying machine learning to a technique called digital twins to build a model with high accuracy for battery capacity prognostics [
2]. However, the accuracy of data-driven methods depends significantly on the features extracted from the data and the way the model is trained [
12]. To improve the prediction accuracy, it is essential to optimize three aspects: feature extraction, feature processing, and model training.
- A.
Feature Extraction
Simplifying battery charging and discharging data into a series of features can reduce the model’s training burden and enhance its efficiency. However, under certain dynamic operating conditions, extracting effective features from the discharge process is challenging, whereas the charging process typically follows a more predictable pattern, making it easier to analyze. As such, it is more effective to extract features from charging data that reflect the degradation of battery capacity [
13]. Moreover, the variety of charging methods leads to diverse ways of extracting features. To enhance the versatility of the prediction model, a standardized approach for feature extraction is required. Peng [
14] and others developed a series of features that accurately represent the capacity degradation using a unified standard based on time, energy, and incremental capacity (IC) features. To predict the battery capacity degradation in electric vehicles (EVs), Deng et al. [
15] extracted statistical features from the charging data, ensuring both methodological consistency and a comprehensive feature set. Guo et al. [
16] combined rational analysis and principal component analysis (PCA) to derive features from charging data that are adaptable to various operating conditions, thus strengthening the versatility of their capacity prediction method. For high-precision capacity prediction across different lithium-ion battery datasets, Dai et al. [
17] extracted six statistical features from charging data, determining the optimal feature combination by comparing various combinations of these features and thus reducing computational complexity. When effective features can be consistently extracted from charging data according to a unified standard, the method’s versatility is proven, and the workload in feature extraction is minimized. However, extracting too few features may fail to capture the full degradation process, while too many features may result in redundancy, thereby increasing the computational burden and reducing model efficiency. Therefore, the extracted features must accurately reflect capacity degradation across different operating conditions while being computationally efficient.
- B.
Feature Processing
Selecting feature sequences that are highly correlated with the battery capacity can significantly improve the prediction’s accuracy. In cases where two feature sequences are highly correlated with both the capacity and each other, redundancy can be reduced by eliminating one of the features, thus easing the computational burden. For example, in [
18], Box–Cox transformation (BCT) was used to enhance the correlation between the extracted features and battery capacity. In [
15], Pearson’s correlation coefficient and gray correlation were employed to identify and remove redundant features, leading to the optimal set of features. In [
14], principal component analysis (PCA) and empirical mode decomposition (EMD) were applied to the experimental curves of battery charging and discharging, as well as the incremental capacity curves, to extract features that strengthened the correlation between features and capacity. Furthermore, ref. [
19] employed a two-step feature engineering approach—feature dimensionality reduction and seasonal fluctuation decoupling—to select the most relevant features for the capacity prediction while eliminating interfering components, thereby improving the model’s prediction accuracy.
- C.
Model Training
The battery capacity prediction based on data-driven methods is influenced not only by the effectiveness of the extracted features but also by the choice of machine learning algorithms and the configuration of their hyperparameters. In [
16], an adaptive RVM model based on PSO optimization was proposed, demonstrating high robustness and effectiveness for estimating the remaining capacity of lithium-ion batteries. Gong et al. [
20] developed a battery capacity prediction model by combining empirical mode decomposition (EMD) and backpropagation with a long- and short-term cyclic memory network. In [
21], a hybrid capacity estimation model was proposed by integrating the Arrhenius degradation equation and a lightweight Transformer architecture tailored for different operating conditions. Zhang et al. [
22] employed a temporal convolutional network combined with Gaussian process regression to establish a novel capacity estimation method capable of automatically extracting capacity decay features from partial charging segments. Furthermore, improper hyperparameter settings can significantly degrade the performance of machine learning algorithms, thereby reducing the accuracy of capacity prediction. To address this, ref. [
23] adopted an improved dung beetle optimization (IDBO) algorithm to optimize the hyperparameters of temporal convolutional networks (TCNs), obtaining optimal hyperparameter combinations quickly and accurately, which notably enhanced the accuracy of battery capacity predictions. It is important to note that if the test set is involved in hyperparameter tuning during the model training process, the model’s performance on the test set may exceed its true capability, leading to evaluation errors. Thus, using the validation set for model tuning is recommended to preserve the independence of the test set.
Building upon these principles, this paper first extracts a series of features from battery data. The correlations between the extracted feature sequences, as well as between these features and the capacity sequences, are then analyzed. Features that exhibit a strong correlation with the capacity are retained, while redundant features are removed, resulting in an optimal feature set. Subsequently, the improved crested porcupine optimization (ICPO) algorithm is employed to optimize the hyperparameters of the bidirectional long short-term memory (Bi-LSTM) network, thus constructing the ICPO-Bi-LSTM model for accurate prognostics of the lithium-ion battery capacity. The dataset is divided into training, validation, and test sets in a specified ratio. The training and validation sets are used for model training, while the test set is reserved for the final performance evaluation. Finally, this paper investigates the impacts of working conditions, the dataset ratio, and the different models on the capacity prediction results by analyzing batteries discharged under complex and simple conditions, thus demonstrating the generality and robustness of the proposed ICPO-Bi-LSTM method.
The main contributions of this paper are as follows:
A unified statistical feature extraction method is proposed, i.e., calculating the mean, sum, and standard deviation values of current, voltage, energy, and power in the charging data for each charging and discharging cycle of a battery. These features apply to different batteries under complex and simple operating conditions, which solves the difficulty of needing to adjust the feature extraction method according to changes in battery conditions. The voltage difference between the battery before and after the simulated operating conditions in each cycle is extracted as another type of feature to fully reflect the capacity decay trend of the battery. The above-extracted features show a strong correlation with the battery capacity.
To overcome the challenge of determining the hyperparameters of the Bi-LSTM model, the improved crested porcupine optimization algorithm (ICPO) is proposed. This algorithm identifies the optimal hyperparameter combination and integrates the improved Chebyshev chaotic mapping initialization to ensure diversity within the initial population. This improves the algorithm’s early-stage search speed and introduces a random difference variance strategy to avoid local optima, thereby enhancing the algorithm’s overall efficiency.
The ICPO-Bi-LSTM model is developed using the optimal feature set to predict the capacity of lithium-ion batteries accurately. The dataset is divided into training, validation, and test sets, with the validation set being used for model training along with the training set. The test set is reserved exclusively for the final performance evaluation, preventing evaluation errors.
The remainder of the paper is organized as follows:
Section 2 describes the experimental apparatus and dataset;
Section 3 details the feature engineering process;
Section 4 presents capacity prognostics based on the ICPO-Bi-LSTM method;
Section 5 provides the results of battery capacity prognostics; and
Section 6 concludes the paper.
2. Battery Data Analysis
2.1. Experimental Equipment
To comprehensively analyze the operational characteristics of lithium-ion batteries under real-world conditions, an experimental platform was developed to collect data from various battery types. The experimental equipment used in this study is the NEWARE CTE-4008D-5V30A tester, which is a battery test equipment manufactured by NEWARE, and its main function is to test the capacity, efficiency, cycle life, and other performances of the battery by simulating the battery charging and discharging process. It consists of a battery testing system, a host computer with software (BTS Client 8.0.0.516), and a battery under test. The physical schematic of the experimental apparatus is shown in
Figure 1. After experiments were conducted with the NEWARE CTE-4008D-5V30A tester to obtain battery charge/discharge data, both model construction and battery capacity prognostics were carried out using the Python 3.11 (64-bit) platform.
2.2. Description of Experimental Data
This experiment used A123 APR18650M1A LFP/C (the manufacturer is A123 Systems LLC, Waltham, USA) batteries (B1) and OXUN IFR26650 LFP/C (the manufacturer is OXUN Energy, Changzhou, China) batteries (B2) to simulate real-world operating conditions. The research focused on the “Jun Lv Hao”, a 300-passenger all-electric ferry operating in Wuhan. The ship’s battery system comprises multiple clusters connected in parallel, offering a total capacity of 2240 kWh. The specific topology of the battery system is illustrated in
Figure 2. The battery system of the “Jun Lv Hao” ship is divided into two sections (left and right), with each section containing six battery clusters. Once the six battery clusters are connected in parallel, they supply power to the pod and other loads via the ship’s DMSB. The experiment was designed to replicate the actual operating conditions of the “Jun Lv Hao” and assess the capacity degradation of lithium-ion batteries under complex operational scenarios based on the rated capacity of the selected batteries. To verify the effectiveness of the battery capacity prediction method developed in this study, we also employed LISHEN LR18650LA NCM/C (the manufacturer is Tianjin Lishen Battery Joint-Stock Co., Ltd., Tianjin, China) batteries (B3) for a simplified discharge test under controlled conditions.
Figure 3 displays the current variation curves observed during a typical “Jun Lv Hao” voyage and under simulated conditions. The specifications of the “Jun Lv Hao” are provided in
Table 2.
The specific operation of the simulated working condition is as follows: Firstly, the capacity value released under the actual working condition is calculated by the ampere-time integration method, the capacity value is reduced by a certain number of times so that it does not exceed the rated capacity of the battery used in the experiment, and then the output current of the actual working condition is reduced by the same number of times. At the same time, the original sampling time of the working condition is 5 s, and this paper shortens the time to 1 s, which constitutes the simulated working condition. After that, the capacity released under the simulated condition is calculated again using the ampere–time integration method to ensure that the value is less than or equal to the rated capacity of the battery used in the experiment. Additionally, to ensure the safety and efficiency of the experiment, the charging currents and the duration of a single cycle were constrained. The specific charging and discharging protocols for the three batteries are outlined as follows:
B1,
① Charge the battery with a constant current of 7.7 A to a cut-off voltage of 3.26 V;
② Charge the battery with a constant current of 5.28 A to a cut-off voltage of 3.32 V;
③ Charge the battery with a constant current of 5.28 A to a cut-off voltage of 3.33 V;
④ Charge the battery with a constant current of 4.015 A to a cut-off voltage of 3.36 V,
⑤ Leave the battery to stand for 5 min;
⑥ Charge the battery with a constant current of 4 A at 3.6 V to a cut-off current of 0.4 A;
⑦ Leave the battery to stand for 5 min;
⑧ Apply the simulated working conditions shown in
Figure 3b;
⑨ Discharge the battery with a constant current of 4.4 A to a cut-off voltage of 2 V;
⑩ Leave the battery to stand for 5 min;
⑪ Repeat the above steps (①–⑩) until 200 cycles are completed;
B2,
① Charge the battery with a constant current of 7.2 A at 3.65 V to a cut-off current of 0.72 A;
② Leave the battery to stand for 5 min;
③ Apply the simulated working conditions shown in
Figure 3b;
④ Discharge the battery with a constant current of 14.4 A to a cut-off voltage of 2 V;
⑤ Leave the battery to stand for 5 min;
⑥ Repeat the above steps (①–⑤) until 276 cycles are completed;
B3,
① Charge the battery with a constant current of 8 A at 4.2 V to a cut-off current of 0.1 A;
② Leave the battery to stand for 10 min;
③ Discharge the battery with a constant current of 8 A to a cut-off voltage of 2.75 V;
④ Leave the battery to stand for 10 min;
⑤ Repeat the above steps (①–④) until 528 cycles are completed.
Figure 4 presents the capacity degradation curves for three types of batteries, as obtained from the experiments described earlier.
Table 3 provides the details of the battery data experimentally obtained, which are relevant to the capacity prediction of lithium-ion batteries.
6. Conclusions
Capacity prognostics using data-driven methods can be inaccurate when the extracted features fail to sufficiently capture the degradation trend of lithium-ion batteries or when the model’s hyperparameters are improperly specified. This paper proposes a capacity prognostic method for marine lithium-ion batteries, which extracts features from battery charging and discharging data collected under dynamic operating conditions and utilizes the ICPO-Bi-LSTM model for capacity prognostics. First, a series of features are extracted from the charging and discharging data to ensure the adequate capture of battery capacity degradation. Then, the gray correlation degree and Spearman correlation coefficient are calculated to select features that are strongly correlated with capacity, while eliminating redundant features to obtain an optimal feature set. Additionally, the issues of an uneven population distribution and slow search efficiency during the early stages of the original CPO algorithm are addressed. The tendency of the original CPO algorithm to prematurely converge and fall into local optima in later stages, which hinders the identification of the optimal hyperparameter combination, is also mitigated. To this end, an improved CPO algorithm is proposed, combining enhanced Chebyshev chaotic mapping and a Random Differential Mutation strategy, which improve the population initialization and iterative search strategies of the original CPO algorithm, respectively.
The accuracy of our method is validated by predicting the capacities of two battery models (B1 and B2), which are discharged under dynamic operating conditions but charged using different methods. Additionally, we evaluate the method’s generalization ability using a third battery model (B3), which is discharged under simpler conditions. The experimental results confirm the method’s capacity to predict battery capacity accurately across various training, validation, and test set ratios, as well as under different charging and discharging conditions, demonstrating both high accuracy and robustness. Specifically, the mean absolute error (MAE) and root mean square error (RMSE) of the predicted capacities for the different batteries across various dataset ratios are consistently below 1%. Furthermore, our method achieves the smallest MAE and RMSE values when compared to other methods (e.g., CPO-Bi-LSTM, PSO-Bi-LSTM, Bi-LSTM, and LSTM), and the ICPO algorithm used in our approach demonstrates the shortest model training time compared to the original CPO algorithm and the PSO algorithm, resulting in the most accurate and efficient capacity prediction. In all model comparison experiments, the maximum MAEs and RMSEs of the predicted capacities using this method remain consistently below 0.6%.
Future work will include extensive experiments using various marine battery models under diverse dynamic operating conditions to further validate the effectiveness of our proposed method. We will progress from controlled laboratory environments to real-world marine applications, assessing the method in increasingly complex and dynamic conditions. One major challenge is the variability in battery charging and discharging conditions, which could impact data integrity. To address this, we plan to enhance both data acquisition techniques and the robustness of the model. Furthermore, we will investigate the effect of ambient temperature on battery performance, particularly during the charging process. Additionally, we aim to integrate the proposed method into an online prediction system for continuous monitoring, with a focus on ensuring scalability and real-time performance.