General Machine Learning Approaches for Lithium-Ion Battery Capacity Fade Compared to Empirical Models

Mayemba, Quentin; Ducret, Gabriel; Li, An; Mingant, Rémy; Venet, Pascal

doi:10.3390/batteries10100367

Open AccessArticle

General Machine Learning Approaches for Lithium-Ion Battery Capacity Fade Compared to Empirical Models

by

Quentin Mayemba

^1,2,3,*,

Gabriel Ducret

⁴,

An Li

¹

,

Rémy Mingant

² and

Pascal Venet

³

¹

Siemens Digital Industries Software, 19 Boulevard Jules Carteret, 69007 Lyon, France

²

IFP Energies Nouvelles, Rond-Point de L’échangeur de Solaize, 69360 Solaize, France

³

Université Claude Bernard Lyon 1, Ampère, UMR5005, INSA Lyon, Ecole Centrale de Lyon, CNRS, F-69100 Villeurbanne, France

⁴

IFP Energies Nouvelles, 1-4 Avenue du Bois Préau, 92500 Rueil-Malmaison, France

^*

Author to whom correspondence should be addressed.

Batteries 2024, 10(10), 367; https://doi.org/10.3390/batteries10100367

Submission received: 6 September 2024 / Revised: 11 October 2024 / Accepted: 14 October 2024 / Published: 16 October 2024

(This article belongs to the Special Issue Artificial Intelligence-Based State-of-Health Estimation of Lithium-Ion Batteries—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

Today’s growing demand for lithium-ion batteries across various industrial sectors has introduced a new concern: battery aging. This issue necessitates the development of tools and models that can accurately predict battery aging. This study proposes a general framework for constructing battery aging models using machine learning techniques and compares these models with two existing empirical models, including a commercial one. To build the models, the databases produced by EVERLASTING and Bills et al. were utilized. The aim is to create universally applicable models that can address any battery-aging scenario. In this study, three types of models were developed: a vanilla neural network, a neural network inspired by extreme learning machines, and an encoder coupled with a neural network. The inputs for these models are derived from established knowledge in battery science, allowing the models to capture aging effects across different use cases. The models were trained on cells subjected to specific aging conditions and they were tested on other cells from the same database that experienced different aging conditions. The results obtained during the test for the vanilla neural network showed an RMSE of 1.3% on the Bills et al. test data and an RMSE of 2.7% on the EVERLASTING data, demonstrating similar or superior performance compared to the empirical models and proving the ability of the models to capture battery aging.

Keywords:

capacity loss; battery aging; empirical model; machine learning; artificial neural network; autoencoder

1. Introduction

The automotive industry is transitioning toward electric mobility. Fossil fuels are not suitable for storing energy in electric vehicles, which is why lithium-ion batteries are, nowadays, the common preference. In addition, this trend has been emphasized by the need to store the intermittent electricity produced by renewables and by an increase in the number of portable electronic devices being sold. These factors contribute to a growing market in the field of energy storage. In this context, lithium-ion batteries emerge as a viable solution [1]. They are energetically dense and financially affordable, and they also have a relatively high cycle life and efficiency.

In spite of their relatively high cycle life, numerous degradation phenomena can occur in these accumulators, as shown in Table 1, adapted from Ref. [2]. The working principle of a fresh cell, along with several aging phenomena, is illustrated in Figure 1 (the crystal structure of the LCO active material has been represented using chemtube3d [3]). Several factors significantly influence the aging of the cells, including the current, temperature, state of charge (SOC), voltage, and mechanical stress. These factors are referred to as stress factors. In the models developed for this study, only the current, temperature, and SOC are considered.

This degradation results in a loss of capacity and an increase in the internal resistance of the cell. The capacity corresponds to the electric charge (expressed in Amp-hours) that can be stored in the battery. It is correlated to the amount of lithium that can pass from one electrode to the other. Predicting the aging process of lithium-ion cells is crucial because it allows manufacturers to ensure that their lithium-ion cells can consistently meet the product’s energy specifications over time. Lithium-ion battery aging forecasting is also key to assessing batteries’ potential for reuse, in order to give them a second life [21] or to assess their safety [22]. The development of battery degradation models, therefore, has become necessary. The literature categorizes these models into three main categories:

physical models [23,24];
empirical models [25,26];
machine learning (ML) models [27,28].

It should, however, be emphasized that some models rely on several of these categories, such as grey-box models that combine physics and data-driven methods, as discussed in the review by Guo et al. in 2022 [29]. Physical models have the advantage of being highly explainable, but they also have drawbacks. They require a significant amount of experimental data, often obtained by dismantling the battery cell, and their application necessitates a high computational cost. This cost can be explained by the resolution of numerous partial differential equations in electrochemical models that consider aging phenomena. Examples of such models are the single particle model, the single particle model with electrolyte dynamics, and the pseudo-2 dimensional model [24]. It is important to note that the current state-of-the-art physical models do not encompass all the aging phenomena described in Table 1. Instead, they focus on a select few phenomena [10,23,24], such as:

SEI growth;
lithium-plating;
particle cracking due to mechanical stress (and the associated growth of the SEI on the cracks);
loss of active material (LAM) due to mechanical stress and the internal cracks of the particles;
oxidation of the electrolyte at the positive electrode.

The two other categories of aging models are data-driven, meaning that they rely on battery-aging data and more specifically on capacity fade measurements. These models are consequently able to capture any aging phenomenon occurring in the cell but are unable to attribute the degradation to specific phenomena. The distinction between empirical and machine learning models is that the ML models utilize the tools developed by the machine learning community to create agnostic models. These ML models do not assume any pre-existing knowledge, nor do they make specific assumptions about the data. In contrast, empirical models rely on mathematical formulas created by the designer of the model and are based on the available data. ML models often have more parameters to optimize than empirical models. The scope of this study is limited to models that predict the aging of cells regardless of the specific aging phenomena occurring within them. This study consequently focuses on data-driven models, and more specifically those implementable in simulation software to estimate the aging of lithium-ion cells under various aging conditions. Only battery aging models based on non-linear regression with exogenous variables are considered, which means that previous outputs are not utilized as inputs. Furthermore, the models discussed in this paper do not depend on the response of the cell to a specific stimulation. Thus, these models may rely on the current that a cell delivers during aging but not on its electrothermal response to a specific test conducted during a check-up, for instance. Part of the incremental capacity curves or any other health indicators (presented in Section 4.3.1 of the review by Khaleghi et al. in 2024 [30]) cannot, therefore, be used as inputs of the models considered. That is why models that rely on previous capacity measurements are also excluded [31]. In this study, neural networks are employed to forecast the capacity loss of the cells, based on the time they spent under specific conditions. New features have been proposed and applied to databases where such methods had never been previously implemented, thus distinguishing our approach from the state of the art. The effectiveness of this method was assessed by the use of two diverse databases that encompass a wide range of applications, including driving profiles, calendar aging, constant current cycling, and electric vertical take-off and landing (eVTOL) aircraft operations. The results thus obtained were then compared with those from two empirical models. To the extent of the authors’ knowledge, this is the first instance in which this method is compared to empirical models and applied to open-source data covering various ambient temperatures and current profiles.

This paper is organized as follows: Section 2 provides an overview of state-of-the-art machine-learning models that forecast the capacity loss of lithium-ion batteries, while Section 3 describes the databases, the empirical models developed, and the machine-learning models used here. Finally, Section 4 presents the results obtained from each database using the methods mentioned.

2. State of the Art on Machine Learning Models to Predict Battery Aging Using Time Passed by the Cell between Thresholds

Building on an understanding of the challenges related to lithium-ion batteries, approaches that harness machine learning techniques are discussed below. These methods predict battery degradation accurately by utilizing extensive datasets to effectively model complex aging dynamics [32]. The literature on the application of machine learning to study the state of health of lithium-ion cells is extensive, with numerous reviews having been published in recent years [27,28,30,33]. While listing all the methodologies would be tedious due to their vast diversity, the review of von Bülow et al. in 2023 [27] highlights that fewer papers focus on developing models that can predict the SOH of cells as tools for engineers. Such a tool could help identify which types of real-world usage most significantly impact a battery’s health. The present study specifically targets such models. The most versatile models employ historical features, sometimes also referred to as histograms. The fundamental concept behind these models is to make categories, or zones, in the cells’ stress factors (such as the temperature, current, voltage, SOC, or mechanical stress). As inputs to the ML model, these zones consider how long the cell has remained in each category, either through the duration spent in the zone or the count of measurements taken while the cell was in that zone. To the best of the authors’ knowledge, no other information has been extracted from these zones to be used as input into machine learning models. As discussed in Section 3.3 of the present study, additional features have been incorporated, such as the time integral of temperature within the zone, the time integral of the current, and the time integral of the SOC. This information allows the model to evaluate the potential damage associated with the duration for which the cell remained in each zone.

The first authors to propose the methodology of using thresholds to create categories and utilize information from these categories to predict a cell’s aging were Nuhic et al. in 2013 [34]. Their objective was to develop battery aging models that were not specific to any particular discharge pattern. To achieve this, they studied what they referred to as the time the cell remained within specific parameter threshold values. They established 9 classes of SOC, 7 classes of temperature, and 13 classes of current. The authors then utilized the time spent by the cell in each combination of SOC and temperature classes (resulting in 7 times 9 inputs) and in each combination of current and temperature classes (resulting in 7 times 13 inputs) as the inputs for their model. In addition, they examined the rainflow counting of the SOC to obtain inputs that could represent the number of cycles without explicitly defining a cycle in the aging profile. These inputs were used to build a support vector regression system. They not only predicted the cell capacity from the beginning until each check-up but also between each check-up.

In 2016, You et al. [35] also conducted a similar study in which they utilized the current, voltage, and temperature measurements of cells. They applied the K-means clustering algorithm to partition the resulting three-dimensional space into 80 distinct regions. Then, they employed the densities of each of these 80 clusters as inputs for support vector machines (SVMs) and an artificial neural network (ANN) to predict the SOH of the cells. To create their dataset, they subjected the cells to cycling profiles inspired by the various driving conditions encountered with electric vehicles, such as highway and urban driving. This approach is comprehensive as it allows for the consideration of various aging profiles. However, it is important to note that analyzing the density of the regions may introduce bias if the sampling frequency varies throughout the cells’ lifespan. For instance, measurements may be gathered with a larger timestep during constant current charging, compared to during driving cycles.

Considering functions of time rather than the number of measurements appears to be more reliable. This idea was proposed by Richardson et al. in 2019 [36], although they did not use such features to construct their Gaussian process regression (GPR) model using the NASA randomized battery dataset. In their study, they suggested considering the time spent between the two thresholds of a specific parameter, such as current, temperature, voltage, or power. Consequently, their approach can be referred to as one-dimensional histograms, or 1D models, in contrast to the three-dimensional models made by You et al. Their GPR model used the Ah-throughput of the cells, their age, and the time between two consecutive capacity measurements to predict the corresponding capacity change, expressed in Ah. It is noteworthy that such approaches require knowledge of the distribution of the stress factors; this is why they are often referred to as histograms.

In 2020, Song et al. [37] used a similar method. They utilized real-world data from 300 battery-powered electric vehicles and 400 hybrid vehicles, collected between January 2018 and December 2018, to predict the capacity of these batteries. The variables considered included the mileage of the vehicles and the distribution of measurement points across several thresholds for current, SOC, and temperature. These inputs were then transformed using principal component analysis [38] and the resulting components were ordered according to the variance they explained in the dataset. Only those components with a cumulative variance below 90% were kept as inputs of the neural network. Given that the vehicles were being used by the customers, the cells experienced no capacity measurement per se. The capacities were estimated during charging at between 40% and 80% SOC.

In 2021, von Bülow et al. [39] compared two-dimensional and three-dimensional models using the Severson et al. database [40] to predict SOH changes for a given number of cycles. The three stress factors that they focused on were current, temperature, and SOC. Their two-dimensional models relied on the time spent by the cell between the thresholds of current and temperature, then those on current and SOC, and finally those on SOC and temperature. One model considered all these features as inputs. They studied the influence of window width (or the number of cycles for which

Δ S O H

was predicted) and different thresholds of the stress factors to retrieve their inputs, then they built an ANN and a GPR. The hyperparameters of the ANN were optimized using a tree-structured Parzen estimator (TPE). The hyperparameters in a machine learning model are the characteristics that define the model, such as the number of layers in an ANN or the number of neurons per layer [41]. They found that combining multiple window widths during training (i.e., training the model to predict capacity loss for not only 50 equivalent full cycles ahead but also 25 or 100 equivalent full cycles ahead) enhanced its ability to generalize. In their experiment, the 3D models did not yield better results than the 2D ones. However, it is worth noting that the dataset they employed did not include any variation in the temperature of the thermal chamber. Therefore, temperature differences between the cells were attributed solely to their electro-thermal behavior.

In 2022, Zhang et al. [42] utilized historical features from three datasets: the study by Severson et al. and Attia et al. [40,43], the NASA randomized battery dataset [44], and a dataset based on actual measurements from plug-in hybrid electric vehicles (PHEV). The models employed included random forest regressions, support vector regressions, ANN, and GPR. The authors used the historical features of the cells to forecast their degradation trajectories.

In 2023, Greenbank et al. [45] conducted a histogram-based approach to determine the thresholds for voltage, temperature, and current, based on the 1st, 33rd, 67th, and 99th percentiles of these stress factors. However, the temperature-related features were excluded from the model because they did not pass the feature selection process. They utilized datasets from Severson et al. [40] and Attia et al. [43] to predict the knee-point and remaining useful life of the cells. Specifically, they estimated the cells’ capacity loss every 12 h, which roughly corresponds to 9 to 19 cycles, depending on the charging profiles. The knee point is defined as the moment when the slope of aging over time significantly increases [46].

The contents of these works, along with what is proposed here, are summarized in Table 2.

In conclusion, machine learning models provide a robust framework for the predictive analysis of lithium-ion battery aging. These models utilize historical data to generate accurate forecasts. Building upon the literature, three models based on the time spent in various zones are presented in Section 3.3. These models have been compared with the empirical models discussed in Section 3.2. Both modeling approaches have been applied to the data presented in Section 3.1, which are open-source data wherein cells were aged under different cycling procedures and storage temperatures. The method relies on the time spent by the cell between temperature thresholds, among other parameters. The following points make these models original:

The thresholds selected in previous studies were specific to the datasets used. In contrast, the present work proposes general thresholds applicable to any battery aging dataset, thus contributing to moving a step closer to a general battery aging model.
Except for the study by Zhang et al. [42], which employed the NASA randomized battery dataset [44], the models available in the literature have not been applied to open-source databases where all cells do not experience the same ambient temperature during aging. Consequently, this paper presents the results from applying the method to two additional public datasets, wherein temperature thresholds are associated not only with cell heating but also with varying aging conditions, resulting in sparser data.
This paper demonstrates the robustness of the method when applied to public datasets, thereby advancing the development of generalized battery aging models. These machine learning methods have not been previously applied to these datasets, and their usage introduces challenges that are distinct from those encountered before.
Furthermore, to the best of the authors’ knowledge, this study is the first application of autoencoders for reducing the dimensionality of inputs based on the time spent.
This study introduces novel inputs derived from zones that are not based on the time spent in a zone or the zone’s density. Instead, these novel inputs include the time integral of the current in the zone, that of the SOC, and that of the temperature. This approach enables the model to account for parameter variations within a zone.

3. Materials and Methods

3.1. Battery Aging Datasets Used

In the present study, the models were applied to two databases: EVERLASTING [47] and that of Bills et al. from 2023 [48]. Although additional battery aging databases exist, such as those referenced in Refs. [2,49,50], the focus of this study is on these two databases because they encompass a range of scenarios inspired by real-world usage cases. The description of these datasets can be found in Ref. [2].

3.1.1. The EVERLASTING Dataset

The EVERLASTING dataset project studied the commercial cells named INR18650 MJ1, manufactured by LG Chem, with a nominal capacity of 3.5 Ah. These cells incorporate positive electrode active material composed of NMC 811. The active material of the negative electrode is made of a blend of graphite intercalation compound (GIC) and SiO_x. The test matrix conducted in this project is summarized in Table 3, Table 4 and Table 5. The resulting capacity losses for 40 cells are illustrated in Figure 2. In this figure, it can be seen that, generally speaking, calendar aging at 0 °C and 10 °C produced similar results, with less degradation being observed compared to calendar aging at 25 °C. The most severe calendar aging occurred at 45 °C. Across all temperatures, capacity retention was superior during calendar aging compared to CC cycling. The driving profiles, along with CC cycling at 10 °C, exhibited the most heterogeneous characteristics regarding cell degradation.

All the cells were included in the training set, except those in the test set described below. In this database, the test data comprised cells 19, 20, 34, 37, 63, 68, 69, 72, 78, 79, and 96. These cells are represented in Figure 2 by dashed lines marked with crosses. The aging conditions for these cells were as follows:

Cells 19 and 20 experienced CC cycle aging at 10 °C;
Cell 34 underwent calendar aging at 10 °C and SOC = 90%;
Cell 37 underwent calendar aging at 0 °C and SOC = 70%;
Cell 63 experienced CC cycle aging at 45 °C;
Cells 68 and 69 experienced driving aging at 45 °C between 70 and 90% SOC;
Cell 72 underwent calendar aging at 45 °C and SOC = 10%;
Cells 78 and 79 experienced CC cycle aging at 25 °C;
Cell 96 underwent calendar aging at 25 °C and SOC = 90%.

The commercial model presented in Section 3.2 could not be calibrated using driving profiles. Therefore, for this model, the test data also included all driving measurements. It is important to note that the training and testing sets of the models were composed of different aging conditions.

3.1.2. The Bills Dataset

The database of Bills et al. [48] was inspired by the behavior of electric vertical take-off and landing aircraft (also referred to as eVTOLs) [51]. In this database, 22 NCA cells with a nominal capacity of 3 Ah were cycled. The negative electrode was made of pure graphite. The baseline power and current profiles are illustrated in Figure 3, and each phase of the baseline profile is explained in detail in Table 6. The temperature for the baseline profile was 25 °C.

The test matrix is presented in Table 7 and includes variations from the baseline scenario. The table specifies the number of data points, which is equal to the number of capacity measurements. The bold values represent the number of data points used as test data. Notably, only one test cell is aging in the conditions that were present in the training data.

The resulting capacity loss versus time is illustrated in Figure 4. The tests continued until the cells either reached 70 °C or 2.5 V during discharge. As can be seen, aging is rather similar for all cells, regardless of aging parameters such as the room temperature or the maximal voltage of the cells. This can be explained by the very demanding discharge profile that causes all cells to heat up and reach temperatures above 40 °C.

In this dataset, the cells VAH02, VAH10, VAH15, VAH22, and VAH28 served as test data, while the remaining 17 cells were used as training data, as illustrated by the dashed line in Figure 4.

3.2. Empirical Models

This study employed two distinct empirical models to assess battery degradation:

a basic aging model;
a generic aging model, implementing the battery aging identification tool in the commercial software Simcenter Amesim V2310 [53]. This model is inspired by the work of Mingant et al., 2021 [25].

These two models were selected for comparison because one is a very simple model while the other is a state-of-the-art model implemented in commercial simulation software. The first one models degradation using an exponential function of time:

Q_{l o s s} = B \times t^{z}

(1)

where B and z are constants.

The commercial model is inspired by the following equation [25]:

\frac{d Q_{l o s s}}{d t} = B \times z \times {(\frac{Q_{l o s s}}{B})}^{\frac{z - 1}{z}}

(2)

where B and z follow the equations:

z = z_{m i n} + (z_{m a x} - z_{m i n}) \times \frac{1 + t a n h (z_{p o l y})}{2}

(3)

\{\begin{matrix} B = B_{p o l y} i f B_{p o l y} \geq 20 \\ B = 20 \times e x p (\frac{B_{p o l y} - 20}{20}) o t h e r w i s e \end{matrix}

(4)

with

z_{p o l y}

and

B_{p o l y}

:

z_{p o l y} = z_{1} + z_{2} \times T + z_{3} \times S O C + z_{4} \times I_{c h} + z_{5} \times I_{d c h}

(5)

\begin{array}{l} B_{p o l y} = B_{1} + B_{2} \times \frac{1}{T} + B_{3} \times S O C + B_{4} \times S O C \times T + B_{5} \times T \times I_{c h} + B_{6} \times \frac{I_{c h}}{T} + B_{7} \times \frac{I_{d c h}}{T} + B_{8} \times \frac{S O C}{T} + B_{9} \times T^{2} \\ + B_{10} \times I_{c h}^{2} + B_{11} \times I_{d c h}^{2} + B_{12} \times S O C ² \end{array}

(6)

where

T

corresponds to the temperature of the cell at time t,

I_{d c h}

corresponds to the discharge current at the same time t, and

I_{c h}

corresponds to the charge current at the same time t. In this work, this model is referred to as the commercial model.

In this commercial model, the coefficients written in blue are optimized during the calibration process. The software is limited to optimizing these coefficients with a single charge current and a single discharge current. Therefore, the optimization was performed using the root mean square discharge current from the Bills database. However, the model was not calibrated on dynamic profiles, such as the driving profile; consequently, all driving profiles were utilized as test data. It is important to note that there are numerous other empirical aging models available [26,54], but only the aforementioned models were applied in this study.

The optimization of this model on the EVERLASTING data resulted in the following mathematical operations:

z_{p o l y} = 6.137 - 2.451 \times 10^{- 2} \times T + 3.551 \times 10^{- 1} \times I_{d c h} + 8.777 \times 10^{- 3} \times S O C

(7)

\begin{array}{l} B_{p o l y} = 627.5 - 1.345 \times 10^{5} \times \frac{1}{T} - 17.72 \times S O C + 2.926 \times 10^{- 2} \times S O C \times T - 0.5205 \times T \times I_{c h} + 4.663 \times 10^{4} \times \frac{I_{c h}}{T} \\ + 2.443 \times 10^{4} \times \frac{I_{d c h}}{T} + 2.736 \times 10^{3} \times \frac{S O C}{T} - 1.926 \times 10^{- 3} \times T^{2} - 18.39 \times I_{c h}^{2} - 40.20 \times I_{d c h}^{2} \\ - 1.655 \times 10^{- 3} \times S O C ² \end{array}

(8)

This model has also been optimized on the Bills data, yielding the following results:

z_{p o l y} = 8.211 \times 10^{- 2} - 8.530 \times 10^{- 3} \times S O C

(9)

B_{p o l y} = 42.34 + 0.4137 \times T \times I_{c h} - 4.239 \times 10^{4} \times \frac{I_{c h}}{T} + 14.46 \times I_{c h}^{2}

(10)

3.3. Machine Learning Models

Building upon the existing literature, models based on the time that the cells spent under specific conditions were created. To achieve this, thresholds for the three stress factors (i.e., state of charge, temperature, and current) considered in our models were defined. These thresholds were arbitrarily chosen but corresponded to battery science concepts, such as a cell with a low SOC or a high SOC, for instance. Importantly, these thresholds were consistent across the two databases studied. The model utilized the time the cell has spent between these thresholds to predict capacity loss. It was designed to perform effectively with various data sources, thereby ensuring consistent and reliable predictions. Three variants of this approach are considered here (see Figure 5).

The first neural network (NN₁), or 1D zone, is designed to determine the duration the cell has spent with one stress factor between two thresholds, independent of the values of the other two factors. For example, the time spent by the cell with a SOC of between 20% and 80% was used as input. The architecture of the neural networks built for neural network 1 is displayed in Figure 6.

The other two neural networks adopt a 3D approach to data transformation. In this approach, the thresholds are applied simultaneously to every stress factor. This creates distinct zones within the space of stress factors. Within this space, the inputs selected for the neural networks include the time spent by the cell in these zones, along with the time integral of the current, the temperature, and the SOC in each zone. All these inputs are utilized by neural networks 2 and 3 to predict the capacity losses. Neural network 2 (NN₂) features an architecture similar to that of neural network 1, but it does not train its first layer, which reduces the number of parameters needing optimization. The architecture of NN₂ is illustrated in Figure 7. This neural network draws inspiration from an extreme learning machine (ELM) [55,56]. The decision to exclude training of the first layer is pertinent given the high dimensionality of the 3D input data. Neural network 3 (NN₃), however, incorporates a different mechanism: it performs data reduction through an encoder that is followed by a neural network. Autoencoders consist of two components: an encoder and a decoder. The encoder transforms the input into a lower dimensional space, while the decoder reconstructs the original input, based on the encoded representation. By using only the encoder part of the autoencoder, a dimensionality reduction technique is applied [57,58]. As shown in Figure 8, an autoencoder was constructed using the input data, and its encoding component was reutilized in the model to predict capacity loss. To develop the autoencoder, the data were augmented by multiplying the inputs by factors ranging from 0.1 to 3 in 0.1 increments, effectively increasing the number of data points by a factor of 29. This augmentation is feasible because the inputs pertain to the time spent by the cell under specific conditions. Thus, multiplying the inputs by certain factors simulates having a cell that ages under specific conditions for various durations. A possible physical interpretation of these 3D input features is that a certain degradation phenomenon will occur if the conditions are specific enough (the conditions make specific degradation phenomena thermodynamically feasible). Providing the model with information about the time spent in these conditions allows it to implicitly deduce the kinetics of the reactions contributing to capacity loss. Including additional features, such as the integral of current, expands the range of conditions considered by the models, as variations within a condition can now be accounted for. Figure 9 depicts the thresholds established for each of the stress factors across both databases. In all three neural networks, the output of the machine learning model is multiplied by the hyperbolic tangent of the time in days, divided by two (this mathematical expression is illustrated in Figure 6, Figure 7 and Figure 8). This step guarantees that the capacity loss for a fresh cell is exactly zero, ensuring that after a few days, the model’s output aligns with that of the ML model while maintaining a smooth transition between these two states. The thresholds identified for the databases are presented in Figure 9.

As mentioned above, the three variants of the method first concentrated on using thresholds to discretize the stress factors, which are real values that change over time. To achieve this, specific thresholds were established, and models were constructed based on the time spent by the cell between these thresholds. The aim was to develop a general model; therefore, the thresholds were not selected based on the distribution of the measurements. You et al. [35] employed a clustering method to delineate their zones, while other researchers, such as Greenbank et al. [45], used the distribution of the stress factors to determine the thresholds. Von Bülow et al. selected the thresholds without considering the distribution of the measurements [39]. In the present study, the thresholds were also chosen arbitrarily to represent the conditions experienced by the cell. This approach resulted in the following inputs for the one-dimensional models, where negative currents indicate discharge currents (see also Figure 5):

Age of the cell
Time spent by the cell with low SOC: $\int_{S O C \leq 20} d t$
Time spent by the cell with medium SOC: $\int_{20 < S O C \leq 80} d t$
Time spent by the cell with high SOC: $\int_{80 < S O C} d t$
Time spent by the cell at a low temperature: $\int_{T \leq 15} d t$
Time spent by the cell at a medium temperature: $\int_{15 < T \leq 35} d t$
Time spent by the cell at a high temperature: $\int_{35 < T} d t$
Time spent by the cell with a high discharge current: $\int_{I \leq - 1 C} d t$
Time spent by the cell with a medium discharge current: $\int_{- 1 C < I \leq - 0.1 C} d t$
Time spent by the cell with a low discharge current: $\int_{- 0.1 C < I \leq - 10^{- 9} C} d t$
Time spent by the cell almost in the calendar: $\int_{- 10^{- 9} C < I \leq 10^{- 9} C} d t$
Time spent by the cell with a low charge current: $\int_{10^{- 9} C < I \leq 0.1 C} d t$
Time spent by the cell with a medium charge current: $\int_{0.1 C < I \leq 1 C} d t$
Time spent by the cell with a high charge current: $\int_{1 C < I} d t$

As an example, the second feature:

\int_{S O C \leq 20} d t

corresponds to the time the cell Ha spent with a SOC below 20%. Features 3 to 14 should be understood in the same way. As only one parameter is used to build a zone with this method, this data transformation is referred to as 1D zones. In contrast, the 3D approaches rely on the three stress factors at the same time, as can be seen in the example below of five features that are used in the 3D models (see also Figure 5 and Figure 9):

$\int_{τ = 0}^{t} f (τ) d τ$ where $\{\begin{array}{l} f (τ) = 1 i f T \leq 15 ° C a n d S O C > 80 % a n d - 10^{- 9} C < I \leq 10^{- 9} C \\ f (τ) = 0 e l s e \end{array}$
$\int_{τ = 0}^{t} f (τ) d τ$ where $\{\begin{array}{l} f (τ) = 1 i f T > 35 ° C a n d S O C > 80 % a n d - 10^{- 9} C < I \leq 10^{- 9} C \\ f (τ) = 0 e l s e \end{array}$
$\int_{τ = 0}^{t} f (τ) d τ$ where $\{\begin{array}{l} f (τ) = S O C i f T \leq 15 ° C a n d S O C > 80 % a n d - 10^{- 9} C < I \leq 10^{- 9} C \\ f (τ) = 0 e l s e \end{array}$
$\int_{τ = 0}^{t} f (τ) d τ$ where $\{\begin{array}{l} f (τ) = I i f T \leq 15 ° C a n d S O C \leq 20 % a n d - 10^{- 9} C < I \leq 10^{- 9} C \\ f (τ) = 0 e l s e \end{array}$
$\int_{τ = 0}^{t} f (τ) d τ$ where $\{\begin{array}{l} f (τ) = T i f 15 ° C < T \leq 35 ° C a n d 20 % < S O C \leq 80 % a n d 1 C < I \\ f (τ) = 0 e l s e \end{array}$

The ML models illustrated in Figure 6, Figure 7 and Figure 8 are characterized by their hyperparameters, which include the number of layers, the number of neurons per layer, or the activation functions employed. Hyperparameters differ from the weights and biases of the model because they remain unchanged during the model’s training phase. Numerous techniques exist to select the set of hyperparameters that yields the best performance on a given dataset [36]. In the models presented in Section 4, the hyperparameters were selected using a trial-and-error method, which involved iteratively tuning the model’s hyperparameters. The other techniques mentioned, like Bayesian optimization or the hyperband, are more automatic. Their core idea is to conduct meta-optimization to adapt the hyperparameters of a model to a given dataset. This is referred to as meta-optimization because it is the optimization of a process that includes an optimization (in the context of machine learning, the included optimization mentioned consists of training the models). In this work, the aim was to propose architectures that are general enough to be used on any database, so they must be well-adapted to working with several databases. This task would, thus, consist of a multi-objective meta-optimization problem. Multi-objective optimization often results in a set of optimal solutions, known as a Pareto set. Having this goal in mind, the trial-and-error approach was chosen because it enables the authors to see the different optima and choose the best trade-off, rather than having an automatic method that yields only one optimum. All models presented in this work were trained and utilized on a Dell Precision 7530 laptop, which was equipped with an Intel Core i7 8850H CPU. The training durations for these models were under 10 s, while the calculations of the capacity losses on the training data took less than 0.2 s for all models. Due to the data augmentation technique implemented for the autoencoders, their training durations were longer, this being approximately 45 s for the EVERLASTING dataset and around 25 s for the Bills dataset.

These model architectures and these model inputs could be used to estimate other battery-aging-related information such as the internal resistance, the remaining useful life, or even the loss of active materials and the loss of lithium inventory. In the present work, only the capacity loss was estimated.

4. Results and Discussion

For this section, the models were developed, and their results were then compared using performance metrics for evaluation. The metrics utilized were the mean absolute error (MAE), the root mean squared error (RMSE), and the correlation coefficient (R²). Both the MAE and RMSE were expressed in the same units as capacity loss, shown specifically as a percentage of initial capacity. The mathematical definitions of these metrics are as follows:

M A E = \sum_{k = 1}^{N} \frac{|y_{k} - \hat{y_{k}}|}{N}

(11)

R M S E = \sqrt{\sum_{k = 1}^{N} \frac{{(y_{k} - \hat{y_{k}})}^{2}}{N}}

(12)

R^{2} = 1 - \frac{\sum_{k = 1}^{N} {(y_{k} - \hat{y_{k}})}^{2}}{\sum_{k = 1}^{N} {(y_{k} - \bar{y})}^{2}}

(13)

where N is the number of Q_loss measurements, y is the measured Q_loss,

\hat{y}

is the estimated Q_loss, and

\bar{y}

is the average of all the Q_loss measurements on which R² is calculated. These metrics are presented for both the training and the testing datasets. The objective was to develop models that are capable of predicting battery aging under conditions that differ from those in the training data. Therefore, only the metrics from the test data were utilized for model comparison.

4.1. Aging Models on the EVERLASTING Dataset

For this section, the performance of the models on the EVERLASTING database was assessed. The results of the different models are presented in Figure 10 and Figure 11. Figure 10 shows the party plots of the different models. The x-axes of these plots represent the capacity loss measurements, and the y-axes represent the corresponding capacity loss estimated by the model. The identity function

y = x

is represented on the graphs because, with a perfect model, all points would be on that line. Figure 11 depicts the metrics associated with each model using the EVERLASTING data. Using this database, the 3D model with an untrained layer (NN₂) outperformed all other models, achieving a value for R²_test = 0.9. The other two ML models yielded similar results, which were superior to those of the empirical models. The commercial model was calibrated using the calendar and constant current data from the training database, as it could not be optimized for driving profiles. Its results surpassed those of the basic model. As expected, the basic model exhibited poor predictive capability; the parity plot revealed horizontal lines where the model prediction yielded similar values, corresponding to the capacity measurements taken for cells of similar ages, despite their differing capacity losses. In Figure 10d, the commercial model shown displayed more test data points than the other figures because the other models were trained on driving profiles. Given the differing conditions between the training and the testing sets, the models’ ability to generalize was evaluated. Consequently, the ML models performed slightly worse on the test data than on the training data, although their overall performance demonstrates that they effectively captured the degradation of the cells. The empirical models were more constrained, possessing a predefined aging profile and fewer degrees of freedom. The authors assume that this explains the slight improvement observed in the commercial model between the training and testing data.

4.2. Aging Models on the Bills Dataset

The models described in Section 3.2 and Section 3.3 were implemented on the Bills et al. database. These models were applied to the dataset using the cells VAH02, VAH10, VAH15, VAH22, and VAH28 as test data, while the remaining 17 cells served as training data. The resulting parity plots are presented in Figure 12 and the corresponding performance metrics are shown in Figure 13. The results indicate that all models exhibited an R²_test value between 0.92 and 0.96, suggesting similar performance and good approximations of capacity loss within this database. However, the model utilizing the auto-encoder performed slightly worse on both the training and test sets compared to the other two ML models. Although the differences were minor, the NN₁ model demonstrated the best performance on both training and test data. Notably, NN₁ was the only ML model for which performance did not decrease when generalizing to the test data. In contrast, the empirical models demonstrated a slight improvement in performance from the training data to the test data.

All models demonstrated lower performance on the EVERLASTING data compared to the Bills data. This outcome was to be expected as the EVERLASTING database exhibits greater variability in the graph of Q_loss versus time than the Bills database (see Figure 2 and Figure 4). This variability also explains the similar performance of all models on the Bills data. Consequently, all models yielded comparable results on a less diverse database, while the proposed machine learning models achieved better performance on the more diverse EVERLASTING dataset. This suggests that the machine learning models possess a superior ability to capture battery degradation performance.

4.3. Overall Comparison of the Models

A metric can be derived from the results obtained from the test data in both databases. This metric is the total RMSE. The definition of this metric

{R M S E}_{t o t}

, and its relationship with the previously mentioned RMSE is shown in the equations:

{R M S E}_{t o t} = \sqrt{\frac{\sum_{a l l t e s t m e a s u r e m e n t s} {(y - \hat{y})}^{2}}{N_{t o t}}}

(14)

{R M S E}_{t o t} = \sqrt{\frac{\sum_{B i l l s t e s t} {(y - \hat{y})}^{2} + \sum_{E V E R L A S T I N G t e s t} {(y - \hat{y})}^{2}}{N_{t o t}}}

(15)

{R M S E}_{t o t} = \sqrt{\frac{{R M S E}_{1}^{2} \times N_{1} + {R M S E}_{2}^{2} \times N_{2}}{N_{1} + N_{2}}}

(16)

In these equations,

y

is the measured Q_loss,

\hat{y}

is the estimated Q_loss,

N_{t o t}

is the total number of capacity losses considered,

{R M S E}_{1}

is the RMSE of the test data from Bills,

{R M S E}_{2}

is the RMSE of the test data from EVERLASTING,

N_{1}

is the number of capacity losses in the test data from Bills,

N_{2}

is the number of capacity losses in the test data from EVERLASTING.

When applied to a model, this metric corresponds to the precision of the model in every instance where it has been tested. This overall score of the model enables its comparison. Figure 14 shows a comparison of the five models considered in this study. In the figure, it can be observed that the best-ranking model is the model with the untrained layer (NN₂), inspired by extreme learning, closely followed by the 1D model (NN₁). The 3D model involving the autoencoder (NN₃) is less accurate, although it is more accurate than the commercial model. The basic model is ranked last.

A possible explanation for the results obtained from each of the proposed machine learning models is that the 3D inputs contain more information relevant to understanding battery aging than the 1D inputs. This discrepancy may account for the superior performance of NN₂ compared to NN₁, particularly since 1D inputs can be derived from 3D inputs, while the reverse is not feasible. To explain the lower performance of NN₃, one hypothesis is that the significant variance in the aging data shared by the encoder with the network is because it does not include all the information from the 3D inputs related to capacity loss, nor does it include all the information used in NN₂ to calculate the capacity loss. Another potential explanation is that this variance is compressed into too few variables, preventing the network following the encoder from accurately interpreting the information drawn from the latent space.

5. Conclusions

In conclusion, three machine learning frameworks were presented and applied to two distinct datasets. When used without modifications, these frameworks delivered good results and demonstrated robustness, highlighting their reliability. The proposed models outperformed the commercial model on the EVERLASTING dataset. In contrast, all models exhibited similar performances on the Bills dataset. When calculating the overall RMSE across all test data points from both databases, the one-dimensional model and the model inspired by an extreme learning machine performed best, followed by the other 3D model, which outperformed the two existing methods: the commercial model and the basic model. The basic model had the largest overall error.

The proposed method for constructing zones in the battery’s stress factor space considers the integral over time of several functions as the inputs for machine learning models to predict battery degradation. This method has been validated on two databases, using the same thresholds to separate the stress factors. Such general approaches have not been applied to these datasets in the existing literature. In addition, novel features were incorporated, autoencoders were utilized, and the obtained results were compared with those of existing models, which activity distinguishes this work from previous studies. The datasets included various temperatures and aging profiles, such as calendar aging, constant current cycling data, and profiles inspired by real-world scenarios, including driving profiles and eVTOL missions. The proposed ML approaches yielded competitive or superior results compared to the commercial model on both datasets, demonstrating their versatility.

Several perspectives can be considered for this work. These include predicting the increase in internal resistance of the cells, assessing the loss of active material at each electrode, as well as the loss of lithium inventory, alongside capacity loss. All these assessments utilize the same model inputs and architectures. Additionally, another perspective is to compare multilayer perceptrons with recurrent neural networks.

Author Contributions

Conceptualization, Q.M., G.D., A.L., R.M. and P.V.; data curation, Q.M.; methodology, Q.M.; software, Q.M.; validation, Q.M.; formal analysis, Q.M.; investigation, Q.M.; writing—original draft preparation, Q.M.; writing—review and editing, G.D., A.L., R.M. and P.V.; visualization, Q.M.; supervision, G.D., A.L., R.M. and P.V.; project administration, Q.M., G.D., A.L., R.M. and P.V.; funding acquisition, G.D., A.L., R.M. and P.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Siemens Industries Software, with subsidies from the Association Nationale de la Recherche et de la Technologie (ANRT) under a CIFRE convention.

Data Availability Statement

Data from the Bills dataset are available here: https://figshare.com/articles/dataset/eVTOL_Battery_Dataset/14226830 (accessed on 13 October 2024). Data from the EVERLASTING dataset are accessible following the links: https://data.4tu.nl/datasets/e42bca59-f1dd-495a-92c9-8b01d6b64040 (accessed on 13 October 2024) https://data.4tu.nl/datasets/e19fe272-4f46-450c-9125-6545c4c1a98b (accessed on 13 October 2024), https://data.4tu.nl/collections/EVERLASTING_Electric_Vehicle_Enhanced_Range_Lifetime_And_Safety_Through_INGenious_battery_management_/5065445 (accessed on 13 October 2024). Data can also be made available upon request.

Conflicts of Interest

Authors Quentin Mayemba and An Li were employed by the company Siemens Digital Industries Software. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Peng, J.; Meng, J.; Chen, D.; Liu, H.; Hao, S.; Sui, X.; Du, X. A Review of Lithium-Ion Battery Capacity Estimation Methods for Onboard Battery Management Systems: Recent Progress and Perspectives. Batteries 2022, 8, 229. [Google Scholar] [CrossRef]
Mayemba, Q.; Mingant, R.; Li, A.; Ducret, G.; Venet, P. Aging datasets of commercial lithium-ion batteries: A review. J. Energy Storage 2024, 83, 110560. [Google Scholar] [CrossRef]
University of Liverpool Lithium Cobalt Oxide–LiCoO2–Conduction Animation. Available online: https://www.chemtube3d.com/lib_lco-2/ (accessed on 10 April 2024).
Maher, K.; Yazami, R. A study of lithium ion batteries cycle aging by thermodynamics techniques. J. Power Sources 2014, 247, 527–533. [Google Scholar] [CrossRef]
McBrayer, J.D.; Rodrigues, M.-T.F.; Schulze, M.C.; Abraham, D.P.; Apblett, C.A.; Bloom, I.; Carroll, G.M.; Colclasure, A.M.; Fang, C.; Harrison, K.L.; et al. Calendar aging of silicon-containing batteries. Nat. Energy 2021, 6, 866–872. [Google Scholar] [CrossRef]
Pelletier, S.; Jabali, O.; Laporte, G.; Veneroni, M. Battery degradation and behaviour for electric vehicles: Review and numerical analyses of several models. Transp. Res. Part B Methodol. 2017, 103, 158–187. [Google Scholar] [CrossRef]
Fath, J.P.; Dragicevic, D.; Bittel, L.; Nuhic, A.; Sieg, J.; Hahn, S.; Alsheimer, L.; Spier, B.; Wetzel, T. Quantification of aging mechanisms and inhomogeneity in cycled lithium-ion cells by differential voltage analysis. J. Energy Storage 2019, 25, 100813. [Google Scholar] [CrossRef]
Ahn, Y.; Jo, Y.N.; Cho, W.; Yu, J.-S.; Kim, K.J. Mechanism of Capacity Fading in the LiNi_0.8Co_0.1Mn_0.1O₂ Cathode Material for Lithium-Ion Batteries. Energies 2019, 12, 1638. [Google Scholar] [CrossRef]
Chen, C.-F.; Barai, P.; Mukherjee, P.P. An overview of degradation phenomena modeling in lithium-ion battery electrodes. Curr. Opin. Chem. Eng. 2016, 13, 82–90. [Google Scholar] [CrossRef]
Reniers, J.M.; Mulder, G.; Howey, D.A. Review and Performance Comparison of Mechanical-Chemical Degradation Models for Lithium-Ion Batteries. J. Electrochem. Soc. 2019, 166, A3189–A3200. [Google Scholar] [CrossRef]
Rowden, B.; Garcia-Araez, N. A review of gas evolution in lithium ion batteries. Energy Rep. 2022, 6, 10–18. [Google Scholar] [CrossRef]
Kabir, M.M.; Demirocak, D.E. Degradation mechanisms in Li-ion batteries: A state-of-the-art review. Int. J. Energy Res. 2017, 41, 1963–1986. [Google Scholar] [CrossRef]
Gao, H.; Yan, Q.; Holoubek, J.; Yin, Y.; Bao, W.; Liu, H.; Baskin, A.; Li, M.; Cai, G.; Li, W.; et al. Enhanced Electrolyte Transport and Kinetics Mitigate Graphite Exfoliation and Li Plating in Fast-Charging Li-Ion Batteries. Adv. Energy Mater. 2023, 13, 2202906. [Google Scholar] [CrossRef]
Winter, M.; Barnett, B.; Xu, K. Before Li Ion Batteries. Chem. Rev. 2018, 118, 11433–11456. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Yang, J.; Huang, W.; Wang, H.; Zhou, W.; Li, Y.; Li, Y.; Xu, J.; Huang, W.; Chiu, W.; et al. Cathode-Electrolyte Interphase in Lithium Batteries Revealed by Cryogenic Electron Microscopy. Matter 2021, 4, 302–312. [Google Scholar] [CrossRef]
Guo, L.; Thornton, D.B.; Koronfel, M.A.; Stephens, I.E.L.; Ryan, M.P. Degradation in lithium ion battery current collectors. J. Phys. Energy 2021, 3, 32015. [Google Scholar] [CrossRef]
Vetter, J.; Novák, P.; Wagner, M.R.; Veit, C.; Möller, K.-C.; Besenhard, J.O.; Winter, M.; Wohlfahrt-Mehrens, M.; Vogler, C.; Hammouche, A. Ageing mechanisms in lithium-ion batteries. J. Power Sources 2005, 147, 269–281. [Google Scholar] [CrossRef]
Chae, S.; Kim, N.; Ma, J.; Cho, J.; Ko, M. One-to-One Comparison of Graphite-Blended Negative Electrodes Using Silicon Nanolayer-Embedded Graphite versus Commercial Benchmarking Materials for High-Energy Lithium-Ion Batteries. Adv. Energy Mater. 2017, 7, 15. [Google Scholar] [CrossRef]
Lin, C.; Tang, A.; Mu, H.; Wang, W.; Wang, C. Aging Mechanisms of Electrode Materials in Lithium-Ion Batteries for Electric Vehicles. J. Chem. 2015, 2015, 104673. [Google Scholar] [CrossRef]
Yang, R.; Yu, G.; Wu, Z.; Lu, T.; Hu, T.; Liu, F.; Zhao, H. Aging of lithium-ion battery separators during battery cycling. J. Energy Storage 2023, 63, 107107. [Google Scholar] [CrossRef]
Wheeler, W.; Venet, P.; Bultel, Y.; Sari, A.; Riviere, E. Aging in First and Second Life of G/LFP 18650 Cells: Diagnosis and Evolution of the State of Health of the Cell and the Negative Electrode under Cycling. Batteries 2024, 10, 137. [Google Scholar] [CrossRef]
Abada, S.; Petit, M.; Lecocq, A.; Marlair, G.; Sauvant-Moynot, V.; Huet, F. Combined experimental and modeling approaches of the thermal runaway of fresh and aged lithium-ion batteries. J. Power Sources 2018, 399, 264–273. [Google Scholar] [CrossRef]
O’Kane, S.E.J.; Ai, W.; Madabattula, G.; Alonso-Alvarez, D.; Timms, R.; Sulzer, V.; Edge, J.S.; Wu, B.; Offer, G.J.; Marinescu, M. Lithium-ion battery degradation: How to model it. Phys. Chem. Chem. Phys. 2022, 24, 7909–7922. [Google Scholar] [CrossRef]
Lopetegi, I.; Plett, G.L.; Trimboli, M.S.; Yeregui, J.; Oca, L.; Rojas, C.; Miguel, E.; Iraola, U. Lithium-ion Battery Aging Prediction with Electrochemical Models: P2D vs. SPMe. In Proceedings of the 2023 IEEE Vehicle Power and Propulsion Conference (VPPC), Milan, Italy, 24–27 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–7. [Google Scholar]
Mingant, R.; Petit, M.; Belaïd, S.; Bernard, J. Data-driven model development to predict the aging of a Li-ion battery pack in electric vehicles representative conditions. J. Energy Storage 2021, 39, 102592. [Google Scholar] [CrossRef]
Vermeer, W.; Chandra Mouli, G.R.; Bauer, P. A Comprehensive Review on the Characteristics and Modeling of Lithium-Ion Battery Aging. IEEE Trans. Transp. Electrif. 2022, 8, 2205–2232. [Google Scholar] [CrossRef]
Bülow, F.; von Meisen, T. A review on methods for state of health forecasting of lithium-ion batteries applicable in real-world operational conditions. J. Energy Storage 2023, 57, 105978. [Google Scholar] [CrossRef]
Sui, X.; He, S.; Vilsen, S.B.; Meng, J.; Teodorescu, R.; Stroe, D.-I. A review of non-probabilistic machine learning-based state of health estimation techniques for Lithium-ion battery. Appl. Energy 2021, 300, 117346. [Google Scholar] [CrossRef]
Guo, W.; Sun, Z.; Vilsen, S.B.; Meng, J.; Stroe, D.I. Review of “grey box” lifetime modeling for lithium-ion battery: Combining physics and data-driven methods. J. Energy Storage 2022, 56, 105992. [Google Scholar] [CrossRef]
Khaleghi, S.; Hosen, M.S.; van Mierlo, J.; Berecibar, M. Towards machine-learning driven prognostics and health management of Li-ion batteries. A comprehensive review. Renew. Sustain. Energy Rev. 2024, 192, 114224. [Google Scholar] [CrossRef]
Cao, L.; Xu, R.; Bi, Y. Research on Life Prediction of Lithium-ion Battery based on WEMD-ARIMA Model. In Proceedings of the 2022 34th Chinese Control and Decision Conference (CCDC), Hefei, China, 15–17 August 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2774–2779. [Google Scholar]
Jorge, I.; Mesbahi, T.; Samet, A.; Boné, R. Time Series Feature extraction for Lithium-Ion batteries State-Of-Health prediction. J. Energy Storage 2023, 59, 106436. [Google Scholar] [CrossRef]
Lombardo, T.; Duquesnoy, M.; El-Bouysidy, H.; Årén, F.; Gallo-Bueno, A.; Jørgensen, P.B.; Bhowmik, A.; Demortière, A.; Ayerbe, E.; Alcaide, F.; et al. Artificial Intelligence Applied to Battery Research: Hype or Reality? Chem. Rev. 2022, 122, 10899–10969. [Google Scholar] [CrossRef]
Nuhic, A.; Terzimehic, T.; Soczka-Guth, T.; Buchholz, M.; Dietmayer, K. Health diagnosis and remaining useful life prognostics of lithium-ion batteries using data-driven methods. J. Power Sources 2013, 239, 680–688. [Google Scholar] [CrossRef]
You, G.; Park, S.; Oh, D. Real-time state-of-health estimation for electric vehicle batteries: A data-driven approach. Appl. Energy 2016, 176, 92–103. [Google Scholar] [CrossRef]
Richardson, R.R.; Osborne, M.A.; Howey, D.A. Battery health prediction under generalized conditions using a Gaussian process transition model. J. Energy Storage 2019, 23, 320–328. [Google Scholar] [CrossRef]
Song, L.; Zhang, K.; Liang, T.; Han, X.; Zhang, Y. Intelligent state of health estimation for lithium-ion battery pack based on big data analysis. J. Energy Storage 2020, 32, 101836. [Google Scholar] [CrossRef]
Greenacre, M.; Groenen, P.J.F.; Hastie, T.; D’Enza, A.I.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Primers 2020, 2, 100. [Google Scholar] [CrossRef]
Bülow, F.; von Mentz, J.; Meisen, T. State of health forecasting of Lithium-ion batteries applicable in real-world operational conditions. J. Energy Storage 2021, 44, 103439. [Google Scholar] [CrossRef]
Severson, K.A.; Attia, P.M.; Jin, N.; Perkins, N.; Jiang, B.; Yang, Z.; Chen, M.H.; Aykol, M.; Herring, P.K.; Fraggedakis, D.; et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy 2019, 4, 383–391. [Google Scholar] [CrossRef]
Yang, L.; Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
Zhang, Y.; Wik, T.; Bergström, J.; Pecht, M.; Zou, C. A machine learning-based framework for online prediction of battery ageing trajectory and lifetime using histogram data. J. Power Sources 2022, 526, 231110. [Google Scholar] [CrossRef]
Attia, P.M.; Grover, A.; Jin, N.; Severson, K.A.; Markov, T.M.; Liao, Y.-H.; Chen, M.H.; Cheong, B.; Perkins, N.; Yang, Z.; et al. Closed-loop optimization of fast-charging protocols for batteries with machine learning. Nature 2020, 578, 397–402. [Google Scholar] [CrossRef]
NASA. Available online: https://papers.phmsociety.org/index.php/phmconf/article/view/2490 (accessed on 3 April 2024).
Greenbank, S.; Howey, D.A. Piecewise-linear modelling with automated feature selection for Li-ion battery end-of-life prognosis. Mech. Syst. Signal Process. 2023, 184, 109612. [Google Scholar] [CrossRef]
Attia, P.M.; Bills, A.; Brosa Planella, F.; Dechent, P.; dos Reis, G.; Dubarry, M.; Gasper, P.; Gilchrist, R.; Greenbank, S.; Howey, D.; et al. Review—“Knees” in Lithium-Ion Battery Aging Trajectories. J. Electrochem. Soc. 2022, 169, 60517. [Google Scholar] [CrossRef]
Trad, K.; Govindarajan, J. D2.3-Report Containing Aging Test Profiles and Test Results; EVERLASTING: Yakima, WA, USA, 2020; Available online: https://everlasting-project.eu/wp-content/uploads/2020/03/EVERLASTING_D2.3_final_20200228.pdf (accessed on 13 October 2024).
Bills, A.; Viswanathan, V.; Sripad, S.; Frank, E.; Charles, D.; Fredericks, W.L. eVTOL Battery Dataset; Carnegie Mellon University: Pittsburgh, PA, USA, 2023. [Google Scholar]
dos Reis, G.; Strange, C.; Yadav, M.; Li, S. Lithium-ion battery data and where to find it. Energy AI 2021, 5, 100081. [Google Scholar] [CrossRef]
Hassini, M.; Redondo-Iglesias, E.; Venet, P. Lithium–Ion Battery Data: From Production to Prediction. Batteries 2023, 9, 385. [Google Scholar] [CrossRef]
Yang, X.-G.; Liu, T.; Ge, S.; Rountree, E.; Wang, C.-Y. Challenges and key requirements of batteries for electric vertical takeoff and landing aircraft. Joule 2021, 5, 1644–1659. [Google Scholar] [CrossRef]
CC BY. Available online: https://creativecommons.org/licenses/by/4.0/ (accessed on 13 October 2024).
Siemens Digital Industries Software. Simcenter AMESim V2310 (Advanced Modeling Environment for Performing Simulations); Siemens Digital Industries Software: Plano, TX, USA, 2023. [Google Scholar]
Jafari, M.; Khan, K.; Gauchia, L. Deterministic models of Li-ion battery aging: It is a matter of scale. J. Energy Storage 2018, 20, 67–77. [Google Scholar] [CrossRef]
Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
Wang, J.; Lu, S.; Wang, S.-H.; Zhang, Y.-D. A review on extreme learning machine. Multimed. Tools Appl. 2022, 81, 41611–41660. [Google Scholar] [CrossRef]
Michelucci, U. An Introduction to Autoencoders. arXiv 2022, arXiv:2201.03898. [Google Scholar]
Li, P.; Pei, Y.; Li, J. A comprehensive survey on design and application of autoencoder in deep learning. Appl. Soft Comput. 2023, 138, 110176. [Google Scholar] [CrossRef]

Figure 1. Schematic representation of some aging phenomena occurring in a lithium-ion cell. In the figure, the red disks correspond to lithium atoms and lithium ions, the grey disks correspond to carbon atoms, the purple disks correspond to cobalt atoms, and the green disks correspond to oxygen atoms.

Figure 2. Capacity loss versus time in the EVERLASTING database. Half of the test conditions were never seen in the training data (considering the SOC, charge, and discharge currents not represented here). The other half correspond to conditions that can be found in the training data but were experienced by cells that were not in the training data.

Figure 3. Baseline profile of the Bills et al. data. The graph was obtained from cell VAH27. The color corresponds to what is being imposed on the cell (i.e., the CC charge imposes the charge current, the CV charge imposes the charge voltage, and the discharge profile is imposed in power over a given time).

Figure 4. Capacity losses according to conditions in the Bills database.

Figure 5. The two data transformation approaches and the three model architectures (neural networks). The graphs shown are for the VAH25 cell in the Bills dataset. The 3D plot represents all measurements taken on this cell, and those that satisfy the zone defined on the right are represented in red, whereas the others are in blue. The 3D models are represented in green and the 1D model is represented in blue.

Figure 6. Neural network architecture 1 (NN₁): the model from the 1D data.

Figure 7. Neural network architecture 2 (NN₂): the model with an untrained layer.

Figure 8. Neural network architecture 3 (NN₃): the model with the encoder. The activation functions of the green layers were the hyperbolic tangent.

Figure 9. The thresholds applied in this study.

Figure 10. Parity plots of the capacity loss of (a) the 1D model (NN₁), (b) the model with an untrained layer (NN₂), (c) the model with the auto-encoders (NN₃), (d) the commercial model, and (e) the basic model on the EVERLASTING data.

Figure 11. Comparison of the performance metrics for the different models on the EVERLASTING database.

Figure 12. Parity plots of the capacity loss of (a) the 1D model (NN₁), (b) the model with an untrained layer (NN₂), (c) the model with the auto-encoders (NN₃), (d) the commercial model, and (e) the basic model on the Bills data.

Figure 13. Histogram of the performances of the models on the Bills dataset.

Figure 14. Overall model comparison.

Table 1. Lithium-ion accumulator’s degradation phenomena, adapted from Ref. [2].

Number	Phenomenon	Reference
1	Solid Electrolyte Interphase (SEI) growth, which can be considered the most important degradation phenomenon (the composition and behavior are different in silicon-containing negative electrodes)	[4,5]
2	Lithium plating and dendrite growth	[6,7]
3	Particle cracking, which is due to the volume changes during lithiation	[7,8,9]
4	Gas bubbles formation and electrolyte drying (mainly the formation of H₂, due to the decomposition of the organic molecules in the electrolyte)	[7,10,11]
5	Structure changes in the active material happen when the crystal structure of the active material changes and lithium cannot be inserted any more	[12]
6	Transition metals dissolution (the transition metals concerned are mainly Ni, Co, and Mn, dissolving into the electrolyte)	[8]
7	Graphite exfoliation and solvent co-intercalation, happens when the electrolyte solvent inserts itself into the graphite with the lithium-ion and separates the graphite sheets	[13,14]
8	The growth of the positive electrode–electrolyte interface	[15]
9	The corrosion or dissolution of the current collectors	[16]
10	The loss of electric contact	[12,17,18]
11	The decomposition of the binders	[19]
12	The decomposition of the electrolyte	[12]
13	The degradation of the separator	[20]

Table 2. Summary of previous, similar approaches.

Study	Dataset	Open-Source Data?	Method to Fix Thresholds	Inputs	Outputs	Model	Temperature
Nuhic 2013 [34]	5 batteries, no reference to another work or the data publication	No	Not shared	2D inputs (I&T and SOC&T) and rainflow counting of the SOC	$Δ Q$	SVR	Yes
You 2016 [35]	Private	No	K-Means clustering	Density of the 3D zones of (I, T, SOC)	SOH	SVM, ANN	Yes
Richardson 2019 [36]	NASA Randomized Battery Dataset	Yes	Arbitrary (Proposed 1D data but did not implement it)	Ah-throughput, $Δ t$ , and time.	$Δ Q$	GPR	Proposed to consider it but did not implement
Song 2020 [37]	BHEV and BEV data from real vehicles	No	Arbitrary	Principal components of the mileage and the 1D distribution of current, SOC, and temperature	Q	ANN	Yes
von Bülow 2021 [39]	Severson [40]	Yes	Arbitrary	2D, 3D	$Δ S O H$	ANN and GPR	Yes
Zhang 2022 [42]	Severson [40] Attia [43], NASA randomized battery dataset [44], and a dataset based on actual measurements from plug-in hybrid electric vehicles (PHEV).	Partly	Not shared	Derived from the histogram data	$Δ Q_{\begin{matrix} k c y c l e s \\ a h e a d \end{matrix}}$	Random forest regressions, support vector regressions, ANN, and GPR	Yes
Greenbank 2023 [45]	Severson [40] Attia [43]	Yes	Based on the distribution	Time spent (1D)	Q_{loss t+12 h}	Piecewise linear model, Gaussian process regression	Yes
Present work	EVERLASTING [47] Bills [48]	Yes	Arbitrary	Time spent (1D) and time integrals (3D)	$Q_{l o s s}$	ANN, ELM, auto-encoders	Yes

Table 3. Number of cells for life cycle conditions using driving profiles in EVERLASTING, sourced from Ref. [2].

Temperature	I Profile 70–90% SOC	I Profile 10–90% SOC	P Profile 10–90% SOC
0 °C	2	2
10 °C	2	2
25 °C	2	2	2
45 °C	2	2

Table 4. Constant DC cycle aging conditions in EVERLASTING, sourced from Ref. [2].

Temperature	Charge C-Rate	Discharge C-Rate	Number of Cells
0 °C	0.5	1.5	2
0 °C	1	1.5	2
10 °C	0.5	1.5	2
10 °C	0.5	0.5	2
10 °C	0.5	3	2
10 °C	1	1.5	2
25 °C	0.5	1.5	2
25 °C	0.5	0.5	2
25 °C	0.5	3	2
25 °C	1	1.5	2
45 °C	0.5	1.5	2
45 °C	0.5	0.5	2
45 °C	0.5	3 ^(a)	2
45 °C	1	1.5	2

^(a): These cells could not cycle because the temperature (45 °C) and the discharge C-rate (3C) caused the cells to heat up to 55 °C too rapidly. Only one capacity measurement was available under these conditions.

Table 5. Number of cells under calendar aging conditions in EVERLASTING, sourced from Ref. [2].

SOC Temperature	10%	70%	90%
0 °C	2	2	2
10 °C	2	2	2
25 °C	2	2	2
45 °C	2	2	2

Table 6. Baseline mission parameters from Ref. [48], used under the license CC BY [52].

Phase	Definition	End Criteria
Take-off	P = 54 W	t = 75 s
Cruise	P = 16 W	t = 800 s
Landing	P = 54 W	t = 105 s
Rest 1	I = 0 A	T < 27 °C
CC Charge	I = 1 C	U > 4.2 V
CV Charge	U = 4.2 V	I < C/30
Rest 2	I = 0 A	T < 35 °C

Table 7. Aging profiles in the Bills et al. dataset [48].

Mission Profile	Cells Impacted	Number of Data Points
Baseline	VAH01, VAH17, VAH27	52
Short cruise (400 s)	VAH12	47
Short cruise (600 s)	VAH13, VAH26	45
Extended cruise (1000 s)	VAH02, VAH15, VAH22	36
10% power reduction during discharge	VAH05, VAH28	30 and 24
20% power reduction during discharge	VAH11	44
CC charge current reduced to C/2	VAH06, VAH24	37
CC charge current brought up to 1.5 C	VAH16, VAH20	24
CV charge voltage reduced to 4.0 V	VAH07	6
CV charge voltage reduced to 4.1 V	VAH23	15
Thermal chamber temperature reduced to 20 °C	VAH09, VAH25	35
Thermal chamber temperature brought up to 30 °C	VAH10	29
Thermal chamber temperature brought up to 35 °C	VAH30	19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mayemba, Q.; Ducret, G.; Li, A.; Mingant, R.; Venet, P. General Machine Learning Approaches for Lithium-Ion Battery Capacity Fade Compared to Empirical Models. Batteries 2024, 10, 367. https://doi.org/10.3390/batteries10100367

AMA Style

Mayemba Q, Ducret G, Li A, Mingant R, Venet P. General Machine Learning Approaches for Lithium-Ion Battery Capacity Fade Compared to Empirical Models. Batteries. 2024; 10(10):367. https://doi.org/10.3390/batteries10100367

Chicago/Turabian Style

Mayemba, Quentin, Gabriel Ducret, An Li, Rémy Mingant, and Pascal Venet. 2024. "General Machine Learning Approaches for Lithium-Ion Battery Capacity Fade Compared to Empirical Models" Batteries 10, no. 10: 367. https://doi.org/10.3390/batteries10100367

APA Style

Mayemba, Q., Ducret, G., Li, A., Mingant, R., & Venet, P. (2024). General Machine Learning Approaches for Lithium-Ion Battery Capacity Fade Compared to Empirical Models. Batteries, 10(10), 367. https://doi.org/10.3390/batteries10100367

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

General Machine Learning Approaches for Lithium-Ion Battery Capacity Fade Compared to Empirical Models

Abstract

1. Introduction

2. State of the Art on Machine Learning Models to Predict Battery Aging Using Time Passed by the Cell between Thresholds

3. Materials and Methods

3.1. Battery Aging Datasets Used

3.1.1. The EVERLASTING Dataset

3.1.2. The Bills Dataset

3.2. Empirical Models

3.3. Machine Learning Models

4. Results and Discussion

4.1. Aging Models on the EVERLASTING Dataset

4.2. Aging Models on the Bills Dataset

4.3. Overall Comparison of the Models

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI