Battery State of Health Estimate Strategies: From Data Analysis to End-Cloud Collaborative Framework

: Lithium-ion batteries have become the primary electrical energy storage device in commercial and industrial applications due to their high energy/power density, high reliability, and long service life. It is essential to estimate the state of health (SOH) of batteries to ensure safety, optimize better energy efﬁciency and enhance the battery life-cycle management. This paper presents a comprehensive review of SOH estimation methods, including experimental approaches, model-based methods, and machine learning algorithms. A critical and in-depth analysis of the advantages and limitations of each method is presented. The various techniques are systematically classiﬁed and compared for the purpose of facilitating understanding and further research. Furthermore, the paper emphasizes the prospect of using a knowledge graph-based framework for battery data management, multi-model fusion, and cooperative edge-cloud platform for intelligent battery management systems (BMS).


Introduction
In an effort to combat the harmful effects of climate change and transition to a more sustainable future, achieving carbon neutrality has gained widespread acceptance across the globe [1][2][3][4].One key strategy is to integrate clean and renewable energy sources into the power grid.Due to their intermittent nature, energy storage devices are critical to bridging the gap between energy production and consumption [5,6].Lithium-ion batteries are recognized as a preferred option for energy storage due to their high energy/power density, high energy efficiency, long operational lifespan, and safety features.They have been extensively used in the transportation industry as a power source.
Battery aging is one of the primary challenges hindering the widespread adoption of electric vehicles [7].Batteries degrade with time and usage, which reduces the system's performance, service life, and safety.The main aging mechanism has been reviewed in Refs.[8,9] The state of health (SOH) of a battery, which reflects its ability to store and deliver energy relative to its initial state, is a key indicator of aging.The accurate estimation of the SOH of the battery system is an essential aspect of the BMS in electric vehicles since it provides knowledge on the battery performance, allows battery fault diagnosis, and helps to achieve an accurate estimation of the battery state of charge (SOC).Additionally, the long-term prediction of performance degradation and remaining useful life (RUL) is highly desirable to guide the management, maintenance, and recycling of the battery throughout its full lifecycle [10,11].
However, the accurate estimation of SOH remains a challenging task given the complexity of the internal chemical reactions and the difficulty in obtaining accurate parameter analysis.The assessment of battery aging is primarily based on the analysis of capacity degradation and impedance increase.The primary mechanisms of battery degradation can be categorized into two modes: reduction of lithium inventory caused by side reactions (the formation of solid electrolyte interface (SEI) [12,13], lithium plating [14,15], etc.), and loss of active material in the electrodes (graphite exfoliation, electrode cracking, etc.).Through the use of advanced characterization techniques in various types of batteries (e.g., scanning electron microscope, Atomic Force Microscope, X-ray computed tomography, etc.) [16] more comprehensive understanding of the aging process in various types of batteries has been developed [17].However, accurately characterizing the internal information can be a challenging task; the underlying aging mechanisms need to be inferred through the analysis of external signals (e.g., terminal voltage and current, surface temperature, etc.) collected by the battery management system (BMS).
Considerable efforts are devoted to estimating the SOH to keep track of battery performance.Different methods are deployed for reaching an accurate SOH estimation.These techniques mainly fall into three groups: experimental analysis methods, modelbased methods, and data-driven methods.
The assessment of battery SOH involves identifying health indicators (HIs) and tracking their changes as the battery ages.Various electrochemical analysis techniques (e.g., ICA, EIS) [18,19] have been used in previous studies on SOH estimation and battery aging modes identification.Usually, health indicators (HIs) are first extracted from the raw test data, which are then used for the estimation of SOH.The second group of SOH estimation methods relies on numerical models and adaptive filtering, which have shown high accuracy in real-time estimation of the battery's capacity, resistance, and other characteristics.A balance between computational complexity and accuracy is usually required in model-based techniques, especially for onboard BMS implementation.The third group of SOH estimation methods, i.e., machine learning algorithms, which benefit from big data analytics, cloud computing, and other innovative technologies such as IoT and AI, have become increasingly popular in this field due to their high potential to significantly enhance the accuracy of SOH estimation and enable optimal management of the entire battery life cycle.
A comprehensive review of SOH estimation methods is presented in this paper, including experiment-based methods, model-based methods, and data-driven methods.The features, advantages, and disadvantages of each method are analyzed.Graph technology, as a prevalent analytical approach, possesses inherent capabilities in representing intricate relationships and managing unstructured data effectively.In the realm of battery research, researchers have accomplished many integrating works by utilizing graph techniques, such as encompassing circuit optimization, energy management optimization, and fault diagnosis.Nevertheless, the literature is scant in terms of comprehensive evaluations concerning the utilization of graph techniques in this domain.To address this research gap, this study delves into the potential integration of knowledge graphs and graph neural networks in state estimation for battery SOH evaluation.Lastly, the potential of leveraging model fusion technology for real-time vehicle SOH prediction is deliberated, with a focus on employing end-cloud fusion battery management technology as the implementation framework [20,21].

Experimental Analysis Methods for SOH Estimation
The laboratory test is key to the investigation of the battery's aging characteristics and the estimation of SOH.Some experimental analysis methods are challenging to implement on BMS due to the disparity between laboratory environments and real-life conditions.However, experimental methods are the foundation for model-based and data-based techniques for SOH prediction.Moreover, they are critical in decision-making for lithiumion battery recycling and second-life reuse [22].
The experimental techniques for battery SOH estimation are divided into direct parameter measurement and indirect parameter estimation techniques based on whether the measured parameter is an electrical signal or not (Figure 1).By processing and analyzing parameters that provide aging information and comparing them with standard values, it becomes possible to quantify the degradation state of the battery, thereby enabling the estimation of battery SOH (Figure 1).

Experimental Analysis Methods for SOH Estimation
The laboratory test is key to the investigation of the battery's aging characteristics and the estimation of SOH.Some experimental analysis methods are challenging to implement on BMS due to the disparity between laboratory environments and real-life conditions.However, experimental methods are the foundation for model-based and databased techniques for SOH prediction.Moreover, they are critical in decision-making for lithium-ion battery recycling and second-life reuse [22].
The experimental techniques for battery SOH estimation are divided into direct parameter measurement and indirect parameter estimation techniques based on whether the measured parameter is an electrical signal or not (Figure 1).By processing and analyzing parameters that provide aging information and comparing them with standard values, it becomes possible to quantify the degradation state of the battery, thereby enabling the estimation of battery SOH (Figure 1).

Direct Experimental Methods
There are three direct indexes to characterize battery aging behavior, including changes in capacity, internal resistance, and impedance [23,24].These parameters are directly measurable if the test requirements are met.

Capacity Test Technique
The main characteristic of battery aging is the reduction of capacity [23].The definition of SOH is where Q is the nominal capacity of the new battery and Q = i t dt is the maximum available capacity of the aged battery.
Ampere-hour (Ah) counting or current integral is a common technique for battery capacity measurement with aging, while the maximum initial capacity provided by battery manufacturers is usually taken as the nominal capacity.
A widely used static capacity test procedure for measuring the battery's capacity with aging is specified as follows:

Direct Experimental Methods
There are three direct indexes to characterize battery aging behavior, including changes in capacity, internal resistance, and impedance [23,24].These parameters are directly measurable if the test requirements are met.

Capacity Test Technique
The main characteristic of battery aging is the reduction of capacity [23].The definition of SOH is where Q rated is the nominal capacity of the new battery and Q M = T 0 i(t)dt is the maximum available capacity of the aged battery.
Ampere-hour (Ah) counting or current integral is a common technique for battery capacity measurement with aging, while the maximum initial capacity provided by battery manufacturers is usually taken as the nominal capacity.
A widely used static capacity test procedure for measuring the battery's capacity with aging is specified as follows: 1.
Step 2: CC discharge till 2.5 V at 0.1 C.
The maximum capacity of the battery at this state of the aging process is calculated by integrating the current over the charge/discharge time.For its strictness in following the definition of SOH, the method is usually accepted as an actual value to verify the accuracy of other SOH estimation measurements.However, the method relies on the maximum initial capacity, which is provided by battery manufacturers, and the accuracy is usually indiscernible [25].A related study shows the method has been used as an approach to re-calibration of the maximum capacity of an operating battery [26].This method is easy to implement, but there are a few disadvantages.First, the sampling accuracy and sampling interval will directly affect the accuracy, and the current measurement bias error will accumulate with time [27].Second, this method is time-consuming as a complete charge-discharge circle is required.This additional cycle is usually inconvenient and may cause extra damage to the battery.In addition, the capacity testing technique does not allow for the expression of transient characteristics.Therefore, some researchers have also employed the calculation of SOH by integrating current over a period of time and dividing it by the difference in SOC multiplied by the rated capacity of the battery [28].

Ohmic Resistance
Another major characteristic of battery aging is the increase in internal resistance.Ohmic resistance reflects the conductivity of electron and lithium-ion transfer inside the battery.Batteries experience an increase in the fraction of the voltage and heat generation rate during the aging process.Therefore, the DC resistance test and Joule effect valuation are two suggested methods for ohmic resistance measuring [29].
Hybrid Pulse Power Characterization (HPPC) is the commonly accepted test for ohmic resistance analysis [30].The testing calculates the battery's voltage change after applying a high current for a short period of time, where ∆U t is the pulse voltage and ∆I is the applied current.
Because the instantaneous current is applied to the battery, the calculated R does not include the polarization effect, and ohmic resistance is the major reason for the potential drop across the battery [31].Generally, the HPPC test needs to be conducted in a laboratory environment to ensure high accuracy of current and voltage measurements, and the applications in industry conditions require careful consideration.Zhou et al. proposed a new method for LiFePO 4 battery SOH estimation by combining incremental capacity analysis (ICA) and ohmic resistance, using support vector regression and recursive least-squares method [32].Chiang et al. utilized the Lyapunov stability criteria to develop an adaptation algorithm that extracts OCV and internal resistance without limitations of the input signal of battery systems [33].The Joule effect is an alternative method for evaluating internal resistance.Heat is generated when current flows through a battery, leading to changes in battery temperature.A calorimeter can be used to estimate temperature changes.Liu et al. found that internal resistances measured by the HPPC method have a closer relationship with the heat generation behavior compared to the impedance [34].
The correlation between ohmic resistance and SOH is influenced by SOC and temperature.The measurement of resistance is subject to change based on the time scale of the evaluation method.Therefore, it is essential to design a multi-time scale estimator with a decoupling mechanism to mitigate the interference between SOC and SOH and improve the accuracy of resistance estimation.

Electrochemical Impedance Spectroscopy
Electrochemical Impedance Spectroscopy (EIS) is a high-precision impedance measurement technique [35][36][37].The fundamental behind EIS is the identification of different reaction rates in the electrochemical process through their corresponding frequency characteristics when alternating signals are applied [38].In order to facilitate rapid impedance measurement, the signals applied to the batteries typically comprise a rich set of harmonic components, including multi-frequency hybrid waves, pseudo-random sequences, and random noise [39][40][41].Next, Fourier transform analysis is usually performed for the decomposition of the measured AC signal into its individual frequency component.Wavelet transform is a suitable alternative method for AC impedance analysis [39,42,43].The wavelet transform provides a better time-frequency resolution and is well-suited for analyzing non-stationary signals (especially step signals).
The battery's EIS test usually covers a wide frequency span; therefore, in addition to ohmic resistance, parameters such as double-layer capacitance, charge transfer resistance, and SEI resistance can also be measured.In the low-frequency region, the EIS spectrum is dominated by mass transfer, which is expressed as capacitive resistance; while in the mid-frequency region, it is dominated by charge transfer, expressed as capacitive resistance.Various types of equivalent circuit models ECMs have been developed to simulate the battery's EIS equivalent circuit model.Compared with integer-order ECMs, fractionalorder models (FOM) can better describe the battery's diffusion behavior [44,45].Moreover, the fitted parameters can represent some aging mechanisms, which cover loss of active material (LAM), loss of Lithium Ions (LLI), and conductivity drops (Cond.loss) [46][47][48].
The correspondence between the Nyquist plot of the battery's EIS test data and a typical Randall equivalent circuit is shown in Figure 2, along with aging mechanisms.

Electrochemical Impedance Spectroscopy
Electrochemical Impedance Spectroscopy (EIS) is a high-precision impedance me urement technique [35][36][37].The fundamental behind EIS is the identification of differ reaction rates in the electrochemical process through their corresponding frequency ch acteristics when alternating signals are applied [38].In order to facilitate rapid impedan measurement, the signals applied to the batteries typically comprise a rich set of harmo components, including multi-frequency hybrid waves, pseudo-random sequences, a random noise [39][40][41].Next, Fourier transform analysis is usually performed for the d composition of the measured AC signal into its individual frequency component.Wave transform is a suitable alternative method for AC impedance analysis [39,42,43].T wavelet transform provides a better time-frequency resolution and is well-suited for an lyzing non-stationary signals (especially step signals).
The battery's EIS test usually covers a wide frequency span; therefore, in addition ohmic resistance, parameters such as double-layer capacitance, charge transfer resistan and SEI resistance can also be measured.In the low-frequency region, the EIS spectrum dominated by mass transfer, which is expressed as capacitive resistance; while in the m frequency region, it is dominated by charge transfer, expressed as capacitive resistan Various types of equivalent circuit models ECMs have been developed to simulate battery's EIS equivalent circuit model.Compared with integer-order ECMs, fraction order models (FOM) can better describe the battery's diffusion behavior [44,45].Mor ver, the fitted parameters can represent some aging mechanisms, which cover loss of tive material (LAM), loss of Lithium Ions (LLI), and conductivity drops (Cond.loss) [4 48].
The correspondence between the Nyquist plot of the battery's EIS test data and typical Randall equivalent circuit is shown in Figure 2, along with aging mechanisms.Apart from the frequency domain method, the time domain analysis of EIS data also suitable for decomposing kinetic processes inside the cells [49,50].The distribution relaxation times (DRT) analysis method is introduced to distinguish the main time co stants of electrochemical processes [51][52][53], offering an intuitive description of Apart from the frequency domain method, the time domain analysis of EIS data is also suitable for decomposing kinetic processes inside the cells [49,50].The distribution of relaxation times (DRT) analysis method is introduced to distinguish the main time constants of electrochemical processes [51][52][53], offering an intuitive description of the complicated properties using multiple RC (Resistor Capacitor) networks [54].In comparison to the conventional circuit fitting approach, the DRT method facilitates the direct resolution of the impedance spectrum and enables the differentiation of kinetic characteristics based on time constants, thus mitigating the subjective nature of manual circuit fitting that can result in analytical inaccuracies.Additionally, this method is algorithmically driven; it reduces the reliance on electrochemical knowledge and detailed designing of ECMs, thereby enhancing the rigor and objectivity of the analysis.However, DRT decomposition is a typical ill-posed problem, and the uniqueness of the solution could not be ensured.The solution of the DRT is mainly derived from Tikhonov regression, also known as ridge regression [52,55,56].
EIS measurement requires a long test time and a relatively stable environment.In order to reduce the cost of EIS applications and improve the accuracy of SOH estimation, there is a need to design integrated electrochemical analysis systems [57].Raijmakers et al. completed impedance measurements at the pack level using a commercial BMS and analyzed the battery crosstalk interferences [58].Howey et al. presented a fast and cost-effective method for online battery impedance measurement using onboard power electronics as the excitation source [59].Wang et al. designed and verified an impedance measuring system combined with a high-power dual active bridge (DAB) converter and distributed sampling units, which realized a low-cost onboard battery impedance measuring system [60].Even so, its application in online settings faces numerous challenges.The difficulties in power device design and electromagnetic interference shielding are only secondary factors contributing to the challenges of online EIS applications.More significantly, due to noise interference and inadequate standing time, the response signal of batteries fails to accurately reflect impedance information.
After obtaining the EIS measurement, HI features can be extracted for SOH estimation [61,62].Jiang et al. compared different EIS features, including broadband EIS, model parameters, and fixed-frequency impedance features, for SOH estimation using Gaussian process regression [63].The results indicate that the fixed-frequency impedance feature offers the best performance metrics, and the mid-frequency region is suitable for SOH regression.Fu et al. investigated the validity of various frequency regions and implemented a real-time, fast acquisition of EIS utilizing the Fast Fourier Transform (FFT) technique [64].The SOH evaluation model was then developed by an extreme learning machine (ELM) with a regularization mechanism and the data from the capacity test was used for model training.Zhang et al. utilized the DRT method and low-frequency impedance fitting to extract the parameters, Pearson correlation analysis to filter the parameters, and the Gaussian process regression method to establish the staged SOH estimation model [65].In addition, the adaptability of the model to different SOC ranges was also considered.

Incremental Capacity Analysis (ICA) and Differential Voltage Analysis (DVA)
Incremental capacity analysis (ICA) and differential voltage analysis (DVA) are established techniques to qualitatively identify the major degradation modes [18,66].Both techniques are based on the identification of the drifting features of the battery's open-circuit voltage (OCV) with aging [67,68].Due to the non-stationary nature of the open-circuit voltage (OCV) data, the identification of various reaction processes within the cell is a challenging task, making the adoption of differential methods a necessary step.The same concept also applies to the differential thermal voltammetry (DTV) technique [69].By transforming the voltage plateau of the OCV-SOC curve into clearly identifiable peaks and valleys in the first-order differential curve, ICA and DVA curves can detect a gradual change in battery behavior while aging [70,71].
The concept of ICA was first proposed for the study of the electrochemical reactivity of manganese dioxide.This technique was then expanded to identify the battery aging mechanism under different conditions [72][73][74].
IC curve is calculated as ∆Q/∆V within a fixed time or voltage interval [75].The integral of the IC curves suggests the capacity of the battery in the corresponding voltage interval.The IC peaks reflect the different phase transition stages of positive and negative materials [76].For instance, during the charging process, the cathode comprised of nickel cobalt aluminum (NCA) material experiences three phase transitions (H1-M-H2-H3), while the anode material undergoes four phase transitions [77].The result of these two-phase transitions is reflected in the four peaks of the IC curve.The contribution of the cathode and anode of the battery to the characteristics of the IC curve can be described by the peak indexation method developed in [68].
Dubarry et al. developed a battery degradation diagnosis method to capture the aging mode of LLI, LAM, and resistance increase at different aging stages.In addition, the ICA technique could diagnose capacity loss due to undercharge (UD) and under discharge (UC) [18,78].The influence of the LLI, RI, and LAM on the IC curves of an LFP cell is shown in Figure 3 [79].

PEER REVIEW 7 of 39
IC curve is calculated as ∆Q/∆V within a fixed time or voltage interval [75].The integral of the IC curves suggests the capacity of the battery in the corresponding voltage interval.The IC peaks reflect the different phase transition stages of positive and negative materials [76].For instance, during the charging process, the cathode comprised of nickel cobalt aluminum (NCA) material experiences three phase transitions (H1-M-H2-H3), while the anode material undergoes four phase transitions [77].The result of these twophase transitions is reflected in the four peaks of the IC curve.The contribution of the cathode and anode of the battery to the characteristics of the IC curve can be described by the peak indexation method developed in [68].
Dubarry et al. developed a battery degradation diagnosis method to capture the aging mode of LLI, LAM, and resistance increase at different aging stages.In addition, the ICA technique could diagnose capacity loss due to undercharge (UD) and under discharge (UC) [18,78].The influence of the LLI, RI, and LAM on the IC curves of an LFP cell is shown in Figure 3 [79].The high charge/discharge rate and the different phase transition properties of the electrode materials both have a strong impact on the number of identifiable peaks.The number of identifiable peaks/valleys of the ICA curve depends not only on the phase transition properties of the electrode materials but also on the [81,82].Considering the shifting of the peaks, some phase transition stages are resisted for the increase of battery resistance, resulting in the peaks shifting toward a higher potential [67].The decrease in the amplitude of the peak indicates a reduction in capacity, which is usually caused by LAM and LLI.
The DV curve is ∆V/∆Q against capacity (Figure 3e,f), which is the inverse of the IC curves (∆Q/∆V).In IC curves, the peaks represent phase transitions, whereas the peaks in DV curves are due to phase equilibria.Since dV dV dV The high charge/discharge rate and the different phase transition properties of the electrode materials both have a strong impact on the number of identifiable peaks.The number of identifiable peaks/valleys of the ICA curve depends not only on the phase transition properties of the electrode materials but also on the [81,82].Considering the shifting of the peaks, some phase transition stages are resisted for the increase of battery resistance, resulting in the peaks shifting toward a higher potential [67].The decrease in the amplitude of the peak indicates a reduction in capacity, which is usually caused by LAM and LLI.
The DV curve is ∆V/∆Q against capacity (Figure 3e,f), which is the inverse of the IC curves (∆Q/∆V).In IC curves, the peaks represent phase transitions, whereas the peaks in DV curves are due to phase equilibria.
The peaks and valleys of the DV curve can be attributed to each electrode.By measuring the distance between the peaks, capacity loss can be obtained [83].
A higher peak on the DV curve implies higher lithium loss in localized areas of the electrode, which might be caused by SEI growth or material-induced uneven current density distribution [84,85].It is speculated the shortening of the distance between the two peaks is caused by the degradation of the carbon electrode [86].As the battery ages, all peaks on the DV curves shift toward a lower depth of discharge (DOD), and the amplitude of peaks may increase due to the homogeneity of lithium distribution increasing [66,87].Boom et al. performed calendar aging tests on 18,650 batteries and separated electrode information using DVA [88].The study indicates that the major cause of capacity degradation is the side reaction at the anode side.In their later work [89], the LiNi 1-x-y Mn y Co x O 2 (NMC) electrode was tested using the same method, and the results suggest that the side reactions which consume recyclable lithium occur mainly at the negative electrode.Liu et al. used DVA and impedance measurements to verify that the major degradation mechanism of LiFePO 4 cells is LAM rather than impedance increase [90].
ICA and DVA are recognized as reliable SOH estimation techniques since the shape parameters of the curves, such as width, area, height, and number of peaks, are closely related to the aging mechanism.However, the differential methods are strongly affected by noise, and it is difficult to obtain an accurate and smooth curve with a direct numerical derivative.Therefore, data pre-processing is essential.The common pre-processing methods include smoothing splines, peak fitting functions, voltage window filter, Gaussian filter, Butterworth filter, Kalman filter, and moving average filter [91][92][93][94].The filter-based algorithm is easy to implement, but there is the possibility of causing a loss of information.Support vector regression uses a maximum classification interval and is able to tolerate a certain degree of outliers, but it is computationally costly.The peak fitting function method can track each peak of the IC and DV curves independently but cannot accurately capture the part between the peaks and is less accurate for the assumption of symmetry [95].The spline curve-based method is easy to differentiate and has high fidelity, but it is greatly affected by measurement errors and noises.
SOH prediction is achieved in two steps: extracting features from IC/DV curves and using mathematical methods to establish the correlation between the parameters of each feature and SOH.Weng et al. developed an SOH monitoring framework, which provided a quantitative correlation between IC peak and battery capacity decay [96].Within this framework, support vector regression is applied to fit the capacity-voltage curve with partial charging data, and IC peak identification is then reduced to an SV-based parameter identification problem.The ICA technique is used to establish the correspondence between features of IC curves and capacity degradation and to provide a robust signature for SOH monitoring.Wang et al. designed a radial basis function neural network (RBFNN) to predict IC peak value utilizing numerous indirect data such as mileage, average charge current, charge initial SOC, and average discharge temperature [94].The ICA method was used to find reliable characteristics on the curve.The advantage of ICA for SOH estimation over the DVA method is its ability to represent capacity degradation and impedance incremental together.Extracting features from DV curves is not as intuitive as that of the ICA curve for cell capacity degradation.Berecibar et al. explained the correlation between capacity decay and DV characteristic peak [71].The battery capacity degradation is assessed based on the drift of a characteristic peak on the DV curve.Wang et al. used an improved center least squares method to smooth the DV curves and then select the characteristic peaks that drift significantly with battery aging for SOH estimation [97].In addition, the problem that the maximum peak value cannot be detected because of inconsistency was discussed.

Advanced Sensors Experiment
Indirect parameter measurement methods for SOH estimation rely on the measurement of the battery's material and structural parameters during charge/discharge.Battery parameters such as temperature and stress change during charging and discharging, and these parameters can reflect the status of the solid-liquid and solid-solid interfaces and the electrochemical reaction kinetics inside the battery, which in turn helps to predict the battery SOH and lifespan.These methods can also monitor the battery's safety indicators and aid the second-life battery reuse to improve the economic value.

Ultrasonic Technique
Ultrasonic inspection technology is an effective tool for detecting various small defects in materials.This is an intuitive, non-contact, non-destructive battery aging characterization technique [98,99].The principle is based on the sensitivity of the propagation of ultrasonic waves in the liquid-filled porous media to electrode tortuosity, porosity, thickness, density, elastic modulus, fluid density, and ion concentration [100,101].It should be noted that not all ultrasound signals within the full frequency range exhibit a strong correlation with the state of a battery.For instance, the slow wave component was found to have a strong correlation with SOC.Recently, ultrasonic time of flight (TOF) has emerged as a relatively new method for interrogating battery behavior [102][103][104].
The growth of a solid electrolyte interface (SEI) is a major factor in the consumption of intercalation lithium during battery aging.In addition, the battery aging process is accompanied by an increase in internal temperature, pressure, and cell volume and even the expansion of manufacturing defects.These manifestations result in material-related parameter changes such as density and modulus variations [105].To investigate these regimes, the ultrasonic signals during the charging and discharging cycles are monitored and analyzed in the time or frequency domain, which provides indicators of battery aging, safety, and small defects [100,101].Many researchers have utilized acoustic signals to probe Li-ion batteries.Hsieh et al. conducted a comprehensive study using commercial batteries such as 18,650 Li-ion and alkaline LR6 (AA) batteries to verify the correlation of changes in sound speed with the state of charge and aging [106].Davies et al. demonstrated that acoustic analysis of the through-thickness bulk wave could be used to estimate SOC [107].An SOH regression model was also developed using TOF and voltage data in that study.In an attempt to reduce the intervention of the sensor on cells, a novel approach based on acousto-ultrasonic guided waves has been proposed [108].Ladpli et al. developed a technique based on built-in ultrasound-guided waves to monitor lithium-ion batteries' state [109].An important nonlinear property between the TOF signal and the maximum remaining capacity is used to develop an a priori model for an onboard SOH determination technique.In a follow-up work [110], Matching Pursuit with the Gabor dictionary was applied to decompose the guided wave.The SOC/SOH prediction model was generated by Generalized Additive Models (GAMs) with the regression splines technique.
The ultrasonic signals are sensitive to the internal material parameters of the battery, which are linked with SOC.Thus, the application of using ultrasonic signals to monitor battery SOH should be performed in a fully charged state in order to characterize the degradation of maximum capacity [100,105,111].

Fiber Bragg Grating Technique
Optical fiber sensors (OFS) are fiber-passive devices that feature multiplexing and multipoint measurement with excellent temporal and spatial resolution.OFS are nonelectrical and immune to electromagnetic interference [112,113].Among OFS, fiber Bragg grating is superior in monitoring the surface temperature and strain of a cell [114,115].
Fiber Bragg Grating (FBG) sensors use the reflection of the optical signal and the expansion and contraction of the fiber to measure temperature and strain.The selected wavelength of the reflected signal is defined as where n eff is the effective core index of refraction and Λ is the Bragg grating period [116].In Ref. [117], the single film fiber SMF-FBG technique was used to measure the temperature and stress of a cell simultaneously with high accuracy.At the same time, an FBG sensor written in a microstructured optical fiber (MOF-FBG) was proposed to detect and decouple temperature and stress signals.The optical sensor could monitor the thermal events occurring inside the cell and their associated heat generation.It could also track parasitic side reactions of SEI formation, which enabled cell lifespan assessment.Wu et al. developed an in-situ FBG temperature sensor array film (FBGAF) to detect the uneven battery internal temperature field and the evolution pattern [118].In practical implementations, to decouple stress and temperature, two optical FBG sensors were arranged at one sampling position.One FBG sensor was firmly attached to the cell so that the wavelength shifting of the reflected signal was caused by both strain and temperature.The second optical sensor, as a reference, was put in specialized tubes detached from the cell.Therefore, the wavelength shifting of the received signal of the second sensor was majorly influenced by temperature.Thus, error from strain could be compensated using the wavelength offset of the reference sensor subtracted from the total wavelength offset.Nascimento et al. evaluated the efficacy of thermocouples (TCs) and FBG sensors in detecting variations in temperature.The findings demonstrated that the FBG sensors detected higher temperature variations and presented a 1.2 times higher response rate than the TCs [119].In the followup work, a hybrid sensing network consisting of fiber Bragg gratings and Fabry-Perot cavities was proposed [120].As a non-invasive measuring method scheme, it has a low impact on cell behavior.
Optical grating technology can also be applied for battery state prediction [121][122][123].Ganguli et al. developed large-format pouch cells with embedded fiber-optic sensors and verified their seal integrity and capacity retention.Based on the previous work, Ganguli et al. discussed the details of the method for separating the strain and temperature signals of the interlayer and used extended Kalman filtering (EKF) and dynamic time warping for battery SOC and SOH estimation [124,125].Li et al. used the automatic extraction (AE) method to extract features from the measurements of FBG sensors and then used Gaussian process regression combined with a convolutional neural network to develop a multivariate regression model to predict battery capacity and quantify the prediction uncertainty [126].The framework utilizes the automatic extraction (AE) method to extract features from the measurements assisted with FBG sensors.
The implementation cost of fiber grating sensors is still high now, hindering its wide implementation in BMS.Further, the correlation between battery reliability and built-in sensors has not been accurately evaluated [127].

Other Experimental Method
Several other indirect assessment techniques have also been developed, such as the parity relation technique and mechanical stress method for SOH estimation [128,129].These experimental methods can either directly estimate SOH or provide data for other SOH estimation methods.

Model-Based Methods for SOH Estimation
The changing parameters of the battery model with aging are tracked for battery SOH estimation.Usually, the real-time measurement data of the BMS is used to update the model parameters.Since the model usually integrates physical knowledge of the battery's dynamics, model-based SOH estimation methods are more interpretable than data-driven methods and can achieve higher accuracy using fewer samples.
Based on the different theories of model construction and principles of state prediction algorithms, they can be divided into three categories: empirical models [130], equivalent circuit models [131,132], and electrochemical models based on physical chemistry [133,134].

Empirical Models
Empirical models for predicting the performance of electrochemical cells do not require detailed knowledge of the cell's design or material properties.Instead, they rely on fitting functions such as polynomial or exponential functions to approximate the cell's behavior based on known data (Figure 4a,b).While these models can be useful for making predictions, they are ultimately only approximate representations of the cell's true behavior [140].Han et al. derived a power law battery aging model based on the Arrhenius equation, which includes temperature as a parameter and is a function of the number of cycles [141].Zhang et al. developed a battery RUL prediction framework based on an exponential empirical model and particle filter.The nonlinear least squares technique was used to exponentially quantify the capacity degradation data of Lithium-ion batteries [135].The outcome of this analysis showed that the exponential model was suited to fit the capacity degradation data.The parameters of the empirical resistance model depend on the SOH.Ecker et al. developed an empirical resistance model to describe the battery's aging process and to predict its lifespan [142,143].Empirical models have relatively few parameters and are easy to implement for onboard SOH monitoring, but they tend to have poor generalization performance outside the scope of the test data used for model training.
Empirical models, being open-loop models, tend to exhibit significant cumulative errors.Their primary objective is to quantify battery degradation information while minimizing computational demands.As a result, empirical models often exhibit limited stability and exhibit insensitivity toward environmental conditions.Thus, the incorporation of observer models becomes imperative to rectify the estimated values or parameters derived from the empirical model [144].The nonlinear least squares technique was used to exponentially quantify the capacity degradation data of Lithium-ion batteries [135].The outcome of this analysis showed that the exponential model was suited to fit the capacity degradation data.The parameters of the empirical resistance model depend on the SOH.Ecker et al. developed an empirical resistance model to describe the battery's aging process and to predict its lifespan [142,143].Empirical models have relatively few parameters and are easy to implement for onboard SOH monitoring, but they tend to have poor generalization performance outside the scope of the test data used for model training.
Empirical models, being open-loop models, tend to exhibit significant cumulative errors.Their primary objective is to quantify battery degradation information while minimizing computational demands.As a result, empirical models often exhibit limited stability and exhibit insensitivity toward environmental conditions.Thus, the incorporation of observer models becomes imperative to rectify the estimated values or parameters derived from the empirical model [144].

Equivalent Circuit Models
An equivalent circuit in which the dynamic behavior of a lithium-ion battery is described by the combination of electrical components, including resistors, inductors, capacitors, and power supplies, is an electrical black box model that constructs a circuit based on the relationship between input and output signals of the system (Figure 4c).The equivalent circuit model allows engineers to perform simulations, study the system behavior, and optimize its performance without having to build a physical battery prototype.The extrapolation facilitates measurements for SOH.The equivalent circuit model is easy to apply in practice because of the minimal physical and chemical knowledge requirement and low computing load.
There are different types of equivalent circuit models, including the RINT model, Davinan model, PNGV model, and multi-stage RC model [45].The model parameters can be identified online or offline using recursive least squares (RLS), genetic algorithms (GA), particle swarm algorithms (MPSO), etc.The ECM model parameter identification process relies on data provided by direct or indirect experiments such as EIS.The ECM offers several advantages, including its ability to provide physical interpretations, computational efficiency, and parameter adjustability.It can capture changes in battery health by modifying its parameters accordingly.By integrating adaptive algorithms such as the Kalman filter, dual sliding film observer, and dual Kalman filter, the ECM can enhance its capability to model nonlinear systems.Consequently, the model becomes more adaptable and accurate.These techniques enable closed-loop estimation of equivalent circuit components parameters, wherein deviations between predicted and actual values are used to refine the estimation results.Adaptive models have attracted a lot of attention in battery parameters and state estimation due to their robustness.

Electrochemical Models
The electrochemical modeling approach is supported by robust theoretical foundations and well-suited to determining the internal electrochemical reaction process and reaction intensity during battery aging.This approach can capture the dynamics of the lithium-ion movement inside the battery (Figure 4e) and the varying parameters of the active materials in the anode and cathode at various SOH levels.The majority of electrochemical models for lithium-ion batteries are based on the work of John Newman and his colleagues [145,146].The pseudo-two-dimensional model established by them uses a set of partial differential equations to describe the electrochemical reactions inside the battery.The P2D model combines porous electrode theory and concentrated solution theory to provide a fundamental theoretical framework for the physical and electrochemical processes occurring within lithium-ion batteries [147].The P2D model can considerably explain the internal electrochemical kinetics and mass transport processes of the battery and investigate the capacity decay of the battery by accurately simulating the internal electrochemical processes of the battery (Figure 4f) [148].Except for extracting the impedance and recyclable Li-ion concentration, the electrochemical model is able to provide many features for SOH estimation.Liu et al. developed a simplified P2D model through Parabolic Profile (PP) approximation method and volume integration [139].Subsequently, the simplified model is converted into a state-space form.The solid phase average lithium-ion concentration at the electrode was estimated in real-time by particle filter (PF) algorithm using partial charge/discharge data, which was then used for estimation of SOH and SOC.Li et al. developed an SP-based degradation model incorporating the effect of SEI formation and crack propagation caused by volume expansion [149].The model shows the high accuracy of SOC/SOH prediction at a high charge/discharge rate.The solid-state diffusion coefficient and diffusion time have been identified and verified in the literature as excellent indicators of SOH, which increase monotonically with cell aging [150,151].Prasad et al. developed a simplified electrochemical model, which relates the coefficients of physical parameters of the cell to the Padé approximated transfer function [152].The cell's resistance and the diffusion time of the Li-ion were estimated using the current and voltage data at a 5C charge or discharge cycle.The varying parameters are then used for SOH prediction.
Compared to ECM and machine learning models, electrochemical models offer distinct advantages.They exhibit enhanced accuracy in capturing the battery's charging and discharging behavior by accounting for intricate microscopic dynamics.This precision is particularly crucial for precise battery state estimation in intricate scenarios.Additionally, electrochemical models possess the capability to concurrently simulate multiple microscopic phenomena occurring within the battery cell.These phenomena encompass electrode particle fragmentation, lithium-ion precipitation and embedding, and SEI thickening.Such micro-level events significantly impact battery longevity, capacity degradation, and safety.By simulating and analyzing these phenomena, a deeper understanding of the battery's performance decay mechanisms can be attained, enabling the characterization of aging behavior.Katrašnik et al. incorporated various processes in their electrochemical model, including SEI formation, SEI decomposition, Lithium reaction with electrolyte at anode interface, lithium plating, Cathode active material degradation, electrolyte decomposition, heat generation resulting from Li+ transport, and heat generation stemming from side reactions [153].These equations establish a coherent connection between the crystal structure of the cathode, electrode topology, heat generation, and degradation phenomena.As a result, the model was able to calculate the SOH and make predictions regarding thermal runaway events at the same time.
Despite the rich functionality of the electrochemical model, it exhibits certain limitations and drawbacks in various aspects, particularly when confronted with chargedischarge scenarios that go beyond its established physical or mathematical assumptions.For instance, in scenarios involving local deactivation and uneven particle distribution, the uniformity assumption of the electrochemical model fails to hold, while scenarios such as high-rate charge/discharge and elevated temperature operations alter the kinetics of interface reactions.Moreover, the electrochemical model encompasses a significant number of parameters whose impact on the results varies considerably.These parameters can undergo changes under different operational conditions, necessitating the conduction of parameter identification procedures prior to the application of the electrochemical model.Li et al. developed a framework for the identification of physical parameters of electrochemical models based on the cuckoo search algorithm [154].By means of sensitivity analysis and multi-step parameter identification, 20 electrochemical model parameters were identified using only current-voltage data as input.
Given the abundant parameters and noticeable environmental variations in electrochemical models, it becomes imperative to employ online parameter updating methodologies [155][156][157][158].The utilization of electrochemical model-based approaches, in conjunction with adaptive filters, enables the online updating of parameters and estimation of battery SOH.However, this technique employed for battery SOH estimation encounters two main challenges.Firstly, the direct measurement of SOH is not feasible, necessitating the indirect calculation through the assessment of capacity or resistance decay.Secondly, the accuracy and stability of the general model estimation are compromised when dealing with nonlinear dynamic systems.To address these challenges, this technique commonly employs proportional integral observers, multi-time scale observers, adaptive partial differential equation observers, and extended Kalman filters to identify electrochemical model parameters, such as battery capacity and Li-ion concentration.Then, an electrochemical model or other equations is used to establish their correlation with SOH.Zhou et al. utilized an EKF model to estimate the count of recyclable lithium ions at specific time instances based on the current and voltage profiles during battery charging and discharging.By comparing these estimates at various time points throughout the battery's lifespan, they obtained insights into the degradation trend of the battery.In this study, the count of recyclable lithium ions was regarded as a parameter of the battery, which is inherently a nonlinear system.The electrochemical model was employed to simulate the actual system behavior in this research [159].Zou et al. developed nonlinear observers that incorporate fast and slow gains for the estimation of state variables related to fast and slow dynamics, specifically the Li-ion concentration in solid particles, Li-ion concentration in the electrolyte, and volume-averaged concentration flux [156].The electrochemical model was employed as a simulation platform to validate the proposed theory.
The major challenges with the P2D model are the serious difficulty of model parametrization and the high computational load for onboard implementation.Various reduced-order methods (such as Padé approximation [160]) have been employed to approximate the solution of the solid-liquid phase diffusion equation in order to simplify the resolution of the partial differential equation within electrochemical models.Other simplifying methods, such as the single particle model (SPM), simplify the electrochemical model by making specific assumptions, including the assumption of a uniform potential distribution, simplification of ion diffusion within the electrode, and neglect of liquid phase dynamics.In the SPM, the anode and cathode are each modeled as a single particle by neglecting the spatial dependence of the current density within the electrode [161][162][163].Furthermore, the SPM simplifies the electrochemical model by making specific assumptions, including the assumption of a uniform potential distribution, simplification of ion diffusion within the electrode, and neglect of liquid phase dynamics.As a result, the SPM model focuses on solving two partial differential equations for the positive and negative electrodes separately, along with associated algebraic equations.This simplification significantly reduces the complexity.
Although different aging mechanisms, such as SEI growth and electrode material cracking, can be integrated into the electrochemical models, the implementation in vehicles is quite challenging due to high model complexity.Therefore, it is not conducive to online real-time SOH estimation and prediction.In the context of applying electrochemical models, the evaluation criteria for model performance often revolve around fitting output voltages or other indicators.However, relying solely on such criteria may result in the parameter values of simplified models deviating from their actual physical counterparts, thus losing their physical significance.Hence, it is crucial to analyze the sources of errors inherent in simplified models and explore the development of real-time electrochemical models suitable for online applications.This research direction holds substantial importance for future endeavors.
Although different aging mechanisms, such as SEI growth and electrode material cracking, can be integrated into the electrochemical models, the implementation in vehicles is quite challenging due to high model complexity.Therefore, it is not conducive to online real-time SOH estimation and prediction.

Machine Learning Methods for SOH Estimation
Machine learning algorithms can construct and analyze models directly from data on battery aging.Machine learning-based approaches for battery health assessment do not require in-depth knowledge of electrochemical principles or degradation mechanisms.This results in better dynamic accuracy, universality, and robustness compared to model-based methods.Using machine learning for SOH prediction involves four steps: data processing, feature extraction, parameter training, and model validation.Machine learning algorithms can be divided into three categories: probabilistic algorithms, non-probabilistic algorithms, and semi-probabilistic algorithms.The schematic diagram of the model as well as the classification framework can be referred to as Figure 5.

Probabilistic-Based Algorithms
A probabilistic model uses probability distributions to characterize the potential values of variables and is a tool used to describe and quantify uncertainty.Lithium-ion battery aging involves many uncertain parameters.The aging process of Lithium-ion batteries involves many uncertain parameters, which can be incorporated into a probabilistic modeling framework [164].
Bayes Rule is a classical probabilistic analysis method, using the prior probability of distribution to calculate posterior probability.This allows Bayesian models to leverage prior knowledge and information more effectively, enabling more accurate modeling and inference of data.The naïve Bayes model, Monte Carlo method, Bayesian network, and Gaussian process regression (GPR) are widely used machine learning methods based on the Bayes rule.The Bayesian approach exhibits distinctive advantages in the contexts of few-shot learning and leveraging prior knowledge.By incorporating prior probabilities, Bayesian methods can alleviate the issue of data sparsity and effectively perform parameter estimation and prediction tasks when confronted with limited data.
A naïve Bayes (NB) model is a probabilistic machine learning model based on the Bayes theorem.The model is "naïve" because it assumes all the features in the data are independent of each other, which is not always true in real-world scenarios.However, naïve Bayes models are often effective in practice and are widely used in a variety of applications.This phenomenon can be attributed to the fact that the naïve Bayes model is capable of providing valuable probability estimations by learning the statistical relationships between classes and features, despite the violation of conditional independence assumptions.Furthermore, in the context of high-dimensional problems, the naïve Bayes model exhibits enhanced efficiency in parameter estimation, owing to its reliance on the assumption of conditional independence.Consequently, it facilitates more efficient training and inference processes.Selina S.Y.Ng et al. proposed a naïve Bayes model for estimating the battery's RUL using constant discharge data [165].Unlike other studies that use capacity as the response variable, in this research, the cycle of a battery was directly used for non-parametric regression.The results are analyzed in detail in comparison with the SVM model.At average RMSE, NB is 16.1 cycles (0.17%), with a standard deviation of

Probabilistic-Based Algorithms
A probabilistic model uses probability distributions to characterize the potential values of variables and is a tool used to describe and quantify uncertainty.Lithium-ion battery aging involves many uncertain parameters.The aging process of Lithium-ion batteries involves many uncertain parameters, which can be incorporated into a probabilistic modeling framework [164].
Bayes Rule is a classical probabilistic analysis method, using the prior probability of distribution to calculate posterior probability.This allows Bayesian models to leverage prior knowledge and information more effectively, enabling more accurate modeling and inference of data.The naïve Bayes model, Monte Carlo method, Bayesian network, and Gaussian process regression (GPR) are widely used machine learning methods based on the Bayes rule.The Bayesian approach exhibits distinctive advantages in the contexts of few-shot learning and leveraging prior knowledge.By incorporating prior probabilities, Bayesian methods can alleviate the issue of data sparsity and effectively perform parameter estimation and prediction tasks when confronted with limited data.
A naïve Bayes (NB) model is a probabilistic machine learning model based on the Bayes theorem.The model is "naïve" because it assumes all the features in the data are independent of each other, which is not always true in real-world scenarios.However, naïve Bayes models are often effective in practice and are widely used in a variety of applications.This phenomenon can be attributed to the fact that the naïve Bayes model is capable of providing valuable probability estimations by learning the statistical relationships between classes and features, despite the violation of conditional independence assumptions.Furthermore, in the context of high-dimensional problems, the naïve Bayes model exhibits enhanced efficiency in parameter estimation, owing to its reliance on the assumption of conditional independence.Consequently, it facilitates more efficient training and inference processes.Selina S.Y.Ng et al. proposed a naïve Bayes model for estimating the battery's RUL using constant discharge data [165].Unlike other studies that use capacity as the response variable, in this research, the cycle of a battery was directly used for non-parametric regression.The results are analyzed in detail in comparison with the SVM model.At average RMSE, NB is 16.1 cycles (0.17%), with a standard deviation of 10.7 cycles (0.15%), better than those of the SVM models, which had an average RMSE of 26.5 cycles (0.27%) and standard deviation of 13.2 cycles (0.20%).This suggests that SVMs are more susceptible to changes in operational parameters.
The Gaussian process regression (GPR) method is a non-parametric regression method that uses the Gaussian process to describe the relationship between the input and output variables.The Gaussian process is a probabilistic process that utilizes the covariance function to describe the correlation between continuous random variables.One key feature of GPR is that it can provide a posterior distribution of the output prediction, which is very useful to quantify the prediction uncertainty in case the output variables are uncertain due to noisy measurements etc. GPR has been widely adopted to predict a battery's SOH due to its non-parametric and probabilistic nature [166][167][168][169].However, the limitations of GPR include high computational expense, sensitivity to the choice of the covariance function, and potentially poor scalability to large datasets.
Yang et al. proposed a new method for estimating the SOH of a battery using charging curves [170].First, the charging and discharging data at various cycle numbers were obtained during cell cycling experiments.Second, four key points were extracted from the charging and discharging curves.The median filtering algorithm was used to eliminate noise.Third, a grey system theory was developed to measure the relational grade between these points.Finally, the four features were used as a training dataset to develop a Gaussian process (GP) model for SOH estimation.Liu et al. conducted research on SOH prediction and prognostics for lithium-ion batteries using GPR and a combination Gaussian process functional regression (GPFR) [171].They aimed to address uncertainty in evaluation and prediction by employing GPR with mean and variance values as representations of SOH uncertainty.Due to the limited performance of the GPR model in addressing battery capacity regeneration, the researchers employed a combined approach known as linear Gaussian process functional regression (LGPFR) for multi-step prognostics of the health condition of lithium-ion batteries.As a result, the prediction RMSE based on the LGPFR and QGPFR was only 1.71 and 1.5.
GPR has also performed well in long-term prediction, such as remaining useful life (RUL) estimation.Richardson et al. proposed a multi-output Gaussian regression model for SOH and RUL prediction and optimized the kernel functions by minimizing the negative log marginal likelihood (NLML) [172].Furthermore, the compound kernel functions enhanced the ability of the GPR algorithm to utilize the empirical degradation model based on explicit mean functions.This results in improving the ability of GP to capture complex behavior and the accuracy of SOH prediction.In the follow-up work [173], they proposed using a GPR conversion model to establish the relationship between current, voltage, temperature, and capacity.This model was found to be effective in estimating battery capacity degradation and predicting remaining useful life under dynamic conditions.The GPR algorithm can also be used to construct state space.Li et al. developed a flexible state space model for estimating the SOH of a battery based on the GPR algorithm [174].The model uses a nonlinear function to establish a relationship between the number of cycles and various feature variables, which are then input into the GPR model to predict the remaining useful life (RUL) over a long time scale.
The Monte Carlo method represents a mathematical approach in which random quantities are used to provide estimates of deterministic quantities.Monte Carlo simulation models a possible outcome for any variable with inherent uncertainty by using a probability distribution, such as a uniform or normal distribution.Many Bayesian Monte Carlo methods applied to SOH prediction are achieved through simulated aging trajectories.
He et al. applied the Dempster-Shafer evidence theory (DST) in combination with an empirical model based on electrochemical impedance spectroscopy (EIS) to estimate the degradation of lithium-ion batteries [175].This model was used to determine an initial set of parameters, which were then updated using Bayesian Monte Carlo techniques to track the degradation trend of the battery and accurately predict its capacity decay, ultimately leading to the estimation of SOH and remaining useful life (RUL).Tang et al. solved the battery state estimation problem using model migration techniques [176].Bayesian Monte Carlo (BMC) method approximate posterior distribution was used in parameter identification and model-based aging trajectory prediction, which reduced the modeling error well.Their results show that the BMC approach is able to predict the health evolution over the entire lifespan with a high degree of accuracy (RMSE less than 2%) based on 15% of the aging data.In another of their works [177], the same method was adopted.Accelerated aging simulations were established by migration processes, whose migration factors were determined using Bayesian Monte Carlo methods and hierarchical resampling techniques.Liu et al. considered a jump-diffusion model with an exponential distribution of jumps for describing the jump phenomenon in the capacity decay process [178].Then, a Monte Carlo sampling algorithm was used to estimate the parameters of the jump and diffusion parts of the degradation model, respectively.
Random forest (RF) is a special probabilistic-based model.It makes predictions by training multiple decision tree models and then averaging their predictions.Each decision tree gives a prediction for the input data points, and the final prediction of the random forest is the average of the predictions of all decision trees.The output of the RF is a probability value that represents the probability of occurrence of the predicted category given the input data points.Hence, notwithstanding the fact that RF is not inherently a probabilistic model, for the purpose of discourse, RF is categorized within this subsection.Additionally, RF is a machine learning algorithm that uses the concept of Bagging, which combines the predictions of multiple decision trees to make a final prediction.This approach helps to reduce variance and improve the overall accuracy of the model.It is effective for handling non-differentiable models, and discrete features with limited values, and can effectively handle missing and heterogeneous data.It is able to capture the relationships between inputs and outputs in order to produce a final estimate of the state of health of a battery.Additionally, RF is an ensemble learning (EL) method that uses multiple decision trees to predict the output of given inputs.Li et al. utilized random forest regression to develop a model that learns the relationship between battery capacity and features extracted from charging voltage and capacity measurements [179].The method does not require any smoothing and directly uses the charging curve for SOH estimation, resulting in strong robustness.Yang et al. used two convolutional neural networks to extract indicators for SOH and SOH variation between two consecutive charge/discharge cycles [180].Then, by exploiting the indicators from CNN, the RF algorithm was adopted to generate the final SOH estimates.Lin et al. apply RF for a multi-model fusion strategy, to obtain high accuracy and robustness estimation [181].Preliminary SOH predictions were generated using multiple new linear regression, SVR, and GPR, respectively, and finally, the preliminary SOH predictions were fused using the RF model.

Non-Probabilistic Algorithms
Typical probabilistic approaches focus on long-term forecasting of the battery's remaining usable life [186].Researchers commonly utilize non-probabilistic approaches for battery SOH prediction.Related machine learning techniques can be divided into supervised and unsupervised based on the presence or absence of data labels.

Supervised Learning
The goal of supervised learning is to enable a model to learn from the relationships between input and output variables in a training dataset so that it can make accurate predictions about new input variables without access to the corresponding output variables.To accomplish this, a supervised learning algorithm processes the training data, discovers the relationship between the input and output variables, and develops a model that embodies this relationship.The model is then applied to new data to generate predic-tions of the output variables.Some commonly employed supervised learning algorithms include support vector machines (SVMs), artificial neural networks (ANNs), and k-nearest neighbors (k-NN).
Support vector (SV) techniques are commonly used for battery state estimation [187,188].SVM was originally proposed by Vladimir Vapnik [189] and has been successfully applied in battery state estimation.SVMs utilize kernel functions to map low-dimensional feature vectors into higher dimensions in order to find the optimal hyperplane for separating input data in the higher dimensional space.SVMs have two main components: the kernel function and the penalty factor C. The penalty factor C controls the complexity of the model, i.e., the degree of punishment for classification errors.Support vectors are samples in the training data used to define the decision boundary between classes in an SVM model.SVs reflect certain inherent characteristics of the cell.Feng et al. used charging curves at different SOH to construct support vector machine (SVM) models and identified the features of the different models using a support vector parameter identification algorithm [190].The algorithm creates an SOH diagnosis model using SVM by comparing the features of the measured data with the stored SVM model.Klass et al. proposed a method of applying standard battery performance tests to an SVM-based battery model [191].This method involves training a model with current, temperature, and SOC as inputs and voltage as the output using SVM.The resulting SVM model is then used as a lookup table for voltage based on hypothetical current/temperature/SOC inputs from the corresponding actual standard test measurements.The virtual test results can then be used to derive resistance and capacity as a basis for SOH estimation.Xing Shu et al. applied a hybrid kernel functionbased support vector machine with fixed-size least squares to study the correlation between SOH and feature variables [192].Using this arrangement, online SOH estimation can be achieved using extreme learning machine algorithms with only intermittent and randomly collected charging data.
Support vector regression (SVR) is an algorithm based on SV [193], which can predict continuous output variables.The performance of SVR is determined by the choice of kernel function and the method of parameter optimization.Li et al. compared SVR models with linear, polynomial, radial basis function (RBF), and sigmoid kernel functions and found that the RBF kernel function performed more reliably on small sample size datasets and in the mid-stage of battery degradation [194].
Other supervised learning methods, including k-nearest neighbor (kNN), have also been applied to battery state estimation.Data that are sufficiently close in the feature space are grouped together so that clusters of data can be identified.The method allows predicting the class associated with a new input by comparing it with the k nearest-neighbor points in the multidimensional feature space, as long as the different groups are determined by the training data [187].KNN is generally introduced in the data processing phase.In Ref. [195], the linear statistical k-nearest neighbors (LSKNN) method was used to infer the unknown points in the measurements and suppress the noise.In the end, the regression model for the battery SOH estimation is CSVGPR.KNN methods can also be applied to regression models.Hu et al. develop a non-linear kernel regression model based on the kNN regression, which manages to capture the dependence of the capacity on the selected features [196].
Artificial neural networks and deep learning are an important part of machine learning methods.The neural network methodology establishes a network architecture comprising numerous interconnected "neurons" that facilitate the mapping of input features (such as terminal voltage, current, temperature, and impedance) to corresponding outputs, specifically the battery capacity and SOH.Information is shared between nodes through "neuron" connections.ANN consists of an input layer, a hidden layer, and an output layer.In addition, according to the network structure, ANN methods are divided into deep learning algorithms and traditional neural networks.The simplest feedforward neural network (FFNN) contains only one hidden layer, whose impute is the sum of the inner product and the bias.Typical deep learning networks include recurrent neural networks (RNN) and convolutional neural networks (CNN).In contrast, traditional neural networks include FFNN and Back-propagation neural networks (BPNN).All four of these models have been successfully applied to SOH prediction [197][198][199].
Neural network models are greatly influenced by the choice of dataset and feature selection.Thus, convolutional neural networks (CNNs) are commonly employed to gather local variables and extract features from the data.The convolutional layers employ filters (also known as convolutional kernels) that slide across local regions, capturing local patterns within the input data.Pooling layers are used to reduce the dimensionality of the feature maps while preserving the most salient features.CNNs can be suited for handling sequence prediction tasks by reshaping the data into a single dataset, where each row represents a time step.Sliding window operations along the time steps are employed to obtain sequences of the desired length as inputs.To achieve regression prediction, sometimes it is important to incorporate fully connected layers or other types of layers after the CNN model.Bao et al. proposed a CNN-VLSTM-DA model for estimating SOH [200].In their model, CNN is employed to integrate spatial and dimensional information, extracting structural features from the data.An attention mechanism is introduced to allocate different weights to different parts of the data, distinguishing crucial information from irrelevant information.The final output is generated by the VLSTM layer, with the RMSE achieving 0.9%.DCNN, as a specialized form of CNN, exhibits a greater proficiency in learning abstract and high-level feature representations compared to traditional CNNs.Shen et al. utilized a DCNN for cell-level capacity estimation based on measurements of voltage, current, and charge capacity within partial charge cycles [201].This DCNN consists of five convolutional layers and three fully connected layers.Each convolutional stage is composed of a convolutional layer, a pooling layer, and a ReLU activation function.The achieved result demonstrates an RMSE close to 0.36%.
Battery aging is a gradual process that occurs over time, and sequential data provide crucial observations of battery performance as it evolves over time.Recurrent Neural Network (RNN) is a widely used neural network architecture for modeling sequential data.Compared to traditional feedforward neural networks, RNN exhibits the characteristics of memory and recursion when processing sequential data.The core idea of RNN is to introduce an internal state, also known as a hidden state, which can be propagated and updated across different time steps.RNN receives inputs and the hidden state from the previous time step at each time step, and computes the output, and updates the hidden state using learned weight parameters.This recursive structure enables RNN to learn patterns and trends in battery aging data, enabling it to predict the State of Health (SOH) of batteries.Eddahech et al. were early adopters of employing RNN for battery aging prediction [202].They incorporated additional network inputs such as temperature (T), current pulse amplitude (I), and state of charge variation (∆SOC) to forecast the sequence data of capacity and equivalent series resistance.The mean squared errors (MSE) achieved were 0.462 and 0.296, respectively.RNN models have various variants.Among several network structures, long short-term memory (LSTM) networks have received the most attention [203].The method is superior for its ability to retain previous information, solve gradient disappearance, and relief gradient explosion problems in the long sequence training process.LSTM implements the protection and control of information through three main structures: input gates, forgetting gates, and output gates.Kwon et al. develop a multilevel long short-term memory (LSTM) model for battery SOH and RUL prediction [204].The second level LSTM employed a sliding window for multi-step forecasting.As the sliding window moved, the prediction was input to the next input value until the predicted SOH reached the end-of-life (EOL).Then the RUL was deduced from the remaining number of cycles.Zhang et al. pointed out LSTM RNN can capture long-term dependencies better than SVM and SimRNN [205].Traditional RNNs may struggle to learn such long-term dependencies due to the gradient disappearance problem, but LSTMs are able to control the flow of information using gates, allowing them to selectively retain or forget past information as needed.This ability enables LSTMs to effectively analyze events that occurred over a long period of time.
Bidirectional long short-term memory (Bi-LSTM) has two hidden layers, one that processes the sequence in the forward direction and another that processes the sequence in the backward direction.Bi-LSTM is more powerful and efficient than LSTM in capturing long-term dependencies in the data, as it can use information from both the past and the future to make predictions.Wang et al. built a bidirectional long and short-term memory (Bi-LSTM) depth model to learn CNN forward and reverse dependencies and emphasized the correlation between series data through the attention mechanism (AM) [206].
LSTM networks, like other intelligent algorithms, can be further developed and improved through the use of variations.Ma et al. developed a novel SOH estimation method based on LSTM [207].Neighborhood component analysis (NCA) and Pearson correlation coefficient are chosen to perform data screening, and differential evolution gray wolf optimization is proposed realizing hyperparameters optimization.Ren Pu et al. proposed an SOH estimation strategy based on the combination of dynamic adaptive cuckoo search (DACS) and LSTM [208].By combining the filtering algorithm, the effect of noise is reduced.

Un-Supervised Learning
Unsupervised learning is a type of machine learning in which the algorithm is not given labeled training data or any prior knowledge about the relationships between variables.Instead, it is tasked with discovering these relationships on its own by analyzing the patterns in the data.Commonly used unsupervised learning algorithms include autoencoders (AE) and principal component analysis (PCA).These algorithms are often used as data preprocessing techniques to remove noise or reduce feature correlation in data sets that will be used for supervised learning.However, evaluating the effectiveness of unsupervised learning can be difficult because the algorithm has difficulty predicting the exact output beforehand.
Autoencoder (AE) is a type of non-linear dimensionality reduction method that uses the input data as supervision to guide the neural network in learning the mapping relationships for data reconstruction.The model typically consists of two components: the encoder and the decoder.AE is a useful tool for recognizing and tracking characteristics of battery aging and is capable of handling the varied and inconsistent nature of battery decay and capacity loss [209,210].Wu et al. developed a combined convolutional auto-encoder (CAE) and recursive auto-encoder (RAE) framework to extract health features from voltage and temperature profiles undercharging and used the AdaBoost, an Ensemble learning method, to construct SOH estimation models [211].Jiang et al. utilized a convolutional auto-encoder (CAE) to extract features from raw battery cycle data and introduced a selfattentive machine (SA) mechanism for SOH estimation [212].An SA module allows the model to effectively learn the abstract relationship among data and process the automatic feature to enhance the interpretability of algorithms.Xu et al. proposed a physics-informed dynamic deep autoencoder (PIDDA) for high-accuracy SOH prediction [213].A Thevenin equivalent circuit model is used to develop state equations for the design of the AE loss functions for the purpose of physical process-guided self-encoders.
Principal component analysis (PCA) reduces the dimensionality of a data set by identifying the directions in which the data varies the most and then projecting the data into those directions.PCA is a control technique for multivariate statistical processes that aims to transform multiple indicators into a few composite indicators called "principal components".There is some loss of accuracy in this process, but there is the possibility that fewer composite variables can be more responsive to the information of the original variables.Lee et al. considered the effect of cell inconsistency and applied PCA to extract new features related to degradation and inconsistency, input to LSTM-RNN, and train regression models to predict SOH [214].Banguero et al. applied PCA to internal parameters of the battery energy storage system (capacity, internal resistance, and OCV).The model retained 80.25% 0.25% of the total variability and achieved rapid degradation diagnosis of the battery system [215].Liu et al. investigated the aging mechanism of NCA batteries through ICA, DVA, and probability density function (PDF) analysis methods.The PCA algorithm was then employed to select features from original factors that extracted from the IC, DV, and PDF curves to reduce multicollinearity and improve the accuracy of SOH assessment.The relationship between the reduced features and SOH was established using ordinary least squares regression [216].

Sime-Probabilistic Algorithms
Probabilistic models are capable of estimating probability distributions of unknown variables and can be utilized for probabilistic inference and handling uncertainty but may not be as flexible as non-probabilistic models and may not capture complex relationships within the data.On the other hand, non-probabilistic models have the ability to automatically learn features from the data, allowing for a better fit for complex data, but are sensitive to the quality of the training data.A semi-probability model is a statistical model that incorporates both probabilistic and non-probabilistic components.Adequate consideration is given to uncertainty and complex relationships between data in semi-probabilistic models.Two examples of such models include generative adversarial models (GANs) and Bayesian neural networks (BNNs).
Generative adversarial networks belong to reinforcement learning, which is a semisupervised learning algorithm.Inspired by the zero-sum game in game theory, GAN treats the generation as a confrontation and game between two networks, the discriminator and the generator: the generator generates synthetic data from a given noise (typically a uniform or normal distribution), and the discriminator distinguishes the output of the generator from the real data.The former tries to generate data closer to the real one, and accordingly, the latter learns to distinguish the real and generated data.Thus, the two networks progress in the confrontation and continue to confront each other after the progress, and the data obtained by the generative network becomes more and more perfect and close to the real data so that the desired data can be generated.Table 1 presents the estimated accuracy levels of various machine learning methods, including GAN models.A shallow layer-sharing mechanism between discriminators and regressors is developed to capture aging knowledge for regularization, along with a regression generative adversarial network (RGAN) scheme for SOH estimation based on the optimal correlation between SOH and features [218].Kim et al. proposed a completely unsupervised approach to SOH monitoring using the information maximization generative adversarial network (InfoGAN) to extract features from EIS data and estimate battery capacity using Gaussian process regression [219].The InfoGAN extracts interpretable and meaningful representations by introducing variational lower bounds to maximize the mutual information between latent variables and observations [224].The combination of GAN and LSTM has received a lot of attention.Yang et al. used improved GAN to process feature data and complete normalization.LSTM was then used to learn the mapping relationship between features and SOH, and the adaptability of the LSTM networks was enhanced using transfer learning (TL) [225].
A Bayesian network (BN) is a directed acyclic graph (DAG) that consists of nodes representing variables and directed edges connecting these nodes.Bayesian neural networks (BNNs) are a type of machine learning model that combines the principles of Bayesian statistics and artificial neural networks.BNNs incorporate uncertainty into their predictions by treating the model's parameters as random variables, each with a learned distribution.This means that the training process of a BNN not only determines the optimal values of the parameters but also estimates their uncertainty.As a result, BNNs are able to model both deterministic and uncertain relationships, which has been demonstrated to be effective in utilizing actual operational data for forecasting.Huo et al. proposed a method for estimating the SOH of a battery using a Bayesian network, which allows for the incorporation of the probabilistic nature of battery degradation [226].They demonstrated the effectiveness of this approach using real-world data collected from the operation of electric taxis.This network is made up of four types of variables for which an appropriate distribution type is selected.Then, the BN model is trained using complete training data to obtain a parameterized BN.The Markov Chain Monte Carlo (MCMC) method is then used to generate posterior distribution samples for capacity estimation.Validation data is processed with sparsification for use in the Metropolis-Hasting algorithm.When the number of samples meets the mix state criterion, the capacity samples are collected and fit a Beta distribution, with the mode value serving as the estimation result.The result indicated that the model has the potential to follow the increasing SOH dispersion trend but failed to predict future changes in SOH.Dynamic Bayesian network (DBN) is a BN model for working with time series data which is able to model dynamic systems.Researchers have successfully utilized DBN to estimate SOH and predict RUL in the absence of features [227,228].

Applications of Knowledge-Based AI and Knowledge Graphs
Connectionism is a theory that models the functioning of neurons, with deep neural networks being the most well-known application.Symbolicism, on the other hand, aims to create learned artificial intelligence and is exemplified by knowledge graphs.Knowledge graphs, as exemplars of symbolicism, are self-describing knowledge bases that provide a structured framework for storing and managing vast amounts of data.They enable the efficient transformation of information into knowledge through techniques such as association, fusion, and inference.By representing relationships and dependencies between entities, knowledge graphs facilitate advanced reasoning and support various applications, including information retrieval, knowledge discovery, and decision-making.The essence of a knowledge graph lies in its collection of interconnected terms, data features, and language from various domains.It stores and connects data in a semantic manner, where each distinct class, object, and relationship is represented by a unique Uniform Resource Identifier (URI).The mapping from URIs to classes or objects ensures explicitness and disambiguation of information, ultimately guaranteeing machine readability.Consequently, semantic knowledge graphs can mitigate the friction of cross-domain communication and information ambiguity.The structured nature of knowledge graphs allows for efficient querying and extraction of relevant information, contributing to the effective utilization of large-scale heterogeneous and dynamic data sources.Currently, knowledge graphs are used for the automatic exploration of battery materials [229] and optimization of energy storage systems [230].
Moreover, an intelligent data management solution such as a knowledge graph-based expert system offers a dependable approach for the integration and traceability of historical data from multiple sources in a BMS.In another work of ours, we also made an attempt to use knowledge graphs for online search and fault inference for batteries [231].Based on a cloud-based management platform's fault logs, the computation of target node entity features was performed using Bi-LSTM, leading to the construction of a knowledge graph in the domain of onboard power battery faults.Leveraging the knowledge graph enables the realization of an online battery system fault query interface, as well as functionalities such as cloud-based testing for fault cause inference and intelligent fault diagnosis.
The BMS continuously monitors the batteries and generates a large volume of data, and effective data management techniques are necessary to extract additional insights from this historical data.To address such a challenge, Kalaycı et al. developed a knowledge graph-based data integration framework for BMS to facilitate data access and analysis [232].Neo4j is used as the graph database in this framework.To validate the efficacy of this data management methodology, an experiment was conducted to monitor the abnormal temperature of batteries.The relational schema of the SQL database, which encompasses the raw measurement data, serves as the foundation for the knowledge graph (KG).Within this framework, the algorithm execution engine is activated, utilizing the time series data stored in the source relational database as input.The algorithm's task is to identify potential temperature outliers and their corresponding intervals, which are subsequently transformed into new nodes for seamless integration into the KG.
The application of knowledge graphs in battery management systems will provide a qualitative analysis of battery aging and SOH decrease.By utilizing knowledge graphs, a database can effectively store various types of information about a battery, including its type, material, capacity, voltage, temperature, common health features (such as EIS, ICA, DVA, etc.), internal sensor signals (such as ultrasonic, fiber optic, pressure, etc.), external environmental signals, and information related to abnormal states.These characteristics can be represented as nodes in the graph, while the correlations between the data can be represented as edges.The use of knowledge graphs allows for the creation of visual relational networks that assist in the identification of more reliable health indicators (HFs) and the development of aging evaluation models with higher accuracy.By applying graph knowledge and machine learning algorithms to graph data, it is possible to perform tasks such as relationship extraction, entity clustering, and relationship prediction.
Graph neural networks (GNNs) are an active area of study in the field of machine learning.GNNs are able to incorporate information from the structure of the graph, such as the relationships between nodes and edges, in order to make predictions or decisions.This allows GNNs to learn from the connections within a graph, which can be particularly useful for tasks such as predicting the properties of individual nodes within the graph.There have been several studies on the use of GNNs to extract battery health features.
As typic examples, Yao et al. proposed a method for estimating SOH using a fusion of multiple sources of features.This method involves extracting 27 health indicators from various sensors as a feature matrix and constructing a connection matrix based on Pearson correlation coefficients [220].These indicators are then classified into three categories based on their correlation coefficients and trained using different models.A Graphsage (a type of graph neural network) is used to extract deeper information from these indicators and obtain updated results, while a fully connected neural network is used to construct a regression model for predicting SOH.The follow-up work [233] indicated that in the CL-Graphsage (CNN-LSTM-Graphsage) framework, LSTM is used to extract temporal features, and Graphsage is used to obtain spatial features.Graph Convolutional Network (GCN) is a type of GNN that perform convolutional operations on the graph structure.Wei et al. proposed a novel method for SOH and RUL estimation based on time-series data (e.g., time to the maximum voltage in charge cycles, time to the maximum temperature in discharge cycles, etc.).An undirected graph with optimal entropy was constructed to identify the correlation among features through the topological structure.Then, based on this graph, two GCNs with different attention mechanisms was built for predicting SOH and RUL.The experimental results demonstrated that the method exhibits improved accuracy compared to LSTM and GPR methods [221].

Multi-Model Fusion and End-Cloud Collaborative Framework for SOH Estimation
Automobiles, being complex systems, are well-suited for the implementation of the Internet of Things (IoT).To improve the accuracy of SOH estimation models, intelligent end-to-cloud convergence solutions are considered.Cloud computing, aided by low latency communication technologies such as 5G, can provide real-time model updates and data hosting for BMS.In the cloud-to-thing framework, two key requirements must be met: the accuracy of the model in the cloud and the real-time availability of data at the edge.Figure 6 illustrates a proposed approach utilizing end-cloud fusion strategy, aiming to enhance both the accuracy of SOH estimation and the estimation speed.

Cloud-Side Highly Accurate Model
Cloud computing is a distributed computing architecture that provides computing resources and data storage services through networked servers, addressing the issue of insufficient computing power for edge computing.In the context of a cloud-converged BMS architecture, the cloud server can utilize its strong data storage capacity and computing power to perform analysis of battery usage data and train high-precision cloud-based battery models.The use of intelligent algorithms for SOH assessment has gained significant attention due to their ability to adapt and make variable adjustments based on inputs and outputs, as opposed to fixed mathematical formulas.The flexibility of intelligent algorithms, combined with multiple data sources and multi-model fusion, can make use of the flexible computing resources of cloud computing to reduce the risk of overfitting by leveraging the deviation of training data from different models.Additionally, the vast data storage capacity of cloud servers enables more opportunities for the development of algorithms.Intelligent algorithms with multi-model fusion show particular advantages in terms of the accuracy of SOH prediction [236].Common multi-model fusion methods in-

Cloud-Side Highly Accurate Model
Cloud computing is a distributed computing architecture that provides computing resources and data storage services through networked servers, addressing the issue of insufficient computing power for edge computing.In the context of a cloud-converged BMS architecture, the cloud server can utilize its strong data storage capacity and computing power to perform analysis of battery usage data and train high-precision cloud-based battery models.The use of intelligent algorithms for SOH assessment has gained significant attention due to their ability to adapt and make variable adjustments based on inputs and outputs, as opposed to fixed mathematical formulas.The flexibility of intelligent algorithms, combined with multiple data sources and multi-model fusion, can make use of the flexible computing resources of cloud computing to reduce the risk of overfitting by leveraging the deviation of training data from different models.Additionally, the vast data storage capacity of cloud servers enables more opportunities for the development of algorithms.Intelligent algorithms with multi-model fusion show particular advantages in terms of the accuracy of SOH prediction [236].Common multi-model fusion methods include ensemble learning and multi-source data fusion.
Ensemble learning is a mainstream model fusion approach that includes both Bagging and Boosting strategies.In Refs.[222,223] the Bagging strategy was utilized to construct a multi-model fusion approach for battery SOH estimation, resulting in improved accuracy compared to single standalone models.The Bagging algorithm primarily aims to reduce variance and improve stability through multiple iterations of training.On the other hand, the Boosting ensemble learning method focuses on reducing bias and involves the training of multiple recursive weak classifiers that are eventually combined through weighting to form a strong classifier.Qin et al. designed a novel prediction framework based on Gradient Boosting (GB).The addition of a small amount of classification and regression tree (Cart) for relearning enhanced the accuracy of the individual algorithms [237].
Another multi-model fusion strategy involves the use of various data sources.This approach leverages a variety of advanced sensors and signal-processing algorithms to obtain diverse health factors for estimating the SOH of a battery.The incorporation of health factors from various sources can effectively be utilized for the evaluation of the SOH of a battery.By fusing and downscaling these different data sources, it is possible to maintain the unique characteristics of each data set while still being able to effectively use them in the SOH evaluation process.Yao et al. made innovative use of health factors from 27 different data sources as features in an SOH prediction model, resulting in good accuracy [220].Hu et al. conducted a study to compare the computational efficiency and effectiveness of various strategies for extracting health factors [238].They found that the fusion strategy of selecting health factors was the most recommended.
Based on the findings from Table 1, it can be inferred that prior to conducting regression prediction, the implementation of a well-designed feature extraction network significantly enhances the estimation accuracy, such as CNN and AE.Furthermore, the utilization of models that capture concealed relationships among the data proves effective in improving the estimation precision.For instance, LSTM can capture long-term dependencies between data points, GNN can capture the connections between data fragments and the overall context, the attention mechanism is suitable for problems that involve importance weights, and local focus in the data and ensemble learning can utilize the relationships among different models.Therefore, in the majority of research endeavors, a combination of multiple approaches is employed to achieve optimal results.Currently, data-driven approaches are the most commonly used method for estimating the SOH of a battery.With the availability of large amounts of complex operational data in the cloud, the use of big data technology can be utilized to extract valuable insights.In particular, big data analysis can be used to analyze historical data stored in the cloud, as well as data related to the battery's working environment and usage patterns, in order to understand trends in battery performance and usage.This can aid in the development of more refined models and enable the implementation of a comprehensive life cycle management approach for the battery.

End-Side Highly Real-Time Model
A high-real-time model that is implemented on the edge (or the "end-side") of a network near the data source rather than in the cloud or a centralized server.This approach can reduce latency and increase data processing speed by bringing computation closer to the data source, which can be especially beneficial in applications that require real-time monitoring and control.Edge-based real-time models and parameter monitoring is the basis for ECU control decisions and cloud computing.Real-time monitoring is crucial for accurately estimating the SOH.The BMS system provides high-precision parameters (voltage, current, temperature, etc.) for battery SOH estimation, which will guide the realtime updates of complex models in the cloud and parallel simulation of battery digital twins.Utilizing low-latency communication technologies such as 5G, building a cloud platform and real-time mapping of BMS, and updating the vehicle-side model and parameters based on the behavior of the battery at different stages.The end-cloud framework can combine the advantages of low floating-point operations (FLOPs) models on the edge, and highprecision models in the cloud with an adaptive algorithm that can improve the accuracy and responsiveness of BMS in estimating the battery SOH.
In an edge-based high-real-time model, Kalman filtering can be used to estimate the state of a vehicle or other system in real time by processing sensor data from the edge devices and making predictions about the system's future behavior.Figure 6 illustrates a common edge-cloud fusion framework where data uploaded by the BMS undergoes complex algorithmic processing.Subsequently, battery aging characteristics are extracted to train high-accuracy AI models.However, data features can only provide limited information.To fully achieve vehicle-cloud collaborative battery management technology and perform real-time functions such as lithium plating detection, heat generation calculation, and state estimation, it is necessary to integrate mechanistically and AI approaches through electrochemical models.From the perspective of SOH estimation, electrochemical models can offer more comprehensive features and directly compute theoretical capacity degradation.However, the complex computational models are only part of this architecture, and onboard BMS systems are limited by cost and can only perform calculations of lower complexity, such as low-order equivalent circuit models and Coulomb counting.Nevertheless, as open-loop methods, they are prone to cumulative error accumulation.Therefore, the use of Kalman filters can be employed to correct the onboard models using the cloud model output as observations.In conclusion, through targeted integration of cloud-based machine learning or electrochemical models, it is possible to refine the estimation results of onboard BMS, ensure real-time estimation at the vehicle end, and obtain more internal battery response information.
With the breakthrough of communication technology and the development of big data technology, the architecture of end-side cloud collaboration has become the solution for intelligent, informative, networked and Internet-oriented BMS [239], which aims to enhance real-time performance and support decision-making of BMS.Digital twin technology, which connects the physical and digital realms, enables the real-time transfer of data between the BMS and cloud-based servers.In our previous work, CHAIN (Cyber Hierarchy and Interactional Network) has been proposed to support the interconnection and coordination of devices, services, and systems through a multidimensional hierarchy [240].The unique hierarchical network architecture of CHAIN integrates the multi-scale battery material design [241], multi-dimensional battery system design, and multi-layer end-cloud collaboration.In addition, it can systematically gather and record production information, beginning from detailed material-level information for simulation and subsequently mapping it to the design process of battery packs.Within this framework, an initial forecast of battery performance can be achieved, serving as a foundation for further tailoring battery management strategies.Yang et al. applied the CHAIN architecture to integrate end sampling, edge computing, and cloud computing to build a flexible and scalable BMS that enables battery state estimation, thermal management, and fault diagnosis [239].The CHAIN architecture provides the solution for the intelligent management of battery systems.
In another study of ours, we devised a robust end-cloud fusion approach.This approach involves the execution of computationally intensive data-driven models (CNN-LSTM) on the cloud while employing Coulomb counting methodology to construct state functions on the vehicle side [242].The Kalman filter algorithm was utilized to integrate the results obtained from both the vehicle side and the cloud side.The proposed methodology enables rapid completion of initial error correction, yielding an RMSE below 1.5% and demonstrating high precision.The endeavor to integrate two models through end-cloud fusion holds a certain reference value.To enhance the implementation of the end-cloud collaborative architecture, future endeavors could focus on incorporating a fusion approach that combines multi-scale mechanistic models with machine learning models.This integration has the potential to expand the predictive capabilities of the models, particularly when dealing with high-rate charging/discharging and low-temperature conditions.

Conclusions
The accurate prediction of the SOH is crucial for ensuring the reliable operation of battery systems in electric vehicles, given the complexity of battery degradation and the potential for safety issues arising from aging.This article examines various methods for estimating the SOH of lithium-ion batteries in EVs.The advantages and limitations of different experimental techniques used for analyzing battery aging mechanisms and estimating SOH are discussed.Then, this paper examines techniques for SOH estimation that are based on models and machine learning algorithms.Research in this field has yet to yield a single optimal solution for SOH estimation, and the appropriate methods should be chosen based on the available data and computational cost.Current trends in battery state estimation include the use of multi-stage, multi-model fusion techniques for improved accuracy.With the rise of intelligent transportation and connected vehicle technologies, there is an opportunity to utilize artificial intelligence and cloud-based platforms for SOH and remaining useful life predictions.The use of complex models and knowledge mapping techniques can aid in the accurate estimation of battery SOH and the thorough understanding of battery aging at various levels.This paper also highlights the potential use of battery management systems for monitoring operational parameters and creating digital twin models for real-time transmission and cloud-based computation.
For data-driven models, leveraging the relationships among data undoubtedly enhances the accuracy of SOH estimation.However, even with efficient estimation algorithms, vehicle-based SOH estimation still faces the following challenges.Firstly, calculating the SOH of a battery pack is challenging.Aggregating aging information from individual cells and determining the overall SOH estimation of the battery pack requires careful consideration of various factors such as battery heterogeneity, aging synchronization, and battery balancing techniques.Secondly, the availability and quality of real-world data may also serve as limiting factors.Appropriate data cleaning, filtering, fusion, feature extraction, and clustering approaches need to be employed to address this issue.
The end-cloud collaborative architecture still faces many challenges.One is the integration of first-principles-based material calculations with artificial intelligence methods.Since SOH is an indirectly measurable quantity, it can only be estimated.Therefore, machine learning models can utilize the outputs of electrochemical models as data to achieve real-time simulation and extract implicit knowledge.Furthermore, the aging process of batteries is influenced by various factors such as operating conditions, temperature variations, and usage patterns.Developing comprehensive and accurate aging models to capture the complex interactions among these factors in real time is a challenging task.Considering that first-principles-based models can simulate the failure or aging behavior of materials, digital twin simulations can provide insights into the underlying mechanisms of battery behavior under challenging conditions such as high-rate discharge or low temperature.This, in turn, can improve the accuracy of battery SOH estimation in extreme environments.
With the advancement of chip computing power and the implementation of integrated storage and computation architectures, BMS have the potential to integrate multidimensional state estimation, fault prediction, and charging strategy optimization.In this context, we believe that knowledge graph technology has wide-ranging applications.To effectively leverage heterogeneous and dynamically changing data and explore key information influencing battery aging, the cloud-based data management approach of knowledge graphs can be employed.By harnessing such complex systems, AI models driven by both knowledge and data can be developed to comprehensively analyze vari-

Figure 1 .
Figure 1.Different experimental analysis methods for SOH estimation.

Figure 1 .
Figure 1.Different experimental analysis methods for SOH estimation.
Batteries 2023, 9, x FOR PEER REVIEW 11 of 39 fitting functions such as polynomial or exponential functions to approximate the cell's behavior based on known data (Figure 4a,b).

Figure 4 .
Figure 4. Model-based methods for SOH estimation.(a) The capacity degradation curves [135].(b) RUL prediction based on SOH with EOL [136].Copyright 2018 Elsevier.(c) Equivalent circuit model with two RC networks, (d) the statistics results of the voltage error under the DST test [137].Copyright 2012, Elsevier.(e) Electrochemical model considering internal and external factors [138].Copyright 2022, Elsevier.(f) Schematic diagram of the P2D model [139].Copyright 2020, Elsevier.While these models can be useful for making predictions, they are ultimately only approximate representations of the cell's true behavior [140].Han et al. derived a power law battery aging model based on the Arrhenius equation, which includes temperature as a parameter and is a function of the number of cycles [141].Zhang et al. developed a battery RUL prediction framework based on an exponential empirical model and particle filter.The nonlinear least squares technique was used to exponentially quantify the capacity degradation data of Lithium-ion batteries[135].The outcome of this analysis showed that the exponential model was suited to fit the capacity degradation data.The parameters of the empirical resistance model depend on the SOH.Ecker et al. developed an empirical resistance model to describe the battery's aging process and to predict its lifespan[142,143].Empirical models have relatively few parameters and are easy to implement for onboard SOH monitoring, but they tend to have poor generalization performance outside the scope of the test data used for model training.Empirical models, being open-loop models, tend to exhibit significant cumulative errors.Their primary objective is to quantify battery degradation information while minimizing computational demands.As a result, empirical models often exhibit limited

Figure 5 .
Figure 5.Typical machine learning methods used for SOH estimation.

Figure 5 .
Figure 5.Typical machine learning methods used for SOH estimation.

Table 1 .
Simple comparison of SOH estimation methods.Zhao et al. introduced a scheme that uses generators to generate auxiliary training samples and utilizes discriminators to learn real samples to monitor abnormal aging indicators.