A Review of Modern Machine Learning Techniques in the Prediction of Remaining Useful Life of Lithium-Ion Batteries

: The intense increase in air pollution caused by vehicular emissions is one of the main causes of changing weather patterns and deteriorating health conditions. Furthermore, renewable energy sources, such as solar, wind, and biofuels, suffer from weather and supply chain-related uncertainties. The electric vehicles’ powered energy, stored in a battery, offers an attractive option to overcome emissions and uncertainties to a certain extent. The development and implementation of cutting-edge electric vehicles (EVs) with long driving ranges, safety, and higher reliability have been identiﬁed as critical to decarbonizing the transportation sector. Nonetheless, capacity deteriorating with time and usage, environmental degradation factors, and end-of-life repurposing pose signiﬁcant challenges to the usage of lithium-ion batteries. In this aspect, determining a battery’s remaining usable life (RUL) establishes its efﬁcacy. It also aids in the testing and development of various EV upgrades by identifying factors that will increase and improve their efﬁciency. Several nonlinear and complicated parameters are involved in the process. Machine learning (ML) methodologies have proven to be a promising tool for optimizing and modeling engineering challenges in this domain (non-linearity and complexity). In contrast to the scalability and temporal limits of battery degeneration, ML techniques provide a non-invasive solution with excellent accuracy and minimal processing. Based on recent research, this study presents an objective and comprehensive evaluation of these challenges. RUL estimations are explained in detail, including examples of its approach and applicability. Furthermore, many ML techniques for RUL evaluation are thoroughly and individually studied. Finally, an application-focused overview is offered, emphasizing the advantages in terms of efﬁciency and accuracy.


Introduction
Transportation infrastructure electrification is one of the significant options for delivering environmentally friendly and sustainable solutions.This is because it both satisfies the ever-increasing need for ecologically friendly energy and the rising expense of transportation [1].It is essential for the global continuance of sustainable development.EVs, both entirely battery-powered hybrid EVs (HEVs) and EVs that work on both fossil fuel and batteries, will lead to capturing the low-emission market.EVs as well as HEVs operate on both conventional fuel and batteries [2].The most significant element of an EV is the energy storage device, i.e., the battery.The invention of rechargeable lead-acid batteries at the beginning of the 19th century prepared the path for the first electric car to become accessible to the general public.Subsequently, electric cars had a stratospheric surge in popularity that persisted till the early decades of the 20th century [3][4][5].
The overall number of electric vehicles sold at the time was almost double the count of fossil fuel-powered automobiles, a record that has yet to be exceeded.Lithium-ion (Li-ion) based batteries are essential to making electric vehicles a reality today [6].These Batteries 2023, 9, 13 2 of 17 batteries are intended to power modern electric cars adapted to human needs.As electric cars become more widespread in the road transportation domain, this count is foreseeable to decline [7].This may not be the sole reason for reviving a century-old concept that has been dormant for a very long time; nevertheless, this time, it is a financially feasible and vendible product that can compete with fossil fuel-powered automobiles.Unlike traditional cars, electric vehicles (EVs) create little to no noise, do not need a great deal of work to run, and have lower fuel costs.This is a big advantage, particularly if oil prices continue to climb.The technology that has been created so far has the potential to be used in urban transportation to address concerns such as low-cost public transportation and traffic congestion.It does not use any of the stored energy and generates very little waste [8,9].
The utilization of Li-ion batteries is prevalent in transportation as well as in energy storage sectors.As one of the costliest components that provides a crucial function, they must be properly managed and monitored [10].A longer battery lifespan is not only crucial for the economic viability of EVs, but also for the infrastructure supporting sources of renewable energy combined with smart grids.The deterioration of batteries during usage is one of the most critical and hardest challenges to overcome.This issue has become a limiting element in the battery life duration [11].Depending on how the battery is used, its lifespan may vary substantially due to the various degrading processes.A Li-ion battery is a time-varying, dynamic, and nonlinear type electrochemical framework with complicated internal mechanics.These qualities make it harder to comprehend the battery.When exposed to an increasing count of discharge and charge series, the efficacy and lifetime of a Li-ion-based battery decline precipitously [12,13].
There are several causes of battery deterioration, some of which are chemical in origin, while others are more physical, such as heat stress or mechanical stress.Figure 1 depicts the processes of battery breakdown that occur most often.Several diverse forms of degradation contribute to a battery's aging, which may be divided into two basic categories utilizing Li-ion as a result of side reactions depleting lithium reserves, as well as the forfeiture of the active material, resulting in a reduction in the quantity of storage space available [14,15].Among these, exfoliation of graphite, binder deterioration, deterioration of electrical contact owing to corrosion of the electric current accumulators, and electrode particle cracking are the primary reasons for active material loss.Degradation of solid electrolyte (SI) interphase films, electrolyte breakdown, and lithium plating are the primary causes of lithium depletion.Importantly, these processes of material degradation are inextricably linked to the materials themselves.Because graphite has a negative work function, the working voltage in the case of graphite makes anode inferior to the electrochemical window of popular electrolytes, resulting in the production of an SE interphase layer [16,17].In contrast, SE interphase film formation would not occur in the anode made of lithiumtitanium (LT) oxide since LT oxide's capacity remains within the window of electrochemical electrolyte.The reason that changes in volume of lithium iron made cathode, in comparison to cathode made of oxides of lithium manganese, is another example of this since it results in less structural deformation [18,19].In addition to variations in the materials, the degradation processes vary substantially depending on the battery's operating parameters and its design.For example, the risk of lithium plating happening during rapid charging is much greater than during battery discharge.In the design of the battery, a smaller sized cathode's elements result in reduced stresses, which in turn leads to less particle breakage; however, owing to the high specific surface area, this also increases cathode material dissolution [20][21][22][23].Due to the complexity of the process through which batteries deteriorate, it is becoming more impossible to estimate, with precision, how long a battery will continue to perform.Nonetheless, this is important in the case of thermal management of battery packs to maintain consistent operation, as well as time maintenance, together with future applications involving battery reuse [24,25].The primary aim of condition monitoring for batteries is to anticipate their end-of-life (EOL) cycle and evaluate the uncertainties associated with the expected values.Meng et al. [26] employed NASA's dataset to forecast the early EOL of four battery cells using a hybrid technique integrating empirical mode decomposition (EMD) and particle filter (PF).It was observed that if prediction begins later in the operating cycle, the data reveal a clear declining trend in EOL prediction uncertainty.
As mentioned in previous paragraphs, the battery deterioration process is quite nonlinear, complex, and difficult to model.In recent times, machine learning-based model prediction and optimization has been employed by numerous researchers to solve this impending issue.Generally, RUL prediction methods may be categorized as model-based methods, data-based approaches, and their hybrids.Model-based methods often consist of physical models, electrochemical models, etc. [27,28].The electrochemical model employs intricate logical models to precisely represent the chemical kinetics of the battery.
To achieve a high level of estimating precision, these approaches will also incur significant levels of complexity and calculating expense [14,29].In addition, in order to finish the parameterization phase of the electrochemical framework, dismantling the battery is often required, which greatly complicates the application procedure.A Li-ion battery is an extremely intricate system working on an electrochemical framework.The model-based approaches are generally complicated and tough to execute for predicting the RUL of Lithium-ion batteries, but the methods using data are ideal for assessing the RUL of Li-ion batteries, particularly for a significant number of historical data-based utilization [30].Consequently, data-based forecasting systems have garnered considerable interest.The investigators have extensively used adaptive neuro-fuzzy inference systems (ANFIS) [31,32], regression trees (RTs) [33,34], artificial neural networks (ANN) [35,36], and response surface methodology (RSM) [37][38][39][40].For optimization, several metaheuristics are used: a genetic algorithm (GA) [41,42], ant colony optimization (ACO), particle swarm optimization (PSO) [43,44], simulated annealing, bat algorithm, spiral optimization algorithm, and artificial swarm optimization [45].
Besides conventional ML techniques, such as ANN, neuro-fuzzy, GEP, etc., there is another class of ML techniques known as ensemble ML.In order to provide a single, best solution to a problem, the EML technique builds numerous instances of conventional ML methods and combines them.Better prediction models can be produced using this method than with the conventional method [46].The main reasons to apply the EML approach include circumstances when there are uncertainties in data representation, solution Due to the complexity of the process through which batteries deteriorate, it is becoming more impossible to estimate, with precision, how long a battery will continue to perform.Nonetheless, this is important in the case of thermal management of battery packs to maintain consistent operation, as well as time maintenance, together with future applications involving battery reuse [24,25].The primary aim of condition monitoring for batteries is to anticipate their end-of-life (EOL) cycle and evaluate the uncertainties associated with the expected values.Meng et al. [26] employed NASA's dataset to forecast the early EOL of four battery cells using a hybrid technique integrating empirical mode decomposition (EMD) and particle filter (PF).It was observed that if prediction begins later in the operating cycle, the data reveal a clear declining trend in EOL prediction uncertainty.
As mentioned in previous paragraphs, the battery deterioration process is quite nonlinear, complex, and difficult to model.In recent times, machine learning-based model prediction and optimization has been employed by numerous researchers to solve this impending issue.Generally, RUL prediction methods may be categorized as model-based methods, data-based approaches, and their hybrids.Model-based methods often consist of physical models, electrochemical models, etc. [27,28].The electrochemical model employs intricate logical models to precisely represent the chemical kinetics of the battery.To achieve a high level of estimating precision, these approaches will also incur significant levels of complexity and calculating expense [14,29].In addition, in order to finish the parameterization phase of the electrochemical framework, dismantling the battery is often required, which greatly complicates the application procedure.A Li-ion battery is an extremely intricate system working on an electrochemical framework.The model-based approaches are generally complicated and tough to execute for predicting the RUL of Lithium-ion batteries, but the methods using data are ideal for assessing the RUL of Li-ion batteries, particularly for a significant number of historical data-based utilization [30].Consequently, data-based forecasting systems have garnered considerable interest.The investigators have extensively used adaptive neuro-fuzzy inference systems (ANFIS) [31,32], regression trees (RTs) [33,34], artificial neural networks (ANN) [35,36], and response surface methodology (RSM) [37][38][39][40].For optimization, several metaheuristics are used: a genetic algorithm (GA) [41,42], ant colony optimization (ACO), particle swarm optimization (PSO) [43,44], simulated annealing, bat algorithm, spiral optimization algorithm, and artificial swarm optimization [45].
Besides conventional ML techniques, such as ANN, neuro-fuzzy, GEP, etc., there is another class of ML techniques known as ensemble ML.In order to provide a single, best solution to a problem, the EML technique builds numerous instances of conventional ML methods and combines them.Better prediction models can be produced using this method than with the conventional method [46].The main reasons to apply the EML approach include circumstances when there are uncertainties in data representation, solution objectives, modeling methodologies, or the availability of random beginning seeds in a model [47,48].
Base learners are the instances or candidate ways.As with standard ML methods, each base learner operates separately until the final findings are integrated to create a single, reliable output [49,50].Several ELM techniques, such as XGBoost, random forest, and Boosted and Bagged regression trees, have been successfully used for RUL prediction [51].
The most important factor for building good prediction models for RUL is the availability of robust data sets.Several publicly available datasets are employed by several researchers to develop prediction models for the RUL of Li-ion batteries [52].A dataset on batteries is available on NASA's "https://c3.nasa.gov/dashlink/resources/133/(accessed on 24 September 2022)" website [53].Another publicly available dataset on the life cycle analysis of batteries is available at the Center for Advanced Life Cycle Engineering website "https://web.calce.umd.edu/batteries/data.htm (accessed on 24 September 2022)".Besides these sources, the datasets on battery health, RUL, and chemistry are also available at Oxford's Battery Intelligence lab "https://howey.eng.ox.ac.uk/data-and-code/ (accessed on 24 September 2022)" and Toyota, in collaboration with MIT and Stanford, made their data available at their website "https://data.matr.io/1/(accessed on 24 September 2022)" [52,53].
Among artificial intelligence (AI) based approaches, the ML trains a machine to follow the human instinct to accomplish hitherto inaccessible functionalities and performance, and it fosters the interaction between people and ML systems to make ML judgments comprehensible to humans.While employed for the prediction of the RUL of Li-ionbased batteries, ML technology offers the capacity for both higher prognostic efficiency and great computation efficacy [54][55][56].The study examined the functions, configuration, structure, precision, benefits, barriers, and downsides of intelligent algorithms in battery state estimation.The research was focused on the practicability of the efficient use of data-based AI methodologies in the area of Li-ion batteries' RUL forecast.

Objectives of the Study
The precise prediction of RUL remains a challenge.Several approaches have been used by researchers working in this field, such as contemporary approaches like ANN and ANFIS for prediction and RSM and PSO for optimization.However, in recent times, more modern approaches, such as gaussian process regression boosted regression trees, XGBoost, support vector regression, CatBoost, and AdaBoost, were being employed.Furthermore, the model's training improving methods, such as Bayesian optimization, random and grid search, and unscented Kalman filters, are being employed.However, the review studies published in the last few years do not provide comprehensive information on these approaches.Thus, the present work is an endeavor to present the latest update on ML, hyperparameters optimization, and parametric optimization in the domain of remailing useful life modeling.

RUL Modeling with ML
The following modern ML techniques have been primarily used in recent times:

Gaussian Process Regression
Gaussian process regression, often known as GPR, is a form of supervised type ML that can be implemented to solve problems relating to probabilistic classification and regression [57][58][59].The following are some of the benefits that Gaussian processes offer; the predictions are derived from the observations through the process of interpolation.Empirical probabilities can be calculated and utilized to determine whether to adapt the forecast in a particular region of interest since the forecast is stochastic (Gaussian).Since the forecast is based on a Gaussian distribution, this means that the prediction can be improved by using adaptive fitting [60][61][62].It has the provision of using different kernels, thus making it adaptable to a different environment.Although standard kernels are provided, it is also possible to prescribe one's kernels.The GPR does have some drawbacks.For example, they show marginally poor efficiency when the data size is too large.Additionally, GPR Algorithms are not sparse, since, for model prediction, they use all the information that is available about the data sample [63,64].The typical schematics of GPR are depicted in Figure 2.
Batteries 2023, 9, 13 5 of 18 provided, it is also possible to prescribe one's kernels.The GPR does have some drawbacks.For example, they show marginally poor efficiency when the data size is too large.Additionally, GPR Algorithms are not sparse, since, for model prediction, they use all the information that is available about the data sample [63][64].The typical schematics of GPR are depicted in Figure 2. Liu and Chen [65] used GPR for the prediction of RUL.The indirect health indicator and GPR were employed for RUL prediction to overcome the issue of the unpredictability of the capacity quantification problem.The model anticipated capacity, compared to the prediction of RUL, within a given threshold.The proposed technique was verified by two separate datasets based on life-cycle approaches.The findings show that the suggested method can provide an accurate and dependable RUL prediction for Li-ion batteries.Pang et al. [66] also employed GPR as an incremental capacity (IC) for the modeling of RUL for Li-ion batteries.To begin, the curve of IC, which is more sensitive compared to the classic discharge/charging curve, was employed to examine the degrading process of the Li-ion battery's performance.It was concluded that the suggested technique offers several benefits, such as dependability, high precision, and higher probabilistic output.In a similar study by Li et al. [67], GPR was employed to predict the health as well as the RUL for Liion batteries.Using the feature variables, the GPR was employed to simulate the battery's SOH estimate.Fourth, utilizing the findings of the battery SOH values and earlier output, a longer autoregressive RUL estimate model was developed.Four battery datasets under varied cycle test settings were employed to exemplify the prediction capabilities and usefulness of the two proposed models.Furthermore, the resilience of the proposed models was tested using four datasets with varying degrees of health.The experimental findings demonstrate that the suggested technique can accurately estimate battery health state and the remaining usable lifespan.

XGBoost
Extreme Gradient Boosting (XGBoost) is a modern ML approach for feature selection and regression.It has become an ML technique of choice owing to its adaptability to any environment.The most notable is Extreme Gradient Boosting (XGBoost), which is an adaptable machine-learning technique for tree boosting.The most important contribution that XGBoost makes to ML is the addition of a regularization component to the loss function [68,69].This component adopts the prognostics at each split in addition to the complexity of the generating ensemble.In addition, XGBoost gives its users the ability to reduce the likelihood of their models being overfitted by modifying a number of hyperparameters, including tree single complexity, rate of learning, forest complexity, terms of regularization terms, dropouts, column subspaces, and so on.XGBoost presents brand Liu and Chen [65] used GPR for the prediction of RUL.The indirect health indicator and GPR were employed for RUL prediction to overcome the issue of the unpredictability of the capacity quantification problem.The model anticipated capacity, compared to the prediction of RUL, within a given threshold.The proposed technique was verified by two separate datasets based on life-cycle approaches.The findings show that the suggested method can provide an accurate and dependable RUL prediction for Li-ion batteries.Pang et al. [66] also employed GPR as an incremental capacity (IC) for the modeling of RUL for Li-ion batteries.To begin, the curve of IC, which is more sensitive compared to the classic discharge/charging curve, was employed to examine the degrading process of the Li-ion battery's performance.It was concluded that the suggested technique offers several benefits, such as dependability, high precision, and higher probabilistic output.In a similar study by Li et al. [67], GPR was employed to predict the health as well as the RUL for Li-ion batteries.Using the feature variables, the GPR was employed to simulate the battery's SOH estimate.Fourth, utilizing the findings of the battery SOH values and earlier output, a longer autoregressive RUL estimate model was developed.Four battery datasets under varied cycle test settings were employed to exemplify the prediction capabilities and usefulness of the two proposed models.Furthermore, the resilience of the proposed models was tested using four datasets with varying degrees of health.The experimental findings demonstrate that the suggested technique can accurately estimate battery health state and the remaining usable lifespan.

XGBoost
Extreme Gradient Boosting (XGBoost) is a modern ML approach for feature selection and regression.It has become an ML technique of choice owing to its adaptability to any environment.The most notable is Extreme Gradient Boosting (XGBoost), which is an adaptable machine-learning technique for tree boosting.The most important contribution that XGBoost makes to ML is the addition of a regularization component to the loss function [68,69].This component adopts the prognostics at each split in addition to the complexity of the generating ensemble.In addition, XGBoost gives its users the ability to reduce the likelihood of their models being overfitted by modifying a number of hyperparameters, including tree single complexity, rate of learning, forest complexity, terms of regularization terms, dropouts, column subspaces, and so on.XGBoost presents brand new capabilities, such as the ability to manage missing data by defaulting node directions, swiftly enumerating likely splitting thresholds during node splits, and interoperability with distributed system frameworks [70,71].The schematics of the XGBoost process are shown in Figure 3.
Batteries 2023, 9, 13 6 of 18 new capabilities, such as the ability to manage missing data by defaulting node directions, swiftly enumerating likely splitting thresholds during node splits, and interoperability with distributed system frameworks [70,71].The schematics of the XGBoost process are shown in Figure 3. Zhang et al. [72] used XGBoost to model-predict the battery's health parameters more thoroughly and accurately to identify the battery's (Li-ion) state of health.The outcomes of the experiments indicate the value of the coefficient of determination (R 2 ) for XGboost models used in battery prediction research was greater than 0.97.At a signal-to-noise ratio of 10 dB, the XGBoost model has an absolute error (AE) of 0 and a Theil index coefficient (TIC) of less than 3%.In the same experiment about the battery forecast, the TIC in the case of the proposed model was lower than 0.3%.Ma et al. [73] employed the XGBoost model to prognosticate Li-ion battery's RUL.First, it was assumed that the oscillations in the RUL series had inherent features.Next, key health indicators (HI) were obtained from the voltage series, including changes comparable to those of the RUL series.The selection of HIs was then determined using the indicators of feature importance.After that, the RUL prediction results were gathered by utilizing XGBoost to learn the HIs that were previously chosen.Experiments conducted on the dataset supplied by the NASA Prognostic Center of Excellence indicate that the suggested XGBoost method has excellent prediction performance.Meng et al. [7] also used XGBoost for model prediction of RUL while comparing it with several ML techniques.A hybrid technique was proposed to precisely predict the Li-ion battery's capacity, taking regeneration into account.To produce the final prediction results, the separate outcome of test ANFIS models were recomposed.An application of the suggested approach to the observed data of NASA lithium-ion battery validates the method.The collected findings demonstrate that the suggested technique may achieve adequate prediction accuracy, with the detrimental influence of capacity regeneration on forecast accuracy being mitigated.

AdaBoost
A statistical classification meta-algorithm, known as AdaBoost (which is an acronym for Adaptive Boosting), can be found here.To achieve improved overall performance, it Zhang et al. [72] used XGBoost to model-predict the battery's health parameters more thoroughly and accurately to identify the battery's (Li-ion) state of health.The outcomes of the experiments indicate the value of the coefficient of determination (R 2 ) for XGboost models used in battery prediction research was greater than 0.97.At a signal-to-noise ratio of 10 dB, the XGBoost model has an absolute error (AE) of 0 and a Theil index coefficient (TIC) of less than 3%.In the same experiment about the battery forecast, the TIC in the case of the proposed model was lower than 0.3%.Ma et al. [73] employed the XGBoost model to prognosticate Li-ion battery's RUL.First, it was assumed that the oscillations in the RUL series had inherent features.Next, key health indicators (HI) were obtained from the voltage series, including changes comparable to those of the RUL series.The selection of HIs was then determined using the indicators of feature importance.After that, the RUL prediction results were gathered by utilizing XGBoost to learn the HIs that were previously chosen.Experiments conducted on the dataset supplied by the NASA Prognostic Center of Excellence indicate that the suggested XGBoost method has excellent prediction performance.Meng et al. [7] also used XGBoost for model prediction of RUL while comparing it with several ML techniques.A hybrid technique was proposed to precisely predict the Li-ion battery's capacity, taking regeneration into account.To produce the final prediction results, the separate outcome of test ANFIS models were recomposed.An application of the suggested approach to the observed data of NASA lithium-ion battery validates the method.The collected findings demonstrate that the suggested technique may achieve adequate prediction accuracy, with the detrimental influence of capacity regeneration on forecast accuracy being mitigated.

AdaBoost
A statistical classification meta-algorithm, known as AdaBoost (which is an acronym for Adaptive Boosting), can be found here.To achieve improved overall performance, it may be used with a wide range of distinct learning methods.The ultimate output of the boosted classifier is determined by computing the weighted average of the results from the additional training algorithms (also called "weak learners") [74][75][76][77].Although AdaBoost is most often shown for binary classification, it may be used for a large number of classes, as well as intervals on the real line.AdaBoost is an adaptive algorithm since it modifies successive weak learners to prioritize examples that were incorrectly identified by previous classifiers [78][79][80].This approach has fewer chances of the problem being caused by overfitting problems, compared to contemporary learning techniques, when used in certain circumstances.It is feasible to show that if the efficiency of each learner is just marginally better than a random prediction, the entire model will ultimately converge to a robust learner.
It is common to practice using AdaBoost to combine weak base learners, such as decision stumps.However, it has been shown that it is also capable of effectively combining strong base learners, such as deep decision trees, which results in a model with improved accuracy.When compared to other ML algorithms, AdaBoost offers various benefits owing to its simplicity of use and fewer parameter tinkering.Furthermore, AdaBoost may be utilized in hybrid mode.Overfitting is not a hallmark of AdaBoost implementations, presumably because the parameters are not tuned concurrently, and the process of learning is hampered by stage-wise estimations.AdaBoost utilizes a method of progressive training and boosting.As a consequence, AdaBoost demonstrations must use high-quality data.Additionally, it is vulnerable to noise and outliers in data, necessitating the removal of these elements before utilizing the data [79,[81][82][83].A schematic depicting the AdaBoost process is depicted in Figure 4.
Batteries 2023, 9, 13 7 of 18 may be used with a wide range of distinct learning methods.The ultimate output of the boosted classifier is determined by computing the weighted average of the results from the additional training algorithms (also called "weak learners") [74][75][76][77].Although Ada-Boost is most often shown for binary classification, it may be used for a large number of classes, as well as intervals on the real line.AdaBoost is an adaptive algorithm since it modifies successive weak learners to prioritize examples that were incorrectly identified by previous classifiers [78][79][80].This approach has fewer chances of the problem being caused by overfitting problems, compared to contemporary learning techniques, when used in certain circumstances.It is feasible to show that if the efficiency of each learner is just marginally better than a random prediction, the entire model will ultimately converge to a robust learner.
It is common to practice using AdaBoost to combine weak base learners, such as decision stumps.However, it has been shown that it is also capable of effectively combining strong base learners, such as deep decision trees, which results in a model with improved accuracy.When compared to other ML algorithms, AdaBoost offers various benefits owing to its simplicity of use and fewer parameter tinkering.Furthermore, AdaBoost may be utilized in hybrid mode.Overfitting is not a hallmark of AdaBoost implementations, presumably because the parameters are not tuned concurrently, and the process of learning is hampered by stage-wise estimations.AdaBoost utilizes a method of progressive training and boosting.As a consequence, AdaBoost demonstrations must use high-quality data.Additionally, it is vulnerable to noise and outliers in data, necessitating the removal of these elements before utilizing the data [79,[81][82][83].A schematic depicting the AdaBoost process is depicted in Figure 4. Zhu et al. [84] used the hybrid of AdaBoost with LSTM for the prognostics of RUL.In the process, the LSTM was first modified to train the time-series correlations of the learning data, and subsequently, the test data's trajectories were extended.This is done to reduce the amount by which the extents of the data trajectories differ between the test and training data sets.After that, the extra time-series data was employed as the recommended way to modify the regression of the AdaBoost approach to predict the RUL.The proposed technique proves to be a robust one with modern methods by signifying its effectiveness on two different deterioration data-groups.Li et al. [82] used AdaBoost to estimate the SOC in the case of Li-ion batteries.The authors combined AdaBoost with a Zhu et al. [84] used the hybrid of AdaBoost with LSTM for the prognostics of RUL.In the process, the LSTM was first modified to train the time-series correlations of the learning data, and subsequently, the test data's trajectories were extended.This is done to reduce the amount by which the extents of the data trajectories differ between the test and training data sets.After that, the extra time-series data was employed as the recommended way to modify the regression of the AdaBoost approach to predict the RUL.The proposed technique proves to be a robust one with modern methods by signifying its effectiveness on two different deterioration data-groups.Li et al. [82] used AdaBoost to estimate the SOC in the case of Li-ion batteries.The authors combined AdaBoost with a recurrent type of neural network.This tactic allows for the spatio-temporal correlation adaptability of sample data.According to the findings of experiments and simulation assessments, the integrated approach suggested in this research may be used to increase the precision of SOC prediction as well as the model's generalization performance.Liu et al. [85] conducted a comparative analysis of three types of ML methods, namely TotalBoost, LPBoost, and AdaBoost, to predict the design process of Li-ion's battery electrode.To determine the extent to which four important elements, with three slurry extracted features and parameters of the protective layer process, influence the porosity and mass loading of battery electrodes, a quantitative study was conducted.According to the findings, the test tree model-based system can provide an effectual quantitative analysis of the significance and correlation of factors associated, in addition to giving satisfactory early forecasts of battery electrode attributes, with prognostic efficiency.

Boosted Regression Trees
The BRT is an ML approach that is based on ensemble trees and decision trees [86].To reduce the amount of variation within a dataset, a decision tree may break it up into more manageable subsets by using a series of splits.Each time a split is performed on the data, it is based on the predictor variable that achieves the greatest reduction in the amount of variance in the response variable [49].In reality, a mono-type intricate decision tree may be learned to get a higher degree of accuracy; nonetheless, it is utterly incapable of making new predictions on its own, even though it can be educated to this level.To improve the accuracy of forecasts, BRT and other techniques for ensemble learning combine a significant number of very tiny decision trees.The term "weak learners" is used rather often to refer to these fundamental trees [87].
The BRT-based model is distinct from other ensemble-based techniques as it builds trees in a step-by-step fashion using the residuals of earlier predictions as the starting point.The BRT approach is gradually shifting its emphasis to the facets that are the most challenging to foresee [33].The performance and complexity of a BRT model are both influenced by five hyperparameters: the count of trees, the depth of interaction, the least count of observations per tree node, the learning rate, and the bagging fraction [88,89].The schematics of the BRT are depicted in Figure 5.
Batteries 2023, 9, 13 8 of 18 recurrent type of neural network.This tactic allows for the spatio-temporal correlation adaptability of sample data.According to the findings of experiments and simulation assessments, the integrated approach suggested in this research may be used to increase the precision of SOC prediction as well as the model's generalization performance.Liu et al. [85] conducted a comparative analysis of three types of ML methods, namely TotalBoost, LPBoost, and AdaBoost, to predict the design process of Li-ion's battery electrode.To determine the extent to which four important elements, with three slurry extracted features and parameters of the protective layer process, influence the porosity and mass loading of battery electrodes, a quantitative study was conducted.According to the findings, the test tree model-based system can provide an effectual quantitative analysis of the significance and correlation of factors associated, in addition to giving satisfactory early forecasts of battery electrode attributes, with prognostic efficiency.

Boosted Regression Trees
The BRT is an ML approach that is based on ensemble trees and decision trees [86].To reduce the amount of variation within a dataset, a decision tree may break it up into more manageable subsets by using a series of splits.Each time a split is performed on the data, it is based on the predictor variable that achieves the greatest reduction in the amount of variance in the response variable [49].In reality, a mono-type intricate decision tree may be learned to get a higher degree of accuracy; nonetheless, it is utterly incapable of making new predictions on its own, even though it can be educated to this level.To improve the accuracy of forecasts, BRT and other techniques for ensemble learning combine a significant number of very tiny decision trees.The term "weak learners" is used rather often to refer to these fundamental trees [87].
The BRT-based model is distinct from other ensemble-based techniques as it builds trees in a step-by-step fashion using the residuals of earlier predictions as the starting point.The BRT approach is gradually shifting its emphasis to the facets that are the most challenging to foresee [33].The performance and complexity of a BRT model are both influenced by five hyperparameters: the count of trees, the depth of interaction, the least count of observations per tree node, the learning rate, and the bagging fraction [88,89].The schematics of the BRT are depicted in Figure 5.  Several authors employed a BRT-based approach for the model prediction of RUL and other aspects of batteries.Wang and Mamo [90] employed a tree-based ML approach to develop a model for RUL predictions.In a hybrid approach, BRT combined an ABC algorithm to investigate the degradation in the capacity of prismatic cells.The ABC method was employed in this study to find optimal parameters of GBR.The suggested model's mean absolute percentage errors (MAPE) for four unknown datasets were 0.46%, 0.70%, 0.87%, and 0.62%.The findings reveal that the proposed model can estimate capacity deterioration with high accuracy.
Eleftheroglou et al. [91] employed a BRT-based data-driven model-prediction approach for the RUL prediction of Li-Polymer batteries.In addition to the mean estimates, the level of uncertainty that is connected with the point predictions was assessed, and upper and lower confidence intervals were also explored.The projections for the remaining useful life of six different flights, all of which started with fully charged batteries, were depicted, explored, and contrasted.The effectiveness of the predictive algorithms was evaluated using several distinct metrics, and it was observed that the proposed models were highly precise in prediction.In another example of ML application in this domain, Chandran et al. [92] developed SOC prediction models for Li-ion battery systems.The study explored both boosting and bagging regression trees for a comprehensive exploration of the prognostic efficiency of this ML technique.The mean squared error (MSE) was well within the 5% range for both approaches, establishing these as an efficient forecasting approach.

Support Vector Regression
Support vector machine has been employed extensively as a classification approach in the last decade.However, it can also be used as a regression method while keeping all the algorithm's important properties.The SVM, when used for the regression of data, is named support vector regression (SVR).SVR also employs similar grouping ideas, as does SVM [93,94].It works backward from the given points to determine the shape of the curve.However, since it is a regression method, rather than employing the curve as a decision boundary, it leverages the curve to discover a match between the vector and the point of the curve.This is done by using the curve to find the best fit.Support vectors are used to assist in locating the function that provides the most accurate fit to the data points [95,96].The schematics of SVR are depicted in Figure 6.
Dong et al. [97] used the SVR technique to model-predict the RUL for Li-ion based batteries.It was determined by the hybrid approach of the SVR particle filter method (SVR-PF).In addition, an RUL prediction model, that can supply the value of RUL while updating probability distribution, was supplied for the terminal life cycle.The outcome demonstrated that the suggested methods for predicting RUL function, as well as SVR-PF, performs superior in terms of prediction and monitoring than the conventional particle filter does.In a similar approach to using SVR-PF, Wei et al. [98] developed models for SOH as well as RUL to simulate the aging process of a battery.This approach used capacity as the condition variable and characteristics, derived from a procedure that employs continuous current and a fixed voltage, as supply variables.Because of the relationship that occurs between storage capacity and the total charge transition resistance and electrolyte resistance, the expected impedance parameters were utilized as the outcome.The data shows that the proposed method provided results that are accurate and reliable, ensuring RUL prediction results were precise in the study.Patil et al. [99] used a novel multiple-phase SCR-based ML approach to model and predict the RUL of Li-ion batteries.It was recommended that if the battery is near its terminal life, the classification model will generate an approximation of the exact RUL, and the SVR model will be utilized to make the prediction.Since the approach using a multistage process leads to rapid computations, a learned model may be employed for the actual estimation of onboard RUL for EV battery packs.This is because, in addition to accuracy, the multistage method produces accurate results.
SVM [93,94].It works backward from the given points to determine the shape of the c However, since it is a regression method, rather than employing the curve as a dec boundary, it leverages the curve to discover a match between the vector and the po the curve.This is done by using the curve to find the best fit.Support vectors are us assist in locating the function that provides the most accurate fit to the data points [9 The schematics of SVR are depicted in Figure 6.

CatBoost
Most of the ML approaches require numeric data.Therefore, before we can train a model, we must first transform categorical input into numeric data.CatBoost is one such category encoder, which can convert categorical data to numerical data.Target encoding is a well-known approach for category encoding.It substitutes a categorical feature with the mean value of the target in the training data, paired with the target probability throughout the whole dataset.However, because the target is employed to anticipate the target, this causes target leakage [100,101].Such models are overfitted and do not generalize well in unknown situations.CatBoost is a modern ML algorithm, having its root in a gradientboosting decision tree.CatBoost was proposed in the year 2017 by Yandex developers.Gradient boosting is a highly effective ML strategy for dealing with issues including noisy data, heterogeneous features, and complicated relationships.CatBoost has been proven to be superior to other contemporary GBDT-based ML algorithms because the CatBoost algorithm is adept at the management of categorical features [102,103].The other GBDT algorithms may substitute categorical characteristics with average label values.The average label worth will be employed as the criterion for node splitting in a decision tree.This technique is known as greedy target-based statistics [104][105][106].
CatBoost integrates a variety of category properties as it combines all categorical characteristics and then combines them into a current tree, having all categorical features in the test data group in a greedy manner [107].CatBoost is capable of overcoming bias in the gradient approach.A weaker learner in the GBDT is generated in every iteration, while each learner is made to learn depending on the gradient of the preceding learner; the sum of all learners' categorized results produces the output [106].It will, however, provide biased pointwise gradient estimations, leading the final learned model to significantly outperform.CatBoost replaces the gradient estimation approach of the standard algorithm with an organized boosting technique.This method can mitigate gradient bias-induced prediction changes and increase the model's generalization capabilities [106,108].A schematic flow chart of CatBoost is shown in Figure 7. Zhang et al. [108] employed CatBoost for health monitoring-based prediction.The experimental findings demonstrated the ability of the model used in this study to realize predictions amongst various battery packs.The R 2 for the hybrid CatBoost prediction model was greater than 0.99, while the MSE value was below one.The higher efficiency of the CatBoost strategy was confirmed, by comparison, with other cutting-edge prediction models.In an advanced study on operating vehicles powered with Li-ion batteries, Gong et al. [109] developed a metamodel for RUL prediction.A hybrid approach of ML with the integral method of ampere-hour was used for model development.Li et al. [110] proposed a novel approach using CatBoost for the model prediction of SOC.The annual operation data of an EV was used with each charging segment segregated.Subsequently, incremental capacity analysis was employed to derive a general aging characteristic of the interval capacity.Additionally, a comparison to the six types of ML was made, and five main inputs-probe temperature, distance, electric current, start and stop of state of charge-were identified based on the R-value.The findings demonstrate that the Cat-Boost-based prediction framework provides the greatest precision, with the RMSE and MAPE being constrained to 1.12% and 2.74%, respectively.

Traditional ML Methods
In the domain of traditional ML methods, the ANN, ANFIS, gene expression programming (GEP), and random forests (RFs) have been extensively employed by researchers in recent times.There are several review studies that have already published those present exhaustive studies.The review papers published by the authors, such as Lv et al. [111], discuss artificial intelligence and ML applied to the battery's properties and its design.Shal et al. [112] offered a comprehensive review of the effects of ML on SOC, RUN, and knee points estimation using ML approaches like neuro-fuzzy, ANN, XGBoost, etc. Jin et al. [113] published a precise review work on RUL's prediction employing ML Zhang et al. [108] employed CatBoost for health monitoring-based prediction.The experimental findings demonstrated the ability of the model used in this study to realize predictions amongst various battery packs.The R 2 for the hybrid CatBoost prediction model was greater than 0.99, while the MSE value was below one.The higher efficiency of the CatBoost strategy was confirmed, by comparison, with other cutting-edge prediction models.In an advanced study on operating vehicles powered with Li-ion batteries, Gong et al. [109] developed a metamodel for RUL prediction.A hybrid approach of ML with the integral method of ampere-hour was used for model development.Li et al. [110] proposed a novel approach using CatBoost for the model prediction of SOC.The annual operation data of an EV was used with each charging segment segregated.Subsequently, incremental capacity analysis was employed to derive a general aging characteristic of the interval capacity.Additionally, a comparison to the six types of ML was made, and five main inputs-probe temperature, distance, electric current, start and stop of state of charge-were identified based on the R-value.The findings demonstrate that the CatBoostbased prediction framework provides the greatest precision, with the RMSE and MAPE being constrained to 1.12% and 2.74%, respectively.

Traditional ML Methods
In the domain of traditional ML methods, the ANN, ANFIS, gene expression programming (GEP), and random forests (RFs) have been extensively employed by researchers in recent times.There are several review studies that have already published those present exhaustive studies.The review papers published by the authors, such as Lv et al. [111], discuss artificial intelligence and ML applied to the battery's properties and its design.Shal et al. [112] offered a comprehensive review of the effects of ML on SOC, RUN, and knee points estimation using ML approaches like neuro-fuzzy, ANN, XGBoost, etc. Jin et al. [113] published a precise review work on RUL's prediction employing ML techniques like support vector, ANN, logistic regression, and GPR.Ng et al. [114] presented a review work on the RUL of Li-ion batteries, employing traditional ML techniques like ANNs, RFs, regression, and Kalman filter.Mao et al. [115] reviewed the ML applications on battery health and state predictions employing ML techniques, such as different types of regression, decision trees, RFs, and RNN.The study by Rauf et al. [116] presented a comprehensive review of traditional as well as hybrid ML techniques.Overall, it can be observed that neural networks, neuro-fuzzy, support vector machines, and linear regression have mostly been employed by investigators in the domain of RUL forecast of Li-ion based batteries.

Discussion
Battery degradation modeling has gotten increasingly sophisticated as the dynamics and complexity of Li-ion battery storage systems have increased.There are qualitative differences in the Li-ion battery deterioration process.However, identifying quantitative variables associated with deterioration is difficult.The data-driven prognostics techniques tackle this problem by taking actual, quasi-observations on the battery modeling without considering the physical mechanism involved.The statistical-based ML techniques make use of complex mathematical and linking patterns involved between input and output to correlate RUL to battery deterioration.In the literature, many ML approaches for modeling battery deterioration utilizing health RUL prognostics of Li-ion based batteries are proposed.Each of the given ML for the prediction of RUL has its peculiar set of applications, and greater outcomes can be attained in some circumstances.However, newer ML techniques are easy to implement, explainable, and overcome the barriers of missing, noisy, and outlier data.

Challenges and Future Scope
Several researchers are currently developing innovative battery materials and techniques.The research and manufacturing of improved safety and energy-dense batteries is critical.In contrast, research in the related disciplines of status monitoring and health management is still in its infancy.The authors aim to sketch out a route for battery health management and investigate future concerns to spur increased technological innovation and innovative discoveries.
Although ML-based techniques have made substantial contributions to trustworthy RUL prediction, they still face several challenges and have a long way to go before producing high-fidelity RUL estimations.Traditional ML approaches are based on the black box idea.Modern ML approaches are more like gray boxes, but more explainable/ethical/trustworthy AI & ML strategies that consider the physics of the problem and can manage massive data are required.
Most RUL forecasting systems struggle to predict RUL properly at an early stage.As a result, developing an early prediction method is crucial for avoiding battery failure and improving battery technology advancement.Several recommendations may be made to increase the potential and effectiveness of early RUL prediction.Using sensitivity analysis, the original data for factors that have a high association with early capacity decline and create the linked health indicators may be established.However, the use of the discovered indirect health factors in various working environments must be validated in several methods.Another method for improving early prediction is to train the predictive algorithm offline using accessible data and then retrain it with a limited quantity of live data.In addition, the categorization model may be used to roughly anticipate RUL in the early stages, and the regression model can then be utilized to create an accurate prediction of RUL.Overall, additional research is needed on algorithms that provide early prediction with little data [117].
Another big issue is identifying qualified individuals to perform data modeling using these ML methods.In this regard, new ML approaches are also easier to deploy when leveraging open-source libraries and online platforms like Kaggle and Jupyter notebook.

Conclusions
This review paper offers a ready-to-use perspective on cutting-edge ML approaches for battery deterioration modeling using RUL estimates.These modern ML techniques are newly developed and their implementation in the domain of RUL prediction is still in its infancy.Hence, a primer with example studies was very much necessary.Machine learning approaches, which are reinforced by a platform of data sharing and open-source tools, have the potential to transform the battery health monitoring system.More research into improving RUL estimate methodologies for Li-ion batteries will help achieve sustainability, especially in the case of the EV industry.The growth of the EV market, which employs Li-ion based batteries, together with improved manufacturing and reprocessing techniques, will assist the global environment by cutting GHG emissions.The authors anticipate that by publishing this study, those interested in ML-based battery deterioration modeling may profit from their work.Engineers can use appropriate ML algorithms to predict the best RUL built on specific necessities.Researchers can advance ideas for improving these approaches.