Comparative Analysis of Commonly Used Machine Learning Approaches for Li-Ion Battery Performance Prediction and Management in Electric Vehicles

: The significant role of Li-ion batteries (LIBs) in electric vehicles (EVs) emphasizes their advantages in terms of energy density, being lightweight, and being environmentally sustainable. Despite their obstacles, such as costs, safety concerns, and recycling challenges, LIBs are crucial in terms of the popularity of EVs. The accurate prediction and management of LIBs in EVs are essential, and machine learning-based methods have been explored in order to estimate parameters such as the state of charge (SoC), the state of health (SoH), and the state of power (SoP). Various machine learning techniques, including support vector machines, decision trees, and deep learning, have been employed for predicting LIB states. This study proposes a methodology for comparative analysis, focusing on classical and deep learning approaches, and discusses enhancements to the LSTM (long short-term memory) and Bi-LSTM (bidirectional long short-term memory) methods. Evaluation metrics such as MSE, MAE, RMSE, and R-squared are applied to assess the proposed methods’ performances. The study aims to contribute to technological advancements in the electric vehicle industry by predicting the performance of LIBs. The structure of the rest of the study is outlined, covering materials and methods, LIB data preparation, analysis, the proposal of machine learning models, evaluations, and concluding remarks, with recommendations for future studies.


Introduction
Li-ion batteries (LIBs) play an important role as an energy source in the use of electric vehicles (EVs).Their advantages in terms of energy density, being lightweight, having a low self-discharge rate, being capable of fast charging, and requiring less maintenance have made LIBs a preferred energy source for EVs [1].With electrification, LIBs have attracted attention as an environmentally friendly option.These batteries emit few greenhouse gas emissions compared to fossil fuel vehicles.Thus, they minimize their environmental impact by reducing their carbon footprint.In addition, LIBs enable EVs to have longer ranges due to their high energy storage capacity.The main reasons why these batteries are increasingly preferred pertain to their environmental sustainability, energy efficiency, low maintenance, and reliable performance.These stated advantages support the increase in popularity of EVs and anticipate their more widespread use in the future [2].
LIBs are preferred in many application areas, including EVs.However, there are obstacles that have limited their rapid spread.One of the most important obstacles slowing down developments in LIB technology pertains to cost.The production of these batteries requires high-cost raw materials, and costs have remained high due to technical difficulties in the production process.Additionally, limitations in terms of the energy storage capacity Appl.Sci.2024, 14, 2306 2 of 19 and performance of LIBs have led to constraints in meeting the requirements for higher energy density and long-lasting batteries [3].Security concerns are other factors that have hindered the development of LIBs.Safety issues pertaining to these batteries, such as overheating, explosion, or fire risks, have revealed that battery technology needs further development.Additionally, the recycling and waste management of these batteries are important issues in terms of reducing their environmental impact.Technological challenges and costs in terms of recycling LIBs have made it difficult to effectively manage and recycle waste batteries [4].These obstacles are seen as limiting factors for the wider acceptance of LIB technology.However, research in these areas in the literature and technological advances enable important steps that need to be taken to overcome these obstacles.
In order to minimize the effects of the abovementioned limiting factors, the current status of LIBs used in EVs must be accurately predicted and managed.The accurate estimation of the basic parameters of the battery, such as the state of charge (SoC), the state of health (SoH), and the state of power (SoP), will facilitate the safe and efficient use of these batteries [5,6].However, the accuracy and reliability of these predictions still face some challenges in terms of existing technologies.In particular, the precise estimation of the SoC and SoH is necessary in order to fully understand and optimize the real-time state of LIBs.However, the complex structure of the battery, its sensitivity to external factors, and differences between cells make it difficult to accurately predict these situations.In addition to the traditional methods, significant advances have been made in terms of the conditions for predicting the use of model-based and data-driven techniques [7].However, more precise, reliable, and faster methods are needed to accurately estimate battery use under different operating conditions and over longer periods of time.Research in this area has focused on efforts to overcome this obstacle through the development of better algorithms, the use of more sensitive sensors, and more comprehensive modeling to understand the complexity of the battery [8].These developments constitute important steps for the more reliable, efficient, and long-lasting use of LIBs.
Machine learning-based methods have been employed to determine the battery status, referring to a set of algorithms that learn complex relationships using large datasets and make predictions based on these learnings [9].These methods analyze large amounts of data and try to identify trends based on the statistical models learned and developed by computers.The process is based on algorithms that recognize patterns in data and use these patterns to predict future events or outcomes.Machine learning methods are generally divided into two main categories as follows: supervised and unsupervised learning.Supervised learning involves the use of labeled data (data in which inputs and outputs are matched) during the training process.Various algorithms, such as classification and regression, fall into this category.Unsupervised learning, on the other hand, performs the learning process using unlabeled datasets and is generally used to identify patterns or structures in the dataset.Algorithms such as clustering and dimensionality reduction are examples of unsupervised learning.These methods enable the collection of large amounts of data and learning from a large dataset regarding the behavior of the battery, providing a highly scalable and flexible approach to determining the state of the battery [10].
Various machine learning techniques have been used in the literature to predict the states of LIBs.These techniques include support vector machines (SVMs) [11], decision trees (DTs) [12], random forests (RFs) [9], artificial neural networks (ANNs), k-means clustering [13], deep learning [14][15][16], genetic algorithms [14], gradient boosting machines (GBMs) [17], automatic regression, and nonlinear regression models [11].SVMs are used to solve classification and regression problems by making a distinction between datasets.In contrast, DT and RF are used to identify patterns in datasets and make predictions, while artificial neural networks are preferred in order to model complex relationships using an architecture similar to the neural networks of the human brain.K-means clustering is used to group datasets based on their similar features, while deep learning is used to learn complex structures and recognize complex patterns pertaining to multilayer artificial neural networks.Genetic algorithms are used as an optimization technique based on evolutionary principles, while GBMs are used to make stronger predictions by combining weak learning algorithms.Automatic regression and nonlinear regression models are used to analyze the relationship between dependent and independent variables and make predictions.These machine learning techniques provide a wide range of tools by which to solve complex problems, such as state of health predictions pertaining to LIBs.
In the present study, various statistical and machine learning approaches were employed to predict the states of LIBs in EVs.These approaches include various techniques such as linear regression (LR), DT, SVM, Gaussian process regression (GPR) [18], ensemble models, and neural networks.While basic statistical methods such as linear regression and stepwise linear regression are being used to predict the key states of LIBs and analyze their performance, tree-based approaches such as DT and ensemble models attempt to predict battery states and performance by considering complex situations.GPR has been used to understand the dynamic structure of the battery and optimize its performance of the battery, while deep learning techniques such as neural networks and long short-term memory (LSTM) analyze large datasets to model complex relationships and optimize the use of LIBs in EVs.These various approaches have been combined to improve the performance of LIBs in EVs, support their correct and efficient use, optimize their range in line with predictions, extend battery life, and ensure their safety.
Unlike the methods that have been aforementioned in the literature, the methodology proposed in this study offers researchers a comparative analysis by outlining the feature utilization procedure in classical machine learning, regression, tree, ensembling, and deep learning approaches.Therefore, it facilitates studies on which method will yield more accurate and faster results.The improvements made to the model parameters and architectural structure are discussed in detail.Comparative results of the LSTM-based hybrid bidirectional long short-term memory (Bi-LSTM) methods are included in this study to demonstrate the superiority of the proposed methods.To compare the observed and predicted discharge capacities, evaluation metrics such as mean squared error (MSE), mean absolute error (MAE), root mean square error (RMSE), and R-squared have been applied through proposed methods [19].Neural network fitting methods are applied, and their effects on model success are explained in detail.
This study compares traditional machine learning methods and deep learning techniques with classical models and discusses in detail the procedure of utilizing features to determine the state of health of LIBs and evaluate the remaining useful life (RUL).In this context, it offers a comparative analysis of the performance of various methods, including regression, decision trees, ensemble models, and deep learning algorithms.Additionally, details of the improvements made to the LSTM and Bi-LSTM methods for estimating the health status of LIBs and assessing the remaining useful life are explained.In order to evaluate the performance of the proposed methods, the differences between observed and predicted discharge capacities are examined by using criteria such as MSE, MAE, RMSE and R-squared.These investigations provide a valuable knowledge base for the development of technologies based on new models for EV management interfaces and systems, predict the future performance of LIBs used in EVs, and contribute to their more efficient use.The study carried out in this context has the potential to contribute to technological developments in the electric vehicle industry by making predictions to increase the performance of LIBs.
The rest of the study was designed as follows: In the second section, the materials and methods used were given.In the third section, LIB data are first prepared and presented for the study.The data were then analyzed from different perspectives.In the fourth chapter, based on the data analysis, various models for a LIB's SoH were proposed with machine learning and ANN techniques.In the following sections, the presented ML and ANN models were evaluated and compared in detail.Finally, the study was concluded by giving results from the literature and providing recommendations.Additionally, the boundaries and trends of this study are included in the research.Thus, researchers are given predictions for future studies.

Materials and Methods
This research aims to examine the performances of different prediction methods to improve the prediction of LIB discharge capacities in EVs.This study includes various approaches, such as machine learning and deep learning, and attempts to shed light on the future development of LIB technology by discussing the advantages, disadvantages, and application areas of each method in detail.

System Description
Battery management systems (BMSs) in EVs target factors to optimize the driving experience by performing a series of complex functions to improve the safety, durability, efficiency, and overall performance of the battery.When a load request comes from any unit of the electric vehicle, the measured current and corresponding voltage have been obtained from the cell.The SoC is calculated with the appropriate functions built into the BMS.The SoH prediction algorithm is executed periodically, regardless of the data log of previous predictions.Data obtained through BMSs are given as input to many prediction models, from classical machine learning approaches to complex deep learning approaches.The system-level application architecture for SoH prediction is given in Figure 1.
Additionally, the boundaries and trends of this study are included in the research.Thus, researchers are given predictions for future studies.

Materials and Methods
This research aims to examine the performances of different prediction methods to improve the prediction of LIB discharge capacities in EVs.This study includes various approaches, such as machine learning and deep learning, and attempts to shed light on the future development of LIB technology by discussing the advantages, disadvantages, and application areas of each method in detail.

System Description
Battery management systems (BMSs) in EVs target factors to optimize the driving experience by performing a series of complex functions to improve the safety, durability, efficiency, and overall performance of the battery.When a load request comes from any unit of the electric vehicle, the measured current and corresponding voltage have been obtained from the cell.The SoC is calculated with the appropriate functions built into the BMS.The SoH prediction algorithm is executed periodically, regardless of the data log of previous predictions.Data obtained through BMSs are given as input to many prediction models, from classical machine learning approaches to complex deep learning approaches.The system-level application architecture for SoH prediction is given in Figure 1.As shown in Figure 1, the powertrain system in the EV controls the complex mechanism of the vehicle.This system includes a number of basic components such as the motor, mechanical drives, Li-ion battery packs, and charging unit.The engine enables the movement of the vehicle by converting electrical energy into mechanical energy.Mechanical drives transmit this energy to the wheels, allowing the vehicle to move forward.LIB packs are the vehicle's electrical power source and are usually located in a special battery compartment under the vehicle.The charging unit is used to charge electricity from external power sources to the battery.The BMS is a critical component that monitors and optimizes the performance of LIB packages.The BMS generally monitors data such as the cycle As shown in Figure 1, the powertrain system in the EV controls the complex mechanism of the vehicle.This system includes a number of basic components such as the motor, mechanical drives, Li-ion battery packs, and charging unit.The engine enables the movement of the vehicle by converting electrical energy into mechanical energy.Mechanical drives transmit this energy to the wheels, allowing the vehicle to move forward.LIB packs are the vehicle's electrical power source and are usually located in a special battery compartment under the vehicle.The charging unit is used to charge electricity from external power sources to the battery.The BMS is a critical component that monitors and optimizes the performance of LIB packages.The BMS generally monitors data such as the cycle index, temperature, current, and voltage received from batteries.These data provide valuable information about the health and performance of LIBs.
Based on data from the BMS, a systematic process should be used to estimate the discharge capacity.Discharge capacity is a measure of the amount of electricity a LIB stores and is often an important metric indicating the LIB's performance.By analyzing Appl.Sci.2024, 14, 2306 5 of 19 past usage cycles with the data received from BMSs, we attempted to predict the LIB's past performance and reactions.Analyses of LIB charge/discharge behavior and temperature changes have been combined with mathematical and statistical models used to predict the discharge capacity when assessing the current state of a LIB.This process helps predict future changes in LIB performance and predicts when the LIB may need maintenance or replacement.In this way, the BMS helps extend the LIB's life, ensure EV holistic safety, and optimize its overall performance.index, temperature, current, and voltage received from batteries.These data provide valuable information about the health and performance of LIBs.

Li-Ion Battery Behavior in Electric Vehicles
Based on data from the BMS, a systematic process should be used to estimate the discharge capacity.Discharge capacity is a measure of the amount of electricity a LIB stores and is often an important metric indicating the LIB's performance.By analyzing past usage cycles with the data received from BMSs, we attempted to predict the LIB's past performance and reactions.Analyses of LIB charge/discharge behavior and temperature changes have been combined with mathematical and statistical models used to predict the discharge capacity when assessing the current state of a LIB.This process helps predict future changes in LIB performance and predicts when the LIB may need maintenance or replacement.In this way, the BMS helps extend the LIB's life, ensure EV holistic safety, and optimize its overall performance.

Battery Tests
All the test results were taken from publicly accessible materials provided by Michael Pecht at the Centre for Advanced Life Cycle Engineering (CALCE), University of Maryland.The experiments involved the utilization of LiCoO 2 -graphite cells [21].
The charge/discharge tests were specified to occur within the operational range of 3.0-4.4V, encompassing various C-rates and temperatures.The quantity of cells and test parameters are outlined in Table 1.Each test condition (Test No. 1, No. 2, etc.) comprised 8 cells, totaling 192 cells.In every set of eight cells, four distinct temperatures, three discharge C-rates, and two charge cut-off C-rates were applied, employing C/5 and C/40 under the constant charge, constant voltage (CCCV) protocol.The SoC varies from 20% to 100%.All the acquired charge and discharge datasets were subjected to machine learning techniques for training purposes.

SoH Prediction in Electric Vehicles
SoH is defined based on the LIB's nominal capacity, internal resistance, energy storage capacity, and overall performance.In EVs, LIBs are one of the most important energy sources, and the health of these batteries has a direct impact on the vehicle's performance, range, and safety.Therefore, it is important to estimate correctly.SoH prediction is a study that needs to be understood to predict how the performance of a battery can change over its lifetime.Formulations and calculation methods are often associated with indicators such as the ratio of the nominal capacity of the battery to its measured capacity or the increase in the internal resistance of the battery.Mathematically, the SoH has been modeled, as given in Equation (1).
Q max , given in Equation ( 1), represents the measured capacity, and Q Rated represents the nominal capacity value.This value refers to the designed or specified maximum capacity of the battery.The nominal capacity specified by the manufacturer is the amount of capacity that the battery should carry under ideal conditions.By proportioning these two values, Equation (1) expresses the ratio of the measured capacity of the battery to the nominal capacity.This ratio is usually expressed as a percentage and shows the current capacity of the battery as a percentage of the specified maximum capacity.This ratio is an important metric that shows how much performance the battery loses or maintains over its life cycle.Therefore, the accurate monitoring and prediction of battery health in EVs are critical elements that affect the range, performance, and safety of vehicles.

Machine Learning Models Used for SoH Prediction
Various machine learning (ML) methods have been used for SoH prediction.LR is one of the different ML methods.Using LIB data, LR creates a linear model that predicts the SoH of the battery.For example, if there is a linear relationship between the SoH of the battery and the operating temperature, the model attempts to find this relationship and predicts the SoH of the battery.During the training process, the model creates a linear equation to express this relationship using measurements in the dataset.The learned model represents the linear relationship between data points.This model is then used to predict new data points.SVM, which is frequently used in SoH estimation, separates data points into groups by creating a separation limiter.The DT and RF target specific functions to determine the SoH of the battery by processing complex data structures.It classifies data in a tree structure and makes predictions.Clustering algorithms divide the data into groups and identify battery groups with similar features.It parses and groups data with similar characteristics.Clustering algorithms often use different metrics to measure similarities in the dataset.To illustrate, metrics such as Euclidean distance and Manhattan distance can be used to calculate the distance between data points.Data points are divided into groups based on these metrics, and each group contains data points that are close to each other in terms of similarity.Clustering algorithms can be a useful tool for understanding the behavior of batteries and identifying batteries with similar characteristics.However, rather than directly predicting battery health when used alone, it reveals the relationship between batteries with similar characteristics and provides a basis for further analysis.
Advanced ANNs predict the SoH of the battery by analyzing complex relationships in the data.This method mimics the way the human brain works.The ANN is used to identify and learn complex relationships by analyzing large amounts of data.When battery data are given as input to the ANN, neurons within the network perform mathematical operations on these data.The ANN uses mechanisms such as weighting and activation functions to identify and learn patterns in the dataset.In this way, it can identify complex relationships in battery data and use these learnings to predict important features such as the SoH.
One of the frequently used methods in SoH prediction processes using ML is the regression model.Regression is used as a model in the relationship between two or more independent variables and a dependent variable.The main purpose of this modeling is to predict the dependent variable through the use of independent variables.In this study, models such as LR and robust regression were implemented.DT is an algorithm similar to a tree structure used in regression problems.The uppermost part consists of roots and branches; there are leaves at the end of the branches.It is generally preferred with complex datasets and works well with datasets that do not require preprocessing.To utilize decision trees, different models, such as the fine tree, medium tree, and coarse tree, have been used to compare the prediction performance.In addition, ensemble decision trees, such as the ensemble boosted tree and ensemble bagged tree, have also been preferred to reduce and strengthen the variance of decision trees [22].SVM, on the other hand, may work with a large number of independent variables and is effective in modeling complexity boundaries [23].In this study, SVM models, such as fine Gaussian, medium Gaussian, and coarse Gaussian, were employed.Efficient linear is one of the efficient linear regression models [24].Linear least squares (LLS) refers to the approximation of linear functions by the least squares method, while GPR is probabilitybased.Some of the models used in this study are rational quadratic GPR, Matern 5/2 GPR, and exponential GPR.Gaussian regressions are a collection of random variables with a Gaussian distribution.Another model that has been used to separate nonlinear classes is kernel SVM and kernel least squares regression [25].These models may handle nonlinear structures more effectively.
The ANN is a significant model that has been used to analyze complex relationships in battery data.In this study, different ANN models, such as narrow, medium, wide, double-layer, and three-layer neural networks, were used.LSTM is a method that is used in subjects such as sentiment analysis, natural language processing, and time series [26].Using the abovementioned ML and artificial intelligence techniques, different models were developed in this study, and the existing data were analyzed in detail.All models were tested on the dataset, and the results are presented in detail.

Advantages and Disadvantages of Different ML Models
The advantages of linear regression include simplicity, interpretability, and fast training processes.However, this method may yield misleading results if the real relationship in the dataset is not linear.Additionally, it may be limited in expressing very complex relationships and may not model nonlinear relationships correctly.Advantages of clustering algorithms include the ability to discover structures in the dataset, better understand datasets, identify batteries with similar properties, and reveal similarities of different batteries.However, it should be noted here that these algorithms are sensitive to the data and metrics used.Additionally, in some cases it is possible to have difficulty in making a clear distinction when data points need to be separated into different groups.
Advantages of artificial neural networks include the ability to identify and learn complex relationships, flexibility, performing well on large datasets, and scalability.Additionally, ANNs may reveal hidden patterns within data and analyze complex structures, owing to their learning ability.However, artificial neural networks might sometimes encounter overfitting problems.This means that the network overfits the dataset and reduces generalizability.Additionally, training and configuring ANNs may require time and computational power.The advantages and disadvantages of the machine learning models used in this study are listed in Table 2. Interpretation and explanation of the model are difficult.As feature size increases, extension becomes difficult and overfits new data.

Ensemble
Good for complex and noisy problems in decision trees, adjusts variance adjustment.
For multiple models, computational costs are high, and interpretation and explanation are difficult.

Neural Network
It provides effective visual capability, the ability to process data even if it is not edited, and it is adaptable and offers a user interface.
It has high hardware requirements, data-based operation, poor control capability, and may produce incomplete results.
Long Short-Term Memory It can capture long dependencies and handles sequential data quite well.
Hardware costs are high; the dataset must be large.

Experimental Studies
In the present study, an estimation of the discharge capacities of LIBs was attempted by developing different models.The first model was the LR model, based on machine learning.The other models were the following: DT, SVM, GPR, kernel SVM, kernel LSR, boosted tree, bagged tree, neural network, and hybrid structures of these models based on LSTM.The development of these models should be carried out in a certain order.
The data collection and preprocessing steps have been the first and most important steps in model development.You need to collect historical data on the problem you want to model (discharge capacity estimation), preprocess this data, and have it checked by real users for data consistency.In this study, the first 300 cycles in the dataset were taken.This selection was checked by real users, and outliers and missing data were checked.In the second step, selecting and analyzing important features from the existing data was carried out.The selected features helped to optimally determine the value of the target variable (the variable you are trying to predict in regression problems).In the third step, it is necessary to divide the dataset into training and test datasets.The training set was used to train the model; on the other hand, the test set was used to evaluate the performance of the model.The model's development and training phase varies depending on the basic features of the ML or ANN algorithms that have been used.Hence, it has been provided in detail under different headings in the Materials and Methods section.The performance of the developed model has been evaluated using a test dataset.Performance evaluation assists in determining how accurately the model predicts the target variable.RMSE is a commonly used performance measure for regression problems.Finally, the developed model should be integrated into real-life applications.Once trained and validated, the model may be used to predict the value of the target variable on new data.With the implementation of this method, studies on the prediction of real-world data might be carried out.

Data Collection and Preprocessing
Within the scope of this study, there were LIB samples having tested 750 cycles.In total, there were 192 samples obtained from 21 different LIBs in the dataset.With the preprocessing steps, the first 300 cycles in the dataset were used for training, and cycles between 301 and 750 were omitted from the dataset.The dataset consisted of a total of 5 features and 45.699 data rows, each including the cycle index, temperature, voltage (V), current (A), and discharge capacity (in Ah).These data were divided into 80% training and 20% testing with the cross-validation method.In contrast, 38.845 data were employed for training purposes, while 6.854 data were utilized as a sample for testing.During the training of the network, 15% was reserved for validation purposes.Details on the LIB data used in the study are presented in Table 3.

Data Partition
The process of dividing the data prepared for the estimation of discharge capacity should be different according to the classification type of data.Because time-dependent data on the discharge capacity have a sequential structure, this structure must be preserved during the training and testing processes of the model.The problem addressed in this study was to predict data while considering the previous data.Therefore, if data divisions were not taken into consideration, it may create a model that predicts the past considering future information.The data obtained within the scope of the study were divided on a time-based ground.With this partitioning process, 80% of the dataset was reserved for training the 38,845-line model, while the remaining 20% was reserved for testing the 6854-line model.

Model Development
Many models, from classical ML methods to complex ANN algorithms, were tested within the scope of the study.Regarding the ML approaches, there were certain parameters determining the performance and flexibility of the algorithms that were used in the model development phase.These parameters are often called "hyperparameters" and control aspects of the model, such as how it learns, generalizes, and manages complexity.The study results indicated that regression-based models and hyperparameters were used for each model and are illustrated in detail in Table S1.Immediately after the regression models' development, DT-based ML models were developed.The hyperparameters and detailed information about the developed models are presented in Table S2.The other type of ML model analyzed in this study was the SVM-based approach.Information about the SVM types and the hyperparameters used in these types are demonstrated in Table S3.
Following classical ML models, ANN-based models were included in the study.Hyperparameters of the ANN models, which were analyzed within the scope of the study, are demonstrated in Table S4.
While many models were used, a new model was created using curve fitting based on an ANN.Here, it was inclined to reach the discharge capacity results expected to be obtained as a result of 4 input data in the input layer.After the input data X1, X2, X3, and X4, there was a hidden layer of the network, as well as the output layer that was expected to be reached.The image of this neural network is shown in Figure 3.
ters determining the performance and flexibility of the algorithms that were used in the model development phase.These parameters are often called "hyperparameters" and control aspects of the model, such as how it learns, generalizes, and manages complexity.The study results indicated that regression-based models and hyperparameters were used for each model and are illustrated in detail in Table S1.Immediately after the regression models' development, DT-based ML models were developed.The hyperparameters and detailed information about the developed models are presented in Table S2.The other type of ML model analyzed in this study was the SVM-based approach.Information about the SVM types and the hyperparameters used in these types are demonstrated in Table S3.
Following classical ML models, ANN-based models were included in the study.Hyperparameters of the ANN models, which were analyzed within the scope of the study, are demonstrated in Table S4.
While many models were used, a new model was created using curve fitting based on an ANN.Here, it was inclined to reach the discharge capacity results expected to be obtained as a result of 4 input data in the input layer.After the input data X1, X2, X3, and X4, there was a hidden layer of the network, as well as the output layer that was expected to be reached.The image of this neural network is shown in Figure 3.As seen in Figure 3, an estimation of the discharge capacity value in the output layer was attempted with 4 input data in the input layer.The Levenberg-Marquardt method was employed for the optimization of this neural network.This optimization method ensures that the network runs faster and produces results.The results obtained were found to be promising.By using 20% cross-validation of the dataset, the data were randomly selected, and the training and testing processes were carried out in 5 separate clusters.This is an important factor that increases the reliability of training and testing.The training was carried out over 1000 epochs, and the results are shown in Figure 4.
was employed for the optimization of this neural network.This optimization method ensures that the network runs faster and produces results.The results obtained were found to be promising.By using 20% cross-validation of the dataset, the data were randomly selected, and the training and testing processes were carried out in 5 separate clusters.This is an important factor that increases the reliability of training and testing.The training was carried out over 1000 epochs, and the results are shown in Figure 4.  Table 4 provides information about the operations performed for the model.The dataset, the ratio allocated for training and testing, the number of iterations, the optimization The graph showing the performance of the model during training is pointed out in Figure 5.It was obvious that the error rate of the model was gradually decreasing.In Figure 5, the error rate of the model was highlighted depending on the mean square error and the number of iterations.
to be promising.By using 20% cross-validation of the dataset, the data were randoml selected, and the training and testing processes were carried out in 5 separate clusters This is an important factor that increases the reliability of training and testing.The trainin was carried out over 1000 epochs, and the results are shown in Figure 4.The graph showing the performance of the model during training is pointed out i Figure 5.It was obvious that the error rate of the model was gradually decreasing.In Fig ure 5, the error rate of the model was highlighted depending on the mean square erro and the number of iterations.Table 4 provides information about the operations performed for the model.The da taset, the ratio allocated for training and testing, the number of iterations, the optimizatio Table 4 provides information about the operations performed for the model.The dataset, the ratio allocated for training and testing, the number of iterations, the optimization method that was used, and the information required for the input and output are included in Table 4.The hyperparameters were implemented to give the best results in the developed neural network, given in Table 4.Although different numbers of hidden layers were applied in the performance of the model, it was observed that the best result consisted of 33 layers.It consisted of 4 input layers, 33 intermediate layers, and 1 output layer.The output layer as the prediction layer.The obtained information, such as MSE 0.0281 and Result = 0.9147, have been explained in detail in the Section 4.

Experimental Results
Using machine learning in a BMS may significantly improve the efficiency, performance, and lifespan of batteries.A battery management system is a crucial component in electric vehicles, renewable energy systems, and portable electronic devices.Its primary function is to monitor, control, and optimize the various parameters of a battery pack.Integrating machine learning methods may enhance the capabilities of a BMS in terms of the SoH.Predicting the state of health of a battery involves estimating its remaining capacity and overall condition.Machine learning models may analyze historical data, usage patterns, and environmental factors to make more accurate predictions about a battery's health.In this study, attempts were undertaken to predict the discharge capacities of Li-ion batteries through the development of various models.The results obtained from tests conducted at four different temperatures, two varied discharge capacities, and two different charge cut-off current values might be employed to train machine learning models in recognizing data attributes, such as the capacity and voltage, for alternative charge/discharge currents and temperature values.Therefore, this study aims to reduce the number of required measurements.

Evaluation Metrics
More than one method was used to test the models prepared throughout the study.RMSE, a metric that was used to evaluate the accuracy of statistical predictions, was enacted in the initial testing processes [27].RMSE was calculated as the square root of the mean square of the differences between the predicted values and the actual values (Equation ( 2)).
When RMSE was used to test a developed model, RMSE was expected to be low.A high RMSE means that the predictions were very different from the actual values.This situation shows the low performance of the model.A lower RMSE value indicated that the predictions were closer to the actual values, and thus, the model performed better.Additionally, the MSE results have been also provided penalizing larger errors.
The other approach used for model testing in experimental studies is MAE [28].Often used to measure the performance of regression models, MAE represents the mean absolute difference between the predicted values and actual values.In other words, it attempts to determine how "wrong" each prediction is.MAE was calculated as provided in Equation (3).
For each prediction, the difference between the predicted value and the actual value was calculated, as shown in Equation (3).The absolute value of this difference was counted.Here, n represents the number of data points, while y i refers to the true value (observation) and ŷi refers to the value predicted by the model.Essentially, MAE measures the magnitude of forecast errors and treats each error with the same weight.Additionally, R-squared was used as the evaluation metric within the frame of this study.R-squared measures the explanatory power of the model on the data and expresses how much of the variance of the dependent variable it explains.In this regard, it was employed in the whole study to measure the effect of independent variables on the dependent variable and to evaluate the explanatory ability of the model.

Estimation of SoH with Regression-Based Approaches
The regression results, which were prepared on the basis of information explained in detail, as pointed out in the Model Development section, are presented in Table 5.The results obtained using decision trees are demonstrated in Table 4. Decision trees appear to be more successful than regression models.Among these models, the ensemble bagged tree yielded the best results.Accordingly, it has been observed that other decision tree models produced similar results.The results obtained from the SVM-based models developed in light of the parameters explained in detail in the Model Development section are presented in Table 6.The results have indicated that the support vector machines were at an acceptable level.Although it was better than the regression-based models, it was found to be lower than the performance obtained from decision trees.It was observed that the fine Gaussian SVM model performed better results than other models.

Estimation of SoH with Neural Network-Based Approaches
The results obtained according to the models of neural networks and the parameter data in the models are outlined in Table 7. Considering the results obtained from neural network-based models, it might be seen that there were 100 neurons in one hidden layer.As has been highlighted in Table 7, the number of neurons in the single hidden layer was more than the number of neurons for the narrow and medium presets.In the structure containing a bilayered network and a trilayered hidden layer, there were 10 neurons in each layer.It was clarified that lower results were obtained due to the lower number of neurons in the layers.

Estimation of SoH with Deep Learning-Based Approaches
The hyperparameters used in the study conducted with Bi-LSTM [20], one of the deep learning models, are listed in Table 8.Charge capacity data in the dataset were considered as a time series and were employed for the deep learning model.In the Bi-LSTM deep learning algorithm, each data was considered as a consecutive time series.Charge capacity data were used for training and testing.The next charge capacity, evaluated as a time series, revealed that the next data are important.According to the results obtained, it yielded a successful result, with an MSE value of 0.019618 and an MSE value of 0.14006.Bi-LSTM is illustrated to give better results than machine learning, regression models, and ANN models.
Graphs of the results obtained from the Bi-LSTM deep learning algorithm are pointed out in Figure 6.These are the graphs showing targets and forecasts.Also, the shown graphs of the metrics were used to measure model performance.
the results obtained, it yielded a successful result, with an MSE value of 0.019618 and an MSE value of 0.14006.Bi-LSTM is illustrated to give better results than machine learning, regression models, and ANN models.
Graphs of the results obtained from the Bi-LSTM deep learning algorithm are pointed out in Figure 6.These are the graphs showing targets and forecasts.Also, the shown graphs of the metrics were used to measure model performance.

Discussion
In this current study, we examined the challenges faced by EVs using innovative technology and the impact of increasing complexity, as well as BMSs and forecasting systems that played a crucial role in managing the SoH.We explored the advantages that ML and ANN approaches might offer in dealing with these challenges and highlighted the importance of these benefits.We discussed the advantages of ML-based prediction methods in making LIBs used in EVs more efficient and adaptable in terms of management and sustainability.Compared to traditional statistical management, ML and ANNs offer a more dynamic BMS structure that may increase the operational LIB efficiency in complex and comprehensive EV infrastructures.
The main findings of this study may pose a valuable contribution to the development of sustainability in the energy sector, thanks to the integration of the BMS structure implemented in EVs with ML-and ANN-based models.Previous studies in this field have been in gathering different data from similar LIBs.Covering this data, machine learning-based classifier methods such as SVR, ELM, and SVM have generally been performed for SoH prediction.A comparison of the previous models and our results within the scope of this study is presented in Table 9. ELM: extreme learning machine, MELM: metabolic extreme learning machine, PSO-LSSVR: particle swarm optimization-least square support vector regression, SVR/GA: support vector regression/genetic algorithm, FFNN: feedforward neural network.

Conclusions and Future Trends
This study presented a comprehensive review to evaluate the performances of different machine learning and deep learning approaches for SoH prediction.Various criteria were employed in the analysis to determine the impact and success of different models on SoH prediction.The advantages and limitations of each model were carefully examined, and the results were evaluated.
It has been observed that regression-based approaches, especially models such as GPR, have achieved significant success in SoH prediction.It has also shown acceptable results in DT and SVM.It has been determined that Bi-LSTM had higher accuracy rates compared to other models.These results have demonstrated the potential of different machine learning and deep learning techniques in battery health prediction.
Referring to the previous literature, hybrid machine learning methods have not been applied to LIBs' SoH prediction.In accordance with this, our novelty clearly has shown that these hybrid methods supply us with better, more efficient, and successful results.
These results are very suitable to be integrated into the BMS software algorithm, especially for EVs.
Future studies may focus on various trends for further advancement and development of this field.First, improving the data collection processes and using larger, more diverse datasets might be an important step toward improving the model's performance.Additionally, further studies may be conducted on improving the prediction accuracy by using different model combinations or ensemble methods.
The other important issue that can be examined in more detail is the performance of deep learning techniques on more complex, larger datasets.Particularly, it is important to understand how techniques such as Bi-LSTM, a deep learning model, perform on larger, more complex datasets in predicting the state of health of the battery.Additionally, investigating the potential advantages of new machine learning models, which were not examined in this study, or less explored in the field of the SoH of battery prediction, could be an exciting area for further research.More comprehensive studies need to be conducted on the role and impact of new-generation artificial intelligence techniques and deep learning methods in SoH prediction.
In conclusion, this study presented an evaluation of the usability and effectiveness of machine learning techniques and deep learning approaches for the SoH of battery prediction.Future research might be expected to focus on further developing and optimizing the applications of these technologies in the battery industry.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/app14062306/s1,Table S1.Parameters used in regression-based ML models; Table S2.Parameters of DT-based ML models; Table S3.Parameters of SVM-based ML models; Table S4.Hyperparameters of ANN-based models.
LIBs used in EVs play an important role in energy storage and are critical factors in determining the vehicle's performance and range.LIBs have a special structure that stores energy through electrochemical reactions and converts this energy into electrical energy.In academic literature, different measurable and indirect indices are common in order to understand LIB behavior.While measurable indices include factors such as the current, voltage, internal impedance, and temperature, which help determine the instantaneous state of the battery, the SoC and SoH have important places among indirect indices[20].While the SoC refers to the current energy level of LIBs, the SoH indicates the overall health of the battery and is related to the capacity and performance of the battery.Research on LIBs focuses on issues such as battery aging processes, energy storage capacity and efficiency, and changes in the internal structure of the battery.LIB behavior focuses on the electrochemical degradation mechanisms.The electrochemical degradation mechanisms are given in Figure2.These mechanisms are among the factors that directly affect the longevity and performance of the battery.The combination of cracking, decomposition, chemical reactions, and additives occurring in the electrodes inside the battery causes the performance of the battery to change throughout its life.Therefore, a detailed consideration of the battery technology, improving battery performance, and the safe and effective use of the behavior of LIBs are of great importance in the development of EVs.The accurate prediction of SoC and SoH values in EVs and the measures taken as a result of these predictions will ensure a longer life and efficient operation of LIBs.
LIBs used in EVs play an important role in energy storage and are critical factors in determining the vehicle's performance and range.LIBs have a special structure that stores energy through electrochemical reactions and converts this energy into electrical energy.In academic literature, different measurable and indirect indices are common in order to understand LIB behavior.While measurable indices include factors such as the current, voltage, internal impedance, and temperature, which help determine the instantaneous state of the battery, the SoC and SoH have important places among indirect indices[20].While the SoC refers to the current energy level of LIBs, the SoH indicates the overall health of the battery and is related to the capacity and performance of the battery.Research on LIBs focuses on issues such as battery aging processes, energy storage capacity and efficiency, and changes in the internal structure of the battery.LIB behavior focuses on the electrochemical degradation mechanisms.The electrochemical degradation mechanisms are given in Figure2.These mechanisms are among the factors that directly affect the longevity and performance of the battery.The combination of cracking, decomposition, chemical reactions, and additives occurring in the electrodes inside the battery causes the performance of the battery to change throughout its life.Therefore, a detailed consideration of the battery technology, improving battery performance, and the safe and effective use of the behavior of LIBs are of great importance in the development of EVs.The accurate prediction of SoC and SoH values in EVs and the measures taken as a result of these predictions will ensure a longer life and efficient operation of LIBs.

Figure 3 .
Figure 3.The architectural structure of the proposed model (blue circles are input and output layer neurons; orange circles are hidden layer neurons.).

Figure 3 .
Figure 3.The architectural structure of the proposed model (blue circles are input and output layer neurons; orange circles are hidden layer neurons.).

Figure 4 .
Figure 4. Results obtained from the proposed model.The graph showing the performance of the model during training is pointed out in Figure 5.It was obvious that the error rate of the model was gradually decreasing.In Figure 5, the error rate of the model was highlighted depending on the mean square error and the number of iterations.

Figure 5 .
Figure 5. Training performance graph of the model.

Figure 4 .
Figure 4. Results obtained from the proposed model.

Figure 4 .
Figure 4. Results obtained from the proposed model.

Figure 5 .
Figure 5. Training performance graph of the model.

Figure 5 .
Figure 5. Training performance graph of the model.

Table 2 .
Advantages and disadvantages of ML and ANN models.

Table 3 .
Example LIB data.

Table 4 .
Training information of the model.

Table 5 .
Regression-based ML model results.

Table 6 .
Results obtained from SVM.

Table 7 .
Results obtained from the neural network.

Table 9 .
Comparison of previous studies in the literature.