Potential Failure Prediction of Lithium-ion Battery Energy Storage System by Isolation Density Method

: Lithium-ion battery energy storage systems have achieved rapid development and are a key part of the achievement of renewable energy transition and the 2030 “Carbon Peak” strategy of China. However, due to the complexity of this electrochemical equipment, the large-scale use of lithium-ion batteries brings severe challenges to the safety of the energy storage system. In this paper, a new method, based simultaneously on the concepts of statistics and density, is proposed for the potential failure prediction of lithium-ion batteries. As there are no strong assumptions about feature independence and sample distribution, and the estimation of the anomaly scores is conducted by integrating several trees on the isolation path, the algorithm has strong adaptability and robustness, simultaneously. For validation, the proposed method was ﬁrst applied to two artiﬁcial datasets, and the results showed that the method was effective in dealing with different types of anomalies. Then, a comprehensive evaluation was carried out on six public datasets, and the proposed method showed a better performance with different criteria when compared to the conventional algorithms. Finally, the potential failure prediction of lithium-ion batteries of a real energy storage system was conducted in this paper. In order to make full use of the time series characteristics, voltage variation during a whole discharge cycle was taken as the representation of the operation condition of the lithium-ion batteries, and three different types of voltage deviation anomalies were successfully detected. The proposed method can be effectively used for the predictive maintenance of energy storage systems.


Introduction
The world's renewable energy (RE), represented by photovoltaic and wind power, has been developing rapidly in recent years and is becoming one of the most viable ways to meet soaring energy demands and address environmental concerns [1]. However, the inherent intermittent characteristic of RE is bringing potential instability to the power grid. Just in time, as a strong backing to ensure the consumption of renewable energy and the reliability of the grid, energy storage has ushered in the leapfrog development [2][3][4]. On 15 July 2021, the China National Development and Reform Commission and the National Energy Administration jointly issued the "Guiding Opinions on Accelerating the Development of New Energy Storage" [5]. For the first time, the development goal of the energy storage industry has been defined and quantified at the national level, and it is expected that the installation scale of new energy storage will reach over 30 million kW by 2025, i.e., from 3.28 GW by the end of 2020 to 30 GW by 2025. In the next five years, the scale of the new energy storage market should expand to 10 times that of the current level, with a compound annual growth rate of more than 55%. At the same time, various provinces and cities have issued relevant policies, requiring that, in principle, the energy storage capacity of new energy projects should not be less than 10% to 20% of the installed capacity of new energy projects.
The lithium-ion battery (LIB) has become one of the most important energy storage technology routes [6,7], mainly due to its significant advantages with respect to other battery types [8][9][10], such as a longer lifecycle, a faster response speed, a lower self-discharge rate, and higher energy conversion efficiency. In China, a batch of one hundred megawattscale demonstration energy storage systems (ESSs) were combined successively to the grid in 2021, and the important application scenarios of the stable peak adjustment and fast frequency adjustment of the large-scale LIB-ESS were verified successfully. Even gigawatt scale energy storage systems are on the agenda in many provinces. However, at the same time, the large-scale use of LIBs brings severe challenges to the safety of ESSs [11,12], and this has become one of the biggest obstacles to the development of LIBs-ESS. Some relevant research has been conducted by former scholars [13][14][15][16], such as the estimation of the state of charge (SOC) [10,17], the healthy estimation according to capacity decline [18][19][20] or internal resistance increase [21], the remaining useful life (RUL) prediction, etc. However, the calculations require long-term battery operation data and even full lifecycle test data [22], and there is little research on the real-time potential failure prediction of LIBs.
In this paper, a new anomaly detection method is proposed for the real-time potential failure prediction of the LIBs of ESSs; this method integrates multiple binary trees and repeatedly estimates the density of the subset that a sample is in when it is on the isolation path. In fact, the approach is simultaneously related to statistics-based, density-based, and depth-based methods. Isolation density (iDensity) has the notion of density itself, which is estimated in the isolation process of a sample and, from the aspect of statistics, isolation density is in fact the real conditional probability density of the instance with multidimensional features. Compared with the former studies, the contributions of this paper are the following. (1) An unsupervised anomaly detection method is introduced for the diagnosis of the LIBS of ESSs; compared with the conventional SOH and RUL calculation methods, the real-time work condition of LIBs can be estimated in the case of no prior knowledge and long-running data. (2) A new LIB anomaly detection method is proposed in this paper; as there is no strong assumption about the data distribution, the method can adapt to datasets with different characteristics; it does not matter if the outliers are exposed or enclosed by the normal instances or even if they get together themselves.
(3) Unaggregated voltage variations of a whole discharge cycle, instead of the instantaneous value, are employed for anomaly detection, then the anomalies the of the LIBs on the scales of the features and time series can be effectively detected, simultaneously.
The paper is organized as follows: in Section 2, the related work on anomaly detection methods and their advantages and drawbacks are discussed. In Section 3, a new anomaly detection method is proposed, and the mathematical interpretation is discussed. In Section 4, experiments and qualitative discussion of the proposed method are conducted by artificial datasets with different anomalies. In Section 5, experiments and quantitative comparisons with conventional methods are conducted by public datasets. In Section 6, as a case study, the proposed method is used for the anomaly detection of a real ESS. Finally, the paper is concluded in Section 7.

Related Work
Anomaly detection is the process of identifying the few and the different samples, with an unsupervised or a semi-supervised method; it has become a research hotspot in many application domains [23][24][25], such as fraud detection in the financial domain, intrusion detection in the cyber security domain, system fault detection in the industrial production domain, etc. However, there are several factors that make this apparently simple and clear task very challenging. (1) Anomalies are usually sparse, separated, or exposed instances. However, they cannot be well-defined, except by some basic and abstract assumptions.
(2) Due to the difficulty of sample labeling, the most common scenario for anomaly detection is semi-supervised or unsupervised learning, i.e., identifying outliers based on their feature A variety of algorithms have been proposed by previous scholars [26], and most of the algorithms are based some explicit or implicit assumptions which determine the performance and the boundedness of the algorithm, simultaneously. In general, the stronger the assumptions, the less adaptable the algorithm.
Statistics-based methods have been widely used for anomaly detection. The histogrambased outlier score (HBOS) [27] is an efficient method for anomaly detection in large datasets. However, the basic assumption is that the features are independent of each other; consequently, the algorithm will fail for the detection of correlation anomalies. In distancebased methods, instances are determined to be abnormal or not by the measurement of distance, such as, for instance, in the method of k-nearest neighbors (KNN) [28]. In the cluster-based method, instances are determined abnormal or not by the degree of deviation from existing clusters, such as, for instance, in the method of k-means [29]. However, Euclidean distance is usually employed for the measurement of distance in KNN and K-Means. Therefore, an implicit assumption that the data will conform to the spherical distribution is introduced. The local outlier factor (LOF) [30] is a density-based method for anomaly detection [31]. In the LOF method, the dissimilarity of the reach distances of an instance and its k neighbors is used for the measurement of anomaly scores. However, the employment of Euclidean distance also introduces the implicit assumption of spherical distribution. In addition, the scope of "local" is difficult to define. Depth-based methods determine anomalies by delimiting molecular space, such as, for example, in the isolation forest (iForest) [32][33][34] method. The basic assumption is that outliers usually have a smaller depth on the lookup path. Because this is a weak hypothesis, iForest therefore shows better adaptability. However, iForest always tends to capture the exposed points, while the anomalies that are enclosed by normal instances are usually difficult to detect.

Isolation Density
A two-dimensional dataset is shown in Figure 1, where the blue points are considered to be normal samples, and the red point x i is an anomalous one. At first, a random value in a randomly selected dimension is generated, by which the samples are divided into two parts. Then, the density of the subsets that contain and do not contain x i , i.e., d and d', are calculated. This process is repeated until x i is fully isolated, or the maximum isolation depth is reached; the process is just like that shown in Figure 1a-f. As d i is the sequence of the density generated in the isolation process of x i , it is called the isolated density. The central assumption of the proposed method is that d will always be greater than d' in the statistical sense.
Based on the idea of ensemble learning, x i is isolated independently by several binary trees, and the final isolation density of x i can be expressed as: where d is the isolation density of sample x and nt and n' denote the total number of binary trees and the depth of the isolation path of a binary tree, respectively. k is the corresponding weight, m is the total of the remaining samples, including x i on the isolation path, V is the hyper-cube volume on the isolation path, L is the length of the edge of the hyper-cube, r is the ratio of the edge length to the original length of the hyper-cube on the isolation path, and V 0 is the original volume of the hyper-cube. Superscript t denotes the tth tree, and subscripts I, j, and f denote the ith sample, the jth isolation operation, and the f th feature, respectively. The final isolation density d i is the total estimated on the whole isolation path. That is, the consideration of the "outlier" of the algorithm includes not only the sample itself, but also its domain samples. At this point, the definition of "outlier" is introduced in this paper: the sample with a lower probability density, either by itself or in a subset including some neighborhood samples, is considered as an outlier. Based on the idea of ensemble learning, xi is isolated independently by several binary trees, and the final isolation density of xi can be expressed as:

Algorithm Validation by Artificial Datasets
In this section, we create two complex artificial datasets, i.e., a ring dataset and a double-moon dataset, which are used for the validation of the proposed method.

Circle Distribution Dataset
A circle dataset is shown in Figure 2; it can be seen that there is a strong nonlinear correlation between the two features; some anomaly points are distributed at the outside and the inside of the circle. The detection results are shown in Figure 3. As described above, due to the naive Bayesian assumption, HBOS is difficult to apply to the anomaly detection of complex datasets in most cases, as is shown in Figure 3a. The iForest method finds each sample through binary trees and detects the anomalies by the depth of the search path, because the anomalies often have less search path depth. On this important principle, the Sustainability 2022, 14, 7048 5 of 14 points exposed to the outside of the dataset are often easily detected as anomalies, while the points inside the dataset are often difficult to find. As is shown in Figure 3b, some of the points at the center of circle are incorrectly detected as normal ones by the iForest method. For the LOF method, the anomalies are judged by the relative density of the points with their surrounding ones; this is usually used for local anomaly detection. However, if the samples of a dataset are unevenly distributed, the LOF method often cannot provide the right results. As is shown in Figure 3c, although the points at the dataset center are sparse overall, the relatively dense points are still identified as normal ones, which is obviously not the desired result. Finally, Figure 3d shows the result of the proposed method; the circle distribution can be identified, and most of the points that do not fit the distribution are detected as anomalies.

Circle Distribution Dataset
A circle dataset is shown in Figure 2; it can be seen that there is a strong nonlinear correlation between the two features; some anomaly points are distributed at the outside and the inside of the circle. The detection results are shown in Figure 3. As described above, due to the naive Bayesian assumption, HBOS is difficult to apply to the anomaly detection of complex datasets in most cases, as is shown in Figure 3a. The iForest method finds each sample through binary trees and detects the anomalies by the depth of the search path, because the anomalies often have less search path depth. On this important principle, the points exposed to the outside of the dataset are often easily detected as anomalies, while the points inside the dataset are often difficult to find. As is shown in Figure 3b, some of the points at the center of circle are incorrectly detected as normal ones by the iForest method. For the LOF method, the anomalies are judged by the relative density of the points with their surrounding ones; this is usually used for local anomaly detection. However, if the samples of a dataset are unevenly distributed, the LOF method often cannot provide the right results. As is shown in Figure 3c, although the points at the dataset center are sparse overall, the relatively dense points are still identified as normal ones, which is obviously not the desired result. Finally, Figure 3d shows the result of the proposed method; the circle distribution can be identified, and most of the points that do not fit the distribution are detected as anomalies.   Figure 4 is a more difficult dataset, i.e., a double-moon distribution dataset. Many features are embodied in this dataset: strong nonlinear correlation, inside anomalies, and uneven distribution. The detection results are shown in Figure 5. As mentioned above, HBOS is not suitable for datasets with dependency variables; as is shown in Figure 5), the  Figure 4 is a more difficult dataset, i.e., a double-moon distribution dataset. Many features are embodied in this dataset: strong nonlinear correlation, inside anomalies, and uneven distribution. The detection results are shown in Figure 5. As mentioned above, HBOS is not suitable for datasets with dependency variables; as is shown in Figure 5), the points in the regions of A and B are incorrectly detected as anomalies. iForest is not suitable for datasets with enclosed anomalies; as is shown in region A of Figure 5b, the points enclosed by the adjacent ones are incorrectly detected as normal. In addition, the LOF is not suitable for datasets with uneven distribution; as is shown in Figure 5c, the points at the regions of the lower left and upper right are incorrectly detected as normal. Finally, the detection results of the proposed method are shown in Figure 5d; for the regions that the conventional methods fail, the proposed method still can detect the normal or anomalous samples effectively.

Algorithm Validation by Public Datasets
Public datasets are used for the validity of the proposed method in this section. The public datasets are from the University of California Irvine (UCI) machine learning repository [35], which is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning

Algorithm Validation by Public Datasets
Public datasets are used for the validity of the proposed method in this section. The public datasets are from the University of California Irvine (UCI) machine learning repository [35], which is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning

Algorithm Validation by Public Datasets
Public datasets are used for the validity of the proposed method in this section. The public datasets are from the University of California Irvine (UCI) machine learning repository [35], which is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.

Evaluation Criteria of Algorithm Performance
Criteria are introduced in this section to evaluate the performance of the proposed method; they are accuracy a, recall r, precision p, F1 score f 1 , the Matthews correlation coefficient MCC, and mean score f m . Their formulas are described as follows: In this paper, positive samples denote the abnormal samples, while negative samples denote the normal samples.

Wisconsin Breast Cancer Dataset
The original Wisconsin breast cancer dataset contains 32 dimensional features with a sample size of 569. The features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe the characteristics of the cell nuclei present in the image. There 367 samples of the modified dataset in this paper, among which the proportion of abnormal samples is 2.72%, and the feature dimension is 30. The detection results of each algorithm are shown in Table 1, where the bold characters in the table are the optimal results under the corresponding standards (the same as below).

Pen Handwriting Dataset
The pen handwriting dataset is a digit database which collected 250 samples from 44 writers and has a total of 10,992 samples with 16 features. There are 809 samples of the modified dataset, and the feature dimension is 16, among which the proportion of abnormal samples is 11.1%. The detection results of each algorithm are shown in Table 2.

Statlog (Shuttle) Dataset
The shuttle dataset contains 58,000 samples and 9 numerical attributes. The examples in the original dataset were in time order, and this time order could presumably be relevant in classification. However, this was not deemed relevant for the StatLog purposes; so, the order of the examples in the original dataset was randomized. There are 46,464 samples of the modified dataset in this paper, and the feature dimension is 9, among which the proportion of abnormal samples is 1.89%. The detection results of each algorithm are shown in Table 3. Bold fonts indicate the best values of each criterion of the four methods for the space shuttle dataset.

KDD Cup 1999 Dataset
This is the dataset used for "The Third International Knowledge Discovery and Data Mining Tools Competition", which was held in conjunction with KDD-99. The task is to build a network intrusion detector, i.e., a predictive model that can distinguish between "bad" connections (known as intrusions or attacks) and "good" normal connections. The original dataset contains 4,000,000 samples with 42 features. There are 620,089 samples of the modified dataset in this paper, and the feature dimension is 38, among which the proportion of abnormal samples is 0.17%. The detection results of each algorithm are shown in Table 4.

Banknote Authentication Dataset
The banknote authentication data were extracted from images that were taken from genuine and forged banknote-like specimens. For digitization, an industrial camera usually used for print inspection was used. The final images have 400 × 400 pixels. Due to the object lens and the distance to the investigated object, gray-scale pictures with a resolution of about 660 dpi were gained. The Wavelet Transform tool was used to extract features from the images. The dataset contains a total of 1372 samples with 4 features, 44.4% of which are anomalies. The detection results are shown in Table 5.

Multi-distribution Dataset
The multi-distribution dataset contains four normal distributions (one of which has low density), a micro cluster, and local anomalies. There is a total of 3000 samples with 2 features, 1.23% of which are anomalies. The detection results of the four methods are shown in Table 6.

Anomaly Detection of Lithium-ion Batteries
In this section, the anomaly detection of a real energy storage system of lithium-ion batteries is conducted. The ESS is constructed for the consumption of the renewable energy of a nearby wind-power plant, which consists of 12 battery compartments in parallel. A battery compartment consists of four battery piles in parallel. A battery pile consists of five battery clusters in parallel. A battery cluster consists of 18 battery packs in series. Finally, a battery pack is composed of 12 batteries in a series.

Description of Lithium-ion Battery Dataset
A difficulty for the anomaly detection of the lithium-ion batteries of an ESS is that the number of measurement parameters is very small. For the studied ESS, there are only three independent measurement parameters for lithium-ion batteries, i.e., voltage, current, and temperature. In addition, for the larger-scale ESSs, even temperature measurements are inadequate. So, in this paper, in order to increase the number of features, the voltage variation of the battery discharge process is taken as its characteristic sequence and is used for series-based anomaly detection.
A discharge process of the ESS was started on 1 November 2020 at 14:11, and it ended on 1 November 2020 at 15:10; it lasted 60 min. The real voltage variations are shown in Table 7, and the parameter variations are shown in Figure 6 at the same time, where the blue lines are the voltage variations of the lithium-ion batteries, and the red line is the current of the battery pile. As can be seen, as the discharge process goes on, the voltages of the batteries decrease gradually. Because the battery adopts a constant power discharge strategy, its current increases gradually as the voltage decreases. Because none of the voltages exceeded the threshold, no threshold-based alarm was triggered; however, some voltage trends seem to be quite different from the overall trend. We may not know the exact reasons behind these phenomena at that time. However, it is always helpful to identify them accurately when they occur. three independent measurement parameters for lithium-ion batteries, i.e., voltage, current, and temperature. In addition, for the larger-scale ESSs, even temperature measurements are inadequate. So, in this paper, in order to increase the number of features, the voltage variation of the battery discharge process is taken as its characteristic sequence and is used for series-based anomaly detection. A discharge process of the ESS was started on 1 November 2020 at 14:11, and it ended on 1 November 2020 at 15:10; it lasted 60 min. The real voltage variations are shown in Table 7, and the parameter variations are shown in Figure 6 at the same time, where the blue lines are the voltage variations of the lithium-ion batteries, and the red line is the current of the battery pile. As can be seen, as the discharge process goes on, the voltages of the batteries decrease gradually. Because the battery adopts a constant power discharge strategy, its current increases gradually as the voltage decreases. Because none of the voltages exceeded the threshold, no threshold-based alarm was triggered; however, some voltage trends seem to be quite different from the overall trend. We may not know the exact reasons behind these phenomena at that time. However, it is always helpful to identify them accurately when they occur.

Anomaly Detection of Lithium-ion Batteries
In this section, the voltage variations of the batteries are used for anomaly detection, and the results are shown in Figure 7, where the blue lines are the detected normal

Anomaly Detection of Lithium-ion Batteries
In this section, the voltage variations of the batteries are used for anomaly detection, and the results are shown in Figure 7, where the blue lines are the detected normal lithiumion batteries, and the red lines are the anomalous ones; that is, a total of 11 batteries, which are significantly different from the other samples, showed detected anomalies by the proposed method. lithium-ion batteries, and the red lines are the anomalous ones; that is, a total of 11 batteries, which are significantly different from the other samples, showed detected anomalies by the proposed method. As described above, at the beginning of discharge, the voltages of the batteries are consistent, relatively, and the maximum difference of voltage is about 0.04 V. However, as the discharge process progresses, the voltage difference becomes gradually larger, and at the end of the discharge, the maximum difference of voltage is about 0.27 V.
As can be seen, the batteries with detected anomalies present very different characteristics, i.e., some voltages of the batteries fluctuate wildly in the discharge process, while some voltages changed only as a sawtooth wave.

Result Analysis
Furthermore, all of the battery anomalies are shown in Figure 8, separately. It can be seen that the 11 samples have different abnormal forms, respectively. The blue lines indicate that the voltages of the batteries are fluctuating wildly in the discharge process; as can be seen, at a certain stage of discharge, the battery voltage fluctuates violently and deviates significantly from a normal value. The red lines indicate that the voltages of the batteries are fluctuating in a sawtooth wave; as can be seen, the voltage variation of the battery shows a certain regularity. In addition, the green line changes seem to be smoother; however, by careful analysis, they are still different from the other batteries. At the beginning of discharge, the voltage is lower than the others, while at the ending of discharge the voltage is higher than the others. Compared with the other curves, the green line is considered to be anomalous discharge behavior.
These abnormal behaviors could be a precursory manifestation of some underlying failure; for instance, the sawtooth wave oscillations may be caused by a sensor fault; the violent changes may be caused by a sensor fault or some sort of potential battery failure, while the smooth deviations from the normal values are more likely to be caused by battery degradation. However, due to the complex battery fault characteristics, long As described above, at the beginning of discharge, the voltages of the batteries are consistent, relatively, and the maximum difference of voltage is about 0.04 V. However, as the discharge process progresses, the voltage difference becomes gradually larger, and at the end of the discharge, the maximum difference of voltage is about 0.27 V.
As can be seen, the batteries with detected anomalies present very different characteristics, i.e., some voltages of the batteries fluctuate wildly in the discharge process, while some voltages changed only as a sawtooth wave.

Result Analysis
Furthermore, all of the battery anomalies are shown in Figure 8, separately. It can be seen that the 11 samples have different abnormal forms, respectively. The blue lines indicate that the voltages of the batteries are fluctuating wildly in the discharge process; as can be seen, at a certain stage of discharge, the battery voltage fluctuates violently and deviates significantly from a normal value. The red lines indicate that the voltages of the batteries are fluctuating in a sawtooth wave; as can be seen, the voltage variation of the battery shows a certain regularity. In addition, the green line changes seem to be smoother; however, by careful analysis, they are still different from the other batteries. At the beginning of discharge, the voltage is lower than the others, while at the ending of discharge the voltage is higher than the others. Compared with the other curves, the green line is considered to be anomalous discharge behavior.
These abnormal behaviors could be a precursory manifestation of some underlying failure; for instance, the sawtooth wave oscillations may be caused by a sensor fault; the violent changes may be caused by a sensor fault or some sort of potential battery failure, while the smooth deviations from the normal values are more likely to be caused by battery degradation. However, due to the complex battery fault characteristics, long laboratory testing cycles, and some practical factors, sufficiently effective prior knowledge of the faults of the batteries is often not available.
The proposed potential failure prediction method in this paper is actually a fault prediction method based on unsupervised learning, which can identify abnormal batteries in advance before unknown potential faults occur. However, the abnormal causes of the identified batteries need to be carefully studied by experts or even by experiments; they may be caused by battery degradation or simple sensor errors. We cannot easily draw conclusions about the causes of the anomalies now; this is the focus of our subsequent research process. laboratory testing cycles, and some practical factors, sufficiently effective prior knowledge of the faults of the batteries is often not available.
The proposed potential failure prediction method in this paper is actually a fault prediction method based on unsupervised learning, which can identify abnormal batteries in advance before unknown potential faults occur. However, the abnormal causes of the identified batteries need to be carefully studied by experts or even by experiments; they may be caused by battery degradation or simple sensor errors. We cannot easily draw conclusions about the causes of the anomalies now; this is the focus of our subsequent research process.

Conclusions
With the rapid development of renewable energy in China and the intensive introduction of energy-storage-related policies, the installation capacity of the energy storage system in China has increased rapidly. At the same time, the safety problem of the main equipment of the energy storage system, i.e., the lithium-ion battery, is gradually being exposed. So, in the absence of sufficient prior knowledge, it is very important for the safe operation of the energy storage station to detect the abnormal running state of the batteries in time through unsupervised learning methods.
In this paper, a new method based on isolation density was proposed for anomaly detection and was fully verified by datasets with different types of anomalies. Finally, the anomaly detection of a real lithium-ion battery energy storage system was conducted, and 11 batteries were detected as being abnormal ones, including different kinds of anomaly conditions. The results are concluded as follows: (1) A new anomaly detection method based on isolation density was proposed in this paper and was fully verified by manual datasets and public datasets containing different types of anomalies.
(2) Isolation density can be viewed as the sparse degree or probability density of the battery, from the aspects of density or statistics, respectively. As it inherently involves the idea of ensemble learning, without any prior assumption about the data distribution, the

Conclusions
With the rapid development of renewable energy in China and the intensive introduction of energy-storage-related policies, the installation capacity of the energy storage system in China has increased rapidly. At the same time, the safety problem of the main equipment of the energy storage system, i.e., the lithium-ion battery, is gradually being exposed. So, in the absence of sufficient prior knowledge, it is very important for the safe operation of the energy storage station to detect the abnormal running state of the batteries in time through unsupervised learning methods.
In this paper, a new method based on isolation density was proposed for anomaly detection and was fully verified by datasets with different types of anomalies. Finally, the anomaly detection of a real lithium-ion battery energy storage system was conducted, and 11 batteries were detected as being abnormal ones, including different kinds of anomaly conditions. The results are concluded as follows: (1) A new anomaly detection method based on isolation density was proposed in this paper and was fully verified by manual datasets and public datasets containing different types of anomalies. (2) Isolation density can be viewed as the sparse degree or probability density of the battery, from the aspects of density or statistics, respectively. As it inherently involves the idea of ensemble learning, without any prior assumption about the data distribution, the method is characterized by high adaptation and can effectively be used for the detection of many kinds of anomalies. (3) The voltage variations during a whole discharge process of the batteries are taken as the features of the work condition. Then, through the proposed method and time series data of the voltages, the batteries with different abnormal discharge states are effectively detected. (4) The abnormal discharge states can be divided into three classes: violent changes, sawtooth wave oscillations, and smooth deviations from normal values.