Battery Grouping with Time Series Clustering Based on Affinity Propagation

Battery grouping is a technology widely used to improve the performance of battery packs. In this paper, we propose a time series clustering based battery grouping method. The proposed method utilizes the whole battery charge/discharge sequence for battery grouping. The time sequences are first denoised with a wavelet denoising technique. The similarity matrix is then computed with the dynamic time warping distance, and finally the time series are clustered with the affinity propagation algorithm according to the calculated similarity matrices. The silhouette index is utilized for assessing the performance of the proposed battery grouping method. Test results show that the proposed battery grouping method is effective.


Introduction
With the development of economics, people pay more and more attention to environmental protection, and new energy vehicles have been irresistible traffic tools for human beings.Power batteries, as the energy source of electric vehicles (EVs), play important roles in the whole EV system.However, the small capacity and low voltage of a single battery cell cannot meet the needs of an EV; thus people generally group many new battery cells together to form a battery pack to provide enough power for the EV [1].Because of the differences in the characteristics, some of the batteries in a battery pack can easily be over-charged or over-discharged, or the potential of most batteries in a battery pack will not exhaust, which will affect the performance and service life of the whole battery pack.Moreover, the inconsistency of the batteries will make monitoring and management of the states, such as estimating the state of charge (SOC) [2,3] and the state of health (SOH) [4,5] of the battery pack, much more difficult.Therefore, it is very important to choose a suitable battery grouping method to assemble batteries with similar characteristics into a pack by a serial and/or parallel connection.
At present, there are three types of battery grouping methods as follows: single characteristic based battery grouping, multiple characteristics based battery grouping, and dynamic characteristics based battery grouping.
In general, the characteristics used for battery grouping include the battery terminal voltage, internal resistance, static capacity, self-discharge rate, etc.The single characteristic based battery grouping methods choose only one of these characteristics as the grouping criteria.For example, the cell capacity is usually used as the criteria for battery grouping, i.e., batteries with similar capacities are grouped together to form a battery pack.Internal resistance is another widely considered characteristic to build battery packs [6].People may also consider thermal characteristics for building a battery pack [7].Unfortunately, a single characteristic is generally too limited to present the performance Energies 2016, 9, 561 3 of 11 clustering problem.The only difference between the battery grouping problem and other common clustering problems is that the numbers of batteries in clusters for battery grouping problems are the same, but this is not a large problem.One can first cluster batteries with commonly used clustering algorithms, and then divide or merge clusters that have a different number of batteries other than the required number.This may not be an optimal solution, but it should be adequate.Therefore, in this paper, we regard the battery grouping problem as a common clustering problem.
As mentioned before, in this paper we use the full charge and discharge series for battery grouping.To obtain these charge and discharge series, we build an embedded system that can simultaneously measure the terminal voltages, temperatures and currents for 20 batteries, as shown in Figure 1.
Energies 2016, 9, 561 3 of 11 algorithms, and then divide or merge clusters that have a different number of batteries other than the required number.This may not be an optimal solution, but it should be adequate.Therefore, in this paper, we regard the battery grouping problem as a common clustering problem.
As mentioned before, in this paper we use the full charge and discharge series for battery grouping.To obtain these charge and discharge series, we build an embedded system that can simultaneously measure the terminal voltages, temperatures and currents for 20 batteries, as shown in Figure 1.
Typical battery discharge curves at a discharge rate of 0.6C for LiMn2O4 batteries are shown in Figure 2. At first glance, battery #2 and #4 are close to each other.Actually, the DTW distances, which will be further explained in the following section, between different sequences are as follows:    algorithms, and then divide or merge clusters that have a different number of batteries other than the required number.This may not be an optimal solution, but it should be adequate.Therefore, in this paper, we regard the battery grouping problem as a common clustering problem.
As mentioned before, in this paper we use the full charge and discharge series for battery grouping.To obtain these charge and discharge series, we build an embedded system that can simultaneously measure the terminal voltages, temperatures and currents for 20 batteries, as shown in Figure 1.
Typical battery discharge curves at a discharge rate of 0.6C for LiMn2O4 batteries are shown in Figure 2. At first glance, battery #2 and #4 are close to each other.Actually, the DTW distances, which will be further explained in the following section, between different sequences are as follows:

AP Based Battery Grouping with the DTW Distance
As mentioned before, we utilize the whole battery charge/discharge curve for battery grouping.For obtaining the battery charge curve, the battery is initially fully discharged, i.e., it has a state of charge of 0%, the battery is then charged with a constant current until it is fully charged.During this process, the battery terminal voltages are measured and saved to form the battery charge curve.The process for obtaining the battery discharge curve is similar.Because both the charge curve and discharge curve are time-related, we regard battery grouping as a time series clustering problem.There are three important issues that should be considered before conducting time series clustering.First, noise may exist in the obtained battery charge/discharge data even if the battery is charged/discharged with a constant current.Second, the number of data points may be different with different batteries; thus the lengths of the time series may not be the same.Third, the number of clusters is unknown beforehand.To solve the first issue, a data denoising process should be used.In this paper, we utilize a wavelet based signal denoising [19] technology to smooth the battery charge/discharge data.Because the data lengths vary, the traditional distance metrics, such as the Euclidean distance, cannot be directly utilized.In this paper, we utilize the DTW distance to compute the similarities among different battery charge/discharge curves.Because the number of clusters is unknown, traditional data clustering methods, such as K-means and KNN cannot be directly utilized.For this reason, we use the AP algorithm to perform the time series clustering.The flow chart of the proposed method is shown in Figure 3.

AP Based Battery Grouping with the DTW Distance
As mentioned before, we utilize the whole battery charge/discharge curve for battery grouping.For obtaining the battery charge curve, the battery is initially fully discharged, i.e., it has a state of charge of 0%, the battery is then charged with a constant current until it is fully charged.During this process, the battery terminal voltages are measured and saved to form the battery charge curve.The process for obtaining the battery discharge curve is similar.Because both the charge curve and discharge curve are time-related, we regard battery grouping as a time series clustering problem.There are three important issues that should be considered before conducting time series clustering.First, noise may exist in the obtained battery charge/discharge data even if the battery is charged/discharged with a constant current.Second, the number of data points may be different with different batteries; thus the lengths of the time series may not be the same.Third, the number of clusters is unknown beforehand.To solve the first issue, a data denoising process should be used.In this paper, we utilize a wavelet based signal denoising [19] technology to smooth the battery charge/discharge data.Because the data lengths vary, the traditional distance metrics, such as the Euclidean distance, cannot be directly utilized.In this paper, we utilize the DTW distance to compute the similarities among different battery charge/discharge curves.Because the number of clusters is unknown, traditional data clustering methods, such as K-means and KNN cannot be directly utilized.For this reason, we use the AP algorithm to perform the time series clustering.The flow chart of the proposed method is shown in Figure 3.

Wavelet Denoising
Suppose the underlying model for the noisy signal () xn is basically of the following form: where () en is Gaussian white noise with a zero mean and a standard variation of  .The wavelet denoising mainly consists of the following three procedures: Decomposition of the noisy signal () xn into L levels and computation of the approximation coefficients of () Denoising by thresholding the detail coefficients.In this step, a soft thresholding process is performed to the detail coefficients at levels 1 to L .

Reconstruction. Compute the reconstructed signal based on the original approximation coefficients of level
L and the thresholded detail coefficients of levels from 1 to L .
In this paper, we use the symlet wavelet "sym8" which is a compactly supported wavelet with the least asymmetry and the highest number of vanishing moments as the wavelet for signal decomposition.A total level of 5 is used.Figure 4 shows an example of wavelet denoising.As observed from Figure 3, the severe fluctuation of the original noisy signal has been successfully removed after denoising.

Wavelet Denoising
Suppose the underlying model for the noisy signal xpnq is basically of the following form: where epnq is Gaussian white noise with a zero mean and a standard variation of σ.The wavelet denoising mainly consists of the following three procedures: Decomposition of the noisy signal xpnq into L levels and computation of the approximation coefficients of xpnq at level L.
Denoising by thresholding the detail coefficients.In this step, a soft thresholding process is performed to the detail coefficients at levels 1 to L.
Reconstruction.Compute the reconstructed signal based on the original approximation coefficients of level L and the thresholded detail coefficients of levels from 1 to L.
In this paper, we use the symlet wavelet "sym8" which is a compactly supported wavelet with the least asymmetry and the highest number of vanishing moments as the wavelet for signal decomposition.A total level of 5 is used.Figure 4 shows an example of wavelet denoising.As observed from Figure 3, the severe fluctuation of the original noisy signal has been successfully removed after denoising.

DTW Distance
The similarity between two sequences can be measured by their distance.Suppose we have two sequences, . There are several methods to compute the distance between Q and C .For example, if mn  , we can use the Euclidean distance, i.e., If mn  , a simple way to calculate the distance is to rescale one of the sequences with interpolation so that the two sequences have the same length.Then, we can use the Euclidean distance to measure the similarity.However, this may not be a good strategy for time series with varying time and speed.As shown in Figure 5, two time series have similar shapes and the same length, but the feature points occur at different times.If we still use the Euclidean distance to measure the similarity between these two sequences, a large distance will be obtained, which may not be desired.For this situation, the DTW distance is a better choice to describe the similarity between them.In general, DTW is a method that calculates an optimal match between two sequences.The sequences are "time warped" non-linearly so that the key feature points can be properly aligned.

DTW Distance
The similarity between two sequences can be measured by their distance.Suppose we have two sequences, Q " tq 1 , q 2 , ..., q m u and C " tc 1 , c 2 , ..., c n u.There are several methods to compute the distance between Q and C. For example, if m " n, we can use the Euclidean distance, i.e., d " If m ‰ n, a simple way to calculate the distance is to rescale one of the sequences with interpolation so that the two sequences have the same length.Then, we can use the Euclidean distance to measure the similarity.However, this may not be a good strategy for time series with varying time and speed.As shown in Figure 5, two time series have similar shapes and the same length, but the feature points occur at different times.If we still use the Euclidean distance to measure the similarity between these two sequences, a large distance will be obtained, which may not be desired.For this situation, the DTW distance is a better choice to describe the similarity between them.In general, DTW is a method that calculates an optimal match between two sequences.The sequences are "time warped" non-linearly so that the key feature points can be properly aligned.

DTW Distance
The similarity between two sequences can be measured by their distance.Suppose we have two sequences,

Q and
C .For example, if mn  , we can use the Euclidean distance, i.e., If mn  , a simple way to calculate the distance is to rescale one of the sequences with interpolation so that the two sequences have the same length.Then, we can use the Euclidean distance to measure the similarity.However, this may not be a good strategy for time series with varying time and speed.As shown in Figure 5, two time series have similar shapes and the same length, but the feature points occur at different times.If we still use the Euclidean distance to measure the similarity between these two sequences, a large distance will be obtained, which may not be desired.For this situation, the DTW distance is a better choice to describe the similarity between them.In general, DTW is a method that calculates an optimal match between two sequences.The sequences are "time warped" non-linearly so that the key feature points can be properly aligned.DTW utilizes dynamic programming to find the optimal mapping of points in two sequences and compute the distance between them.Detailed steps for computing the DTW distance can be found in several papers [20,21], and efficient algorithms can be found in [17].
Suppose we have N sequences.After the computation of the DTW distances, we can obtain a distance matrix D " " , where d ij is the DTW distance between sequence i and sequence j.Suppose the largest distance is d max and the smallest distance is d min , we can then obtain the similarity matrix S " " s ij ‰ , where s ij is the similarity of sequence i and sequence j and is computed as follows: which is actually a normalization step.
According to the definition of distance, d ii is 0. We define s ii as the mean similarity of the data points that connects to data point i, i.e.,

AP Based Clustering
AP is a clustering algorithm based on the concept of "message passing" between data points.Unlike traditional clustering algorithms such as K-means and KNN, AP does not require the number of clusters to be determined or estimated beforehand.At the beginning, all data points (a data point in this paper is actually a sequence) are regarded as exemplars.Each data point competes to become the cluster center with two types of message passing, i.e., the responsibility and the availability.This mechanism can avoid the influence of a random selection of initial cluster centers and the specified number of clusters.The responsibility r ik represents the message passed from data point i to the candidate center k, which describes the appropriateness of data point k to be the cluster center of data point i.However, the availability a ik represents the message passed from the candidate cluster center k to data point i, which describes the appropriateness that data point i chooses data point k as its candidate center.The bigger the summation of r ik and a ik , the more possible it is for data point k to be a cluster center.r ik and a ik are initialized to be 0 at the beginning.After obtaining the similarity matrix S, r ik and a ik are then computed as follows: The responsibility and the availability should then be updated with iterations as follows: a ik " λ ˚aold ik `p1 ´λqa ik (8) where λ is the damping coefficient, which has a value in the range of 0.5 to 1.The iteration terminates when a maximum step is reached or the changes in values of r ik and a ik are not significant.
If for some battery k we have e kk " r kk `akk ą 0, then battery k is a cluster center, and we mark its corresponding element batteries.
More details on the AP algorithm can be found in [18].

Experimental Results
In this part, two types of experiments were performed.
Energies 2016, 9, 561 7 of 11 In the first experiment, the proposed battery grouping method was evaluated with real battery charge/discharge sequences.We obtained both the charge and discharge sequences for 94 new LiMn 2 O 4 lithium-ion batteries with a nominal capacity of 2 Ah, though the proposed method can also be applied to other types of batteries.The charge curves were obtained when the batteries were charged with a constant current of 1.2 A until the charge cutoff voltage reached, and then charged under a constant voltage when the current dropped to 0.02 A. The discharge curves were obtained when the batteries were discharged at a constant current of 1.2 A. We then utilized the proposed AP based method for battery grouping.For a comparison, we also tested the K-means based spectral clustering method [22].We used the silhouette index [23] to assess the performance of the methods.
The silhouette indexes, according to different numbers of clusters with different algorithms for the charge sequences, are given in Table 1.According to the results in Table 1, for the charge curve, the best number of clusters for the AP algorithm and the spectral clustering algorithm are both 6.The best silhouette index of the AP algorithm is larger than that of the spectral clustering algorithm.Figure 6 shows a visual display of the clustered similarity matrices when the charge curves are used for battery grouping.As observed from Figure 6, the clustered similarity matrix for the AP algorithm is much more diagonalized than that of the spectral clustering algorithm.Both Table 1 and Figure 6 show better performance with the proposed AP based algorithm.

Experimental Results
In this part, two types of experiments were performed.In the first experiment, the proposed battery grouping method was evaluated with real battery charge/discharge sequences.We obtained both the charge and discharge sequences for 94 new LiMn2O4 lithium-ion batteries with a nominal capacity of 2 Ah, though the proposed method can also be applied to other types of batteries.The charge curves were obtained when the batteries were charged with a constant current of 1.2 A until the charge cutoff voltage reached, and then charged under a constant voltage when the current dropped to 0.02 A. The discharge curves were obtained when the batteries were discharged at a constant current of 1.2 A. We then utilized the proposed AP based method for battery grouping.For a comparison, we also tested the K-means based spectral clustering method [22].We used the silhouette index [23] to assess the performance of the methods.
The silhouette indexes, according to different numbers of clusters with different algorithms for the charge sequences, are given in Table 1.According to the results in Table 1, for the charge curve, the best number of clusters for the AP algorithm and the spectral clustering algorithm are both 6.The best silhouette index of the AP algorithm is larger than that of the spectral clustering algorithm.Figure 6 shows a visual display of the clustered similarity matrices when the charge curves are used for battery grouping.As observed from Figure 6, the clustered similarity matrix for the AP algorithm is much more diagonalized than that of the spectral clustering algorithm.Both Table 1 and Figure 6 show better performance with the proposed AP based algorithm.The clustering result for the charge curves based on the proposed AP based algorithm is shown in Figure 7, which also shows the effectiveness of the proposed algorithm.The clustering result for the charge curves based on the proposed AP based algorithm is shown in Figure 7, which also shows the effectiveness of the proposed algorithm.The silhouette indexes, according to the different numbers of clusters with different algorithms for the discharge sequences, are given in Table 2.According to the results in Table 2, for the discharge curves, the best number of clusters for the AP algorithm is 5, while the best number of clusters for the spectral clustering algorithm is 2. The best silhouette index of the AP algorithm is still larger than that of the spectral clustering algorithm.Figure 8 shows a visual display of the clustered similarity matrices when the discharge curves are used for battery grouping.As observed from Figure 8, the clustered similarity matrix for the AP algorithm is much more diagonalized than that of the spectral clustering algorithm.Both Table 2 and Figure 8 also show better performance with the proposed AP based algorithm.The silhouette indexes, according to the different numbers of clusters with different algorithms for the discharge sequences, are given in Table 2.According to the results in Table 2, for the discharge curves, the best number of clusters for the AP algorithm is 5, while the best number of clusters for the spectral clustering algorithm is 2. The best silhouette index of the AP algorithm is still larger than that of the spectral clustering algorithm.Figure 8 shows a visual display of the clustered similarity matrices when the discharge curves are used for battery grouping.As observed from Figure 8, the clustered similarity matrix for the AP algorithm is much more diagonalized than that of the spectral clustering algorithm.Both Table 2 and Figure 8 also show better performance with the proposed AP based algorithm.The silhouette indexes, according to the different numbers of clusters with different algorithms for the discharge sequences, are given in Table 2.According to the results in Table 2, for the discharge curves, the best number of clusters for the AP algorithm is 5, while the best number of clusters for the spectral clustering algorithm is 2. The best silhouette index of the AP algorithm is still larger than that of the spectral clustering algorithm.Figure 8 shows a visual display of the clustered similarity matrices when the discharge curves are used for battery grouping.As observed from Figure 8, the clustered similarity matrix for the AP algorithm is much more diagonalized than that of the spectral clustering algorithm.Both Table 2 and Figure     The clustering result for the discharge curves based on the proposed AP based algorithm is shown in Figure 9, which also shows the effectiveness of the proposed algorithm.
Energies 2016, 9, 561 9 of 11 The clustering result for the discharge curves based on the proposed AP based algorithm is shown in Figure 9, which also shows the effectiveness of the proposed algorithm.In the second experiment, we built two battery packs, both containing four new battery cells.The four battery cells in the first battery pack were chosen from one of the clusters obtained according to the proposed algorithm.The battery cells in the second battery pack had similar discharge cutoff terminal voltages and charge static terminal voltages as utilized by many manufacturers.The two battery packs were then charged and discharged for many cycles under the same condition.Figure 10 shows the constant charge along with the voltage drop curve [24] at the 300th cycle for the battery cells in different packs.As observed from the figure, the characteristics of the battery cells in the first pack remain consistent, while the characteristics of battery cells in the second pack are inconsistent.This shows that battery grouping with the whole charge series is superior to that with only two distinct voltage values.

Conclusions
Battery grouping is a widely used technology to improve the performance of battery packs.In this paper, we propose a time series clustering based battery grouping method.The proposed method utilizes the whole battery charge/discharge sequence for battery grouping.The time sequences are first denoised with a wavelet denoising technique, the similarity matrix is then computed with the DTW distance, and finally, the time series are clustered with the AP algorithm according to the calculated In the second experiment, we built two battery packs, both containing four new battery cells.The four battery cells in the first battery pack were chosen from one of the clusters obtained according to the proposed algorithm.The battery cells in the second battery pack had similar discharge cutoff terminal voltages and charge static terminal voltages as utilized by many manufacturers.The two battery packs were then charged and discharged for many cycles under the same condition.Figure 10 shows the constant charge along with the voltage drop curve [24] at the 300th cycle for the battery cells in different packs.As observed from the figure, the characteristics of the battery cells in the first pack remain consistent, while the characteristics of battery cells in the second pack are inconsistent.This shows that battery grouping with the whole charge series is superior to that with only two distinct voltage values.
Energies 2016, 9, 561 9 of 11 The clustering result for the discharge curves based on the proposed AP based algorithm is shown in Figure 9, which also shows the effectiveness of the proposed algorithm.In the second experiment, we built two battery packs, both containing four new battery cells.The four battery cells in the first battery pack were chosen from one of the clusters obtained according to the proposed algorithm.The battery cells in the second battery pack had similar discharge cutoff terminal voltages and charge static terminal voltages as utilized by many manufacturers.The two battery packs were then charged and discharged for many cycles under the same condition.Figure 10 shows the constant charge along with the voltage drop curve [24] at the 300th cycle for the battery cells in different packs.As observed from the figure, the characteristics of the battery cells in the first pack remain consistent, while the characteristics of battery cells in the second pack are inconsistent.This shows that battery grouping with the whole charge series is superior to that with only two distinct voltage values.

Conclusions
Battery grouping is a widely used technology to improve the performance of battery packs.In this paper, we propose a time series clustering based battery grouping method.The proposed method utilizes the whole battery charge/discharge sequence for battery grouping.The time sequences are first denoised with a wavelet denoising technique, the similarity matrix is then computed with the DTW distance, and finally, the time series are clustered with the AP algorithm according to the calculated

Conclusions
Battery grouping is a widely used technology to improve the performance of battery packs.In this paper, we propose a time series clustering based battery grouping method.The proposed method utilizes the whole battery charge/discharge sequence for battery grouping.The time sequences are first denoised with a wavelet denoising technique, the similarity matrix is then computed with the DTW distance, and finally, the time series are clustered with the AP algorithm according to the calculated similarity matrices.The silhouette index is utilized for assessing the performance of the proposed battery grouping method.Test results show that the proposed battery grouping method is effective.
In this paper, we performed a clustering based only on the charge or discharge time series of batteries, and an extension to clustering based on both charge and discharge time series is straight forward.However, from the results of Tables 1 and 2, different clusters are obtained for the same batteries when the charge and discharge time series are utilized.Therefore, research on a strategy to combine the charge and discharge time series is worthwhile.On the other hand, the time series used for battery grouping in this paper were obtained under constant charge/discharge rates (0.6C for both charge and discharge).In the future, research on the influence of different charge/discharge rates on the performance of grouped battery packs may be valuable.

Figure 2 .
Figure 2.An illustration of the battery discharge curves.

Figure 2 .
Figure 2.An illustration of the battery discharge curves.

Figure 2 .
Figure 2.An illustration of the battery discharge curves.

Figure 3 .
Figure 3. Flow chart of the proposed method.

Figure 3 .
Figure 3. Flow chart of the proposed method.

.
There are several methods to compute the distance between

Figure 6 .
Figure 6.Clustered similarity matrices for the charge curves; (a) The AP algorithm; (b) The spectral clustering algorithm.

Figure 6 .
Figure 6.Clustered similarity matrices for the charge curves; (a) The AP algorithm; (b) The spectral clustering algorithm.

Figure 7 .
Figure 7. Clustering result for the charge curves with the proposed method.

Figure 8 .
Figure 8. Clustered similarity matrices for discharge curves; (a) The AP algorithm; (b) The spectral clustering algorithm.

Figure 7 .
Figure 7. Clustering result for the charge curves with the proposed method.

Figure 7 .
Figure 7. Clustering result for the charge curves with the proposed method.
8 also show better performance with the proposed AP based algorithm.

Figure 8 .
Figure 8. Clustered similarity matrices for discharge curves; (a) The AP algorithm; (b) The spectral clustering algorithm.

Figure 8 .
Figure 8. Clustered similarity matrices for discharge curves; (a) The AP algorithm; (b) The spectral clustering algorithm.

Figure 9 .
Figure 9. Clustering result for discharge curves with the proposed method.

Figure 10 .
Figure 10.Constant charge along with the voltage drop characteristics of battery cells for two different battery packs; (a) cells grouped with similar charge curves; (b) cells grouped with similarterminal.

Figure 9 .
Figure 9. Clustering result for discharge curves with the proposed method.

Figure 9 .
Figure 9. Clustering result for discharge curves with the proposed method.

Figure 10 .
Figure 10.Constant charge along with the voltage drop characteristics of battery cells for two different battery packs; (a) cells grouped with similar charge curves; (b) cells grouped with similarterminal.

Figure 10 .
Figure 10.Constant charge along with the voltage drop characteristics of battery cells for two different battery packs; (a) cells grouped with similar charge curves; (b) cells grouped with similarterminal.

Table 1 .
Silhouette indexes for charge curves.

Table 1 .
Silhouette indexes for charge curves.

Table 2 .
Silhouette indexes for the discharge curves.

Table 2 .
Silhouette indexes for the discharge curves.

Table 2 .
Silhouette indexes for the discharge curves.