A State of Health Estimation Framework for Lithium-Ion Batteries Using Transfer Components Analysis

: As di ﬀ erent types of lithium batteries are increasingly employed in various devices, it is crucial to predict the state of health (SOH) of lithium batteries. There are plenty of methods for SOH estimation of a lithium-ion battery. However, existing technologies often have computational complexity. Furthermore, it is di ﬃ cult to use least the previous 30% of data of the battery degradation process to predict the SOH variation of the entire degradation process. To address this problem, in this paper, the SOH of the target battery is estimated based on the transfer of di ﬀ erent battery data sets. Firstly, according to importance sampling (IS), valid features are extracted from cycles of charging voltage in both the source and target battery. Secondly, transfer component analysis (TCA) is used to map the source data set to the target data set. Moreover, an extreme learning machine (ELM) algorithm is employed to train a single hidden layer feed forward neural network (SLFN) for its fast training speed and facile to set up. Finally, validation experiments and the comparisons on the results are conducted. The results showed that the proposed framework has a good capability of predicting the SOH of lithium batteries.


Introduction
In recent decades, lithium batteries, as power units of numerous mobile terminals, have been becoming more and more widely used in various electronic devices, especially for electric vehicles (EV) and energy systems [1]. However, the capacity of lithium-ion batteries will drop off with charge and discharge operations. In order to ensure the stability of devices, technologies for SOH are often adopted to estimate the current degradation degree of the lithium-ion battery [1][2][3][4]. Therefore, predicting the SOH of a battery is attached with essential significance. There are many kinds of valid SOH estimation methods for a lithium-ion battery [2][3][4]. However, in practical applications, it is still challenging to predict the long-term SOH of the battery when the battery has only been used for 20% to 50% of its life.
In this paper, the SOH of the lithium-ion battery refers to the ratio of battery discharge capacity to new battery rated capacity under certain conditions, which is shown as follows: In the past decades, extensive research has been conducted into the SOH of batteries. Battery SOH prediction can be divided into two categories: adaptive methods and data-driven methods [2][3][4]. Adaptive prognostic approaches, such as Kalman filter (KF) and particle filter (PF), intend to make an In the last few decades, many improved algorithms based on gradient descent (GD) have been developed. BPNN (back propagation neural network) with improved efficient training algorithm cannot jump out of the limitations of BPNN. This model still takes a long time of iterative training. Thus, ELM is introduced instead of GD to train an FFNN because of its fast training speed and low calculation cost.
Transfer learning has been an emerging learning method in recent years. In the traditional machine learning method, the training data and the test data are required to be identically distributed, otherwise the prediction result may shift [17,18]. The rise of migration learning is precisely to solve this drift problem. In transfer learning, there are two types of data with similar distribution. One type of knowledge that contains known knowledge is called the source domain, which is the object to be transferred, and the other is the data that are unknown, but similar to the source domain knowledge. This is called the target domain and is the target of transfer. The goal of transfer learning is that the source domain gains some knowledge in the solution task to improve the target task.
The following of this paper is structured as follows. In Section 2, a domain adaptation algorithm and a triple layer FFNN training method are introduced, after which the composition of the framework called transfer component analysis (TCA)-extreme learning machine (ELM) is presented. The experimental data set is introduced in Section 3. Experimental design ideas, experimental results, and corresponding analysis are introduced in Section 4. Conclusion are conducted in Section 5.

Applied Approaches
This section is used to introduce the approaches that are applied to exploit the knowledge of the complete degradation data of a source battery for the prediction of target battery SOH; it is composed of two parts. The first part introduces usage of domain adaptation method. The second part introduces the neural network training algorithm used in this paper.

Domain Adaptation
Pan et al. [19] proposed TCA, which is a symmetric feature-based DA approach that aims to reduce the difference of feature data between the source domain and target domain [17,18].
These two domains have the same feature space and similar, but distinct marginal distributions. As is shown in Figure 1, the principle of TCA is to map the feature data of two differently distributed domains onto the general latent feature space based on reproducing kernel Hilbert space (RKHS). The difference between the two domains is measured by maximum mean discrepancy (MMD), which is defined as follows: where H represents RKHS, x si indicates the ith sample of source domain, x tj indicates the jth sample of target domain, and f ( ) indicates the mapping function of the two domains mapping feature data from present feature space to the latent shared feature space. The main focal point of TCA is how to find f ( ) quickly and efficiently. There could be numerous f ( ). Therefore, by introducing the kernel method, Pan amended Equation (2) into the following: where K s,s is the kernel matrix of the source domain data obtained through the mapping of kernel function k(·, ·). K s,t , K t,s , and K t,t can be obtained by analogy of K s,s ; and framework called transfer component analysis (TCA)-extreme learning machine (ELM) is presented. The experimental data set is introduced in Section 3. Experimental design ideas, experimental results, and corresponding analysis are introduced in Section 4. Conclusion are conducted in Section 5.

Applied Approaches
This section is used to introduce the approaches that are applied to exploit the knowledge of the complete degradation data of a source battery for the prediction of target battery SOH; it is composed of two parts. The first part introduces usage of domain adaptation method. The second part introduces the neural network training algorithm used in this paper.

Domain Adaptation
Pan et al. [19] proposed TCA, which is a symmetric feature-based DA approach that aims to reduce the difference of feature data between the source domain and target domain [17,18].

Source Domain
Target Domain

Transformation Function Ft
Common Latent Feature Space These two domains have the same feature space and similar, but distinct marginal distributions. As is shown in Figure 1, the principle of TCA is to map the feature data of two differently distributed domains onto the general latent feature space based on reproducing kernel Hilbert space (RKHS). The difference between the two domains is measured by maximum mean discrepancy (MMD), which is defined as follows: where H represents RKHS, si x indicates the ith sample of source domain, tj x indicates the jth sample of target domain, and ( ) f indicates the mapping function of the two domains mapping It has changed into a semi-definite programming (SDP) problem after this transformation. As the SDP formula is time-consuming in calculation, Pan introduced a dimension reduction method to construct the desired result [19]. Scholkpf et al. [20] proposed a method for reducing the dimensionality of the kernel matrix for nonlinear principle components analysis (PCA). The kernel matrix in Equation (4) can be implemented as The dimension of the dimensionality reduction space is d m 1 + m 2 . In that way, a (m 1 + m 2 ) × d matrix W, which is used to map the corresponding source and target domain feature vectors to low-dimensional space, can be constructed. Also, the kernel matrix can be rewritten into the following: In particular, W = K −1/2 W is a lower-dimensional solution than K. Equation (3) can be mathematically expressed as follows: In order to optimize Equation (8) and reduce the complexity of the matrix, the regularization term tr W T W is added to Equation (8) by Pan [19]. At this point, TCA can be summarized as follows: where I is a (m 1 + m 2 ) × (m 1 + m 2 ) unit matrix. H = I − 1/(m 1 + m 2 )A is a center matrix and A is an (m 1 + m 2 )(m 1 + m 2 ) all-one matrix, respectively. To maintain the diverse of the data, additional constraint W T KHKW = I is used to limit the divergence of data.

Training Algorithm of Neural Network
ELM, which is an emerging neural network (NN) training algorithm for single hidden layer feed forward neural network (SLFN), was proposed by Huang et al. [21,22]. It differs from the traditional In this paper, a distinct sample is denoted as (x i , y i ), where x i = x i1 , . . . , x ip , . . . , x in ∈ R n and y i is the corresponding output. The structure of ELM is shown in Figure 2.
additional constraint W KHKW I Τ = is used to limit the divergence of data.

Training Algorithm of Neural Network
ELM, which is an emerging neural network (NN) training algorithm for single hidden layer feed forward neural network (SLFN), was proposed by Huang et al. [21,22]. It differs from the traditional training algorithm of SLFN; because there is no iteration in ELM trained NN, it achieved a fast training speed.
In this paper, a distinct sample is denoted as ( ) where 1 1 1 1 Training an SLFN is equivalent to finding a least-squares solution † =H T β  of the problem.
Where † H is the pseudo-moores inverse of the matrix H. The resulting β  is the global optimal solution. The input layer has n input neurons, corresponding to the n-dimensional features of a sample. The hidden layer has L hidden neurons, and L ≤ N. N denotes the number of samples. w pj denotes the weight between the pth input layer neurons and the jth hidden layer neurons and b j denotes jth bias of hidden layer neurons. β j denotes jth weight between hidden layer neurons and output layer neurons. This model can be summarized as follows: where Training an SLFN is equivalent to finding a least-squares solution β = H † T of the problem. Where H † is the pseudo-moores inverse of the matrix H. The resulting β is the global optimal solution.

Framework for Prediction
In this section, the first part is used to discuss the feasibility of this work. In the second part, the proposed framework shown in Figure 6 will be illustrated step by step.
Wu et al. [16] applied IS as a feature selection scheme, which extracts data from the charging voltage. Because the duration is relatively long during the charging operation, it is convenient to be measured, and the curve is relatively stable. It is reliable and relevant to select the feature data from the data of the charging voltage. Therefore, IS is chosen as the feature selection scheme in this paper. The specific selection scheme is shown in Figure 3. The voltage values are selected as the feature data in the ratio of 1/3, 1/2, 2/3, 13/18, 7/9, 5/6, 8/9, 33/36, 17/18, and 35/36 at each charging voltage curve.
Wu et al. [16] applied IS as a feature selection scheme, which extracts data from the charging voltage. Because the duration is relatively long during the charging operation, it is convenient to be measured, and the curve is relatively stable. It is reliable and relevant to select the feature data from the data of the charging voltage. Therefore, IS is chosen as the feature selection scheme in this paper. The specific selection scheme is shown in Figure 3. The voltage values are selected as the feature data in the ratio of 1/3, 1/2, 2/3, 13/18, 7/9, 5/6, 8/9, 33/36, 17/18, and 35/36 at each charging voltage curve. Although each battery has different capacity degradation curves, these curves are in a similar degradation pattern. Also, most importantly, the charging curves of different batteries are similar when the number of cycles is different. Namely, the distribution of feature data is similar, but the values are different, which is shown in Figure 4 and Figure 5. This satisfies the premise of featurebased DA [17]. Therefore, it is possible and valid to predict the SOH of the target battery by taking advantage of different types of batteries through the domain adaptation technique. Although each battery has different capacity degradation curves, these curves are in a similar degradation pattern. Also, most importantly, the charging curves of different batteries are similar when the number of cycles is different. Namely, the distribution of feature data is similar, but the values are different, which is shown in Figures 4 and 5. This satisfies the premise of feature-based DA [17]. Therefore, it is possible and valid to predict the SOH of the target battery by taking advantage of different types of batteries through the domain adaptation technique.
voltage. Because the duration is relatively long during the charging operation, it is convenient to be measured, and the curve is relatively stable. It is reliable and relevant to select the feature data from the data of the charging voltage. Therefore, IS is chosen as the feature selection scheme in this paper. The specific selection scheme is shown in Figure 3. The voltage values are selected as the feature data in the ratio of 1/3, 1/2, 2/3, 13/18, 7/9, 5/6, 8/9, 33/36, 17/18, and 35/36 at each charging voltage curve. Although each battery has different capacity degradation curves, these curves are in a similar degradation pattern. Also, most importantly, the charging curves of different batteries are similar when the number of cycles is different. Namely, the distribution of feature data is similar, but the values are different, which is shown in Figure 4 and Figure 5. This satisfies the premise of featurebased DA [17]. Therefore, it is possible and valid to predict the SOH of the target battery by taking advantage of different types of batteries through the domain adaptation technique. Voltage (V) B0005.cycle (7) B0005.cycle (88) B0005.cycle (204) B0005.cycle (  DA corrects the distribution of source and target domain data essentially and keeps the target domain and source domain data distribution consistent but differential [19]. After this processing, the data of the source domain can be regarded as the same distributed data with target domain data under the new feature space. Hence, these kinds of data can be used as a training set to train a model of machine learning. After that, the ELM algorithm is used as the neural network training algorithm to train a triple layer FFNN because of its extreme fast training speed. Eventually, the source domain data are used as the training set, and the rest of the target domain data are used as the test set to prove the validity of the proposed framework through experiments.  Figure 5. Comparison of (a) first and (b) fifth feature. The first feature refers to the voltage value at 1/3 of the battery during each charge, and the fifth feature is the voltage at 5/9. This means that although the battery data sets are different, the charging voltage has a similar distribution during the charging process of the battery.
DA corrects the distribution of source and target domain data essentially and keeps the target domain and source domain data distribution consistent but differential [19]. After this processing, the data of the source domain can be regarded as the same distributed data with target domain data under the new feature space. Hence, these kinds of data can be used as a training set to train a model of machine learning. After that, the ELM algorithm is used as the neural network training algorithm to train a triple layer FFNN because of its extreme fast training speed. Eventually, the source domain data are used as the training set, and the rest of the target domain data are used as the test set to prove the validity of the proposed framework through experiments.
As shown in the framework of Figure 6, the deployment of the algorithm will be explained step by step.
1/3 of the battery during each charge, and the fifth feature is the voltage at 5/9. This means that although the battery data sets are different, the charging voltage has a similar distribution during the charging process of the battery.
DA corrects the distribution of source and target domain data essentially and keeps the target domain and source domain data distribution consistent but differential [19]. After this processing, the data of the source domain can be regarded as the same distributed data with target domain data under the new feature space. Hence, these kinds of data can be used as a training set to train a model of machine learning. After that, the ELM algorithm is used as the neural network training algorithm to train a triple layer FFNN because of its extreme fast training speed. Eventually, the source domain data are used as the training set, and the rest of the target domain data are used as the test set to prove the validity of the proposed framework through experiments. As shown in the framework of Figure 6, the deployment of the algorithm will be explained step by step.
1. To begin with, the complete degradation data of source battery and the previous parts of the target battery data are needed. Features are extracted from the raw data

1.
To begin with, the complete degradation data of source battery and the previous parts of the target battery data are needed. Features are extracted from the raw data of charging voltage of batteries according to the scheme of IS. As is shown in Figure 4, the data in the middle and posterior parts, especially the latter part, have relatively large differences.

2.
The employment of the TCA algorithm is shown in this step. The calculation process of TCA is shown in Section 2.1. First, the input are two feature matrices from source and target domains. Then, calculate the L and H matrices according to Equation (5) and Equation (9). Then, we need to choose a kernel function to calculate the K matrix. In this paper, the Gaussian kernel function in the radial basis function (RBF) is chosen, for its lower computational cost and shorter computation time.
K(x, y) = e x, y ∈ src, tar , where σ is the bandwidth of the Gaussian kernel function. Then, the ranked top dth eigenvalues of (KLK + µI) −1 KHK are the source and target domain data we need.

3.
After TCA processing, the mapped source domain data, the mapped target domain data, and the function used to map the new arrival target domain data are obtained. Among them, the labels of mapped target domain data remain unknown, because the SOH of target battery is unable to obtain at this moment. For this sack, this part of the data is not used. On the other hand, the source domain data are complete after being mapped. Therefore, it can be used to train an effective model.  4. In this step, an SLFN model based on the ELM training algorithm described in Section 2.2 is trained. In this paper, the activation function of the hidden layer is a sigmoid function: For any one of the mapped source domain sample data, it can be formulated as follows: Because of the regression problem, the output of the network has only one neural node and the output is the predicted value. Then, in the context of our application of ELM to predict lithium battery SOH, the output layer only needs one output neuron y i . It can be given as follows: where M is the number of hidden layer neurons in the SLFN and needs to be manually selected.

5.
In this step, new arrival target domain data are predicted by a formerly well-trained model. The trained model has different numbers of input neurons with target domain data. Therefore, new arrival target domain data should be mapped by TCA. In an experimental environment, the data sets in use have integrated data of both the source domain and target domain, which can be mapped together by TCA at once. However, in the actual application process, this is not feasible. The newly generated target domain data need to be mapped into the new space by the mapping function F in STEP3. After the new data are mapped to previous latent space, the model can be used to get the desired SOH predictions.

Battery Datasets and Experimental Setup
In order to verify the validity of the proposed framework, experiments are proposed for simulation and verification in this section. The experimental platform is Matlab2017a and the computer system is Windows 10. The data sets used in the experiments in this paper are Battery Data Set in PCoE Datasets, provided by The Prognostics Center of Excellence (PCoE) at NASA's Ames Research Center for fault prediction and diagnostic studies [23]. Battery Data Set has battery degradation data under various operating conditions. In order to ensure the similarity of battery data in different datasets, they are collected under similar charging and discharging conditions. Therefore, in the Battery Data Set, sub-datasets B0005 and B0007 were selected as the experimental data. The two sub-data sets were collected at room temperature for battery #5 and #7, respectively. They were charged to 4.2 V in a 1.5 A constant current mode and then continuously charged at 20 mA. Discharge was implemented at a constant current of 2 A until voltage fell to 2.7 V and 2.2 V [23].
Data are collected at each cycle. These two data sets contain various types of data, such as voltage, current, time data, and impedance data collected in a lithium-ion battery aging experiment. Among these parameters, only the voltage data during the charging process were used in this paper. In Battery Data Set, the EOL of batteries are at 70% capacity. Capacity degradation curves are shown in Figure 7. degradation data under various operating conditions. In order to ensure the similarity of battery data in different datasets, they are collected under similar charging and discharging conditions. Therefore, in the Battery Data Set, sub-datasets B0005 and B0007 were selected as the experimental data. The two sub-data sets were collected at room temperature for battery #5 and #7, respectively. They were charged to 4.2 V in a 1.5 A constant current mode and then continuously charged at 20 mA. Discharge was implemented at a constant current of 2 A until voltage fell to 2.7 V and 2.2 V [23]. Data are collected at each cycle. These two data sets contain various types of data, such as voltage, current, time data, and impedance data collected in a lithium-ion battery aging experiment. Among these parameters, only the voltage data during the charging process were used in this paper. In Battery Data Set, the EOL of batteries are at 70% capacity. Capacity degradation curves are shown in Figure 7.
On the other hand, Oxford Battery Degradation Dataset 1 is another battery data set in use. This data set is collected in long term battery ageing tests of 8 Kokam (SLPB533459H4) 740 mAh lithiumion pouch cells [24]. This dataset contains eight sets of sub-data sets that are tested under the same charge and discharge conditions for the same eight pieces of lithium battery at 40 degree centigrade. Charging is carried out at a constant current and constant voltage mode; data are collected every 100 cycles. There are approximately 4500 or 8000 cycles in the sub-data sets.
In Oxford Battery Degradation Dataset 1, the end of life (EOL) of the battery is 75% of the initial capacity.
It can be distinctly seen in Figure 7 that the capacity degradation curve of Cell1, Cell3, Cell7, and Cell8 is similar. The same is true for other types of data of these batteries. They are a better choice than others.
Considering that the EOL of the battery in PCoE Datasets and Oxford Battery Degradation Dataset 1 is discriminate, and the number of cycles is also different, it is necessary to unify the two data sets. Therefore, data are selected before the capacity fell to 75% in PCoE Datasets.

Results and Corresponding Discussion
This section is divided into four parts. Section 4.1 is used to demonstrate the effectiveness of ELM in this predictive task. Section 4.2 is used to demonstrate the validity of the final proposed transfer framework. After proving the validity of the framework, Section 4.3 starts with the data and simply explores how the data can improve the accuracy of the prediction. In Section 4.4, explore the On the other hand, Oxford Battery Degradation Dataset 1 is another battery data set in use. This data set is collected in long term battery ageing tests of 8 Kokam (SLPB533459H4) 740 mAh lithium-ion pouch cells [24]. This dataset contains eight sets of sub-data sets that are tested under the same charge and discharge conditions for the same eight pieces of lithium battery at 40 degree centigrade. Charging is carried out at a constant current and constant voltage mode; data are collected every 100 cycles. There are approximately 4500 or 8000 cycles in the sub-data sets.
In Oxford Battery Degradation Dataset 1, the end of life (EOL) of the battery is 75% of the initial capacity.
It can be distinctly seen in Figure 7 that the capacity degradation curve of Cell1, Cell3, Cell7, and Cell8 is similar. The same is true for other types of data of these batteries. They are a better choice than others.
Considering that the EOL of the battery in PCoE Datasets and Oxford Battery Degradation Dataset 1 is discriminate, and the number of cycles is also different, it is necessary to unify the two data sets. Therefore, data are selected before the capacity fell to 75% in PCoE Datasets.

Results and Corresponding Discussion
This section is divided into four parts. Section 4.1 is used to demonstrate the effectiveness of ELM in this predictive task. Section 4.2 is used to demonstrate the validity of the final proposed transfer framework. After proving the validity of the framework, Section 4.3 starts with the data and simply explores how the data can improve the accuracy of the prediction. In Section 4.4, explore the impact of the amount of battery data in the target domain on the prediction results. More data means a deeper degradation of the target battery.
The criteria for evaluating the error are mean absolute error (MAE) and root mean squared error (RMSE). The expressions are as follows:

ELM Effectiveness Experiments
There are four sets of repeated experimental sub-data sets in Oxford Battery Degradation Dataset 1. This data set meets the same data distribution requirements of traditional machine learning. Therefore, this data set is used to verify the validity of the ELM algorithm on this issue. The same is true for selected PCoE data sets.
An ELM model is trained using all stages of Cell1 and then the latter part of Cell3, Cell7, and Cell8 is used as the test set. The experimental results are shown in Table 1 and Figure 8a. In Figure 8c,d, the same experiments are carried out in the PCoE dataset. The experimental results on these two data sets prove that ELM has good ability for this prediction task. In Figure 8b, it is shown that if only the first half of the data is used to train the network, it is hard to train a network that can predict SOH. The testing set used is also from Cell1.

ELM Effectiveness Experiments
There are four sets of repeated experimental sub-data sets in Oxford Battery Degradation Dataset 1. This data set meets the same data distribution requirements of traditional machine learning. Therefore, this data set is used to verify the validity of the ELM algorithm on this issue. The same is true for selected PCoE data sets.

Transfer Experiment
In this section, the PCoE dataset and Oxford Battery Degradation Dataset 1 serve as the source and target domains, respectively, to demonstrate the validity of the framework.
Because of the randomness of ELM, it is rational to take the average value of 100 experiments under the same settings as the prediction result. Although it is 100 loops, the total time cost of the calculation is less than 1 second.
The categorical results are shown in Table 2, Table 3, and Figure 9. Table 2 shows the prediction results of Battery #05 as the source domain and the others as the target domain. Similar to Table 2,  Table 3 shows the prediction results when Battery #07 is used as the source domain. Furthermore, Figure 9 shows some of the results in Table 2. These experimental results prove that the proposed framework is effective. For the fairness of this experiment, in this experiment, the dimension of the TCA was fixed at five dimensions, and the number of neurons in the ELM hidden layer was fixed at 4. These two parameters can be selected in this neighborhood. In this section, the PCoE dataset and Oxford Battery Degradation Dataset 1 serve as the source and target domains, respectively, to demonstrate the validity of the framework.
Because of the randomness of ELM, it is rational to take the average value of 100 experiments under the same settings as the prediction result. Although it is 100 loops, the total time cost of the calculation is less than 1 second.
The categorical results are shown in Table 2, Table 3, and Figure 9. Table 2 shows the prediction results of Battery #05 as the source domain and the others as the target domain. Similar to Table 2,  Table 3 shows the prediction results when Battery #07 is used as the source domain. Furthermore, Figure 9 shows some of the results in Table 2. These experimental results prove that the proposed framework is effective. For the fairness of this experiment, in this experiment, the dimension of the TCA was fixed at five dimensions, and the number of neurons in the ELM hidden layer was fixed at 4. These two parameters can be selected in this neighborhood.  The following conclusions can be drawn by comparing Figure 8a with Figure 9a,b. As shown in Table 1, the MAE of experiments in Figure 8a is 1.59% and the RMSE is 3.19%. As shown in Table 2, the MAE of experiments in Figure 9a is 2.10% and the RMSE is 3.51%. This means that the knowledge transferred from other datasets can achieve an accurate prediction of lithium batteries of different types or working conditions for target batteries.

Discussion of PCoE Dataset
In Figure 5a,b, it is observed that in Battery #05 and #07, approximately the top 20% of the data distribution is significantly different. This experiment is carried out for the discussion of whether the top 20% of the data will result in a worse model. The results of reduced Battery #05 for Cell1 dataset SOH prediction are shown in Figure 10

Discussion of PCoE Dataset
In Figure 5a,b, it is observed that in Battery #05 and #07, approximately the top 20% of the data distribution is significantly different. This experiment is carried out for the discussion of whether the top 20% of the data will result in a worse model. The results of reduced Battery #05 for Cell1 dataset SOH prediction are shown in Figure 10. A conclusion can be drawn from comparison between the MAE and MSE recorded in Figure 8c and Figure 10. In the case of using Battery #05 to predict the SOH of Cell1, the MAE has dropped 0.45%, about 21.4%. The results of the other datasets also declined about 20%. This means that it is advantageous to do data screening in this work.

Percentage of Mapping Data
In this part, the proposed experiments are to show the change in the final predicted result using the different percentage of data of battery life as the target domain. Data of Battery #05 are selected as the source domain and data of Cell1 are selected as the target domain. The results are shown in Table 4. As the target domain data are getting abundant, the overall trend of the forecast results is getting better and better. However, the target domain data cannot be increased arbitrarily. Predicting the battery's SOH will become meaningless when 90% of the battery's useful life has been used.

Conclusions
Because of the similarity of feature data distribution of charge voltage in different types of batteries, TCA is introduced to take advantage of similar knowledge hidden in the source domain. Moreover, because the mapping function F used for mapping can also handle new arrival data, the proposed framework can do online processing. The result of utilizing B0005 to predict the SOH of A conclusion can be drawn from comparison between the MAE and MSE recorded in Figures 8c  and 10. In the case of using Battery #05 to predict the SOH of Cell1, the MAE has dropped 0.45%, about 21.4%. The results of the other datasets also declined about 20%. This means that it is advantageous to do data screening in this work.

Percentage of Mapping Data
In this part, the proposed experiments are to show the change in the final predicted result using the different percentage of data of battery life as the target domain. Data of Battery #05 are selected as the source domain and data of Cell1 are selected as the target domain. The results are shown in Table 4. As the target domain data are getting abundant, the overall trend of the forecast results is getting better and better. However, the target domain data cannot be increased arbitrarily. Predicting the battery's SOH will become meaningless when 90% of the battery's useful life has been used.

Conclusions
Because of the similarity of feature data distribution of charge voltage in different types of batteries, TCA is introduced to take advantage of similar knowledge hidden in the source domain. Moreover, because the mapping function F used for mapping can also handle new arrival data, the proposed framework can do online processing. The result of utilizing B0005 to predict the SOH of Cell1 can achieve a MAE of 1.65% and a RMSE of 2.84%. This shows that it is possible to utilize distinct domain battery data to predict the target domain battery SOH, and the prediction results are accurate. As a conclusion, the main contribution of this paper are as follows: (1) The ELM training algorithm is used instead of BPNN. This replacement greatly simplifies the computational complexity and improves the model training speed without sacrificing prediction accuracy.
(2) In combination with the TCA, a framework is proposed that is able to predict the SOH of the target battery by using the degradation data of other batteries. When the battery is only used for 30% of its life, it can be predicted to some extent.

Author Contributions:
This work was carried out in collaboration between all authors. B.J. proposed and validated this model. Y.G. provided the necessary battery data and added the diversity of the experiment. L.W. provided the necessary help for the implementation of the experiment setup.