Source Diagnosis of Solid Oxide Fuel Cell System Oscillation Based on Data Driven

The solid oxide fuel cell (SOFC) is a new energy technology that has the advantages of low emissions and high efficiency. However, oscillation and propagation often occur during the power generation of the system, which causes system performance degradation and reduced service life. To determine the root cause of multi-loop oscillation in an SOFC system, a data-driven diagnostic method is proposed in this paper. In our method, kernel principal component analysis (KPCA) and transfer entropy were applied to the system oscillation fault location. First, based on the KPCA method and the Oscillation Significance Index (OSI) of the system process variable, the process variables that were most affected by the oscillations were selected. Then, transfer entropy was used to quantitatively analyze the causal relationship between the oscillation variables and the oscillation propagation path, which determined the root cause of the oscillation. Finally, Granger causality (GC) analysis was used to verify the correctness of our method. The experimental results show that the proposed method can accurately and effectively locate the root cause of the SOFC system’s oscillation.


Introduction
The solid oxide fuel cell (SOFC) has the advantages of having low emissions and being noise-free and highly efficient in power generation. These cells have broad application prospects and have attracted widespread attention at home and abroad [1,2]. However, the development of China's SOFC system is still in the developmental stage. The preliminary exploration of integration and control of an SOFC system has been completed in China, and a kW-class SOFC independent power generation system prototype has been successfully built. Due to the instability of temperature, pressure, flow, and other factors in the actual power generation process, the corresponding electrochemical reaction is affected, thereby reducing the SOFC system's performance [3]. An SOFC system contains multiple subsystems and each of these contains multiple variables [4]. Therefore, when a certain part oscillates, it affects other parts and even the entire system, thus affecting the performance and life of the entire system [5]. In order to reduce the performance degradation caused by the SOFC system's oscillation and extend the SOFC's service life, it is particularly important to diagnose the root cause of the SOFC system's oscillation.
occur. Transfer entropy applies to linear and nonlinear relationships between variables and can give more accurate results than the Granger causality [26]. Transfer entropy has been successfully applied to oscillation diagnosis in chemical processes [27].
In this paper, a 1kW steam reforming SOFC system was studied. In order to ensure the good performance of an SOFC system, transfer entropy was used to diagnose the root cause of system oscillation, which combines data-driven and expert experience. The purpose of this article was to diagnose the root cause of oscillations of an SOFC system. Since some variables in the system were not related to oscillation problems, they could be eliminated during the research process. Because of the large computation complexity of transfer entropy, it was necessary to select the characteristic variables first. The characteristic variables were good for the interpretability and understanding of the oscillation root causes. Therefore, the selection of characteristic variables is important in analyzing and diagnosing the root cause of an SOFC system's oscillation. PCA is a common data dimensionality reduction method, but it is only applicable to linear data that obey Gaussian distribution. The nonlinear correlation between samples may be lost after linear dimensionality reduction using PCA. Therefore, a method combining kernel principal component analysis (KPCA) and the Oscillation Significance Index (OSI) was first proposed to select characteristic variables [23]. Then, transfer entropy was used to quantitatively analyze the causal relationship between the oscillation variables, which combines the system connectivity rules, to determine the root cause of the system oscillation. Finally, the Granger causality analysis method was used to verify the effectiveness of our method.

Description of the SOFC System
The schematic diagram of the 1 kW SOFC system for the experiment is shown in Figure 1. The system consists of an SOFC stack, a reformer subsystem, a heat exchanger subsystem, a fuel supply subsystem, an air supply subsystem, a water tank, a combustion chamber, a power converter, and a control system. The distribution of the main sensors is also given. These sensors measure the data of a total of 24 process variables, including gas flow, pressure, and internal temperature of each component. The SOFC system uses methane as fuel, and water is provided by a water tank. The reforming reaction between fuel and deionized water in the reformer generates hydrogen and carbon monoxide. This process requires a certain amount of water and heat. On the air side, a blower provides enough air to the system, and the air is heated by the heat exchanger to the required inlet set temperature. Once the fuel and air flow reach the SOFC stack, an electrochemical reaction occurs, generating the required electrical energy. For the power to reach the end user, a power converter is required to convert DC power to AC power. The leak of the stack gas burns in the combustion chamber and reduces pollutant emissions. The heated gas reaches the reformer, which can control the reforming reaction temperature. It then flows through the heat exchanger to heat the fresh air provided by the blower. In addition, the system also has an automatic protection shutdown function, including the upper and lower limits of temperature, flow, and pressure, and automatically enters the protective gas after shutdown. There are many monitoring variables throughout the power generation process, including 24 process variables, some of which are described in Table 1. In order to evaluate the performance of the SOFC system and find the root cause of the system oscillation, it was necessary to obtain data for each variable of the SOFC system under the entire operating conditions. The experimental phase included start, stop, standby, and long-term discharge phases. The 1 kW SOFC system with steam reforming included 36 measurement variables in the entire power generation process, and 24 variables were selected to form the main control loop of the SOFC system, including gas flow, pressure, and internal temperature measurement data for each component, as well as electrical characteristic data. We collected data for the 24 process variables from the system every 1 second for oscillation fault analysis. Figure 2 shows the normalized time trend curve of 20,000 sample data for 24 process variables. It can be seen from the figure that nearly half of the process variables had different degrees of oscillation. Due to the large number of variables involved in the entire power generation process and the complex correlation of the control loop, the There are many monitoring variables throughout the power generation process, including 24 process variables, some of which are described in Table 1. In order to evaluate the performance of the SOFC system and find the root cause of the system oscillation, it was necessary to obtain data for each variable of the SOFC system under the entire operating conditions. The experimental phase included start, stop, standby, and long-term discharge phases. The 1 kW SOFC system with steam reforming included 36 measurement variables in the entire power generation process, and 24 variables were selected to form the main control loop of the SOFC system, including gas flow, pressure, and internal temperature measurement data for each component, as well as electrical characteristic data. We collected data for the 24 process variables from the system every 1 second for oscillation fault analysis. Figure 2 shows the normalized time trend curve of 20,000 sample data for 24 process variables. It can be seen from the figure that nearly half of the process variables had different degrees of oscillation. Due to the large number of variables involved in the entire power generation process and the complex correlation of the control loop, the oscillation signal can be easily propagated in multiple control loops, so it is difficult to directly determine the root cause of the oscillation in the SOFC system. oscillation signal can be easily propagated in multiple control loops, so it is difficult to directly determine the root cause of the oscillation in the SOFC system.

Characteristic Variable Selection
Many of the process variables in the SOFC system's oscillation process have nothing to do with the oscillation problem. In order to improve the accuracy of the oscillation diagnosis and reduce the amount of transfer entropy calculation, it was necessary to first reduce the data dimension and filter the characteristic variables. PCA is the most common data dimensionality reduction algorithm, but it can only deal with the dimensionality reduction of linear data that obey Gaussian distribution. It often has a poor effect on linearly inseparable data. Therefore, in order to eliminate the nonlinear correlation between data and reduce the data dimension, the KPCA method was used. The key of the KPCA method is to use a nonlinear mapping function to map the related data set to the highdimensional feature space. This is followed by the traditional PCA carried out to select significant oscillation variables from the original process data [27,28].
Here we assume that the sample data are X={ }, 1, 2,... , and n means the number of samples, m is the dimension of the samples, φ is a nonlinear transformation, F is a mapping space, and the image of the original data in space F is ( ) , the covariance matrix of the mapping data is defined as follows: The Eigen decomposition of the covariance matrix C is as follows: where λ is the eigenvalue of the covariance matrix, 0 λ ≥ and V is the eigenvector corresponding to the covariance matrix. The kernel function K is defined by the following:

Characteristic Variable Selection
Many of the process variables in the SOFC system's oscillation process have nothing to do with the oscillation problem. In order to improve the accuracy of the oscillation diagnosis and reduce the amount of transfer entropy calculation, it was necessary to first reduce the data dimension and filter the characteristic variables. PCA is the most common data dimensionality reduction algorithm, but it can only deal with the dimensionality reduction of linear data that obey Gaussian distribution. It often has a poor effect on linearly inseparable data. Therefore, in order to eliminate the nonlinear correlation between data and reduce the data dimension, the KPCA method was used. The key of the KPCA method is to use a nonlinear mapping function to map the related data set to the high-dimensional feature space. This is followed by the traditional PCA carried out to select significant oscillation variables from the original process data [28,29].
Here we assume that the sample data are X ={x i |x i ∈ R m }, i = 1, 2, . . . n, and n means the number of samples, m is the dimension of the samples, φ is a nonlinear transformation, F is a mapping space, and the image of the original data in space F is φ(x i ). When n i=1 φ(x i ) = 0, the covariance matrix of the mapping data is defined as follows: The Eigen decomposition of the covariance matrix C is as follows: where λ is the eigenvalue of the covariance matrix, λ ≥ 0 and V is the eigenvector corresponding to the covariance matrix. The kernel function K is defined by the following: In the equation, K ij is a kernel function matrix generated by the kernel function. There are many commonly used kernel functions. The kernel function used in this paper is the Gaussian radial basis Energies 2020, 13, 4069 6 of 13 kernel function; σ is the width parameter of the function and controls the radial action range of the function.
The obtained eigenvalues are arranged in descending order, and the corresponding eigenvector units are orthogonalized. The first components are extracted based on the principal component contribution rate, and finally the projection of the kernel matrix on a partial eigenvector is obtained. The cumulative variance contribution rate is the ratio of the sum of the main t-dimensional principal component variance contribution rates to the total n-dimensional principal component variance contribution rate, and the ratio is directly proportional to the information containing metadata. Generally, when the cumulative variance contribution ratio of the principal components of the first t dimensions of t < n exceeds 85%, the original n-dimensional data information can be characterized by the t-dimensional data information to achieve the purpose of reducing dimensions [30] as follows: where w 2 ji is a measure of the contribution of the original variable x j to the principal component t i . According to the Oscillation Significance Index (OSI), process variables that have a significant contribution to the principal component are selected to construct a subset of oscillation failure analysis as follows: In the equation, λ i is a eigenvalue corresponding to the principal component in the oscillation source candidate variable set O PC .

Transfer Entropy
Transfer entropy is an information theoretical explanation of the causal relationship between two variables. For two different time series X and Y, assuming that x i is influenced by the former k states of the sequence X and the former l states of the sequence Y, the transfer entropy from Y to X can be calculated using the following equation, which represents a reduction in the uncertainty of Y when X is known [31]: , y n (l) ) log p(x n+1 , x n (k) , y n (l) )p(x n (k) ) p(x n+1 , x n (k) )p(x n (k) , y n (l) ) In actual situations, k = 1 and l = 1 are the better choices for calculation, which can avoid the evaluation of complex high-dimensional probability density functions when calculating transfer entropy. This does not affect the use of transfer entropy to determine the directionality and correlation between variables [32]; therefore, Equation (7) can be simplified as follows: The reverse transfer entropy is as follows: T X→Y = p(x n+1 , x n , y n ) log p(y n+1 , x n , y n )p(y n ) p(y n+1 , y n )p(x n , y n ) Finally, the causality measure can be calculated by subtracting the effect of X on Y from the effect of Y on X, as Equation (10): Energies 2020, 13, 4069 7 of 13 t Y→X > 0 means that Y causes X and t Y→X < 0 means that X causes Y. When there are multiple variables, the results can be denoted in a causality matrix as follows: The rows represent cause variables, while the columns represent the effect variables, and R represents the number of variables [27].

Granger Causality
The Granger causality (GC) test is a statistical method for hypothesis testing, which is used to test whether one set of time series x 2 has a causal effect on another set of time series x 1 . The basic principle is to establish an autoregressive model; under the condition of controlling the past value of x 1 , the prediction accuracy of the past value of x 2 to the current value of x 1 is estimated [33]. The GC method needs to input multivariate time series data to meet the requirements of generalized stationary, otherwise incorrect regression problems may occur. In order to ensure that the oscillation factor is the main information of the original data, and at the same time meet the generalized stationary requirements of vector autoregressive (VAR) modeling, it is necessary to perform smooth preprocessing on the original data such as a differential transformation operation to eliminate the time series trend. Granger causality testing is divided into time domain Granger causality analysis and frequency domain Granger causality analysis. Time domain Granger causality analysis is used to test the effectiveness of the proposed method.
The system operation historical data are composed of time series data of multiple process variables. The time series x i (t) and x j (t) of any two process variables are taken, and all the lag terms x i (t − k),x j (t − k),k = 1, . . . , l of x i (t), and x j (t) are regressed to establish a complete autoregressive (AR) model for predicting x i (t) and x j (t): x j (t) = l k=1 a ji,k x i (t − k) + l k=1 a jj,k x j (t − k) + e j (t) (13) In the equation, a ii,k , a ij,k , a ji,k , and a jj,k are the model coefficients, e i (t) and e j (t) are the model prediction errors, and l is the model order that defines the number of lag terms included in the model.
Corresponding to the above complete model definition, the lag term of the variable x j (t) is excluded and then regression is performed to establish a restricted model for predicting x i (t): In the equation, e i( j) (t) refers to the model prediction error obtained without considering the prediction of x j to x i .
If the model prediction variance var(e i ) is significantly smaller than var(e i( j) ), its statistical significance is to combine the past value of x j and the prediction of x i is more accurate, which means x j produces a Granger causality for x i . Therefore, the Granger causality can be quantified by the following index: Before performing Granger causality inference, the F-test was used to verify its statistical significance. If the F value calculated at the selected significance level α (value set to 0.05) exceeds the critical value F α , it is satisfied that the p value is less than α, indicating that x j is the cause of x i : where m is the observation sample size; RRS r and RRS ur are the sum of the residual squares of the restricted model and the complete model, respectively; l r is equal to the number of x j lag terms, that is, the number of parameters to be estimated in the restricted model; and l ur is the number of parameters to be estimated in the complete model, which satisfies l ur > l r .

Characteristic Variable Selection
Before locating the oscillation faults of the SOFC system, the process variables affected by the oscillating signals were first screened from the original process data to reduce the scope of causal analysis and improve the accuracy of the oscillating source identification.
For this study, we normalized the original data and used KPCA to reduce the dimensionality of the normalized data. The result is shown in Figure 3. The 24 process variables are reduced to eight-dimensional principal components, and the sum of their contribution rates reaches more than 95%. In other words, the first eight principal components can reflect almost all the characteristics of the original data. means j x produces a Granger causality for i x . Therefore, the Granger causality can be quantified by the following index: Before performing Granger causality inference, the F-test was used to verify its statistical significance. If the F value calculated at the selected significance level α (value set to 0.05) exceeds the critical value F α , it is satisfied that the p value is less than α , indicating that j x is the cause of

Characteristic Variable Selection
Before locating the oscillation faults of the SOFC system, the process variables affected by the oscillating signals were first screened from the original process data to reduce the scope of causal analysis and improve the accuracy of the oscillating source identification.
For this study, we normalized the original data and used KPCA to reduce the dimensionality of the normalized data. The result is shown in Figure 3. The 24 process variables are reduced to eightdimensional principal components, and the sum of their contribution rates reaches more than 95%. In other words, the first eight principal components can reflect almost all the characteristics of the original data. The OSI is calculated in the range of eight principal components, and variables with significant oscillations are screened out of all process variables on the condition that the percentage of OSI accounted for more than 5%. Figure 4 shows the result. The selected oscillating variables include combustion chamber inlet temperature, methane pressure, stack voltage, air bypass flow, and Contribution rate (%) The OSI is calculated in the range of eight principal components, and variables with significant oscillations are screened out of all process variables on the condition that the percentage of OSI accounted for more than 5%. Figure 4 shows the result. The selected oscillating variables include combustion chamber inlet temperature, methane pressure, stack voltage, air bypass flow, and reformer fuel inlet temperature. It can be seen in Figure 2 that the filtering results of the oscillation variables are the five variables that are most affected by the oscillation among all process variables. reformer fuel inlet temperature. It can be seen in Figure 2 that the filtering results of the oscillation variables are the five variables that are most affected by the oscillation among all process variables.

SOFC System Oscillation Diagnosis Results and Discussion
There are a large number of variables in the SOFC system, and the KPCA method was used to screen five characteristic variables with significant oscillations among the 24 original process variables. Subsequently, the method of transfer entropy was used for further research. The causality matrix of the transfer entropy analysis is shown in Table 2 It indicates that the methane pressure has the largest impact on the stack voltage oscillation. The transfer entropy value of methane pressure to the inlet temperature of the combustion chamber is 0.2748, which also shows a significant causal effect. At the same time, we can clearly see that the transfer entropy values in the second row are all positive, and the transfer entropy values in the second column are all negative, which means that the methane pressure has an impact on other oscillatory variables and there is no other variable that affects methane. This shows that methane pressure is the root cause of the system oscillation. As can be seen from the table, causal feedback exists in the remaining oscillation variables, reflecting the strong coupling characteristics of the system. However, it is important that the causal

SOFC System Oscillation Diagnosis Results and Discussion
There are a large number of variables in the SOFC system, and the KPCA method was used to screen five characteristic variables with significant oscillations among the 24 original process variables. Subsequently, the method of transfer entropy was used for further research. The causality matrix of the transfer entropy analysis is shown in Table 2. An arbitrary position (i, j) in the causality matrix represents the causality measure from the i − th row variable to the j − th column variable. It can be known from Equation (8) that the transfer entropy value from i to j and the transfer entropy value from j to i are opposite numbers to each other. The causal matrix is symmetrical about the diagonal. For example, the position (2,3) indicates that the transfer entropy value of methane pressure to the stack voltage is 0.4390, which is the largest of all causal measures. It indicates that the methane pressure has the largest impact on the stack voltage oscillation. The transfer entropy value of methane pressure to the inlet temperature of the combustion chamber is 0.2748, which also shows a significant causal effect. At the same time, we can clearly see that the transfer entropy values in the second row are all positive, and the transfer entropy values in the second column are all negative, which means that the methane pressure has an impact on other oscillatory variables and there is no other variable that affects methane. This shows that methane pressure is the root cause of the system oscillation. As can be seen from the table, causal feedback exists in the remaining oscillation variables, reflecting the strong coupling characteristics of the system. However, it is important that the causal measures between the air bypass flow and other variables are very small. Based on the process knowledge, a reasonable explanation can be made as follows: the purpose of adding cold air bypass on the air trunk road is to pass excessive. The cold air effectively controls the temperature of the air at the inlet of the stack, thereby ensuring that the stack is in a thermally safe state. Compared with excess air feedback supply, fuel supply is the key to causing abnormal fluctuations in stack discharge.
In addition, as can be seen in Figure 4, just like the air bypass flow, the air feedback flow and air pressure are also affected by the oscillation. However, due to the lower effect, it is filtered out in the filtering process of the oscillation variable. It can be concluded that the air volume fluctuation caused by the blower failure has no obvious relationship with the methane pressure.
In addition, it can be seen that the transfer entropy value of the burner chamber inlet temperature to the stack voltage is 0.0998. Combined with the structure flow chart of the SOFC system, a reasonable explanation can be made. The temperature of the burner chamber affects the temperature of the entire system and the stack. The temperature increase of the burner chamber will indirectly raise the temperature of the stack. In the case of constant current, the higher the temperature, the higher the voltage. Conversely, the lower the temperature, the lower the voltage [34].
According to the analysis of the transfer relationship between the variables in Table 2, it can be concluded that the methane pressure instability is the root cause of the oscillation fault, which directly affects the temperature fluctuation of the fuel inlet of the reformer, and then causes the voltage fluctuation of the stack, and finally causes the temperature fluctuation of the burner chamber inlet.
Combining the structure diagram of the system in Figure 1 and the results of the transfer entropy analysis, it can be concluded that the methane pressure is the source of oscillation in the SOFC system. As the fuel required for the operation of the SOFC system, methane flows along the pipeline to the reforming chamber to participate in the reforming reaction, participate in the electrochemical reaction in the anode runner of the stack, and participate in the oxidation reaction in the exhaust gas. Therefore, the supply of methane plays a vital role in the heat, electricity, and gas stability of the system. The methane feedback flow does not oscillate significantly, which can eliminate the cause of the instability of the flow velocity at the fuel supply end. According to the system structure diagram and the connectivity rules of each component, it can be concluded that the reason for methane pressure fluctuation is the deionized water vapor fluctuation in the reforming reaction.

Granger Causality Verification
The causal matrix based on time domain GC analysis is shown in Table 3. Any position (i, j) in the causal matrix represents the causal metric from the j − th column variable to the i − th row variable. For example, the position (3,2) indicates that the Granger causality measure of methane pressure on the stack voltage is 0.4458, which is the largest of all causality measures, indicating that the methane pressure has the largest impact on the voltage of the stack. Table 4 shows the F-test result P value of each group of causality. When the P value is less than the significance level α (α = 0.05), the F-test result is valid and the causal inference is established. It is marked in bold in Table 4 and the corresponding causality measure is shown in bold in Table 3. It can be seen from Table 3 that the measurement of methane pressure on the inlet temperature of the burner chamber and the stack voltage shows a clear causal relationship. At the same time, it can be found that the methane pressure has the greatest influence on the chimney voltage, which is much higher than the causal value of the stack voltage change caused by other reasons. Therefore, it can be determined that methane pressure instability is the root cause of the oscillation fault.  Similarly, we can see that the causal relationship between air bypass flow and other variables is small. The F-test results show that the corresponding causal hypothesis is invalid, and the result is the same as that of the transfer entropy.
Combining the results of the two data analysis methods of transfer entropy and Granger causality analysis, it can be derived that the methane pressure is the source of the oscillation failure.
Since there is communication between the reforming reaction water vapor and the methane supply pipeline, the fluctuation of the water vapor affects the pressure fluctuation in the chamber, and further affects the pressure fluctuation of the methane. The fluctuation of water vapor in the evaporator cannot be monitored by the existing sensors. However, the set value of deionized water can be adjusted. Based on the results, we artificially reduced the set value of deionized water in the later stages of the experiment. As a result, the number of experimental shutdowns and fluctuations in methane pressure was greatly reduced. Therefore, the source of the oscillating failure is unstable methane pressure, which is the result closest to the root cause of the actual failure. Therefore, we can isolate or reduce the effect of water vapor on methane pressure, which can help the SOFC system achieve good performance and maintain its stability and long-term operation.

Conclusions
Based on the research of a 1 kW SOFC independent power generation system with steam reforming, a data-driven SOFC system root cause diagnosis method is proposed for the oscillation fault problem in the system. KPCA and transfer entropy were employed to locate the root cause of system oscillations. First, variables unrelated to the oscillation of the SOFC system were eliminated, then five characteristic variables were selected from the 24 original process variables based on KPCA and OSI. Subsequently, transfer entropy theory was used to calculate the direct causality of the variables and analyze the oscillation propagation path in combination with the system structure to successfully determine the source of the oscillation as methane pressure. Finally, the time domain Granger causality analysis method was used to prove the effectiveness and reliability of our method. The experimental results show that our method can diagnose the root cause of vibration based on the process data. The fault location effect is obvious, and the results are accurate and reliable. In addition, combining data analysis and expert experience, the proposed method can provide an effective method for diagnosing the root cause of oscillations in an SOFC system.