Research on the Fault Diagnosis of a Polymer Electrolyte Membrane Fuel Cell System

: In this work, the possibilistic fuzzy C-means clustering artiﬁcial bee colony support vector machine (PFCM-ABC-SVM) method is proposed and applied for the fault diagnosis of a polymer electrolyte membrane (PEM) fuel cell system. The innovation of this method is that it can ﬁlter data with Gaussian noise and diagnose faults under dynamic conditions, and the amplitude of characteristic parameters is reduced to ± 10%. Under dynamic conditions with Gaussian noise, the faults of the PEM fuel cell system are simulated and the original dataset is established. The possibilistic fuzzy C-means (PFCM) algorithm is used to ﬁlter samples with membership and typicality less than 90% and to optimize the original dataset. The artiﬁcial bee colony (ABC) algorithm is used to optimize the penalty factor C and kernel function parameter g . Finally, the optimized support vector machine (SVM) model is used to diagnose the faults of the PEM fuel cell system. To illustrate the results of the fault diagnosis, a nonlinear PEM fuel cell simulator model which has been presented in the literature is used. In addition, the PFCM-ABC-SVM method is compared with other methods. The result shows that the method can diagnose faults in a PEM fuel cell system e ﬀ ectively and the accuracy of the testing set sample is up to 98.51%. When solving small-sized, nonlinear, high-dimensional problems, the PFCM-ABC-SVM method can improve the accuracy of fault diagnosis.


Introduction
Hydrogen energy is one of the most important green energy sources. The polymer electrolyte membrane (PEM) fuel cell system can directly convert hydrogen energy into electrical energy through an electrochemical reaction and generate water and heat with minimal pollution [1]. The PEM fuel cell system is a multi-input and-output nonlinear system, and there are some auxiliary elements such as compressors, supply manifolds, return manifolds, compressors, valves, etc. For this reason, the PEM fuel cell system is vulnerable to different sets of faults that can imply its temporal or permanent damage [2]. Therefore, fault diagnosis methods are important to reduce this vulnerability as much as possible.
Considering whether the model is necessary, the diagnosis methods can be classified into two general types, i.e., model-and non-model-based methods [3,4]. The model-based method needs to develop a model to simulate the behavior of the monitored system [4] and, generally, it is performed mostly via residual evaluation, followed by a residual inference for possible fault occurrence detection [5]. Escobet and Feroldi et al. [6,7] proposed a model-based fault diagnosis methodology based on the relative fault sensitivity, and the diagnosis methodology correctly diagnosed the simulated faults in contrast with other methodologies using binary signature matrix of analytical residuals and faults. Rosich et al. [8] designed a subset of consistency relations and residual generators for a fuel cell system. Lira et al. [9] In this paper, Gaussian noise is used to simulate the interference in the PEM fuel cell model, and the fault diagnosis effect of the method can also be verified. In this paper, when the variance of Gaussian noise is 1.0, 0.5, 0.2, 0.1 respectively, the amplitude of characteristic parameters is reduced to ±10%. By simulating the fault scenarios of the PEM fuel cell system, the original dataset is established with eight diagnosis variables. The possibilistic fuzzy C-means clustering artificial bee colony support vector machine (PFCM-ABC-SVM) method is used to diagnose the faults in the PEM fuel cell system.

PFCM Algorithm
In the fuzzy C-means clustering (FCM) algorithm, the membership value of each sample point must be 1.0, therefore, it is sensitive to noise points and the classification result is not accurate. The PCM algorithm is sensitive to the initial cluster center, and only when the cluster centers are the same can the global optimal solution be obtained, which causes cluster consistency problems [34]. To solve the shortcomings of the above algorithm, Pal proposed the possibilistic fuzzy C-means (PFCM) algorithm based on the above algorithm [35,36]. The PFCM algorithm overcomes the sensitivity of the FCM algorithm to noise and the sensitivity of the PCM algorithm to initial clustering centers. Additionally, the PFCM algorithm improves the accuracy of classification results. The objective function of the PFCM algorithm is as follows: where 1 ≤ i ≤ c, 1 ≤ j ≤ n; c i = 1 u ij = 1; a and b define the relative importance of fuzzy membership and typicality values in the objective function, a > 0, b > 0; m and p are the fuzzy parameters; d ij = x j − v i is the Euclidean distance from sample point x j to v i ; c is the number of cluster centers; and n is the number of sample points. The penalty coefficient of the PFCM algorithm is as follows: where η i is the penalty coefficient; generally, K = 1, from the optimal solution of Equation (1), get the following Equations (3)- (5): The steps of the PFCM algorithm are as follows: Step 1 Set the fuzzy parameters, set the terminating threshold ε, set the maximum number of iterations L, set the number of initial iterations l, initialize the cluster center V (0) , initialize the membership matrix U (0) , and initialize the typicality matrix T (0) ; Step 2 According to Formula (2), calculate the penalty coefficient η i ; Step 3 According to Formula (3), calculate and update the membership matrix u Step 4 According to Formula (4), calculate and update the typicality matrix t Step 5 According to Formula (5), calculate and update the cluster center matrix v Step 6 If V (l+1) − V l < ε or L < l, output the cluster center, the membership matrix and the typicality matrix; if not, make l = l + 1, skip to Step 2. The flow chart of the PFCM algorithm is shown in Figure 1.

Multi-Parameter Optimization of Support Vector Machine(SVM)
Support Vector Machine (SVM) is a new general machine learning method presented by Vapnik [37]. Traditional statistical research is in the case of sufficient samples or assuming an infinite number of samples, but in actual problems, there are few samples. According to the principle of structural risk minimization and the Vapnik-Chervonenkis(VC) dimension theory, SVM can combine the complexity of the model with the learning ability, find the optimal solution, and obtain better generalization ability. The classification theory of SVM is developed from the problem of linear separable binary classification. In the process of classification, the optimal classification hyperplane is constructed. The training samples are classified correctly according to the principle of least empirical risk, and the maximum classification interval is required to ensure the minimum confidence range. It has advantages in solving small-sized, nonlinear, high-dimensional problems [38][39][40]. The objective function of the SVM optimization problem is as follows: where ∀x i , x j ∈ R n ; α i is the Lagrange multiplier; C is the penalty factor; and K(x i , x j ) is the kernel function which can transform a low-dimensional vector into a high-dimensional inner product.
The corresponding optimal classification function is as follows: where α * is the optimal solution; b * = y i − N i = 1 α * i y i K(x i , x j ). In the above optimization problem, it is necessary to determine the kernel function K(x i , x j ). There are four kinds of kernel functions commonly used in SVM as follows: linear kernel function K(x i , y i ) = x i · y i ; polynomial kernel function K(x i , y i ) = [(x i · y i ) + b] d ; hyperbolic tangent kernels function K(x i , y i ) = tanh[v(x i · y i ) + c]; and radial basis kernel function K(x i , where g is the kernel parameter. Many studies show that radial basis kernel function is a better choice when there is not enough prior knowledge [41]. The radial basis kernel function is used as the kernel function in SVM. After that, the kernel function parameter g and the penalty factor C should be selected which are significant to establish the optimized SVM model. The artificial bee colony (ABC) algorithm is an intelligent optimization algorithm inspired by biological behaviors proposed by Karaboga [42,43]. It mainly solves practical problems by simulating bees collecting honey. The ABC algorithm finds the global optimal solution through the local optimization behavior of bees. It is often used to solve multi-parameter optimization problems [44]. In the paper, the ABC algorithm is used to obtain the optimal penalty factor C and kernel parameter g. Compared with the genetic algorithm (GA), and particle swarm optimization algorithm (PSO), the ABC algorithm has the advantages of strong global optimization ability and few control parameters.
The multi-parameter optimization of SVM is as follows: Step 1 Initialize the parameters in the ABC algorithm and SVM, i.e., the number of bee colonies, the number of honey sources, the maximum search number of honey sources (Limit), the current search number of honey sources, the maximum number of iterations (MaxIter), the search range of penalty factors C, and the search range of kernel function parameter g.
Step 2 Select the fitness function in the ABC algorithm. The purpose of optimizing the SVM parameters is to improve the accuracy of fault classification. The solution of the optimization problem can be regarded as a process for the bee to find the honey source. The fitness function is as follows: where, f itness i is the fitness value of the i-th parameter, and f i is the objective function value of the i-th honey source.
Step 3 Employed bees search for the neighborhood of the current honey source according to Formula (12) and calculate the fitness of the new honey source according to Formula (11). If the fitness value of the new honey source is better than that of the original honey source, the new honey source position replaces the original honey source position, otherwise the original honey source remains unchanged.
where, new_x id is the value of the d-th dimension in the i-th new honey source; x id is the value of the d-th dimension in the i-th original honey source; R is a random number in [-1, 1]; and k is any honey source except the i-th honey source.
Step 4 After the employed bees complete the global search, onlooker bees select the honey source according to Formula (13), and then search for the neighborhood to get the new honey source according to Formula (12). If the fitness value of the new honey source is better than that of the original honey source, the new honey source position replaces the original honey source position, otherwise the original honey source remains unchanged.
where, P i is the probability that the i-th honey source is selected, f itness i is the fitness value of the i-th honey source, and N is the total number of honey sources.
Step 5 Judge whether the current search number of honey sources is bigger than the maximum search number of honey sources. If it is bigger, generate a new honey source according to Formula (14).
where, x ij is the value of the j-th dimension of the i-th honey source, j ∈ {1, 2}.
Step 6 Record the current optimal honey source and judge whether the termination condition is met. If the termination condition is met, skip to Step 7, otherwise skip to Step 3.
Step 7 Get the global optimal honey sources, which are the penalty factor C and kernel parameter g, to establish the optimized SVM model.

Fault Simulation of the PEM Fuel Cell System
The PEM fuel cell system can directly convert chemical energy into electricity through electrochemical reaction and produce water and heat at the same time. The PEM fuel cell simulator model uses controller strategies and nonlinear models presented by Pukrushpan et al. [1]. It is assumed that the system is in a constant temperature state, ignoring the influence of the double charge layer, and it is regarded as a rapid dynamic behavior near the electrode/electrolyte. Parameters commonly used in the PEM fuel cell simulator model are described in Table 1. The PEM fuel cell simulator model was established by Pukrushpan, J.T. in [1] and some parameters of the PEM fuel cell simulator model are from [45][46][47] based on actual product parameters. The PEM fuel cell simulator model is widely used for the fault diagnosis of the PEM fuel cell system [7,8,12,14], and represents the75kW fuel cell system with 381 cells. The PEM fuel cell simulator model includes the fuel cell stack model, the compressor model, the supply manifold model, the return manifold model, the air cooler model, and the humidifier model. The PEM fuel cell system block diagram is shown in Figure 2. The five faults are partially quoted from the literature [1,14] and the amplitude of characteristic parameters is reduced to ±10%. The faults in the PEM fuel cell simulator model are described in Table 2.  The compressor motor suffers an overheating Parametric abrupt

Fault3
The fluid resistance increases due to water blocking the channels or flooding in the diffusion layer Parametric abrupt 10% of reduction of the water flow

Fault4
Air leak in the air supply manifold Parametric abrupt 10% of reduction of the air flow  The characteristic parameters remain unchanged and Fault0 is in normal state. Equations (15)- (21) [1] are used to simulate Fault1-Fault4. According to the thermodynamic formula, the compressor torque τ cp is expressed as: where, τ cp is the torque needed to drive the compressor, C p is the specific heat capacity of air, ω cp is the compressor speed, η cp is the compressor efficiency, p sm is the supply manifold pressure, p atm is the pressure of the air, T atm is the temperature of the air, γ is the ratio of the specific heats of the air, and W cp is the air mass flow of compressor.
A lumped rotational parameter model with inertia is used to represent the compressor speed: where J cp is the combined inertia of the compressor and the motor, and τ cm is the compressor motor torque input. The Fault1 state is simulated with the increment ∆k v in the compressor constant k v . The Fault2 state is simulated with the increment ∆R cm in the compressor motor resistance R cm : where, η cm is the motor mechanical efficiency, k t is the motor torque constant, R cm is the compressor motor resistance, ∆R cm is the increment in the compressor motor resistance, k v is the motor electric constant, and ∆k v is the increment in the motor electric constant. The maximum mass of the vapor that the gas can hold is calculated from the vapor saturation pressure: where, m v,max,ca is the maximum mass of the vapor, p sat is the saturation pressure of the vapor, R v is the gas constant of the vapor, and T st is the temperature of the stack. If m w,ca ≤ m v,max,ca , so m v,ca = m w,ca , m l,ca = 0; if m w,ca > m v,max,ca ,so m v,ca = m v,max,ca ,m l,ca = m w,ca − m v,max,ca . The total cathode pressure is the sum of oxygen, nitrogen, and vapor partial pressure: where P ca is the cathode pressure; V ca is the cathode volume; P O 2 ,ca , P N 2 ,ca and P v,ca are the partial pressure of oxygen, nitrogen, and vapor; R O 2 , R N 2 and R v are the gas constants of oxygen, nitrogen, and vapor. Fault3 is simulated with the increment ∆k ca,out in the cathode outlet orifice constant k ca,out : where, ∆k ca,out is the increment in the cathode outlet orifice constant, k ca,out is the cathode outlet orifice constant, W ca,out is the air flow in the cathode outlet, p ca is the cathode pressure, and p rm is the return manifold pressure. Fault 4 is simulated with the increment ∆k sm,out in the supply manifold outlet orifice constant k sm,out : W sm,out = (k sm,out + ∆k sm,out )(p sm − p ca ) where, W sm,out is the outlet mass flow, ∆k sm,out is the increment in the supply manifold outlet orifice constant, and k sm,out is the supply manifold outlet orifice constant.

Fault Diagnosis of the PEM Fuel Cell System
In this work, the Gaussian noise with variance of 0.1, 0.2, 0.5, and 1.0 are added to the PEM fuel cell simulator model, respectively. It is difficult to distinguish the Fault 0 to Fault 4 states in Table 2. Signals in a fault state are coupled with signals in other faults. Therefore, the traditional methods cannot diagnose the fault of the PEM fuel cell system effectively.
The Fault 0-4 states are simulated using the PEM fuel cell simulator model in the dynamic condition. Eight diagnostic variables are selected from the PEM fuel cell simulator model, and the eight diagnostic variables are fuel cell current (I f c ), fuel cell voltage (V f c ), compressor speed (ω cm ), compressor outlet pressure (P cm,out ), compressor motor voltage (V cm ), compressor motor current (I cm ), hydrogen inlet pressure (P H 2 ,in ), and air inlet pressure (P air,in ). Taking the Fault4 state as an example, Gaussian noise with variance of 1.0 is added to the PEM fuel cell simulator model. The fuel cell current, fuel cell voltage, compressor speed, compressor outlet pressure, compressor motor voltage, compressor motor current, hydrogen inlet pressure, and air inlet pressure change with time, respectively, are shown in Figures 3-10.
In this paper, the Gaussian noise with variance of 0.1, 0.2, 0.5, and 1.0 are added to the PEM fuel cell simulator model, respectively. The PFCM algorithm is used to filter samples with membership and typicality less than 90% and optimize the original dataset. The filtered data is used as the sample dataset. The sample dataset are divided into two groups, one is the training set sample and the other is the testing set sample. The training set sample number is 670, and the testing set sample number is 335. The penalty parameter C and kernel function parameter g of SVM are optimized using the ABC algorithm, and then establish the optimized SVM model. The testing set sample is used to test the accuracy of the fault diagnosis method.  Table 2. Signals in a fault state are coupled with signals in other faults. Therefore, the traditional methods cannot diagnose the fault of the PEM fuel cell system effectively.
The     Table 2. Signals in a fault state are coupled with signals in other faults. Therefore, the traditional methods cannot diagnose the fault of the PEM fuel cell system effectively.
The                 In this paper, the Gaussian noise with variance of 0.1, 0.2, 0.5, and 1.0 are added to the PEM fuel cell simulator model, respectively. The PFCM algorithm is used to filter samples with membership and typicality less than 90% and optimize the original dataset. The filtered data is used as the sample dataset. The sample dataset are divided into two groups, one is the training set sample and the other is the testing set sample. The training set sample number is 670, and the testing set sample number is 335. The penalty parameter C and kernel function parameter g of SVM are optimized using the ABC algorithm, and then establish the optimized SVM model. The testing set sample is used to test the accuracy of the fault diagnosis method.
The fault diagnosis steps of the fuel cell system based on the PFCM-ABC-SVM method are as follows:    In this paper, the Gaussian noise with variance of 0.1, 0.2, 0.5, and 1.0 are added to the PEM fuel cell simulator model, respectively. The PFCM algorithm is used to filter samples with membership and typicality less than 90% and optimize the original dataset. The filtered data is used as the sample dataset. The sample dataset are divided into two groups, one is the training set sample and the other is the testing set sample. The training set sample number is 670, and the testing set sample number is 335. The penalty parameter C and kernel function parameter g of SVM are optimized using the ABC algorithm, and then establish the optimized SVM model. The testing set sample is used to test the accuracy of the fault diagnosis method.
The fault diagnosis steps of the fuel cell system based on the PFCM-ABC-SVM method are as follows: The fault diagnosis steps of the fuel cell system based on the PFCM-ABC-SVM method are as follows: Step 1 Initialize the parameters in the PFCM-ABC-SVM method as follows: set the fuzzy parameters, m = 2, p = 2; set the terminating threshold ε = 10 −6 ; set the maximum number of iterations L = 100; set the number of initial iterations l = 0; initialize the cluster center V (0) , initialize the membership matrix U (0) , and initialize the typicality matrix T (0) ; set the number of bee colonies n = 20; set the maximum search number of honey sources Limit = 100; set the current search number of honey sources d = 0; set the maximum number of iterations maxIter = 10; set the search range of penalty factor C: [0.01, 100]; and set the search range of kernel function parameter g: [0.01, 100].
Step 2 Get the original data of the PEM fuel cell system and select eight diagnostic variables. The eight diagnostic variables are fuel cell current, fuel cell voltage, compressor speed, compressor outlet pressure, compressor motor voltage, compressor motor current, hydrogen inlet pressure, and air inlet pressure.
Step 3 Establish the original dataset with eight diagnostic variables and normalize the original dataset using mapminmax Function in Matlab(R2018b).
Step 4 Adapt the PFCM algorithm to eliminate samples with membership and typicality less than 90%, filter the original dataset, and establish the sample dataset.
Step 5 Divide the sample dataset into the training set sample and the testing set sample.
Step 6 Optimize the penalty parameter C and kernel function parameter g of SVM using the ABC algorithm and establish the optimized SVM model.
Step 7 Diagnose faults by the optimized SVM model and obtain the diagnostic result. The fault diagnosis flow chart of the fuel cell system based on PFCM-ABC-SVM method is shown in Figure 11. than 90%, filter the original dataset, and establish the sample dataset.
Step 5 Divide the sample dataset into the training set sample and the testing set sample.
Step 6 Optimize the penalty parameter C and kernel function parameter g of SVM using the ABC algorithm and establish the optimized SVM model.
Step 7 Diagnose faults by the optimized SVM model and obtain the diagnostic result. The fault diagnosis flow chart of the fuel cell system based on PFCM-ABC-SVM method is shown in Figure 11.  When the Gaussian noise variance is 1.0, the PFCM-ABC-SVM method is compared with the GA-SVM, PSO-SVM and ABC-SVM methods. The comparison between the PFCM-ABC-SVM method and the other methods is shown in Table 3. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 1.0 are shown in Figure 12. For the Fault 0-4 states, the accuracy of the training set sample is 95.67%, and the accuracy of the testing set sample is 92.84% using the PSO-SVM method; the accuracy of the training set sample is 95.82%, and the accuracy of the testing set sample is 94.03% using the ABC-SVM method; the accuracy of the training set sample is 97.46%, and the accuracy of the testing set sample is 97.31% using the PFCM-ABC-SVM method. Therefore, the PFCM-ABC-SVM method can effectively improve the accuracy of fault diagnosis of the PEM fuel cell system. The category label in Figure 12, "0" represents Fault0, "1" represents Fault1, "2" represents Fault2, "3" represents Fault3, and "4" represents Fault4. There are 335 samples in the testing set samples.   Table 4. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 0.5 are shown in Figure 13. The accuracy of the training set sample is 98.81% and the accuracy of the testing set sample is 97.91% using the PFCM-ABC-SVM method.   Table 4. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 0.5 are shown in Figure 13. The accuracy of the training set sample is 98.81% and the accuracy of the testing set sample is 97.91% using the PFCM-ABC-SVM method.  When the Gaussian noise variance is 0.2, the PFCM-ABC-SVM method is compared with the GA-SVM, PSO-SVM and ABC-SVM methods. The comparison between the PFCM-ABC-SVM method and the other methods is shown in Table 5. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 0.2 are shown in Figure 14. The accuracy of the training set sample is 98.81%, and the accuracy of the testing set sample is 98.21% using the PFCM-ABC-SVM method.  When the Gaussian noise variance is 0.2, the PFCM-ABC-SVM method is compared with the GA-SVM, PSO-SVM and ABC-SVM methods. The comparison between the PFCM-ABC-SVM method and the other methods is shown in Table 5. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 0.2 are shown in Figure 14. The accuracy of the training set sample is 98.81%, and the accuracy of the testing set sample is 98.21% using the PFCM-ABC-SVM method. When the Gaussian noise variance is 0.2, the PFCM-ABC-SVM method is compared with the GA-SVM, PSO-SVM and ABC-SVM methods. The comparison between the PFCM-ABC-SVM method and the other methods is shown in Table 5. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 0.2 are shown in Figure 14. The accuracy of the training set sample is 98.81%, and the accuracy of the testing set sample is 98.21% using the PFCM-ABC-SVM method.  When the Gaussian noise variance is 0.1, the PFCM-ABC-SVM method is compared with the GA-SVM, PSO-SVM and ABC-SVM methods. The comparison between the PFCM-ABC-SVM method and the other methods is shown in Table 6. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 0.1 are shown in Figure 15. The accuracy of the training set sample is 98.66%, and the accuracy of the testing set sample is 98.51% using the PFCM-ABC-SVM method. When the Gaussian noise variance is 0.1, the PFCM-ABC-SVM method is compared with the GA-SVM, PSO-SVM and ABC-SVM methods. The comparison between the PFCM-ABC-SVM method and the other methods is shown in Table 6. The classification results of the PEMFC-ABC-SVM method when the Gaussian noise variance is 0.1 are shown in Figure 15. The accuracy of the training set sample is 98.66%, and the accuracy of the testing set sample is 98.51% using the PFCM-ABC-SVM method.    In order to illustrate the advantages of the PFCM-ABC-SVM method, the GA-SVM, PSO-SVM and ABC-SVM methods are compared with it, in this work. The results of fault diagnosis are shown in Tables 3-6. Under the dynamic conditions with the variance of the Gaussian noise decreasing from 1.0 to 0.1, the accuracy of the testing set sample is as high as 98.51%. Comparing with the other methods, the PFCM-ABC-SVM method has a better effect in fault diagnosis of the PEM fuel cell system.

Conclusions
In this work, the PFCM-ABC-SVM method is proposed and verified by the PEM fuel cell simulator model. The Gaussian noise with variance of 0.1, 0.2, 0.5, and 1.0 are added to the PEM fuel cell simulator model, respectively, for fault diagnosis. The PFCM algorithm is used to filter samples with membership and typicality less than 90% and optimize the original dataset. The ABC algorithm is used to optimize the penalty factor C and kernel function parameter g, and the optimized SVM model is used to diagnose the faults of the PEM fuel cell system. The results show that under the dynamic conditions with the variance of the Gaussian noise decreasing from 1 to 0.1, the accuracy of the training set sample increases from 97.46% to 98.81%, and the accuracy of the testing set sample increases from 97.31% to 98.51%. The PFCM-ABC-SVM method is effective to diagnose the faults in the PEM fuel cell system, and it is better than other commonly used methods. The PFCM-ABC-SVM method has an advantage in solving the small-sized, nonlinear, and high-dimensional problems and furthermore, provides references for on-line fault diagnosis of a fuel cell system.   compressor speed(rad/s) P cm,out compressor outlet pressure(pa) V cm compressor motor voltage(V) I cm compressor motor current(A) P H 2 ,in hydrogen inlet pressure(pa) P air,in air inlet pressure(pa)