An Artiﬁcial Intelligence Solution for Predicting Short-Term Degradation Behaviors of Proton Exchange Membrane Fuel Cell

: The dead-ended anode (DEA) and anode recirculation operations are commonly used to improve the hydrogen utilization of automotive proton exchange membrane (PEM) fuel cells. The cell performance will decline over time due to the nitrogen crossover and liquid water accumulation in the anode. Highly efﬁcient prediction of the short-term degradation behaviors of the PEM fuel cell has great signiﬁcance. In this paper, we propose a data-driven degradation prediction method based on multivariate polynomial regression (MPR) and artiﬁcial neural network (ANN). This method ﬁrst predicts the initial value of cell performance, and then the cell performance variations over time are predicted to describe the degradation behaviors of the PEM fuel cell. Two cases of degradation data, the PEM fuel cell in the DEA and anode recirculation modes, are employed to train the model and demonstrate the validation of the proposed method. The results show that the mean relative errors predicted by the proposed method are much smaller than those by only using the ANN or MPR. The predictive performance of the two-hidden-layer ANN is signiﬁcantly better than that of the one-hidden-layer ANN. The performance curves predicted by using the sigmoid activation function are smoother and more realistic than that by using rectiﬁed linear unit (ReLU) activation function.


Introduction
Proton exchange membrane (PEM) fuel cell is a high efficiency and low emission electro-chemical device which can be applied in many fields including automobiles, distributed generation and military [1]. In recent years, the PEM fuel cell technology has made great progress in increasing performance, reducing costs and improving durability. However, the output performance and durability of PEM fuel cells are still insufficient for wide commercialization at this stage [2,3]. There exist many performance degradation processes in the PEM fuel cell operation that generally include two categories, recoverable performance degradation in the short term, such as the cell in the cold start, dead-ended anode (DEA), and anode recirculation operation [4][5][6][7], and unrecoverable performance degradation in the long term [8,9]. The mechanisms causing performance degradation are various and complicated in these degradation processes, but it is widely recognized that analysis and prediction of the degradation behaviors are critical [10,11].
Developing the physical model is the main method in fuel cell studies [1,11]. For the physical model, the governing equations with strong physical meaning are solved, and thus the physical model generally has high prediction accuracy on the cell performance and other physical parameters. The DEA and anode recirculation modes are commonly used to improve the hydrogen utilization, especially for the automotive PEM fuel cell [12][13][14]. Nitrogen crossover and liquid water accumulation in the anode will occur that lead to the performance degradation. To improve the durability and operation stability of the PEM fuel cell, it is necessary to understand the degradation process under various working conditions. Compared to the experimental test, predicting the performance degradation process of the PEM fuel cell by modeling could reduce experimental time and expense. Peng et al. [15] developed a three-dimensional single-channel computational fluid dynamics (CFD) model of the PEM fuel cell with the DEA, and processes of the performance decay under various operating conditions are predicted. Rizvandi and Yesilyurt [16] developed a pseudo-three-dimensional transient model to analyze the dynamic and localized characteristics of the PEM fuel cell with the DEA. The numerical results of the performance decay were compared to the experiment results to validate the model. Xu et al. [17] developed a control-oriented model of the PEM fuel cell with the DEA operation considering the internal transport mechanisms. The working processes under various purge intervals and purge durations were simulated and analyzed, and some critical indicators, such as pressure and nitrogen content, were monitored during the simulation. Furthermore, the calculated performance decay in the DEA operation, and performance recovery in the purge duration were validated by comparing to the experiment results. Luo et al. [18] and Zhou et al. [19] developed a PEM fuel cell stack cold start model to investigate the cold start characteristics. The failure processes of cold start were well simulated and compared with experimental data. Three start modes, constant power, current and voltage, were analyzed, and the optimized cold start strategies were proposed based on the simulation results. Sultan et al. [20] used a bonobo optimizer (BO) algorithm to identify the unknown parameters of the PEM fuel cell model, and the simulation results could well match the experimental results. The BO algorithm also showed better performance than other optimization algorithms. Wang et al. [21] developed a quasi-two-dimensional transient model of PEM fuel cells with anode recirculation. The transient characteristics of the cell with anode recirculation was investigated, including the performance improvement in the initial period due to self-humidification and performance decay due to nitrogen crossover.
Besides physical models, many studies developed data-driven models to predict the transient behaviors of PEM fuel cells [22][23][24][25]. The data-driven model constructs the input-output relationship from the experimental dataset, and does not require much prior knowledge. Furthermore, after the model is well-trained, conducting the data-driven model does not require complex calculation, and the computational efficiency is generally much higher than that of the physical model [26,27]. Xie et al. [28] proposed a fusion prognostic approach based on particle filter and long short-term memory (LSTM) network. This approach could effectively predict the remaining useful life and short-term degradation of fuel cells. Zuo et al. [29] reported an attention-based recurrent neural network (RNN) model that could accurately predict the output voltage degradation of PEM fuel cells based on original long-term dynamic loading cycle durability test data. The attentionbased LSTM and attention-based gated recurrent unit (GRU) showed the higher prediction accuracy compared to LSTM and GRU. Vichard et al. [30] presented the results of a longterm durability test performed on an open-cathode fuel cell system operating during 5000 h under specific operating conditions. A degradation model based on echo state network (ESN) was then used to predict the performance evolution. The results showed that the model could achieve accurate performance prediction for more than 2000 h. Chen et al. [31] proposed a new grey neural network model (GNNM) method, which combined the particle swarm optimization (PSO) and the moving window method to predict the degradation of PEM fuel cell under different operating conditions. The influence of load current, inlet temperature, input hydrogen pressure and inlet relative humidity are considered. Three PEM fuel cell aging experiments were used to demonstrate that the proposed method can accurately predict the degradation for PEM fuel cell on different applications.
Based on the literature review, machine learning was mainly used for the long-term degradation process prediction of PEM fuel cells. The cell performance degradation process in the initial period (generally hundreds of hours) was measured, and the data was used to predict the degradation process in the future time through machine learning. However, to the best of authors' knowledge, the study of machine learning modeling to predict the full-process degradation process of the PEM fuel cell at different operating conditions still lacks. In this study, we propose a data-based degradation prediction method to simulate the degradation behaviors of the PEM fuel cell in the DEA and anode recirculation modes. We firstly used artificial neural network (ANN) to predict the degradation behaviors of the PEM fuel cell, but the results were not satisfactory. Considering that the initial value and change evolution of the PEM fuel cell performance both change under different operating conditions, we try to prediction them separately to improve the accuracy by the multivariate polynomial regression (MPR) method and ANN, respectively. Some previous works also reported the reasonable combination of algorithms can achieve a higher accuracy [32]. Therefore, we propose a combined the MPR method and ANN approach, M-ANN. The main idea is using the MPR method to predict the initial value under different conditions, and using ANN to predict the variation value over time. The workflow of the proposed method, M-ANN, is shown in Figure 1 for predicting the PEM fuel cell degradation process in the DEA mode. We use a verified physical model to simulate the degradation characteristics of the PEM fuel cell in the DEA and anode recirculation modes, and the simulation data is used instead of experimental data. The potential of using experimental data for the proposed modeling method in the future is also discussed.

Data Acquisition
In this study, a mechanism model is employed to simulate the degradation processes of the PEM fuel cell with the DEA mode and anode recirculation mode [21]. The quasi-two-dimensional transient model fully considers the transport process of the elec-

Data Acquisition
In this study, a mechanism model is employed to simulate the degradation processes of the PEM fuel cell with the DEA mode and anode recirculation mode [21]. The quasitwo-dimensional transient model fully considers the transport process of the electrode and channels, including gas and liquid transport, water phase change, heat transfer, nitrogen crossover and electrochemical reactions. Evolutions of performance change simulated by the model are utilized as the data. The parameters of the simulated PEM fuel cell are listed in Table 1. The governing equations of the employed model are briefly introduced as follows, and more details of transport coefficient and source term could found in Ref. [21]. One-dimensional mass transport along the channel is calculated. For the DEA mode, the liquid water and nitrogen accumulation in the anode channel must be involved. Nitrogen will be accumulated at the dead end of the anode channel firstly, and meanwhile, the nitrogen concentration gradient along the channel is also increasing over time. Although convection is generally dominated for gas species transport in channels, the species diffusion in the dead-ended channel still cannot be ignored. Gas species concentration (hydrogen, oxygen, water vapor and nitrogen) and liquid fraction in channels are calculated: lq,CH-GDL − S lq,CH where u (m·s −1 ) is the velocity; A (m 2 ) is the area; c (mol·m −3 ) is the molar concentration; φ (mol·m −2 ·s −1 ) is the diffusion flux; S (mol·m −2 ·s −1 ) is the source term; N is the number of control volumes along the channel; L (m) is the channel length; J (mol·m −2 ·s −1 ) is the convection flux along the channel; ρ (kg·m −3 ) is the density; M (kg·mol −1 ) is the molar weight; s is the liquid volume fraction; d (m) is the channel height. The superscripts and subscript: in and out represent the inlet and outlet of the control volume or channel, respectively; k − 1 and k represent the time step; [n] represents the control volume along the channel; i represents gas species; CH represents the channel; CH-GDL represents the interface between the channel and gas diffusion layer (GDL); act represents the activation; lq represents liquid water; w represents water. For the anode recirculation mode, liquid water in the channel could be quickly blown away and removed by the liquid-vapor separator. We assume that the liquid fraction is fixed as zero in the anode channel, and Equation (3) can be ignored. Mass and heat transfer in the electrodes are diffusion-dominated along the throughplane direction. Gas species, liquid saturation, membrane water content and temperature are calculated: where δ (m) is the thickness; ε is the porosity; EW (kg·mol −1 ) is the equivalent weight of the membrane; ω is the ionomer volume fraction of the catalyst layer (CL); λ is the membrane water content; C p is the specific heat capacity (J·mol −1 ·K −1 ); T (K) is the temperature. The superscripts and subscript: p represents the components of the electrode, including GDLs, micro-porous layers (MPLs) and CLs of the anode and cathode; m represents the membrane; mw represents the membrane water; T represents energy; eff represents the effective coefficient. Nitrogen crossover from the anode to the cathode is calculated: where K (mol·m −1 ·s −1 ·Pa −1 ) is the permeability; p (Pa) is the pressure. The superscripts and subscript: cro represents crossover; a and c represent the anode and cathode, respectively. The electro-chemical model represents the relationship between the current density and output voltage: is the area resistance, and universal gas constant; I (A·m −2 ) is the current density; n is the number of electron transfer; α is the transfer coefficient; F (95,486 C·mol −1 ) is Faraday constant. The superscripts and subscript: rev represents reversible; 0 represents the reference value. The numerical process is done by the in-house C++ code. The simulation results of performance transients in the DEA and anode recirculation modes were compared with experimental results, and a good agreement was achieved that could be found in our previous work [14,21]. The employed model could well reflect the degradation process of the PEM fuel cell in the DEA and anode recirculation modes, and thus the simulated data can be used as the substitute for experimental data to develop the data-driven model.

ANN Model Development
When the hidden layer in the multilayer perceptron (MLP) has an appropriate activation function and a sufficient number of hidden layers and neurons, the MLP can estimate any function with any accuracy. The ANN used in this work is implemented by an MLP. The ANN architecture can be described as follows: the first layer of input variable feeding network is represented as the input layer, the final layer is called the output layer, and all layers between the input layer and output layer are called the hidden layer (as shown in Figure 1). All neural networks have an input layer and an output layer, but the number of hidden layers may vary.
In this study, two different ANN structures are explored and their performances are tested. They are ANN with one-hidden-layer and ANN with two-hidden-layer. The initial number of neurons in each hidden layer is less than 20 [33]. The exact number of neurons, the learning rate and the number of training epochs in the current ANN are tested by a hyperparameter selection procedure. The data set has 12 groups of complete time-series data, of which 8 groups are randomly divided into train set, 2 groups into validation set, and 2 groups into the test set. The train set is used to train the ANN model, the validation set is used to select hyperparameter and select the best model, and the test set is used to check the generalization ability of the selected ANN model. The number of neurons in the hidden layer is 3-20, the learning rate is 0.001, 0.004, 0.007, 0.01 and 0.04, and the number of training epochs is within 200. The code automatically saves the model when we get a better model during validation. The validation of the forecasted model is based on the mean and maximum relative errors. Learning and building the ANN is done using the PyTorch library of the interpreted programming language Python.
In DEA mode, PEM fuel cell works in the constant voltage mode, and the current density changes over time. In this study, 12 processes of the PEM fuel cell in the DEA mode with different operating conditions (voltage and inlet pressure) are simulated (as shown in Table 2). The duration for the DEA mode of each process is 200 s. The sampling time is 1 s. Therefore, the input layer of ANN has three neurons. When in DEA mode, they receive the following three parameters.
Time. In anode recirculation mode, PEM fuel cell works in the constant current mode, and the voltage changes over time. In this study, 12 processes of the PEM fuel cell in the anode recirculation mode with different operating conditions (current density and anode stoichiometry) are simulated (as shown in Table 3). The duration for the anode recirculation mode of each process is 600 s. The sampling time is 1 s. When in anode recirculation mode, they receive the following three parameters.
Time. The input values are normalized before input. This is conducive to the improvement of the training speed and numerical performance of the ANN, so that the error in the training process can be smoothly reduced, and the accuracy of gradient estimation can be improved.

Multivariate Polynomial Regression (MPR) Method
MPR method is a regression analysis method to study the polynomial relationship between a dependent variable (target) and several independent variables (input). The greatest advantage of the MPR method is that it can approach the measured points by adding the high-order term of the independent variable until the results are satisfactory. In this paper, the first to fifth order equations are considered for the MPR method. The regression equations of the following two independent variables are used to predict the initial values: First order equation: Second order equation: When in DEA mode, where y is the predicted value of the initial current density, and x 1 and x 2 are the normalized voltage and inlet pressure, respectively. When in anode recirculation mode, where y is the predicted value of the initial voltage, and x 1 and x 2 are the normalized current density and Anode stoichiometry, respectively. α i (i = 0, 1, 2, 3, 4, 5) is the corresponding regression coefficient which is calculated using statistical methods. After comparing the accuracy, we choose the third order equation of the MPR method for modeling. The development of the MPR model is done by using the scikit-learn library of the interpreted programming language Python.

Performance Decay Prediction of the DEA Mode
In this section, the performance decay prediction of the PEM fuel cell in the DEA mode is evaluated. Firstly, the prediction performance of the ANN and M-ANN is compared to select the better modeling method. Then, the MPR method is employed as the benchmark model to demonstrate the necessity of ANN in modeling. Lastly, the effect of hidden layer numbers and activation functions on the prediction performance is discussed. We introduce two errors to evaluate the prediction performance, mean relative error δ mean and maximum relative error δ max : where n is the number of data points; I i (A·cm −2 ) is the true value of current density; I i (A·cm −2 ) is the predictive value of current density.

Comparison of the ANN and M-ANN
In the DEA mode, the PEM fuel cell performance firstly increases due to the selfhumidification effect, and then gradually decreases due to the nitrogen crossover and accumulation in the anode, as shown for the data in the 12 processes. More physical explanation could be found in our previous work [21]. Comparison of one-hidden-layer ANN and one-hidden-layer M-ANN predictive values and true values of current density change evolution of the PEM fuel cell in the DEA mode are shown in Figures 2 and 3, respectively. Comparison of two-hidden-layer ANN and two-hidden-layer M-ANN predictive values and true values of current density change evolution of the PEM fuel cell in the DEA mode are shown in Figures 4 and 5. (a)-(h), (i,j) and (k,l) represent the train set, validation set, and test set, respectively. The trend of current density firstly increasing and then decreasing is predicted by the ANN-based model for all the 12 processes. It is worth mentioning that it takes around 4 hours' calculation for the small workstation to obtain the best one-hidden-layer ANN model through grid search, while it takes around 70 h to obtain the best two-hidden-layer ANN model. Prediction performance, including the mean and maximum relative errors of different modeling methods for the DEA mode is listed in Table 4.            The mean and maximum relative errors by one-hidden-layer ANN are 0.725% and 1.563% that are much larger than that by one-hidden-layer M-ANN of 0.198% and 0.803%. The mean and maximum relative errors by two-hidden-layer ANN are 0.341% and 0.932% that are much larger than that by two-hidden-layer M-ANN of 0.158% and 0.534%. Although the prediction degradation process of M-ANN includes the error of the MPR model to predict the initial current density and the error of the ANN model to predict the current density change. However, from the results of M-ANN, this is tolerable. It can be seen that the prediction curves by ANN show more obvious error compared with the true curves than those by M-ANN. The reason why M-ANN can obtain better prediction results than ANN may be as follows: 1.
The MPR method solves the problem of initial current density prediction under various working conditions. This is because each operating condition has a different starting point for its current density due to different initial conditions, which is the reason for the poor effectiveness of ANN in predicting the overall degraded current density. ANN can also be used to predict initial current density at various operating conditions. the initial current density dataset has only 12 samples. It is well known that the ANN model needs a large amount of data to be trained to get better results.
Therefore, when the amount of data is small, it is a wise choice to use the MPR method to predict results.

2.
The ANN solves the problem of predicting current density change. After eliminating the influence of the initial point, the current density changes under various operating conditions are similar. However, this change is relatively complex, not only rise or fall, but also the magnitude and time of rise and fall are uncertain. Therefore, with a large number of samples, a relatively complex ANN is more effective in learning the current density change. Figure 6 shows the result of MPR method predicting current density change. MPR-MPR means to use MPR to predict the initial current density and then MPR to predict the current density change. Compared with Figures 5 and 6, ANN is better at predicting the current density change than MPR. Overall, this study decomposes the predictive degradation process into two different types of problems, and then gives the two problems to two algorithms to solve, which achieve the complementary advantages between algorithms, resulting in better performance than a single algorithm to solve a single problem.

Comparison of Hidden Layer Numbers and Activation Functions
Comparison of one-hidden-layer ANN and that of two-hidden-layer ANN predictive values and true values of current density change evolution for the PEM fuel cell with the DEA mode are shown in Figures 2 and 4. The mean and maximum relative errors by one-hidden-layer ANN are 0.725% and 1.563% that are much larger than that by two-hidden-layer ANN of 0.341% and 0.932%. Comparison of one-hidden-layer M-ANN and that of two-hidden-layer M-ANN predictive values and true values of current density change evolution for the PEM fuel cell with the DEA mode are shown in Figures 3  and 5. The mean and maximum relative errors by one-hidden-layer M-ANN are 0.198% and 0.803% that are slightly larger than that by two-hidden-layer M-ANN of 0.158% and 0.534%. From the above two groups of comparison, it is obvious that the performance of two-hidden-layer is better than that of one-hidden-layer, which is a predictable result. Because the two-hidden-layer neural network is more complex than the one-hidden-layer, it can learn deeper knowledge from the data, which leads to superior prediction results. Figure 7 shows the comparison between the predictive values of two-hidden-layer M-ANN with sigmoid activation function and the true values of current density evolution in DEA mode. The mean and maximum relative errors by two-hidden-layer M-ANN with sigmoid activation function are 0.157% and 0.617%. Comparing Figures 5 and 7, it Overall, this study decomposes the predictive degradation process into two different types of problems, and then gives the two problems to two algorithms to solve, which achieve the complementary advantages between algorithms, resulting in better performance than a single algorithm to solve a single problem.

Comparison of Hidden Layer Numbers and Activation Functions
Comparison of one-hidden-layer ANN and that of two-hidden-layer ANN predictive values and true values of current density change evolution for the PEM fuel cell with the DEA mode are shown in Figures 2 and 4. The mean and maximum relative errors by one-hidden-layer ANN are 0.725% and 1.563% that are much larger than that by twohidden-layer ANN of 0.341% and 0.932%. Comparison of one-hidden-layer M-ANN and that of two-hidden-layer M-ANN predictive values and true values of current density change evolution for the PEM fuel cell with the DEA mode are shown in Figures 3 and 5. The mean and maximum relative errors by one-hidden-layer M-ANN are 0.198% and 0.803% that are slightly larger than that by two-hidden-layer M-ANN of 0.158% and 0.534%. From the above two groups of comparison, it is obvious that the performance of twohidden-layer is better than that of one-hidden-layer, which is a predictable result. Because the two-hidden-layer neural network is more complex than the one-hidden-layer, it can learn deeper knowledge from the data, which leads to superior prediction results. Figure 7 shows the comparison between the predictive values of two-hidden-layer M-ANN with sigmoid activation function and the true values of current density evolution in DEA mode. The mean and maximum relative errors by two-hidden-layer M-ANN with sigmoid activation function are 0.157% and 0.617%. Comparing Figures 5 and 7, it can be seen that the prediction curves obtained by using sigmoid activation function are smoother than those obtained by using rectified linear unit (ReLU) activation function. Sigmoid activation function: ReLU activation function: As can be seen from Figure 8, the ReLU activation function outputs 0 at x < 0 and x at x > 0. The sudden change of output at x = 0 is the reason for the unsmooth time-series output. The sigmoid activation function is smooth in the whole range, so its time-series output is also smooth. ReLU activation function has widely replaced sigmoid activation function in the field of deep learning for three reasons: 1. The sigmoid activation function is computationally intensive while the ReLU activation function is much less so when the backpropagation algorithm is solving for the gradient. 2. For deep neural networks with sigmoid activation function, the vanishing gradient problem can easily occur when the backpropagation algorithm is solving for the gradient. 3. The ReLU activation function can make the output of some neurons zero, which will cause the sparsity of the neural network, reduces the interdependence of parameters and alleviates the overfitting problem. Sigmoid activation function: ReLU activation function: As can be seen from Figure 8, the ReLU activation function outputs 0 at x < 0 and x at x > 0. The sudden change of output at x = 0 is the reason for the unsmooth time-series output. The sigmoid activation function is smooth in the whole range, so its time-series output is also smooth. ReLU activation function has widely replaced sigmoid activation function in the field of deep learning for three reasons: 1.
The sigmoid activation function is computationally intensive while the ReLU activation function is much less so when the backpropagation algorithm is solving for the gradient.

2.
For deep neural networks with sigmoid activation function, the vanishing gradient problem can easily occur when the backpropagation algorithm is solving for the gradient.

3.
The ReLU activation function can make the output of some neurons zero, which will cause the sparsity of the neural network, reduces the interdependence of parameters and alleviates the overfitting problem. It can be seen from the results in this paper that in the case of a small number of neurons, the ReLU activation function will make the time-series results unsmooth. Obviously, a smooth prediction curve is more representative of the reality. In this respect, the sigmoid activation function has an advantage over the ReLU activation function. Although from the perspective of mean relative error, there is almost no difference in performance between the two in this study.

Performance Decay Prediction of the Anode Recirculation Mode
Different from DEA mode, the independent variable in anode recirculation mode is changed to current density, anode stoichiometry and time, and the dependent variable is changed to voltage. The operating conditions of the 12 processes in anode recirculation mode are shown in Table 3  It can be seen from the results in this paper that in the case of a small number of neurons, the ReLU activation function will make the time-series results unsmooth. Obviously, a smooth prediction curve is more representative of the reality. In this respect, the sigmoid activation function has an advantage over the ReLU activation function. Although from the perspective of mean relative error, there is almost no difference in performance between the two in this study.

Performance Decay Prediction of the Anode Recirculation Mode
Different from DEA mode, the independent variable in anode recirculation mode is changed to current density, anode stoichiometry and time, and the dependent variable is changed to voltage. The operating conditions of the 12 processes in anode recirculation mode are shown in Table 3. Comparison of two-hidden-layer M-ANN with ReLU activation function and that of two-hidden-layer M-ANN with sigmoid activation function predictive values and true values of voltage change evolution for the PEM fuel cell with the anode recirculation mode are shown in Figures 9 and 10. The mean and maximum relative errors by two-hidden-layer M-ANN with ReLU activation function are 0.143% and 0.458%. The mean and maximum relative errors by two-hidden-layer M-ANN with sigmoid activation function are 0.155% and 0.359%.
Changing the independent and dependent variables is to explore whether the M-ANN can continue to perform well when different independent variables are used. The comparison between Tables 3 and 5 shows that M-ANN has strong robustness. Compared with DEA mode, the performance change evolution in anode recirculation mode is simpler, with the slope of rising or fall remaining almost constant, and the voltage changes little. However, the mean relative error in anode recirculation mode is similar to that of DEA mode. In this case, we adjust the number of training epochs to within 2000. However, it does not yield a superior model. The most likely reason is that the prediction performance of two-hidden-layer M-ANN has reached the limit. Perhaps when using a three-hiddenlayer M-ANN model, the prediction results will be better. However, the two-hidden-layer M-ANN is more complex than the three-hidden-layer M-ANN, so it needs more data to train the model to get better results.

Conclusions
In this study, we propose the M-ANN method to simulate the short-term degradation behaviors of the PEM fuel cell. Two cases of degradation data, the PEM fuel cell in the DEA and anode recirculation modes, are employed to train the model. The results show that the predictive performance by M-ANN is significantly better than that only by ANN or MPR. Firstly, the MPR method solves the problem of initial values prediction under various working conditions, and this problem cannot be easily solved by ANN because the sample size is extremely small. Secondly, the ANN solves the problem of predicting the variation value of the cell performance in the degradation process, and meanwhile the prediction effect of the MPR is much worse than that of the ANN. Then, we investigate the effect of the hidden-layer number on the predictive performance of M-ANN, and it is found that the performance of two-hidden-layer is better than that of one-hidden-layer. In addition, we also compare the effects of different activation functions, ReLU and sigmoid, on the predictive performance of M-ANN. Although each of them does not show an obvious advantage over the other on the errors, the predictive curves by using the sigmoid activation function are smoother than that by using ReLU activation function. Obviously, the smooth prediction curve is more realistic. Generally speaking, the M-ANN model shows good predictive performance in both two cases, which proves that M-ANN has good universality on the short-term degradation or dynamic behaviors of the PEM fuel cell.

Conflicts of Interest:
The authors declare no conflict of interest.