Model-Based Data Driven Approach for Fault Identiﬁcation in Proton Exchange Membrane Fuel Cell

: This paper develops a model-based data driven algorithm for fault classification in proton exchange membrane fuel cells (PEMFCs). The proposed approach overcomes the drawbacks of voltage and current density assumptions in conventional model-based fault identification methods and data limitations in existing data driven approaches. This is achieved by developing a 3D model of fuel cells (FC) based on semi empirical model, analytical representation of electrochemical model, thermal model, and impedance model. The developed model is simulated for membrane drying and flooding faults in PEMFC and their effects are identified for the action of varying temperature, pressure, and relative humidity. The ohmic, concentration, activation and cell voltage losses for the simulated faults are observed and processed with wavelet transforms for feature extraction. Furthermore, the support vector machine learning algorithm is adapted to develop the proposed fault classification approach. The performance of the developed classifier is tested for an unknown data and calibrated through classification accuracy. The results showed 95.5% training efficiency and 98.6%


Introduction
Increasing carbon emissions and environment resource issues are drawing the attention of engineers to focus on various aspects of improvement of environment, either by solving legacy issues or by preventing future damage. Fuel cell development falls into the category of "preventing future damage", since this technology is attractive, due to many environmentally promising features. Relatively high efficiency, reduced harmful emissions, and potential reduced dependence on fossil fuels are among the reasons for studying this field [1]. Subsequent interest in fuel cells has resulted in the utilization of various materials of construction and fuel sources, and operating conditions have been developed for many stationary and mobile applications [2,3]. Among the different materials used in fuel cells, the proton exchange membrane fuel cells (PEMFCs) are widely being adapted, especially in transportation applications [4].
Despite of numerous applications and advantages, barriers related to reliability and durability are still a major concern while realizing the commercialization of PEMFCs [5,6]. These metrics are mainly dependent on various faults during the operation of PEMFC that effect the performance of the system. Hence, to achieve efficient operation and improve the reliability, faults related to the operating condition of the PEMFC should be detected, and rectified or isolated in time. To do so, Table 1. Faults and their effects on the operation of proton exchange membrane fuel cells (PEMFC).

Fault/Degradation Effect and Diagnosis
Degradation due to ageing Effect of loss of electrochemical surface area on membrane degradation, catalyst layer corrosion [32] (1) Gradual decrease in performance.
(2) Caused by design and assembly, material quality, and operating conditions.
(4) Degradation is not reversible. (5) Improved material durability can essentially mitigate the degradation. Degradation due to system operation Inside fuel cell [32] Failure of membranes, catalyst layers, gas diffusion layers, bipolar plates.
(1) Caused by failure mode operation or fabrication process of fuel cell.

Reactants supply
Contamination [33] Partially blocked reaction sites due to containment of reactants.
(1) Caused by impurities and air pollutants at anode and cathode side respectively. (2) Permanent failure of degradation of performance of PEMFC.
(1) Caused while operating PEMFC with high reactant pressures for higher output power. (2) High power loss in air compression.
(3) Can be controlled by operating and monitoring the PEMFC for normal pressure ratings.

Improper gas flow rates
Loss of active surface area of the catalyst, carbon support corrosion [31], cathode water flooding [35], membrane drying.
(1) Caused by low hydrogen flow rate and high oxygen flow rate. (2) Starvation fault due to low hydrogen flow rate.
(3) Decrease in membrane conductivity due to high oxygen flow rate. (4) More power consumption to achieve high air stoichiometry and low efficiency [36].
Heat management [20,21] Lower conductivity of membrane due to membrane dehydration.
(1) Caused by low and high temperatures.
(2) Reduced voltage output and flooding inside the cell due to low temperatures (3) Overheating damage to the membrane due to high temperature.

Fault/Degradation Effect and Diagnosis
Water management Membrane drying [37] Hinderance for access of protons to the catalyst surface due to dry membrane.
(2) Occurs at anode side due to water logging at cathode side. (3) Increased activation and ohmic losses. (4) Severe drying may cause irreversible membrane damage.
Flooding [37] Degraded fuel cell stack due to blocked reactant pathways.
(1) Caused by water accumulated in gas porosities of gas diffusion and catalyst layers. (2) Occurs at the cathode side.
Electric circuit [38] Ageing related degradation, concentration voltage loss, and melting of electrodes.
(1) Caused due to too high or too low voltages and load currents.
Out of these faults, the effects due to water management and temperature are considered crucial for healthy and reliable operation of the PEMFC. The modelling and operation of PEMFC for these faults is discussed in the further sections.

D model of a Fuel Cell
The modelling of a 3D model FC is based on the semi empirical model, the analytical representation of the electrochemical model, the thermal model, and the impedance model. Initially, multiple nodes at critical zones, such as cell center, gas inlet/outlet and boundary zones of the cell, are identified. A total of nine nodes are identified in a cell (MES-DEA single cell [39]), and the physical phenomena of these nodes varies depending upon the position of the nodes and external conditions like, temperature, pressure and humidity. Furthermore, in the dynamic state, electrical and thermal domains are combined by the 3D model to measure the variation in temperature, pressure, and humidity at the nine nodes. The geometrical representation of the 3D model FC is shown in Figure 1. To ensure potential difference at each node, 20 resistors are used. As the temperature distributions and current density are closely related to different phenomena, the 3D model has an advantage of predicting various phenomena that occur inside an operating fuel cell. Furthermore, the modelling hypotheses for the 3D model are as follows: influence of membrane, cathode, and anode are not illustrated. The pressure drop is negligible in both anode and cathode catalytic sites  To ensure potential difference at each node, 20 resistors are used. As the temperature distributions and current density are closely related to different phenomena, the 3D model has an advantage of predicting various phenomena that occur inside an operating fuel cell. Furthermore, the modelling hypotheses for the 3D model are as follows: influence of membrane, cathode, and anode are not illustrated. The pressure drop is negligible in both anode and cathode catalytic sites and the voltage drop due to activation loss is negligible at the anode. To characterize the fuel cell fault, only impedance magnitudes are considered.

Electrical Formulation
The modelling of the electrical phenomena at stack level is carried out by developing a dynamic model in the MATLAB/Simulink environment, as shown in Figure 2. To ensure potential difference at each node, 20 resistors are used. As the temperature distributions and current density are closely related to different phenomena, the 3D model has an advantage of predicting various phenomena that occur inside an operating fuel cell. Furthermore, the modelling hypotheses for the 3D model are as follows: influence of membrane, cathode, and anode are not illustrated. The pressure drop is negligible in both anode and cathode catalytic sites and the voltage drop due to activation loss is negligible at the anode. To characterize the fuel cell fault, only impedance magnitudes are considered.

Electrical Formulation
The modelling of the electrical phenomena at stack level is carried out by developing a dynamic model in the MATLAB/Simulink environment, as shown in Figure 2.
. The model is based on thermodynamic and electrochemical characteristics, and inputs the temperature impact ( ), hydrogen pressure and oxygen pressure , voltage Nernst, and other (activation, ohmic, and concentration) losses [7,40]. This is mathematically given by: The model is based on thermodynamic and electrochemical characteristics, and inputs the temperature impact (T), hydrogen pressure P H 2 and oxygen pressure P O 2 , voltage Nernst, and other (activation, ohmic, and concentration) losses [7,40]. This is mathematically given by: The Nernst equation in (1) represents the relation between ideal standard potential E 0 = 1.22 and ideal equilibrium potential, at different temperatures and pressures of products and reactants. This is used to calculate the free load voltage E.
where R is gas constant, α is electron transfer coefficient, F is faraday constant. The activation losses in (2) are observed in the low current region which assumes variable parameters for current density. Furthermore, because of internal electrical resistance R m , the ohmic overvoltage is given by: where t m is the function of membrane thickness, and σ m is the function of membrane resistivity.
Here, the ohmic losses increase with the increase in current, due to the nature of fuel cell resistance, which is constant. The relationship between concentrated polarization and voltage loss is given in (4). The concentrated polarization varies proportionally with the current density but becomes prominent for high limiting currents. This makes the flow of gas reactants to the fuel rection sites difficult [41].
To achieve the dynamic operation of the model, a capacitor is added. Therefore, the cell voltage is calculated using: Furthermore, the action of the double layer capacitor impacts the transient values of concentrated polarization and stack activation. This impact of double layer capacitor on the polarization curves can be modelled as a first order system, and the corresponding results are depicted in Figure 3.
where is the function of membrane thickness, and is the function of membrane resistivity. Here, the ohmic losses increase with the increase in current, due to the nature of fuel cell resistance, which is constant.
The relationship between concentrated polarization and voltage loss is given in (4). The concentrated polarization varies proportionally with the current density but becomes prominent for high limiting currents. This makes the flow of gas reactants to the fuel rection sites difficult [41].
To achieve the dynamic operation of the model, a capacitor is added. Therefore, the cell voltage is calculated using: Furthermore, the action of the double layer capacitor impacts the transient values of concentrated polarization and stack activation. This impact of double layer capacitor on the polarization curves can be modelled as a first order system, and the corresponding results are depicted in Figure 3. Apart from the electrical formulation, the modelling of thermal domain is performed considering the temperature of the stack, which is obtained using the empirical method, as depicted in Figure 4. Furthermore, the stack temperature can be represented as a function of electric current [7], which is given as follows: where = 38.27, = 0.01032, = −11.93 and = −0.7182 Apart from the electrical formulation, the modelling of thermal domain is performed considering the temperature of the stack, which is obtained using the empirical method, as depicted in Figure 4. Furthermore, the stack temperature can be represented as a function of electric current [7], which is given as follows: where a =  Furthermore, the analytical expressions given above (1-7) are coupled with the voltage and temperature measurements using the Newton-Raphson (N-R) algorithm. This assumes a function = − (where corresponds to calculated cell voltage at each node and is the measured voltage at each node), and calculates the distribution of current density in reaching the equality ( ) = 0. Furthermore, the analytical expressions given above (1-7) are coupled with the voltage and temperature measurements using the Newton-Raphson (N-R) algorithm. This assumes a function Energies 2020, 13, 3144 7 of 18 f = E − V (where E corresponds to calculated cell voltage at each node and V is the measured voltage at each node), and calculates the distribution of current density in reaching the equality f (x) = 0.

Calibration of the Model
To perform fault diagnosis, the developed fuel cell model must be capable of operating in both healthy and faulty modes. The Simulink representation of 3D model PEMFC is shown in Figure 5. Furthermore, the analytical expressions given above (1-7) are coupled with the voltage and temperature measurements using the Newton-Raphson (N-R) algorithm. This assumes a function = − (where corresponds to calculated cell voltage at each node and is the measured voltage at each node), and calculates the distribution of current density in reaching the equality ( ) = 0.

Calibration of the Model
To perform fault diagnosis, the developed fuel cell model must be capable of operating in both healthy and faulty modes. The Simulink representation of 3D model PEMFC is shown in Figure 5. Temperature, current density, and gas pressure are the inputs, while voltage and current are the outputs of the model. The red junctions in the circuit deal with open circuit, activation loss, ohmic and concentration voltages. The important aspects of the model are the connection resistors, which are used to simulate the faults in the system. Hence, the calibration of these resistances is important. To achieve accurate fault diagnosis, the developed model needs to generate accurate data for the healthy and faulty operations of PEMFC. This is achieved by dividing the fuel cell stack into elementary cells. Furthermore, the temperature of each elementary cell is measured using different equivalent circuits. The voltage drop magnitude is associated with the change in model parameters (open circuit voltage, losses in anode and cathode , membrane loss, and double layer capacitance in anode and cathode) of the fuel cell. In addition to the above, the involvement of thermocouples and voltage sensors increases the resistance. This is mainly due to the irregular Temperature, current density, and gas pressure are the inputs, while voltage and current are the outputs of the model. The red junctions in the circuit deal with open circuit, activation loss, ohmic and concentration voltages. The important aspects of the model are the connection resistors, which are used to simulate the faults in the system. Hence, the calibration of these resistances is important. To achieve accurate fault diagnosis, the developed model needs to generate accurate data for the healthy and faulty operations of PEMFC. This is achieved by dividing the fuel cell stack into elementary cells. Furthermore, the temperature of each elementary cell is measured using different equivalent circuits. The voltage drop magnitude is associated with the change in model parameters (open circuit voltage, losses in anode R a and cathode R a , membrane R 0 loss, and double layer capacitance in anode and cathode) of the fuel cell. In addition to the above, the involvement of thermocouples and voltage sensors increases the resistance. This is mainly due to the irregular pressures at the connecting points, as any pressure above the threshold value will block gas channel. This scenario can be represented by adding a series impedance at each node which increases the voltage drop in the cell. Considering the action of resistances and impedances, further, while applying the 3D model to one stack, the distribution of voltage and temperature are considered in X, Y, and Z direction. For calibrating the 3D model, the impedances are calculated by determining the impedance of each cell and the connecting resistors are calculated based on known current density. In addition, the fuel cell is partitioned into separate branches along the x, y, z axis, where the electric mode and the impedance are associated with each other. To simplify the calculations due to impedance, the resistance behavior of the impedance is assumed. Furthermore, based on the domination of MES cell, the variation of voltage and temperature in X direction can be negligible. Hence, the changes in resistance, and impedance, that effect the voltage and current density due to varying temperature, pressure, humidity, and ageing effects provide an advantage to simulate healthy and faulty operating conditions of PEMFC.

Operating Modes and Data Preparation
Generally, the faulty or degrading operation of PEMFC can be observed either due to natural ageing (long time operation), or due to operational incidents (reactant starvation, contamination of membrane electrode assembly, etc.) These modes indicate an abnormality in the operation of FC through various conditions and result in performance loss. In this research, the faults in fuel cell are classified into two groups, drying faults and flooding at anode and cathode side [42]. The flooding at anode side is caused by a recondition process where the anode compartment is filled with deionized water developing a water film. Here, the diffusion of hydrogen to a negative electrode in the cell is blocked by the water film, resulting in decreased cell voltage. Similarly, the flooding in the cathode is caused by excess water, which causes a water film blocking oxygen diffusion to the positive electrode. This resulted in decreased cell voltage. Furthermore, the drying faults correspond to the drying of the membrane, which is caused by high temperatures [40]. During this process, holes are developed in the polymeric structure of the membrane, resulting in the fast reduction of voltage. A relation between temperature and relative humidity, which is the reason for the dry or wet state of the membrane, is shown in Figure 6. It is observed that the humidity should be between > 60 and < 100% to prevent excessive drying and flooding, respectively. the electric mode and the impedance are associated with each other. To simplify the calculations due to impedance, the resistance behavior of the impedance is assumed. Furthermore, based on the domination of MES cell, the variation of voltage and temperature in X direction can be negligible. Hence, the changes in resistance, and impedance, that effect the voltage and current density due to varying temperature, pressure, humidity, and ageing effects provide an advantage to simulate healthy and faulty operating conditions of PEMFC.

Operating Modes and Data Preparation
Generally, the faulty or degrading operation of PEMFC can be observed either due to natural ageing (long time operation), or due to operational incidents (reactant starvation, contamination of membrane electrode assembly, etc.) These modes indicate an abnormality in the operation of FC through various conditions and result in performance loss. In this research, the faults in fuel cell are classified into two groups, drying faults and flooding at anode and cathode side [42]. The flooding at anode side is caused by a recondition process where the anode compartment is filled with deionized water developing a water film. Here, the diffusion of hydrogen to a negative electrode in the cell is blocked by the water film, resulting in decreased cell voltage. Similarly, the flooding in the cathode is caused by excess water, which causes a water film blocking oxygen diffusion to the positive electrode. This resulted in decreased cell voltage. Furthermore, the drying faults correspond to the drying of the membrane, which is caused by high temperatures [40]. During this process, holes are developed in the polymeric structure of the membrane, resulting in the fast reduction of voltage. A relation between temperature and relative humidity, which is the reason for the dry or wet state of the membrane, is shown in Figure 6. It is observed that the humidity should be between > 60 and < 100% to prevent excessive drying and flooding, respectively. Further, the effect of the above discussed conditions in fuel cell are simulated by introducing different zones in the calibrated 3D model of the fuel cell. The resistances and impedances in these zones are dependent on the humidity, temperature, and other operating conditions. Initially, the faults are injected by defining the input conditions, as shown in Table 2. Further, the effect of the above discussed conditions in fuel cell are simulated by introducing different zones in the calibrated 3D model of the fuel cell. The resistances and impedances in these zones are dependent on the humidity, temperature, and other operating conditions. Initially, the faults are injected by defining the input conditions, as shown in Table 2. In addition, the impedances connected across the branches correspond to the loss of connection between various cells in the X, Y, and Z directions. Altering the value of impedances affects the cell current distribution, which aids in simulating various other faults. Furthermore, a DC load with harmonics identical to the harmonics in DC/DC boost converter are associated with the FC to identify the mean value of voltage variation and harmonic distortion rate. This phenomenon is created to realize the real time operating condition of the PEMFC. Furthermore, the action of the above discussed faults, in respect of the working conditions on various characteristics of the fuel cell, are measured and the results are discussed as follows.
Energies 2020, 13, 3144 9 of 18 As the temperature increases, the activation losses decrease, due to the Tafel constant. This impacts the increasing current density of the cell. In this condition, the voltage drop is nonlinear. The activation overvoltage for temperature change is shown in Figure 7a. In Figure 7b, the ohmic overvoltage for temperature change is observed. These losses are due to the ohmic resistance caused by electrolyte, cell interconnects, and bipolar plates. The effect of temperature on the concentration voltage losses is shown in Figure 7c. These losses are caused due to the consumption of the reactant at the electrode. Here, the temperature and losses are inversely proportional. Furthermore, the cell voltage loss for varying temperature is shown in Figure 7d. Generally, the rising temperature has a positive effect on the operation and performance of the fuel cell, with reduced activation and concentration losses. However, the higher temperatures have a very limited effect on the performance improvement of the cell and this causes cell degradation, which may lead to early cell failure. For flooding fault, the pressure effect corresponds to the hydrogen and oxygen at the fuel cell inlet are measured. As the pressure of hydrogen and oxygen increases, the activation losses decrease and reduce the rising current density, as shown in Figure 8a. Furthermore, the increasing pressure reduces the concentration losses and improves the current density, as shown in Figure 8b. Generally, the increase in pressure at inlet improves the fuel cell voltage, as shown in Figure 8c. For the normal operation of the cell, this pressure is maintained at 1 bar. During, flooding condition, the pressure at the inlets automatically increases, resulting in cell failure. Hence, efficient monitoring of the cell is necessary for reliable operation of the cell. Furthermore, the effect of relative humidity (RH) on the cell is observed, to analyze the drying condition. The increasing RH highly impacts the proton transfer, increases ohmic resistance and reduces conductivity, resulting in decreased power generation and efficiency. The effect of RH is identified for ohmic losses and cell voltage loss in Figure 9a

Classifier Development
A detailed layout of complete fault classification for PEMFC considering drying and flooding faults is shown in Figure 10. The data obtained from the simulation in Section 4 are used to develop the fault classifier for PEMFC. The direct involvement of sampled waveforms for classification results in the poor performance of the classifier. Hence, the characteristic features are extracted from the data to increase the performance of the classifier. The obtained fuel cell characteristics correspond to the transient faults and are available for a very short duration. Hence, the developed fault identification process should be fast and accurate to detect the transients. This can be achieved by identifying the fault characteristics appropriately and finding a suitable classifier for efficient training. To select a suitable classifier, it is required to find any or all the sources, properties, and features of the data of the sampled waveforms.

Feature Extraction
Feature extraction is one such process to extract all the properties, and features of data (signal or image). Generally, features of a data set are classified as time-domain [43], and frequency-domain [44]. As the data used in this project are synthetic data, and represent variations between multiple quantities, both time and frequency spectra are considered. This process is achieved through wavelet transform (WT). The wavelet transform is widely studied in the literature for signal and image processing applications [45,46]. The WT process operates by finding a set of basis function to decompose the signal and extract the properties of the signal. The prototype wavelets of these basis functions are called mother wavelets. The basic wavelet functions are dependent on the contracted,

Classifier Development
A detailed layout of complete fault classification for PEMFC considering drying and flooding faults is shown in Figure 10. The data obtained from the simulation in Section 4 are used to develop the fault classifier for PEMFC. The direct involvement of sampled waveforms for classification results in the poor performance of the classifier. Hence, the characteristic features are extracted from the data to increase the performance of the classifier. The obtained fuel cell characteristics correspond to the transient faults and are available for a very short duration. Hence, the developed fault identification process should be fast and accurate to detect the transients. This can be achieved by identifying the fault characteristics appropriately and finding a suitable classifier for efficient training. To select a suitable classifier, it is required to find any or all the sources, properties, and features of the data of the sampled waveforms.

Feature Classification
In addition to features, the major aspect for developing a fault classification approach is classifier. A brief overview of classifiers as data driven approaches for fault classification is given in Section 1. In this research, a support vector machine is used as a data driven approach for developing the fault classification approach.

Feature Extraction
Feature extraction is one such process to extract all the properties, and features of data (signal or image). Generally, features of a data set are classified as time-domain [43], and frequency-domain [44]. As the data used in this project are synthetic data, and represent variations between multiple quantities, both time and frequency spectra are considered. This process is achieved through wavelet transform (WT). The wavelet transform is widely studied in the literature for signal and image processing applications [45,46]. The WT process operates by finding a set of basis function to decompose the signal and extract the properties of the signal. The prototype wavelets of these basis functions are called mother wavelets. The basic wavelet functions are dependent on the contracted, extended, and shifted version of mother wavelets [47]. In general, the WTs are categorized as continuous wavelet transform (CWT) and discrete wavelet transforms (DWT) [48]. The detailed explanation of wavelets for feature extraction are widely available in the literature [49,50]. The generalized mathematical depiction of WT, which corresponds to CWT, is shown in (8) [51].
According to the experiment, x(t) is the unprocessed signal data, a is dilation or scaling parameter, b is translation parameter, * is the complex conjugate symbol, ψ a,b (t) is calculated from ψ(t), and ψ(t) is the wavelet that has been chosen as the mother wavelet.

Feature Classification
In addition to features, the major aspect for developing a fault classification approach is classifier. A brief overview of classifiers as data driven approaches for fault classification is given in Section 1.
In this research, a support vector machine is used as a data driven approach for developing the fault classification approach.

Support Vector Machine (SVM)
SVM is a supervised learning method widely adapted with classification and regression problems [52]. The basic elements of SVM deal with support vectors and separator planes for separating data, and margins for creating upper bounds in separating the data. The general operation of SVM fits data around the separator by maximizing the margin. This is achieved by labelling the training data with their respective classes which when trained helps in good generalization and performance regarding classification of new data. During the training process, the SVM maps the linear data into a feature space and this feature space holds the characteristic solution to develop a separator plane using support vectors [53]. In the case of nonlinear data, the kernel functions [54] are used to map the data in to feature and high dimensional feature space. The data can be mapped into a higher dimensional space using a non-linear transformation function Φ, and then in the feature space, data can be linearly separated. The non-linear transformation is done through a function called the kernel function. SVM is a good example of kernel methods that uses a kernel trick, in which an inner product of the mapping function is replaced by a kernel function. Conventionally, the non-linear soft-margin SVM solves the same primal optimization problem as in (10).

of 18
However, to use a kernel trick, it transforms the primal optimization into the Lagrange dual optimization (2.3).
where w corresponds to hyperplane weight vector, x i corresponds to an observations vector, y i represents the classes to be labelled, corresponds to the bias parameter, ξ i represents the positive slack variables that enter the primal optimization problem, α = {α 1 , . . . , α N } is the vector of Lagrangian multipliers, 1 T corresponds to ones vector and Q corresponds to an N × N matrix with Furthermore, Q is computed as an inner product of Φ(x i ) and Φ x j without the knowledge of function Φ(x). This is achieved with the help of a pre-defined kernel trick, This trick measures the distance between vectors x i and x j . An overview of different kernel functions for SVM is given in Table 3 [55]. Table 3. Different kernel functions for learning of support vector machine (SVM).

Kernel Function Inner Product Kernel Type
Linear kernel Furthermore, the SVM solves the multi-class classifications, either by embedding it in the problem for optimization or by achieving binary class classification from multiclass decomposition. The second way is widely adapted and includes methods like one-versus-one (OVO), directed acyclic graphs (DAGs), and one-versus-all (OVA). In OVO classification, also known as one against one, all binary combinations of classes are created. This means that, if N different classes are available for the classification, then N(N−1) 2 classifiers are built. Furthermore, the DAG combines the outcomes of OVO classifiers. In OVA, a specific class samples are deemed to be positive and the remaining as the negative class. This leads to the generation of N different classifiers.

Classifier Development
To develop a fault classification algorithm, three different operating conditions (normal, drying, and flooding) of the PEMFC are considered. One condition is directed towards a normal operating module, and the other two conditions correspond to different faults of drying and flooding conditions. The losses and voltage of cell for different operating conditions are simulated as per the conditions in the table, and the corresponding results were obtained as discussed in the section. As nine elementary points are considered for the development of a cell, the output characteristics for all the conditions are captured and feature extraction is performed as discussed in the section. Considering the operation of each cell, nine different characteristics are plotted with respect to current density of the cell. Hence, 81 characteristics are obtained for all the elementary cells, which are further subjected to time and frequency domain analysis using wavelet transform. As the faults considered fall under the transient fault category, each time represented signal is sampled into 100 samples, and four different features are extracted for each sample. Since the simulated outputs are non-deterministic, the energy, entropy, power spectral density and peak features are used to see how the signal is distributed over different time and frequency scales [56][57][58]. The extracted features for 8100 samples (81 outputs divided into 100 samples each) form a feature matrix of size 4 × 8100. Here, 4 corresponds to the number of features and 8100 is sample size for each feature. These 8100 feature vector samples are classified as normal (2700 feature vectors), drying (2700 feature vectors), and flooding (2700 feature vectors). Furthermore, the feature matrix, along with its assigned classes, is trained with the SVM classifier, to develop a fault classification mechanism for PEMFC.
In this situation the data in feature matrix is not linearly separable, hence, kernel functions are being used. In this research, a Gaussian radial basis function (RBF) is used, based on the performance during tests and trials. Furthermore, the classification is carried out through MATLAB and the corresponding results are discussed in the figures. A detailed overview of classifier parameters is given in Table 4:  Figure 11 shows the classification accuracy of the trained model in the form of a confusion matrix with samples and percentages of samples. The accuracy represents the ratio of the total number of correctly classified labels to the total number of all classified labels. Using accuracy as an evaluation metric places equal emphasis on prediction errors for all the classes. Figure 11a indicates the number of truly classified samples with respect to the predicted class. Out of all the samples, 363 samples are misclassified for multiple classes. Furthermore, the predicted positive class and false discovery rate in Figure 11b show that 1% and 2% of total drying and flooding data are misclassified respectively, whereas a total 10% of misclassification is seen for normal data. This misclassification is due to the transient nature of the fault while simulating with the 3D model. The overall classification accuracy is observed to be 95.9%. In Figure 12, the receiver operating characteristics (ROC) and area under convergence (AUC) of different events during the process of classification were shown. The current classifier in the figures identified the classifier accuracy with respect to the ROC-AUC curve.
Furthermore, the trained classifier is tested for an unknown data set, which depicted 98.6% accuracy. From the classifier results, it is observed that the trained classifier can efficiently classify unknown water management faults in PEMFC. Based on the fault classified, the electrochemical or physical/chemical diagnostic tools can be employed to clear the fault.  Furthermore, the trained classifier is tested for an unknown data set, which depicted 98.6% accuracy. From the classifier results, it is observed that the trained classifier can efficiently classify unknown water management faults in PEMFC. Based on the fault classified, the electrochemical or physical/chemical diagnostic tools can be employed to clear the fault.

Conclusions
The model-based data driven approach for fault classification in proton exchange membrane fuel cells is developed in this paper. The developed approach modelled the 3D model of PEMFC based on the semi-empirical model, the analytical representation of the electrochemical model, the thermal model, and the impedance model. The developed model is calibrated for the healthy and failure mode operation of PEFMC. To achieve failure mode operation for fault classifier development, membrane drying faults and flooding faults in PEMFC are simulated. The simulated faults were identified for varying temperature, pressure, and humidity, and analyzed through activation, concentration, ohmic and voltage losses. These data are considered as the baseline for developing the fault classifier. Furthermore, wavelet transform is used for extracting the features of the identified fault effects. The extracted features are trained using the support vector machine classifier. To achieve accurate classification, the SVM is operated with Gaussian radial basis kernel function under one versus one multiclass method. The trained classifier showed 95.5% training efficiency and 98.6% testing efficiency for unknown data.  Furthermore, the trained classifier is tested for an unknown data set, which depicted 98.6% accuracy. From the classifier results, it is observed that the trained classifier can efficiently classify unknown water management faults in PEMFC. Based on the fault classified, the electrochemical or physical/chemical diagnostic tools can be employed to clear the fault.

Conclusions
The model-based data driven approach for fault classification in proton exchange membrane fuel cells is developed in this paper. The developed approach modelled the 3D model of PEMFC based on the semi-empirical model, the analytical representation of the electrochemical model, the thermal model, and the impedance model. The developed model is calibrated for the healthy and failure mode operation of PEFMC. To achieve failure mode operation for fault classifier development, membrane drying faults and flooding faults in PEMFC are simulated. The simulated faults were identified for varying temperature, pressure, and humidity, and analyzed through activation, concentration, ohmic and voltage losses. These data are considered as the baseline for developing the fault classifier. Furthermore, wavelet transform is used for extracting the features of the identified fault effects. The extracted features are trained using the support vector machine classifier. To achieve accurate classification, the SVM is operated with Gaussian radial basis kernel function under one versus one multiclass method. The trained classifier showed 95.5% training efficiency and 98.6% testing efficiency for unknown data.

Conclusions
The model-based data driven approach for fault classification in proton exchange membrane fuel cells is developed in this paper. The developed approach modelled the 3D model of PEMFC based on the semi-empirical model, the analytical representation of the electrochemical model, the thermal model, and the impedance model. The developed model is calibrated for the healthy and failure mode operation of PEFMC. To achieve failure mode operation for fault classifier development, membrane drying faults and flooding faults in PEMFC are simulated. The simulated faults were identified for varying temperature, pressure, and humidity, and analyzed through activation, concentration, ohmic and voltage losses. These data are considered as the baseline for developing the fault classifier. Furthermore, wavelet transform is used for extracting the features of the identified fault effects. The extracted features are trained using the support vector machine classifier. To achieve accurate classification, the SVM is operated with Gaussian radial basis kernel function under one versus one multiclass method. The trained classifier showed 95.5% training efficiency and 98.6% testing efficiency for unknown data.
Author Contributions: The conceptualization, methodology, software development, visualization, analysis, data curation, and writing of original draft were carried out by K.V.S.B. and M.A.K. The Validation, investigation, resources, administration, and review and editing of manuscript of the research were carried out by F.B. and A.H. All authors have read and agreed to the published version of the manuscript.