A Self-Validating Method via the Uniﬁcation of Multiple Models for Consistent Parameter Identiﬁcation in PEM Fuel Cells

: Mathematical models are used for simulating the electrochemical phenomena of proton-exchange-membrane (PEM) fuel cells. They differ in the scale, modeling variables, precision in speciﬁc features, and the required parameters. Often, the input parameters are not measurable and need to be estimated by minimizing the error between the model output and experimental data; however, the estimated parameters could differ from one model to another, hence provoking uncertainty about the correct values and the model’s suitability for simulating the real phenomenon. To address these issues, we introduced a self-validating methodology using three different mathematical models: The ﬁrst set of parameters was estimated with a semi-empirical (SE) model; then, it was used for computing several points of the polarization curve (PC). The SE parameters and points were used to estimate a second set of parameters and to compute a single point of the PC with a macro-homogeneous (MH) model. The parameters and concentration proﬁles from the MH solution were used to estimate the last set of parameters with the reaction–convection–diffusion (SP-RCD) model, increasing the detail of the simulation. The SP-RCD parameters were returned to the MH model to recover the complete PC. The proposed methodology requires a few data points to consistently recover the same PC from the three models through estimating parameters in one model and validating them in the others. As output, the method provides complete information about several variables and the physical properties of the catalysts. In addition to the consistent simulation, the numerical results are consistent with those reported in the literature, thus validating the proposed method.


Introduction
Since the 20th century and the beginning of the 21st century, the generation of alternative clean technologies has been increasing to counteract the increase in pollutants and energy consumption due to the increasing global population [1]. The PEM fuel cells are clean devices that directly generate electrical energy, water, and heat through electrochemical reactions in their cathodic and anodic compartments, as shown in Figure 1. The electrochemical reactions in a PEMFC are represented in Equations (1) to (3) [2,3]: Anode reaction: Global reaction: The mathematical modeling of fuel cells allows us to study the energy conversion processes inside of a PEM fuel cell; analyze designs, structures, or material composition; and improve electrical energy generation. Several mathematical models have been implemented and validated through laboratory experimentation in the literature. Among the models, the following stand out: (a) Empirical and semi-empirical models: These models are based on algebraic expressions obtained from physical relations present in the PEMFC. They model the behavior of the polarization curves using a set of empirical or semiempirical parameters through non-linear regression methods [2,4,5]. (b) Interface models: These models' diffusion processes occur through the anode, membrane, and cathode using diffusion differential equations, assuming that the catalytic layer is negligible [6][7][8].
(c) Macro-homogeneous models: These consist of a set of non-linear ordinary differential equations based on the assumption that the main reactions occur in the catalytic layer of the cathode and analyze the physical compositions of the materials and their properties [9][10][11][12][13][14]. (d) Agglomerated models: These consist of partial differential equations, model diffusion processes, and involve a description of the physical composition of the cathodic catalytic layer, its porosity, or catalyst conglomerates [15][16][17].
In the modeling and simulation of PEMFC, mathematical models require a set of parameters that are not obtained experimentally, such as diffusion coefficients; hence, they need to be estimated [2,12,13,15]. The parameter estimation problem is an optimization problem with an objective function defined by the error minimization of the fitting experimental data and the model output [5,[18][19][20]. Algorithms for solving the parameter estimation problem in PEMFC are particle swarm optimization (PSO) [21,22], the artificial immune system [23], simulated annealing [2,18], the genetic algorithm (GA) [5,24], the RNA-genetic algorithm [19], the differential evolution algorithm [25,26], the artificial bee swarm [27], fuzzy logic control [28], harmonic search in [29], JAYA-NM [30], the Grasshopper Optimization Algorithm (GOA) [31], the Salp Swarm Optimizer (SSO) [32], moth-flame optimization [33], and UMDA G [34], among others. In general, most of them report acceptable solutions; nevertheless, they vary in the consistency of their results, that is, in delivering similar estimated parameters for the same conditions in several independent executions. UMDA G [34] has shown a reliable balance between the precision of the approximation and the computational cost and consistency. Consequently, it was used in this work for the estimation task.
In this context, the contributions of this article are various. First, we unified three high-performing models to demonstrate how to simulate the same cell with all of them; from our point of view, the main advantage of these multiple simulations is that the models corroborate that the parameters are the correct ones and deliver information about the cell performance, which is not possible to measure but is obtained from the simulation. Second, we estimated parameters in one model using a few experimental data, and then we used the other models to increase the precision of the estimation and to corroborate that the parameters reproduce the cell performance. Third, we reproduced incomplete experimental information. Once the parameters had been estimated, we were able to reproduce the full polarization curves and concentration profiles. Fourth, we presented the assembly of numerical techniques as a complete method for parameter estimation and multimodel simulation of the PEMFC.
The paper is organized as follows: Section 2 presents an overview of the three selected mathematical models; we describe their advantages and disadvantages. In Section 3, we present the parameter estimation problem. Our proposed self-validating methodology is presented in Section 4. In Section 5, we show the numerical and comparative results of our proposal with those reported in the literature. Finally, a conclusion of this work is presented in Section 6.

Mathematical Models for PEMFC
In this section, we describe the three mathematical models used: semi-empirical, macro-homogeneous, and SP-RCD models; each of them requires a different set of parameters but shares some of them, and each delivers different kinds of solutions that, according to the methodology introduced in this work, are interrelated for the consistent simulation of the PEMFC and the consistent estimation of parameters. Table 1 shows the nomenclature of the principal parameters for the three models.

Semi-Empirical Model
The semi-empirical model is described by algebraic expressions as functions of operational parameters that, through the estimation of so-called semi-empirical parameters, allow the fitting of a simulated polarization curve to experimental data [30,32,34]. The model uses operational parameters, such as the pressures of gases in the cell, the active area of the catalyst, or the thickness. Table 2 shows the algebraic expressions that can be used to obtain the PEMFC voltage in terms of the Nernst equation and the activation, ohmic and concentration overpotential losses, as shown in Figure 2. Numerical simulations are used to analyze the behavior under different operational conditions, for instance, if the membrane or pressures are changed.  [35]. V * = Voltage loss caused by mixed potential and crossover.
Mass loading of Pt per unit area (mg cm 2 ) A active cell area (cm 2 ) m C Mass loading of C per unit area (mg cm −2 ) R M equivalent membrane resistance (Ω) f Platinum mass fraction in the Pt/C catalyst ψ parameter related to the water in the PEM ρ Pt Platinum density (g cm −3 ) L c membrane thickness (cm) R C equivalent contact resistance (Ω) ξ i semi-empirical coefficient (i = 1, .., 4) The results of this model show the general characteristics of the PEMFC but may not be able to provide information about microscopic characteristics, such as density and platinum content in the catalytic layers [36,37]. The most commonly estimated set of parameters is θ = {ξ 1 , ξ 2 , ξ 3 , ξ 4 , R c , B, ψ} [23,25,30,34]. Table 2. Semi-empirical mathematical model; details in [4].
Concentration overpotential The advantages of this model are its adaptability to fit different experimental data with adequate parameter estimation, and it is computationally inexpensive. The main disadvantages are: (1) the result varies in its precision with the number of selected parameters to estimate (physical or semi-empirical parameters); (2) the obtained information only concerns the polarization curve and operational parameters, for instance, reactant gas pressures or active area of the catalysts. The common techniques used in the literature to estimate parameters in this type of model are PSO, GA, and JAYA, among others [22,24,30]. This model is suitable to approximate the complete polarization curve, as shown in Figure 2. These results can be used to feed another model that complements the information for a PEMFC.

Macro-Homogeneous Model
The macro-homogeneous model is defined by a system of non-linear ordinary differential equations that describes the oxygen concentration, overpotential, and current density through the catalyst layer in the cathode (CL), where the most important reaction of a PEM cell occurs, as shown in Figure 3. The model describes the catalyst layer as a one-dimensional problem in steady-state; its principal advantage is including the characteristics or physical properties of the materials in the catalyst layer, such as platinum density, carbon density, and porosities. The peculiarity of this model is its applicability at low or intermediate current densities. It produces unstable numerical results at high current densities, associated with losses due to concentration, generating spurious solutions [10,12,13].  Table 3. Algebraic expressions of the ordinary differential equations; details in [12,13].
The standard mathematical model is described by the following nonlinear boundary value problem [10,[12][13][14]: on Ω = [0, l c ], where l c is the thickness of the CL, subject to the boundary conditions: Numerical instabilities are produced due to large gradients in a small subdomain, causing the system of differential equations to behave similarly to singularly perturbed ordinary equations or systems of differential equations with dominant convection [38,39]. Due to this issue, when it is solved by a numerical method, it is necessary to create a fine mesh to capture the high gradients in the solution profiles, which requires high execution computational times to obtain a solution. Therefore, the main disadvantage is the computational time required to deliver stable solutions.
The numerical methods must include strategies for stabilizing the solution; otherwise, the estimated parameters could be incorrect.
The numerical techniques for solving this model are implicit Runge-Kutta methods for lower current densities and implicit Lobatto techniques for intermediate current densities [10,13]. These types of numerical methods are coupled with adaptive meshes and shooting methods to reach the unfixed boundary conditions [40]. In contrast to the semiempirical model, the macro-homogeneous model recovers the concentration losses region of the polarization curve-with the limitation of requiring a finer discretization and more operations for solving the ODE system-and has greater computational complexity.

SP-RCD Model
The SP-RCD mathematical model studied in [41] consists of a system of second-order differential equations. It was derived from the formulation of the macro-homogeneous model using Fick's law for the oxygen concentration [10,12,13], Ohm's law for the overpotential [42], and the concept of a second-order formulation for the current density in [43]. These formulations result in the coupled non-linear second-order differential system given in the following: subject to the following boundary conditions: In Table 3, we present the algebraic equations required to define the model. The model describes the physical properties of the cathodic catalyst layer. It was solved by coupling the finite element method and the θ method or the adaptive Shishkin mesh as a function of the derivative of the current density [41]. This combination of methods showed similar results with a shorter computation time to those obtained by the macro-homogeneous model. The principal advantage of this model is that it reports numerically stable solutions with coarse meshes. The principal disadvantage is that, in its initial formulation, the SP-RCD model requires a priori information for the boundary conditions. To solve the parameter estimation problem, we need information about the oxygen concentration profiles, overpotential, and current density, which are difficult to obtain experimentally. However, these boundary conditions or profiles can be obtained or approximated by the results of the macro-homogeneous model for a given value of current density I δ .

Parameter Estimation Problem
In a parameter estimation problem, regardless of the algorithm to be used, it is necessary to define a function that is minimized or maximized. This function is called the objective function. For PEMFC, different researchers apply various objective functions; see [44]. In this work, given a set of unknown parameters θ, we defined the objective function, SSE : Θ → R + , as the sum of the squared error between the experimental data of the polarization curve and the output of the model: where N is the number of experimental data; V FC j and V z j ; θ are the j-th experimental datum and the output of the model, respectively, at point z j . The problem consists of determining the set of parameters θ ∈ Θ that returns a minimum value for SSE( θ).

Self-Validating Methodology
In Section 2, we explained the benefits of the three selected mathematical models. In this section, we propose a methodology to unify the models using the advantages of each model to solve the disadvantages of the others. An advantage of our proposal is that it works with a few experimental data, since the three models complement and provide information to one another.
The algorithm begins with an experimental data set of a polarization curve and returns a set of estimated parameters and the fitting of the polarization curve, following steps of the scheme in Figure 4. In this initial step, it is necessary to declare the set of parameters θ = { θ SE , θ MH , θ RCD } to be estimated. Using the semi-empirical model, we obtained a complete polarization curve, including the concentration losses region. The SE-parameter estimation fitted the model's output to the experimental data, providing the first set of estimated parameters θ SE . As we explain in Section 2.1, this model is incapable of describing the specific physical properties of the catalyst layer.
The macro-homogeneous model describes the missing characteristics that the semiempiric model is unable to describe; we propose to use a part of the polarization curve simulated by the semi-empiric model, close to the activation losses region, to sample a set of points, called S SE , which was used to estimate a second set of parameters θ MH from the macro-homogeneous model. It is important to note that in this region, the macrohomogeneous model is well-posed and provides correct solutions, as noted in [13]. With a set of solutions from the macro-homogeneous model corresponding to one point from S SE , the oxygen concentration profiles, overpotentials, and current densities were obtained. It is clear that the estimation of the macro-homogeneous model was correct for the points S SE , but it could not be correct for the complete polarization curve. To improve this, we used the SP-RCD model. Then, we generated a set of sample points named S MH from the obtained oxygen concentration, overpotential, and current density profiles and used them as data for the parameter estimation with the SP-RCD model. In this step, we re-estimated the previously estimated parameters, θ MH , and defined θ RCD as the set of boundary conditions found by the shooting method. Finally, with the new re-estimated parameters θ MH,RCD = { θ MH , θ RCD }, we reproduced the complete polarization curve with the macro-homogeneous model. This methodology is independent of the parameter estimation problem algorithm and is described as an algorithm that follows the steps below (graphs in Figures 5 and 6), which should be followed in the same way as the methodology steps: Step 1. We used the semi-empirical model to fit the experimental polarization curve, as shown in Figure 5-Step 1. With this fit, the first set of estimated parameters, θ SE , was obtained. More information on the detailed estimation method and its implementation can be found in [34]. • Step 2. We sampled a set of N 1 points, named S SE , of the polarization curve fitted by the semi-empirical model in the zone of activation losses, as shown by the green points in Figure 5-Step 2; the example uses N 1 = 6. • Step 3. We fit the macro-homogeneous model to the sampled points, S SE , as shown by the black line in Figure 5-Step 3; more information on numerical methods to solve the model can be found in [13]. A first approximation of the selected parameters of the MH model θ MH was obtained, as well as the oxygen concentration, overpotential, and current density profiles. • Step 4. We selected one point (green point in Figure 5-Step 4); used the associated set of profiles obtained by the macro-homogeneous model for oxygen concentration, overpotential, and current density; and sampled N 2 points from these profiles, S MH . The sampled points from the profiles are shown in Figure 6a,c,e. In this example, we used N 2 = 12.

•
Step 5. We fit the SP-RCD model to the sampled points S MH ; then, we obtained estimated parameters θ MH,RCD = { θ MH , θ RCD }. In this process, a second and more precise approximation of the oxygen concentration, overpotential, and current density profiles was obtained, as shown in Figure 6b,d,f. The model was solved using a non-linear finite element formulation with an adaptive Shishkin mesh. Details of the implementation of the SP-RCD model are presented in [41]. • Step 6. We generated the polarization curve by feeding the MH model with the set of estimated parameters θ MH,RCD = { θ MH , θ RCD } using the SP-RCD, as shown by the magenta continuous line in Figure 5-Step 6. • RETURN. The estimated set, θ, and the simulated polarization curves were obtained by the semi-empirical and macro-homogeneous models using the estimated parameters from the SP-RCD model. The proposed methodology is a self-validating strategy as the principal objective of the coupling of the three mathematical models is to generate simulated polarization curves to describe the experimental data of a PEMFC.

Results
To validate the proposal, we used two experimental datasets of PEMFC. General information about the reference initial experimental PEMFC can be found in [45]; the authors used Nafion ® 117 with humidified air and prototech-brand electrodes (20 wt. % Pt) onto which they sputtered platinum with an equivalent thickness of 50 nm, corresponding to a total Pt loading of 0.45 mg/cm 2 . Complementary base operating conditions used by [12,13] are presented in Table 4. Table 4. Parameter values used in the baseline calculation, taken from [12,13]. Table 3 A s = Eq. in Table 3 i 0 = Eq. in Table 3 The optimization algorithm to minimize SSE( θ) is an Estimation of the Distribution Algorithm (EDA) named the Univariate Marginal Distribution Algorithm with Gaussian models (UMDA G ) [46][47][48][49][50][51]. It is a stochastic algorithm that computes, in each iteration, the l-dimensional joint probability function, factored as a product of l univariate and independent probability functions. This probability function evolves until a stopping criterion is reached. Details of this algorithm and its implementation are shown in [34,51]. The UMDA G uses a population of 200 individuals and 1000 generations with a tolerance of 1 × 10 −7 , for the semi-empirical model. For the macro-homogeneous and the SP-RCD models, it uses a population of 100 individuals and 500 generations with a tolerance of 1 × 10 −7 . In the context of evolutionary algorithms, the number of individuals in the population is usually considered dependent on the number of parameters to estimate. Thus, for the semi-empirical model, we needed to estimate nine parameters; for the macrohomogeneous model, we needed to estimate three parameters; and for the SP-RCD model, we re-estimated six parameters used in the macro-homogeneous model. Then, the search space for the semi-empirical model should have included more individuals to reach an adequate approximation; meanwhile, the other two models required fewer individuals.

Dataset 1
We carried out simulations with Dataset 1 (Table 4) and selected 8 points of the polarization curve, which was necessary to obtain the profile of the curvature of the reported polarization curve in [12]. For the semi-empirical model, we selected the set of parameters θ SE = {ξ 1 , ξ 3 , ξ 4 , R C , B, ψ, A, L c , J max }. The search space to fit the semiempirical model with upper and lower bounds is presented in Table 5. These values are based on previous research [2,23,30,34,37].
With the UMDA G , we obtained the estimated parameters for the set θ SE , as presented in Table 6-Dataset 1, and we generated the simulated initial polarization curve Γ SE , as shown by the dashed blue line in Figure 7a. Next, we sampled three points on Γ SE to obtain S SE into the activation losses region, as shown by the green points in Figure 7a, and we used them to obtain the set of estimated parameters θ MH = {m pt , p pt , c } with the macro-homogeneous model using the search space presented in Table 7, based on [12][13][14]; the results are reported in Table 8-Dataset 1, in which we also present the estimated boundary conditions obtained by a shooting method. Table 5. Upper and lower bounds for parameter estimation for the semi-empirical model.   With the results of the macro-homogeneous model, we selected information from the oxygen concentration, overpotential, and current density profiles corresponding to I δ = 0.05 Acm −2 ; we sampled N 2 = 10 uniform points on the profiles and carried out a parameter estimation with the SP-RCD model using the set of parameters θ MH,RCD with θ RCD = {O 2 (0), η(0), η(l c )}; the search limits were obtained by perturbing 5% of the estimated values reached with the shooting method in the previous step using the macrohomogeneous model. In Table 8-Dataset 1, we show the estimated values compared to those reported in the literature and to the previously estimated values from the macrohomogeneous model. In Figure 7a, the continuous magenta line represents the simulated polarization curve generated with the macro-homogeneous model using the estimated parameters of the SP-RCD model; the simulated polarization curves are similar and closer together, producing self-validating results.

Dataset 2
A second dataset was used with the baseline parameters in Table 4-Dataset 2; we selected eight points of the experimental polarization curve, and the reported experimental data are presented in [13]. For the semi-empirical model, we used the same set of parameters and the search space in Table 2, except for I max ; we increased the interval to [0.7, 1.8] following the reported values in [18] to preserve the profile of the experimental polarization curve. The estimations are presented in Table 6-Dataset 2. In Figure 7b, the dashed blue line represents the simulated polarization curve Γ SE,2 obtained from the semi-empirical model. We sampled N 1 = 3 points on Γ SE,2 to obtain S SE , as shown by the green points in Figure 7b. For the macro-homogeneous model, we estimated the set of parameters θ MH = {m pt , p pt , c }, similar to that for Dataset 1. We report the estimated parameters in Table 8-Dataset 2. For parameter estimation with the SP-RCD model, we selected the profiles of oxygen concentration, overpotential, and current density corresponding to I δ = 0.21 Acm −2 ; we sampled N 2 = 10 uniform points on the profiles, overpotential, and current density. In the estimation process, we included θ RCD = {O 2 (0),η(0), η(l c )}. We compared the estimated values with those reported in [13], and with estimated values from the macro-homogeneous model in Table 8-Dataset 2. We present the simulated polarization curve generated by the macro-homogeneous model using the estimated values with SP-RCD in Figure 7b, as shown by the continuous magenta line.
The results of the numerical experiments in this work provide us with coherent and similar results to those obtained in previous works [12,13,34] (see Tables 6 and 8). The simulated polarization curves with the semi-empirical or macro-homogeneous models using the estimated parameters of the SP-RCD are similar and close together, showing that our methodology is adequate to describe PEMFC performances.

Conclusions
In this article, we described a proposal of a self-validating methodology using three different mathematical models: semi-empirical (SE), macro-homogeneous (MH), and sin-gularly perturbed reaction-convection-diffusion (SP-RCD). The set of linked parameter estimation problems, solved by the UMDA G , uses the benefits and advantages of each model to circumvent the disadvantages of the others. With the proposal in this work, it is possible to analyze the performance of a PEMFC, from the operational and construction parameters to the structural and composition parameters in the catalyst layer. The semi-empirical model describes the operational and construction parameters, such as temperature, pressure, and membrane properties; meanwhile, the macro-homogeneous and the SP-RCD models are useful to determine the composition parameters, such as platinum mass loading m Pt . The coupling of the models serves to take advantage of the polarization curve region that is better fitted by each one; for instance, the SE model reproduces the catalyst activation region; then, the MH takes the SE output to produce a gross approximation of the boundary conditions of the concentration profiles and a set of parameters. The actual boundary conditions and parameters are re-estimated with the SP-RCD model, producing more precise solutions. This last precise solution is inputted into the MH model to obtain a correction of the polarization curve.
The results are comparable with those reported in the literature; the simulated polarization curves of the semi-empirical and macro-homogeneous models obtained using the estimated parameters of the SP-RCD are similar and close together, validating the approximations between the models. The power of the method is to estimate parameters that are difficult or impossible to measure, starting from experimental information. Nevertheless, once we estimate the parameters that reproduce the experimental data, one can change a few parameters to describe the performance of a different design. Furthermore, one can use the model to optimize the performance of a PEMFC by testing parameters in a physically plausible interval of values. This strategy equips the practitioners with complementary information about parameters that are difficult to measure experimentally in a laboratory and increases the confidence of the estimated parameters that consistently reproduce the experimental data. Future work contemplates the optimization of the parameters. Notice that a first step to optimizing a PEMFC is to have a reliable numerical simulation that reproduces the actual polarization curve and concentration species profiles. Once the adequate parameters for the models have been determined by means of the proposed methodology, it is possible to vary any parameter, not only the estimated ones, to find an optimized value for improving the cell performance or cost. For example, it could be the platinum mass loading m Pt , for which optimization could reduce the cost while maintaining the performance of the original design, or the operational parameters to increase the electricity generation with the same design.