Evolutionary Design Optimization of an Alkaline Water Electrolysis Cell for Hydrogen Production

Featured Application: This article provides the strategy to ﬁnd the best cell design and operating condition of a diphasic electrochemical cell using an evolutionary algorithm. Abstract: Hydrogen is an excellent energy source for long-term storage and free of greenhouse gases. However, its high production cost remains an obstacle to its advancement. The two main parameters contributing to the high cost include the cost of electricity and the cost of initial ﬁnancial investment. It is possible to reduce the latter by the optimization of system design and operation conditions, allowing the reduction of the cell voltage. Because the CAPEX (initial cost divided by total hydrogen production of the electrolyzer) decreases according to current density but the OPEX (operating cost depending on the cell voltage) increases depending on the current density, there exists an optimal current density. In this paper, a genetic algorithm has been developed to ﬁnd the optimal evolution parameters and to determine an optimum electrolyzer design. The optimal current density has been increased by 10% and the hydrogen cost has been decreased by 1%.


Introduction
Mitigation of the climate change consequences is the main global challenge in the 21 century. The different IPCC (Intergovernmental Panel on Climate Change) reports stated that the climate change is mainly due to greenhouse gas emissions to the atmosphere produced by human activities. The IPCC recommends in their scenarios the decrease of energy consumption, the increase of renewable production, and sequestration of harmful CO 2 produced by the industrial activities including the production of cement and steel. The renewable energies generated from various systems such as wind turbines and photovoltaic panels do not produce greenhouse gases but suffer from time intermittency E rev is the reversible voltage in V, η act is the activation overvoltage in V, R and R memb respectively the electrolyte resistance and membrane resistance in Ω cm 2 and, j the current density in A m −2 .
The E rev voltage represents the minimum thermodynamic cell voltage. This value is around 1.23 V at atmospheric pressure at 25 • C. Increasing temperature and decreasing the pressure decrease this minimum cell voltage. The majority of previous studies have focused on finding a cheap, robust and electroactive electrode material or electrocatalyst to decrease the activation overpotential, which are the second and third terms in Equation (1). There is also another research trend aiming at increasing efficiency, which is to reduce an ohmic resistance and surface coverage through electrolyzer design by means of simulations or experiments. Previous studies modeled the void fraction and velocity distribution [1][2][3][4][5] or the cell voltage [6]. There exist only few studies about optimization of this process. Villagra et al. [7] showed that there exists an optimum current density for a given design of PEM electrolyzer. Bensmann et al. [8] demonstrated that there exists an optimum of pressure for a given PEM electrolyzer.
The goal of this study is to find the optimal design and the current density for an alkaline electrolysis cell. A meta-model has been developed to simplify the physics of the void fraction and a genetic algorithm is used to find the optimal design. The optimization problem uses a complex objective function involving multiple, coupled and non-linear variables. The optimization of such a function is difficult because there are no finite time resolution algorithms. The problem is defined to be NP-complete and cannot be solved with classical methods such as the gradient descent [9]. For this reason, artificial evolution techniques (evolutionary algorithms) have been explored in the design of the electrolyzer. This paper begins with the theory, physics and economy behind the alkaline water electrolysis cell. Then, the set-up, adjustment of parameters according to the problem features and validation of the genetic algorithm are highlighted. Finally, the algorithm is used to find the optimal design of the alkaline water electrolysis.

Theory
An alkaline water electrolysis cell produces hydrogen and oxygen using electric energy via an electrochemical process. The Figure 1 presents a classical water electrolysis cell. It consists of a positive pole called the anode (where the oxidation of hydroxyl ions occurs Equation (2)), a negative pole called the cathode (where the reduction of water occurs Equation (3)) and a membrane (brown part) that prevents the mixing of hydrogen and oxygen.

Theory
An alkaline water electrolysis cell produces hydrogen and oxygen using electric energy via an electrochemical process. The Figure 1 presents a classical water electrolysis cell. It consists of a positive pole called the anode (where the oxidation of hydroxyl ions occurs Equation (2)), a negative pole called the cathode (where the reduction of water occurs Equation (3)) and a membrane (brown part) that prevents the mixing of hydrogen and oxygen.
The reactions of Equations (2) and (3) occur at alkaline pH (>7). Thus, KOH or NaOH salts must be added. The classical operating conditions of alkaline electrolyzer are: Temperature around 80 °C, atmospheric pressure and 0.30 KOH. In the next paragraph, we will explain why those choices have been made. As explained in the introduction section, the minimum voltage that must be applied to a water electrolysis cell is called reversible voltage. Hammoudi et al. [10] developed a model (Appendix A) that simulates the temperature, pressure, and mass fraction sensitivity of the reversible voltage. The reversible voltage decreases with temperature while mass fraction and the pressure increase it.
The ohmic electrolyte resistance is depending on the electrolyte conductivity , the diphasic boundary thickness and the void fraction , see Equation (4) [11].  The reactions of Equations (2) and (3) occur at alkaline pH (>7). Thus, KOH or NaOH salts must be added. The classical operating conditions of alkaline electrolyzer are: Temperature around 80 • C, atmospheric pressure and 0.30 KOH. In the next paragraph, we will explain why those choices have been made.
As explained in the introduction section, the minimum voltage that must be applied to a water electrolysis cell is called reversible voltage. Hammoudi et al. [10] developed a model (Appendix A) that simulates the temperature, pressure, and mass fraction sensitivity of the reversible voltage. The reversible voltage decreases with temperature while mass fraction and the pressure increase it.
The ohmic electrolyte resistance R is depending on the electrolyte conductivity σ, the diphasic boundary thickness δ and the void fraction ε, see Equation (4) [11].

of 28
With h the thickness of the electrolyte (catholyte or anolyte) in m, σ the electrical conductivity in S m −1 .
Hine et al. [12] showed the electrical conductivity dependency on void fraction can be simulated using the Bruggeman's correlation Equation (5).
The Equation (6) can be simplified by assuming that the gas is dispersed throughout the electrolyte thickness.
Thus, ohmic resistance increases with void fraction. The quantity of gas injected in the electrolyte is calculated via Faraday's law Equation (7). According to the Equation (8), the pressure decreases the volumetric hydrogen flow and thus the void fraction. Moreover, Vogt et al. [13] and Jannsen et al. [14] showed that the pressure decreases the detachment bubble radius, thus decreasing the surface coverage. Gilliam et al. [15] and See et al. [16] measured the electrical conductivity depending on the KOH mass fraction and temperature. They showed that there exists an optimum KOH mass fraction between 0.2 and 0.4 that depends on the temperature. The different correlations for electrical conductivity, viscosity, density of KOH and NaOH have been summarized by Le Bideau et al. [17]. The temperature increases the efficiency of the process but a high temperature and an extremely alkaline (pH > 14) condition led to material deterioration and electrolyte boiling. Thus, the temperature should not exceed 100 • C.
Q vH 2 the volumetric hydrogen flow m 3 s −1 , S the electrode surface in m 2 , v mol the molar volume m 3 mol −1 , F = 96,500 C mol −1 the Faraday's constant in C mol −1 , R G = 8.314 J kg K −1 the ideal gas constant in J mol −1 K −1 , T the temperature in K, p the pressure in Pa.
Therefore, this analysis explained the choice of mass fraction and temperature but not pressure choice. Actually, high pressure increases the manufacturing cost. However, according to Grigoriev et al. [18], the aim of pressure level for the new alkaline electrolyzer is 30 bar.
According to Equation (6), it would be tempting to decrease as much as possible the electrolyte thickness. However, Nagai et al. [19][20][21] observed in multiple studies that there exists an optimum of electrolyte thickness. Indeed, in an extremely narrow electrolyte the bubbles cannot leave fast enough to prevent bubble accumulation.
The activation overpotential can be modeled with different mathematical formalism depending on the current density. For the industrial current density, the Tafel law is used, see Equation (9).
With a and b the Tafel parameter in respectively V and V dec −1 and θ the surface coverage ratio. The parameter a and b depend on the electrode material and electrocatalyst. The surface coverage depends on a lot of parameters (temperature, pressure, mass fraction, current density, presence or absence of electrolyte flow). However, there is no numerical model that can simulate the surface coverage ratio depending on all these parameters. Vogt et al. [22] developed a model that roughly estimates the surface coverage depending on current density Equation (10).
Appl. Sci. 2020, 10, 8425 5 of 28 Using the data from Jannsen et al. [14], the correlation that gives the bubble radius according to the pressure is given by Equation (11).
r 0 the mean bubble radius at atmospheric pressure in m.
The pressure, current density and temperature sensitivities of cell voltage have been given. One of the key parameters is the void fraction. This parameter depends on several parameters. The sensitivities will be explained in the next section.

Finite Volume Model (CFD)
The void fraction can be calculated using numerical simulations. A model has been developed and validated by Le Bideau et al. [1] in a previous study. The following hypothesis have been chosen:

•
The flow is Newtonian, viscous and incompressible; • the flow is considered isothermal; • ions distributions are neglected; • the flow is considered laminar; • bubble diameter is constant for a given operating condition; and • The current density is taken as constant.
For more explanations about the hypotheses, see Le Bideau et al. [1]. The model is a two-fluid Eulerian model Equations (12)- (16).
ε is the gas or liquid fraction, the subscript k can be either g (O 2 , H 2 ) or liq, ρ is the density in kg m −3 , V the velocity in m s −1 , and S k is the term source in kg m −3 s −1 p is the pressure in Pa, = τ is the stress tensor in Pa, → g the gravitational acceleration in m s −2 , and → F k is the exchange term in N m −3 The stress tensor is written as follows: with µ k the viscosity of the phase k in Pa s and I the unit tensor.
Equation (15) is the exchange term. It is the sum of the usual drag force and lift force. The study of Le Bideau et al. [1] showed that it was mandatory to take into account the bubble diffusion force Appl. Sci. 2020, 10, 8425 6 of 28 ( → F BD ) to reproduce the experimental data. The coefficient K g has been used to fit the experimental data of Boissonneau et al. [23].

Artificial Neural Network (ANN)
These four dimensionless numbers (Equations (21)-(24)) have been used to correlate the void fraction using an artificial neural network (ANN). Appendix B shows the design of experiments used to train the ANN. The numerical search field is summarized in Table 1. This numerical search field allows the finding an optimum design for an average current density j in [  The dimensionless numbers have been reduced using Equation (21).
With X the reduced dimensionless number, x the dimensionless number The weights of the ANN are obtained using the software NeurOne©. Because the evolutions of the void fraction according to the four dimensionless parameters are non-linear, the learning is done by a Levenberg-Marquardt (L-M) algorithm [24] with parameters initialized according to a normal law N(0, 0.1) over about 70 iterations. The L-M algorithm consists in reducing a Sum Square Error (SSE) defined by Equation (22). The SSE is reduced by a combination of two algorithm: the steepest descent algorithm and the Gauss-Newton algorithm [24]. Thus, the weight vector is updated following the Equation (23).
With J k the Jacobian matrix, the k subscript for the number of update and λ the combination coefficient.
If λ is close to 0 then Gauss-Newton algorithm is used whereas when λ is very large the L-M algorithm switches to the steepest descent algorithm.
The Figure 2 gives the architecture of the artificial neural network used to estimate the void fraction, which correspond to the Equations (25)-(29).   The parameters of Equations (25)-(29) are given in the Table 2.

Sensitivity Analysis of Void Fraction and Ohmic Resistance
In order to observe the effect of the different parameters on the electrolysis performance, the artificial neural network has been used to predict the void fraction and the electrolyte ohmic resistance . Therefore, we performed a fractional factorial design of experiment (DOE). The DOE used is a "one factor at a time" (OFAT), referred to the minimal parameter-values curve (red). In all other curves, a single parameter among (T, Y, j, H, and h) changes from min to max value to assess

Sensitivity Analysis of Void Fraction and Ohmic Resistance
In order to observe the effect of the different parameters on the electrolysis performance, the artificial neural network has been used to predict the void fraction ε and the electrolyte ohmic resistance R. Therefore, we performed a fractional factorial design of experiment (DOE). The DOE used is a "one factor at a time" (OFAT), referred to the minimal parameter-values curve (red). In all other curves, a single parameter among (T, Y, j, H, and h) changes from min to max value to assess the effect of each one on the void fraction and electrolyte ohmic resistance. Table 3 gives the minimum and maximum values of different parameters in this DOE. The bubble radius r is set at 25 µm. Because the ANN cannot evaluate the void fraction for r * above 1.5 × 10 −3 , the electrode height H elec has been set to 5 × 10 −2 . Table 3. Minimum and maximum of the explored parameters for the sensitivity analysis.

Variables
Min Max The results of the predictions are represented in Figures 4-7. In the Figure 4a,b, it can be observed that the pressure decreases for all the cases the void fraction ε at least until 5 bar and the ohmic resistance R. However, the artificial neural network predicts that the ohmic resistance reaches a plateau around 10 bar for most of the cases. In the case where H max = 10 cm, the minimum value is obtained near 5 bar but the void fraction rises again after this limit. This case, mass fraction and temperature seem to have little effect on the void fraction but have more important effects on the resistance.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 29 the effect of each one on the void fraction and electrolyte ohmic resistance. Table 3 gives the minimum and maximum values of different parameters in this DOE. The bubble radius is set at 25 µm. Because the ANN cannot evaluate the void fraction for * above 1.5 × 10 −3 , the electrode height has been set to 5 × 10 −2 . Table 3. Minimum and maximum of the explored parameters for the sensitivity analysis.

Variables Min
The results of the predictions are represented in Figures 4-7. In the Figure 4a,b, it can be observed that the pressure decreases for all the cases the void fraction at least until 5 bar and the ohmic resistance . However, the artificial neural network predicts that the ohmic resistance reaches a plateau around 10 bar for most of the cases. In the case where = 10 cm, the minimum value is obtained near 5 bar but the void fraction rises again after this limit. This case, mass fraction and temperature seem to have little effect on the void fraction but have more important effects on the resistance.  In Figure 5a, it can be clearly seen that the electrolyte thickness has a great influence on the void fraction. By decreasing the electrolyte thickness ℎ by few hundred micrometers, the void fraction increases abruptly from 0.1 to 0.6. Therefore, the existence of an optimum is explained. In the Figure  5b, the optimum of electrolyte thickness ℎ can be observed. This optimum mainly depends on the electrode heights and pressure . The pressure reduces the radius of the bubble and thus diminishes the optimal thickness of the electrolyte. Increasing the electrode height means increasing the injected gas flow so it increases the void fraction and the electrolyte ohmic resistance. In Figure 6a,b, it can be observed that the current density increases the void fraction until 1500 A m −2 and reached a plateau. The value of the plateau depends on the geometry of the cell especially the ratio electrolyte thickness/electrode height. However, the kinematic viscosity seems to play a role also.  In Figure 6a,b, it can be observed that the current density increases the void fraction until 1500 A m −2 and reached a plateau. The value of the plateau depends on the geometry of the cell especially the ratio electrolyte thickness/electrode height. However, the kinematic viscosity seems to play a role also. In Figure 7a,b, the electrode height sensitivity impact on the ohmic resistance and void fraction can be observed. As expected, the electrode height increases the void fraction and thus the ohmic resistance. However, for (30 bar), it seems that electrode height does not impact the void fraction values. This is a surprising result that must be taken with caution. In conclusion, the neural network gives roughly the sensitivity to all parameters even if for some case the results are surprising. Indeed, increasing the pressure must strictly decreases the void fraction and the void fraction should always increase according to electrode height. However, even if the sensitivities are strictly respected, the neural network approach save a lot of computational time. Indeed, to train the ANN, 61 CFD evaluation has been needed (corresponding to approximately 20 h of calculation on an Intel Core i7−6700HQ CPU @2.60 GHz). To make the Figures 4-7, at least 480 evaluations must be performed corresponding to approximately 30 s using the ANN and 160 CPU hours using the CFD code. Moreover, the genetic algorithm requires at the bare minimum 500 evaluations so the optimization would take 167 CPU hours if the CFD code is used against 30 CPU seconds using the ANN.

Economy
The hydrogen cost has two contribution: The that represents the cost of the investments and the OPEX that represents the cost during electrolyzer operations (Equation (30)). is estimated by using Faraday's law, see Equation (31). The investment cost is divided by the total amount of hydrogen produced during the electrolyzer lifetime. The depends on the cell voltage and electricity cost.
With and in € kg −1 , the cell voltage in V, Faraday's constant in A s mol −1 , the energy cost in € kWh −1 , the initial capital cost per unit area in € m −2 , the electrolyzer lifespan in s, is the molar mass in kg mol −1 . The life of the electrolyzer, the initial capital cost per unit area and the cost of electricity cannot be calculated using a numerical model, but estimates do exist or can be calculated. Grigoriev et al. [18] have indicated that the life of the electrolyzer is approximately 90,000 h. The cost of electricity varies greatly depending on the country or the means of power generation used to produce In Figure 5a, it can be clearly seen that the electrolyte thickness has a great influence on the void fraction. By decreasing the electrolyte thickness h by few hundred micrometers, the void fraction increases abruptly from 0.1 to 0.6. Therefore, the existence of an optimum is explained. In the Figure 5b, the optimum of electrolyte thickness h opt can be observed. This optimum mainly depends on the electrode heights H elec and pressure p. The pressure reduces the radius of the bubble r and thus diminishes the optimal thickness of the electrolyte. Increasing the electrode height means increasing the injected gas flow so it increases the void fraction and the electrolyte ohmic resistance.
In Figure 6a,b, it can be observed that the current density increases the void fraction until 1500 A m −2 and reached a plateau. The value of the plateau depends on the geometry of the cell especially the ratio electrolyte thickness/electrode height. However, the kinematic viscosity ν seems to play a role also.
In Figure 7a,b, the electrode height sensitivity impact on the ohmic resistance and void fraction can be observed. As expected, the electrode height increases the void fraction and thus the ohmic resistance. However, for p max (30 bar), it seems that electrode height does not impact the void fraction values. This is a surprising result that must be taken with caution.
In conclusion, the neural network gives roughly the sensitivity to all parameters even if for some case the results are surprising. Indeed, increasing the pressure must strictly decreases the void fraction and the void fraction should always increase according to electrode height. However, even if the sensitivities are strictly respected, the neural network approach save a lot of computational time. Indeed, to train the ANN, 61 CFD evaluation has been needed (corresponding to approximately 20 h of calculation on an Intel Core i7−6700HQ CPU @2.60 GHz). To make the Figures 4-7, at least 480 evaluations must be performed corresponding to approximately 30 s using the ANN and 160 CPU hours using the CFD code. Moreover, the genetic algorithm requires at the bare minimum 500 evaluations so the optimization would take 167 CPU hours if the CFD code is used against 30 CPU seconds using the ANN.

Economy
The hydrogen cost has two contribution: The CAPEX that represents the cost of the investments and the OPEX that represents the cost during electrolyzer operations (Equation (30)). CAPEX is estimated by using Faraday's law, see Equation (31). The investment cost is divided by the total amount of hydrogen produced during the electrolyzer lifetime. The OPEX depends on the cell voltage and electricity cost.
With OPEX and CAPEX in € kg −1 , U cell the cell voltage in V, F Faraday's constant in A s mol −1 , EC the energy cost in € kWh −1 , IC S the initial capital cost per unit area in € m −2 , t the electrolyzer lifespan in s, M H 2 is the molar mass in kg mol −1 .
The life of the electrolyzer, the initial capital cost per unit area and the cost of electricity cannot be calculated using a numerical model, but estimates do exist or can be calculated. Grigoriev et al. [18] have indicated that the life of the electrolyzer is approximately 90,000 h. The cost of electricity varies greatly depending on the country or the means of power generation used to produce electricity. In metropolitan France, the cost of electricity is about 0.17 € kWh −1 and in Germany about 0.30 € kWh −1 . The effect of these two costs will be studied in the optimization section.
To estimate the IC S , the cost per kWh given by Grigoriev et al. [18] can be used. They stated that the cost per kW is between 500 and 1400 today and will be between 400 and 850 in 2030 and will reach the interval of 200-700 € kW −1 . According to Grigoriev et al. [18], the electrolyzer power density is between 4 and 10 kW m −2 . Thus, IC S will be explored between 800 and 14,000 € m −2 . The Figure 8 presents the OPEX, CAPEX, and total hydrogen cost for an electrolyzer depending on the current density for a naïve design (h H 2 = h O 2 = 1.5 mm, H elec = 10 cm, p = 1 bar, T = 353 K, Y = 0.30, b an = 0.15 V dec −1 , b cath = 0.15 V dec −1 , IC = 14,000 € m −2 , and EC = 0.17 € kWh −1 ) optimum current density is 2923 A m −2 and the minimum cost is 12.48 € kg −1 .
Appl. Sci. 2020, 10, x FOR PEER REVIEW 12 of 29 electricity. In metropolitan France, the cost of electricity is about 0.17 € kWh −1 and in Germany about 0.30 € kWh −1 . The effect of these two costs will be studied in the optimization section.
To estimate the , the cost per kWh given by Grigoriev et al. [18] can be used. They stated that the cost per kW is between 500 and 1400 today and will be between 400 and 850 in 2030 and will reach the interval of 200-700 € kW −1 . According to Grigoriev et al. [18], the electrolyzer power density is between 4 and 10 kW m −2 . Thus, will be explored between 800 and 14,000 € m −2 . The Figure 8

Optimization Problem
We also define the cost function ( , ) representing the addition of investment cost (CAPEX) and operating cost (OPEX) in € kg −1 .

Optimization Problem
We also define the cost function F c (CAPEX, OPEX) representing the addition of investment cost (CAPEX) and operating cost (OPEX) in € kg −1 .
The set of design parameters used in the rest of the article will be referred as DP.
This type of optimization problem is classified as a NP-complete (non-polynomial solving time) problem that has not yet found an effective deterministic solving algorithm.
Given F c (DP), find DP * ∈ E ⊆ E such as : With DP * the set of design parameters that gives the minimal cost value (i.e., the best design), E the set of feasible solutions wrt g i and h j of constraint and E the set of general solutions.
Taking into account the dimensionality, the non-linearity and the couplings in this problem, classical methods such as gradient descent are not likely to produce a successful optimization. Therefore, we propose to rely on stochastic optimization such as evolutionary algorithms. This method allows to find good solution when the problem is very difficult, and the search space is huge.

Genetic Optimization Algorithm (GA)
Historically, genetic algorithms (GAs) are the first and the simplest ones to implement [25]. GAs are particularly well adapted for parametric optimization as is our design problem (see Equation (36)).
First of all, the design parameters (DP) must be put under genetic material (chromosome). The fitness function is based on the chromosome and cost function. To begin the genetic algorithm, an initial population of individuals (i.e., design candidates) is randomly generated and their associated fitnesses are calculated. Next, individuals are selected for pairing according to their fitness to produce offspring. During the pairing, chromosome may undergo crossover or mutation. Eventually, we look if the best solution is satisfying if not, we generate a new population using the same process (evaluation, selection, cross-over, mutation). There are different types of genetic algorithms, but they can all be summarized in one general flowchart (Figure 9). fitness function is based on the chromosome and cost function. To begin the genetic algorithm, an initial population of individuals (i.e., design candidates) is randomly generated and their associated fitnesses are calculated. Next, individuals are selected for pairing according to their fitness to produce offspring. During the pairing, chromosome may undergo crossover or mutation. Eventually, we look if the best solution is satisfying if not, we generate a new population using the same process (evaluation, selection, cross-over, mutation). There are different types of genetic algorithms, but they can all be summarized in one general flowchart (Figure 9). Figure 9. Flowchart of the genetic algorithm. Figure 9. Flowchart of the genetic algorithm.

Coding
The method of genetic algorithms takes its vocabulary from genetics. A design solution is called a genotype (composed of a single chromosome in the Simple Genetic Algorithm) and the parameters that make up this case are called genes. In our case, a chromosome containing all the genes would take the following form Equation (36).
All genes are normalized using Equation (37) and therefore their value is between 0 and 1.
The number of bits in a gene depends on the desired precision (or quantification) for a parameter. The input parameters for bits determination are summarized in Table 4. To determine the number of bits, Equations (38) and (39) are used.
We need a number of bits for parameter DP i : Finally, the total chromosome number of bits is determined via the Equation: Table 5 gives the number of bits, the wanted and obtained quantification for each design parameters. The total number of bits is 33.

Evaluation
In optimization, we need a normalized function to be maximized, so we will use an inverse function. In order to have a solution between 0 and 1, the inverse function is scaled using the theoretical minimum of the CAPEX and the OPEX. The theoretical minimum of OPEX is obtained by neglecting the diphasic effect. The calculated minimum and maximum of CAPEX and OPEX are given in the Table 6.
Inversing the cost function, having the lowest hydrogen cost (F c = F cmin ) corresponding to the highest fitness (F max = 1) gives: Unfortunately, F(F cmax ), corresponding to the highest cost (then the lowest fitness F min ) is not 0, as it should be: Thus, it is interesting to make a linear transformation to take advantage of the whole range 0 to 1. With: The value of these coefficients depends on the initial assumptions. Finally, the Equations (48) and (49) gives the form of the fitness function.

Operators
The Flowchart shown in the Figure 9 shows there are several operators: Initial random population generation, selection, crossing-over, mutation, and population replacement.

Initial Random Population
The first generation is created from scratch, randomly drawing bits values for all parameters of each individuals in the population. The population includes k pop candidate solutions (individuals). The following generations will undergo successively the genetic operators all along the evolution that include k gen generations.

Roulette Wheel Selection
To simulate the natural selection, the candidates from the population are selected using the roulette wheel selection [25]. Each candidate is given a portion of an imaginary wheel proportionally to its fitness (so the bigger fitness, the bigger portion). The wheel is spun k pop times (k pop being the total number of candidate solutions in the population). The selected candidates are paired to be mated. Each couple will be the parents producing offspring for the next generation.

Single Point Crossing-Over
After the selection, a second operator that mimics the biological cell meiosis is used: The crossing-over. This operator is used to combine the parents' genetic material in the chromosome (genotype). A crossing-over is carried out with a probability occurrence of p c . A crossing-over point (locus) is chosen randomly in the chromosome. Before this point, the chromosome is unchanged and beyond, the chromosome sequence is swapped between the parents. This produces two offspring that will be new candidates for the next generation. The artificial evolution if not very sensitive to the crossing-over probability [26]. It is why p c is generally set to 50%.

Bitwise Mutation
The mutation operator is used to maintain genetic diversity in the population over the generations. It can be compared to the real biological mutation. In a living organism, the mutation occurs due to external factors (radioactivity, environmental turmoil) or just due to DNA recopying error. In the bitwise mutation, a mutation probability is drawn for each bit of the chromosome, the bits are flipped from 1 to 0 or from 0 to 1. The probability of mutation p m must be set very low to avoid too much genotype instability in the population. A good solution would be disrupted easily and its chance to survive mutation will be null. If p m reaches 50%, the genetic algorithm is equivalent to a random search, which must be prevented.

Replacement Generator (Generational)
The children obtained from parents through selection, crossing-over and mutation replace totally the previous generation. A specific place in reserved for the best individual (in all previous generations). This requires scarifying the first newborn children or having an odd population size k pop .

Presentation of the Evolution Parameter
The evolution parameters have been introduced briefly in the previous subsection. In total, there are 4 of them: • k gen the number of generations • k pop the number of populations per generation • p m probability of mutation • p c probability of crossing-over is definitively set to 50% The values of these parameters can greatly impact the results of the optimization simulation, they must be chosen wisely. The choice of k gen and k pop is constrained first by the time resource (CPU time) allocated to an optimization and then by the respect of infinitesimality constraint. Indeed, the advantage of the genetic algorithm is that it evaluates only a tiny (infinitesimal) part of the possible solution. To comply with this constraint: card(E) = 2 n bt = 8.59 × 10 9 (51) To validate the algorithm, we use the ad hoc parameters k gen = 50 and k pop = 10, we define the two dimensionless parameters: According to Chocron [26], p m must be chosen in such a way that, in average, one bit mutates per generation while having an average proportion of mutant individuals per generation equal to 50% (Equations (54) and (55)). By this setting, the aim is that the mutation should have the same impact as the crossing-over.
With E(nbmp) and E(nbmi) the expected value that a mutation affects the entire population and a single individual.

Validation of the Algorithm
Hypothesis: The calculation has been performed using an Intel Core i7−6700HQ CPU @2.60 GHz. The optimization simulation time varies around 5 s. To validate the algorithm, the physics of the model presented in Section 2 will be simplified. The diphasic effect will not be taken into account. In doing so, we can analyze the equations to extract the sensitivities and find the optimal design if the process was not two-phase. The next step is to see if the algorithm finds a better candidate. By analyzing Equations (30) and (31), we can deduce that the current density must be around 1000. As temperature increases the electrical conductivity and decreases the reversible voltage, the algorithm must find the highest possible temperature. Since two-phase phenomena are not taken into account, the pressure proposed by the algorithm must be as low as possible (because it increases the reversible voltage). The same applies to the electrolyte thickness. Indeed, the latter decreases the ohmic resistance. Figure 10 shows the evolution of the fitnesses (a) and the cost function (b) according to generations. In (a) we observe that f best and f avg increase greatly in the first 10 generations, we call this phase the exploration. During this phase, the GA searches good locations in the DP space. It finds very quicly good regions in DP, which explains the quick fitness growing, thus the cost dwindling. solution, which explain the reduction of and difference. Usually, there exist a last phase where the stagnates below the that continues to slightly increase until reaching a plateau around the end of evolution. This phase is called the convergence one and pursues the search around the best solutions to find slightest increases for (eventually ending up with the global optimum), maintaining just below using the mutation as permanent genetic stirrer. Unfortunately, we cannot see this last phase that means the evolution is not completed. Between generations 10 and 50 both fitnesses continue to increase but moderately, this in the exploitation phase. During this phase the algorithms attracts most of the population towards the best solution, which explain the reduction of f best and f avg difference. Usually, there exist a last phase where the f avg stagnates below the f best that continues to slightly increase until reaching a plateau around the end of evolution. This phase is called the convergence one and pursues the search around the best solutions to find slightest increases for f best (eventually ending up with the global optimum), maintaining f avg just below f best using the mutation as permanent genetic stirrer. Unfortunately, we cannot see this last phase that means the evolution is not completed.
In Figures 11 and 12 are presented the design parameters evolution path from random (first population) to overall best (last population) values. In (a), both h parameters (H 2 and O 2 ) converge to the smallest available value. This can be explained by the Ohm's law in the monophasic model. In Figures 11 and 12 are presented the design parameters evolution path from random (first population) to overall best (last population) values. In (a), both ℎ parameters (H2 and O2) converge to the smallest available value. This can be explained by the Ohm's law in the monophasic model.  Figure 11. Design parameters evolution for the best set solution (in full space and for a single evolution) of the best candidate depending on the generation (a) for the anolyte and catholyte thickness ℎ , ℎ (b) for electrolyte temperature . Figure 11. Design parameters evolution for the best DP set solution (in full DP space and for a single evolution) of the best candidate depending on the generation (a) for the anolyte and catholyte thickness h H2 , h o2 (b) for electrolyte temperature T.
(a) (b) Figure 11. Design parameters evolution for the best set solution (in full space and for a single evolution) of the best candidate depending on the generation (a) for the anolyte and catholyte thickness ℎ , ℎ (b) for electrolyte temperature . In Figures 11 and 12, we note that almost all values converge during the evolution towards their best value, but the parameter (b). Actually, in this simplified model (monophasic), has no influence on the cost function (gaz effects neglected).
This first tentative of artificial evolution of alkaline electrolyzer is promising, but we know that it could be improved in two ways:

•
The simplified model does not take into account the two-phase phenomena, which can greatly influence the hydrogen costs ( ).

•
The evolution parameters used in this first try are not well adapted to the problem (missing convergence phase to achieve evolution).
In the next section we answer these two issues by using the more elaborate two-phase model and by performing a sensitivity analysis of the genetic optimization w.r.t. the evolution parameters.

Sensitivity Study of the Results to the Evolution Parameters
In this section, we analyze the evolution algorithm on the more realistic diphasic problem to tune the evolution parameters. The goal of this study is to adjust the evolution parameters ( , , , ) in order to have the best genetic optimization performances. The hypothesis are the same than previously but because we take into account the diphasic effect, the bubble radius must be set. Thus, the following hypothesis are used. Figure 12. Design parameters evolution for the best DP set (in full DP space and for a single evolution) of the best candidate depending on the generation (a) current density p, (b) electrode height H elec , (c) pressure p and (d) KOH mass fraction Y.
In Figures 11 and 12, we note that almost all DP values converge during the evolution towards their best value, but the parameter H elec (b). Actually, in this simplified model (monophasic), H elec has no influence on the cost function (gaz effects neglected).
This first tentative of artificial evolution of alkaline electrolyzer is promising, but we know that it could be improved in two ways:

•
The simplified model does not take into account the two-phase phenomena, which can greatly influence the hydrogen costs (F c ).

•
The evolution parameters used in this first try are not well adapted to the problem (missing convergence phase to achieve evolution).
In the next section we answer these two issues by using the more elaborate two-phase model and by performing a sensitivity analysis of the genetic optimization w.r.t. the evolution parameters.

Sensitivity Study of the Results to the Evolution Parameters
In this section, we analyze the evolution algorithm on the more realistic diphasic problem to tune the evolution parameters. The goal of this study is to adjust the evolution parameters (k gen , k pop , p m , p c ) in order to have the best genetic optimization performances. The hypothesis are the same than previously but because we take into account the diphasic effect, the bubble radius must be set. Thus, the following hypothesis are used.

Impact of k gen and k pop
The first study should make it possible to determine the value of k gen and k pop , the numerical experiments carried out are gathered in Table 7. R GP represents the ratio between k gen over k pop It allows to set (inversely) the level of "parallelism" of the genetic algorithm, to answer our high dimension and strongly coupled optimization problem. P GP is the product of k gen by k pop (i.e., the number of evaluations then of simulations), the higher this parameter is, the longer the evolution will take. To reduce the optimization computational costs, it is necessary to reach a compromise between fitness results and calculation time. The results that really interest us, the final best fitness, are summarized in Figure 13. It can be seen in Figure 13a that for any R GP , it is necessary to choose a P GP of at least 1500 but for R GP of 1. However, in Figure 13b, we really observe a stagnation of the fitness value from P GP = 2000 and R GP = 7. If P GP = 3000, the necessary R GP is around 3, and this R GP increases when P GP decreases. We can therefore conclude that the optimum setting is P GP = 2000 is necessary for a R GP = 7. If a larger P GP is available (if more CPU time is available), then we should decrease R GP to 3 for P GP = 3000. This amounts to increasing the number of individuals in the population, i.e., taking advantage of resources to increase parallelism, which can be enhanced materially by using multiprocessors. So we choose k gen = 118 and k pop = 17. = 7. If = 3000, the necessary is around 3, and this increases when decreases. We can therefore conclude that the optimum setting is = 2000 is necessary for a = 7. If a larger is available (if more CPU time is available), then we should decrease to 3 for = 3000. This amounts to increasing the number of individuals in the population, i.e., taking advantage of resources to increase parallelism, which can be enhanced materially by using multiprocessors. So we choose = 118 and = 17.

Impact of p m
The mutation probability p m is important for the initial population draw, the exploration phase and the genetic stiring all along the evolution. If p m is too low, no new genetic material is found, and if too high it is disruptive and prevent keeping good solutions. Here we study its impact upon the fitnesses.
Initially, p m = was set at 1.55% as recommended by Chocron [26]. In order to refute or affirm this choice, a sensitivity study based on p m was conducted. In order to observe only the operator effect of the mutation, elitism was deactivated (i.e., the fact that the best individual is always re-injected into the population). The results of the evolutions are shown in Figure 14. The maximum fitness increases from 0 to about 0.5%, then reaches a plateau of up to 2% and then gradually decreases. For p m = [0.5-2]%, there is a peak at p m = 0.7%. However, the difference between the fitness value at p m = 0.7% and 1.5% is very small. The choice of p m recommended by Chocron is therefore relevant as far as we focus on the best fitness.

Impact of pm
The mutation probability is important for the initial population draw, the exploration phase and the genetic stiring all along the evolution. If is too low, no new genetic material is found, and if too high it is disruptive and prevent keeping good solutions. Here we study its impact upon the fitnesses.
Initially, = was set at 1.55% as recommended by Chocron [26]. In order to refute or affirm this choice, a sensitivity study based on was conducted. In order to observe only the operator effect of the mutation, elitism was deactivated (i.e., the fact that the best individual is always reinjected into the population). The results of the evolutions are shown in Figure 14. The maximum fitness increases from 0 to about 0.5%, then reaches a plateau of up to 2% and then gradually decreases. For = [0.5-2]%, there is a peak at = 0.7%. However, the difference between the fitness value at = 0.7% and 1.5% is very small. The choice of recommended by Chocron is therefore relevant as far as we focus on the best fitness.
In Figure 15, the evolution parameters are the ones just defined. We see in this 100-averaged run that the exploration phase stops at gen = 10, followed by the exploitation phase ending for gen = 90. This time, we observe a convergence phase (gen = 90 to 117) where stagnates just below that increases slightly. The convergence gap between and is only 3% of the value, as expected in the canonical GA (<5%). In conclusion, we can consider these evolution parameters to be validated.
(a) (b) Figure 14. Result of the sensitivity study of the fitness to the mutation rate. In (a) the whole research space has been plotted and (b) only the mutation rate between 0 and 2% has been plotted. Figure 14. Result of the sensitivity study of the fitness to the mutation rate. In (a) the whole research space has been plotted and (b) only the mutation rate between 0 and 2% has been plotted.
In Figure 15, the evolution parameters are the ones just defined. We see in this 100-averaged run that the exploration phase stops at gen = 10, followed by the exploitation phase ending for gen = 90. This time, we observe a convergence phase (gen = 90 to 117) where f avg stagnates just below f best that increases slightly. The convergence gap between f best and f avg is only 3% of the f best value, as expected in the canonical GA (<5%). In conclusion, we can consider these evolution parameters to be validated. Eventually, the evolution parameter are summarized in Table 8. These evolution parameters allow to get the optimum DP set summarized in Table 9. The main results are the difference in catholyte and anolyte thickness and electrode heights. Due to the presence of gases, the minimum of electrolyte thickness cannot be chosen. The electrode height is set to the minimum value available. There is twice as much gas injected in the catholyte than in anolyte. Thus, the catholyte thickness ℎ is bigger than the anolyte thickness ℎ . The lower the electrode height, the lesser gases injected resulting in a lower void fraction. The pressure stays at 1 bar. It can be explained by the fact that the surface coverage sensitivity regarding the pressure has not been taken into account. Thus, at relatively low current density, the pressure effect increases the cell voltage.  Eventually, the evolution parameter are summarized in Table 8. These evolution parameters allow to get the optimum DP set summarized in Table 9. The main results are the difference in catholyte and anolyte thickness and electrode heights. Due to the presence of gases, the minimum of electrolyte thickness cannot be chosen. The electrode height is set to the minimum value available. There is twice as much gas injected in the catholyte than in anolyte. Thus, the catholyte thickness h H2 is bigger than the anolyte thickness h H2 . The lower the electrode height, the lesser gases injected resulting in a lower void fraction. The pressure stays at 1 bar. It can be explained by the fact that the surface coverage sensitivity regarding the pressure has not been taken into account. Thus, at relatively low current density, the pressure effect increases the cell voltage. Table 9. Optimum DP with their associated hydrogen cost for IC s = 800 € m −2 , EC = 17 c€ kWh −1 , r = 25 µm and R memb = 3.23 × 10 −5 Ohm m 2 .

Optimization of the Naïve Solution
In the Section 2.3, we defined a naïve solution. In this section, we will compare this naïve solution with a GA optimized solution with the same hypothesis. The Figure 16a presents the average and best fitness according to the generations. The evolution is correctly achieved around gen = 97. In the Figure 16b, it can be observed that the genetic optimization allows to decrease the random hydrogen cost from 12

Optimization of the Naïve Solution
In the Section 2.3, we defined a naïve solution. In this section, we will compare this naïve solution with a GA optimized solution with the same hypothesis. The Figure 16a presents the average and best fitness according to the generations. The evolution is correctly achieved around gen = 97. In the Figure 16b, it can be observed that the genetic optimization allows to decrease the random hydrogen cost from 12.70 € kg −1 to 12.30 € kg −1 . The Table 10 summarized the DP of the two solutions (the naïve and optimized solutions) and their corresponding hydrogen cost. The optimization allowed to find the optimal , ℎ , ℎ and . The electrode height is decreased at 5 cm and the anolyte and catholyte thickness are decreased at 400 µm. The KOH mass fraction is decreased to 0.23 (the loss of electrical conductivity is balanced by the decrease of void fraction allowed by a less viscous electrolyte). This optimization allows to decrease the price of 0.18 € kg −1 (1% of decrease) and an increase the current density of 290 A m −2 (10% of increase).  The Table 10 summarized the DP of the two solutions (the naïve and optimized solutions) and their corresponding hydrogen cost. The optimization allowed to find the optimal H elec , h O2 , h H2 and Y. The electrode height is decreased at 5 cm and the anolyte and catholyte thickness are decreased at 400 µm. The KOH mass fraction is decreased to 0.23 (the loss of electrical conductivity is balanced by the decrease of void fraction allowed by a less viscous electrolyte). This optimization allows to decrease the price of 0.18 € kg −1 (1% of decrease) and an increase the current density of 290 A m −2 (10% of increase). The Figure 17 shows this result. It can be seen that the difference of price increase with an increasing current density until reaching a difference of 0.72 € kg −1 at j = 10 4 A m −2 .
Appl. Sci. 2020, 10 The Figure 17 shows this result. It can be seen that the difference of price increase with an increasing current density until reaching a difference of 0.72 € kg −1 at = 10 4 A m −2 .  Figure 18a,b explains why there exist a difference of price. We can see that the ohmic overpotential is strongly decreased. In Figure 18b, it can be observed that the major cost of OPEX is mainly due to the activation overpotential and reversible voltage.  Figure 18a,b explains why there exist a difference of price. We can see that the ohmic overpotential is strongly decreased. In Figure 18b, it can be observed that the major cost of OPEX is mainly due to the activation overpotential and reversible voltage.  Figure 18a,b explains why there exist a difference of price. We can see that the ohmic overpotential is strongly decreased. In Figure 18b, it can be observed that the major cost of OPEX is mainly due to the activation overpotential and reversible voltage.  The other overpotential are the same because they only are depending on the hypothesis.

Conclusions
A CFD model has been used to create an artificial neural network (ANN). This ANN allows the estimation of the void fraction very quickly from 166 CPU hours to around 30 s for 500 evaluations. However, further experimental studies must be performed to increase the accuracy of the CFD model. Indeed, the CFD model is based on only 3 experimental cases. Thus, the bubble diffusivity coefficient is not well correlated to the dimensionless parameters. With this new model, it would be possible to predict with accuracy the diphasic boundary layer thickness. The genetic algorithm parameters have been determined by making sensitivity analyses. To ensure a good optimization, the number of evolutions P GP = 2000 and a R GP = 7 have been chosen. For the mutation rate p m , the only based on the best fitness mutation rate (proposed by Chocron [26]), has been replaced by a more suitable p m = 0.5% (using also the average fitness). For the electrolyzer design, the algorithm preconizes the lowest height possible and find an optimum anolyte and catholyte thickness between 200 µm and 400 µm, depending on the optimum current density. It has been noticed that increasing the surface cost IC s increases the optimal current density. The optimum pressure has been determined at 1 bar but the surface coverage dependency according to the pressure (or the other parameters) has not been taken into account. Moreover, Bensmann et al. [8] showed that the optimal pressure depends on also of the compression, cooling, purity needed. Two further studies should be performed. The first one must focus on the determination of the surface coverage w.r.t dimensionless parameter (Reynolds number Re V G , Froude number Fr V G , dimensionless electrolyte thickness and radius h * , r * ). Another one should address the influence of electrolyte and hydrogen temperature, pressure(s) of compression stage on the global hydrogen production system (cooling, compression, purification of hydrogen). Funding: This research was funded by the national association ADEME and Bretagne Region (France).

Acknowledgments:
We would like to deeply thank the national association ADEME for funding our research and the Bretagne region to support our research.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The model of Hammoudi et al. [10] used to calculate the reversible voltage in this written as Equations (A1)-(A4).

Appendix B
The following numerical experiments has been performed to train the ANN.