Full-Scale Digesters: An Online Model Parameter Identiﬁcation Strategy

: This work presents a new standard in the model, identiﬁcation, and control of monitoring purposes over anaerobic reactors. One requirement that guarantees a normal controller operation is for the faculty to measure the data needed periodically. Due to its inability to easily obtain the concentrations of acidogenic bacteria and methanogenic archaea periodically using reliable and commercial sensors, this paper presents an algorithm composed of an asymptotic observer (considering the reaction rates are unknown), aiming to estimate these concentrations. This method represents a signiﬁcant advantage because it is possible to perform a resource-saving strategy using standard measurements, such as pH or alkalinity, to calculate them analytically in natural environments. Additionally, two yield parameters were included in the original anaerobic model two (AM2) to unlock implementations for a wide range of organic substrates. The static parameter identiﬁcation was improved using a new method called step-ahead optimization. It demonstrates signiﬁcant improvements ﬁtting the mathematical model to data until a 78.7% increase in efﬁciency (compared with the traditional optimization method genetic algorithm). After the period of convergence, the state observer evidences a small error with a maximum 2% deviation. Finally, numerical simulations demonstrate the structure’s strengths, which constitutes a signiﬁcant step in paving the way further to implement feasible, cost-effective controls and monitoring systems in the industry.


Introduction
Anaerobic digestion (AD) is a complex biological process wherein anaerobic microorganisms break down the biodegradable fraction of the biomass into biogas (a mixture of gases, mainly 50-80% CH 4 and 30-50% CO 2 ), which can be used as a valuable energy source (5.5-7 kWh/m 3 of biogas) [1]. In the EU, the development of the biogas industry was prompted by the introduction of various support schemes (feed-in-tariff, green certificates, and fiscal incentives/subsidies) and changes in energy and climate policies [2]. Nowadays, the EU is the world leader in biogas electricity production, with 10 GW installed capacity and a number of 17.667 biogas plants running on energy crops (39%), animal manures (39%), sewage sludge (5%), food and beverage (4%), municipal waste (4%), and others (9%) [3]. Despite the widespread application of AD technology, full-scale biogas plants are mainly operated manually [4]. Consequently, plant operators struggle to achieve efficient operations while striving with intrinsic difficulties of the AD process, such as: (i) highly nonlinear behavior; (ii) high sensitivity to uncontrollable input perturbations; (iii) limited online measurement of state variables [5][6][7]. This is why the control and online monitoring of the anaerobic digestion process is a hot area of research and development [8].
In the last decade, several control methodologies for a closed-loop feed control of AD have been proposed to achieve process stabilization and/or maximization of methane production. It is well-known that classical control methods (e.g., proportional-integralderivative (PID) controllers and PID controller cascades) yield unsatisfactory performances when the plant is subjected to significant set-point changes, as they cannot cope with the nonlinearities of the AD process. Gaida et al. [4], García-Diéguez et al. [6], and Méndez-Acosta et al. [8] have reported summaries of the advantages and drawbacks of several control schemes, pointing out that adaptive control strategies and rule-based expert approaches (e.g., fuzzy controls, artificial neural networks) account for nonlinearities of the AD process. However, these control strategies cannot be applied to full-scale plants because large amounts of data are needed to train the neural network, and very often, it is not possible to obtain data covering a broad range of operating conditions [9].
In recent years, non-linear model predictive control (NMPC) has become the principal advanced control methodology to describe the behavior of a highly non-linear bioprocess [10]. NMPC stems from the idea of employing a representative model of the process, which is used to predict the future behavior of the plant. This prediction capability allows optimal control problems to be solved online, where tracking errors (namely the differences between the predicted output and the desired reference) are minimized over a future horizon [11]. To date, few studies are available where NMPC has been used to control the AD process. For instance, Mauky et al., 2016 [12] proposed an NMPC to adapt biogas production according to a fluctuating timetable of energy demand. Briefly, the set point (reference) of the system output (in this case, biogas flow) varies concerning the requested energy from the power grid. Thus, the control input (feeding rate) was computed by the NMPC so that the system output follows the reference scenario. This controller was validated by full-scale experiments and showed promising results [13]. Nevertheless, the accuracy of NMPC depends on the capacity of the phenomenological-based model to represent reality, which derives from the accuracy of the model parameters [14].
A practical model for control is the named anaerobic model two (AM2) proposed by Bernald et al. [15], which consists only of two reaction dynamics (the acidogenesis step and the methanogenesis step) and six state variables: acidogenic bacteria (X 1 ), methanogenic archaea (X 2 ), organic substrate concentration (S 1 ) expressed as total chemical oxygen demand (COD), volatile fatty acids (VFA) (S 2 ), inorganic carbon (C), and total alkalinity (Z). The growth of acidogens is assumed to follow Monod kinetics and the growth of methanogens to follow Haldane kinetics [15]. The model can be tailored to include relevant AD phenomena, such as ammonia inhibition owing to the hydrolysis of proteins while preserving a simple structure [16]. Although AM2 has already proved its usefulness for nonlinear control schemes of AD process in several works [17,18], it often exhibits operational difficulties that must be addressed to achieve the effectiveness for NMPC. The parameter identification procedure followed by Bernard et al. [15] is based on determining main values from steady-state variables and is then correlated by equations coming from the model assumptions. However, reliable steady states are challenging for full-scale plants due to inevitable inlet flow disturbances and substrate composition fluctuations [10]. Additionally, X 1 and X 2 are state variables unable to be measured in a real-time fashion. In practice, the measurements of each microbial-population concentration are costly, and usually, only qualitative information can be drawn from existing techniques [19]. The drawbacks above trigger identification problems when linear methods are applied, as well as inaccurate model predictions (since some kinetic parameters cannot be determined independently).
The phenomenology is frequently poorly comprehended because the anaerobic digestion processes are related to the existence of microorganisms. Then, mathematically replicating the same operating conditions is not regularly possible due to the uncertainty and variation in yield parameters because of changes in metabolism. This paper provides a novel software beyond traditional methods, where performance depends on measured data and new software sensor strategies that capture reality with high reliability. To perform a long-term plan to achieve feasible control schemes for industrial purposes, the first step is to ensure the existence of a measurable layer that provides continued data to the mathematical model [16]. Although these observer strategies have been widely used on different types of microorganisms inside a reactor, there are still obstacles in trying to define a method that guarantees full knowledge of the data inside a reaction [20].
Asymptotic observers (the desirable alternative used in the literature) estimate the state variables that cannot be measured directly over systems. Depending on the information available, the design of the observer mainly depends on two conditions: the information regarding the reaction kinetics and the yield parameters. This paper uses the mathematical model AM2 with additional terms to consider a wide range of organic matter modeled at the inlet. Using this model, it is still not possible to obtain information on the state variables concentrations of acidogens and methanogens using feasible online sensors. Additionally, there is no complete knowledge of the process kinetics. Therefore, based on the preliminary information, this paper focuses on developing an asymptotic observer conditioned by the following characteristics: the state variables X 1 and X 2 are unknown, the reaction rates are unknown, and the yield parameters are known and calculated using a parameter identification procedure based on optimization. The main challenge resides in using real data from an industrial process that evaluates the proposed methodology in a wide range of possibilities [21,22].
In order to demonstrate the performance indicators in the online model parameter identification strategy, the following sections are proposed in this paper. Section 2 presents a reduced model AM2 with additional terms for control purposes. Then, in Section 3, experimental results are used as a basis for setting the experimental conditions to test the methods proposed. Section 4 presents the parameter identification algorithm, where an optimization problem is solved to find the values that better fit the dynamics of the mathematical model and the experimental data. Section 5 proposes an asymptotic observer, considering the information from reaction rates is unknown. This algorithm aims to estimate the concentrations of acidogenic and methanogenic microorganisms. Finally, in the conclusions, some remarks are discussed from evaluating the performance of the proposed strategy.

The Anaerobic Digestion Mass-Balance Model
The AM2 reduced model that was initially selected, as described by Bernard et al. [15], considers biological phase reactions, as mentioned above, dividing the consortium into two homogeneous groups, acidogens, and methanogens, which represent the destabilization phenomenon [23]; Equations (1) to (6) described this model.
where X 1 represents the concentration of acidogenic bacteria, X 2 represents the concentration of methanogenic archaea, S 1 represents the concentration of organic substrate characterized by its COD (kg/m 3 ), S 2 represents the concentration of volatile fatty acids (VFA) (kg acetic acid/m 3 ), Z represents the total alkalinity (mol/m 3 ), C represents the total inorganic carbon concentration (mol/m 3 ). The variables S 1in , S 2in , C in , and Z in are, respectively, the influent concentrations of S 1 , S 2 , C, and Z. The variable α is the fraction of biomass which remains in the liquid phase; D is the dilution rate (d −1 ) and is inverse to the solid retention time (SRT). ψ 1 is the maximum acidogenic bacteria growth rate (d −1 ), ψ 2 is the maximum methanogenic archaea growth rate (d −1 ), K S1 is the half-saturation constant of S 1 (kg/m 3 ), and K S2 is the half-saturation constant of S 2 (kg acetic acid/m 3 ). The yield parameters k 1 is a constant for substrate degradation, k 2 is a constant for VFA production (mol/kg), k 3 is a constant for VFA consumption (mol/kg), k 4 is a constant for CO 2 production (mol/kg), k 5 is a constant for CO 2 production (mol/kg), and k 6 is a constant for CH 4 production (mol/kg). q C is represented by the Equation (7).
where P C and Φ condense some variables, explained elsewhere (Bernard et al. [15]), with: Regarding Equations (7)-(9), the variable k L a is the liquid-gas transfer constant (d −1 ), K H Henry's constant (mol/m 3 atm), and P T is the total pressure (atm). Bernard et al. [15] not considers the influence of ammonium (a usual compound generated during the fermentation and microbial growth metabolism) on alkalinity. During the development of our model proposal, we considered important to add a term taking ammonium into account, to apply the model to a broader spectrum of usable organic matter. Thus, two yield parameters were added to represent this effect, K Z,1 and K Z,2 , respectively. Based on the considerations by Kil et al. [11] and the Equation (5), a new term was added to the dynamic of the total alkalinity equation proposed by Bernard et al. [15], as the Equation (5').
The Monod-type Equations (10) and (11) characterize the dynamics of the two reaction rates considered.
Monod-type kinetics describe the growth of acidogenic bacteria ψ 1 and methanogenic archaea ψ 2 because, in the fermentation process, the biomass does not register possible VFA accumulation and consequently inhibition. Finally, the methane flow rate produced q M is proportional to the reaction rate of methanogenesis, as shown in Equation (12).
The AM2 extended model assumes that the parameters temperature and enthalpy of formation, considered initially on the mass balance equation model for homogeneous reaction systems, are constant. It can be done assuming the total mass inside the reactor is constant (m(t) = m) in all the experiments. On the other hand, the effect of toxins is not considered inside the homogeneous system due to the reactor operating with high values in SRT (making all material inside the reactor diluted). Additionally, there is no evidence of inhibitions. Therefore, the influence of these parameters is discarded [10,14].

Experimental Results and Characteristics of the Reactor
Real data from a continuously stirred tank reactor (CSTR) of 150 L of work volume (pilot plant digester) operating on the AD of sewage sludge (thickened combined primary and secondary waste sludge from the Guadalete Wastewater Treatment Plant (Jerez de la Frontera, Spain) were used. The temperature, held at 55 ºC (thermophilic range), was regulated via an internal coil (heat unit) using a PID 140 controller linked by a temperature sensor. The feedstock was added from once to thrice a day, including the weekend, to determine the effect of solid retention time (SRT) on the reactor performance. More details of the reactor setup can be found elsewhere at de la Rubia et al. [24].
Chemical oxygen demand (COD), total solids (TS), and volatile solids (VS) were determined on influent and effluent, while pH, individual VFA, alkalinity, and ammoniacal nitrogen were measured on effluent. Analytical determination of COD, TS, VS, pH, alkalinity, and ammoniacal nitrogen were performed according to APHA [25]. The concentration of VFA was determined by gas chromatography (Shimadzu GC-17 A) [24]. The volume of biogas produced in the reactor was directly measured with a mass flow sensor. While, biogas composition (methane and carbon dioxide) was analyzed by gas chromatography separation (Shimadzu GC-14 B) [24].
The study was conducted for 338 days, starting up with a SRT of 75 days (maintained 45 days), which gradually decreased to check the behavior of the system at SRT of 40 days (maintained 40 days), 27 days (maintained 85 days), 20 days (maintained 73 days), and finally 15 days (maintained 85 days). Only a specific range of data was used to discard unstable scenarios, such as a start-up (SRT of 75 days) and the last period (SRT of 15 days) when the reactor operated closer to boundaries. These data are unfavorable for modeling purposes due to deviations related to reactions [24]. Therefore, we selected the data obtained during 207 days of the experiment, starting on day 46, when the reactor operated at 40 days of SRT with an organic loading rate (OLR) of 1.5 kg COD/m 3 d, and moved forward until the end of the stage of SRT 20 days, aiming to work with standard patterns of microorganisms as much as possible.
The reactor was fed with raw sludge. The concentration of COD (see Figure 1a) remained relatively constant at the first stage, which kept the values around 60 kg/m 3 . In the subsequent stages, the values oscillated at around 70 kg/m 3 . Due to the use of real sewage sludge and the daily changing conditions, exceptionally, some values remained scattered from the mean value. Figure 1b,c shows the values of COD and VFA in the effluent. The value of COD remained constant during SRT 40 days, at 18 kg/m 3 . The SRT diminishing (to 27 days and 20 days) was related to an OLR increase, decreased COD removal, and showed a growing irregular tendency. The degree of dispersion was much larger than in the previous period. On the other hand, the values of VFA varied widely, except on SRT at 40 days (see Figure 1b), when remained constant at around 3 g/L. Figure 2 shows the consequences of the microbial dynamics derived from the current state of the reactor. The pH level during the period selected, as can be seen, was above 7.5, enough for the proper development of the anaerobic microorganisms. The volume of CH 4 , shown in Figure 2b, has a specific behavior at each stage. At SRT 40 days, the production was around 0.035 m 3 CH 4 /d, showing a slow linear progression in the volume of CH 4 produced related to the growth of methanogens [26]. After an increase was observed when the SRT decreased (until 27 days), the volume remained with no substantial variations over the mean until day 210, when the reactor had been operated at 20 days of SRT (since some weeks ago). A moderate stationary tendency was observed beyond the middle of the stage SRT 27 until day 200. Going beyond the data used for modeling purposes, the volume of CH 4 on stage SRT 15 days showed a nonstationary tendency, maybe because it was closer to the operational limits. The reduction of the SRT from 40 to 20 days was related to an increase in the methane volume production.

An Adaptive Modeling Identification Strategy for Anaerobic Reactors
The most common methods used for modeling identification on bioprocess are linear regressions and other strategies based on optimization [23,27]. The linear regression strategy proposed by Bernard et al. [15] uses the extended mathematical model AM2 as the core (being the best option for control and monitoring purposes found in the literature so far. Consequently, the deal is to calculate (calibrate) the parameters to assist the mathematical model in closely tracking the experimental data when it runs over steady states. The validation procedure is supported, based on the premise that it has to be tuned on steadystate and evaluated during the process of convergence over transients [28].
The transient was tested step-by-step once the inlet conditions changed. However, despite the numerous advantages, such as the guarantee of identifiability of parameters, a rigorous identification procedure that covers a wide range of operational conditions, and the ability to validate the performance during transients, especially over unstable phases, this method exhibited several disadvantages. The most important being: the supposition of linearity using independent and dependent variables, the sensitivity to noise, and the presence of outliers and overfitting [14]. As the optimal parameter identification has shown better results [13,29], in the following section, a well-known parameter optimization method is presented [14,30]; however, further on, the alternative presented in the literature, a new parameter optimization method is proposed, aiming to increase considerably the performance in comparison with traditional methods.

Parameter Identification Based on Optimization
Considering the mathematical model for AD over reactors on Equations (1)- (6). The parameters to be identified were p(k) {µ 1max , µ 2max , K S 1 , K S 2 , k 1 , k 2 , k 3 , k 4 , k 5 , k 6 , K Z 1 , K Z 2 , Z in }, and were calculated from an algorithm that used specific information measured from the reactor, see Figure 3a. This method solves an optimization problem to find the values of parameters aiming to minimize the difference between the measured data and the dynamics from the mathematical model AM2 [10,31]. Figure 3a shows all the details that explain the architecture proposed for the algorithm optimization. The data used from the reactor are F out = {CH 4 }, the volume of methane produced as a consequence of the metabolism of methanogenic archaea, n m = {S 1 , S 2 , Z}, the measured states considered by the mathematical model AM2, the organic substrate concentration COD, VFA concentration, and total alkalinity; u in = { COD in , VFA in , D} are the COD, the VFA, and the dilution rate at inlet; u out = {n nm , n m , pH} are the non-measured data, the measured data, and pH, respectively [32,33]. Q in is the energy used to maintain the reactor within a thermophilic range (not considered by the mathematical model). Figure 3b shows the rules used by the parametric identification algorithm to calculate the optimal values of p(k). As shown, the parameters p(k) are considered constant during the experiment. Equation (13) shows the optimization problem proposed to be solved [34]. min p(k),...,p(k+N F ) J(u(k), p(k), x(k)) s.t.
y min y(k) y max , ∀k = 1, . . . , N p , p min , u(k) p max , ∀k = 1, . . . , N u The equation J(u(k), p(k), x(k)) is the functional cost that contains the criteria to minimize. It depends on the inputs u(k), the parameters p(k), and the state variables of the mathematical model x(k); g(·) represents the controlled input; y min and y max are the lower and upper operational constraints. Finally, p min and p max are the lower and upper limits of the parameters to be calculated p(k) [34].
Equation (14) shows the cost function: the norm of the squared difference between the mathematical model dynamics and variables x mod (k), and the correspondent data x e obtained from the experiment. Finally, Equation (15) represents the mean square error between the experimental data (n e m (k), pH e (k) y q e M (k)) and the correspondent data obtained by the mathematical model (n mod m (k), pH mod (k) y q mod M (k)). On n mod m (k) and n e m (k), the subscript m represents the known dynamics of the experiment.
The identifiability of these parameters is conditioned by isolated structures conformed into subsystems that depend on their natural interactions. Equations (1) and (3) can be grouped in one subsystem because the state variables S 1 and X 1 are not influenced by others, and so, by the correspondent parameters. Equation (5') is independent and can be run separately. Then, the subsystems composed by Equations (1)-(4) can be considered independently. Under normal operational scenarios, the reactor has three equilibrium operational points. The first occurs when the reactor operates under a stable region over operational constraints. The second is when the system operates under instability scenarios, and the third is when the reactor experiments a washout, so that X 1 ∼ = 0 and S 1 = S 1in . When the subsystem composed by Equations (1) and (3) converges, Equations (2) and (4) will do the same toward a stability region. In the same way, Equations (5') and (6) will converge at the same time [19,35].

Static Parameter Identification Procedure: Genetic Algorithms
Genetic algorithms evolve inspired by the ability of nature to adapt and evolve conditioned by the environment and the genetic characteristics of their predecessors. The algorithm is based on a random rough search around the solution space area, thus, the convergence depends on the number of iterations until the optimal solution is encountered. This option is used as a reference because it is considered one of the best alternatives found in the literature [36].

Static Parameter Identification Procedure:
Step-Ahead Figure 4a shows the algorithm's structure and the step-by-step designing process that aims to estimate the parameters of the experiment. The technique, called step-ahead, places a mathematical model as the core (AM2 in our case) to predict the evolution of the system dynamics over each time step, considering X(0) as the starting point (represented by a black circle •). At this time, the algorithm uses the mathematical model to calculate one feasible step forward, aiming to discover the system's evolution in advance (represented by a black triangle ). Now, on X 1 , the new measured value (the •) is compared with the previous prediction (the ).
Step-by-step, this procedure measures the differences between the values of • and (error step # = • step # − step # ), which means the errors encountered step-by-step aim to be minimized throughout the experiment. To calculate the prediction of the system in advance, X(0) requires the information of the initial conditions, the inputs u s(0) , and the value of the parameters to be calculated p(0). E = [error 1 , error 2 , . . . , error n ] (16) In the next step, aiming to perform a new prediction (on X(1)), the algorithm updates the initial conditions (replacing the value of the previous prediction with the new initial condition, or the values measured, •). This strategy allows adjusting the deviation (or error) at each step. At the end of the simulation, the result is a vector that stores all the errors encountered over each step [37]. If the step-ahead algorithm is used, the objective of the optimization problem changes drastically, the algorithm will work on minimizing the errors encountered on each step. Compared with Equation (13), the operational and physical boundaries are the same. The new optimization problem is shown in the following Equation (17).
In the following Section 4.2, a specific algorithm is proposed to complete the information needed to establish the minimal data condition measurements from the reactor. The observer will focus on estimating the concentration of acidogenic bacteria and methanogenic archaea, which is not possible to obtain using reliable commercial transductors or protocols over analytical procedures in situ.

An Asymptotic Observer for State Estimation When Reaction Rates Are Unknown
We need all measurements available to control and monitoring an anaerobic digester. The absence of information due to the lack of reliable sensors, and the inadequate strategies to constantly test all measurements from laboratory analyses, opens up an opportunity to substitute the uncertainty by using online software sensors [13,29].
The online software state sensors or the so-called state observers [19] estimate the state variables over homogeneous reaction systems using other information relatively easy to obtain and tied to the system. Based on the previously existing data (the reaction kinetics and static parameters), different state observers are proposed in the literature [38]. Additionally, there is a substantial restriction, the data from the reaction kinetics are unknown; therefore, this observer is a consequence of the strictly challenging requirement that the information from kinetics needs to be known [38]. All previous conditions result in a particular category of observers named asymptotically because they estimate the nonexisting measurable states based on two conditions: the system is still not exponentially observable, and the reaction kinetics are unknown. In order to adapt the design of the state observer, we consider the general homogeneous reaction systems described by the general nonlinear state space model, i.e., [38].
where K is the matrix of yield coefficients, φ is the matrix of reaction rates, ξ has the states of the process, Q is the total rate of mass gas outflow for each component ξ, and F are the homogeneous reaction system mixture components for each component ξ. The design of the algorithm has to fulfill the following conditions: the information of matrix φ is unknown, the yield coefficients from K are fully known, and finally, the number of q state , the number of measured state variables is the same or higher than the rank of the matrix K, i.e., q state = dim(ξ 1 ) ≥= rank(K). In Equation (18), dim(ξ) = dim(F) = dim(Q) = N, dim(φ) = M, and dim(K) = N × M. Finally, due to the circumstances, we suppose that there exists at least one partition, ξ a , and ξ b , of the state vector ξ. Thus, the general nonlinear model Equation (18) can be divided as: where the rank of K is p. The submatrix K a comes from K and the dimensions are p × M. Thus, the submatrix K b has the remaining information of K. Finally, the matrices (ξ a ,ξ b ), (Q a , Q b ), and (F a , F b ) are the corresponding parts of ξ, Q, and F caused by the influence of K a and K b . The previous formulation has the following feature. There exists a transformation that considers Z ob as a linear combination of ξ a and ξ b , thus.
derived from Equation (21), the dynamic is represented as follows: then, using Equation (19) on Equation (22) results in: solving the last equation: and then grouping the following expressions: finally, using Equation (22) on Equation (25) results in: In order to maintain the mass balance structure using the partitioned system, there is a condition that has to be solved, the last term on Equation (26) must be eliminated because the basic structure of Equation (26) has to be maintained [38]. According to Equation (26), there are two conditions, φ = 0, thus: Finally, according to the previous equations, the state space model is equivalent to: If the expression F a − Q a = 0, then it means that the partition made by Equations (19) and (20) are appropriate due to the new dynamics on Z ob being independent of K and φ. Equation (28) shows the values ofξ a that are independent of φ (the information on the reaction kinetics).
Observer Design Using the AM2 Extended Model Using Equations (28) and (29) for designing the state space observers, and employing the nonlinear general dynamical model from Equations (1)-(6), the design is stated as follows on Equations (30) and (31). The new process describes the decoupled subsystem conducted by the state variables X 1 , X 2 , S 1 , and S 2 , then: Considering the previous subsystem, the subsequent state equations are structured as follows: Then, comparing the previous subsystem on Equation (31) with the equivalent generic Equations (28) and (29) result in the following specifications: • The original nonlinear state space system is decoupled into two parts: the subsystem equation in (31), and the other part that includes the remaining state variables, inorganic carbon C, and total alkalinity Z.

•
The information usually contained on matrices Q a and Q b is located in the state dynamic variable C.

•
The matrices Q 1 and Q 2 are the reaction rates r 1 and r 2 .
Based on the previous requisites, ξ a and ξ b represent the information of measurable and non-measurable states, thus: therefore: using Z = A 1 ξ 1 + A 2 ξ 2 to compare with the previous structure in Equation (33) results in: as a consequence, in order to find A 0 : Using Equation (35) and employing the previous information leads to the following matrices of Equation (36).
The next step is to separate the non-measurable states in order to estimate the variables Z ob 1 and Z ob 2 , and the measurable states S 1 and S 2 . Using Equation (21) and solving for ξ b : At this point, we have the matrices of ξ a , A 0 , and Z ob .
The state variables X 1 and X 2 are unknown. The variables Z ob 1 and Z ob 2 represent the new dynamics independent from the reaction kinetics contained on φ. The matrix A 0 has the yield coefficients, and the state variables S 1 and S 2 are the estimation spaces. The values of F a and F b are: Finally, using the Equation (39), the expression of Z ob is stated in Equation (40): Table 1 shows the data on the dilution rates used to feed the reactor during the experiment. In Figure 5, the results of both parameter optimization algorithms, (genetic and stepahead) are compared to measure the improvements achieved. The discontinuous black line represents the experimental results. In contrast, the blue line is the result achieved by the genetic algorithm GA, and the red line is the result obtained using the step-ahead SA algorithm. Figure 5a shows the results around the adjustment of the AM2 model extended using the data and the two algorithms from VFA. The red line closely fits the data better than the blue line. Figure 5b shows an even better adjustment from the step-ahead algorithm around data from COD than those obtained for VFA. Figure 5c shows the time course of pH. The performances of both algorithms are similar, excluding the initial results, where the starting point from the genetic algorithm is in a lower position than the one-step-ahead. Figure 5d shows the results of the identification procedure running for total alkalinity Z. The dynamics of the two algorithms closely follow the discontinuous black line; however, the algorithm step-ahead actively captures the trajectories from data giving better results. Figure 5e shows the alkalinity characterization of the inlet parameter. Trying to replicate the measurements made on-site, it is assumed that the value is virtually-measured every four days. Figure 5f shows the parameter identification results over the dynamic CH 4 produced by the anaerobic process. There is a significant difference compared with the previous results. While the step-ahead algorithm decided to strictly follow the tendency of the variables VFA, COD, and Z, along CH 4 there is a roughly average value of the step-ahead algorithm over the data. As seen in the previous results, the performance of the genetic algorithm shows an average tendency adjustment. Equation (41) was used to calculate the improvements achieved by the step-ahead algorithm over the genetic algorithm. The variables tested were VFA, COD, Z, pH, and CH 4 . The results are shown in Table 2. The best upgrading was obtained by VFA and COD with improvements of 78.7 and 60.5, respectively. Figure 4b shows the structure proposed by direct measurement (sensors) and the asymptotic observer that estimates the state variables X 1 and X 2 with the restriction that the reaction kinetics are unknown. This structure seeks to address the lack of measurements due to the absence of reliable sensors. Figure 6 shows the results obtained by the algorithm asymptotic observer that reconstructs the state variables concentration of acidogenic bacteria X 1 and methanogenic archaea X 2 using the information of the parameters compiled in Table 3. These parameters were calculated after running both algorithms several times until the best adjustment was achieved. To test the observer's performance, the values estimated were compared with the dynamics measured directly from the mathematical model. The starting point between the simulation of the dynamic observer and the mathematical model was different to confirm the convergence performance.

Results
(a) (b) Figure 6. Test for asymptotic observers; a comparison between real and observed state variables. (a) Dynamic of state variable X 1 ; (b) dynamic of state variable X 2 . Finally, after 60 days, both dynamics converged. Once the dynamics of the mathematical model and the estimator ran together, both reacted simultaneously due to unexpected changes. On day 50, a change in the operational point occurred: the value of the dilution rate changed from 0.03 to 0.07. The acidogenic and methanogenic microorganisms drastically reduced their presence in the reactor. Even when the dynamics were about to achieve a stabilization point, a new change on day 100 ocurred: the dilution rate changed from 0.07 to 0.05. The mathematical model's dynamics and the estimation algorithm continued running together toward their natural behavior. From that moment, the dynamics of the mathematical model and the observer remained together.
This paper achieved the objective to develop, step-by-step, the layer that supports the design of control and monitoring strategies. The methodology demonstrates reliability over the use of mathematical models, as sensitive as possible, which captures the biological and physical-chemical behaviors of the anaerobic process. In the future, the aim will be to analyze the performance of merging an optimal control scheme that uses mathematical models as a basis to represent the AD over reactors. Table 3. Parameters calculated by the use of a genetic algorithm and step-ahead.

Parameter
Value Unit Genetic Algorithm Step-Ahead

Conclusions
This paper designed a feasible methodology skilled in implementing control and monitoring schemes for a wide range of substrates in anaerobic reactors. The model AM2 was used extensively in the literature because it is 'insensitive' (as necessary) due to the lack of phenomenological knowledge. This model circumvents these difficulties in locating the biological absence of knowledge on specific terms named reaction rates. The original model considers that the total alkalinity is not affected by process rates because it is assumed that some substrates do not contain protein or amino acids. However, it is prevalent in the industry to find substrates containing proteins or amino acids. Therefore, new variables have been added to extend the number of substrates susceptible to being modeled. The next step consisted of conducting the data measured into a parameter identification algorithm based on optimization. At this point, a novel step-ahead method considerably improved the efficiency in the level of adjustment achieved between the dynamics of the mathematical model and the data. Compared with the well known method genetic algorithm, the variables that tested the performance: S 1 , S 2 , Z, pH, and CH 4 showed improvements of 78.7%, 60.5%, 38.6%, 25.5%, and 7.7% respectively. Once the mathematical model was parameterized to its real characteristics, there was still an inconvenience; it is not possible to continuously have the information for concentrations of acidogenic bacteria and methanogenic archaea. Because it is not possible to use analytical methods in real environments, an asymptotic observer for state estimation, when reaction rates are unknown, is proposed to fulfill the stream of information required by the mathematical model. After a period of convergence, the error between the observer and the dynamics did not exceed the 2%.

Funding:
The authors wish to express their gratitude to Fundación FIDETIA (G91045419), Universidad de Sevilla, Spain, for funding this work. The author Luis G. Cortés express special acknowledgements to Fundación Centro de Estudios Interdisciplinarios Básicos y Aplicados (CEIBA), Colombia, for funding. Laboratorio de Investigaciones en Catálisis y nuevos materiales LICATUC (Universidad de Cartagena).