A Two-Stage Method for Parameter Identification of a Nonlinear System in a Microbial Batch Process

This paper deals with the parameter identification of a microbial batch process of glycerol to 1,3-propanediol (1,3-PD). We first present a parameter identification model for the excess kinetics of a microbial batch process of glycerol to 1,3-PD. This model is a nonlinear dynamic optimization problem that minimizes the sum of the least-square and slope errors of biomass, glycerol, 1,3-PD, acetic acid, and ethanol. Then, a two-stage method is proposed to efficiently solve the presented dynamic optimization problem. In this method, two nonlinear programming problems are required to be solved by a genetic algorithm. To calculate the slope of the experimental concentration data, an integral equation of the first kind is solved by using the Tikhonov regularization. The proposed two-stage method could not only optimally identify the model parameters of the biological process, but could also yield a smaller error between the measured and computed concentrations than the single-stage method could, with a decrease of about 52.79%. A comparative study showed that the proposed two-stage method could obtain better identification results than the single-stage method could.


Introduction
There are widespread applications for 1,3-propanediol (1,3-PD) [1]. In the microbial production of 1,3-PD, the bio-dissimilation of glycerol to 1,3-PD has attracted the interest of researchers since the 1980s because it possesses a relatively high yield and productivity [2]. In recent years, much research has been conducted to improve the dissimilation process of glycerol from mathematical modeling, biochemical analysis, process optimization, etc. . For example, to describe the cell growth kinetics of multiple-inhibitions and product formation in a continuous bio-dissimilation process, a quantitative description has been given by Xiu et al. [3], Zeng and Deckwer [4], and Zeng and Biebl [5]. Lama et al. [6] addressed the metabolic engineering of Klebsiella pneumoniae J2B for the production of 1,3-PD from glucose. Lee et al. [7] and Sun et al. [8] reviewed the advances in biological and chemical methods for the conversion of glycerol into 1,3-PD. Vivek et al. [9] carried out a comparative evaluation of the metabolite fluxes in 1,3-PD production of cell recycling, simple batch, and continuous fermentation processes by using the Lactobacillus brevis N1E9.3.3 strain. Rodriguez et al. [10] addressed the kinetic modeling of raw glycerol to 1,3-PD production by Shimwellia blattae. This kinetic model can describe the influence of the initial glycerol concentration on 1,3-PD production. Liu et al. [11] presented the bi-objective dynamic optimization of a nonlinear time-delay system to optimize 1,3-PD production in a microbial batch process. Yuan et al. [12] gave an optimal control strategy of a nonlinear batch system with a time delay to maximize 1,3-PD production. Hirokawa et al. [13] used the engineered cyanobacterium Synechococcus elongatus to improve the production of 1,3-PD by optimizing the gene expression level of a metabolic pathway and operation conditions. Narisetty et al. [14] improved 1,3-PD production by maintaining physical conditions and optimizing media composition. To address the multi-objective optimization of the continuous culture process of glycerol to 1,3-PD, Xu et al. [15] proposed four optimization models and solved them with the normal-boundary intersection (NBI) and weighted-sum (WS) methods. Xu and Li [16] proposed a multilevel programming method to infer the common metabolic objective function for glycerol bio-dissimilation to 1,3-PD by Klebsiella pneumoniae. This approach is more feasible and has a better prediction performance than the existing method [16]. Xu and Wang [17] presented three parameter identification models of a microbial batch process of glycerol to 1,3-PD to identify the parameter values of the nonlinear biological system by considering three different error criteria of biomass, glycerol, 1,3-PD, acetic acid, and ethanol. These parameter identification models are dynamic optimization problems. They can be solved by transforming the original dynamic optimization problems into corresponding nonlinear programming problems. However, the transformed nonlinear programming problems are difficult to solve for global optimality. To deal with this difficulty, in this study, a two-stage method is proposed to efficiently handle the parameter identification of the microbial batch process of glycerol to 1,3-PD.
In the following, we first present a parameter identification model for the excess kinetics of the microbial batch process of glycerol to 1,3-PD by Klebsiella pneumoniae. Then, a two-stage method is proposed to efficiently solve the presented dynamic optimization problem. Section 4 presents the parameter identification results obtained by the proposed two-stage method. A comparative study is also given to show that the proposed two-stage method can obtain better identification results than the single-stage method. Finally, some brief conclusions of the present work are given in Section 5.

Microbial Batch Process
Based on previous work [2], the material balance relationships of the microbial batch process of glycerol to 1,3-PD by Klebsiella pneumoniae can be expressed as follows: dC HAc dt = q HAc X, q HAc = m HAc + µY m HAc + ∆q m where t denotes the culture time (h); t 0 ≥ 0 and t N > 0 are the initial and terminal times of the culture process, respectively (h); X denotes the biomass (g/L); C S denotes the substrate (glycerol) concentration (mmol/L); C PD denotes the concentration of product 1,3-PD (mmol/L); C HAc denotes the concentration of product acetic acid (mmol/L); C EtOH denotes the concentration of product ethanol (mmol/L); X 0 , C S0 , C PD0 , C HAc0 , and C EtOH0 are the initial values of X, C S , C PD , C HAc , and C EtOH , respectively; µ denotes the specific growth rate of cells (1/h); q S denotes the specific consumption rate of substrate glycerol (mmol/(g·h)); q PD , q HAc , q EtOH denote the specific formation rates of products 1,3-PD, acetic acid, and ethanol, respectively (mmol/(g·h)); m S is the maintenance term of glycerol consumption under substrate-limited conditions (mmol/(g·h)); m PD , m HAc , and m EtOH denote the maintenance terms of corresponding product formation under substrate-limited conditions, respectively (mmol/(g·h)); Y m S denotes the maximum growth yield (g/mmol); Y m PD , Y m HAc , and Y m EtOH are the product yield of the corresponding products (mmol/g); ∆q m S is the maximum increment of the glycerol consumption rate under substrate-sufficient conditions (mmol/(g·h)); ∆q m PD , ∆q m HAc , and ∆q m EtOH are the maximum increments of the corresponding product formation rate under substrate-sufficient conditions (mmol/(g·h)); and K * S , K * PD , K * HAc , and K * EtOH denote the saturation constants of glycerol and corresponding products in the excess terms, respectively (mmol/L).

Parameter Identification Model
To identify the values of the parameters m S , m PD , m HAc , m EtOH , Y m S , Y m PD , Y m HAc , Y m EtOH , ∆q m S , ∆q m PD , ∆q m HAc , ∆q m EtOH , K * S , K * PD , K * HAc , and K * EtOH in Equations (1)-(12), we propose the following parameter identification model for the microbial batch process of glycerol to 1,3-PD: EtOH max (13) subject to satisfying dC HAc dt dC EtOH dt In Equations (13)-(41), the objective function F is the sum of the least-square and slope errors of biomass (X), glycerol (C S ), 1,3-PD (C PD ), acetic acid (C HAc ), and ethanol (C EtOH ); X e (t j ), C Se (t j ), C PDe (t j ), C HAce (t j ), and C EtOHe (t j ) are the measured data for biomass, glycerol, 1,3-PD, acetic acid, and ethanol at the sampling time t j (j = 1, 2, · · · , N), respectively; X(t j ), C S (t j ), C PD (t j ), C HAc (t j ), and C EtOH (t j ) denote the computed concentrations for biomass, glycerol, 1,3-PD, acetic acid, and ethanol at t j , respectively; X max , C Smax , C PDmax , C HAcmax , and C EtOHmax are the maximum measured concentrations of biomass, glycerol, 1,3-PD, acetic acid, and ethanol in [t 0 , t N ], respectively; N is the number of sampled data points; . X e (t j ), C HAce (t j ), and . C EtOHe (t j ) are the approximate experimental slopes for biomass, glycerol, 1,3-PD, acetic acid, and ethanol at t j , respectively; .
C HAc (t j ), and . C EtOH (t j ) are the computed slopes for biomass, glycerol, 1,3-PD, acetic acid, and ethanol at t j , respectively; .
The inequality constraints in Equations (21)-(25) keep the specific growth rate µ, specific consumption rate q S , and specific formation rates q PD , q HAc , and q EtOH within certain physically and chemically feasible limits; 2.
Obviously, the parameter identification model in Equations (13)-(41) is a nonlinear dynamic optimization problem with complex constraints. Therefore, it is difficult to solve for global optimality.

Two-Stage Method for the Parameter Identification Model
In this work, we propose a two-stage method to efficiently solve the presented parameter identification model in Equations (13)-(41). This method addresses the parameter identification problem in Equations (13)-(41) through a two-stage procedure instead of directly solving it.

Two-Stage Method
We first rewrite the parameter identification problem in Equations (13)-(41) as subject to satisfying 16 , p l ∈ R 16 , and p u ∈ R 16 have the following formulations: The general method for solving a dynamic optimization problem depends on the discretization of dynamic system equations within the framework of the direct transcription [25][26][27]. By using a certain discretization technique, one can convert an infinite-dimensional dynamic optimization problem into a finite-dimensional, large-scale, nonlinear programming problem. To discretize the dynamic system equations, a collocation method should be used. For example, we can use the first-order Runge-Kutta (RK) method and higher-order RK methods, such as the implicit trapezoidal method, to discretize the dynamic system equations.
The implicit trapezoidal method is applied here to discretize the dynamic system in Equations (43)-(47). The discretized equations are written as where η j = t j − t j−1 , j = 1, 2, · · · , N. The implicit trapezoidal equations (56)-(60) help reduce the dimension of the dynamic optimization problem in Equations (42)-(55), but they still involve a large-scale nonlinear programming problem. To reduce the computation time, we use the following modified formulations of the implicit trapezoidal method by applying experimental data to Equations (56)-(60): where j = 1, 2, · · · , N. Replacing Equations (43)-(47) with Equations (61)-(65), we can convert the dynamic optimization problem in Equations (42)-(55) into the following nonlinear programming problem: subject to satisfying µ(x, p) = 0.67x 2 0. 28 To efficiently solve the nonlinear programming problem in Equations (66)-(78), we first regard µ(t j )(j = 0, 1, · · · , N), q S (t j ), q PD (t j ), q HAc (t j ), and q EtOH (t j ) as the identified parameters and rewrite it as subject to satisfying where µ l j , q l Sj , q l PDj , q l HAcj , and q l EtOHj are the lower bounds of µ(t j ), q S (t j ), q PD (t j ), q HAc (t j ), and q EtOH (t j ), respectively; and µ u j , q u Sj , q u PDj , q u HAcj , and q u EtOHj are the upper bounds of µ(t j ), q S (t j ), q PD (t j ), q HAc (t j ), and q EtOH (t j ), respectively.
For the parameter identification problem in Equations (79)-(90), we have the following remarks: 1.
In the computation of the optimization problem in Equations (79)-(90), the slopes .
The slopes .
The optimization problem in Equations (79)-(90) is a relatively simple quadratic programming problem compared to the nonlinear programming problem in Equations (66)-(78). Thus, it is easy to obtain the globally optimal solution of the problem in Equations (79)-(90) with the available quadratic programming algorithms.
Let the optimal values of µ(t j )(j = 0, 1, · · · , N), q S (t j ), q PD (t j ), q HAc (t j ), and q EtOH (t j ) be µ , p), and q EtOH (x, p), respectively, we have the following 4(N + 1) nonlinear equations: After solving the nonlinear Equations (91)-(94), we can obtain the values of the identified parameters p k (j = 1, 2, · · · , 16). This goal can be achieved by solving the following optimization problem: subject to satisfying In this problem, F 1 (t j ), F 2 (t j ), F 3 (t j ), and F 4 (t j ) have the following formulations The optimal values of the identified parameters p k (k = 1, 2, · · · , 16) can be obtained by solving the problem in Equations (95) and (96) with the available optimization solver.

Computing the Slopes of Experimental Data
As stated in the optimization problem in Equations (79)-(90), the objective of Equation (79) involves the slopes .
x ie (t j ) of 5N experimental data. These experimental slopes can be estimated by some numerical methods, such as the artificial neural network method [28] and the cubic spline interpolation algorithm [29]. In this work, based on the Tikhonov regularization method [30], we present the following procedures to estimate the experimental slopes .

Optimization Results and Discussion
In this work, we applied the Tikhonov regularization method given in Section 3.2 to estimate the experimental slopes .
x ie (t j ) and used the genetic algorithm in MATLAB TM software to solve both the optimization problems of Equations (79)-(90) and of Equations (95) and (96). In the implementation of the genetic algorithm, we set all the algorithm parameters to be the default values. Experimental concentration data x ie (t j )(i = 1, 2, · · · , 5) were drawn from Feng and Xiu [31] and Xu and Wang [17]. The initial point x 0 of the studied system was selected as (0.1905, 400.043, 0, 0, 0) T . The parameter N K used in the Tikhonov regularization method was set to be 502. The lower and upper bounds for the optimized variables µ(t j )(j = 0, 1, · · · , N), q S (t j ), q PD (t j ), q HAc (t j ), and q EtOH (t j ) and the identified parameters p k (k = 1, 2, · · · , 16) are given in Tables 1 and 2, respectively.

Parameter
Lower Bound Upper Bound 0 100 q S (t 4 ) 0 100 q PD (t 0 ) 0 100 q PD (t 1 ) 0 100 q PD (t 2 ) 0 100 q PD (t 3 ) 0 100 q PD (t 4 ) 0 100 q HAc (t 0 ) 0 100 q HAc (t 1 ) 0 100 q HAc (t 2 ) 0 100 q HAc (t 3 ) 0 100 q EtOH (t 0 ) 0 100 q EtOH (t 1 ) 0 100 q EtOH (t 2 ) 0 100 q EtOH (t 3 ) 0 100 q EtOH (t 4 ) 0 100 Figure 1 presents the optimal values of the variables µ(t j )(j = 0, 1, · · · , N), q S (t j ), q PD (t j ), q HAc (t j ), and q EtOH (t j ), obtained by solving the optimization problem in Equations (79)-(90) in the first stage of the two-stage method. Applying these optimal values to the optimization problem in Equations (95) and (96) and then solving it in the second stage of the two-stage method, we could obtain the optimal values of 16 identified parameters p k (k = 1, 2, · · · , 16). These results are shown in Table 2. Table 2 also gives the lower and upper bounds for identified parameters p k (k = 1, 2, · · · , 16). Table 3 shows the comparison between the proposed two-stage method and the single-stage method applied in Reference [17]. As can be seen in the total error function F value, the proposed two-stage method in this work could yield a smaller error between the measured and computed concentrations than the single-stage method applied in Reference [17], with a decrease of about 52.79%. This conclusion clearly shows the tractability and effectiveness of the proposed two-stage method in handling the parameter identification of the microbial batch process of glycerol to 1,3-PD.  Figure 1 presents the optimal values of the variables ) ). These results are shown in Table 2. ). Table 3 shows the comparison between the proposed two-stage method and the single-stage method applied in Reference [17]. As can be seen in the total error function F value, the proposed two-stage method in this work could yield a smaller error between the measured and computed concentrations than the single-stage method applied in Reference [17], with a decrease of about 52.79%. This conclusion clearly shows the tractability and effectiveness of the proposed two-stage method in handling the parameter identification of the microbial batch process of glycerol to 1,3-PD.  Table 3. Comparison between the proposed two-stage method and the single-stage approach from Reference [17].  Figures 2 and 3, we can see that the calculated concentrations with the proposed two-stage method were better fitted to the experimental data than with the single-stage method. To further measure the goodness-of-fit of the model, the coefficient of determination in statistical analysis was used as a metric. The coefficient of determination of the presented study was 0.9994 (0.9994 is closer to 1), and was larger than the coefficient of determination (0.9243) obtained by the single-stage method, with an improvement of about 8.13%. This concludes that the biological model obtained by the two-stage method provided better goodness-of-fit for the experimental concentration data than the single-stage method did.

Method Proposed Two-Stage Method Single-Stage Method from Reference [17]
(a) (b) Figure 1. The optimal values µ * (t j ), q * s (t j ), q * PD (t j ), q * HAC (t j ), and q * EtOH (t j ) of the variables µ(t j ), q S (t j ), q PD (t j ), q HAc (t j ), and q EtOH (t j ), obtained by solving the optimization problem in Equations (79)-(90) in the first stage of the two-stage method: (a) µ * (t j ); (b) q * S (t j ); (c) q * PD (t j ); (d) q * HAc (t j ); (e) q * EtOH (t j ).  Figures 2 and 3, we can see that the calculated concentrations with the proposed two-stage method were better fitted to the experimental data than with the single-stage method. To further measure the goodness-of-fit of the model, the coefficient of determination in statistical analysis was used as a metric. The coefficient of determination of the presented study was 0.9994 (0.9994 is closer to 1), and was larger than the coefficient of determination (0.9243) obtained by the single-stage method, with an improvement of about 8.13%. This concludes that the biological model obtained by the two-stage method provided better goodness-of-fit for the experimental concentration data than the single-stage method did.   Figures 2 and 3, we can see that the calculated concentrations with the proposed two-stage method were better fitted to the experimental data than with the single-stage method. To further measure the goodness-of-fit of the model, the coefficient of determination in statistical analysis was used as a metric. The coefficient of determination of the presented study was 0.9994 (0.9994 is closer to 1), and was larger than the coefficient of determination (0.9243) obtained by the single-stage method, with an improvement of about 8.13%. This concludes that the biological model obtained by the two-stage method provided better goodness-of-fit for the experimental concentration data than the single-stage method did.

Conclusions
This paper addressed the problem of parameter identification for a microbial batch process of glycerol to 1,3-PD. A two-stage method was proposed to efficiently solve the presented parameter identification problem. In the first stage of this method, a simple quadratic programming problem is first required to be solved. The optimized variables of this quadratic programming problem are the specific growth rate μ , specific consumption rate S q , and specific formation rates PD q , HAc q , and EtOH q at time j t ( N j , , 1 , 0  = ). Applying the optimization results of the quadratic programming problem to the second stage of the proposed two-stage method, we can obtain the values of the 16 identified parameters. A comparative study was conducted and showed that the proposed two-stage method could obtain better identification results than the single-stage method could. Although the two-stage method was proposed here to identify the parameters of a microbial batch process of glycerol to 1,3-PD, it can also be applied to the parameter identification of other biological systems.

Conclusions
This paper addressed the problem of parameter identification for a microbial batch process of glycerol to 1,3-PD. A two-stage method was proposed to efficiently solve the presented parameter identification problem. In the first stage of this method, a simple quadratic programming problem is first required to be solved. The optimized variables of this quadratic programming problem are the specific growth rate µ, specific consumption rate q S , and specific formation rates q PD , q HAc , and q EtOH at time t j (j = 0, 1, · · · , N). Applying the optimization results of the quadratic programming problem to the second stage of the proposed two-stage method, we can obtain the values of the 16 identified parameters. A comparative study was conducted and showed that the proposed two-stage method could obtain better identification results than the single-stage method could. Although the two-stage method was proposed here to identify the parameters of a microbial batch process of glycerol to 1,3-PD, it can also be applied to the parameter identification of other biological systems.