Multi-Rate Data Fusion for State and Parameter Estimation in (Bio-)Chemical Process Engineering

: For efﬁcient operation, modern control approaches for biochemical process engineering require information on the states of the process such as temperature, humidity or chemical composition. Those measurement are gathered from a set of sensors which differ with respect to sampling rates and measurement quality. Furthermore, for biochemical processes in particular, analysis of physical samples is necessary, e.g., to infer cellular composition resulting in delayed information. As an alternative for the use of this delayed measurement for control, so-called soft-sensor approaches can be used to fuse delayed multirate measurements with the help of a mathematical process model and provide information on the current state of the process. In this manuscript we present a complete methodology based on cascaded unscented Kalman ﬁlters for state estimation from delayed and multi-rate measurements. The approach is demonstrated for two examples, an exothermic chemical reactor and a recently developed model for biopolymer production. The results indicate that the the current state of the systems can be accurately reconstructed and therefore represent a promising tool for further application in advanced model-based control not only of the considered processes but also of related processes.


Introduction
In recent years, application of automatic control to bio-chemical manufacturing processes has become increasingly important to keep the required product quality in close bounds, guarantee process safety and also decrease the corresponding environmental impact. Examples are found in a wide range of industries including pharmaceutical and food manufacturing.
For efficient operation, these control approaches require information on the current values of important process states and parameters. Often, those are either corrupted by noise or not directly measurable. Furthermore, measurements of different process quantities are gained from various sensor types which may differ in sampling rate, accuracy and lag. In fact, many online sensors can provide instantaneous but lumped (e.g., mean particle size instead of the full particle size distribution for fluidized bed granulation) or indirect/inferential measurements gained from auxiliary variables (e.g., for drying processes, indirect measurement of product moisture content from moisture content of drying gas at the outlet) [1,2]. In contrast, offline measurements are generally more accurate but accompanied by significant measurement lags, e.g., resulting from sample drawing, preparation and analysis for biochemical processes.
From the control point of view the previously described points lead to a dilemma: On the one hand, fast but rather unreliable data represent no suitable basis for an reliable automatic controller. Though, on the other hand it is also not an option to "wait" for reliable measurement data as corresponding controllers would not be able to act quickly on changes in the process. As an alternative to advanced (and probably too expensive) sensors that allow the generation of accurate and quasi-instantaneous online data, model-based approaches, so-called soft sensors, can be applied to merge different measurements and provide reliable estimates of the current process.
In the literature, this problem has drawn some attention and thus different approaches have been studied. In [3], the most significant estimation algorithms have been presented and compared. Here, Bayesian estimators represent a specific category [4][5][6][7][8] that is an alternative to optimization based techniques such as moving horizon estimators [9][10][11]. An important class of approximative Bayesian estimators are Kalman filters (KF) and particle filters [6,12,13]. In [14] and the references therein, different KF based approaches to the previously stated problem are discussed. Additionally, practical examples of different approaches for (bio-)chemical processes are found in literature, e.g., Kager et al. [15] developed an estimator for penicillin production using online and delayed offline data using particle filtering. The approach showed convincing performance in estimating states and parameters for real measurement data. Furthermore, multi-rate estimators were developed in [16][17][18] estimators were developed and evaluated for pilot-scale polymerization reactors.
In this contribution, we present a new model-based methodology, which is able to merge measurements of various sampling rates, lag horizons and accuracies. The approach is based on cascaded unscented Kalman filters (UKF). Application to two example processes, a nonlinear exothermic reactor and microbial biopolymer production, is demonstrated. The results are discussed and indicate that the presented concept is able to provide reliable online-estimates and is easily adaptable to a wide range of processes.

Unscented Kalman Filtering
Assume that the process model is given aṡ with x ∈ R n x , y ∈ R n y and u ∈ R n u representing the systems state, output and input vector. Furthermore, the systems initial state is denoted as x 0 ∈ R n x . In general, the states dynamics f as well as the measurement equation h are nonlinear functions of the states and the inputs. In many practical cases, not all dynamic states can be measured directly, i.e., n x > n y , and furthermore, measurements may be corrupted by noise processes w(t). Moreover, the process dynamics itself may only be known up to a certain degree or be affected by stochastic processes summarized in the signal v(t) ∈ R n v . For state estimation, i.e., the reconstruction of states x from available measurements y, the framework of Kalman filtering can be applied to explicitly account for the underlying noises and uncertainties. In particular for nonlinear dynamics and measurement function, the so-called unscented Kalman filter (UKF) can be applied. The basic formulation as presented in [12,19], is described in a discrete time framework, with systems dynamics and measurement function given by: For the sake of simplicity, it is assumed that the process and the measurement noise, w k and v k , result from random stochastic processes. Without loss of generality, we assume that both can be drawn from zero-mean Gaussian distribution with covariances Q k and R k . Furthermore, we assume that both affect the states and measurements in an additive sense.
In principle, the traditional structure from the classical Kalman filter [20] for linear dynamics is kept. However, contrarily to the prominent extended Kalman filter (EKF), a sampled-based linearization using Sigma-Points is applied in the UKF. The prediction steps are given as:
Propagate each SP using the model equations 3.
Using weights k , P xy , P yy andŷ (−) k by averaging over weighted SP Suitable tuning of κ is discussed in [12] and the references therein. In the second step, the predicted states and covariance is updated with the current measurement: It is worth mentioning, that different extensions and variants have been introduced in the last decades, e.g., the square-root version [12]. Further discussion is provided in [21].

UKF for Delayed and Multi-Rate Measurements
As described in the introduction, in practice, measurement data is acquired from different sensors that differ not only in accuracy but potentially also measurement delays and sampling rates. The classical formulation presented above must be adapted to such cases in order to allow accurate estimation of states.
In the case of delayed measurements, the UKF algorithm is adapted, such that only states up to (t − τ) can be reconstructed at the current sample time t. For the interval (t − τ, t] states (and also covariances representing uncertainties) could be predicted using the unscented transformation, i.e., the prediction step of the UKF. Further adaption is necessary if measurements differ in their respective sampling rates. In the case of two different sensors, a "fast" and a "slow" sensor, the first can be assumed to have sampling rate ∆t without loss of generality while the sampling rate of the second is given as an integer multiple N ∆t. This means, that the slow measurement is only available for each N ∆t and the measurement equation therefore is given as: Obviously, for further sensors, this representation can become rather confusing. Alternatively, in particular in the Kalman Filtering context, it is convenient to keep the standard measurement equation, i.e., y k = y k, f ast , y k,slow T (19) and use a specific weighting of the measurements in the update step by setting high values for the measurement noise of currently not available measurements in the update step.
This alternative solution is easier to implement for more than two sensors because the native KF structure can be kept.
An example is given in the following. Assume that a fast sensor has a sampling rate of ∆t and the slow sensor a sampling rate of 2∆t. This means that y k,slow can only be measured in each second time step and y k, f ast in each time step. Thus, the reconstruction of states is alternately based on the measurement of y k, f ast and the measurements from both sensors, y k,slow and y k, f ast , from step to step. In result the measurement equation is given as: Alternatively, the standard measurement equation, i.e., is kept with a specific weighting:

Results and Discussion
This section evaluates the presented methods' performance to infer the systems state from (potentially) delayed measurements with different sampling rates. Therefore, two examples are analyzed: In the first case, the UKF is applied to a exothermic reactor described by a system of two nonlinear ODEs [22]. In addition to reconstruction of immeasurable states, the UKF's potential for online parameter estimation is assessed. The second example is concerned with state estimation for microbial-based production of biopolymers in lab-scale setup. The underlying model was recently developed in our group with own experimental data [23].

Model Formulation
The dimensionless model suggested by Schaffner and Zeitz [22] describes the nonlinear dynamics of an exothermic chemical reaction in terms of conversion x 1 and temperature The control input u represents the heating/cooling of the reactor. Depending on u, the reactor dynamics may exhibit persistent oscillations. Additionally, this model has served frequently as a benchmark in engineering education for the design of nonlinear observers and controllers, where one usually assumes that the temperature is measurable while the conversion is not. However, reconstruction of the conversion from temperature measurements is possible with convenient approaches, e.g., UKF.
In the following, we will assume that measurements of x 1 and x 2 are available at different sampling rates: While the temperature can be measured rather easily, e.g., with a standard industrial thermometer, direct measurement of conversion would be based on physical samples. Sample analysis may take a certain time span τ = k del ∆t. The corresponding measurement rate is thus not only possible at a lower rate but also delayed. In terms of a measurement equation this reads as: with y k = y(t = k∆t). Following the convention introduced in the previous section, the noise variance of conversion measurements depend on the sampling time: All model parameters can be found in Table 1. For the simulations an additional unknown stochastic part in the process dynamics was assumed. The deterministic part was solved with the MATLAB function ode15s and for the stochastic part the Euler-Maruyama method [24] was implemented. For the shown scenarios, artificial measurements are used.

Parameter
Value Parameter Value

Scenario I: State Estimation
In Figures 1 and 2 the performance of the UKF for reconstruction of the states is presented for different time points. UKF parameters and initial conditions are summarized in Table 2. Table 2. Parameters and initial conditions for exothermic reactor scenario I.

Parameter
Value Parameter Valuê In the figures, the so-called delay window is denoted with green lines: the right green line denotes the current sampling instant t while the left corresponds to (t − τ). Within the delay window only the online measurement of the temperature x 1 is available but no information is provided for the slow conversion. Thus, state estimates between the two green lines, i.e., (t − τ, t], are only based on temperature measurements. In contrast, state estimates outside of the delay window, i.e., t ≤ (t − τ) (left of the green line), are based on measurement of temperature and conversion. In Figure 1 estimation results are shown for two different time points. A general effect is observed: If a conversion measurement becomes available (black diamond), the estimation accuracy does not only improve significantly for x 1 itself but also for x 2 as estimation quality of the latter in the delay window crucially depends on the quality of estimates at the left border of the delay window. The slow conversion sensor based update thereby improves estimation accuracy for temperature within the delay window. In Figure 2 the overall estimation results are depicted. It can be seen that both states are estimated quite accurately in face of the stochastic dynamics and measurements. However, reconstruction of x 1 generally becomes worse the longer the time since the last slow sensor measurement and also in the delay window, where estimation is based on measurements of x 2 alone.

Scenario II: Simultaneous State and Parameter Estimation
In addition to states that cannot be directly measured at each time instance, in practice it frequently occurs that model parameters are either not fully known or even change during process operation, e.g., as a result of aging or fouling processes. In this case, the state estimation problem can be adapted to enable simultaneous estimation of states and parameters.
For the nonlinear reactor the state vector is augmented by the unknown parameters a 1 and b 2 and the UKF approach is implemented for the augmented state vector Initial conditions and further parameters are summarized in Table 3. Table 3. Parameters and initial conditions for exothermic reactor scenario II.

Parameter Value
Parameter Valuê While estimation of x 1 is not sufficient because of inaccurate parameter estimates, slow sensor measurements improve estimation for a certain time span. However, the minima of the oscillations are not well captured for t < 40. Afterwards, parameter estimates seem to be reasonably close to the process value such that the reconstruction improves. In particular for b 2 significant adaption is seen when slow sensor measurements become available. It was also generally observed that estimation of b 2 within the delay window is rather inaccurate while for a 1 correct trends were seen. The major reason is potentially related to the practical observability: Within the delay window the approach aims on reconstruction of the augmented states from noisy measurement information of x 2 alone. In contrast, when a slow sensor measurement becomes available, the additional accurate information on x 1 obviously improves the methods potential for estimation of all four variables of interest.

Microbial PHA Production
Bioplastic is a very promising alternative in comparison to conventional plastic raw material, e.g., polypropylene and polyethylene. Polyhydroxyalkanoates (PHAs) represent a very prominent bio-based and bio-degradable group of bioplastic that can be produced in a wide variety of microorganisms. Beside the well-known representative poly-3hydroxybutyrate (PHB), the co-polymer poly(3-hydroxybutyrate-co-3-hydroxyvalerate) (PHBV) is more competitive in comparison to PHB because of its higher elongation-tobreak values, lower melting points and higher biocompatibility [25]. Because of the high production costs for bioplastics such as PHBV, they have a low market share in the plastics industry.
To make bioplastics even more competitive, model approaches can be used to analyze, optimize and finally control the production in order to reduce the costs. In the following section, our recently developed PHBV production model based on online CO 2 exhaust gas measurements to capture biomass growth is described [23].

Model Formulation
The nonlinear model was presented in [23] and describes the dynamics of a fed-batch process using fructose (c f ru ) and propionic acid (c p ) as carbon sources. Furthermore, the dynamics of the nitrogen source c n , residual biomass c res , HB-content c hb and HV-content c hv are accounted for. The resulting model therefore comprises a system of six nonlinear ordinary differential equations as reported in the Appendix A. The parameter values and detailed description can be found in the original publication [23].
In contrast to the previous example, the process' dynamics are assumed to be described completely by the deterministic dynamics. In the following, it is assumed that measurements are based on physical samples. Furthermore, it is assumed that substrates concentrations can be determined within t del,1 = 30 min and the measurements are accompanied by medium measurement noise. In contrast, analysis of biomass content takes t del,2 = 1 h with assessment of HV and HB content takes t del,3 = 2 h with low noise. The approach described previously was adapted accordingly to deal with the situation. In contrast to the original publication, only artificial data is used in this simulation study.

State Estimation
In Figure 4 the simulation results are shown.  It can be seen that the filter is able to smooth out the substrate measurements after a short period of adaption. However, propionate estimates deteriorate between 15 h and 25 h. However, after the ammonium-shot estimation accuracy improves rapidly. This may also give hint for future experiments that additional substrate shots could improve the general reconstruction quality. Furthermore, measurements of HB and HV are accurately reconstructed. Estimation of the non-polymeric biomass shows less smooth behavior in comparison, so the reconstruction of residual biomass is more sensitive to measurement noise.

Simultaneous State and Parameter Estimation
In Figure 5 simulation results are shown for simultaneous estimation of the states and the parameter c p,inh . It can be seen that the approach is able to reconstruct the states accurately and also the unknown constant parameter. However, during design it was observed that the performance of the parameter estimation is rather sensitive to the UKF's tuning parameters, i.e., κ and Q, which may require more advanced approaches in future when additional parameters have to estimated simultaneously.

Conclusions and Outlook
In this article a UKF-based model-based state estimation concept was presented which is able to incorporate delayed and multirate measurement from different sensor. Performance has been evaluated for two different examples: The first example was an exothermic chemical reactor described by nonlinear dynamic equations. It was shown that states as well as unknown parameters can be reconstructed with sufficient accuracy within and outside the corresponding delay window. To the authors opinion the increased computational effort resulting from the more involved implementation is more than compensated by the improved estimation accuracy. In a second example, the method was applied for state estimation of a microbial biopolymer production process. Here, three different sensor delays and sampling rates were assumed and required the adaption of previously presented method. Compounds with a slow sampling rate could be reconstructed accurately. Furthermore, the approach was adapted to allow for simultaneous reconstruction of an unknown model parameter. The results showed sufficient performance despite being rather sensitive to the UKF's tuning.
The overall results indicate that the presented method is a promising approach for online state and parameter estimation in (bio-)chemical processes. The UKF implementation is able to accurately reconstruct both from noisy and multirate measurements which are found often in practice due to manifold measurement devices and complex analysis techniques based on physical samples. It thereby represents a valuable tool for advanced automatic control concepts which rely on accurate information on the corresponding process despite limited and delayed measurement information. The authors would also like to emphasize, that the presented concept for multirate data fusion does not involve complex computations such as solution of inverse problems by nonlinear optimization, as found in alternative approaches. Therefore, real-world and real-time realization does not require excessive computational power but could also be accomplished using rather simple platforms.
In the future, focus will be on application of the approach for control of (bio-) chemical processes. Examples include biopolymer production from agricultural and food industrial waste streams [26][27][28] as well as bakers yeast drying [29][30][31]. For more complex cases with multiple unknown parameters, higher nonlinearities or non-Gaussian process noise, performance of the UKF approach may deteriorate. Here, particle filters [6,32] can represent a promising alternative. Furthermore, the outlined approach could be used to extend existing estimators for distributed parameter systems as found in the description of particle formation [33][34][35] and biotechnological processes [36][37][38], to reconstruct the systems states and parameters in presence of measurement delays or multi-rate measurements. Funding: This work is partly funded by the project DIGIPOL from European Regional Development Fund (ERDF) and the Center of Dynamic Systems (CDS) Magdeburg.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Reactor Model for Biopolymer Production
The dynamics for the carbon sources are given as and Inhibition effects are described by inh 2 = max 0, 1 − c n c n +c n,sw , The equations of fructose and propionic acid consumption (A1) and (A2) consists of four terms to describe the dynamic behaviour. In the first term the substrate uptake for biomass production is given. The consumption of the respective carbon source to produce the monomer units 3-hydroxybutyrate (HB) and 3-hydroxyvalerate (HV) of the polymer chain is described with second term. The third term of each equation describe the conversion from carbon source to CO 2 . Finally, the dilution caused by the fed-batch mode is shown in fourth term.
The dilution factor in the fed-batch mode is described as follows The feed flow rate is written as F in and reactor volume as V.
Volume balance is necessary in fed-batch mode: To describe the metabolic activity b CO 2 (t), we used the relative CO 2 proportion: With CO 2,out in the exhaust gas and CO 2,in in the fresh inlet air the metabolic activity can be calculated. The inhibitory steric variable for the description of inh 3 in Equation (A3) is given as the ratio between total polymer concentration and total biomass concentration Beside a carbon source as substrate, the organism need ammonium for non-PHA growth. The ammonium dynamics is described as follows The first term in Equation (A8) describes the residual growth of the bacteria, while the second term includes the conversion of biopolymer (HB and HV) to residual biomass.
The following equation account for the dynamics of residual (non-PHA, catalytically active) biomass By the consumption of the carbon sources fructose and propionic acid residual biomass can be produced (parameter k 1 and k 2 ). Further, residual biomass can be produced by the conversion of HB and HV from the polymer chains in presence of ammonium.
The product dynamics of the monomers HB and HV in the polymer chains is described as dc hv dt =k 6 · c res · c p · inh 2 · inh 3 − k 3 · c res · c n · c hv − D · c hv .
(A11) HB can be produced by fructose and propionic acid with the parameters k 4 and k 5 . HV can only be accumulated if propionic acid is metabolized with parameter k 6 . Both monomers of the chain (HB and HV) can be converted to residual biomass with the parameter k 3 in presence of ammonium.
For sake of simplicity, it is assumed that the dynamics of CO 2 and reactor volume are known with high precision from direct online measurements in real time. Therefore, both are assumed to be known time-variant parameters which reduces the complexity of the model implementation. However, the applied soft sensor concept could easily be extended to include reconstruction of both states from available measurements as depicted in Figure A1. Further information on the model itself is found in [23] and model parameters are summarized in Table A1.