Next Article in Journal
Increasing Salt Rejection of Polybenzimidazole Nanofiltration Membranes via the Addition of Immobilized and Aligned Aquaporins
Next Article in Special Issue
Data-Mining for Processes in Chemistry, Materials, and Engineering
Previous Article in Journal
Effect of Temperature and Microwave Power Levels on Microwave Drying Kinetics of Zhaotong Lignite
Previous Article in Special Issue
Numerical Simulation of Water Absorption and Swelling in Dehulled Barley Grains during Canned Porridge Cooking
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Incremental Parameter Estimation under Rank-Deficient Measurement Conditions

1
Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland
2
ETH Zürich, Institute of Environmental Engineering, 8093 Zürich, Switzerland
3
Laboratoire d’Automatique, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
*
Author to whom correspondence should be addressed.
Processes 2019, 7(2), 75; https://doi.org/10.3390/pr7020075
Submission received: 13 December 2018 / Revised: 25 January 2019 / Accepted: 27 January 2019 / Published: 2 February 2019
(This article belongs to the Special Issue Process Modelling and Simulation)

Abstract

:
The computation and modeling of extents has been proposed to handle the complexity of large-scale model identification tasks. Unfortunately, the existing extent-based framework only applies when certain conditions apply. Most typically, it is required that a unique value for each extent can be computed. This severely limits the applicability of this approach. In this work, we propose a novel procedure for parameter estimation inspired by the existing extent-based framework. A key difference with prior work is that the proposed procedure combines structural observability labeling, matrix factorization, and graph-based system partitioning to split the original model parameter estimation problem into parameter estimation problems with the least number of parameters. The value of the proposed method is demonstrated with an extensive simulation study and a study based on a historical data set collected to characterize the isomerization of α -pinene. Most importantly, the obtained results indicate that an important barrier to the application of extent-based frameworks for process modeling and monitoring tasks has been lifted.

Graphical Abstract

1. Introduction

Despite advances in model identification theory, parameter estimation can still be very challenging in practice. Such challenges include the lack of identifiability, large computational cost, the need to formulate appropriate experimental designs, and the fact that many methods, such as those for uncertainty analysis, are still being investigated and therefore not standardized. In this work, we focus on a novel method to tackle the computational challenge associated with the identification of kinetic parameters in large dynamic models. Hence, the other challenges associated with parameter identifiability, experimental design, and uncertainty analysis, though relevant, are not investigated here.
To handle the computational challenge, it is typical to devise a protocol for model fitting and model validation. Such protocols can be divided into two broad classes. In the first class, the protocols are domain specific. For instance, several protocols for the identification of activated sludge models are discussed in [1]. Similarly, a protocol for environmental system models is proposed in [2]. These protocols incorporate significant expertise specific to the particular application domain, thereby leading to protocols that are fine-tuned for that domain. While they tend to be similar on a conceptual level, it is rather difficult to apply these protocols universally.
The second class includes protocols that are more general and thus—in principle—broadly applicable. The incremental model identification framework studied in [3,4,5,6,7] is a good example. This method is grounded on the computation of extents, which are—loosely speaking—linear combinations of the original model states and capture the progress of individual dynamic phenomena, such as chemical reactions. A more precise definition will be given below.
The applicability of the extent-based framework is limited to cases where all extents can be computed based on measurements and invariant relationships. However, since this is rarely the case for biological process models, the extent-based framework has been extended in [8] to the case where each extent is either observable or non-sensed—see below for precise definitions. While this recent work provides a meaningful improvement, extent-based incremental model identification remains inapplicable for a wide range of biological scenarios found in practice. For example, this method cannot deal with the frequent case where an extent is sensed but unobservable.
The goal of this study is to present a novel method for incremental parameter identification that is more universally applicable. This method is based on the formulation of a generalized framework for extent computation and the use of a graph-based clustering algorithm. In what follows, we present the method and demonstrate its applicability to cases that could not be handled in an extent-based incremental model identification framework before. The expected impact of the newly developed tools is discussed at the end of this study.

2. Notation and Symbols

The matrix composed of the columns c of the matrix M is denoted M , c , while the matrix composed of the rows r of the matrix M is denoted M r , . All vectors are column vectors unless mentioned otherwise. Table 1 lists all symbols used in this study.

3. Methods

3.1. System Representation and Extents

3.1.1. Dynamic Model in Terms of Numbers of Moles

We study batch process systems whose dynamic behavior is described by a set of differential equations of the following form:
n ˙ ( t ) = N T V r ( t ) ,       n ( 0 ) = n 0 .
In the present work, the above equations represent a single-phase reaction system in a vessel with constant volume V. n ( t ) is the S-dimensional vector of numbers of moles; n 0 specifies the initial conditions; r ( t ) is the R-dimensional vector of reaction rates; N is the ( R × S )-dimensional stoichiometric matrix.
We assume that M noisy measurements, y ˜ h , are available at the H sampling times t h ( h = 1 , , H ), according to the following equations:
y ˜ h : = y ( t h ) + ϵ h = 1 V       M n ( t h ) + ϵ h , ϵ h N 0 , Σ ϵ ,
where y ( t h ) is the M-dimensional vector of noise-free measurements, ϵ h the M-dimensional vector of measurement errors, Σ ϵ the M-dimensional measurement error variance-covariance matrix, and M is the ( M × S ) -dimensional species-based measurement matrix.
We further assume that kinetic laws for the reaction rates are available. These laws are functions of the component masses, n ( t ) , and a T-dimensional vector of kinetic parameters, θ :
r ( t ) : = f n ( t ) , θ .
In the present study, we assume that the rate laws are known, except for the values of the parameters θ that need to be estimated.

3.1.2. Dynamic Model in Terms of Extents

We now adopt the definition of the extents of reaction, hereafter simply called extents, as the number of times each reaction has occurred since t = 0 . These extents are measured in moles and defined mathematically as:
x ( t ) : = V 0 t r τ d τ .
It follows that
n ( t ) = n 0 + N T x ( t ) .
Accordingly, (1)–(3) can be represented equivalently in the following form:
x ˙ ( t ) = V f n ( t ) , θ ,       x ( 0 ) = 0
y ˜ h = 1 V M n 0 + N T x ( t h ) + ϵ h
= y 0 + 1 V G x ( t h ) + ϵ h
n ( t ) = n 0 + N T x ( t ) ,
where
y 0 : = 1 V M n 0
G : = M N T ,
with the ( M × R ) -dimensional matrix G being labeled the extent-based measurement matrix.
Let A denote the rank of G , that is, A : = rank G min M , R .

3.2. Labeling Extents

3.2.1. Definitions

To label the extents, the following definitions are proposed:
  • The rth extent x r ( t ) is labeled sensed if the measurements y ˜ h are affected by x r ( t ) through the measurement Equation (7), that is, if G , r has at least one non-zero element.
  • The rth extent is labeled observable if (i) it is sensed ( G , r not null); and (ii) the change in the measurements y ˜ h caused by a change in x r ( t ) can be unambiguously attributed to the change in that extent, that is, G , r is independent of all other column vectors in G .
The above labeling is based on the structure of G and does not depend on the temporal resolution or quality of the recorded measurements ( y ˜ h ). As such, this means only structural observability is considered. This terminology is similar to the one used in data reconciliation [9,10] as the labels are produced without a dynamic model.

3.2.2. Labeling Procedure

To label the extents, one first computes the ( M × R ) -dimensional reduced row echelon form of G denoted B :
B : = rref G .
This ( M × R ) -dimensional matrix is composed of A non-zero rows and M A zero rows. Using B , the extents are labeled with the following procedure:
(a)
Label the R n extents corresponding to zero columns in B as non-sensed and use the vector n to identify their positions in x . Label the R s remaining extents as sensed and use the vector s to identify their positions in x .
(b)
Find all rows in B with a single non-zero element and find the column positions of these non-zero elements. Label the R o extents corresponding to these column positions as observable and use the vector o to identify their positions in x . These extents are observable because one can compute a unique value based on the available information (measurements, extent-based measurement matrix, and initial conditions).
(c)
Label the R a extents that are sensed but not observable as ambiguous. These extents are ambiguous because one cannot compute a unique value based on the available information. The vector a is used to identify their positions in x .
Based on this labeling, the vectors x s ( t ) , x n ( t ) , x o ( t ) , and x a ( t ) are defined to represent (i) the R s sensed extents, (ii) the R n = R R s non-sensed extents; (iii) the R o observable extents; and (iv) the R a = R s R o ambiguous extents. This is illustrated in Figure 1.
With the above definitions, (5) can be reformulated as:
n ( t ) = n 0 + N s , T x s ( t ) + N n , T x n ( t ) = n 0 + N o , T x o ( t ) + N a , T x a ( t ) + N n , T x n ( t ) .
Furthermore, it follows from G , n = 0 M × R n that
y ˜ h = y 0 + 1 V G , s x s ( t h ) + ϵ h = y 0 + 1 V G , o x o ( t h ) + 1 V G , a x a ( t h ) + ϵ h .

3.2.3. Practical Cases

Depending on the values of A, M, and R, one can distinguish the following cases:
  • Full-rank extent-based measurement matrix ( A = R ). This occurs when there are at least as many measurements as reactions ( M R ) and the matrix G is full column rank. As a result, R n = 0 , R o = R , and R a = 0 . This is the most frequently studied case, e.g., in [5,6,7]. This case enables computing unique values for the R extents by means of a linear transformation [5], thus allowing the estimation of kinetic parameters for each reaction rate model individually.
  • Rank-deficient measurement matrix ( A < R ). If A < R , for example because there are fewer measurements than reactions ( M < R ), it is no longer possible to compute all R extents from M measurements without additional information such as an established kinetic model. One can distinguish two situations within this case:
    (a)
    No ambiguity ( R a = 0 ). In this case, R o = A and R o + R n = R . As shown in [8], it is possible in this case to implement efficient parameter estimation by identifying subsystems of the complete model that include a subset of the kinetic rate laws and their parameters.
    (b)
    Ambiguity present ( 0 < R a A ). This situation results in R o < A . For this case, no generally applicable method for incremental parameter estimation is available until now.
The method proposed in this work is developed with the aim of handling all the above cases in a single framework for extent-based kinetic parameter estimation.

3.3. Observable and Unobservable Extent Directions

The ambiguous extents are now investigated in more detail to determine observable directions among them.

3.3.1. Factorization of G , a

Let ρ o be the rank of the ( M × R a ) -dimensional measurement matrix G , a . Then, G , a can be factorized into the ( M × ρ o ) -dimensional measurement matrix G o and the ( R a × ρ o ) -dimensional matrix V o :
G , a = G o V o T .
Using the reduced row echelon form B , a , the matrices G o and V o are computed as follows:
  • V o T is obtained by selecting the ρ o non-zero rows in B , a ,
  • G o is the matrix composed of the columns of G , a corresponding to the column positions of the first non-zero elements in the rows of V o T .

3.3.2. Definition of Observable and Unobservable Extent Directions

The term G , a x a ( t h ) in (13) can be expressed as:
G , a x a ( t ) = G o V o T x a ( t ) = G o χ o ( t ) ,
with χ o ( t ) the ρ o -dimensional vector of observable extent directions among the ambiguous extents:
χ o ( t ) : = V o T x a ( t ) = V o T I a , x ( t ) ,
where the ( R a × R ) -dimensional matrix I a , includes the rows a of the identity matrix I R such that
x a ( t ) = I a , x ( t ) .
Equation (15) indicates that while the extents x a ( t ) cannot be observed individually, their combined effects on the measurements can be observed as the linear combinations χ o ( t ) .
Remark 1.
The unobservable extent directions span the null space of G , a , which is also the null space of V o T . Denoting this null space by the ( R a × ρ u ) -dimensional matrix V u ,
V u = null V o T ,
with ρ u = ( R a ρ o ) , one can define the ρ u -dimensional vector of unobservable extent directions χ u ( t ) as:
χ u ( t ) : = V u T x a ( t ) = V u T I a , x ( t ) .

3.4. Observable Extents and Extent Directions

We further define the vector χ ¯ o ( t ) consisting of the A = R o + ρ o observable extents and extent directions as follows:
χ ¯ o ( t ) : = x o ( t ) χ o ( t ) = x o ( t ) V o T x a ( t ) .
With this definition, the measurement Equation (13) can be rewritten as:
y ˜ h = y 0 + 1 V G , o x o ( t h ) + 1 V G o χ o ( t ) + ϵ h = y 0 + 1 V G ¯ o χ ¯ o ( t h ) + ϵ h ,
with G ¯ o , an ( M × A ) -dimensional matrix, constructed as
G ¯ o : = G , o G o .
Since the ( R a × A ) -dimensional matrix G ¯ o has full column rank, (21) can be used to compute the maximum-likelihood estimates χ ¯ ˜ o , h of the observable extents and extent directions directly from the measurements:
χ ¯ ˜ o , h = P y ˜ h y 0 ,
with the ( A × M ) -dimensional matrix P given by:
P = V G ¯ o T Σ ϵ 1 G ¯ o 1 G ¯ o T Σ ϵ 1 .
The associated expected variance-covariance matrix of the estimation errors becomes:
Σ χ ¯ = P Σ ϵ P T = G ¯ o T Σ ϵ 1 G ¯ o 1 .

3.5. Unobservable Extents and Extent Directions

We finally define the vector χ ¯ u ( t ) consisting of the R A = R n + ρ u unobservable extents and non-sensed extent directions as follows:
χ ¯ u ( t ) : = x n ( t ) χ u ( t ) = x n ( t ) V u T x a ( t ) = I n , V u T I a , x ( t ) .
With this definition, the expression for the number of moles (12) can be rewritten as:
n ( t ) = n 0 + N o , T x o ( t ) + N a , T V o + T χ o ( t ) + V u + T χ u ( t ) + N n , T x n ( t ) = n 0 + N ¯ o T χ ¯ o ( t ) + N ¯ u T χ ¯ u ( t )
with:
N ¯ o : = N o , V o + N a ,
N ¯ u : = N n , V u + N a ,

3.6. System Partitioning

Incremental parameter estimation is based on the possibility of separating the parameter estimation problem into smaller problems. Ideally, if all extents of reaction could be computed from measurements, each reaction could be identified individually, that is, independently of the other reactions. This way, the parameter estimation problem reduces to the solution of R smaller problems. As discussed in the previous section, some extents may not be observable. For these situations, it would still be nice to be able to separate the parameter estimation problem into J smaller problems ( J R ). The objective is therefore to partition the reaction system effectively—with as many small groups of reactions as possible—to simplify the parameter estimation task. The first step to achieve this consists of system partitioning.
An algorithm is developed to group the kinetic parameters into J parameter subsets ( j = 1 , , J ), each represented as a T ( j ) -dimensional vector θ ( j ) satisfying the following properties:
  • The size T ( j ) of each parameter subset should be as small as possible.
  • The estimates θ ^ ( j ) in the jth parameter subset can be computed without consideration of any other parameter subset θ ( i ) , i j .
  • Each parameter in θ appears in at most one of the parameter subsets θ ( j ) .
This objective is achieved by means of model reformulation and graph-based system partitioning. The graph-based procedure can be interpreted as a symbolic manipulation of the process model. It does not require symbolic differentiation, however.

3.6.1. Step 1—Model Reformulation

An extended model is first defined to describe the dynamics of all extents and all observable directions among the ambiguous extents. To this end, the following procedure is applied:
(a) 
Express x ˙ ( t ) as a function of χ ¯ o ( t ) and x ( t )
The dynamic model (6) is modified by replacing n ( t ) with the right-hand side of (27):
x ˙ ( t ) = V f χ ¯ o ( t ) , χ ¯ u ( t ) , θ ,       x ( 0 ) = 0 .
The vector χ ¯ u ( t ) is now replaced with the right-hand side of (26). As a result, the above system becomes:
x ˙ ( t ) = V f χ ¯ o ( t ) , x ( t ) , θ ,       x ( 0 ) = 0 .
(b) 
State augmentation
Define the ρ aug -dimensional vector χ aug ( t ) : = x ( t ) χ o ( t ) = I R V o T I a , x ( t ) that includes all extents and extent directions, with ρ aug = R + ρ o . The dynamic behavior of χ aug ( t ) can be described by a differential-algebraic system including the R differential Equation (31) and the ρ o algebraic expressions (16):
x ˙ ( t ) = V f χ ¯ o ( t ) , x ( t ) , θ ,       x ( 0 ) = 0
χ o ( t ) = V o T I a , x ( t ) .
(c) 
Interpolation of the observable extents and extent directions
To increase the efficiency of system partitioning, it is useful to account for the fact that the observable extents and extent directions can be expressed in terms of measurements. However, since the observable extents and extent directions are only known at H discrete measurement points, their values always need to be obtained via interpolation. In this work, we apply piece-wise linear interpolation as follows:
χ ¯ ˜ o , i ( t ) : = χ ¯ ˜ o , h + t t h t h + 1 t h χ ¯ ˜ o , h + 1 χ ¯ ˜ o , h ,       t h t < t h + 1 ,       h = 1 , , H ,
with which the system (32) and (33) becomes:
x ˙ ( t ) = V f χ ¯ ˜ o , i ( t ) , x ( t ) , θ ,       x ( 0 ) = 0
χ o ( t ) = V o T I a , x ( t ) .

3.6.2. Step 2—Graph-Based System Partitioning

The equation system (35) and (36) is now analyzed by means of a graph partitioning procedure to determine the smallest groups of kinetic parameters that can be estimated separately. To this end, the following steps are performed:
(a) 
Create a graph
One creates a directed graph F with a vertex for every state variable in χ aug ( t ) and every parameter in θ . Hence, this graph has R + ρ o + T vertices. A directed arc is added from vertex v to vertex w if the vth element of χ aug ( t ) θ appears in the right-hand side of the wth equation in (35) and (36) ( v = 1 , , R + ρ o + T , w = 1 , , R + ρ o ). This graph represents the information flow for simulating (35) and (36). Additional arcs and vertices may be added to describe the influence of known inputs and the links between extents and measured variables. For system partitioning, this is however unnecessary and omitted for clarity.
(b) 
Extents predicted from measurements or simulation
The simultaneous approach uses a complete model of the reaction system to predict the extents (or concentrations) via simulation. If one wants to partition the reaction system into small groups of reactions, only the extents belonging to a given group can be generated via the simulation of that group. The other extents that enter the rate laws must be provided by the user as quantities known from measurements.
That information can be included in the graph F by annotating the various arcs. The arcs that originate at a vertex corresponding to an observable extent or an observable extent direction are labeled observation arcs, considering that observable extent or an observable extent direction can be replaced with their measured values (34). They are visualized as dashed-line arrows. The remaining arcs are labeled simulation arcs and visualized as solid-line arrows. The observation arcs represent the idea that the elements of χ ¯ ˜ o , i ( t ) can be regarded as known inputs for simulating (35) and (36).
(c) 
Subgraph selection
Identify the J subgraphs F ( j ) consisting of arcs and vertices in F on directed paths that (i) lead to a vertex representing an observable extent or an observable extent direction; and (ii) consist of simulation arcs only. The selected vertices represent an R ( j ) -dimensional vector of extents x ( j ) ( t ) , a ρ o ( j ) -dimensional vector of directions χ o ( j ) ( t ) , and a T ( j ) -dimensional vector of parameters θ ( j ) . The positions of x ( j ) ( t ) in x ( t ) are given by the vector j so that:
x ( j ) ( t ) : = I j , x ( t )
f ( j ) : = I j , f ,
and the selection matrices Λ o ( j ) and Λ θ ( j ) are defined so that:
χ o ( j ) ( t ) : = Λ o ( j ) χ o ( t )
θ ( j ) : = Λ θ ( j ) θ .
This means that each subgraph F ( j ) represents a subset of Equations (35) and (36) that describes the dynamics of x ( j ) ( t ) and χ o ( j ) ( t ) without reference to any other state variable:
x ˙ ( j ) ( t ) = V f ( j ) χ ¯ ˜ o , i ( t ) , x ( j ) ( t ) , θ ( j ) ,       x ( j ) ( 0 ) = x o
χ o ( j ) ( t ) = U ( j ) x ( j ) ( t ) ,
with U ( j ) : = Λ o ( j ) V o T I a , I j , T .
(d) 
Add observation arcs and vertices
For every graph F ( j ) , add (i) the observation arcs that have a target vertex belonging to F ( j ) and (ii) the source vertices of the added observation arcs. These added source vertices represent the minimal subset of interpolants in χ ¯ ˜ o , i ( t ) (34) that are required to simulate x ( j ) ( t ) and χ o ( j ) ( t ) and are referred to as χ ¯ ˜ o , i ( j ) ( t ) . This means that the graph F ( j ) now represents all information required to simulate the observable extents x ( j ) ( t ) and the observable extent directions χ o ( j ) ( t ) . Accordingly, one can rewrite the jth equation subsystem as:
x ˙ ( j ) ( t ) = V f ( j ) χ ¯ ˜ o , i ( j ) ( t ) , χ o ( j ) ( t ) , x ( j ) ( t ) , θ ( j ) ,       x ( j ) ( 0 ) = x o
χ o ( j ) ( t ) = U ( j ) x ( j ) ( t ) .
At the end of this procedure, the equation system (35) and (36) is approximated by J smaller equation subsystems, each including a subset of the kinetic parameters. As intended, every kinetic parameter appears in at most one of the J subsystems. In addition, the identified subsystems do not share any of the observable extents or extent directions as state variables, that is, each observable extent and extent direction is simulated in only one of the identified subsystems.

3.7. Parameter Estimation Methods

In this work, we solve the parameter estimation problem in two distinct ways. The first way consists in a simultaneous estimation of all parameters in the maximum-likelihood sense. The second way consists in an extent-based incremental parameter estimation. The next paragraphs describe how incremental parameter estimation approximates the simultaneous estimation procedure to minimize the number of parameters that are estimated together.

3.7.1. Simultaneous Parameter Estimation

Maximum-likelihood estimation of the kinetic parameters can be obtained by solving minimization of the weighted mean squared error:
Problem P 0
θ ^ = arg min θ Q 0 : = 1 H · M h H y ˜ h y ( t h ) T Σ ϵ 1 y ˜ h y ( t h ) ,
subject to (1)–(3). This estimation problem can be equivalently formulated in terms of the computed extents as follows:
Problem P 0 *
θ ^ = arg min θ Q 0 * : = h H χ ¯ ˜ o , h χ ¯ o ( t h ) T W χ ¯ ˜ o , h χ ¯ o ( t h ) ,
subject to (6)–(8) and (20) and with W : = Σ χ ¯ 1 .

3.7.2. Incremental Parameter Estimation

The incremental parameter estimation procedure is obtained by applying two modifications, A and B, to problem P 0 * .
(a) 
Modification A: Removing correlation terms. Let A ( j ) = R o ( j ) + ρ o ( j ) and define the A ( j ) -dimensional vector of observable extents and extent directions χ ¯ o ( j ) ( t ) . This vector includes all observable extents x o ( j ) ( t ) , whose positions in x ( j ) ( t ) are given by the vector o ( j ) , and all observable extent directions χ o ( j ) ( t ) in Subsystem j. We further define the matrix U ¯ ( j ) so that:
x o ( j ) ( t ) : = I o ( j ) , x ( j ) ( t )
χ ¯ o ( j ) ( t ) = x o ( j ) ( t ) χ o ( j ) ( t ) = U ¯ ( j ) x ( j ) ( t )
U ¯ ( j ) : = I o ( j ) , U ( j ) .
The A ( j ) -dimensional vector χ ¯ ˜ o , h ( j ) is then defined by selecting the elements of χ ¯ ˜ o , h as:
χ ¯ ˜ o , h ( j ) : = I e ( j ) , χ ¯ ˜ o , h ,
where the vector e ( j ) gives the positions of χ ¯ ˜ o , h ( j ) in χ ¯ ˜ o , h . Now define the residuals d h ( j ) associated with subsystem j:
d h ( j ) : = χ ¯ ˜ o , h ( j ) χ ¯ o ( j ) ( t h )
This way, the objective function Q 0 * defined above can be reformulated as:
Q 0 * = h H i J j J d h ( i ) T I e ( i ) , W I e ( j ) , T d h ( j ) = h H j J d h ( j ) T W ( j ) d h ( j ) + i , j : i j d h ( i ) T I e ( i ) , W I e ( j ) , T d h ( j )
and can subsequently approximated with Q 1 :
Q 0 * Q 1 : = j J h H d h ( j ) T W ( j ) , d h ( j )
with W ( j ) : = I e ( j ) , W I e ( j ) , T . This first modification results in:
Problem P 1
θ ^ = arg min θ Q 1 : = j J h H d h ( j ) T W ( j ) d h ( j ) ,
subject to (6)–(8) and (20). This error due to this approximation is referred to as Type A approximation error. This error is zero if all matrices I e ( i ) , W I e ( j ) , T , i j , are zero matrices. This is true when the correlation between the estimation errors of any element in χ ¯ ˜ o , h ( i ) and any element of χ ¯ ˜ o , h ( j ) ( i j ) is zero.
(b) 
Modification B: Separation of problem P 1 into J smaller problems P 1 ( j ) . An approximation to problem P 1 is now obtained by optimizing each of the J terms in (54) separately and simulating the values of χ ¯ o ( j ) ( t h ) with (43) and (44). We refer to each problem as P 1 ( j ) :
Problem P 1 ( j )
θ ^ ( j ) = arg min θ Q 1 ( j ) : = h H d h ( j ) T W ( j ) d h ( j )
subject to
x ˙ ( j ) ( t ) = V f ( j ) χ ¯ ˜ o , i ( j ) ( t ) , χ o ( j ) ( t ) , x ( j ) ( t ) , θ ( j ) ,       x ( j ) ( 0 ) = 0
χ o ( j ) ( t ) = U ( j ) x ( j ) ( t )
d h ( j ) = χ ¯ ˜ o , h ( j ) χ ¯ o ( j ) ( t h ) = χ ¯ ˜ o , h ( j ) U ¯ ( j ) x ( j ) ( t h ) .
The most important feature of problems P 1 ( j ) , j = 1 , , J , is that each of them involves only a small number of parameters θ ( j ) . Please note that the approximation of problem P 1 by this set of problems P 1 ( j ) is perfect in the special case where the right-hand sides (56) do not involve interpolated extents, that is, if the vectors χ ¯ ˜ o , i ( j ) ( t ) ( j = 1 , , J ) are empty. Another error, named Type B approximation error, is introduced when this is not the case.
Thanks to two modifications of problem P 0 , one obtains J smaller problems P 1 ( j ) , each of which includes only a fraction of the original set of parameters. The price to pay for such a simplification is a deviation from maximum likelihood due to the introduction of approximation errors (type A & B). These errors are often marginal as demonstrated below.
As in previous work [5,6,7], the parameter estimates obtained by solving problem P 1 ( j ) ( j = 1 , , J ) can serve as reliable initial guesses to initiate the solution to problem P 0 . The problem P 0 when solved with these initial guesses is named problem P 1 + 0 .

3.8. Implementation

All results can be reproduced with the open-source Efficient Model Identification (EMI) MATLAB package for efficient model identification [6] that includes all methods and simulations used in this study. This package is added in the Supplementary Information (Section A).

4. Results

We first explain the results obtained within an extensive simulation study to demonstrate the method. After that, results obtained with an experimental data set are used to demonstrate real-world applicability.

4.1. Simulation Study

The developed methods are demonstrated via a batch reaction system and investigating 5 different measurement scenarios (A–E). The reaction system is described first. Then, scenario A and its results are discussed in detail. The results for scenarios B to E are only summarized, with the details described in the Supplementary Information (Section B).

4.1.1. Reaction System

This reaction system has R = 5 reactions involving S = 6 species (A to F) with the following reaction scheme:
R 1 : A + B C R 2 : 2 A D R 3 : 2 C B + D R 4 : D E R 5 : 2 D E + F
with
N : = 1 1 + 1 0 0 0 2 0 0 + 1 0 0 0 + 1 2 + 1 0 0 0 0 0 1 + 1 0 0 0 0 2 + 1 + 1 .

4.1.2. Dynamic Model in Terms of Numbers of Moles

The simulated kinetic rate expressions are:
f n ( t ) , θ : = k 1 n 1 ( t ) V n 2 ( t ) V K 1 n 3 ( t ) V k 2 n 1 ( t ) V 2 k 3 n 3 ( t ) V k 4 n 4 ( t ) V k 5 n 4 ( t ) V 2
with θ : = k 1 k 2 k 3 k 4 k 5 K 1 T . The ground truth values of the kinetic parameters are given in Table 2.
The initial conditions are n 0 : = 0.73 0.42 0 0 0 0 T mol, and the reactor volume is V = 1 L . In all investigated scenarios, measurements are taken at intervals of 5 min during 10 h ( H = 121 , t h = 0 , 1 / 12 , 2 / 12 , , 10 ).

4.1.3. Scenario A

The first scenario considers the case where the concentrations of B and C and the sum of the concentrations of E and F are measured, that is,
M : = 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 .
The measurement error-covariance matrix is Σ ϵ = diag 1 1 2 T 10 4 mol 2 L 2 . Figure 2 shows the simulated noise-free and noisy measurements.
Extent labeling
To specify the measurement model in terms of extents (7), one needs to specify y 0 and G :
y 0 = M n 0 = 0 . 42 0 0 T
G = M N T = 1 0 + 1 0 0 + 1 0 2 0 0 0 0 0 + 1 + 2 .
The reduced row echelon form of G is
B = + 1 0 0 0 0 0 0 + 1 0 0 0 0 0 + 1 + 2 ,
which indicates that there are:
  • R n = 1 non-sensed extent, x n ( t ) = x 2 ( t ) ,
  • R o = 2 observable extents, x o ( t ) = x 1 ( t ) x 3 ( t ) T ,
  • R a = 2 ambiguous extents, x a ( t ) = x 4 ( t ) x 5 ( t ) T ,
which gives:
G , a = 0 0 0 0 + 1 + 2 .
Observable extents and extent directions
The factorization of G , a gives:
G o = 0 0 + 1 , V o T = + 1 + 2 ,
so that
χ o ( t ) = V o T I a , x ( t ) = 0 0 0 1 2 x ( t ) = x 4 ( t ) + 2 x 5 ( t )
χ ¯ o ( t ) = x 1 ( t ) x 3 ( t ) χ o ( t ) = x 1 ( t ) x 3 ( t ) x 4 ( t ) + 2 x 5 ( t ) .
The values of the observable extents and extent directions χ ¯ ˜ o , h : = x ˜ 1 , h x ˜ 3 , h χ ˜ o , h T can be computed from (23) with
P = 2 1 0 1 1 0 0 0 + 1 .
The expected estimation error variance-covariance matrix Σ χ ¯ is:
Σ χ ¯ = + 5 + 3 0 + 3 + 2 0 0 0 + 2 10 4 .
This demonstrates that the estimation errors in the first and second observable extents are uncorrelated with the estimation error in the computed observable extent direction. Please note that the correlation between the estimation errors of the first and second observable extents is fairly high ( 3 5 · 2 = 0 . 95 ). Figure 3 shows the computed values of the observable extents and extent direction.
Expression (27) is then fully specified with the following vectors and matrices:
n 0 = 1 1 0 0 0 0
N ¯ o : = 1 1 + 1 0 0 0 0 + 1 2 + 1 0 0 0 0 0 1 3 5 2 5 .
N ¯ u : = 0 0 0 0 0 0 2 0 0 + 1 0 0 0 0 0 0 0 0 0 0 0 0 + 2 5 2 5 0 0 0 0 1 5 + 1 5
Since the 2nd and 3rd column of N ¯ u are null vectors, it follows that values of n ˜ 2 ( t ) and n ˜ 3 ( t ) can be computed based on the interpolated estimates χ ¯ ˜ o , i ( t ) without the need to simulate x ( t ) . A further exploration of this idea remains out of the scope of this paper, however.
System partitioning
For simplicity of notation, we omit the time dependence in what follows. For example, the interpolants χ ¯ ˜ o , i ( t ) are given as χ ¯ ˜ o , i : = x ˜ 1 x ˜ 3 χ ˜ o T . Following Steps 1(a)–(b) in Section 3.6.1, the augmented equation system becomes:
x ˙ = k 1 V 2 n 0 , 1 x ˜ 1 2 x 2 n 0 , 2 x ˜ 1 + x ˜ 3 K 1 n 0 , 3 + x ˜ 1 2 x ˜ 3 k 2 V 2 n 0 , 1 x ˜ 1 2 x 2 2 k 3 V n 0 , 3 + x ˜ 1 2 x ˜ 3 k 4 V n 0 , 4 + x ˜ 3 χ ˜ o + x 2 k 5 V 2 n 0 , 4 + x ˜ 3 χ ˜ o + x 2 2 ,    x 0 = 0
χ o = V o T I a , x = 0 0 0 1 2 x = x 4 + 2 x 5 .
Figure 4 shows the graph corresponding to the above equation system. The vertices corresponding to the observable extents and extent direction are shaded, while the other vertices are white. The simulation arcs are shown with full-line arrows, while the observation arcs are shown as dashed-line arrows. To identify possible subsystems, one removes all the observation arcs, which results in two subgraphs. The first subgraph includes the parameters k 1 , k 2 , k 4 , k 5 , and K 1 , which affect the observable quantities x 1 and χ o via a network that also involves the unobservable extents x 2 , x 4 , and x 5 . The second subgraph is much smaller and includes the parameter k 3 that influences the observable extent x 3 .
Accordingly, the J = 2 subsystems are:
x ˙ ( 1 ) = x ˙ 1 x ˙ 2 x ˙ 4 x ˙ 5 = k 1 V 2 [ n 0 , 1 x 1 2 x 2 n 0 , 2 x 1 + x ˜ 3 K 1 n 0 , 3 + x 1 2 x ˜ 3 ] k 2 V 2 n 0 , 1 x 1 2 x 2 2 k 4 V n 0 , 4 + x ˜ 3 χ o + x 2 k 5 V 2 n 0 , 4 + x ˜ 3 χ o + x 2 2 , x ( 1 ) 0 = x 0 , 1 x 0 , 2 x 0 , 4 x 0 , 5 = 0
χ o ( 1 ) = χ o = x 4 + 2 x 5 ,
and
x ˙ ( 2 ) = x ˙ 3 = k 3 V n 0 , 3 + x ˜ 1 2 x 3 ,       x ( 2 ) 0 = x 0 , 3 = 0 .
The information flows in these subsystems are shown in the inset of Figure 4.
Incremental parameter estimation
The resulting partitioning means that estimates of the first set of parameters θ ( 1 ) : = k 1 , k 2 , k 4 , k 5 , K 1 can be obtained by minimizing a weighted least-squares deviation between the predicted x 1 ( t h ) and χ o ( t h ) and their measured counterparts x ˜ 1 , h and χ ˜ o , h , that is, by solving problem P 1 ( 1 ) :
θ ^ ( 1 ) = arg min θ ( 1 ) h H d h ( 1 ) T W ( 1 ) d h ( 1 )
subject to
x ˙ 1 ( t ) x ˙ 2 ( t ) x ˙ 4 ( t ) x ˙ 5 ( t ) = k 1 V 2 [ n 0 , 1 x 1 ( t ) 2 x 2 ( t ) n 0 , 2 x 1 ( t ) + x ˜ 3 ( t ) K 1 n 0 , 3 + x 1 ( t ) 2 x ˜ 3 ( t ) ] k 2 V 2 n 0 , 1 x 1 ( t ) 2 x 2 ( t ) 2 k 4 V n 0 , 4 + x ˜ 3 ( t ) χ o ( t ) + x 2 ( t ) k 5 V 2 n 0 , 4 + x ˜ 3 ( t ) χ o ( t ) + x 2 ( t ) 2 , x 1 ( 0 ) x 2 ( 0 ) x 4 ( 0 ) x 5 ( 0 ) = 0
χ o ( t ) = x 4 ( t ) + 2 x 5 ( t )
d h ( 1 ) = x ˜ 1 , h χ ˜ o , h x 1 ( t h ) χ o ( t h ) ,
where
W ( 1 ) : = L e ( 1 ) Σ χ ¯ 1 L e ( 1 ) T = + 1 0 0 0 0 + 1 2 3 0 3 5 0 0 0 1 / 2 10 4 1 0 0 0 0 1 T
= 2 0 0 1 / 2 10 4 .
The second set of parameters consists of k 3 only, θ ( 2 ) : = k 3 . Its value can be obtained by minimizing the least-squares deviation between x 3 ( t h ) and its measured counterparts x ˜ 3 , h , that is, by solving problem P 1 ( 2 ) :
θ ^ ( 2 ) = arg min θ ( 2 ) h H d h ( 2 ) 2
subject to
x ˙ 3 ( t ) = k 3 V n 0 , 3 + x ˜ 1 ( t ) 2 x 3 ( t ) ,       x 3 ( 0 ) = 0
d h ( 2 ) = x ˜ 3 , h x 3 ( t h ) .
Figure 5 shows the measured concentrations and the simulated profiles obtained with (i) the true parameters; (ii) simultaneous parameter estimation ( P 0 ); and (iii) incremental parameter estimation ( P 1 ( 1 ) and P 1 ( 2 ) ). All simulated profiles describe the experiment well and are practically indistinguishable. The parameter values obtained by solving P 0 , P 1 ( 1 ) , P 1 ( 2 ) , and P 1 + 0 are shown in Table 2. All parameter estimates are in close agreement to each other. It is worth noting that the parameter k 2 can be estimated even if the corresponding extent x 2 is labeled non-sensed, i.e., our results suggest that this parameter is both structurally and practically identifiable.

4.1.4. Scenario B

Scenario B assumes concentration measurements for the species B and C only:
M : = 0 1 0 0 0 0 0 0 1 0 0 0 .
The measurement error-covariance matrix is Σ ϵ = diag 1 1 T 10 4 mol 2 L 2 . The reduced row echelon form of G is:
B = + 1 0 0 0 0 0 0 + 1 0 0 ,
which leads to the following extent labeling:
  • R n = 3 non-sensed extents, x n ( t ) = x 2 ( t ) x 4 ( t ) x 5 ( t ) T ,
  • R o = 2 observable extents, x o ( t ) = x 1 ( t ) x 3 ( t ) T ,
  • R a = 0 ambiguous extents.
Please note that the extent-based method proposed in [8] applies in this case since R a = 0 .
Figure 6 shows the corresponding graph F , which can be partitioned into J = 2 subsystems and a leftover part. The first subgraph includes the parameters k 1 , k 2 , and K 1 since they all have some effect on the observable extent x 1 via a network that also involves the unobservable extent x 2 . The second subgraph includes the parameter k 3 that influences the observable extent x 3 . The information flows in these subsystems are shown in the inset of Figure 6. The leftover part includes the parameters k 4 and k 5 and the extents x 4 and x 5 . These parameters and variables are not part of any of the identified subsystems because there are no directed paths composed of simulation arcs from the corresponding vertices to any of the observable extents. Hence, it follows that these parameters are unidentifiable. This is consistent with the method proposed in [8] (not shown). Clearly, the graph-based partitioning procedure shows potential for parameter identifiability analysis, which is discussed as an opportunity below.

4.1.5. Scenario C

Scenario C assumes concentration measurements for the species B, C, E, and F:
M : = 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 .
The measurement error-covariance matrix is Σ ϵ = diag 1 1 1 1 T 10 4 mol 2 L 2 . The reduced row echelon form of G is:
B = + 1 0 0 0 0 0 0 + 1 0 0 0 0 0 + 1 0 0 0 0 0 + 1 ,
which leads to the following extent labeling:
  • R n = 1 non-sensed extents, x n ( t ) = x 2 ( t ) .
  • R o = 4 observable extents, x o ( t ) = x 1 ( t ) x 3 ( t ) x 4 ( t ) x 6 ( t ) T .
  • R a = 0 ambiguous extents.
Figure 7 shows the corresponding graph F , which can be partitioned into 2 subsystems. The first subgraph includes the parameters k 1 , k 2 , k 4 , k 5 , and K 1 since they all have some effect on the observable extents x 1 , x 4 , and x 5 via a network that also involves the unobservable extents x 2 . The second subgraph includes the parameter k 3 that influences the observable extent x 3 . The information flows in these subsystems are shown in the inset of Figure 7. Although the assignment of the parameters is the same as in scenario A, the parameter estimation problems are different since the fit of subsystem 1 is determined with respect to x ˜ 1 , x ˜ 4 , and x ˜ 5 in scenario C, compared to only x ˜ 1 and χ ˜ o in scenario A.

4.1.6. Scenario D

Scenario D assumes concentration measurements for the species A, C, and E:
M : = 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 .
The measurement error-covariance matrix is Σ ϵ = diag 1 1 1 T 10 4 mol 2 L 2 . The reduced row echelon form of G is:
B = + 1 0 2 0 0 0 + 1 + 1 0 0 0 0 0 + 1 + 1 0 0 0 0 0 ,
which leads to the following extent labeling:
  • R n = 0 non-sensed extents,
  • R o = 0 observable extents,
  • R a = 5 ambiguous extents, x a ( t ) = x 1 ( t ) x 2 ( t ) x 3 ( t ) x 4 ( t ) x 6 ( t ) T .
From A = rank B = 3 and R o = 0 , one sees that there are ρ o = A R o = 3 observable extent directions among the 5 ambiguous extents. The first two are linear combinations of x 1 , x 2 , and x 3 , while the third one is a linear combination of x 4 and x 5 . This particular appearance of extents in subsets of the observable directions stems from the subspace clustering property of the reduced row echelon form and will be discussed in some more detail below.
Figure 8 shows the corresponding graph F , which can be partitioned into 2 subsystems. The first subgraph includes the parameters k 1 , k 2 , k 3 , and K 1 since they all have some effect on the observable extent directions χ o , 1 and χ o , 2 via a network that also involves the unobservable extents x 1 , x 2 , and x 3 . The second subgraph includes the parameters k 4 and k 5 that influence the observable extent direction χ o , 3 via a network that also involves the unobservable extents x 4 and x 5 . Hence, this leads to two parameter estimation problems involving 4 and 2 parameters, respectively. The information flows to simulate each of the corresponding subsystems are shown in the inset of Figure 8.

4.1.7. Scenario E

Scenario E assumes concentration measurements of all species:
M : = I 6 .
The measurement error-covariance matrix is Σ ϵ = diag 1 1 1 1 1 1 T 10 4 mol 2 L 2 . The reduced row echelon form of G is:
B = + 1 0 0 0 0 0 + 1 0 0 0 0 0 + 1 0 0 0 0 0 + 1 0 0 0 0 0 + 1 0 0 0 0 0 ,
which leads to the following extent labeling:
  • R n = 0 non-sensed extents,
  • R o = 5 observable extents, x o ( t ) = x 1 ( t ) x 2 ( t ) x 3 ( t ) x 4 ( t ) x 6 ( t ) T ,
  • R a = 0 ambiguous extents.
This means that the original framework for extent computation, which assumes A = R (Section 3.2.3), applies.
Figure 9 shows the graph F , which can be partitioned into 5 subsystems that include the parameters as follows: (i) k 1 and K 1 ; (ii) k 2 ; (iii) k 3 ; (iv) k 4 ; and (v) k 5 . Hence, one obtains 5 entirely decoupled parameter estimation problems. This set of optimization problems is the same as those obtained with the original extent-based model identification method as is also clear from the inset of Figure 9.

4.2. Experimental Study

To explore the applicability of the novel incremental parameter estimation method with realistic laboratory data, we execute parameter estimation for the α -pinene batch experiment first reported in [11] and later studied in [12,13,14,15,16]. The reaction network consists of 5 species. These species are labeled A , B , C , D , and E and correspond to α -pinene, dipentene, allo-ocimene, pyronene, and dimer. The reaction network is:
R 1 : A B R 2 : A C R 3 : C D R 4 : C E R 5 : E C
As in [12], the model is defined with the following stoichiometric matrix and process rates:
N : = 1 + 1 0 0 0 1 0 + 1 0 0 0 0 1 + 1 0 0 0 1 0 + 1 0 0 + 1 0 1
r : = k 1 n 1 ( t ) V k 2 n 1 ( t ) V k 3 n 3 ( t ) V k 4 n 3 ( t ) V k 5 n 5 ( t ) V .
At the start of the batch experiment, only A is present. As explained in [12], the concentrations of all species but D are measured. The concentrations of D are obtained by assuming that 3% of the transformed A is present as D at all times. Finally, all concentrations are normalized to 1 (100%). Figure 10 shows the experimental data reported in [12]. As the initial numbers of moles and the volume are unknown, we express both the concentrations and the extents as dimensionless fractions of the numbers of moles relative to the initial numbers of moles of A . Hence, we can write n 0 : = 1 0 0 0 0 T .
Extent labeling
For extent labeling, we assume that all species are measured with independent zero-mean Gaussian measurement errors, similar to the least-squares approach of [12]:
M : = I 5 .
It follows that
G = M N T = 1 1 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 1 .
The reduced row echelon form of G is
B = + 1 0 0 0 0 0 + 1 0 0 0 0 0 + 1 0 0 0 0 0 + 1 1 ,
which indicates that there are:
  • R n = 0 non-sensed extents,
  • R o = 3 observable extents, x o ( t ) = x 1 ( t ) x 2 ( t ) x 3 ( t ) T ,
  • R a = 2 ambiguous extents, x a ( t ) = x 4 ( t ) x 5 ( t ) T ,
which delivers:
G , a = 0 0 0 0 1 + 1 0 0 + 1 1 .
Observable extents and extent directions
The factorization of G , a gives:
G o = 0 0 0 + 1 , V o T = + 1 1 ,
so that
χ o ( t ) = V o T I a , x ( t ) = 0 0 0 1 1 x ( t ) = x 4 ( t ) x 5 ( t )
χ ¯ o ( t ) = x 1 ( t ) x 2 ( t ) x 3 ( t ) χ o ( t ) = x 1 ( t ) x 2 ( t ) x 3 ( t ) x 4 ( t ) x 5 ( t ) .
Figure 11 shows the computed values of the observable extents and extent direction assuming these definitions. The corresponding error variance-covariance matrix is:
Σ χ ¯ = + 0.8 0.6 0.2 0.2 0.6 + 1.2 + 0.4 + 0.4 0.2 + 0.4 + 0.8 0.2 0.2 + 0.4 0.2 + 0.8 σ
with σ the (unknown) measurement error variance.
System partitioning
As above, we omit the time dependence in what follows. Following Steps 1(a)–(b) in Section 3.6.1, the augmented equation system becomes:
x ˙ = k 1 n 0 , 1 x ˜ 1 2 x ˜ 2 k 2 n 0 , 1 x ˜ 1 2 x ˜ 2 k 3 n 0 , 3 + x ˜ 2 x ˜ 3 χ ˜ o k 4 n 0 , 4 + x ˜ 3 k 5 n 0 , 5 + χ ˜ o ,       x 0 = 0 ,
χ o = x 4 x 5 .
Figure 12 shows the graph corresponding to the above equation system. Removing all the observation arcs generates 4 subgraphs. The first three subgraphs represent the first three reactions and include: k 1 and x 1 ( j = 1 ) ; k 2 and x 2 ( j = 2 ) ; and k 3 and x 3 ( j = 3 ) . The fourth graph includes the parameters k 4 and k 5 via a network that also involves the observable extent direction χ o and the unobservable extents x 4 and x 5 ( j = 4 ) .
Accordingly, the J = 4 subsystems are:
x ˙ ( 1 ) = x ˙ 1 = k 1 n 0 , 1 x 1 2 x ˜ 2 ,       x ( 1 ) 0 = x 0 , 1 = 0 ,
x ˙ ( 2 ) = x ˙ 2 = k 2 n 0 , 1 x ˜ 1 2 x 2 ,       x ( 2 ) 0 = x 0 , 2 = 0 ,
x ˙ ( 3 ) = x ˙ 3 = k 3 n 0 , 3 + x ˜ 2 x 3 χ ˜ o ,       x ( 3 ) 0 = x 0 , 3 = 0 ,
x ˙ ( 4 ) = x ˙ 4 x ˙ 5 = k 4 n 0 , 3 + x ˜ 2 x ˜ 3 χ ˜ o k 5 n 0 , 5 + χ ˜ o ,      x ( 4 ) 0 = x 0 , 4 x 0 , 5 = 0 ,
χ o ( 4 ) = χ o = x 4 x 5 ,
The information flows in these subsystems are shown in the inset of Figure 12.
Incremental parameter estimation
The resulting partitioning means that estimates of the k 1 , k 2 , and k 3 can be obtained by solving the following three single-parameter estimation problems ( P 1 ( 1 ) , P 1 ( 2 ) , P 1 ( 3 ) ):
P 1 ( 1 ) : k ^ 1 = arg min k 1 h H x ˜ 1 , h x 1 ( t h ) T W ( 1 ) x ˜ 1 , h x 1 ( t h )
subject to
x ˙ ( 1 ) = x ˙ 1 = k 1 n 0 , 1 x 1 2 x ˜ 2 ,       x ( 1 ) 0 = x 0 , 1 = 0
where
W ( 1 ) : = 5 / 4
P 1 ( 2 ) : k ^ 2 = arg min k 2 h H x ˜ 2 , h x 2 ( t h ) T W ( 2 ) x ˜ 2 , h x 2 ( t h )
subject to
x ˙ ( 2 ) = x ˙ 2 = k 2 n 0 , 1 x ˜ 1 2 x 2 ,       x ( 2 ) 0 = x 0 , 2 = 0
where
W ( 2 ) : = 5 / 3
P 1 ( 3 ) : k ^ 3 = arg min k 3 h H x ˜ 3 , h x 3 ( t h ) T W ( 3 ) x ˜ 3 , h x 3 ( t h )
subject to
x ˙ ( 3 ) = x ˙ 3 = k 3 n 0 , 3 + x ˜ 2 x 3 χ ˜ o ,       x ( 3 ) 0 = x 0 , 3 = 0
where
W ( 3 ) : = 5 / 4
The fourth set of parameters, k 4 and k 5 , are estimated by solving the following problem:
P 1 ( 4 ) : k ^ 3 , k ^ 4 = arg min k 3 , k 4 h H χ ˜ o , h χ o ( t h ) T W ( 4 ) χ ˜ o , h χ o ( t h )
subject to
x ˙ ( 4 ) = x ˙ 4 x ˙ 5 = k 4 n 0 , 3 + x ˜ 2 x ˜ 3 χ ˜ o k 5 n 0 , 5 + χ ˜ o ,       x ( 4 ) 0 = x 0 , 4 x 0 , 5 = 0
χ o ( 4 ) = χ o = x 4 x 5 ,
where
W ( 4 ) : = 5 / 4
Please note that the (unknown) measurement error variance σ can arbitrarily be set equal to one during parameter estimation without affecting the parameter estimates. This, however, means that parameter uncertainties cannot be quantified in this case without estimating σ as well. This estimation and subsequent uncertainty analysis is considered out of scope for this study. Figure 11 shows the simulated extent profiles obtained after solving the optimization problems ( P 1 ( j ) , j = 1 , , 4 ). There is a reasonable fit in all cases, except for the third extent of reaction ( x 3 ). This, in the mind of the authors, demonstrates a tangible benefit of the extent-based model framework. Indeed, one can identify which parts of the model could be improved, e.g., by reconsidering the reaction rate expression corresponding to the computed extents that are approximated poorly. However, this is not explored further in this work.
Figure 10 shows the measured concentrations and the simulated profiles obtained with (i) incremental parameter estimation ( P 1 ( j ) , j = 1 , , 4 ), followed by (ii) simultaneous parameter estimation with estimates obtained through incremental parameter estimation ( P 1 + 0 ). The parameter values obtained by solving P 0 , P 1 ( 1 ) , P 1 ( 2 ) , and P 1 + 0 are shown in Table 3. Note the final values obtained with both P 0 and P 1 + 0 are equal within numerical precision.

5. Discussion

A novel procedure for model parameter estimation is proposed. The procedure combines two important features: (i) a new framework for extent computation based on the computation of observable directions among the set of sensed but ambiguous extents; and (ii) a graph-based system partitioning procedure that identifies the groups of equations and kinetic parameters that can be simulated independently from the remaining part of the dynamic model. The latter is possible by approximating the original equation system via interpolation of observed extents and extent directions. The proposed procedure enables splitting the parameter estimation problem into smaller problems in cases that could so far only be handled with simultaneous model identification methods (e.g., scenario A, C, and D in Section 4.1). It also subsumes preexisting methods under special conditions such as complete observability (e.g., scenario E in Section 4.1,5,6,7]) or absence of ambiguity (e.g., scenario B in Section 4.1,8]). Thus, this procedure is applicable to parameter estimation for any multivariate differential equation model under a large range of structural observability conditions.
This work removes an important bottleneck for the implementation of incremental model identification to biological systems. Indeed, by providing a method that can be applied to any known stoichiometric matrix N and measurement matrix M , incremental model identification is now applicable to biological processes where the number of measurements M is lower than the number of modeled reactions R. Computationally speaking, the extent computation and graph-based partitioning are expected to scale well with the number of modeled reactions. The effects on the computational efficiency of the parameter estimation step will however depend primarily on the lengths of the state vectors in the identified subsystems. In turn, these lengths depend strongly on model structure and the available measured variables and less so on the number of states in the complete system.
A remaining obstacle is the fact that both M and N are assumed to be known. While several techniques exist to handle this, including methods based on extents [17], some coefficients in these matrices may be unidentifiable based on atomic and stoichiometric balances alone. For this reason, dealing with unknown stoichiometry ( M ) or ill-defined measurements ( N ) deserves continued attention.
Another aspect open for exploration is that no global optimization method has been tested so far to solve the smaller parameter estimation problems P 1 ( j ) that result from system partitioning. Inspiration can be drawn from existing parameter estimation methods even if these methods (i) might be complex yet without publicly available implementation [18]; or (ii) remain limited to cases where all extents are observable and therefore modeled as univariate processes for parameter estimation [19,20].

5.1. Generalized Framework for Extent Computation

The addition of new concepts, such as ambiguous extents and observable extent directions expands the framework for extent computation beyond its original range of applicability. In this study, the focus has been given to the problem of parameter estimation. However, the same generalized framework is likely to be applicable and useful for challenges that have been handled with earlier extent-based methods such as data reconciliation [7], model structure identification [6], and process control [21]. Moreover, there are no obvious limitations that hinder applications involving multi-phase or distributed system models [22,23] or models including algebraic constraints [6,22].

5.2. Optimal System Partitioning

According to our understanding, the proposed procedure delivers the most efficient system partitioning given the available measurements and model information M , f · , N , and n 0 . In other words, the procedure generates the subsystems with the smallest number of parameters that can be estimated independently from parameters in other subsystems. However, this is stated here without formal proof. The factorization of G , a based on the reduced row echelon form of G , as described in Section 3.3.1, plays a crucial role. Indeed, it follows from work on subspace clustering methods for noise-free data [24,25] that this factorization minimizes the presence of simulation arcs from the ambiguous extents to the observable directions, thereby leading to the best possible system partitioning.
Please note that the obtained generalized extent framework is optimal only for the purpose of parameter estimation. For instance, one could consider a factorization of G , a which identifies directions that exhibit uncorrelated estimation errors, thereby improving the statistical quality of the computed observable directions and possibly avoiding Type A approximation error. This could be achieved via singular value decomposition of G , a . However, singular value decomposition cannot guarantee optimal system partitioning [26]. Consequently, it follows that the optimal factorization of G , a depends on the objective of this factorization. This stands in contrast to the existing body of work concerning extents, which do not involve any such purpose-dependent factorization steps.
Note also that the proposed procedure is designed for the definition of optimality used here. For instance, if one relaxes the constraint that parameters can only appear in one subsystem (Section 3.6), then the proposed graph partitioning procedure is not optimal. For example, in scenario A, one may be able to estimate the parameters in 3 subsets rather than 2, namely, (i) k 1 , k 2 , and K 1 by simulation of x 1 ( t ) and x 2 ( t ) ; (ii) k 2 , k 3 , k 4 , and k 5 by simulation of x 2 ( t ) , x 4 ( t ) , x 5 ( t ) , and χ o , 1 ( t ) ; and (iii) k 3 by simulation of x 3 ( t ) . This leads to 3 parameter estimation problems with 3, 4, and 1 parameters, which may be easier to solve than the 2 parameter estimation problems with 5 and 1 parameters as obtained with the proposed procedure. It is however non-trivial to identify a procedure, including both factorization of G , a and system partitioning, that guarantees the best partitioning when applying this relaxed definition of optimality, nor is it clear whether such a procedure exists.

5.3. New Opportunities

This work also hints at several new applications of the generalized framework for extent computation:
(a) 
Identifiability analysis. While not a core objective of this work, it has been shown that the graph partitioning method can help identify unidentifiable parameters. Given that this labeling is based on a model reformulation and does not depend on the temporal resolution or quality of the collected measurements, it follows that the method identifies structurally unidentifiable parameters. Unlike other methods [27], the proposed approach does not require symbolic differentiation. It remains to be explored whether this can also be used to positively identify structurally identifiable parameters. For a discussion on the evaluation and use of indicators of structural and practical parameter identifiability we refer to [28].
(b) 
Soft-sensing. The appearance of observable directions among the ambiguous extents is closely related to the concepts of observability and detectability in the context of state estimation [29]. Note however that the observability labels in this work are based on the stoichiometric balances and measurement equations alone, thus excluding the dynamic model. At the same time, it is suspected that the extents corresponding to vertices that are not on directed paths to vertices representing observable extents or extent directions can be labeled as structurally unobservable, again observing that the exact timing and quality of the measurements does not play any role in this labeling. Similarly to the identifiability analysis discussed above, such an approach would not rely on symbolic differentiation. Whether this can be used to unambiguously determine observability and detectability for all model states remains to be studied.
(c) 
Experimental design. The labeling of extents and directions as observable or unobservable suggests that experimental design may be used to optimize the selection of measured variables. A method to do so has been applied in [30] to enumerate all Pareto-optimal flow sensor layouts in wastewater treatment plants. In [31], symbolic computation enabled the identification of optimal experimental designs. Similar approaches could be applied as a measurement selection method for metabolic flux analysis and the monitoring of complex systems.

6. Conclusions

In this work, an incremental parameter identification procedure has been developed and tested based on a generalized extent-based framework. This generalized framework enables incremental parameter estimation in cases where the previous methods based on the computation of extents did not permit this. Importantly, through study of simulated and experimental medium-sized examples, the generality of the developed procedure has been demonstrated and new opportunities offered by this framework have been identified.

Supplementary Materials

The following are available online at https://www.mdpi.com/2227-9717/7/2/75/s1: The current version of the Efficient Model Identification toolbox (EMI), including all code to reproduce our results in MATLAB R2017b, Figure S1: Scenario B—Simulation. Noise-free and noisy measurements as a function of time, Figure S2: Scenario B—Parameter estimation. Observable extents ( x 1 , x 3 ) and their computed equivalents ( x ˜ 1 , x ˜ 3 ), Figure S3: Scenario B—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) true parameters; (ii) simultaneous parameter estimation P 0 ; and (iii) incremental parameter estimation P 1 ( 1 ) and P 1 ( 2 ) , Figure S4: Scenario C—Simulation. Noise-free and noisy measurements as a function of time, Figure S5: Scenario C—Extent computation. Observable extents ( x 1 , x 3 , x 4 , x 5 ) and their computed equivalents ( x ˜ 1 , x ˜ 3 , x ˜ 4 , x ˜ 5 ), Figure S6: Scenario C—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) true parameters; (ii) simultaneous parameter estimation P 0 ; and (iii) incremental parameter estimation P 1 ( 1 ) and P 1 ( 2 ) , Figure S7: Scenario D—Simulation. Noise-free and noisy measurements as a function of time, Figure S8: Scenario D—Extent computation. Observable extent directions ( χ o , 1 , χ o , 2 , χ o , 3 ) and their computed equivalents ( χ ˜ o , 1 , χ ˜ o , 2 , χ ˜ o , 3 ), Figure S9: Scenario D—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) true parameters; (ii) simultaneous parameter estimation P 0 ; and (iii) incremental parameter estimation P 1 ( 1 ) and P 1 ( 2 ) , Figure S10: Scenario E—Simulation. Noise-free and noisy measurements as a function of time, Figure S11: Scenario E—Extent computation. Observable extents ( x 1 , x 3 ) and observable extent direction ( χ o ) and their computed equivalents ( x ˜ 1 , x ˜ 3 , χ ˜ o ), Figure S12: Scenario E—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) true parameters; (ii) simultaneous parameter estimation P 0 ; and (iii) incremental parameter estimation P 1 ( 1 ) and P 1 ( 2 )

Author Contributions

Conceptualization, K.V., J.B. and D.B.; Methodology, K.V., J.B. and D.B.; Results and software implementation, K.V.; Validation, K.V., J.B. and D.B.; Writing—Original Draft Preparation, K.V.; Writing—Review & Editing, K.V., J.B. and D.B.; Funding Acquisition, K.V.

Funding

This study was made possible by Eawag Discretionary Funds (grant No.: 5221.00492.009.03, project: DF2015/EMISSUN).

Acknowledgments

The authors thank Alma Ma s ˇ ić for inputs during initiation of this work and Ivan Miletic for suggesting the study of the α -pinene case.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Rieger, L.; Gillot, S.; Langergraber, G.; Ohtsuki, T.; Shaw, A.; Takács, I.; Winkler, S. Guidelines for Using Activated Sludge Models. IWA Task Group on Good Modelling Practice. IWA Scientific and Technical Report; IWA Publishing: London, UK, 2012. [Google Scholar]
  2. Jakeman, A.J.; Letcher, R.A.; Norton, J.P. Ten iterative steps in development and evaluation of environmental models. Environ. Model. Softw. 2006, 21, 602–614. [Google Scholar] [CrossRef]
  3. Bhatt, N.; Amrhein, M.; Bonvin, D. Incremental identification of reaction and mass-transfer kinetics using the concept of extents. Ind. Eng. Chem. Res. 2011, 50, 12960–12974. [Google Scholar] [CrossRef]
  4. Bhatt, N.; Kerimoglu, N.; Amrhein, M.; Marquardt, W.; Bonvin, D. Incremental identification of reaction systems—A comparison between rate-based and extent-based approaches. Chem. Eng. Sci. 2012, 83, 24–38. [Google Scholar] [CrossRef]
  5. Rodrigues, D.; Srinivasan, S.; Billeter, J.; Bonvin, D. Variant and invariant states for chemical reaction systems. Comput. Chem. Eng. 2015, 73, 23–33. [Google Scholar] [CrossRef] [Green Version]
  6. Masˇić, A.; Srinivasan, S.; Billeter, J.; Bonvin, D.; Villez, K. Identification of biokinetic models using the concept of extents. Environ. Sci. Technol. 2017, 51, 7520–7531. [Google Scholar] [CrossRef] [PubMed]
  7. Srinivasan, S.; Billeter, J.; Narasimhan, S.; Bonvin, D. Data reconciliation for chemical reaction systems using vessel extents and shape constraints. Comput. Chem. Eng. 2017, 101, 44–58. [Google Scholar] [CrossRef]
  8. Masˇić, A.; Billeter, J.; Bonvin, D.; Villez, K. Extent computation under rank-deficient conditions. IFAC-PapersOnLine 2017, 50, 3929–3934. [Google Scholar] [CrossRef] [Green Version]
  9. Kretsovalis, A.; Mah, R.S.H. Observability and redundancy classification in multicomponent process networks. AIChE J. 1987, 33, 70–82. [Google Scholar] [CrossRef]
  10. Crowe, C.M. Observability and redundancy of process data for steady state reconciliation. Chem. Eng. Sci. 1989, 44, 2909–2917. [Google Scholar] [CrossRef]
  11. Fuguitt, R.E.; Hawkins, J.E. Rate of the thermal isomerization of α-Pinene in the liquid phase1. J. Am. Chem. Soc. 1947, 69, 319–322. [Google Scholar] [CrossRef]
  12. Box, G.E.P.; Hunter, W.G.; MacGregor, J.F.; Erjavec, J. Some problems associated with the analysis of multiresponse data. Technometrics 1973, 15, 33–51. [Google Scholar] [CrossRef]
  13. Tjoa, I.B.; Biegler, L.T. Simultaneous solution and optimization strategies for parameter estimation of differential-algebraic equation systems. Ind. Eng. Chem. Res. 1991, 30, 376–385. [Google Scholar] [CrossRef]
  14. Rodriguez-Fernandez, M.; Egea, J.A.; Banga, J.R. Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinform. 2006, 2006, 483. [Google Scholar]
  15. Brunel, N.J.; Clairon, Q. A tracking approach to parameter estimation in linear ordinary differential equations. Electr. J. Stat. 2015, 9, 2903–2949. [Google Scholar] [CrossRef] [Green Version]
  16. Dattner, I.; Gugushvili, S. Application of one-step method to parameter estimation in ODE models. Stat. Neerl. 2018, 72, 126–156. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Bonvin, D.; Rippin, D.W.T. Target factor analysis for the identification of stoichiometric models. Chem. Eng. Sci. 1990, 45, 3417–3426. [Google Scholar] [CrossRef]
  18. Sahlodin, A.M.; Chachuat, B. Convex/concave relaxations of parametric ODEs using Taylor models. Comput. Chem. Eng. 2011, 35, 844–857. [Google Scholar] [CrossRef]
  19. Masˇić, A.; Udert, K.; Villez, K. Global parameter optimization for biokinetic modeling of simple batch experiments. Environ. Model. Softw. 2016, 85, 356–373. [Google Scholar] [CrossRef]
  20. Rodrigues, D.; Billeter, J.; Bonvin, D. Maximum-likelihood estimation of kinetic parameters via the extent-based incremental approach. Comput. Chem. Eng. 2018. [Google Scholar] [CrossRef]
  21. Billeter, J.; Rodrigues, D.; Srinivasan, S.; Amrhein, M.; Bonvin, D. On decoupling rate processes in chemical reaction systems—Methods and applications. Comput. Chem. Eng. 2017, 114, 296–305. [Google Scholar] [CrossRef]
  22. Srinivasan, S.; Billeter, J.; Bonvin, D. Identification of multiphase reaction systems with instantaneous equilibria. Ind. Eng. Chem. Res. 2016, 29, 8034–8045. [Google Scholar] [CrossRef]
  23. Rodrigues, D.; Billeter, J.; Bonvin, D. Generalization of the concept of extents to distributed reaction systems. Chem. Eng. Sci. 2017, 171, 558–575. [Google Scholar] [CrossRef]
  24. Aldroubi, A.; Sekmen, A. Reduced row echelon form and non-linear approximation for subspace segmentation and high-dimensional data clustering. Appl. Comput. Harmon. Anal. 2014, 37, 271–287. [Google Scholar] [CrossRef]
  25. Vidal, R. Subspace clustering. IEEE Signal Process. Mag. 2011, 28, 52–68. [Google Scholar] [CrossRef]
  26. Billeter, J.; Bonvin, D.; Villez, K. Extent-Based Model Identication under Incomplete Observability Conditions; Technical Report No. 6, v3.0; Eawag: Dübendorf, Switzerland, 2018. [Google Scholar]
  27. Petersen, B.; Gernaey, K.; Devisscher, M.; Dochain, D.; Vanrolleghem, P.A. A simplified method to assess structurally identifiable parameters in Monod-based activated sludge models. Water Res. 2003, 37, 2893–2904. [Google Scholar] [CrossRef]
  28. Bonvin, D.; Georgakis, C.; Pantelides, C.C.; Barolo, M.; Grover, M.A.; Rodrigues, D.; Schneider, R.; Dochain, D. Linking models and experiments. Ind. Eng. Chem. Res. 2016, 55, 6891–6903. [Google Scholar] [CrossRef]
  29. Sontag, E.D. Mathematical Control Theory: Deterministic Finite Dimensional Systems; Springer Science & Business Media: Berlin, Germany, 2013; Volume 6. [Google Scholar]
  30. Villez, K.; Vanrolleghem, P.A.; Corominas, L. Optimal flow sensor placement on wastewater treatment plants. Water Res. 2016, 101, 75–83. [Google Scholar] [CrossRef]
  31. Billeter, J.; Neuhold, Y.M.; Hungerbuehler, K. Systematic prediction of linear dependencies in the concentration profiles and implications on the kinetic hard-modelling of spectroscopic data. Chemom. Intell. Lab. Syst. 2009, 95, 170–187. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Labeling of the R extents as sensed, non-sensed, observable, and ambiguous. The ambiguous extents are spanned by ρ o observable and ρ u unobservable extent directions. This way, the extent space can be represented by A = R o + ρ o observable extents and extent directions, and R A = R n + ρ u unobservable extents and extent directions.
Figure 1. Labeling of the R extents as sensed, non-sensed, observable, and ambiguous. The ambiguous extents are spanned by ρ o observable and ρ u unobservable extent directions. This way, the extent space can be represented by A = R o + ρ o observable extents and extent directions, and R A = R n + ρ u unobservable extents and extent directions.
Processes 07 00075 g001
Figure 2. Scenario A—Simulation. Noise-free and noisy measurements as a function of time.
Figure 2. Scenario A—Simulation. Noise-free and noisy measurements as a function of time.
Processes 07 00075 g002
Figure 3. Scenario A—Extent computation. Observable extents ( x 1 , x 3 ) and observable extent direction ( χ o ) and their computed equivalents ( x ˜ 1 , x ˜ 3 , χ ˜ o ).
Figure 3. Scenario A—Extent computation. Observable extents ( x 1 , x 3 ) and observable extent direction ( χ o ) and their computed equivalents ( x ˜ 1 , x ˜ 3 , χ ˜ o ).
Processes 07 00075 g003
Figure 4. Scenario A—Graph F . There are three shaded vertices corresponding the observable extents ( x 1 , x 3 ) and extent direction ( χ o ). The remaining vertices represent the unobservable extents ( x 2 , x 4 , x 5 ) and the parameters ( k 1 , k 2 , k 3 , k 4 , k 5 , K 1 ). The simulation and observation arcs are shown as solid-line and dashed-line arrows, respectively. Removing the observation arcs and graph partitioning results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 4 , k 5 , and K 1 are connected to the observable quantities x 1 and χ o , and another graph in which k 3 is connected to the observable extent x 3 .
Figure 4. Scenario A—Graph F . There are three shaded vertices corresponding the observable extents ( x 1 , x 3 ) and extent direction ( χ o ). The remaining vertices represent the unobservable extents ( x 2 , x 4 , x 5 ) and the parameters ( k 1 , k 2 , k 3 , k 4 , k 5 , K 1 ). The simulation and observation arcs are shown as solid-line and dashed-line arrows, respectively. Removing the observation arcs and graph partitioning results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 4 , k 5 , and K 1 are connected to the observable quantities x 1 and χ o , and another graph in which k 3 is connected to the observable extent x 3 .
Processes 07 00075 g004
Figure 5. Scenario A—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) true parameters; (ii) simultaneous parameter estimation ( P 0 ); and (iii) incremental parameter estimation ( P 1 ( 1 ) and P 1 ( 2 ) ). The produced simulation results exhibit strong overlap, meaning that the identified models closely approximate the ground truth model.
Figure 5. Scenario A—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) true parameters; (ii) simultaneous parameter estimation ( P 0 ); and (iii) incremental parameter estimation ( P 1 ( 1 ) and P 1 ( 2 ) ). The produced simulation results exhibit strong overlap, meaning that the identified models closely approximate the ground truth model.
Processes 07 00075 g005
Figure 6. Scenario B—Graph F . There are two shaded vertices corresponding the observable extents ( x 1 , x 3 ). Removing the observation arcs and graph partitioning results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , and K 1 affect the observable extent x 1 , a second graph in which k 3 affects the observable extent x 3 . The vertices k 4 and k 5 have no effect on the observable extents x 1 and x 3 .
Figure 6. Scenario B—Graph F . There are two shaded vertices corresponding the observable extents ( x 1 , x 3 ). Removing the observation arcs and graph partitioning results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , and K 1 affect the observable extent x 1 , a second graph in which k 3 affects the observable extent x 3 . The vertices k 4 and k 5 have no effect on the observable extents x 1 and x 3 .
Processes 07 00075 g006
Figure 7. Scenario C—Graph F . There are 4 shaded vertices corresponding to the observable extents x 1 , x 3 , x 4 , and x 5 . Removing the observation arcs results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 4 , k 5 , and K 1 affect the observable extents x 1 , x 4 , and x 5 , and another graph in which k 3 affect the observable extent x 3 .
Figure 7. Scenario C—Graph F . There are 4 shaded vertices corresponding to the observable extents x 1 , x 3 , x 4 , and x 5 . Removing the observation arcs results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 4 , k 5 , and K 1 affect the observable extents x 1 , x 4 , and x 5 , and another graph in which k 3 affect the observable extent x 3 .
Processes 07 00075 g007
Figure 8. Scenario D—Graph F . There are 3 shaded vertices corresponding the observable extent directions χ o , 1 , χ o , 2 , and χ o , 3 . Removing the observation arcs results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 3 , and K 1 affect the observable extent directions χ o , 1 and χ o , 2 , and another graph in which k 4 and k 5 affect the observable extent direction χ o , 3 .
Figure 8. Scenario D—Graph F . There are 3 shaded vertices corresponding the observable extent directions χ o , 1 , χ o , 2 , and χ o , 3 . Removing the observation arcs results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 3 , and K 1 affect the observable extent directions χ o , 1 and χ o , 2 , and another graph in which k 4 and k 5 affect the observable extent direction χ o , 3 .
Processes 07 00075 g008
Figure 9. Scenario E—Graph F . All extent vertices are observable and thus shaded. Removing the observation arcs results in 5 subgraphs for the parameters as shown in the inset: k 1 and K 1 are estimated from x 1 , k 2 from x 2 , k 3 from x 3 , k 4 from x 4 , and k 5 from x 5 .
Figure 9. Scenario E—Graph F . All extent vertices are observable and thus shaded. Removing the observation arcs results in 5 subgraphs for the parameters as shown in the inset: k 1 and K 1 are estimated from x 1 , k 2 from x 2 , k 3 from x 3 , k 4 from x 4 , and k 5 from x 5 .
Processes 07 00075 g009
Figure 10. α -pinene—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) incremental parameter estimation ( P 1 ( j ) , j = 1 , , 4 ) followed by (ii) simultaneous parameter estimation ( P 1 + 0 ). The produced simulation results exhibit strong overlap, meaning that both models produce very similar results.
Figure 10. α -pinene—Parameter estimation. Measured concentrations and simulated profiles obtained with (i) incremental parameter estimation ( P 1 ( j ) , j = 1 , , 4 ) followed by (ii) simultaneous parameter estimation ( P 1 + 0 ). The produced simulation results exhibit strong overlap, meaning that both models produce very similar results.
Processes 07 00075 g010
Figure 11. α -pinene—Extents. Computed extents and simulated profiles obtained following incremental parameter estimation ( P 1 ( j ) , j = 1 , , 4 ).
Figure 11. α -pinene—Extents. Computed extents and simulated profiles obtained following incremental parameter estimation ( P 1 ( j ) , j = 1 , , 4 ).
Processes 07 00075 g011
Figure 12. α -pinene—Graph F . There are four shaded vertices corresponding the observable extents ( x 1 , x 2 , x 3 ) and extent direction ( χ o ). The remaining vertices represent the unobservable extents ( x 4 , x 5 ) and the parameters ( k 1 , k 2 , k 3 , k 4 , k 5 ). The simulation and observation arcs are shown as solid-line and dashed-line arrows, respectively. Removing the observation arcs and graph partitioning results in 4 subgraphs for the parameters as shown in the inset: (i) one with vertices for k 1 and x 1 ; (ii) one with vertices for k 2 and x 2 ; (iii) one with vertices for k 3 and x 3 ; and (iv) one with vertices k 4 , k 5 , x 4 , x 5 , and χ o .
Figure 12. α -pinene—Graph F . There are four shaded vertices corresponding the observable extents ( x 1 , x 2 , x 3 ) and extent direction ( χ o ). The remaining vertices represent the unobservable extents ( x 4 , x 5 ) and the parameters ( k 1 , k 2 , k 3 , k 4 , k 5 ). The simulation and observation arcs are shown as solid-line and dashed-line arrows, respectively. Removing the observation arcs and graph partitioning results in 4 subgraphs for the parameters as shown in the inset: (i) one with vertices for k 1 and x 1 ; (ii) one with vertices for k 2 and x 2 ; (iii) one with vertices for k 3 and x 3 ; and (iv) one with vertices k 4 , k 5 , x 4 , x 5 , and χ o .
Processes 07 00075 g012
Table 1. List of symbols.
Table 1. List of symbols.
SymbolDescriptionDimensions
ϵ h Measurement error at time t = t h M × 1
θ θ ( j ) Kinetic parameters (in subsystem j) T × 1 T ( j ) × 1
Λ o ( j ) , Λ θ ( j ) Selection matrix ρ o ( j ) × ρ o , T ( j ) × T
ρ o ρ o ( j ) , ρ o , i ( j ) Number of observable extent directions (in system j, interpolated in system j) 1 × 1
ρ u Number of unobservable extent directions 1 × 1
ρ aug Number of extents and observable extent directions 1 × 1
Σ ϵ Measurement error variance-covariance matrix M × M
τ Time (integrand) 1 × 1
Σ χ ¯ Estimation error variance-covariance matrix A × A
χ aug Extents and observable extent directions ρ aug × 1
χ o χ o ( j ) Observable extent directions (in subsystem j) ρ o × 1 ρ o ( j ) × 1
χ ¯ o χ ¯ o ( j ) Observable extents and extent directions (in subsystem j) A × 1 A ( j ) × 1
χ ¯ ˜ o , h χ ¯ ˜ o , h ( j ) Computed observable extents and extent directions (in subsystem j) A × 1 A ( j ) × 1
χ ¯ ˜ o , i χ ¯ ˜ o , i ( j ) Interpolated observable extents and extent directions (in subsystem j) A × 1 A ( j ) × 1
χ u Unobservable extent directions ρ u × 1
χ ¯ u Unobservable extents and extent directions R A × 1
A A ( j ) Number of observable extents and extent directions (in subsystem j) 1 × 1
a Indices of ambiguous extents R a × 1
B Reduced row echelon form of G M × R
d h d h ( j ) Model prediction residuals at t h (in subsystem j) A × 1 A ( j ) × 1
e ( j ) Indices of computed extents and extent directions simulated by subsystem j A ( j ) × 1
F F ( j ) Information flow graph (for subsystem j)
f f ( j ) Rate expressions (in subsystem j) R × 1 R ( j ) × 1
G Extent-based measurement matrix M × R
G o , G ¯ o Measurement matrix for the observable extent directions/observable extents and extent directions M × ρ o / A
HNumber of measurements 1 × 1
hMeasurement sample index 1 × 1
iSubsystem index 1 × 1
JNumber of subsystems 1 × 1
j Indices of extents in subsystem j R ( j ) × 1
jSubsystem index 1 × 1
M Species-based measurement matrix M × S
MNumber of measurement samples 1 × 1
N Stoichiometric matrix R × S
N ¯ o , N ¯ u Extent and extent direction-based stoichiometric matricesA/ R A × S
n Indices of non-sensed extents R n × 1
n n 0 Number of moles (at time t = 0 ) S × 1
o o ( j ) Indices of observable extents (in subsystem j) R o × 1 R o ( j ) × 1
P Projection matrix A × M
Q 0 , Q 0 * , Q 1 ( j ) Objective function 1 × 1
R R ( j ) , R a , R n R o , R o ( j ) , R o , i ( j ) , R s Number of reactions (in subsystem j, ambiguous/non-sensed/ observable/observable in subsystem j/interpolated in subsystem j/sensed) 1 × 1
r Reaction rates R × 1
rReaction index 1 × 1
SNumber of species 1 × 1
s Indices of sensed extents R s × 1
T ( T ( j ) )Number of parameters (in subsystem j) 1 × 1
t t h Time (of measurement sample h) 1 × 1
U ( j ) , U ¯ ( j ) Mixing matrix ρ o ( j ) / A ( j ) × R ( j )
V o , V u Direction matrix for observable/unobservable direction directions R × ρ o , R × ρ u
VReactor volume 1 × 1
W W ( j ) Weight matrix (for subsystem j) A × A A ( j ) × A ( j )
x x ( j ) , x r Extents (in subsystem j, of reaction r) R × 1 R ( j ) × 1 , 1 × 1
y y 0 , y ( t h ) Noise-free measurements (at time t = 0 , t = h ) M × 1
y ˜ h Measurements in sample h M × 1
Table 2. Scenario A—Parameter estimates. Model simulation and parameter estimation results: kinetic parameters and model fit. All parameter estimates are reported with their standard deviation based on the Laplacian approximation of the likelihood function (i.e., e x p Q 0 , e x p Q 1 ( j ) ).
Table 2. Scenario A—Parameter estimates. Model simulation and parameter estimation results: kinetic parameters and model fit. All parameter estimates are reported with their standard deviation based on the Laplacian approximation of the likelihood function (i.e., e x p Q 0 , e x p Q 1 ( j ) ).
 NameUnitTrue Value P 0 P 1 ( j ) P 1 + 0
θ 1 k 1 L mol 1 h 1 2.0 1.9490 ( ± 0.0003 ) 2.0010 ( ± 3.3625 ) 1.9857 ( ± 0.0001 )
θ 2 k 2 L mol 1 h 1 0.5 0.4929 ( ± 0.0002 ) 0.5006 ( ± 2.4090 ) 0.4771 ( ± 0.0001 )
θ 3 k 3 h 1 1.0 1.0056 ( ± 0.0827 ) 0.8955 ( ± 0.0002 ) 0.9974 ( ± 0.0805 )
θ 4 k 4 h 1 0.4 0.4342 ( ± 0.4275 ) 0.4003 ( ± 2.6400 ) 0.4606 ( ± 0.1309 )
θ 5 k 5 L mol 1 h 1 1.6 1.2285 ( ± 2.5160 ) 1.6009 ( ± 2.9587 ) 1.2111 ( ± 0.8018 )
θ 6 K 1 mol L 1 1.4 1.3435 ( ± 0.1016 ) 1.4001 ( ± 2.9081 ) 1.3986 ( ± 0.1135 )
Q 0 H · M WRMSR  0.44731 0.47934 0.44771
Table 3. α -pinene—Parameter estimates. The parameters obtained with simultaneous estimation ( P 0 ) and incremental estimation ( P 1 + 0 ) are practically the same.
Table 3. α -pinene—Parameter estimates. The parameters obtained with simultaneous estimation ( P 0 ) and incremental estimation ( P 1 + 0 ) are practically the same.
 NameUnit P 0 P 1 ( j ) P 1 + 0
θ 1 k 1 % h 1 0.2130.2140.213
θ 2 k 2 % h 1 0.1070.1060.107
θ 3 k 3 % h 1 0.0740.0740.074
θ 4 k 4 % h 1 0.9891.0370.989
θ 5 k 5 % h 1 0.1440.1480.144
Q 0 H · M WRMSR 0.66 0.67 0.66

Share and Cite

MDPI and ACS Style

Villez, K.; Billeter, J.; Bonvin, D. Incremental Parameter Estimation under Rank-Deficient Measurement Conditions. Processes 2019, 7, 75. https://doi.org/10.3390/pr7020075

AMA Style

Villez K, Billeter J, Bonvin D. Incremental Parameter Estimation under Rank-Deficient Measurement Conditions. Processes. 2019; 7(2):75. https://doi.org/10.3390/pr7020075

Chicago/Turabian Style

Villez, Kris, Julien Billeter, and Dominique Bonvin. 2019. "Incremental Parameter Estimation under Rank-Deficient Measurement Conditions" Processes 7, no. 2: 75. https://doi.org/10.3390/pr7020075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop