Incremental Parameter Estimation under Rank-Deﬁcient Measurement Conditions

: The computation and modeling of extents has been proposed to handle the complexity of large-scale model identiﬁcation tasks. Unfortunately, the existing extent-based framework only applies when certain conditions apply. Most typically, it is required that a unique value for each extent can be computed. This severely limits the applicability of this approach. In this work, we propose a novel procedure for parameter estimation inspired by the existing extent-based framework. A key difference with prior work is that the proposed procedure combines structural observability labeling, matrix factorization, and graph-based system partitioning to split the original model parameter estimation problem into parameter estimation problems with the least number of parameters. The value of the proposed method is demonstrated with an extensive simulation study and a study based on a historical data set collected to characterize the isomerization of α -pinene. Most importantly, the obtained results indicate that an important barrier to the application of extent-based frameworks for process modeling and monitoring tasks has been lifted.


Introduction
Despite advances in model identification theory, parameter estimation can still be very challenging in practice.Such challenges include the lack of identifiability, large computational cost, the need to formulate appropriate experimental designs, and the fact that many methods, such as those for uncertainty analysis, are still being investigated and therefore not standardized.In this work, we focus on a novel method to tackle the computational challenge associated with the identification of kinetic parameters in large dynamic models.Hence, the other challenges associated with parameter identifiability, experimental design, and uncertainty analysis, though relevant, are not investigated here.
To handle the computational challenge, it is typical to devise a protocol for model fitting and model validation.Such protocols can be divided into two broad classes.In the first class, the protocols are domain specific.For instance, several protocols for the identification of activated sludge models are discussed in [1].Similarly, a protocol for environmental system models is proposed in [2].These protocols incorporate significant expertise specific to the particular application domain, thereby leading to protocols that are fine-tuned for that domain.While they tend to be similar on a conceptual level, it is rather difficult to apply these protocols universally.
The second class includes protocols that are more general and thus-in principle-broadly applicable.The incremental model identification framework studied in [3][4][5][6][7] is a good example.This method is grounded on the computation of extents, which are-loosely speaking-linear combinations of the original model states and capture the progress of individual dynamic phenomena, such as chemical reactions.A more precise definition will be given below.
The applicability of the extent-based framework is limited to cases where all extents can be computed based on measurements and invariant relationships.However, since this is rarely the case for biological process models, the extent-based framework has been extended in [8] to the case where each extent is either observable or non-sensed-see below for precise definitions.While this recent work provides a meaningful improvement, extent-based incremental model identification remains inapplicable for a wide range of biological scenarios found in practice.For example, this method cannot deal with the frequent case where an extent is sensed but unobservable.
The goal of this study is to present a novel method for incremental parameter identification that is more universally applicable.This method is based on the formulation of a generalized framework for extent computation and the use of a graph-based clustering algorithm.In what follows, we present the method and demonstrate its applicability to cases that could not be handled in an extent-based incremental model identification framework before.The expected impact of the newly developed tools is discussed at the end of this study.

Notation and Symbols
The matrix composed of the columns c of the matrix M is denoted M •,c , while the matrix composed of the rows r of the matrix M is denoted M r,• .All vectors are column vectors unless mentioned otherwise.Table 1 lists all symbols used in this study.

Symbol Description Dimensions
h Measurement error at time t = t h M × 1 θ θ (j)  Kinetic parameters (in subsystem j) T × 1 T (j) × 1 Observable extent directions (in subsystem j) Observable extents and extent directions (in subsystem j) Computed observable extents and extent directions (in subsystem j) A × 1 A (j) × 1 χo,i χ(j) Interpolated observable extents and extent directions (in subsystem j) A × 1 A (j) × 1 Unobservable extents and extent directions (R − A) × 1 A A (j) Number of observable extents and extent directions (in subsystem j) Model prediction residuals at t h (in subsystem j) A × 1 A (j) × 1 e (j)  Indices of computed extents and extent directions simulated by subsystem j A (j) × 1 F F (j)  Information flow graph (for subsystem j) −− f f (j)  Rate expressions (in subsystem j) R × 1 R (j) × 1 Extent and extent direction-based stoichiometric matrices Indices of observable extents (in subsystem j) o , R o,i , R s Number of reactions (in subsystem j, ambiguous/non-sensed/ observable/observable in subsystem j/interpolated in subsystem j/sensed) Mixing matrix ρ Weight matrix (for subsystem j) A × A A (j) × A (j)   x x (j) , x r Extents (in subsystem j, of reaction r) R × 1 R (j) × 1, 1 × 1 y (y 0 , y(t h )) Noise-free measurements (at time t = 0, t = h) M × 1 ỹh Measurements in sample h M × 1

Dynamic Model in Terms of Numbers of Moles
We study batch process systems whose dynamic behavior is described by a set of differential equations of the following form: In the present work, the above equations represent a single-phase reaction system in a vessel with constant volume V. n(t) is the S-dimensional vector of numbers of moles; n 0 specifies the initial conditions; r(t) is the R-dimensional vector of reaction rates; N is the (R × S)-dimensional stoichiometric matrix.
We assume that M noisy measurements, ỹh , are available at the H sampling times t h (h = 1, . . ., H), according to the following equations: where y(t h ) is the M-dimensional vector of noise-free measurements, h the M-dimensional vector of measurement errors, Σ the M-dimensional measurement error variance-covariance matrix, and M is the (M × S)-dimensional species-based measurement matrix.
We further assume that kinetic laws for the reaction rates are available.These laws are functions of the component masses, n(t), and a T-dimensional vector of kinetic parameters, θ: (3) In the present study, we assume that the rate laws are known, except for the values of the parameters θ that need to be estimated.

Dynamic Model in Terms of Extents
We now adopt the definition of the extents of reaction, hereafter simply called extents, as the number of times each reaction has occurred since t = 0.These extents are measured in moles and defined mathematically as: It follows that Accordingly, ( 1)-( 3) can be represented equivalently in the following form: where with the (M × R)-dimensional matrix G being labeled the extent-based measurement matrix.
Let A denote the rank of G, that is, A := rank (G) ≤ min (M, R).

Definitions
To label the extents, the following definitions are proposed:

•
The rth extent x r (t) is labeled sensed if the measurements ỹh are affected by x r (t) through the measurement Equation (7), that is, if G •,r has at least one non-zero element.

•
The rth extent is labeled observable if (i) it is sensed (G •,r not null); and (ii) the change in the measurements ỹh caused by a change in x r (t) can be unambiguously attributed to the change in that extent, that is, G •,r is independent of all other column vectors in G.
The above labeling is based on the structure of G and does not depend on the temporal resolution or quality of the recorded measurements ( ỹh ).As such, this means only structural observability is considered.This terminology is similar to the one used in data reconciliation [9,10] as the labels are produced without a dynamic model.

Labeling Procedure
To label the extents, one first computes the (M × R)-dimensional reduced row echelon form of G denoted B: This (M × R)-dimensional matrix is composed of A non-zero rows and M − A zero rows.Using B, the extents are labeled with the following procedure: Label the R n extents corresponding to zero columns in B as non-sensed and use the vector n to identify their positions in x.Label the R s remaining extents as sensed and use the vector s to identify their positions in x.

(b)
Find all rows in B with a single non-zero element and find the column positions of these non-zero elements.Label the R o extents corresponding to these column positions as observable and use the vector o to identify their positions in x.These extents are observable because one can compute a unique value based on the available information (measurements, extent-based measurement matrix, and initial conditions).(c) Label the R a extents that are sensed but not observable as ambiguous.These extents are ambiguous because one cannot compute a unique value based on the available information.
The vector a is used to identify their positions in x.
Based on this labeling, the vectors x s (t), x n (t), x o (t), and x a (t) are defined to represent (i) the R s sensed extents, (ii) the R n = R − R s non-sensed extents; (iii) the R o observable extents; and (iv) the R a = R s − R o ambiguous extents.This is illustrated in Figure 1.With the above definitions, (5) can be reformulated as: Furthermore, it follows from

Practical Cases
Depending on the values of A, M, and R, one can distinguish the following cases: I. Full-rank extent-based measurement matrix (A = R).This occurs when there are at least as many measurements as reactions (M ≥ R) and the matrix G is full column rank.As a result, R n = 0, R o = R, and R a = 0.This is the most frequently studied case, e.g., in [5][6][7].This case enables computing unique values for the R extents by means of a linear transformation [5], thus allowing the estimation of kinetic parameters for each reaction rate model individually.II.
Rank-deficient measurement matrix (A < R).If A < R, for example because there are fewer measurements than reactions (M < R), it is no longer possible to compute all R extents from M measurements without additional information such as an established kinetic model.One can distinguish two situations within this case: (a) No ambiguity (R a = 0).In this case, R o = A and R o + R n = R.As shown in [8], it is possible in this case to implement efficient parameter estimation by identifying subsystems of the complete model that include a subset of the kinetic rate laws and their parameters.(b) Ambiguity present (0 < R a ≤ A).This situation results in R o < A. For this case, no generally applicable method for incremental parameter estimation is available until now.
The method proposed in this work is developed with the aim of handling all the above cases in a single framework for extent-based kinetic parameter estimation.

Observable and Unobservable Extent Directions
The ambiguous extents are now investigated in more detail to determine observable directions among them.

Factorization of G •,a
Let ρ o be the rank of the (M × R a )-dimensional measurement matrix G •,a .Then, G •,a can be factorized into the (M Using the reduced row echelon form B •,a , the matrices G o and V o are computed as follows: • G o is the matrix composed of the columns of G •,a corresponding to the column positions of the first non-zero elements in the rows of V o T .

Definition of Observable and Unobservable Extent Directions
The term G •,a x a (t h ) in ( 13) can be expressed as: with χ o (t) the ρ o -dimensional vector of observable extent directions among the ambiguous extents: where the (R a × R)-dimensional matrix I a,• includes the rows a of the identity matrix I R such that Equation (15) indicates that while the extents x a (t) cannot be observed individually, their combined effects on the measurements can be observed as the linear combinations χ o (t).
Remark 1.The unobservable extent directions span the null space of G •,a , which is also the null space of V o T .
Denoting this null space by the (R a × ρ u )-dimensional matrix V u , with ρ u = (R a − ρ o ), one can define the ρ u -dimensional vector of unobservable extent directions χ u (t) as:

Observable Extents and Extent Directions
We further define the vector χ o (t) consisting of the A = R o + ρ o observable extents and extent directions as follows: With this definition, the measurement Equation ( 13) can be rewritten as: with G o , an (M × A)-dimensional matrix, constructed as Since the (R a × A)-dimensional matrix G o has full column rank, (21) can be used to compute the maximum-likelihood estimates χo,h of the observable extents and extent directions directly from the measurements: with the (A × M)-dimensional matrix P given by: The associated expected variance-covariance matrix of the estimation errors becomes:

Unobservable Extents and Extent Directions
We finally define the vector χ u (t) consisting of the R − A = R n + ρ u unobservable extents and non-sensed extent directions as follows: With this definition, the expression for the number of moles ( 12) can be rewritten as: with:

System Partitioning
Incremental parameter estimation is based on the possibility of separating the parameter estimation problem into smaller problems.Ideally, if all extents of reaction could be computed from measurements, each reaction could be identified individually, that is, independently of the other reactions.This way, the parameter estimation problem reduces to the solution of R smaller problems.As discussed in the previous section, some extents may not be observable.For these situations, it would still be nice to be able to separate the parameter estimation problem into J smaller problems (J ≤ R).The objective is therefore to partition the reaction system effectively-with as many small groups of reactions as possible-to simplify the parameter estimation task.The first step to achieve this consists of system partitioning.
An algorithm is developed to group the kinetic parameters into J parameter subsets (j = 1, . . ., J), each represented as a T (j) -dimensional vector θ (j) satisfying the following properties:

•
The size T (j) of each parameter subset should be as small as possible.

•
The estimates θ(j) in the jth parameter subset can be computed without consideration of any other parameter subset θ (i) , i = j.

•
Each parameter in θ appears in at most one of the parameter subsets θ (j) .
This objective is achieved by means of model reformulation and graph-based system partitioning.The graph-based procedure can be interpreted as a symbolic manipulation of the process model.It does not require symbolic differentiation, however.

Step 1-Model Reformulation
An extended model is first defined to describe the dynamics of all extents and all observable directions among the ambiguous extents.To this end, the following procedure is applied: Express ẋ(t) as a function of χ o (t) and x(t) The dynamic model ( 6) is modified by replacing n(t) with the right-hand side of ( 27): The vector χ u (t) is now replaced with the right-hand side of (26).As a result, the above system becomes: Define the ρ aug -dimensional vector χ aug (t) := x(t) x(t) that includes all extents and extent directions, with ρ aug = R + ρ o .The dynamic behavior of χ aug (t) can be described by a differential-algebraic system including the R differential Equation ( 31) and the ρ o algebraic expressions (16):

(c) Interpolation of the observable extents and extent directions
To increase the efficiency of system partitioning, it is useful to account for the fact that the observable extents and extent directions can be expressed in terms of measurements.However, since the observable extents and extent directions are only known at H discrete measurement points, their values always need to be obtained via interpolation.In this work, we apply piece-wise linear interpolation as follows: with which the system (32) and (33) becomes: Step 2-Graph-Based System Partitioning The equation system (35) and ( 36) is now analyzed by means of a graph partitioning procedure to determine the smallest groups of kinetic parameters that can be estimated separately.To this end, the following steps are performed: Create a graph One creates a directed graph F with a vertex for every state variable in χ aug (t) and every parameter in θ.Hence, this graph has θ appears in the right-hand side of the wth equation in ( 35) and (36 . This graph represents the information flow for simulating (35) and (36).Additional arcs and vertices may be added to describe the influence of known inputs and the links between extents and measured variables.For system partitioning, this is however unnecessary and omitted for clarity.

(b)
Extents predicted from measurements or simulation The simultaneous approach uses a complete model of the reaction system to predict the extents (or concentrations) via simulation.If one wants to partition the reaction system into small groups of reactions, only the extents belonging to a given group can be generated via the simulation of that group.The other extents that enter the rate laws must be provided by the user as quantities known from measurements.
That information can be included in the graph F by annotating the various arcs.The arcs that originate at a vertex corresponding to an observable extent or an observable extent direction are labeled observation arcs, considering that observable extent or an observable extent direction can be replaced with their measured values (34).They are visualized as dashed-line arrows.The remaining arcs are labeled simulation arcs and visualized as solid-line arrows.The observation arcs represent the idea that the elements of χo,i (t) can be regarded as known inputs for simulating (35) and ( 36).

(c)
Subgraph selection Identify the J subgraphs F (j) consisting of arcs and vertices in F on directed paths that (i) lead to a vertex representing an observable extent or an observable extent direction; and (ii) consist of simulation arcs only.The selected vertices represent an R (j) -dimensional vector of extents x (j) (t), a ρ o (t), and a T (j) -dimensional vector of parameters θ (j) .The positions of x (j) (t) in x(t) are given by the vector j so that: and the selection matrices θ are defined so that: This means that each subgraph F (j) represents a subset of Equations ( 35) and (36) that describes the dynamics of x (j) (t) and χ (j) o (t) without reference to any other state variable: with

(d) Add observation arcs and vertices
For every graph F (j) , add (i) the observation arcs that have a target vertex belonging to F (j) and (ii) the source vertices of the added observation arcs.These added source vertices represent the minimal subset of interpolants in χo,i (t) (34) that are required to simulate x (j) (t) and χ (j) o (t) and are referred to as χ(j) o,i (t).This means that the graph F (j) now represents all information required to simulate the observable extents x (j) (t) and the observable extent directions χ (j) o (t).Accordingly, one can rewrite the jth equation subsystem as: At the end of this procedure, the equation system (35) and ( 36) is approximated by J smaller equation subsystems, each including a subset of the kinetic parameters.As intended, every kinetic parameter appears in at most one of the J subsystems.In addition, the identified subsystems do not share any of the observable extents or extent directions as state variables, that is, each observable extent and extent direction is simulated in only one of the identified subsystems.

Parameter Estimation Methods
In this work, we solve the parameter estimation problem in two distinct ways.The first way consists in a simultaneous estimation of all parameters in the maximum-likelihood sense.The second way consists in an extent-based incremental parameter estimation.The next paragraphs describe how incremental parameter estimation approximates the simultaneous estimation procedure to minimize the number of parameters that are estimated together.

Simultaneous Parameter Estimation
Maximum-likelihood estimation of the kinetic parameters can be obtained by solving minimization of the weighted mean squared error: subject to (1)-( 3).This estimation problem can be equivalently formulated in terms of the computed extents as follows: subject to ( 6)-( 8) and (20) and with W := Σ −1 χ .

Incremental Parameter Estimation
The incremental parameter estimation procedure is obtained by applying two modifications, A and B, to problem P * 0 .
(a) Modification A: Removing correlation terms.Let A o and define the A (j) -dimensional vector of observable extents and extent directions χ (j) o (t).This vector includes all observable extents x (j) o (t), whose positions in x (j) (t) are given by the vector o (j) , and all observable extent directions χ (j) o (t) in Subsystem j.We further define the matrix U (j) so that: The A (j) -dimensional vector χ(j) o,h is then defined by selecting the elements of χo,h as: where the vector e (j) gives the positions of χ(j) o,h in χo,h .Now define the residuals d h associated with subsystem j: This way, the objective function Q * 0 defined above can be reformulated as: and can subsequently approximated with Q 1 : with W (j) := I e (j) ,• W I e (j) ,• T .This first modification results in: subject to ( 6)-( 8) and (20).This error due to this approximation is referred to as Type A approximation error.This error is zero if all matrices I e (i) ,• W I e (j) ,• T , i = j, are zero matrices.This is true when the correlation between the estimation errors of any element in χ(i) o,h and any element of χ(j) o,h (i = j) is zero.
(b) Modification B: Separation of problem P 1 into J smaller problems P (j) 1 .An approximation to problem P 1 is now obtained by optimizing each of the J terms in (54) separately and simulating the values of χ (j) o (t h ) with ( 43) and (44).We refer to each problem as P (j) The most important feature of problems P (j) 1 , j = 1, . . ., J, is that each of them involves only a small number of parameters θ (j) .Please note that the approximation of problem P 1 by this set of problems P (j) 1 is perfect in the special case where the right-hand sides (56) do not involve interpolated extents, that is, if the vectors χ(j) o,i (t) (j = 1, . . ., J) are empty.Another error, named Type B approximation error, is introduced when this is not the case.
Thanks to two modifications of problem P 0 , one obtains J smaller problems P (j) 1 , each of which includes only a fraction of the original set of parameters.The price to pay for such a simplification is a deviation from maximum likelihood due to the introduction of approximation errors (type A & B).These errors are often marginal as demonstrated below.
As in previous work [5][6][7], the parameter estimates obtained by solving problem P (j) 1 (j = 1, . . ., J) can serve as reliable initial guesses to initiate the solution to problem P 0 .The problem P 0 when solved with these initial guesses is named problem P 1+0 .

Implementation
All results can be reproduced with the open-source Efficient Model Identification (EMI) MATLAB package for efficient model identification [6] that includes all methods and simulations used in this study.This package is added in the Supplementary Information (Section A).

Results
We first explain the results obtained within an extensive simulation study to demonstrate the method.After that, results obtained with an experimental data set are used to demonstrate real-world applicability.

Simulation Study
The developed methods are demonstrated via a batch reaction system and investigating 5 different measurement scenarios (A-E).The reaction system is described first.Then, scenario A and its results are discussed in detail.The results for scenarios B to E are only summarized, with the details described in the Supplementary Information (Section B).

Reaction System
This reaction system has R = 5 reactions involving S = 6 species (A to F) with the following reaction scheme:

Dynamic Model in Terms of Numbers of Moles
The simulated kinetic rate expressions are: with The ground truth values of the kinetic parameters are given in Table 2. ).

Scenario A
The first scenario considers the case where the concentrations of B and C and the sum of the concentrations of E and F are measured, that is, The measurement error-covariance matrix is Σ = diag 1 1 2 T 10 −4 mol 2 L −2 .Figure 2 shows the simulated noise-free and noisy measurements.

Extent labeling
To specify the measurement model in terms of extents (7), one needs to specify y 0 and G: The reduced row echelon form of G is which indicates that there are: which gives: Observable extents and extent directions The factorization of G •,a gives: so that The values of the observable extents and extent directions χo,h := x1,h x3,h χo,h T can be computed from (23) with The expected estimation error variance-covariance matrix Σ χ is: This demonstrates that the estimation errors in the first and second observable extents are uncorrelated with the estimation error in the computed observable extent direction.Please note that the correlation between the estimation errors of the first and second observable extents is fairly high ( 3 = 0.95).Figure 3 shows the computed values of the observable extents and extent direction.Expression ( 27) is then fully specified with the following vectors and matrices: Since the 2nd and 3rd column of N u are null vectors, it follows that values of ñ2 (t) and ñ3 (t) can be computed based on the interpolated estimates χo,i (t) without the need to simulate x(t).A further exploration of this idea remains out of the scope of this paper, however.

System partitioning
For simplicity of notation, we omit the time dependence in what follows.For example, the interpolants χo,i (t) are given as χo,i := x1 x3 χo T .Following Steps 1(a)-(b) in Section 3.6.1, the augmented equation system becomes: = 0 (74) Figure 4 shows the graph corresponding to the above equation system.The vertices corresponding to the observable extents and extent direction are shaded, while the other vertices are white.The simulation arcs are shown with full-line arrows, while the observation arcs are shown as dashed-line arrows.To identify possible subsystems, one removes all the observation arcs, which results in two subgraphs.The first subgraph includes the parameters k 1 , k 2 , k 4 , k 5 , and K 1 , which affect the observable quantities x 1 and χ o via a network that also involves the unobservable extents x 2 , x 4 , and x 5 .The second subgraph is much smaller and includes the parameter k 3 that influences the observable extent x 3 .Accordingly, the J = 2 subsystems are: and The information flows in these subsystems are shown in the inset of Figure 4.

Incremental parameter estimation
The resulting partitioning means that estimates of the first set of parameters θ (1) θ(1) = arg min θ (1)   H x 1 (0) x 2 (0) x 4 (0) where The second set of parameters consists of k 3 only, θ (2) := k 3 .Its value can be obtained by minimizing the least-squares deviation between x 3 (t h ) and its measured counterparts x3,h , that is, by solving problem P (2) Figure 5 shows the measured concentrations and the simulated profiles obtained with (i) the true parameters; (ii) simultaneous parameter estimation (P 0 ); and (iii) incremental parameter estimation (P (1) 1 ).All simulated profiles describe the experiment well and are practically indistinguishable.The parameter values obtained by solving P 0 , P (1) 1 , and P 1+0 are shown in Table 2.All parameter estimates are in close agreement to each other.It is worth noting that the parameter k 2 can be estimated even if the corresponding extent x 2 is labeled non-sensed, i.e., our results suggest that this parameter is both structurally and practically identifiable.
1 and P (2) 1 ).The produced simulation results exhibit strong overlap, meaning that the identified models closely approximate the ground truth model.

Scenario B
Scenario B assumes concentration measurements for the species B and C only: The measurement error-covariance matrix is Σ = diag 1 1 T 10 −4 mol 2 L −2 .The reduced row echelon form of G is: which leads to the following extent labeling: • R a = 0 ambiguous extents.
Please note that the extent-based method proposed in [8] applies in this case since R a = 0. Figure 6 shows the corresponding graph F , which can be partitioned into J = 2 subsystems and a leftover part.The first subgraph includes the parameters k 1 , k 2 , and K 1 since they all have some effect on the observable extent x 1 via a network that also involves the unobservable extent x 2 .The second subgraph includes the parameter k 3 that influences the observable extent x 3 .The information flows in these subsystems are shown in the inset of Figure 6.The leftover part includes the parameters k 4 and k 5 and the extents x 4 and x 5 .These parameters and variables are not part of any of the identified subsystems because there are no directed paths composed of simulation arcs from the corresponding vertices to any of the observable extents.Hence, it follows that these parameters are unidentifiable.This is consistent with the method proposed in [8] (not shown).Clearly, the graph-based partitioning procedure shows potential for parameter identifiability analysis, which is discussed as an opportunity below.
The measurement error-covariance matrix is The reduced row echelon form of G is: which leads to the following extent labeling: T .
Figure 7 shows the corresponding graph F , which can be partitioned into 2 subsystems.The first subgraph includes the parameters k 1 , k 2 , k 4 , k 5 , and K 1 since they all have some effect on the observable extents x 1 , x 4 , and x 5 via a network that also involves the unobservable extents x 2 .The second subgraph includes the parameter k 3 that influences the observable extent x 3 .The information flows in these subsystems are shown in the inset of Figure 7.Although the assignment of the parameters is the same as in scenario A, the parameter estimation problems are different since the fit of subsystem 1 is determined with respect to x1 , x4 , and x5 in scenario C, compared to only x1 and χo in scenario A.
The measurement error-covariance matrix is The reduced row echelon form of G is: which leads to the following extent labeling: From A = rank (B) = 3 and R o = 0, one sees that there are ρ o = A − R o = 3 observable extent directions among the 5 ambiguous extents.The first two are linear combinations of x 1 , x 2 , and x 3 , while the third one is a linear combination of x 4 and x 5 .This particular appearance of extents in subsets of the observable directions stems from the subspace clustering property of the reduced row echelon form and will be discussed in some more detail below.
Figure 8 shows the corresponding graph F , which can be partitioned into 2 subsystems.The first subgraph includes the parameters k 1 , k 2 , k 3 , and K 1 since they all have some effect on the observable extent directions χ o,1 and χ o,2 via a network that also involves the unobservable extents x 1 , x 2 , and x 3 .The second subgraph includes the parameters k 4 and k 5 that influence the observable extent direction χ o,3 via a network that also involves the unobservable extents x 4 and x 5 .Hence, this leads to two parameter estimation problems involving 4 and 2 parameters, respectively.The information flows to simulate each of the corresponding subsystems are shown in the inset of Figure 8.
The measurement error-covariance matrix is The reduced row echelon form of G is: which leads to the following extent labeling: This means that the original framework for extent computation, which assumes A = R (Section 3.2.3),applies.
Figure 9 shows the graph F , which can be partitioned into 5 subsystems that include the parameters as follows: (i) k 1 and K 1 ; (ii) k 2 ; (iii) k 3 ; (iv) k 4 ; and (v) k 5 .Hence, one obtains 5 entirely decoupled parameter estimation problems.This set of optimization problems is the same as those obtained with the original extent-based model identification method as is also clear from the inset of Figure 9.

Experimental Study
To explore the applicability of the novel incremental parameter estimation method with realistic laboratory data, we execute parameter estimation for the α-pinene batch experiment first reported in [11] and later studied in [12][13][14][15][16].The reaction network consists of 5 species.These species are labeled A, B, C, D, and E and correspond to α-pinene, dipentene, allo-ocimene, pyronene, and dimer.The reaction network is: As in [12], the model is defined with the following stoichiometric matrix and process rates: At the start of the batch experiment, only A is present.As explained in [12], the concentrations of all species but D are measured.The concentrations of D are obtained by assuming that 3% of the transformed A is present as D at all times.Finally, all concentrations are normalized to 1 (100%).Figure 10 shows the experimental data reported in [12].As the initial numbers of moles and the volume are unknown, we express both the concentrations and the extents as dimensionless fractions of the numbers of moles relative to the initial numbers of moles of A. Hence, we can write n 0 := 1 0 0 0 0 T .

Extent labeling
For extent labeling, we assume that all species are measured with independent zero-mean Gaussian measurement errors, similar to the least-squares approach of [12]: It follows that The reduced row echelon form of G is which indicates that there are: which delivers: Observable extents and extent directions The factorization of G •,a gives: so that Figure 11 shows the computed values of the observable extents and extent direction assuming these definitions.The corresponding error variance-covariance matrix is: with σ the (unknown) measurement error variance.1 , j = 1, . . ., 4) followed by (ii) simultaneous parameter estimation (P 1+0 ).The produced simulation results exhibit strong overlap, meaning that both models produce very similar results.
Figure 12 shows the graph corresponding to the above equation system.Removing all the observation arcs generates 4 subgraphs.The first three subgraphs represent the first three reactions and include: k 1 and x 1 (j = 1) ; k 2 and x 2 (j = 2); and k 3 and x 3 (j = 3).The fourth graph includes the parameters k 4 and k 5 via a network that also involves the observable extent direction χ o and the unobservable extents x 4 and x 5 (j = 4).Accordingly, the J = 4 subsystems are: The information flows in these subsystems are shown in the inset of Figure 12.

Incremental parameter estimation
The resulting partitioning means that estimates of the k 1 , k 2 , and k 3 can be obtained by solving the following three single-parameter estimation problems (P 1 , P 1 , P 1 ): where W (1) := 5/4 (115)

P
(2) where where The fourth set of parameters, k 4 and k 5 , are estimated by solving the following problem: where Please note that the (unknown) measurement error variance σ can arbitrarily be set equal to one during parameter estimation without affecting the parameter estimates.This, however, means that parameter uncertainties cannot be quantified in this case without estimating σ as well.This estimation and subsequent uncertainty analysis is considered out of scope for this study.Figure 11 shows the simulated extent profiles obtained after solving the optimization problems (P (j) 1 , j = 1, . . ., 4).There is a reasonable fit in all cases, except for the third extent of reaction (x 3 ).This, in the mind of the authors, demonstrates a tangible benefit of the extent-based model framework.Indeed, one can identify which parts of the model could be improved, e.g., by reconsidering the reaction rate expression corresponding to the computed extents that are approximated poorly.However, this is not explored further in this work.
Figure 10 shows the measured concentrations and the simulated profiles obtained with (i) incremental parameter (P (j) 1 , j = 1, . . ., 4), followed by (ii) simultaneous parameter estimation with estimates obtained through incremental parameter estimation (P 1+0 ).The parameter values obtained by solving P 0 , P 1 , P 1 , and P 1+0 are shown in Table 3.Note the final values obtained with both P 0 and P 1+0 are equal within numerical precision.

Discussion
A novel procedure for model parameter estimation is proposed.The procedure combines two important features: (i) a new framework for extent computation based on the computation of observable directions among the set of sensed but ambiguous extents; and (ii) a graph-based system partitioning procedure that identifies the groups of equations and kinetic parameters that can be simulated independently from the remaining part of the dynamic model.The latter is possible by approximating the original equation system via interpolation of observed extents and extent directions.The proposed procedure enables splitting the parameter estimation problem into smaller problems in cases that could so far only be handled with simultaneous model identification methods (e.g., scenario A, C, and D in Section 4.1).It also subsumes preexisting methods under special conditions such as complete observability (e.g., scenario E in Section 4.1, [5][6][7]) or absence of ambiguity (e.g., scenario B in Section 4.1, [8]).Thus, this procedure is applicable to parameter estimation for any multivariate differential equation model under a large range of structural observability conditions.This work removes an important bottleneck for the implementation of incremental model identification to biological systems.Indeed, by providing a method that can be applied to any known stoichiometric matrix N and measurement matrix M, incremental model identification is now applicable to biological processes where the number of measurements M is lower than the number of modeled reactions R. Computationally speaking, the extent computation and graph-based partitioning are expected to scale well with the number of modeled reactions.The effects on the computational efficiency of the parameter estimation step will however depend primarily on the lengths of the state vectors in the identified subsystems.In turn, these lengths depend strongly on model structure and the available measured variables and less so on the number of states in the complete system.
A remaining obstacle is the fact that both M and N are assumed to be known.While several techniques exist to handle this, including methods based on extents [17], some coefficients in these matrices may be unidentifiable based on atomic and stoichiometric balances alone.For this reason, dealing with unknown stoichiometry (M) or ill-defined measurements (N) deserves continued attention.
Another aspect open for exploration is that no global optimization method has been tested so far to solve the smaller parameter estimation problems P (j) 1 that result from system partitioning.Inspiration can be drawn from existing parameter estimation methods even if these methods (i) might be complex yet without publicly available implementation [18]; or (ii) remain limited to cases where all extents are observable and therefore modeled as univariate processes for parameter estimation [19,20].

Generalized Framework for Extent Computation
The addition of new concepts, such as ambiguous extents and observable extent directions expands the framework for extent computation beyond its original range of applicability.In this study, the focus has been given to the problem of parameter estimation.However, the same generalized framework is likely to be applicable and useful for challenges that have been handled with earlier extent-based methods such as data reconciliation [7], model structure identification [6], and process control [21].Moreover, there are no obvious limitations that hinder applications involving multi-phase or distributed system models [22,23] or models including algebraic constraints [6,22].

Optimal System Partitioning
According to our understanding, the proposed procedure delivers the most efficient system partitioning given the available measurements and model information M, f (•), N, and n 0 .In other words, the procedure generates the subsystems with the smallest number of parameters that can be estimated independently from parameters in other subsystems.However, this is stated here without formal proof.The factorization of G •,a based on the reduced row echelon form of G, as described in Section 3.3.1,plays a crucial role.Indeed, it follows from work on subspace clustering methods for noise-free data [24,25] that this factorization minimizes the presence of simulation arcs from the ambiguous extents to the observable directions, thereby leading to the best possible system partitioning.
Please note that the obtained generalized extent framework is optimal only for the purpose of parameter estimation.For instance, one could consider a factorization of G •,a which identifies directions that exhibit uncorrelated estimation errors, thereby improving the statistical quality of the computed observable directions and possibly avoiding Type A approximation error.This could be achieved via singular value decomposition of G •,a .However, singular value decomposition cannot guarantee optimal system partitioning [26].Consequently, it follows that the optimal factorization of G •,a depends on the objective of this factorization.This stands in contrast to the existing body of work concerning extents, which do not involve any such purpose-dependent factorization steps.
Note also that the proposed procedure is designed for the definition of optimality used here.For instance, if one relaxes the constraint that parameters can only appear in one subsystem (Section 3.6), then the proposed graph partitioning procedure is not optimal.For example, in scenario A, one may be able to estimate the parameters in 3 subsets rather than 2, namely, (i) k 1 , k 2 , and K 1 by simulation of x 1 (t) and x 2 (t); (ii) k 2 , k 3 , k 4 , and k 5 by simulation of x 2 (t), x 4 (t), x 5 (t), and χ o,1 (t); and (iii) k 3 by simulation of x 3 (t).This leads to 3 parameter estimation problems with 3, 4, and 1 parameters, which may be easier to solve than the 2 parameter estimation problems with 5 and 1 parameters as obtained with the proposed procedure.It is however non-trivial to identify a procedure, including both factorization of G •,a and system partitioning, that guarantees the best partitioning when applying this relaxed definition of optimality, nor is it clear whether such a procedure exists.

New Opportunities
This work also hints at several new applications of the generalized framework for extent computation: (a) Identifiability analysis.While not a core objective of this work, it has been shown that the graph partitioning method can help identify unidentifiable parameters.Given that this labeling is based on a model reformulation and does not depend on the temporal resolution or quality of the collected measurements, it follows that the method identifies structurally unidentifiable parameters.Unlike other methods [27], the proposed approach does not require symbolic differentiation.It remains to be explored whether this can also be used to positively identify

Figure 2 .Figure 3 .
Figure 2. Scenario A-Simulation.Noise-free and noisy measurements as a function of time.

1 Figure 4 .
Figure 4.Scenario A-Graph F .There are three shaded vertices corresponding the observable extents (x 1 , x 3 ) and extent direction (χ o ).The remaining vertices represent the unobservable extents (x 2 , x 4 , x 5 ) and the parameters (k 1 , k 2 , k 3 , k 4 , k 5 , K 1 ).The simulation and observation arcs are shown as solid-line and dashed-line arrows, respectively.Removing the observation arcs and graph partitioning results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 4 , k 5 , and K 1 are connected to the observable quantities x 1 and χ o , and another graph in which k 3 is connected to the observable extent x 3 .

Figure 5 .
Figure 5. Scenario A-Parameter estimation.Measured concentrations and simulated profiles obtained with (i) true parameters; (ii) simultaneous parameter estimation (P 0 ); and (iii) incremental parameter estimation (P

1 Figure 6 .
Figure 6.Scenario B-Graph F .There are two shaded vertices corresponding the observable extents (x 1 , x 3 ).Removing the observation arcs and graph partitioning results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , and K 1 affect the observable extent x 1 , a second graph in which k 3 affects the observable extent x 3 .The vertices k 4 and k 5 have no effect on the observable extents x 1 and x 3 .4.1.5.Scenario C Scenario C assumes concentration measurements for the species B, C, E, and F:

1 Figure 7 .
Figure 7. Scenario C-Graph F .There are 4 shaded vertices corresponding to the observable extents x 1 , x 3 , x 4 , and x 5 .Removing the observation arcs results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 4 , k 5 , and K 1 affect the observable extents x 1 , x 4 , and x 5 , and another graph in which k 3 affect the observable extent x 3 .4.1.6.Scenario D Scenario D assumes concentration measurements for the species A, C, and E:

1 Figure 8 .
Figure 8. Scenario D-Graph F .There are 3 shaded vertices corresponding the observable extent directions χ o,1 , χ o,2 , and χ o,3 .Removing the observation arcs results in 2 subgraphs for the parameters as shown in the inset: one graph in which k 1 , k 2 , k 3 , and K 1 affect the observable extent directions χ o,1 and χ o,2 , and another graph in which k 4 and k 5 affect the observable extent direction χ o,3 .

1 Figure 9 .
Figure9.Scenario E-Graph F .All extent vertices are observable and thus shaded.Removing the observation arcs results in 5 subgraphs for the parameters as shown in the inset: k 1 and K 1 are estimated from x 1 , k 2 from x 2 , k 3 from x 3 , k 4 from x 4 , and k 5 from x 5 .

5 Figure 12 .
Figure12.α-pinene-Graph F .There are four shaded vertices corresponding the observable extents (x 1 , x 2 , x 3 ) and extent direction (χ o ).The remaining vertices represent the unobservable extents (x 4 , x 5 ) and the parameters (k 1 , k 2 , k 3 , k 4 , k 5 ).The simulation and observation arcs are shown as solid-line and dashed-line arrows, respectively.Removing the observation arcs and graph partitioning results in 4 subgraphs for the parameters as shown in the inset: (i) one with vertices for k 1 and x 1 ; (ii) one with vertices for k 2 and x 2 ; (iii) one with vertices for k 3 and x 3 ; and (iv) one with vertices k 4 , k 5 , x 4 , 5 , and χ o .

Table 1 .
List of symbols.

extents 𝒙𝒙 𝑅𝑅 sensed 𝒙𝒙 𝐬𝐬 𝑅𝑅 s ambiguous 𝒙𝒙 𝐚𝐚 𝑅𝑅 a observable 𝒙𝒙 𝐨𝐨 𝑅𝑅 o non−sensed 𝒙𝒙 𝐧𝐧 𝑅𝑅 n non−sensed 𝒙𝒙 𝐧𝐧 𝑅𝑅 n observable 𝒙𝒙 𝐨𝐨 𝑅𝑅 o non−sensed 𝒙𝒙 𝐧𝐧 𝑅𝑅 n observable dir. 𝝌𝝌 o 𝜌𝜌 o unobservable dir. 𝝌𝝌 u 𝜌𝜌 u
Labeling of the R extents as sensed, non-sensed, observable, and ambiguous.The ambiguous extents are spanned by ρ o observable and ρ u unobservable extent directions.This way, the extent space can be represented by A = R o + ρ o observable extents and extent directions, and R − A = R n + ρ u unobservable extents and extent directions.

Table 2 .
Scenario A-Parameter estimates.Model simulation and parameter estimation results: kinetic parameters and model fit.All parameter estimates are reported with their standard deviation based on the Laplacian approximation of the likelihood function (i.e., exp K 1 can be obtained by minimizing a weighted least-squares deviation between the predicted x 1 (t h ) and χ o (t h ) and their measured counterparts x1,h and χo,h , that is, by solving problem P