Next Article in Journal
Enantiomers of Carbohydrates and Their Role in Ecosystem Interactions: A Review
Previous Article in Journal
Challenges in Supersymmetric Cosmology
Previous Article in Special Issue
A Mathematical Model for Transport in Poroelastic Materials with Variable Volume:Derivation, Lie Symmetry Analysis, and Examples
Open AccessArticle

Finding and Breaking Lie Symmetries: Implications for Structural Identifiability and Observability in Biological Modelling

BioProcess Engineering Group, IIM-CSIC, 36208 Vigo, Galicia, Spain
*
Author to whom correspondence should be addressed.
Symmetry 2020, 12(3), 469; https://doi.org/10.3390/sym12030469
Received: 27 February 2020 / Revised: 13 March 2020 / Accepted: 14 March 2020 / Published: 16 March 2020
(This article belongs to the Special Issue Lie Symmetries at Work in Biology and Medicine)

Abstract

A dynamic model is structurally identifiable (respectively, observable) if it is theoretically possible to infer its unknown parameters (respectively, states) by observing its output over time. The two properties, structural identifiability and observability, are completely determined by the model equations. Their analysis is of interest for modellers because it informs about the possibility of gaining insight into a model’s unmeasured variables. Here we cast the problem of analysing structural identifiability and observability as that of finding Lie symmetries. We build on previous results that showed that structural unidentifiability amounts to the existence of Lie symmetries. We consider nonlinear models described by ordinary differential equations and restrict ourselves to rational functions. We revisit a method for finding symmetries by transforming rational expressions into linear systems. We extend the method by enabling it to provide symmetry-breaking transformations, which allows for a semi-automatic model reformulation that renders a non-observable model observable. We provide a MATLAB implementation of the methodology as part of the STRIKE-GOLDD toolbox for observability and identifiability analysis. We illustrate the use of the methodology in the context of biological modelling by applying it to a set of problems taken from the literature.
Keywords: dynamic modelling; nonlinear systems; observability; structural identifiability; Lie symmetries dynamic modelling; nonlinear systems; observability; structural identifiability; Lie symmetries

1. Introduction

The present work is motivated mainly by problems arising in identification and modelling of biological systems, although its results are applicable in other fields. We consider nonlinear dynamic models defined by ordinary differential equations. This framework is sufficiently powerful to model a wide range of biological processes, from intracellular networks to whole ecosystems, with the appropriate level of detail.
When compared to other applications, biological models often pose specific challenges due to the combination of nonlinearity, uncertainty about the underlying system, and experimental limitations regarding the possibility of perturbations and of measurements [1]. These features make the identification of biological models particularly challenging, and call for new methodological developments and computational tools. Indeed, many theoretical advances in nonlinear systems identification have been motivated by biological problems, even though the type of problems being considered are often common to other scientific areas, which makes the resulting methodologies generally applicable [2,3]. An example is the study of structural identifiability, a property that was introduced in the context of physiological modelling [4] and has since then been adopted in many scientific and technological areas; since its study is of particular interest for biological models, many related theoretical developments have been motivated by biological problems [5,6,7]. Structural identifiability refers to the possibility of inferring the values of the unknown (constant) parameters in the model equations from observations (measurements) of the model output. It is closely related to—and in fact can be considered a particular case of—another property, observability, which describes the possibility of determining the (time-varying) state variables of the model [8]. By considering the parameters as constant state variables, structural (local) identifiability amounts to observability (In accordance with the usual terminology, we speak of structural identifiability to distinguish it from practical identifiability. The latter quantifies the uncertainty in parameter values taking into account the information content of the available data, which may be limited due to experimental errors or low sampling rates. We also specify that it is structural local identifiability to distinguish it from structural global identifiability. Although observability is also a structural local property, for historical reasons we do not add those adjectives to its name, and refer to it simply as observability.).
The cause of lack of structural identifiability or observability can be traced back to the existence of symmetries in the model equations. In the present work we use Lie’s theory on symmetry analysis of differential equations [9,10,11]. It has been shown [12] that the existence of Lie symmetries is equivalent to the existence of similarity transformations, i.e., transformations of parameters and variables that leave the output(s) unchanged [13,14]. This means that the existence of symmetries amounts to lack of structural identifiability and/or observability.
Here we revisit a method for finding symmetries by transforming rational expressions into linear systems [15]. Determining the existence of symmetries in a model is a way of analysing its structural identifiability and observability. Furthermore, if symmetries exist, their mathematical expressions provide information about the relationships between model variables that cause loss of identifiability and/or observability. One way of exploiting these insights is by fixing one or more parameters involved in a symmetry, in order to render the remaining ones identifiable. Another way is by using the symmetry-breaking transformations to reformulate the model, applying the transformations so that the symmetries disappear and the new model is identifiable and observable. In this paper, we extend the method by enabling it to provide symmetry-breaking transformations, which allows for a semi-automatic model reformulation that renders a non-observable model observable. Thus, the approach can be used not only for characterizing the identifiability and observability of a model, but also for suggesting alternative reformulations if the original model does not possess those properties. We illustrate the usefulness and applicability of this approach in biological applications with four models of biochemical processes. Furthermore, we provide an implementation of the methodology as part of a new version of the STRIKE-GOLDD software https://github.com/afvillaverde/strike-goldd_2.1. STRIKE-GOLDD is a MATLAB toolbox that analyses structural identifiability and observability using a differential geometry approach [16,17,18].
The organization of this paper is as follows: we begin by providing an overview of the methodological aspects in Section 2. Then we illustrate its application to a number of modelling problems in Section 3. Finally, we discuss the implications and provide some conclusions in Section 4.

2. Methods

2.1. Structural Identifiability and Observability

For the study of observability and identifiability we consider the following type of models:
M N L : = x ˙ ( t ) = f ( x ( t ) , θ , u ( t ) , w ( t ) ) , y ( t ) = g ( x ( t ) , θ , u ( t ) , w ( t ) ) , x ( t 0 ) = x 0 ( θ )
with a parameter vector θ R q , known input functions u ( t ) R m u , unknown input functions w ( t ) R m w , state vector x ( t ) R m , output vector y ( t ) R n and, f and g vectors constituted by analytical functions. From this point on, the time dependency will be omitted to simplify the notation.
Definition 1.
A parameter θ i , denoting the ith component, of the M N L model is structurally locally identifiable (SLI) if for almost any parameter vector θ R q there is a neighbourhood N ( θ ) in which the following relation holds:
θ ^ N ( θ ) a n d y ( t , θ ^ ) = y ( t , θ ) θ ^ i = θ i , i = 1 , , q .
If the above definition does not hold for any neighbourhood of θ , θ i is structurally unidentifiable (SU). If all model parameters are SLI, the model is SLI too. Otherwise, the model is SU.
Similarly, the observability of a state can be defined as:
Definition 2.
Given an output vector y ( t ) and a known input vector u ( t ) , the state x i ( τ ) is observable if it can be determined from y ( t ) and u ( t ) in the interval t 0 τ t t f , where t f is a finite time. Otherwise, it is unobservable.
A model is observable if all states are observable. The same definition applies to the unknown entries vector but instead of observable, it is called reconstructible.
The term Full Input-State-Parameter Observability (FISPO) has been recently proposed [8] to refer to a model that is observable, identifiable and reconstructible. Its formal definition is given by:
Definition 3.
Given the model (1), the unknown quantities vector is considered z ( t ) = [ x ( t ) , θ , w ( t ) ] R m + q + m w . Denoting each component of z in time τ as z i ( τ ) with t 0 τ t t f , for a finite t f . The model M N L is FISPO if the following condition holds:
z ^ ( τ ) N ( z ( τ ) ) a n d y ( t , z ^ ( τ ) ) = y ( t , z ( τ ) ) z ^ i ( τ ) = z i ( τ ) .
Observability and identifiability can be studied jointly if the parameters are considered as constant states. Thus, the observability of these states is equivalent to the local structural identifiability. Therefore, the augmented state vector is [18]:
x ˜ ( t ) = x ( t ) θ w ( t ) , x ˜ ˙ ( t ) = f ( x ˜ ( t ) , u ( t ) ) 0 w ˙ ( t ) .
In this way, the model (1) is transformed into:
M N L : = x ˜ ˙ ( t ) = f ( x ˜ ( t ) , u ( t ) ) , y ( t ) = g ( x ˜ ( t ) , u ( t ) ) .
To determine the observability of a model it is necessary to calculate the nonlinear observability matrix O N L using Lie derivatives. In the case of time-dependent entries, i.e., u ( t ) , Lie derivatives are computed as follows:
Definition 4.
The first order Lie derivative of g ( x ˜ ) with respect to f ( x ˜ ) is:
L f g ( x ˜ ) = g ( x ˜ ) x ˜ f ( x ˜ , u ) + j = 0 g ( x ˜ ) u ( j ) u ( j + 1 ) ,
where u ( j ) is the jth derivative of u.
For higher orders, i 2 , the calculation is done through a recursive procedure:
L f i g ( x ˜ ) = L f i 1 g ( x ˜ ) x ˜ f ( x ˜ , u ) + j = 0 L f i 1 g ( x ˜ ) u ( j ) u ( j + 1 ) .
The observability-identifiability matrix of the previous model is defined as:
O I ( x ˜ , u ) = x ˜ g ( x ˜ , u ) x ˜ ( L f g ( x ˜ , u ) ) x ˜ ( L f 2 g ( x ˜ , u ) ) x ˜ ( L f n x ˜ 1 g ( x ˜ , u ) ) .
Theorem 1.
Nonlinear Observability-Identifiability Condition (OIC). If the model (5) satisfies rank ( O I ( x ˜ 0 , u ) ) = m + q + m w , with O I ( x ˜ 0 , u ) given by (8), where x ˜ 0 is a point in the augmented state space, then the model is locally observable and locally structurally identifiable in a neighbourhood N ( x ˜ 0 ) of x ˜ 0 [18].
If the OIC is not fulfilled, there is at least one unobservable (respectively, unidentifiable) state (respectively, parameter). In that case, full observability may sometimes be achieved by measuring more variables or functions. However, sometimes it is not possible to perform more measurements due to the characteristics of the experiments.

2.2. Lie Symmetries and Structural Unidentifiability

Yates et al. [12] showed that the existence of Lie symmetries entails the existence of similarity transformations, and therefore denotes lack of structural identifiability [13,14]. Similarity transformations allow the existence of parameters and variables transformations that leave the output(s) unchanged.
In Lie algebra, similarity transformations are one-parameter Lie group morphisms that map a solution of the differential equation onto themselves in terms of state variables. There are an infinite number of ways to represent this morphism, however the representation is unique when independent variables are fixed. The uniqueness of representation is a key property for the purpose of implementing the algorithm.
The expression of the one-parameter Lie group of transformations is:
x = X ( x ; ε ) .
Expanding the above expression in some neighborhood of ε = 0 :
x = x + ε X ( x ; ε ) ε | ε = 0 + 1 2 ε 2 2 X ( x ; ε ) ε 2 | ε = 0 + = x + ε X ( x ; ε ) ε | ε = 0 + O ( ε 2 ) ,
where
η ( x ) = X ( x ; ε ) ε | ε = 0
is the infinitesimal of (9) and x + ε η ( x ) is the infinitesimal transformation of the Lie group of transformations.
Definition 5
([9]). The infinitesimal generator of the one-parameter Lie group of transformations is the differential operator:
X = X ( x ) = η ( x ) · = i = 1 n η i ( x ) x i ,
wheredefines the gradient
= x 1 , x 2 , , x n .
Theorem 2.
First Fundamental Theorem of Lie [9]. Given an initial value problem (IVP) for a system of first-order ODEs:
d x d τ = η ( x ) ,
x = x when τ = 0 .
The Lie group of transformations (9) is equivalent to the above IVP with the parametrization τ ( ε ) :
τ ( ε ) = 0 ε Γ ( ε ) d ε ,
where
Γ ( ε ) = ϕ ( a , b ) b | ( a , b ) = ( ε 1 , ε ) ,
Γ ( 0 ) = 1 .
ϕ ( a , b ) is the law of composition and ε 1 denotes the inverse element to ε.
The above theorem shows how infinitesimal transformations contain the essential information that determines the uniparametric Lie group of transformations. From Theorem (2) and without loss of generality, it is assumed that the Lie group of transformations has as its law of composition ϕ ( a , b ) = a + b with ε 1 = ε and Γ ( ε ) = 1 [9]. Therefore, the uniparametric Lie group of transformations (9) is rewritten, in terms of its infinitesimals η ( x ) , as:
d x d ε = η ( x ) , x = x at ε = 0 .
The above expression defines an Initial Value Problem (IVP) for the Lie group of transformations in terms of its infinitesimals generators.
The exponential parametrization of the Lie group around the identity is:
Theorem 3
([9]). The one-parameter Lie group of transformations (9) is equivalent to:
x = exp [ ε X ] x = x + ε X x + 1 2 ε 2 X 2 x + =
= 1 + ε X + 1 2 ε 2 X 2 + x = k = 0 ε k k ! X k x ,
where X is given by (12) and X k = X X k 1 , k = 1 , 2 with X 0 x = x .
For a more detailed background, we refer the reader to [9,11].

2.2.1. Computing Symmetries

Let us consider the same ODE system as in (1). The state vector will be augmented with parameters and unknown inputs, as mentioned earlier:
x ˙ i ( t ) = f i ( x ( t ) , u ( t ) ) , i = 1 , , m x i ( t ) = θ , i = m + 1 , , m + q x i ( t ) = w i ( t ) , i = m + q + 1 , , n
where n = m + q + m w .
We will study three different infinitesimals. Considering d m a x , d N as the maximum degree for polynomials and r i , d unknown constants to determine, the expressions of infinitesimals are:
  • Univariate:
    η i ( x ) = d = 0 d m a x r i , d x i d , i = 1 , , n .
  • Partially variate:
    η i ( x ) = d i , d m + 1 , , d m + q = 0 | d | = d m a x r i , d x i d i x m + 1 d m + 1 · · · x m + q d m + q , i = 1 , m , η i ( x ) = d m + 1 , , d m + q = 0 | d | = d m a x r i , d x m + 1 d m + 1 · · · x m + q d m + q , i = m + 1 , m + q , η i ( x ) = d i , d m + 1 , , d m + q = 0 | d | = d m a x r i , d x i d i x m + 1 d m + 1 · · · x m + q d m + q , i = m + q + 1 , n .
  • Multivariate:
    η i ( x ) = d 1 , , d m + q = 0 | d | = d m a x r i , d x 1 d 1 · · · x m + q d m + q , i = 1 , m , η i ( x ) = d m + 1 , , d m + q = 0 | d | = d m a x r i , d x m + 1 d m + 1 · · · x m + q d m + q , i = m + 1 , m + q , η i ( x ) = d 1 , , d n = 0 | d | = d m a x r i , d x 1 d 1 · · · x n d n , i = 1 , n .
The derivative of infinitesimal generators is also defined, so that it can act on the x ˙ ( t ) :
X = i = 1 n η i ( x ) x i + i = 1 n η i ( x ) x i ˙ ,
where
η i ( x ) = j = 1 n x ˙ j η i x j .
Using the above formulation for infinitesimals generators, the following explicit criterion for admittance of a Lie group of transformations is obtained:
Theorem 4
([9,10]). The system of ordinary differential equations admits a one-parameter Lie group of transformations defined by the infinitesimal generator (12) if and only if:
X · ( x ˙ k f k ( x ) ) = 0 , k = 1 , , m
X · ( y l g l ( x ) ) = 0 , l = 1 , , n
Applying the previous theorem to the initial system (1), we obtain an explicit criterion:
j = 1 n x ˙ j η k x j ( x ) i = 1 n η i ( x ) f k x i ( x ) = 0 , k = 1 , , m i = 1 n η i ( x ) g l x i ( x ) = 0 , l = 1 , , n
An admitted Lie symmetry is a continuous group of transformations X such that the observation (observed datum) is unchanged:
g ( x ( t ) , u ( t ) ) = g ( x ( t ) , u ( t ) )
The output map should not be modified. The above system defines a system of partial differential equations. If we consider the rational form of f k ( x ) and g l ( x ) :
x ˙ k = f k ( x ) = P k ( x ) Q k ( x ) , k = 1 , , m , y l = g l ( x ) = R l ( x ) S l ( x ) , l = 1 , , n ,
then, the system of PDEs (24) is converted into a system of ODEs.
There are different expressions of (24) depending on the infinitesimal generator:
  • Univariate and partial:
    P k Q k η k x k i = 1 n η i [ P x i k Q k P k Q x i k ] = 0 , k = 1 , m , i = 1 n η i [ R x i l S l R l S x i l ] = 0 , l = 1 , n .
  • Multivariate:
    j = 1 m P j Q k b j Q b η k x j i = 1 n η i b k Q b [ P x i k Q k P k Q x i k ] = 0 , k = 1 , m , i = 1 n η i [ R x i l S l R l S x i l ] = 0 , l = 1 , n .
Each of these equations can be reordered based on the combinations among the components of x. Let r be a vector containing all r i , d :
i 1 , , i n c i 1 , , i n ( r ) x 1 i 1 · · · x n i n = 0 .
The coefficients c i 1 , , i n are linear in r and its matrix form expression allows to reformulate condition (26) and (27) into:
C · r = 0 .
The problem of finding symmetries is equivalent to solving a linear system of equations with numeric entries. However, C is a non-square matrix and, in order to find all the solutions, it is necessary to compute its kernel. It is possible that the obtained solutions are not independent of each other, as a result of linear combinations or multiplication by x i ; for this reason, we will only consider solutions that are independent of each other and in their greatest degree of simplification.
The next step is to build the expression of x with the infinitesimal generators as in Theorem (3). When the infinitesimal transformation is given by powers of the variable, the exact transformation is known and it is classified as “elementary”. Some examples are:
x i = x i + ε , X = x i ( translation ) x i = exp ( ε ) x i , X = x i x i ( scaling ) x i = x i 1 ε x i , X = x i 2 x i ( Mobius ) x i = x i [ 1 ( p 1 ) ε x i p 1 ] 1 p 1 , X = x i p x i ( higher order )
The most common types of symmetries in biochemical models are translation and scaling. However, those elementary transformations cover only a part of the possible symmetries that a model can contain. The others must be calculated using Lie series or solving the IVP (16).
It is possible to maximize the number of elementary transformations given by an infinitesimal generator before applying Lie series. This process starts by searching for all parameter combinations that forbid an elementary transformation, and dividing the initial infinitesimal generator by them. Each new infinitesimal generator provides a number of elementary transformations that can be greater than, lower than or equal to the initial one. Maximizing the total amount, we will obtain an infinitesimal generator that provides the largest possible number of elementary transformations.

2.2.2. Initial Conditions

The definition of a dynamic model may include specific initial conditions (ICS), which can be numeric, parametric (known or unknown) or a combination of both [19,20]. Perturbations in ICS must produce changes in the output y ( t ) for the model to be observable.
If the symmetries are studied without taking into account the ICS, it may happen that the generators do not fulfill them. It is important to consider only the generators that satisfy the ICS [19]; by including them as output vectors, a symmetry is admitted by the system if and only if:
X · ( x k ( t 0 ) x 0 ( θ ) ) | x = x 0 ( θ ) = 0 , k = 1 , , m ,
where x 0 ( θ ) are the ICS (parametric, numerical or a combination of both). Expanding the above equation:
i = 1 n η i x i x k ( t 0 ) | x = x 0 ( θ ) i = 1 n η i x i x 0 ( θ ) | x = x 0 ( θ ) = 0 , k = 1 , , m .
Considering the rational form of x 0 ( θ ) :
x 0 ( θ ) = V k ( x ) W k ( x ) , k = 1 , , m .
Expression (31) is reformulated as follow:
i = 1 n η i ( x 0 ( θ ) ) i = 1 n η i V x i k W k V k W x i k ( W k ) 2 | x = x 0 ( θ ) = 0 , k = 1 , , m .
This new restriction allows us to consider only those symmetries that fulfill the initial conditions, and provides a tool for study the influence of them.

2.3. Implementation

An overview of the algorithm described in the preceding subsections is shown in Figure 1. We have made a MATLAB implementation available at https://github.com/GemmaMasFes6/Lie-Symmetries, and as part of a new version (v2.1.6) of the STRIKE-GOLDD toolbox (https://github.com/afvillaverde/strike-goldd_2.1); it will also be included in future releases of said toolbox. Our code represents an addition to the set of existing tools for studying differential equation symmetries, which include symmetryDetection [15] in Python, MinimalOutputSets [19] in Mathematica, and SADE [21] in Maple. Our software is open source and, to the best of our knowledge, it is the first tool of its characteristics available in MATLAB.
The code consists of a main MATLAB script, ‘Lie_Symmetry’, and ten auxiliary functions defined in separate files. Each of these functions performs one of the stages outlined above: calculation of infinitesimal polynomials (univariate, partially variate and multivariate); computation of polynomials for states (depending on the type of polynomial there are two possibilities), observations, and ICS (if they are specified); and obtaining the transformations. The last step in turn incorporates two other functions, corresponding to maximizing the number of elementary transformations and, if necessary, calculating the transformations using Lie series.
The algorithm has a number of options that can be specified by the user: the type of infinitesimal polynomial, its degree, and the number of terms in Lie series (in case it has to be used).
The input of the programme must include the following vectors of symbolic variables, declared through the MATLAB sym command: parameters, states, initial conditions, ODEs, observations, and inputs (known or unknown). For initial conditions, two vectors must be provided: a vector called known_ics with entries equal to 1 for a known initial condition and 0 otherwise, and a vector ics with the values of the known ICS, either numeric or (known or unknown) parametric value.
The output of the programme first reports whether there is any symmetry. If a symmetry exists, the programme prints the infinitesimal generator and the transformations on the screen.

3. Results

To illustrate the application of the method in biological and biomedical modelling, we use it to analyse a set of models taken from the literature. The models are listed in Table 1 and their schematic representations are shown in Figure 2.

3.1. Simple Chemical Reaction

This model represents a bimolecular reaction described by one ODE and one observation [15]:
A ˙ = 2 k A 2 , A obs = s 1 A 1 + s 2 A .
It is used to provide a basic illustration of the methodology, due to its simplicity. Without considering initial conditions, and using an univariate polynomial of second order, the programme finds two infinitesimal generators:
X = A A k k s 1 s 1 s 2 s 2 , X = A 2 A + s 2 .
All the transformations are elementary:
A = e ε A , k = e ε k , s 1 = e ε s 1 , s 2 = e ε s 2 . A = A 1 ε A , s 2 = s 2 + ε .
Our results coincide with those reported in [15]. It is possible to include ICS in order to study its influence in the model.
This example, because of its simplicity, allows us to check the results manually. Once the transformations are computed, it is easy to see that the second group of transformations solves the same ODE. The time derivative of A is:
A ˙ = A ˙ ( 1 ε A ) 2
Incorporating the above expression with A in (33), the ODE is still fulfilled.
Below are two screenshots of the results of the observability and identifiability analysis obtained with STRIKE-GOLDD. In the first panel, Figure 3 (Page 1), corresponding to the initial model, all states and parameters are unobservable; in the second one, Figure 3 (Page 2), corresponding to the model with Lie transformations given by the second generator, states and parameters are observable.

3.2. Pharmacokinetic Model

A pharmacokinetic (PK) model describes the time course of the concentrations of a drug in different compartments, after entering an organism. This model includes one input, four ODEs, and two outputs [22,25]:
x 1 ˙ = u ( k 1 + k 2 ) x 1 , x 2 ˙ = k 1 x 1 ( k 3 + k 6 + k 7 ) x 2 + k 5 x 4 , x 3 ˙ = k 2 x 1 + k 3 x 2 k 4 x 3 , x 4 ˙ = k 6 x 2 k 5 x 4 , x 2 obs = s 2 x 2 , x 3 obs = s 3 x 3 .
We use partial variate polynomials of second order, without ICS. Maximizing the number of elementary transformations leads to four of them, and the procedure yields the following infinitesimal generator:
X = k 1 k 1 k 2 k 3 ( k 1 + k 2 ) k 2 k 3 k 7 s 2 s 2 + k 1 s 3 k 2 s 3 + x 2 x 2 k 1 s 3 k 2 x 3 + x 4 x 4 .
The formulation of the IVP (16) for this infinitesimal generator, considering only the non-elementary transformations, is:
k 2 ˙ = k 1 , k 2 ( 0 ) = k 2  ,
k 3 ˙ = k 3 ( k 1 + k 2 ) k 2 , k 3 ( 0 ) = k 3  ,
k 7 ˙ = k 3 ( k 1 + k 2 ) k 2 , k 7 ( 0 ) = k 7  ,
s 3 ˙ = k 1 s 3 k 2 , s 3 ( 0 ) = s 3  ,
x 3 ˙ = k 1 x 3 k 2 , x 3 ( 0 ) = x 3  .
The solution of the ODE system, replacing k 1 with its transformation, is:
k 2 = k 1 + k 2 k 1 e ε ,
k 3 = k 3 e ε ( k 1 + k 2 k 1 e ε ) k 2 ,
k 7 = k 7 + k 3 ( k 1 + k 2 ) k 2 k 3 e ε ( k 1 + k 2 ) k 2 ,
x 3 = x 3 ( k 1 + k 2 k 1 e ε ) k 2 ,
s 3 = k 2 s 3 ( k 1 + k 2 k 1 e ε ) .
These new transformations coincide with those presented in [25]. Using Lie series in the infinitesimal generator, the new expressions are:
x 2 = x 2 e ε , x 4 = x 4 e ε , k 1 = k 1 e ε , s 2 = s 2 e ε ,
x 3 = x 3 ε k 1 x 3 k 2 ε 2 k 1 x 3 2 k 2 ε 3 k 1 x 3 6 k 2 ε 4 k 1 x 3 24 k 2 ,
k 2 = k 2 ε k 1 ε 2 k 1 2 ε 3 k 1 6 ε 4 k 1 24 ,
k 3 = k 3 k 3 ( k 1 + k 2 ) ε k 2 + ε 2 k 3 ( k 1 + k 2 ) 2 k 2 ε 3 k 3 ( k 1 + k 2 ) 6 k 2 + ε 4 k 3 ( k 1 + k 2 ) 24 k 2 ,
k 7 = k 7 + k 3 ( k 1 + k 2 ) ε k 2 ε 2 k 3 ( k 1 + k 2 ) 2 k 2 + ε 3 k 3 ( k 1 + k 2 ) 6 k 2 ε 4 k 3 ( k 1 + k 2 ) 24 k 2 ,
s 3 = s 3 + ε k 1 s 3 k 2 + ε 2 k 1 s 3 ( 2 k 1 + k 2 ) 2 k 2 2 + ε 3 k 1 s 3 ( 6 k 1 2 + 6 k 1 k 2 + k 2 2 ) 6 k 2 3 + + ε 4 k 1 s 3 ( 24 k 1 3 + 36 k 1 2 k 2 + 14 k 1 k 2 2 + k 2 3 ) 24 k 2 4 .
These expressions seem to differ from the closed form, however, the result can be reformulated in order to obtain them. Taking as example x 3 :
x 3 = x 3 k 1 x 3 k 2 ε + ε 2 2 + ε 3 6 + ε 4 24 .
The last part of the previous expression consists of the first terms of the following series:
n = 0 ε n n ! 1 = e ε 1 .
Substituting (48) in (47), the transformation of x 3 is the same as the one obtained from IVP:
x 3 = x 3 k 1 x 3 k 2 ( e ε 1 ) = x 3 k 2 k 1 x 3 ( e ε 1 ) k 2 = x 3 ( k 2 k 1 ( e ε 1 ) ) k 2 .
The most complicated case appears to be s 3 . The expression obtained from the IVP (40) can be rearranged as:
s 3 = k 2 s 3 k 2 + k 1 ( 1 e ε ) = s 3 1 k 1 k 2 ( e ε 1 ) = s 3 n = 0 k 1 k 2 ( e ε 1 ) n .
It is only necessary to prove that (46) is the same as (49) to show that both transformations are equal. Equation (46) can be reordered in terms of the powers of k 1 / k 2 :
s 3 = s 3 1 + k 1 k 2 ε + ε 2 2 + ε 3 6 + ε 4 24 + k 1 2 k 2 2 ε 2 + ε 3 + 7 ε 4 12 + k 1 3 k 2 3 ε 3 + 3 ε 4 2 + k 1 4 ε 4 k 2 4 .
The coefficient of k 1 / k 2 is e ε 1 , as was proven before in (47) and (48).
The coefficient of the third term must be equal to
( e ε 1 ) 2 = e 2 ε + 1 2 e ε .
Considering the Taylor series of the previous expression until the fifth order,
1 + 2 ε + 2 ε 2 + 8 ε 3 6 + 16 ε 4 24 + 1 2 2 ε ε 2 ε 3 3 2 ε 4 24 = ε 2 + ε 3 + 7 ε 4 12 ,
we obtain that the final expression is the same as that presented in (50).
The coefficient of k 1 3 / k 2 3 is the result of considering some terms of the Taylor series of ( e ε 1 ) 3 :
3 1 + ε + ε 2 2 + ε 3 6 + ε 4 24 3 1 + 2 ε + 2 ε 2 + 8 ε 3 6 + 16 ε 4 24 + 1 + 3 ε + 9 ε 2 2 + 27 ε 3 6 + 81 ε 4 24 1 =
= ε 3 + 36 ε 4 24 = ε 3 + 3 ε 4 2 .
It is necessary to consider more terms of the Lie series to prove that the coefficient of k 1 4 / k 2 4 is ( e ε 1 ) 4 . In any case, regardless of the model under consideration, the transformations obtained with Lie series—that is, the output of the programme—are the expanding form of the solution of the IVP. In order to obtain a good approximation of the closed form (36)–(40) the programme needs to consider sufficient terms of the Lie series. For this case, two terms are sufficient to achieve full observability (i.e., FISPO).
This model had been previously analysed in [15,25]. In [15] a different infinitesimal generator was obtained; due to the lack of a method to compute non-elementary transformations, it was not possible to obtain all transformations. In [25] the same infinitesimal generator as reported here was obtained, and transformations were computed using Hermite-Padé polynomials. The resulting transformations in [25] are the same as those obtained from the IVP, as well as those obtained by our programme using Lie series.

3.3. NF- κ B Signalling Pathway

The model studied in this example was described in [23,26]. It represents a cellular signaling pathway found in most animal cells, corresponding to the NF- κ B transcription factor. It is depicted in Figure 2C, where black arrows indicate exit routes.
x ˙ 1 = k 11 x 10 k 1 u 1 + k 0 u + k 1 p x 1 , x ˙ 2 = k 1 u 1 + k 0 u + k 1 p x 1 k 2 x 2 , x ˙ 3 = k 2 x 2 k 3 x 3 , x ˙ 4 = k 2 x 2 k 4 x 4 , x ˙ 5 = k 3 ρ vol x 3 k 5 x 5 , x ˙ 6 = k 5 x 5 k 10 x 9 x 6 , x ˙ 7 = k 6 x 6 k 7 x 7 , x ˙ 8 = k 8 x 7 k 9 x 8 , x ˙ 9 = k 9 ρ vol x 8 k 10 x 9 x 6 , x ˙ 10 = k 10 x 9 x 6 k 11 ρ vol x 10 , y 1 obs = s 1 ( x 1 + x 2 + x 3 ) + I 0 cyt , y 2 obs = s 2 ( x 10 + x 5 + x 6 ) + I 0 nuc , y 3 obs = s 3 ( x 2 + x 3 ) , y 4 obs = s 4 ( x 2 + x 4 ) .
This model is the only one studied with ICS. In this case, they are parametric and include an additional parameter, x 1 0 , that does not appear in the model equations:
x 1 ( 0 ) = x 1 0 , x 2 ( 0 ) = k 1 p x 1 k 2 , x 3 ( 0 ) = k 1 p x 1 k 3 , x 4 ( 0 ) = k 1 p x 1 k 4 , x 5 ( 0 ) = k 3 ρ vol x 3 k 5 , x 6 ( 0 ) = k 7 x 7 k 6 , x 7 ( 0 ) = k 9 x 8 k 8 , x 8 ( 0 ) = k 3 x 3 k 9 , x 9 ( 0 ) = k 5 x 5 x 6 k 10 , x 10 ( 0 ) = k 1 p x 1 k 11 .
Using a second order univariate polynomial, the following infinitesimal transformations were found:
  • Scaling symmetry was found for the known input function and two parameters:
    u = u e ε ; k i = k i e ε i = 0 , 1 .
  • The second symmetry was also for the input function and one parameter, in this case, a Mobius and translation symmetry, respectively:
    u = u ε u 1 ; k 0 = k 0 + ε .
  • Another scaling symmetry involving one state and two parameters:
    x 7 = x 7 e ε ; k 6 = k 6 e ε ; k 8 = k 8 e ε .
  • One scaling type symmetry is admitted using the parameter ρ vol . All the nucleus states, as well as four parameters, take part in the symmetry:
    x i = x i e ε i = 5 , 6 , 9 , 10 ; s 2 = s 2 e ε ;
    k i = k i e ε i = 6 , 10 , 11 ; ρ vol = ρ vol e ε .
  • The last symmetry is the only one that involves the initial condition parameter, x 1 0 . All of the states have a scaling type symmetry, compensated by the scaling factor of s i and k 10 :
    x i = x i e ε i = 1 , , 10 ; s i = s i e ε i = 1 , , 4 ;
    k 10 = k 10 e ε ; x 1 0 = x 1 0 e ε .
All of the symmetries are elementary transformations and it was not necessary to use Lie series. If ICS had not been considered, the symmetries would be elementary too.
This model was studied in [15]. The results of the first four transformations presented above coincide with those found in [15]; however, the last transformation includes the ICS parameter, unlike in the aforementioned article.

3.4. Glucose-Insulin Regulation

This model describes the regulation of blood glucose and insulin [24]. It has two states (glucose and insulin), one output (a glucose measurement), and a known input (the glucose entering from the digestive system):
q 1 ˙ = u + p 1 q 1 p 2 q 2 , q 2 ˙ = p 3 q 2 + p 4 q 1 , y = q 1 V p .
Here we analyse its symmetries without considering ICS. The model has two infinitesimal generators, which can be found using multivariate polynomials of second order:
X = q 2 q 2 p 2 p 2 + p 4 p 4 , X = q 1 q 1 + u + p 3 q 1 p 2 q 2 + p 3 p 1 p 2 p 2 p 3 p 3 + + p 1 p 3 p 3 2 + p 2 p 4 p 2 p 4 V p V p .
The first infinitesimal generator includes only elementary transformations:
q 2 = q 2 e ε , p 2 = p 2 e ε , p 4 = p 4 e ε .
The second infinitesimal generator has seven transformations, four of which are elementary:
q 1 = q 1 e ε , p 2 = p 2 e ε , p 3 = p 3 e ε , V p = V p e ε , q 2 = ( u p 3 q 1 ) ε 4 24 p 2 + ( u + p 3 q 1 ) ε 3 6 p 2 + ( u p 3 q 1 ) ε 2 2 p 2 + ( u + p 3 q 1 ) ε p 2 + q 2 , p 1 = p 3 ε 4 24 + p 3 ε 3 6 p 3 ε 2 2 + p 3 ε + p 1 , p 4 = ( p 3 2 + p 1 p 3 + p 2 p 4 ) ε 4 24 p 2 + ( p 3 2 + p 1 p 3 + p 2 p 4 ) ε 3 6 p 2 + + ( p 3 2 + p 1 p 3 + p 2 p 4 ) ε 2 2 p 2 + ( p 3 2 + p 1 p 3 + p 2 p 4 ) ε p 2 + p 4 .
Considering two terms of Lie series, the reparameterized model is FISPO, as classified by STRIKE-GOLDD.
The formulation of the IVP for the second infinitesimal generator is:
q 1 ˙ = q 1 , q 1 ( 0 ) = q 1  ,
q 2 ˙ = u + p 3 q 1 p 2 , q 2 ( 0 ) = q 2  ,
p 1 ˙ = p 3 , k 7 ( 0 ) = k 7  ,
p 2 ˙ = p 2 , p 2 ( 0 ) = p 2  ,
p 3 ˙ = p 3 , p 3 ( 0 ) = p 3  ,
p 4 ˙ = p 1 p 3 ( p 3 ) 2 + p 2 p 4 p 2 , p 4 ( 0 ) = p 4  .
V p ˙ = V p , V p ( 0 ) = V p  .
The solution of the ODE system is:
q 1 = q 1 e ε , q 2 = q 2 + e ε p 3 q 1 + e ε u p 2 , p 1 = p 1 + p 3 p 3 e ε , p 2 = p 2 e ε , p 3 = p 3 e ε , p 4 = p 3 2 e ε p 2 p 3 2 + p 1 p 3 p 2 + e ε ( p 1 p 3 + p 2 p 4 ) p 2 , V p = V p e ε .
It is possible, using the procedure described for the pharmacokinetic model, to verify that the IVP solutions are the closed form of the solutions obtained through Lie series provided by the programme.
This model was proposed by Bolie [24] and its structural identifiability was analysed in [27] for the first time. However, its symmetries had not been studied until now.

4. Discussion

This article has addressed the relationship between non-observability and Lie algebra. Its main contribution is a computational method that searches for infinitesimal transformations in models composed of rational functions, in order to undo the symmetries that these may present. The procedure is based on expressing each transformation admitted by the ODE system according to its infinitesimal generator in polynomial form. In this way, the search for symmetries is equivalent to solving a system of linear equations, whose solution yields a transformation of the parameters that makes the model observable while leaving the observations invariant.
Our method builds on previous work [12,19,25], and especially on the procedure presented by Merkt et al. [15], with the addition of two features. The first one is the a priori maximization of the number of explicit transformations that can be obtained from the infinitesimal generator. The second one is the calculation of non-elementary transformations by means of Lie series. Increasing the number of explicit transformations is beneficial not only because it reduces the number of terms to consider from the Lie series, but also for calculating the solutions of the IVP. The complexity of the IVP solutions is inversely proportional to the number of explicit transformations. The pharmacokinetic model (PK) analysed here illustrates this point: without the use of the maximization of explicit transformations, MATLAB’s symbolic math toolbox did not manage to solve the IVP.
The algorithm allows to study the influence of the initial conditions in the model. The type of ICS (parametric, numeric or both) and the states that incorporate them may affect the number and type of symmetries of the models, varying from explicit to non-elementary transformations and reducing the number of infinitesimal generators.
We have implemented the method as a MATLAB programme that automates both the search for symmetries and the reconstruction of the model from the infinitesimal generators found. The programme has been integrated in the STRIKE-GOLDD toolbox for observability and identifiability analysis. The software has been tested with four previously published biomedical models, one of which—Bolie’s glucose-insulin regulation model—had not been tested for symmetries before. In the other cases our diagnoses mostly agree with those previously reported in the literature. An exception is the NF- κ B model, for which we found an infinitesimal generator that includes the parameter introduced by the initial conditions and that was not found in a previous analysis [15]. We observed another difference between our software and the one provided with [15]: when analysing the chemical reaction (CR) and pharmacokinetic (PK) models, the generators obtained with our code remained the same when varying the type of polynomial and degree; in contrast, the generators obtained with the programme of [15] changed when using partially varied and multivariate polynomials of order three or higher. These discrepancies may be due to implementation issues.
Our symmetry-detecting algorithm can be directly used to analyse structural identifiability and observability, providing an alternative to the OIC-checking algorithm already included in STRIKE-GOLDD for that purpose. More importantly however, this new code provides additional information about the relationships between model variables that cause loss of identifiability and/or observability. These insights can be exploited in two ways: (i) by fixing one or more parameters involved in a symmetry, in order to render the remaining ones identifiable, and (ii) by using the symmetry-breaking transformations to reformulate the model, yielding a modified model that is identifiable and observable. To facilitate the application of the latter procedure, we have implemented in our programme the semi-automatic transformation of a non-observable (respectively, non-identifiable) model into an observable (respectively, identifiable) model. It should be noted that, while said transformation may render a model fully observable, it also modifies the expression of the variables involved in its equations, which lose their original mechanistic meaning. Thus, while the results of the procedure can offer valuable insight about the model structure, they should be applied carefully for the purpose of model reformulation.
Our programme has some known limitations. First, while it considers high order generators and it can uncover a wide range of possible symmetries, it lacks procedures for determining a priori the type and total number of symmetries present in a model. Second, it does not provide a bound on the number of terms of the Lie series needed to obtain the infinitesimal transformations, when these are not given by explicit transformations. To the best of our knowledge, these limitations are shared with other existing methodologies. The possibility of overcoming them will be considered in future work.

Author Contributions

Conceptualization, A.F.V.; methodology, G.M.; software, G.M.; investigation, G.M. and A.F.V.; writing—original draft preparation, G.M. and A.F.V.; writing—review and editing, G.M. and A.F.V.; supervision, A.F.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Spanish Ministry of Science, Innovation and Universities through the project SYNBIOCONTROL (ref. DPI2017-82896-C2-2-R).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
CRChemical reaction
FISPOFull input, state, and parameter observability
ICSInitial conditions
IVPInitial value problem
ODEOrdinary differential equation
OICObservability-identifiability condition
PDEPartial differential equation
PKPharmacokinetic
SLIStructurally locally identifiable
SUStructurally unidentifiable

References

  1. DiStefano, J., III. Dynamic Systems Biology Modeling and Simulation; Academic Press: Amsterdam, The Netherlands, 2015. [Google Scholar]
  2. Sontag, E.D. Some new directions in control theory inspired by systems biology. IET Syst. Biol. 2004, 1, 9–18. [Google Scholar] [CrossRef] [PubMed]
  3. Åström, K.J.; Kumar, P.R. Control: A perspective. Automatica 2014, 50, 3–43. [Google Scholar] [CrossRef]
  4. Bellman, R.; Åström, K.J. On structural identifiability. Math. Biosci. 1970, 7, 329–339. [Google Scholar] [CrossRef]
  5. Miao, H.; Xia, X.; Perelson, A.S.; Wu, H. On identifiability of nonlinear ODE models and applications in viral dynamics. SIAM Rev. 2011, 53, 3–39. [Google Scholar] [CrossRef]
  6. Chiş, O.T.; Banga, J.R.; Balsa-Canto, E. Structural identifiability of systems biology models: A critical comparison of methods. PLoS ONE 2011, 6, e27755. [Google Scholar] [CrossRef]
  7. Raue, A.; Karlsson, J.; Saccomani, M.P.; Jirstrand, M.; Timmer, J. Comparison of approaches for parameter identifiability analysis of biological systems. Bioinformatics 2014, 30, 1440–1448. [Google Scholar] [CrossRef]
  8. Villaverde, A.F. Observability and Structural Identifiability of Nonlinear Biological Systems. Complexity 2019, 2019, 8497093. [Google Scholar] [CrossRef]
  9. Bluman, G.; Anco, S. Symmetry and Integration Methods for Differential Equations; Springer Science & Business Media: New York, NY, USA, 2008. [Google Scholar]
  10. Oliveri, F. Lie symmetries of differential equations: Classical results and recent contributions. Symmetry 2010, 2, 658–706. [Google Scholar] [CrossRef]
  11. Arrigo, D.J. Symmetry Analysis of Differential Equations: An Introduction; John Wiley & Sons: New York, NY, USA, 2015. [Google Scholar]
  12. Yates, J.W.; Evans, N.D.; Chappell, M.J. Structural identifiability analysis via symmetries of differential equations. Automatica 2009, 45, 2585–2591. [Google Scholar] [CrossRef]
  13. Vajda, S.; Godfrey, K.R.; Rabitz, H. Similarity transformation approach to identifiability analysis of nonlinear compartmental models. Math. Biosci. 1989, 93, 217–248. [Google Scholar] [CrossRef]
  14. Evans, N.D.; Chapman, M.J.; Chappell, M.J.; Godfrey, K.R. Identifiability of uncontrolled nonlinear rational systems. Automatica 2002, 38, 1799–1805. [Google Scholar] [CrossRef]
  15. Merkt, B.; Timmer, J.; Kaschek, D. Higher-order Lie symmetries in identifiability and predictability analysis of dynamic models. Phys. Rev. E 2015, 92, 012920. [Google Scholar] [CrossRef] [PubMed]
  16. Villaverde, A.F.; Barreiro, A.; Papachristodoulou, A. Structural identifiability of dynamic systems biology models. PLoS Comput. Biol. 2016, 12, e1005153. [Google Scholar] [CrossRef] [PubMed]
  17. Villaverde, A.F.; Evans, N.D.; Chappell, M.J.; Banga, J.R. Input-dependent structural identifiability of nonlinear systems. IEEE Control Syst. Lett. 2019, 3, 272–277. [Google Scholar] [CrossRef]
  18. Villaverde, A.F.; Tsiantis, N.; Banga, J.R. Full observability and estimation of unknown inputs, states, and parameters of nonlinear biological models. J. R. Soc. Interface 2019. in review. [Google Scholar] [CrossRef]
  19. Anguelova, M.; Karlsson, J.; Jirstrand, M. Minimal output sets for identifiability. Math. Biosci. 2012, 239, 139–153. [Google Scholar] [CrossRef]
  20. Saccomani, M.P.; Audoly, S.; D’Angiò, L. Parameter identifiability of nonlinear systems: The role of initial conditions. Automatica 2003, 39, 619–632. [Google Scholar] [CrossRef]
  21. Rocha Filho, T.M.; Figueiredo, A. [SADE] a Maple package for the symmetry analysis of differential equations. Comput. Phys. Commun. 2011, 182, 467–476. [Google Scholar] [CrossRef]
  22. Raksanyi, A. Utilisation du calcul formel pour l’étude des systèmes d’équations polynomiales applications en modélisation. Ph.D. Thesis, Université de Paris-Dauphine, Paris, France, 1986. [Google Scholar]
  23. Lipniacki, T.; Paszek, P.; Brasier, A.R.; Luxon, B.; Kimmel, M. Mathematical model of NF-kB regulatory module. J. Theor. Biol. 2004, 228, 195–215. [Google Scholar] [CrossRef]
  24. Bolie, V.W. Coefficients of normal blood glucose regulation. J. Appl. Physiol. 1961, 16, 783–788. [Google Scholar] [CrossRef]
  25. Sedoglavic, A. A probabilistic algorithm to test local algebraic observability in polynomial time. J. Symb. Comput. 2002, 33, 735–755. [Google Scholar] [CrossRef]
  26. Cheong, R.; Hoffmann, A.; Levchenko, A. Understanding NF-kB signaling via mathematical modeling. Mol. Syst. Biol. 2008, 4, 192. [Google Scholar] [CrossRef] [PubMed]
  27. Cobelli, C.; DiStefano, J. Parameter and structural identifiability concepts and ambiguities: A critical review and analysis. Am. J. Physiol.-Regulat. Integr. Comp. Physiol. 1980, 239, R7–R24. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Diagram of the algorithm.
Figure 1. Diagram of the algorithm.
Symmetry 12 00469 g001
Figure 2. Diagrams of the models analysed in this article. (A) Simple chemical reaction. (B) Pharmacokinetic model. (C) NF- κ B signalling pathway. (D) Glucose-insulin regulation system.
Figure 2. Diagrams of the models analysed in this article. (A) Simple chemical reaction. (B) Pharmacokinetic model. (C) NF- κ B signalling pathway. (D) Glucose-insulin regulation system.
Symmetry 12 00469 g002
Figure 3. Output of STRIKE-GOLDD for the initial model (Page 1) and the model with one-parameter Lie transformations (Page 2).
Figure 3. Output of STRIKE-GOLDD for the initial model (Page 1) and the model with one-parameter Lie transformations (Page 2).
Symmetry 12 00469 g003
Table 1. List of models analysed in this paper and summary of their features.
Table 1. List of models analysed in this paper and summary of their features.
Model Name (and Acronym)ReferenceStatesParametersOutputs
Simple chemical reaction (CR)[15]A k , s 1 , s 2 A obs
Pharmacokinetic model (PK)[22] x 1 , x 2 , x 3 , x 4 k 1 , k 2 , k 3 , k 5 , x 2 obs , x 3 obs
k 6 , k 7 , s 2 , s 3
NF- κ B signalling pathway (NFKB)[23] x 1 , x 2 , x 3 , x 4 , x 5 , k 0 , k 1 , k 1 p , k 2 , k 3 , k 4 , y 1 , y 2 , y 3 , y 4
x 6 , x 7 , x 8 , x 9 , x 10 k 5 , k 6 , k 7 , k 8 , k 9 , k 10 , k 11 ,
s 1 , s 2 , s 3 , s 4 , ρ vol , I 0 cyt , I 0 nuc
Glucose-insulin regulation (Bolie)[24] q 1 , q 2 p 1 , p 2 , p 3 , p 4 , V p h
Back to TopTop