Modeling Delayed Dynamics in Biological Regulatory Networks from Time Series Data †

Background: The modeling of Biological Regulatory Networks (BRNs) relies on background knowledge, deriving either from literature and/or the analysis of biological observations. However, with the development of high-throughput data, there is a growing need for methods that automatically generate admissible models. Methods: Our research aim is to provide a logical approach to infer BRNs based on given time series data and known influences among genes. Results: We propose a new methodology for models expressed through a timed extension of the automata networks (well suited for biological systems). The main purpose is to have a resulting network as consistent as possible with the observed datasets. Conclusion: The originality of our work is three-fold: (i) identifying the sign of the interaction; (ii) the direct integration of quantitative time delays in the learning approach; and (iii) the identification of the qualitative discrete levels that lead to the systems’ dynamics. We show the benefits of such an automatic approach on dynamical biological models, the DREAM4(in silico) and DREAM8 (breast cancer) datasets, popular reverse-engineering challenges, in order to discuss the precision and the computational performances of our modeling method.


Introduction
With both the spread of numerical tools in every part of daily life and the development of NGS methods (Next Generation Sequencing methods), like DNA microarrays in biology, a very large amount of time series data is now produced [1][2][3].This means that the produced data from the experiments on biological systems grows drastically.The newly-produced data, as long as the associated noise does not raise an issue with regard to the precision and relevance of the corresponding information, can give us some new insights into the behavior of a system.This justifies the urge to design efficient methods for inference.
Reverse engineering of gene regulatory networks from expression data have been handled by various approaches [4][5][6][7][8].Most of them are only static.However, other researchers are rather focusing on incorporating temporal aspects in inference algorithms.The relevance of these various algorithms has been recently assessed in [9].The authors of [10] tackled the inference problem of time-delayed gene regulatory networks through Bayesian networks, and in [11], the authors infer a temporal Boolean network.As this is a complex problem, in [12], the authors propose a time-window-extension technique based on time series segmentation in different successive phases.These approaches take gene expression data into account as the input and generate the associated regulations.However, the discrete approaches that simplify this problem by abstractions need to determine the relevant thresholds of each gene to define its active and inactive state.Various approaches have been designed to tackle the discretization problem.We can cite for example [12], in which the authors have proposed an alternative methodology that considers not a concentration level, but the way the concentration changes (in other words: the derivative of the function giving the concentration w.r.t.time) in the presence/absence of one regulator.On the other hand, the major problem for modeling lies in the quality of the expression data.Indeed, noisy data may be the main origin of the errors in the inference process.Thus, the pre-processing of the biological data is crucial for the pertinence of the inferred relations between components.In our work, we assume that the input data are pre-processed into reliable discretized time series data and focus on the dynamical inference problematics.
In this paper, we aim to provide a logical approach to tackle the learning of qualitative models of biological dynamical systems, like gene regulatory networks.In our context, we assume the set of interacting components as fixed, and we consider the learning of the interactions between those components.As in [13], in which the authors targeted the completion of stationary Boolean networks, we suppose that the topology of the network is given, providing us the influences among each gene as the background knowledge.From time series data of the evolution of the system, given its topology, we learn the dynamics of the system.The main originality of our work is that we address this problem in a timed setting, with quantitative delays potentially occurring between the moment an interaction is activated and the moment its effect is visible.
During the past decade, there has been a growing interest for the hybrid modeling of gene regulatory networks with delays.These hybrid approaches consider various modeling frameworks.In [14], the authors hybridize Petri nets: the advantage of the hybrid approach with regard to discrete modeling lies in the possibility of capturing biological factors, e.g., the delay for the transcription of RNA polymerase.The merits of other hybrid formalisms in biology have been studied, for instance timed automata [15], hybrid automata [16], the hybrid model of a neural oscillator [17] and Boolean representation [18,19].In [20], the authors also propose algorithms to learn the delayed dynamics of systems from time series data.However, they focus only on synchronous dynamics, and the delays they consider are different from the duration of the reaction that we model here.Other works based on approximate bisimulations [21] present a procedure to compare neural structures among them, which is verified on continuous time recurrent neural networks.In addition, an application of the framework of approximate bisimulations to an approximation of nonlinear systems was proposed in [22].Here, the authors use an over-approximation of the observational distance between nondeterministic nonlinear systems.Finally, in [23], the authors investigate a direct extension of the discrete René Thomasmodeling approach by introducing quantitative delays.These delays represent the compulsory time for a gene to turn from a discrete qualitative level to the next (or previous) one.They exhibit the advantage of such a framework for the analysis of mucus production in the bacterium Pseudomonas aeruginosa.The approach we propose in this paper inherits from this idea that some models need to capture these timing features.
Similar concerns are presents in the work of [24], where the authors model the dynamics of the system using the S[B] paradigm [25] via delayed discrete automaton and persistent entropy.However, where they focus on the global dynamics of the system, we aim to capture the local interactions of each components.Their methods can reproduce the global evolution of the system among steady states, but what we target is the precise evolution of the system every fixed amount of time.
The first theoretical steps behind our method have been introduced in [26].In the current paper, we have deepened our work to widen its applicability: after introducing a range of filters to curate the models resulting from our algorithms, we show its efficiency on new benchmarks coming from DREAM Challenges.
We briefly introduce Automata Networks (AN) in Section 2, and all theoretical and practical notions are then settled to introduce our timed extension of AN in Section 3.Then, in Section 4, we present the related learning algorithm.We propose in Section 5 new refinement methods applied to the generated model.We demonstrate the performance of our algorithm by a running case study example in Section 6.Then, we show in Section 7 the implementation of the modeling method and of all filters in Answer Set Programming (ASP).Finally, we illustrate the merits of our approach in Section 8 by applying it on popular reverse-engineering challenges: the DREAM4 in silico dataset, then on the real data of the breast cancer network from DREAM8 dataset.Additionally, we summarize our contribution in Section 9 and give some perspectives for future works.

Automata Networks
We present in this section the definition and the semantics of automata networks.The enrichment of the automata networks with delays and the corresponding new semantics are presented in Section 3.
Definition 1 introduces the Automata Network (AN) [27][28][29] as a model of finite-state machines having transitions between their local state conditioned by the state of other automata in the network.A local state of an automaton is noted by a i , where a is the automaton identifier, and i is the expression level within a.At any time, each automaton has exactly one active local state, and the set of active local states is called the global state of the network.
The concurrent interactions between automata are defined by a set of local transitions.Each local transition has this form τ = a i →a j , with a i , a j being local states of an automaton a called, respectively, the origin and destination of τ, and is a (possibly empty) set of local states of automata other than a (with at most one local state per automaton).Notation: Given a finite state A, ℘(A) is the power set of A. Given a network N, state(N, t) is the state of N at a time step t ∈ N.

Definition 1 (Automata Network (AN)
).An automata network is a triple (Σ, S, T ) where: • Σ = {a, b, . . .} is the finite set of automata identifiers; • For each a ∈ Σ, S(a) = {a i , . . ., a j }, is the finite set of local states of automaton a; S = ∏ a∈Σ S(a) is the finite set of global states and LS = ∪ a∈Σ S(a) denotes the set of all of the local states.Definition 2 (Playable local transition).Let AN = (Σ, S, T ) be an automata network and ζ ∈ S, with ζ = state(AN , t).We denote P t the set of playable local transitions in AN at time step t by: P The dynamics of the AN is performed thanks to the global transitions.Indeed, the transition from one state ζ 1 to its successor ζ 2 is satisfied by a set of the playable local transitions (Definition 2) at ζ 1 .

Definition 3 (Global transitions)
. Let AN = (Σ, S, T ) be an automata network and ζ 1 , ζ 2 ∈ S, with ζ 1 = state(AN , t) and ζ 2 = state(AN , t + 1).Let P t be the set of playable local transitions at t.We denote G t the power set of global transitions at t: G t := ℘(P t ) In the semantics that we base our work on, the parallel application of local transitions in different automata is permitted, but it is not enforced.Thus, the set of global transitions is a power set of all of the playable local transitions (also the empty set).

Timed Automata Networks
In some dynamics, it is crucial to have information about the delays between two events (two states of an AN).The discrete transitions, described above, cannot exhibit this information: we just process chronological information that the state ζ 2 will be after ζ 1 in the next step, but it is not possible to know chronometry, i.e., how much time this transition takes to occur and whether it blocks some transitions during this time.In fact some local transitions could not be played any more because of conflict about shared resources (necessary components to play a transition) between them.Thus, we need to restrain the general dynamics to capture more realistic behavior w.r.t.biology.Therefore, we propose in this section to add the delays in the local transitions' attributes: timed local transitions.Later, we give the associated semantics that we base our work on to learn biological networks.

Definition 4 (Timed Automata Network (T-AN)).
A timed automata network is a triple (Σ, S, T ) where: For each a ∈ Σ, S(a) = {a i , . . ., a j }, is the finite set of local states of automaton a; S = ∏ a∈Σ S(a) is the finite set of global states; LS = ∪ a∈Σ S(a) denotes the set of all of the local states.• T = {a → T a | a ∈ Σ}, where ∀a ∈ Σ, T a ⊂ S(a) × ℘(LS \ S(a)) × N × S(a) with (a i , , δ, a j ) ∈ T a ⇒ a i = a j , is the mapping from automata to their finite set of timed local transitions.
To model biological networks where quantitative time plays a major role, we will use T-AN.
This formalism enriches AN model with timed local transitions: a i → δ a j .In the latter, δ is called a delay and represents the time needed for the transition to be performed.When modeling a regulation phenomenon, this allows one to capture the delay between the activation order of the production of the protein and its effective synthesis, as well as the synthesis of the product.
According to Lemma 1, Definition 2 is then also applicable to timed local transitions.
Considering that delays in the evolution of timed automata networks create conflict between the timed local transitions, this conflict is mainly justified by the shared resources between the timed local transitions (Definition 5).Indeed, transitions that have the same origins and/or destinations could not be fired synchronously.Besides, during the delay of the execution of a transition τ 1 , it is possible that another transition τ 2 is activated.Then, we need to take care of the following possible conflicts between resources: transition τ 2 may change the local states of the automata participating in τ 1 .We make the following assumption, which is similar to that adopted in [30]: we consider that τ 2 needs to be blocked until the current transition τ 1 finishes.Nevertheless, we allow the resources of τ 1 to participate in other transitions.In addition, we do not forbid the process involved in orig(τ 1 ) to participate in another transition τ 2 if and only if that the remaining delay(τ 1 ) is greater than delay(τ 2 ) (see Definition 6).Those considerations lead to the followings definitions.
In Definition 6, if P is the set of currently ongoing timed local transitions, it allows us to prevent the execution of transitions that would alternate the resources currently being used or that would rely on resources that will be modified before the end of those transitions.Definition 6 (Blocked timed local transition).Let AN = (Σ, S, T ) be a T-AN and t ∈ N. Let P be a set of pairs T × N. The set of blocked timed local transitions of AN by P at t is defined as follows: Let τ 1 be a transition, such that τ 1 = a i → δ a j is fired at time step t.Therefore, t + δ is the ending time of τ 1 , and (t − (t + δ)) is the interval of time between the ending of the transition and the beginning of transition τ 1 with t > t.According to Definition 6, τ 2 is blocked if a i (respectively b k ) is a necessary resource for τ 2 (respectively τ 1 ) and the τ 1 (respectively τ 2 ) finishes before τ 2 (respectively τ 1 ): δ > t − (t + δ) (respectively δ < t − (t + δ)), i.e., a i (respectively b k ) is not available to participate in the transition τ 2 (respectively τ 1 ) during δ (respectively δ).Lemma 2. If τ 2 is in conflict with an ongoing transition τ 1 at a time step t then τ 2 is blocked by τ 1 .
Lemma 2 is completely coherent with semantics dynamics.It is not possible to play a transition that it is in conflict with another ongoing one.Thus, the set of fireable transitions depends at each step on the set of the ongoing transitions, the blocked ones and the new playable ones.Definition 7 (Fireable timed local transition).Let AN = (Σ, S, T ) be a T-AN and ζ ∈ S the state of AN at t ∈ N. Let P be a set of pairs T × N and B(AN , P, t) be the set of blocked timed local transitions of AN by P at t.The set of fireable timed local transitions of AN in ζ w.r.t.P at t is defined as follows: Definition 8 prevents the execution of two transitions that would affect the same automaton.

Definition 9 (Active timed local transitions)
. Let AN = (Σ, S, T ) be a T-AN and ζ ∈ S the state of AN at t ∈ N. Let SFS(AN , ζ, P, t) be the set of fireable sets of timed local transition.The set of active timed local transitions of AN at t is: Definition 9 provides us the evolution of the possible set of ongoing transitions.Supposing that in an initial state of a trajectory (at t = 0), no transition is blocked and all playable timed transitions are fireable, then, when t > 0, at each time step, it should be verified that a playable timed transition is also fireable; in other words, that it is not blocked by the active timed local transitions fired in previous steps.Furthermore, the fired timed local transitions at the same time should not block each other.
In Figure 2 we detail the distribution of the timed local transitions of an T-AN at any step.We suppose that all timed local transitions T in the network are presented by the green set, the blocked ones B are presented by the red set, the fireable one F are presented by the gray set and finally the fireable set FS at a step t is presented by the blue set.We remind that F ∩ B = ∅ and FS ⊆ F. Delays of local transitions can now be represented in an automata network thanks to timed local transitions.Note that if all delays of local transitions are set to zero, this is equivalent to an AN without delays (original AN).The way these new local transitions should be used is described as follows.
At any time, each automaton has one and only one local state, forming the global state of the network.Choosing arbitrary ordering between automata identifiers, the set of global states of the network is referred to as S with S = ∏ a∈Σ S(a).Definition 10 (Semantics of timed automata network ).Let AN = (Σ, S, T ) be a T-AN and t ∈ N. The set of timed local transitions fired at t is: The state of AN at t + 1 is denoted ζ t+1 = state(AN , t + 1) and defined according to the set of timed local transitions that finished at t + 1: We note that at any time step t, such that ζ = state(AN , t) and P the set of ongoing transitions, we have: FS ∈ F(AN , ζ, P, t) ∈ T \ B(AN , P, t).
Where synchronous biological regulatory networks have been studied, little has been done on the asynchronous counterpart [31], although there is evidence that most living systems are governed by synchronous and asynchronous updating.According to Harvey and Bossomaier [32], asynchronous systems are biologically more plausible for many phenomena than their synchronous counterpart, and observed global synchronous behavior in nature usually simply arises from the local asynchronous behavior.In this paper, we defend these assumptions, and we consider an asynchronous behavior for each automata, on the one hand, and a synchronous behavior in the global network.
In [20], the authors also propose algorithms to learn the delayed dynamic of systems from time series data.However, they focus only on synchronous dynamics, and changes take only one step to occur.The delays that they consider are in the conditions of the change (body of there rules), which can be at different previous time step, where in our method the conditions, are in the same state, and the delay is in the change (head of there rules).However, the assumptions in the synchronous model that all components could change at the same time and take an equivalent amount of time in changing their expression levels are biologically unrealistic.However, there is seldom enough information to be able to discern the precise order and duration of state transitions.The timed extension of AN that we propose in this paper allows both asynchronous and synchronous behavior by proposing a non-deterministic application of the timed local transitions.Table 1 shows a trajectory of a T-AN when we choose to apply timed local transitions in a synchronous manner.
We presented above the semantics of the T-AN that we base our work on to model BRNs from experimental data.Even a few hybrid formalisms already exist, like time Petri nets, hybrid automata, etc., we propose this extension of the AN framework for several reasons.First, AN is a general framework that, although it was mainly used for biological networks [28,33], allows one to represent any kind of dynamical models, and converters to several other representations are available.Indeed, a T-AN is a subclass of timed Petri nets [34].Finally, the particular form of the timed local transition in the T-AN model allows one to easily represent them in ASP, with one fact per timed local transition, as described in this work [35].Later, we propose a new approach to resolve the generation problem of T-AN models from time series data.
Taking the following T-AN as an example, we generate a possible trajectory of the network starting from a known initial state.
Example 2. Let AN = (Σ, S, T )) be a timed automata extended with delays from Example 1, such that We give in the following Table 1 an example of a trajectory of the T-AN of Example 2 starting from an initial state until reaching a stable state.

Learning Timed Automata Networks
This algorithm takes as input a model expressed as a T-AN, whose set of timed local transitions is empty, and time series data capturing the dynamics of the studied system.Given the influences between the components (or assuming all possible influences if no background knowledge is available), this algorithm generates the timed local transitions that could result in the same changes of the model as the observed ones through the observation data.

Algorithm
In this section, we propose an algorithm to build T-AN from time series data.We assume that the latter data observations are provided as a chronogram of size T, with T the maximum number of time points: the value of each variable is given for each time point t ∈ N + , 0 ≤ t ≤ T, through a time interval discretization (see Definition 11  We denote Γ a a chronogram of the time series data for a component a. Algorithm 1, MoT-AN (Modeling Timed Automata networks), shows the pseudocode of our implemented algorithm.It will generate all possible timed local transitions that can realize each observed change.A change corresponds to a modification of a component expression level.It happens because of an activation or an inhibition of the changed component by a timed local transition.Thanks to MoT-AN, we compute the discrete levels of the origin and the destination of timed local transitions and the discrete levels of all automata involved in its condition.Thus, finding this timed local transition responsible for this change is also finding the sign of the interaction responsible for this change.

INPUT:
-Timed automata network AN = (Σ, S, T ) with T = ∅; -a chronogram Γ = a∈Σ Γ a ; -the regulation influences χ = a∈Σ χ a and -a maximal in-degree i ∈ N * OUTPUT: φ a set of T-AN that realize the time series data.
• Let ϕ := ∅ • Step 1: According to Γ, for each time step t where a component a changes its value from a i to a j , with a i , a j ∈ S a : for all ∈ ℘(χ a ), | | ≤ i, generate exhaustively all timed local transitions: -Add all timed local transition τ in ϕ Step 2: Create in Φ as much as possible T-AN AN = (Σ, S, ϕ ) with ϕ ⊆ Φ a set of timed local transitions that can realize Γ, such that ϕ is minimal: Because of the delays and the non-deterministic semantics, it is not possible to decide whether a timed local transition is absolutely correct or not.However, we can output the minimal sets of timed local transitions, which are necessary to realize all of the dynamics changes.However, since observations are not perfect, we cannot safely remove non-consistent transitions.Thus, we refine the output model by filters taking into account the different possible perturbations.These filters are detailed more later in Section 5.
For the rest of the article, we denote by Π x t the last change of each component x comparing to t ( i.e., ∀x ∈ Σ Π x t ≤ t).
Theorem 1 (Completeness).Let AN = (Σ, S, T ) be a T-AN, Γ be a chronogram of the components of AN , i the indegree with i ∈ N and R ∈ T the set of timed local transitions that realized the chronogram Γ, such that (a i , l, a j , δ) ∈ R =⇒ |l| ≤ i.Let χ be the regulation influences of all a ∈ Σ.Let AN = (Σ, S, ∅) be a T-AN.Given AN , Γ, χ and i as input, Algorithm 1 is complete: it will output a set of T-AN φ, such that ∃AN = (Σ, S, ϕ ) ∈ φ with R ⊆ ϕ .
Proof.Let us suppose that the algorithm is not complete, then there is a timed local transition h ∈ R that realized Γ and h ∈ ϕ .In Algorithm 1, after Step 1, ϕ contains all timed local transitions that can realize each change of the chronogram Γ.Here, there is no timed local transition h ∈ R that realizes Γ, which is not generated by the algorithm, so h ∈ ϕ.Then, it implies that at Step 2, ∀ϕ , h ∈ ϕ .However, since h realizes one of the changes of Γ and h is generated at Step 1, then it will be present in one of the minimal subsets of timed local transitions; such that h will be in one of the networks outputted by the algorithm.
If AN = (Σ, S, T ) is a T-AN learned model regarding the chronogram Γ and the interaction graph χ of the automata in Σ, such that ∀Π x t ∈ Γ ∃τ ∈ T , which realize Π x t .
Theorem 2 (Complexity).Let AN = (Σ, S, T ) be a T-AN, |Σ| be the number of automata of AN , η the total number of local states of an automaton of AN and i the maximal in-degree of all transitions with 0 ≤ i ≤ |Σ|.Let Γ be a chronogram of the components of AN over T units of time, such that c is the number of changes in Γ, 0 ≤ c ≤ T. The memory use of Algorithm The complexity of this algorithm belongs to O(c Each set has to be compared with the others to keep only the minimal ones, which costs In practice, we can fix the maximal indegree of the generated timed local transitions and consider binary automaton (maximum of two local levels per automaton: η = 2) to make the computation tractable.Considering an indegree i = 2, the complexity falls down to O(c ).If at each time step, there is a change, so it is bounded by Indeed, in biological systems, the number of interactions between components is not too large, so it is not the source of complexity, but the fact that we generate all possible minimal models.Besides, behavior can be captured with a simple combination consisting of two components, like in Section 3.2 of [4].If we output only one model, we find that the complexity falls down to O(c • 16) = O(c).

Refining
In Section 4, we create as many models as possible that satisfy our semantics and the given dynamics (time series data).However, in these models, there may exist some contradictions.Thus, in this section, we introduce a set of refinements to perform model curation and keep the ones that are the most consistent with some given assumptions.

Synchronous Behavior
As detailed in Section 2, in our semantics, we impose the synchronous semantics behavior as long as there is no conflict between transitions.In other words, for any step, if there is at least one transition τ that is not blocked by another one and τ can be played, then τ should be fired.Definition 12 (Synchronous and not in conflict transitions).Let AN = (Σ, S, T ) be a learned T-AN model regarding a chronogram Γ and an interaction graph χ of all automata in Σ.The model is said to be coherent with the synchronous and non-conflict behavior in Γ if and only if ∀τ ∈ T if τ = (a i , , δ, a j ) is playable at a time step t in Γ and τ ∈ T , such that τ is in conflict with τ, then ∃Π a t = t + δ in Γ with t + δ > t.
This filter checks whether each model is consistent with the synchronous and not in conflict transitions (Theorem 12).If this is not the case, then the model will be discarded from the output of Algorithm 1.

More Frequent Timed Automata Networks
All outputted models have one execution, which exactly reproduces the given observations (see Theorem 1); thus, it is interesting to compare all of these models and to find the transitions that are more commonly performed.

Definition 13 (Transition frequency).
Let AN 1 = (Σ, S, T 1 )...AN n = (Σ, S, T n ) be the set of found models by Algorithm 1.Let Freq(τ) be the frequency of appearance of a transition τ ∈ T i , 1 ≤ i ≤ n in all of these models: Giving a minimal frequency to this filter, minFreq ∈ [0, 1], it maintains only transitions τ, such that Freq(τ) ≥ minFreq.It can happen that some transitions are lost that are important, but it keeps only the transitions that are more relevant.

Deterministic Influence
In a T-AN model, we do not want a component to inhibit and activate another with the same level of expression because biologically, this is generally not the case.In Definition 14, we say that the component b with a level of expression k cannot be a part of the conditions that activate (τ 1 ) and inhibit (τ 2 ) a.

Definition 14 (Distinct influence)
Therefore, this filter eliminates all learned models in which there exists transitions that do not satisfy the deterministic influences between components (Definition 14).

Several Delays
Usually in the same model for the same transition, there is only one delay.However, when learning models from real-time series data (e.g., data coming from cell cancer lineages, circadian clock, . ..), we may not be guaranteed that the data have been normalized, even after discretization.Indeed, there is often noise in data.Thus, after computing the delays, Algorithm 1 can find the same transition with different delays.Here, the user (biologist) can choose whether to keep all of these transitions or we propose to him/her to merge all of them and keep only one transition with an interval of delays by computing the maximum value and the minimum one.
Definition 15 (Interval of delays).∀τ 1 , τ 2 , ..., τ n ∈ T t.q τ 1 = a i −→ with a i , a j ∈ Σ a , ∈ ℘(LS \ S(a)) and δ 1 = δ 2 = ... = δ n then: merge all transitions τ 1 , τ 2 , ..., τ n in one transition τ: On the other hand, we can compute a delay equal to the average value of all of these transitions that differ only by delays (see Definition 16).The intuition is that, in practice, if there are enough observations, the delay of those actions should tend to the real value.

Definition 16 (Average of delays
a j with a i , a j ∈ Σ a , ∈ ℘(LS \ S(a)) and δ 1 = δ 2 = ... = δ n then: merge all transitions τ 1 , τ 2 , ..., τ k in one transition τ: τ = a i −→ δ avg a j such that: In Definition 17, we want each learned transition to have only one delay in one model.

Case Study
In this section, we show how this method generates a T-AN model consistent with the set of biological regulatory time series data.First, the method uses discretized observations as an input (i.e., chronogram), thus, it is necessary to process at the beginning the time series data with another method (as shown in Figure 3) in order to discretize it.
Our method may be summarized as follows: -Detect biological components' changes; -Compute the candidate timed local transitions responsible for the network changes; -Generate the minimal subset of candidate timed local transitions that can realize all changes; -Filter the timed candidate actions.
We apply Algorithm 1 on learning the timed local transition τ ∈ T of the example of a network AN = (Σ, S, T ) with 3 components (|Σ| = 3) whose chronogram is detailed in Figure 3  Let χ = {b → z, a → z, a → a} be the set of regulation influences among the components of the network.According to χ, the set of genes having influence on z is χ z = {a, b}.This means that = {a ?, b ?}, or = {a ?}, or = {b ?}.The expression level of the genes of χ z when the researched candidate timed local transition (τ i ) is ongoing, i.e., during the partial steady state between two successive changes (t i and t i−1 ).This level is computed from the chronograms as follows: Thus, ={a 0 , b 1 } or ={a 0 } or ={b 1 }, and the set of candidate timed local transitions is: Since it is the first change, the delay of each timed local transition is the same: The second change occurs at t 2 = 3 and is denoted by change(3).Here, it is the gene a whose state changes from a 0 to a 1 ; thus, the timed local transition that realizes this change has this form τ = a 0 → δ a 1 where can be any combination of the regulators value at t 1 of z.According to χ, the genes influencing a are χ a = {a}.This means that = {a ?}, and the expression level of a between t 1 and t 2 is a 0 .Therefore, = {a 0 }.Thus, there is only one candidate timed local transition: The fourth change occurs at t 4 = 5, change (5).Here, it is a whose expression decreases and changes from a 1 to a 0 ; thus, the candidate timed local transition that could realize this change has this form, τ = a 1 → δ a 0 where can be any combination of the regulators value at t 4 − 1 of a.According to χ, the set of genes having influences on a is χ a = {a}.Again, = {a ?} as the expression level of a since its last change is a 1 .We have A = {a 1 }, and there is only one candidate timed local transition: The fifth change occurs at t 5 = 6, change (6).Here, it is z whose value changes from z 1 to z 0 ; thus, the time local transition that has realized this change has the form of: τ = z 1 → δ z 0 where can be any combination of the regulators' value at t 3 − 1 of b.Since χ z = {a, b}, this means that = {a ?, b ?} or = {a ?} or = {b ?} The expression level of a and b at t 5 − 1 is respectively a 0 and b 0 .Thus, = {a 0 , b 1 } or = {a 0 } or = {b 0 }.The candidate timed local states are: The last change of a is at t 4 = 5, and the last change of b is at After processing all changes, the set of timed local transitions that could realize the chronograms is: All learned timed local transitions are consistent with all observed time series data and the regulation influences given as input.The used method ensures completeness; we have the full set of timed local transitions that can explain the observations.By generating all minimal subsets of this set of timed local transitions, one of those subsets (as in Figure 4) define the model that realized the observations.The labels of each local transition stand for the local states of the automata, which make the transition playable, and its delay (time needed for the transition to be performed).

ASP Syntax
In this section, we briefly recapitulate the basic elements of ASP [36], a declarative language that proved efficient to address highly computational problems.An answer set program is a finite set of rules of the form: a 0 ← a 1 , . . ., a m , not a m+1 , . . ., not a n .
where n ≥ m ≥ 0, a 0 is a propositional atom or ⊥; all a 1 , . . ., a n are propositional atoms, and the symbol "not" denotes negation as failure.The intuitive reading of such a rule is that whenever a 1 , . . ., a m are known to be true and there is no evidence for any of the negated atoms a m+1 , . . ., a n to be true, then a 0 has to be true, as well.If a 0 = ⊥, then the rule becomes a constraint (in which case the symbol ⊥ is usually omitted).As ⊥ can never become true, if the right-hand side of a constraint is validated, it invalidates the whole solution.
In the ASP paradigm, the search of solutions consists of computing the answer sets of a given program.An answer set for a program is defined by Gelfond and Lifschitz [37] as follows.An interpretation I is a finite set of propositional atoms.A rule r as given in ( 1) is true under I if and only if: {a 1 , . . ., a m } ⊆ I ∧ {a m+1 , . . ., a n } ∩ I = ∅ ⇒ a 0 ∈ I .
An interpretation I is a model of a program P if each rule r ∈ P is true under I. Finally, I is an answer set of P if I is a minimal (in terms of inclusion) model of P I , where P I is defined as the program that results from P by deleting all rules that contain a negated atom that appears in I and deleting all negated atoms from the remaining rules.Programs can yield no answer set, one answer set or several answer sets.To compute the answer sets of a given program, one needs a grounder (to remove free variables from the rules) and a solver.For the present work, we used CLINGO(we used CLINGO Version 5.0: http://potassco.sourceforge.net/)[38], which is a combination of both.
In the rest of this section, we use ASP to tackle the problems' learning models from the time series data (i.e., implementing Algorithm 1 and all of the refinements in Section 5).

All Models
Encoding of the case study in Section 6, Figure 3 on page 13: We encode the observations as a set of predicates obs(X,Val,T), where X is an automaton identifier, Val a level of this automaton and T is a time point, such that the automaton X has the value Val starting from T.
We detailed in Section 3 how to compute the delays.Therefore, the delay is the difference between the time where the change happens (changeState) and the last change of the components involved in the local transition: the influenced one, X, and the influencing one(s) (here, we detail only the transition with indegree = 1).Find the last component that has changed, the influenced component X, the influencing component Y or one of the components influenced by X, such that there exists a transition in conflict with the computing one.This last time step of the change H comparing to the time step T2 for the component X having as an influence Y is computed in the predicate lastChange(X,Y,H,T2) (lines [29][30][31][32][33][34][35][36][37][38][39].In other words, H=Π T2 X . 27 % Find the time step when the transition has started playing 28 % The last change of X such that W is influencing by X and X is influencing by Y 29 lastchange(X,Y,W,Max,T2) ← Max=#max{ T : changeState(Y,T;X,T;W,T), T<T2}, changeState(X, T2), existInfluence(X,Y), existInfluence(W,X), Max>=0.
We propose below Figure 5 where we simplify the explication of the encoding part.It illustrates which resources are necessary to compute the origin, destination, conditions and delay of each timed local transition.

Refinement on the Delays
The several refinements of the generated models we propose can be seen as parameters that can be combined or not.For example, the user can specify that for the same transition, if we find different delays, we can take the average value or even specify an interval in which we define the minimum value and the maximum value.
According to Definition 17, in one model, a transition cannot have different delays; so that such models can be eliminated by the following constraint in line 12.Otherwise, we can merge all transitions that only differ by their delay in one transition, but whose delay is equal to the average delay of these transitions.To compute the average delays for a transition in the same model according to Definition 16, we do as follows in ASP.First, we compute the total number of these transitions in Tot by nbreTotTrans(Y,Valy,X,Val1,Val2,Tot) (lines 77-77), where the shared part between all transitions is Y,Valy,X,Val1,Val2; then, the sum of all of these delays in S by the predicate sumDelays (lines 80-81).To find the average, we divide S by Tot (Davg=S/Tot).The new transition after merging is then transAvgDelay(Y,Valy,X,Val1,Val2,Davg) (lines 83-84).According to the Definition 13, we want to find the more frequent transitions in the models.In ASP, the option "--cautious" computes the cautious consequences (intersection of all answer sets) of a logic program (algorithm implementation in ASP).Therefore, we use this option while executing.

Evaluation
In this section, we provide two assessments of Algorithm 1.We evaluate the capacity of our algorithm ( all programs, described in this article for timed automata network generation are implemented in ASP and are available online at: http://www.irccyn.ec-nantes.fr/~benabdal/modeling-biological-regulatory-networks.zip ) to build models for prediction and the impact of the quantity of observations on run time.Here, we process chronograms obtained from time series data of the DREAM4 challenge [39] and of the DREAM8 challenge [40].

DREAM4
In this section, we assess the efficiency of our algorithm through case studies coming from the DREAM4 challenge.DREAM challenges are annual reverse engineering challenges that provide biological case studies.In this section, we focus on the datasets coming from DREAM4 (available online at https://www.synapse.org/#!Synapse:syn3049712/wiki/74630).It provides data for systems of different size (10 genes on the one hand; 100 genes on the other), allowing us to assess the scalability of our approach.The input data that we tackle here consist of the following: five different systems each composed of 100 genes, all coming from Escherichia coli and yeast networks.For every such system, the available data are the following: (i) 10 time series data with 21 time points, and 1000 is the duration of each time series; (ii) steady state for the wild-type; (iii) steady states after knocking out each gene; (iv) steady states after knocking down each gene (i.e., forcing its transcription rate at 50%); (v) steady states after some random multifactorial perturbations.We processed all of the data.Here, we focus on the management of time series data.
Each time series includes different perturbations that are maintained all of the time during the first 10 time points and applied to at most 30% of the genes.In this setting, a perturbation means a significant increase or decrease of the gene expression.In the raw data of the time series, gene expression values are given as real numbers between zero and one.
The discretization is a crucial part for the abstraction of the data.However, in this article, we do not develop the method of the discretization, and we suppose this as already chosen.There are many ways to discretize those data; using biological databases and statistical methods, we could identify gene expression thresholds.Clustering techniques could be used to group data points into discrete levels.On the other hand, rather than considering thresholds, we could focus on components' speed evolution to discretize the data.It is indeed a perspective of optimization to regularize the model through different levels and methods of discretization.However, in this article, we focus on the learning algorithm and consider discretization as more or less given.
To apply our approach, we chose to discretize those data into two to six qualitative values.Increasing the number of qualitative values from two to four improves the precision, but then the score decreases from five, most likely because of over-fitting: the relations learned become too precise and cannot be applied on something else other than the training data.The best score we obtained was with four qualitative values and is reported in Table 2.Each gene is discretized in an independent manner, with respect to the following procedure: we compute the average value of the gene expression among all data of a time series, then the values between the average and the maximal/minimal value are divided into as many levels.Discretizing the data according to the average value of expression is expected to reduce the impact of perturbation on the discretization and, thus, on the model learned.The DREAM4 challenge offers two different problems, which consist of predicting: (i) the structure of the gene interactions (in terms of an unsigned directed graph); (ii) attractors in some given conditions.Our method is not designed to tackle the first issue; indeed, we need to know those influences.However, the models we learn can be applied to predict trajectories and thus attractors.Here, we use the influences graphs expected in the first problem as the background knowledge (inferred by the Gene Network Weaver [41]) to tackle the attractor prediction part of the challenge.

Results
For this evaluation, we are given an initial state and five different dual gene knockout conditions.The goal is to predict the attractor in which the system will fall from the initial state for each dual knockout.Here, we just choose the first model that our algorithm outputs and use the biggest set of fireable timed local transitions at each time step to produce a trajectory until a cycle is detected.The first state of this cycle is reverse discretized and proposed as the predicted state.In the challenge, the quality of the prediction is evaluated by computing the mean square error (MSE) between the predicted state and the expected one.As shown in Table 2, the precision we achieved in those experiments is quite good considering the results of the competitors of the DREAM4 challenge [39].Their results range between 0.010 and 0.075 for the same evaluation settings, which are comparable to (0.033 to 0.086), giving us encouraging results.Regarding run time, learning and predicting the trajectories of the benchmarks of 10 genes took less than 30 s, and the same experiments for the benchmarks of 100 genes took about 3 h and 20 min on a one-processor Intel Core2 Duo (P8400, 2.26 GHz).
To achieve this score, we had to perform several tests by varying the discretization precision and the complexity of the dynamics learned.Figure 6 shows the score we got on predicting the training data of DREAM4 benchmarks of size 10 starting with two levels of discretization until 20.Here, we take the series of each benchmark as the input, and the first state of each series is used to start a prediction.MSE is computed over the 21 points of the original series against the predicted ones.We can see that increasing the discretization precision improves the prediction precision until five levels of discretization.Then, the model prediction quality decreases as it tends to overfit the data.Those tests also allow us to assess the scalability of our approach in practice.Figure 7 shows the impact of both timed local transition indegree and discretization level on run time when learning the DREAM4 benchmarks of size 100.
In the results obtained from the experimentation of our algorithm on the time series data of the DREAM4, we can see the exponential influence on the run time of the indegree per local transition considered, as well as the level of discretization chosen for all five different networks.However, it also shows that in practice, our approach can tackle big networks; here, 100 genes.

DREAM8
We recently started to test our method on the DREAM8 [40]: Heritage-DREAM breast cancer network inference challenge (the data set is available online at https://www.synapse.org/#!Synapse: syn1720047/wiki/55342).The challenge is about the inference of causal signaling networks and prediction of protein phosphorylation dynamics.As for DREAM4, we are focusing on the prediction part.The aim is to build dynamical models that can predict trajectories of phosphoproteins.An important emphasis is on the ability of models to generalize beyond the training data by predicting trajectories under perturbations not seen in the training data.This sub-challenge is split into two independent parts: A, breast cancer proteomic data; B, in silico data; and we choose to start with the first one.
Training data come from experiments on four breast cancer cell lines stimulated with various ligands.The data comprise protein abundance time courses under inhibitor perturbations.Training data are provided for each of the 32 biological contexts defined by the combination of cell line and growth condition (stimulus).These data comprise time courses for about 45 phosphoproteins under a vehicle control (DMSO) and under inhibitor perturbations of network nodes, as shown in Figure 8 (Figure 8 is from https://www.synapse.org/#!Synapse:syn1720047/wiki/56061).Using these training data, participants are asked to build dynamical models that can predict phosphoprotein trajectories specific to each of the 32 biological contexts (32 cell line/stimulus pairs) and under the inhibition of phosphoproteins that were not perturbed in the training data.Here, we deal with about 48 variables per cell line, which is tractable for our algorithm.The quantity of time series is also more generous than in DREAM4.The difficulty for us lies in the small amount of data point per series: seven points (0, 5, 15, 30, 60 min and 2, 4 h).Even though, we succeed at producing models that can be used for the time course prediction.

Results
To evaluate their precision, predicted trajectories are compared with experimental test data obtained in each of the 32 cell line/stimulus contexts and following the inhibition of phosphoprotein nodes by test inhibitors.These experiments are over the same time period as the training data (up to 4 h).To get our precision score, we used dreamtools [43], an open-source Python package for evaluating DREAM challenge scoring metrics.The scoring metric of this sub-challenge is based on multiple metrics; so far, we focused on the mean RMSE (Root Mean Squared Error) provided by dreamtools.In those experiments, we chose to abstract the data into a fixed number of discrete levels as for the DREAM4 experiment (see Section 8.1).By varying the number of level of discretization (see Table 3), our best score was obtained with five levels of discretization.Considering less levels results in a prediction that is too general, while considering more levels results in a tendency to overfit the modelw.r.t. the observed data.The best score we could get so far was 0.5528 mean RMSE, which would rank our method around the ninth and 11th leader board method (the DREAM8 Subchallenge 2A leader board is available at https://www.synapse.org/#!Synapse:syn1720047/wiki/56831), which got scored between 0.5139 and 0.5564 on this metric.On this challenge, the best performing methods are based on statistical analysis (see https://www.synapse.org//#!Synapse:syn2343141).The interest of our method is that the model we learn is human readable, as well as the trajectory prediction.Indeed, we explain the dynamical behavior of the network by providing detailed interactions between components to explain each change step by step.These transitions are enriched by signs (activation/inhibition), the level of expression (thresholds) and delays.The final model can be understood statically and can be used to predict dynamical behaviors with respect to a known semantics.Like for DREAM4, to achieve this score, we had to perform several tests by varying the discretization precision and the complexity of the dynamics learned.Table 4 shows the score we

c 2 Figure 1 .
Figure 1.Example of an automata network with four automata: a, b, c and d presented by labeled boxes; and their local states are presented by circles (for instance, a is either at Level 0 or 1).A local transition is a labeled directed edge between two local states within the same automaton: its label stands for the set of necessary conditions for local states of the automata to play the transition.The grayed circles stand for the global state: a 1 , b 0 , c 2 , d 2 .

Definition 7 Definition 8 (
extends the notion of playable transition by considering concurrences with the currently ongoing transition of P. Set of fireable sets of timed local transition).Let AN = (Σ, S, T ) be a T-AN and ζ ∈ S the state of AN at t ∈ N. Let P be a set of pairs T × N and F(AN , ζ, P, t) the set of fireable local transitions of AN in ζ w.r.t.P at t.The set of fireable sets of timed local transitions of AN in ζ w.r.t.P at t is defined as:

Figure 2 .
Figure 2. Distribution of timed local transitions T at any time step t of Definitions 4-9.
Given a global state ζ ∈ S, ζ(a) is the local state of automaton a in ζ, i.e., the a-th coordinate of ζ.We write also a i ∈ ζ ⇔ ζ(a) = a i ; and for any ls ∈ LS, ls ⊂ ζ ⇔ ∀a i ∈ ζ, ζ(a) = a i .In this paper, we allow, but do not force, applying parallel transitions in different automata, such in Definition 3, but adding delays in the local transitions and considering concurrency between transitions require further study of the semantics of the model (Definition 10).
below and Example in Figure 3 on page 13).Definition 11 (Chronogram).A chronogram is a discretization of the time series data for each component of a biological regulatory network.It is presented by the following function Γ, Γ : [0, T] ⊂ N + −→ {0, ..., n} t −→ i with T the maximum time point regarding the time series data called the size of the chronogram and n the maximum level of discretization.
Furthermore, each set of timed local transitions has to realize each change of Γ; it requires to check that c changes, and it costs O(c • 2 T•(|Σ|•η) |i| ).Finally, the total complexity of learning AN by generating timed local transitions from the observations of Γ belongs to

Figure 3 .
Figure 3. Examples of the discretization of continuous time series data into bi-valued chronograms.The abscissa (respectively ordinate) represents time (respectively gene expression levels).In this example, the expression level is discretized according to a threshold fixed to half of the maximum gene expression value.change(t) indicates that the expression level of a biological component, here a gene, changes its value at a time point t.

The third change occurs at t 3 = 4 ,
change(4).Here, it is the gene b whose value changes from b 1 to b 0 ; thus, the timed local transition that realizes this change is of this form, τ = b 1 → δ b 0 where can be any combination of the regulators value at t 3 − 1 of b.According to χ, there is no gene that can influence b; thus, no timed local transition can realize this change.

Figure 4 .
Figure 4. (Left) Influence graph modeling of the case study example (Figure 3); (right) one of the T-AN generated by Algorithm 1.The labels of each local transition stand for the local states of the automata, which make the transition playable, and its delay (time needed for the transition to be performed).

Figure 5 .
Figure 5.An illustration of how the ASP encoding computes the timed local transitions "transition" from the observations "obs".The names in lozenges correspond to the predicates in the ASP code.

Figure 6 .
Figure 6.Evaluation of the discretization impact on model precision over training data of DREAM4 benchmarks of a size of 10.

Figure 7 .
Figure 7. Evolution of run time on processing different models inferred from time series data of DREAM4 (100 variables benchmarks), varying the indegree of local timed transitions and discretization levels.These tests were performed on a processor of Intel Core i7 (4700, 3 GHz) with 16 GB of RAM.
A global state of a given AN consists of a set of all active local states of each automaton in the network.The active local state of a given automaton a ∈ Σ in a state ζ ∈ S is denoted ζ[a].For any given local state a i , we also note a i ∈ ζ if and only if ζ[a] = a i .For each automaton, it cannot have more than one active local state at one global state.

Table 1 .
Example of a trajectory of the T-AN of Example 2 starting from an initial state < a 1 , b 0 , c 2 , d 2 > (at t = 0) to a stable state < a 0 , b 0 , c 1 , d 1 > (at t = 10); with P = A(AN , t − 1) as in Definition 9, and B, F, SFS, FS, A(AN , t), state(AN , t) are computed as defined in Definitions 4-8.
t B(AN , P, t) F(AN , ζ, P, t) SFS FS A(AN , t) state(AN , t) The complexity of learning AN by generating timed local transitions from the observations of Γ with Algorithm 1 belongs to O(c Let p be an automaton local state of AN , then |Σ| is the maximal number of automaton that can influence p.There is (|Σ| • η) |i| possible combinations of those regulators that can influence p at the same time forming a timed local transition.There is at most T possible delays, so that there are Proof.T • |Σ| • (Σ • η) |i| possibles timed local transitions; thus, in Algorithm 1 at Step 1, the memory is bounded by O(T • (|Σ| • η) |i| ), which belongs to O(T • |Σ| |Σ| ) since 0 ≤ i ≤ |Σ|.Generating all minimal subsets of timed local transitions ϕ of AN that can realize Γ can require one to generate at most 2

Table 2 .
[39]uation of our method on the learning and prediction of the evolution of gene regulatory network benchmarks from the DREAM4 challenge[39]through the Mean Square Error (MSE): 10 variables benchmarks (left); and 100 variables benchmarks (right).

Table 3 .
Evaluation results of our methods on DREAM8 Subchallenge 2A data.