Robustness in Regulatory Interaction Networks. A Generic Approach with Applications at Different Levels: Physiologic, Metabolic and Genetic

Regulatory interaction networks are often studied on their dynamical side (existence of attractors, study of their stability). We focus here also on their robustness, that is their ability to offer the same spatiotemporal patterns and to resist to external perturbations such as losses of nodes or edges in the networks interactions architecture, changes in their environmental boundary conditions as well as changes in the update schedule (or updating mode) of the states of their elements (e.g., if these elements are genes, their synchronous coexpression mode versus their sequential expression). We define the generic notions of boundary, core, and critical vertex or edge of the underlying interaction graph of the regulatory network, whose disappearance causes dramatic changes in the number and nature of attractors (e.g., passage from a bistable behaviour to a unique periodic regime) or in the range of their basins of stability. The dynamic transition of states will be presented in the framework of threshold Boolean automata rules. A panorama of applications at different levels will be given: brain and plant morphogenesis, bulbar cardio-respiratory regulation, glycolytic/oxidative metabolic coupling, and eventually cell cycle and feather morphogenesis genetic control.


Introduction
The cell metabolism is regulated by interaction networks [1,2] bringing together elementary macromolecules like genes and their expression products, proteins, in a complex organization made of numerous weak interactions (due to physicochemical forces like electrostatic or van der Waals forces). The aim of this paper is to show how mathematical networks theories like graphs and dynamical systems theories are necessary to give a mechanistic description of how a cell, a tissue or an organ work from the emergent properties of their constitutive interacting metabolisms. Regulatory interaction networks are made of elements (e.g., genes or proteins) in interaction and they control important cell or tissue functions like proliferation and differentiation. Their dynamics depends highly on the relationships and delays between the kinetics of creation and transformation of the networks elements. More generally, a system is a set of elements in interaction and the cell (resp. tissue) organisation is a biological system, considered as a pyramid of components made of interacting macromolecules (resp. cells). Their observed spatiotemporal behaviours (phenotypes) are explained in the framework of Systems Biology, which studies biological systems in a circuit of complexity from data acquisition to reconstruction of regulatory interaction network (inverse problem) allowing direct predictions by modeling and simulating it in silico ( Figure 1). The complexity of living systems deals with common rules prescribing how macromolecules, cells and tissues are connected into integrated regulatory networks with architectural similarities both inside the cell and between the cellular organizations. Besides, we notice that "network motifs" are crucial to the complexity paradigm. They are common patterns of interconnections between elements as well as super-structures, modeled by deterministic kinetic non-linear rules coming from the classical enzymology and from the genetic networks theory [3][4][5][6][7][8][9][10][11], e.g., from Henri-Michaelis-Menten, Hill, Monod-Wyman-Changeux and Thomas equations (Figure 2). Probabilistic versions of these rules exist accounting for the variability of the systems trajectories around their average behaviour [12,13]. The mathematical tools from the theories of dynamical systems [14][15][16][17][18][19][20] come from numerous contributors from H. Poincaré to R. Thom (Figure 2). The interaction graphs associated to the regulatory interaction networks are inferred from the experimental data and from the literature [21][22][23]. For instance, gene co-expression data may be used for building correlation networks from directional correlations or logical considerations about the observed fixed configurations and then completing the graphs obtained this way by orienting, signing and valuating their edges, hence creating new connected components (like C 1 and C 2 , from C, on Figure 3 top). Interaction graphs architecture contains simple motifs (as those from Alon's work in genetic networks [24] and from metabolic and physiologic networks [25,26]) with two to five nodes. Some examples of motifs are given in Figure 3 and systematically studied in [27]. Among these motifs, we will use, in the following, the negative (respectively positive) circuits, i.e., closed paths between nodes having an odd (respectively even) number of inhibitions like in the triple negative and quintuple positive circuits, in the 3-switch (three nodes fully connected with only inhibitions between them) and in the negative and positive regulons, the most simple motifs having both one positive and one negative circuit. When there is no circuit inside the motifs, we call coherent (resp. incoherent) feed-forward double path a couple of paths with same initial and final nodes, and same (resp. different) global signs.   (1) and directed interaction (2) network, coherent feed-forward double path (3), triple negative and quintuple positive circuits (4), 3-switch (5), negative (6) and positive regulons (7).

Definition of an Attractor and of Its Basin
The definitions of attractors are numerous. We have chosen here a version available both for continuous and discrete cases and the only consistent with all encountered practical situations [28,29]. We define first the Birkhoff limit set L(x) of an initial condition x in the state space E as the set of accumulation points of a trajectory T, where T(x,s) is the state reached at time s from the initial condition x and B(y,ε) is the set of states at distance of y less than ε (reduced to {y} in the discrete case): The stability basin of a subset A of E is the set of initial conditions x not in A, but such as L(x) ⊂ A. Let us denote by ⎯A the set A completed by possible shadow trajectories [30]. An attractor A verifies: (i) A is a fixed set for the composed set operator LoB: A = L(B(A)), An attractor A is invariant in the dual operations consisting firstly in considering all the trajectories of its basin (from all initial conditions not in A, but finishing their life in A) and secondly to restrict them to their ends of life ( Figure 4).

Figure 4.
Definition of an attractor and its basin in the case of a Boolean regulatory network; the initial condition is indicated in blue disks and the final attracting conditions for the majority rule after synchronous iterations (the state of a node is 1 if activating neighbours number is equal to or more than those of inhibiting ones) are given outside the disks (left); attractor A and its basin B(A) (right).
The simplest case of attractors is the fixed point called node [14][15][16][17], to which converge all trajectories of its stability basin ( Figure 5); among the other fixed points, the focus is also an attractor, but neither the centre nor the saddle. The attractors fixed points can bifurcate toward a limit cycle when the parameters values of the dynamical system change ( Figure 5). The epigenetic differences including those involved in cell differentiation can be understood in the framework of the regulatory networks in terms of existence of multiple attractors [8], and the presence of a positive circuit in their interaction graph is a necessary condition for observing multiple fixed points. The second part of the sentence has been conjectured in [31] and proved both in continuous [32][33][34][35][36][37] and discrete cases [38][39][40], where attractors depend in general on the updating mode of the network: sequential if each node is updated in a deterministic or random ergodic order, parallel if all nodes are updated synchronously and block-sequential if the synchrony is available only into blocks updated sequentially. Figure 5. Different categories of attractors: trajectories landscapes for (a) a node, (b) a saddle, (c) a centre, (d) a focus and (e) a limit cycle, in the Poincaré's terminology.

Undirected Graph
An undirected graph G is a pair (V,E), where V is a set whose elements are called vertices (or nodes), and E is a set of non ordered pairs of vertices, called edges. The vertices of an edge are called its endvertices. Two vertices of an undirected graph belonging to the same edge are called incident with this edge. Any edge is called incident with its endvertices. Two edges (resp. vertices) are adjacent if they are incident with one common endvertex (resp. edge). A path (or m-path) is an ordered sequence of m edges in which two consecutive edges are adjacent, by sharing one endvertex not incident with any previous edge of the sequence, except possibly, for the last edge, with the first edge. In this case, the path is called a cycle (or m-cycle). In an undirected graph G, two vertices u and v are called connected, if G contains a path from u to v. Otherwise, they are called disconnected. A graph is called connected if every pair of distinct vertices in the graph can be connected through some path. A connected component is a maximal connected subgraph of G. Each vertex belongs to exactly one connected component, as does each edge. If u and v are vertices of G, then a collection of paths between u and v is called vertex-independent if no two of them share a vertex (other than u and v themselves). Similarly, the collection is edge-independent if no two paths in it share an edge. The greatest number of vertex-independent paths between u and v is called the local vertex-connectivity κ(u,v) and the greatest number of edge-independent paths between u and v, the local edge-connectivity λ (u,v). G is called r-vertex-connected (resp. r-edge-connected) or has a vertex-connectivity κ(G) (or an edge-connectivity λ(G)) equal to r, if each pair of vertices of G has the same vertex-connectivity (resp. edge-connectivity) r. The degree (or valency) of a vertex is the number of edges that connect to it, where an edge that connects to the vertex at both ends, a 1-cycle, called also loop, is counted twice. The average degree of G is equal to d = 2⏐E⏐/⏐V⏐, where ⏐E ⏐(resp. ⏐V⏐) denotes the number of edges (resp. vertices) of the graph. The vertex-connectivity of a graph is less than or equal to its edge-connectivity: κ(G) ≤ λ(G). Both are less than or equal to the minimum degree of G, equal to the minimum value of the degree of its vertices.

Regular Graph
A regular graph is a graph where each vertex has the same number of neighbours, i.e., every vertex has the same degree. A regular graph with vertices of degree r is called a r-regular graph. For a random r-regular graph G (chosen uniformly in the set of r-regular graphs), the following result holds, relating degree and edge-connectivity [55]: assume 3 ≤ r ≤ c 0 n, for some small positive constant c 0 , then with probability tending to 1 as n tends to infinity, G is r-edge-connected.

Weighted and Signed Graph
A graph is a weighted graph if a real number, called weight, is assigned to each edge. Such weights represent in regulatory networks the intensity of the influence (repression/inhibition or induction/activation) of an element on the others, depending on the problem. The weight of the graph is the sum of weights of all edges. A signed graph is a weighted graph with weights valued in {-1,0,1}.

Directed Graph
A directed graph or digraph G is a pair (V,E), where V is the set of vertices, and E is a set of ordered pairs of vertices, called directed or oriented edges. In a digraph G, a path is oriented. Thus, the concept of circuit replaces that of cycle. A digraph G is connected if the undirected underlying graph obtained by replacing all directed edges of G with undirected edges is a connected graph. A digraph is strongly connected if it contains a directed path from u to v and a directed path from v to u, for every pair of vertices (u,v). The strongly connected components (scc) are the maximal strongly connected subgraphs. These strongly connected components (scc) are less numerous in the oriented graphs than in their non oriented versions like in the correlation graphs ( Figure 3). Passing from regulatory networks of biology to regular networks of physics (e.g., Ising model graph), we have a drastic reduction of the scc's ( Figure 6).

Indegree and Outdegree
For a vertex of a digraph, the number of head endvertices (resp. tail endvertices) of edges adjacent to this vertex is called the indegree (resp. outdegree) of the vertex. The indegree (resp. outdegree) is the number of incoming (resp. outcoming) edges in a vertex v and is denoted deg

Connectedness and Connectivity in Graphs
The connectedness (or connectance) [56] in a directed graph is equal to C = ⏐E⏐/⏐V⏐ 2 and, in an undirected graph without loops, to C = 2⏐E⏐/[⏐V⏐(⏐V⏐ -1)]. Gardner and Ashby [4] have studied the relation between the probability of stability and the connectedness of large linear dynamical systems observed in biology [57,58]. We can notice that the connectivity in the Kauffman's sense [41] is the average indegree, which is also equal in directed graphs to c = C⏐V⏐; c is set to a constant, however, it is possible to let it be random, chosen under various distributions [50,51], with average connectivity c.

Kauffman Boolean Networks
Let us denote by R a genetic regulatory network whose number of genes equals n. This network, whose size is defined by n, is modeled by a graph of n vertices representing the n genes. Each gene has two possible activity states (expression and silence) such that the state x i (t) of the gene i equals 0 (resp. 1) if the gene is inactive (resp. active) at time t. We denote the whole configuration of the network at time t by: where Ω is the set of all possible Boolean configurations on R. A Kauffman Boolean network consists of n interacting elements whose states x i (t) (i from 1 to n) are binary variables [41][42][43][44][45][46][47][48][49][50][51]. Each gene i is connected to k other genes (i 1 ,...,i k ) and its state is updated according to a specific rule where F i is a Boolean function, and the x ij (t) are the states of the units connected to i, which may or may not include i itself. The Boolean function F i is represented by a truth table that lists its outputs for each set of inputs values. For each F i with k variables, there are K = 2 k possible sets of inputs values, yielding 2 K different possible functions.

Threshold Boolean Automata Networks
A threshold Boolean automata network with n genes can be represented by an oriented graph in which the n vertices correspond to genes and directed edges to the interactions between these genes. Each interaction from gene j to gene i is characterised by an interaction weight w ij which gives the intensity of the influence that gene j has on gene j and, consequently, of the role played by gene j through the protein it expresses, which can repress or induce the expression of gene i. Being given an arbitrary regulatory network, we associate to it an interaction matrix W of order n-n, whose general coefficient w ij corresponds to the interaction weight with which gene j acts on gene i. More precisely, the coefficient w ij can be positive or negative depending on the fact that gene j tends to respectively activate or inhibit gene i, and is null if j has no influence on i. Let V i be the neighbourhood of gene i, i.e., the set of genes j having an influence on i (j∈V i ⇔ w ij ≠ 0) and H the interaction potential: where x(t) ∈ Ω = {0,1} n and θ i is the i th activation threshold, i.e., the interaction potential to be overtaken for activating i. The threshold Boolean automata rule [52][53][54] is: where G is the Heaviside (or sign-step) function: G(y) = 0, if y ≤ 0 and 1 otherwise. In certain applications, the state 0 can be replaced by the state -1; in this case, G(y) = -1, if y ≤ 0 and 1 otherwise.

Attractors in Kauffman Boolean Networks and Threshold Boolean Automata Networks
In a random Kauffman Boolean network of size n with connectivity c (i.e., with an average indegree equal to c), we choose randomly among the 2 K , where K = 2 c possible random Boolean functions determined by the c (on average) neighbours of any node. For c = 1/2 and for the Boolean functions identity and negation chosen with the same probability, there are N = O(√(n)) states belonging to the attractors [41][42][43][44][45] and the number of possible limit cycles of length L has been recently proven to be in general exponential according to L [44][45][46][47][48][49]. In a random threshold Boolean automata network, with c =2 (i.e., with on average 2 neighbours acting on the same gene with weights randomly chosen at values 1 or -1), we conjecture that the number N of attractors in the sequential updating mode (each vertex being sequentially updated following a deterministic or random ergodic order) verifies [53,54]: where S is the number of the strongly connected components with a positive circuit and S' the number of the positive circuits (counted only 1 time if connected).

Boundary and Core
The notions of boundary and core of a regulatory interaction network come from graph theory. The notions of critical node and edge are more specific in biological applications. For defining correctly these notions, we consider a threshold Boolean automata network: its core is defined by the set of nodes k whose eccentricity e(k) (i.e., the maximal length of the shortest paths to reach from node k all the others) is minimal and the boundary by the set of sources, i.e., nodes whose indegree is null ( Figure 6). The diameter of a graph is equal to the maximum of the eccentricities of its vertices.

Critical Node and Critical Edge
An edge in a regulatory network is critical if its suppression causes the loss of at least one attractor. It is the case in the example of the flowering regulatory network of Arabidopsis thaliana (cf. Figure 7(a) and [23,59]). A simplified version [ Figure 7(c)] of the primitive interaction graph keeping the same attractors, has eight boundary nodes (of eccentricity 2), two strong connected components in its core, four positive and one negative not connected circuits. It has six fixed points and seven limit cycles of order 2 in the parallel (co-expressed) updating mode (which is a consequence of a theorem of [60]) and respects the inequality conjectured in Section 2, because the number N = 6 of attractors in the sequential updating mode verifies: The flowering regulatory network of Arabidopsis thaliana is sensitive to the change of updating mode, because it loses all the seven limit cycles observed in the parallel mode when we let it evolve with the sequential mode. The flowering network is also sensitive to the knockout of the gene EMF1 by a micro-RNA, which causes the loss of two fixed points among six [61][62][63].

Potential Regulatory Networks
A continuous potential differential equation on R n is defined by: where the potential P is a real continuously differentiable function on R n . In the same way, a potential regulatory network on the discrete state space E is defined by [64,65]: where P is a real function (e.g., a polynomial with real coefficients) on E and h a function from R to E, with boundary conditions ensuring that the flow remains in E. For example, in the Boolean case, we choose for h the Heaviside function G. In the integer case (E ⊂ Z n ), h is the identity, P a polynomial with integer coefficients and ∀i = 1,...,n, Δx i ∈ {-1,0,1}. Then (5) is the discrete equivalent of (3): is an interaction tensor, with A k = (a ij ) k as marginal matrices and a iii = 0, W = (w ij ) an interaction matrix and Θ = (θ i ) a threshold vector, we have for the partial derivatives of P: Then the potential regulatory network associated to P is defined by: where G is the Heaviside function. From (6) we derive: In the Boolean case, let suppose that A = 0, P(x ) = t xWx + Bx, with w ii = 1, and each sub-matrix on any subset J of indices in {1,...,n} of W is non positive. Then P decreases on the trajectories of the potential automaton defined by x i (t + 1) = G(-ΔP/Δx i + x i (t)) for any mode of implementation of the dynamics (sequential, block sequential and parallel). This network is a threshold Boolean automata neural network whose stable fixed configurations ( Figure 8) correspond to the minima of P [64].

Hamiltonian Networks
A Hamiltonian energy function H can be defined in a regulatory network of size n, from a kinetic energy term minus a potential energy term, both energies becoming constant at the dynamic equilibrium. The definition of H requires the new following variables: can be interpreted as the rate at which the system changes its state x i (t) at node i and time t and T i (t) as the mean local state over the time interval [t,t + 1]. Then let us define the Hamiltonian energy H equal to the kinetic energy: role of an acceleration. We have: Hence: A discrete system verifying the equations (9) for a Hamiltonian energy function H is called Hamiltonian and is conservative for H [66], because it keeps H constant.  Let us consider now a r-regular network with only identity and negation functions. In such a network, a positive (resp. negative) circuit is conservative for the average of the kinetic part of the Hamiltonian energy H because E c (t + n) = E c (t) (resp. E c (t + 2n) = E c (t)), and hence, the quantity: we call average kinetic energy over the time interval [t,t + n-1] (resp. [t,t + 2n-1]), is constant (which is not the case in general before the system reaches its attracting behaviour). As direct consequence, all trajectories on the circuits are fixed configurations or limit cycles with constant average kinetic energy.

Relationships between Kauffman Boolean and Threshold Boolean Automata Networks
A Kauffman Boolean network with only identity and negation functions [45,67], as well as states x(t)∈{0,1} n , can be considered as a threshold Boolean automata network with states y(t)∈{-1,1} n by considering variables 1}. It suffices to define thresholds as 0 and weights as follows: for identity function, w ii+1 > w ii > 0 and for the negation function, -w ii+1 > w ii > 0.
Reciprocally, each threshold Boolean automata network, because its rule corresponds to a Boolean function from Ω = {0,1} n to {0,1}, can be expressed in terms of Kauffman Boolean networks.

Relationships between Undirected and Directed Graphs
Lemma 1 proves a relationship existing between the number of non-oriented edges defined on an undirected graph G and the mean number of directed edges chosen on the directed versions of G.
Lemma 1: for any undirected graph G having m non oriented edges, the mean number of oriented edges we can define on G from the non oriented configuration is equal to 4m/3.
Proof: Let us note <O> the mean number of oriented edges we can construct from a configuration of m non oriented edges; then, if exactly k from the m non oriented edges are decomposed into two oriented opposite connections, we have C m m-k 2 m-k different ways to dispatch the not double connections into the (m-k) other non oriented edges; we can write : From Lemma 1, the average indegree c (or connectivity in Kauffman's sense) of the directed graphs we can randomly construct from their undirected version of degree d is such as c=4m/3n, where m=nd/2; then 2nd/3=nc and a directed average indegree c corresponds to an undirected degree d=3c/2.

Circuits
Let us define now X r , the number of non-oriented circuits of length r in a random undirected d-regular graph of order n. In [68][69][70], it is shown that the random variables X r , for 3 ≤ r ≤ g, are asymptotically distributed as independent Poisson variables with means equal to (d-1) r /2r=(3c-2) r /r2 r+1 (2 r-1 /r, if c=2), with d = d(n) and g = g(n) allowed to increase with n, provided that: There are C nd/2 n(n-1)/2 random undirected d-regular graphs of order n, each bringing on average 2 r-1 /r non-oriented circuits of length r, if the average in-degree c = 2, and each undirected edge of a non-oriented circuit of length r gives birth to 1 (counted twice in case of a loop) or 2 signed oriented out-coming edges, hence giving at most two oriented circuits. The circuit direction is supposed to be imposed by the circuit nodes controlled by a boundary node: in the human genome, about 2,600 sources plus about 700 micro-RNAs (706 presently known after http://www.microrna.org and 800 expected [71]) are controlling 19,000 not boundary nor isolated genes [72], corresponding to one control among 4 not boundary nor isolated genes (if each micro-RNA is supposed to have on average three specific target genes), hence there is, on average, about 1 control per circuit of length 5, 4 or 3 (it is the case for the circuits in Figures 17 and 18).  The circuit direction is chosen to give the priority to the first forward post-control interaction over the backward one. If the control is exerted on a 3-circuit, whose undirected graph is given in Figure 9 (a), each of its genes being auto-catalysed, and if we respect the graph constraints: c = 2; p + = 4/7 (probability to have a sign +) and p -= 3/7, the oriented graph is given in (b) if the control by a boundary gene is positive, and in (c) and (d) if the control by a micro-RNA or a boundary gene is negative (depending on the sign of the first post-control forward interaction).

Attractors Counting in Real Regulatory Networks
In 1949, M. Delbrück [8] conjectured that the presence of positive circuits (i.e., paths from a gene i to itself having an even number of inhibitions) in the interaction graph of a genetic regulatory network was a necessary condition for the cell differentiation. This conjecture has been precisely written in a mathematical context by R. Thomas in 1981: he identified precisely in simple genetic regulatory networks each differentiated state with a specific attractor and he conjectured that the presence of at least one positive circuit in the interaction graph was a necessary condition for the existence of multiple fixed points [31].
In 1993, S. Kauffman [43] conjectured that the mean number of attractors for a Boolean genetic network with n genes and with connectivity c = 2, was of the same order of magnitude as √(n). This conjecture was supported by real observations: there are about 22000 genes in the human genome and about 200 different tissues, which can be considered as different attractors of the same cell dynamics.
Let us consider now directed random networks and try to evaluate the number of attractors brought by its circuits. By multiplying the numbers of attractors of the disjoint circuits, we can get a lower bound of the total number of attractors in the whole random network. For example, if n = 22.10 3 , then the mean number of oriented circuits of length r (for 3 ≤ r ≤ 5) is equal to 2 r-1 /r ≈ 16/5 ≈ 3, if r = 5, each bringing on average at most 6 different attractors in parallel updating mode [67]: 8 for positive and 4 for negative circuits ( Figure 10) and 1.5 in sequential mode (2 fixed configurations for positive and 1 limit cycle for negative circuits). For r = 4 (resp. 3), the mean number of oriented circuits is equal to 2 (resp. 1) and the mean number of attractors is at most (6 + 2)/2 = 4 (resp. (4 + 2)/2 = 3). For r = 3, the attractors are given in Figure 9. For r = 2, the mean number of oriented circuits is equal to n(4/n) = 4, that is the number of the nodes of the network times the probability to have a reverse edge from the successor nodes of each node (whose mean number is equal to 2 in the case of connectivity 2). The number of attractors is 3 in the parallel mode of updating in the case of a positive circuit and 1 in the case of a negative circuit ( Figure 10). If the random oriented network contains n nodes and has a connectivity c (in the Kauffman's sense) equal to 2 like in the Arabidopsis network [21] where c = 25/12 (Figure 7), then we can calculate the mean number S of sources; let us consider a network with n nodes and 2n interactions; for each node, the probability of having one input from any of the n nodes interacting possibly with it (auto-interactions are allowed) and respecting the connectivity 2, is equal to 2/n ; after n 2 independent choices -n for each of the n nodes -the mean value of the total number of interactions is equal to n 2 (2/n) = 2n, that is the expected number of interactions; hence, the probability of having no input at is given by p = 1-2/n, and the probability of having no input for a node after n independent choices is p n = (1-2/n) n . Then the mean number of sources is: Because (1-2/n) n = e nLog(1-2/n) ≈ e -2 ≈ 1/7.4, then S is approximatively equal to n/7.4, if n is sufficiently large. By using the same argument, the mean number of sinks L is equal to S, and the number I of isolated nodes is equal to about n/(7.4) 2 [81]. To conclude, the sources, if their 2 S states are possible, bring 2 S .A n,2 attractors, if A n,2 is the number of attractors brought by the n-(2S-I) nodes having at least one in and one out interaction. These nodes belong to a random network having a connectivity (in the Kauffman's sense) equal to c' = 2n/[n-(2S-I)] ≈ 2.8n. Then, for a fixed initial configuration of the sources, the calculation of attractors due to circuits inside these n-(2S-I) nodes can be made by using the previous Sections. The mean number of tree structures can also be calculated following [61], but these structures do not bring new attractors, except those brought by the sources and circuits connected to these trees.
These results are consistent with the fact that about 3,000 boundary genes, acting as sources, control the network of 22,000 genes of the human genome: the proportion of boundary genes is about 1/7 in some large regulatory networks such as the cell cycle control network [82], then the order of magnitude predicted in random digraphs is respected, if we are adding to the 22,000/7.4-22,000/(7.4) 2 ≈ 2,973-402 = 2,571 sources or boundary genes, an amount of about 700 human micro-RNAs, which corresponds to the real size of the repositories (http://microrna.sanger.ac.uk; http://www.microrna.org) and to their expected size [79]. They control about 19,000 not boundary nor isolated genes [80], which corresponds to one control for every four not boundary nor isolated genes (each micro-RNA being supposed to have about three specific targets and hence to inhibit about 245 boundary genes), hence there is, on average, about 1 control per circuits of length 5, 4 or 3 (it is the case for the circuits in the Figures 17 and 18). A micro-RNA can hybridize a mRNA stopping the ribosomal elongation, then constituting a post-transcriptional control we can assimilate as coming from a new inhibitory source, i.e., a boundary node acting negatively (rarely positively [81]) on the gene transcribed in this mRNA. The n-(G + L-I) nodes having at least one in and one out interaction are about 22,000-5,544 = 16,456 and they form about three circuits of length 5, 2 of length 4 and one of length 3. They bring 2 3 attractors in the sequential mode and 8 1.5 .4 1.5 6 1 .2 1 .4 0.5 .2 0.5 = 6.10 3 attractors in the parallel mode. These numbers have been obtained by multiplying the number of the attractors brought by each positive or negative circuits, considered as disjoint and in equal average number.
If we consider that the boundary genes and micro-RNAs are always in state 1 of expression, the only possible changes in the attractors can come from an inhibition of the genes in state 1 of the positive circuits by the 700 micro-RNAs and the half of the 2,525 boundary genes non inhibited by the micro-RNAs: that corresponds to about 3,360 genes inhibited among the 19,000 not boundary nor isolated genes, that is 1/6 of these genes, i.e., about 2 circuits of length 5, 1 circuit of length 4 and 0.5 of length 3. Then the control leaves free on average 1 circuit of length 5, 1 of length 4, 0.5 of length 3 and 1/3 of length 2 and there is on average in these free circuits at most 69 attractors in parallel updating, this number resulting from the calculation: 8 0.5 .4 0.5 .6 0.5 .2 0.5 .4 0.25 .2 0.25 .3 2/3 = 3.6 × 2 4.25 ≈ 69; this number 69 is in fact a majorant due to the Jensen's inequality applied to the concave function f(x)=x p , where p < 1, the numbers of attractors, that is 8, 4, 6, 2 and 3, coming from the Figure 10 and the only non negligible circuits being of length less than 5, because of the Bollobas results [68]. The mean number of attractors allowed for the controlled circuits is in the same way equal at most to 8 0.5 .4 0.25 .6 0.25 ≈ 4 × 1.56 ≈ 6.24, because the negative control acts efficiently only on genes in state 1, these genes concerning only the half of the positive controlled circuits. Then we can observe on average at most 69 × 6.24 ≈ 430 attractors, which is in qualitative agreement with the number of observed cell attractors in human (the number of the different types of cell differentiation in human is between 178 [82] and 411 [83]), if we suppose that the Boolean rules used in circuits of regulatory genetic networks are only made of identity and negation functions and that the genes are expressed synchronously (parallel updating). The pure sequential mode (8 attractors) is unable to explain the richness observed, hence we must propose a certain degree of synchrony in the gene expresssion, as well as new rules more sophisticated than the threshold Boolean automata one (e.g., rules in which the interaction potential is non-linear in the state variables [61]).

Robustness
The robustness of a regulatory network is defined by its ability to resist to exogenous or endogenous perturbations, e.g., to keep the same number and type of attractors (fixed or periodic), with about the same stability basin size. Exogenous perturbations from the environment can be caused by fluctuations (due for example to the nutrition) in the concentration of a protein active in the network (as repressor or inducer) or of a micro-RNA from viral origin acting on the mRNA transcribed from a gene of the network, or by a synchronisation of gene expression (due to chromosome rearrangements or to chromatin dynamics), provoking a change in attraction basins size or in type (nature or number) of attractors (e.g., appearance of new limit cycles in Figure 7). An example of endogenous perturbation is a perturbation in the genetic co-expression scheme (due to the chromatin dynamics [84]) caused by a change in the updating rule of the network. For instance, the network may start with a sequential updating mode (all genes expressing their protein consecutively) which may turn into a block-sequential one (in each block the gene expression is synchronised) and eventually into the parallel updating mode (one sole block of genes all expressed simultaneously).
A threshold Boolean automata network rule can be implemented in 3 different updating protocols. Consider the example of a 3-switch (Figure 3 and 5), in which any gene inhibits the others with the weight -1 and is auto-catalysed with the weight + 1, the threshold being equal to -ε, where ε > 0: -if the nodes are sequentially visited by the updating process, the system has 6 fixed configurations, with state 1 (resp. 0) at one node and state 0 (resp. 1) at the others. Such a system having only fixed configurations is potential in the sense of the Section 4.1. [64], because the discrete velocity of the dynamics is equal to the gradient of a Lyapunov function (it is for example more generally the case in a n-switch when the interaction weights are symmetrical), -if the nodes are synchronously updated, we have one limit-cycle of order 2 (made of the full 0 and full 1 configurations) and 6 fixed configurations (corresponding to those of the sequential updating). Such a discrete system is Hamiltonian in the sense of Section 4.2., -in the intermediary case, called block-sequential, in which we update first a node, and then synchronously the two others, we have the same attractors as in the sequential case.
From this particular network of size 3, the 3-switch, we conjecture more generally that the cyclic behaviour can appear when the updating rule goes from the sequential to the parallel mode and not in the inverse way. The statistics done on the set of the all possible (188,968) networks of size 3 shows that this conjecture is almost everywhere true, but it fails for 0.3% of them [25,26,62]. Certain other empirical observations can be generalized: Lemma 2: When a limit cycle occurs in the dynamics of a block, then the whole network dynamics has only limit cycles as attractors. Reciprocally, if the global network dynamics has a limit cycle, there is necessarily at least one block having at least one limit cycle as attractor.

Lemma 3:
If the global network dynamics has a limit cycle of length m, then m ≤ Π i∈C m i , where C is the set of the blocks having at least a limit cycle as attractor, m i being the maximal length of limit cycles of the block i.
The proofs of Lemmas 2 and 3 are obvious. The statistical behaviour of networks of size 3 needs a complete simulation of all their dynamics; we are only interested by networks of size 3 having some specific dynamical features: for certain updating modes, they have both fixed configurations and at least one limit cycle, the limit cycles disappearing completely for other iterations rules, where there are only fixed configurations. Among the 188,968 possible networks of size 3, corresponding to all the configurations of interactions and thresholds, only 34,947 (18.5%) have this dynamical behaviour. If we try to understand how limit cycles are disappearing when the iteration mode changes, frequently this disappearance occurs by passing from synchronous to sequential updating, in a hierarchy of updating modes, defined by the hierarchy of blocks based on the pre-order inclusion. We observe in the simulations three different possible behaviours for these 34947 size 3 networks: -those for which the cycles disappear when we are going down in the hierarchy from the synchronous to the sequential modes (behaviour "Down"), -those for which the cycles disappear when we are going up in the hierarchy from the sequential modes to the synchronous one (behaviour "Up"), -those not corresponding to any previous behaviour, for which the cycles occur and disappear inside the hierarchy without clear rule (behaviour "None").
Among the 34,947 simulated networks of size 3 having limit cycles, the dispatching into the 3 possible behaviours follows the repartition below. Table 1. Repartition of the 34,947 networks of size three having limit cycles into three classes: Down (resp. Up) for which cycles disappear when the updating mode loses its synchrony (resp. sequentiality) and None for which cycles occur and disappear without clear rule. This repartition confirms that there is practically no network (only 0.31%) for which the cycles are present in the sequential updating modes and disappear in the synchronous one.
Let us now consider the 108 networks with an "Up" behaviour, in order to try to find inside them a differentiation character. For that, let us calculate the maximal length of their limit cycles. For the 34,947 studied networks, this maximal length is equal to 6. The table below gives the repartition of the networks of size 3 as function of their maximal limit cycle length (the column "Total" corresponds to the repartition of the maximal lengths for the 34,947 simulated networks), showing the diversity in the response of networks of size 3 to the changes of the updating mode (cf. [27] for sizes from 4 to 7).

Neuron and Plant Morphogenesis
Both brain and plants are built during their morphogenesis under the control of regulatory systems called n-switches [35,87], negatively fully connected, with an exception for positive auto-catalysis [ Figure 12(b) left]. By considering a n-switch [88] available for the plant morphogenesis modeling [ Figure 12(c) right], if X i denotes the concentration of the morphogen i, we suppose that all the inhibitions of the n-switch are expressed through a Hill competitive term in the following differential equations [83], where the cooperativity c is supposed to be strictly greater than 1, k is a catabolic constant and σ an enzymatic Vmax (Figure 13 left): By doing the change of variables Y i =(X i ) 1/2 , we can check that if we define the potential P by: ..,n kY j 2 /4 -(σ/4n)Log(1 + Σ j=1,...,n Y j 2c ), then new differential equations can be written as: The new differential system is then a gradient system with P as associated potential, whose minima are just located at the fixed points of the n-switch (Figure 12 b right and c left). This result is still available if the kinetics is allosteric, i.e., if the inhibition is expressed through concerted sites in an oligomeric enzyme, that is, for every i between 1 and n: with L i = ∏ k=1,n;k≠i (1 + c k X k ) n /(1 + e k X k ) n and e k << c k < 1, which corresponds to an allosteric inhibition of X i by the X k 's, for k ≠ i. If e k = 0, by changing the variables Y i = (X i ) 1/2 and considering the potential P defined by: P(Y 1 ,...Y n ) = Σ j=1,...,n k j Y j 2 /4 -(σ/4n)Log(Σ j=1,...,n (1+Y j 2 ) n +Π j=1,...,n (1+c j Y j 2 ) n ), then the new differential equations can be written like (13) as: The applications of the n-switches concern the morphogenesis of the brain in which a 2-switch (made of transduction peptides in [89]) controls the interactions between early brain cells [ Figure 12(a) left], the double inhibition leading to a spatial segregation between cell populations [ Figure 12(a) middle] which differentiate in a second stage [ Figure 12(a)right] giving for example separated (rods and cones in retina [90]) or intricate (neurons and astrocytes) complementary tissues. Many factors that control cells fate are themselves targets of such inhibitory proteins: neural induction in vertebrates [91,92] as well as plant growth [88] with inhibition between meristem and buds may be mediated through such a mechanism [ Figure 12(c) right] very sensitive to the initial conditions; each attractor corresponds to the local dominance of a node of the n-switch over the others [ Figure 12(b) left].

Cardio-Respiratory Physiologic Regulation
The bulbar vegetative regulation of the cardio-respiratory system is made of three main neuronal populations [93], firstly the inspiratory neurons I activating the expiratory neurons E, with an inhibitory feed-back on I, and secondly the cardio-moderator CM activating the peripheral cardiac pace-maker made of the excitable cells of the sinusal node S, with an inhibitory feed-back on CM (Figure 14 left). In the healthy state, the cardiac activity is ruled by these two negative regulons and in presence of a weak coupling between I and C, the cardiac rhythm is just the 3-harmonic component of the respiratory one ( Figure 14 right a), presenting an acceleration during inspiration, called respiratory sinusal arrhythmia, visible in presence of noise ( Figure 14 right b), and destroyed in case of degenerative neural disease (like in Parkinson's or Alzheimer's disease, where there is no more coupling). A too strong coupling between I and CM transforms the 3-harmonic cardiac rhythm in a pathological signal having same frequency as the respiration (Figure 14 right c and d). The presence of a negative regulon causes the occurrence of an attractor, which is a limit cycle, as usually [93][94][95].  The glycolysis [ Figure 15(a)] has been modeled multiple times by several authors [96][97][98][99][100][101][102], especially the central allosteric step of the phosphofructokinase (PFK), which is the key glycolytic/oxidative enzyme, because it presents a highly non-linear allosteric kinetics (with cooperativity equal to six) and has an ATP/ADP negative regulon in its regulatory interaction network, causing, for critical values of the constant fructose entry flow σ 1 , oscillations of all the metabolites of the glycolysis, with a period of several minutes. Let us apply it to neurons and astrocytes: the mitochondrial shuttles for NADH are less active in astrocytes than in neurons: that causes a flow of lactate from astrocytes to neurons [101,102] and this difference in NADH transportation efficiency is provoked by a weaker efficacy of the translocase (ANT) and ATPase enzymes less optimally located inside the inner mitochondrial membrane [103].
Let us define now by x 1 , x 2 , x 3 and x 4 the concentrations of respectively the successive main metabolites of the glycolysis: glucose, glyceraldehyde-3-P, 1,3-biphospho-glycerate and phospho-enol-pyruvate. We assume that steps E2 and E3 of the glycolysis (summarised in Figure 15 a) are Michaelian and reversible, the enzymatic complex E1 includes the allosteric irreversible kinetics of the phospho-fructo-kinase PFK with a cooperativity n (see [104][105][106][107][108] for its complex kinetics), and both pyruvatekinase (E4) and dehydrogenases of the complex E5 are irreversible. Then, consider the differential system (S) ruling the glycolysis and the pentose pathway until the ribulose-5-P: Figure 15. (a) The glycolysis and the pentose pathway. E1 denotes the four enzymes of the high glycolysis (hexokinase HK, phosphoglucose-isomerase PGI, phosphofructo-kinase PFK and aldolase ALDO), E2 denotes the glyceraldehyde-3P-dehydrogenase, E3 denotes the four enzymes of low glycolysis (phosphoglycerate-kinase, phosphoglycerate-mutase, enolase ENO and pyruvate-kinase PK), E4 denotes the pyruvate-kinase and E5 the 3 enzymes of the oxydative part of pentose pathway, glucose-6P-dehydrogenase G6PDH, 6P-glucono-lactonase and phosphogluconate-dehydrogenase PGDH, alternative to the phospho-transferase system PTS

+L'V 3 )) and K'=J 1 V 4 V 3 /J 2 LV 2 (V 4 +L'V 3 ))
When all the fluxes of a metabolic system have reached their stable stationary value, then we can define the notion of control strength C ik exerted by the metabolite x i on the flux Φ k of the k th step by: [105,111,112] and we have: The control molecules can be enzymes [ Figure 11(b)] or metabolites and the equation (14) can be used to prove that the most regulating molecules in glycolysis are the energetic molecules ATP and ADP, which conversely are mainly produced by glycolysis: for example, 84% of the ATP production in yeast is provided and controlled by glycolysis [113].
When oscillations occur, we can use the variables T ik (resp. A ik ) to quantify the control of the period τ k (resp. amplitude Amp k ) of the k t h step flux by x i [112][113][114]: T i k =∂Logτ k /∂Logx i and A ik = ∂LogAmp k /∂Logx i . If ξ is the eigenvalue of the Jacobian matrix of the differential system for which the stationary state has bifurcated in a limit cycle (Hopf bifurcation), then τ k = 2π/Imξ, and we have in the 2-dimensional case, if for example dx 1  where ΔP = ∂ 2 P/∂x 1 2 + ∂ 2 P/∂x 2 2 is the Laplacian of P and C(P) = ∂ 2 P/∂x 1 2 ∂ 2 P/∂x 2 2 -(∂ 2 P/∂x 1 ∂x 2 ) 2 is the mean Gaussian curvature of the surface P, both taken at the stationary state of the differential system. When the apparent V max of the PFK is diminished in astrocytes due to a lack of ATP, the oscillatory behaviour is less frequent (Figure 16 left) and the production rate of lactate from pyruvate is more important than in neuron, creating a flux of lactate to neurons (Figure 17 top). The neurons consume the lactate coming from the extracellular space, partially replenished by the astrocytes production. That gives to neurons an ATP level higher than in astrocytes, an extra-pyruvate production from lactate, and an extra-oxygen and glucose consumption theoretically predicted and experimentally observed [101,102]. ANT and ATPase concentrations are in human under the negative control of two micro-RNAs, micro-RNA 151 and micro-RNA 34 acting as boundary controller nodes (http://microrna.sanger.ac.uk) by diminishing the glycolytic/oxidative system efficiency. Figure 17. Left. Connection between neurons and astrocytes glycolysis with lactate flux, controlled by ANT (green) and ATPase (red) inside the mitochondrial inner membrane (top). Right. PFK regulatory interaction network with the effectors (activators and inhibitors) of the PFK inside the enzymatic complex E 1 .

Cell Cycle Control
The cell cycle is sensitive to the control by two micro-RNAs whose one (micro-RNA 34) has p53 as positive transcription factor, involved in selective cytotoxicity of intracellular amyloid [115][116][117]. Fixing to state 1 micro-RNAs causes the occurrence of limit cycles in the parallel mode ( Figure 18). Some attractors (bottom left) in an arbitrary state space and list of all attractors of the cell cycle network, with fixed boundary conditions (micro-RNAs in state 1) and parallel updating mode (bottom right). ABRS denotes the percentage of initial states in an attractor basin. The involved genes are: p27, Cdk2, pCyCE_Cdk2, CyCE_Cdk2, micro-RNA 159, pCycA_Cdk2, CycA_Cdk2, Rbp-E2F, Rb-E2F, E2F, Rbp and Rb. Arbitrary representation of a part of the attraction basins with only 2 fixed points, A 1 and A 3 , and 2 limit cycles, A 2 and A 4 (bottom left).

Feather Morphogenesis
The feather morphogenesis network has been well studied in chicken [118][119][120][121][122][123][124][125] and is described in Figure 19. One boundary node, the micro-RNA 141 inhibits the protein p53 involved in the positive control of the Cyclin D1 and in the negative control of E2F via the micro-RNA 34, whose p53 is a transcription factor (Figure 18 top left). Hence the micro-RNA 141 can have a double action on the proliferation (modeled by using the Ross-Volterra logistic term [126,127]), by breaking the p53 influence. If the boundary state of micro-RNA 141 is equal to 1, we notice a dramatic change in the number and nature of attractors, because we lose both in the parallel and sequential cases one fixed attractor by changing the state of micro-RNA 141 from 0 to 1 and we transform the limit cycle of the parallel updating case in a fixed point. In these new attractors, there are still two types of behaviour, one with Wint (Wnt) and β-catenin silenced causing the absence of the activator BMP-7, hence the absence of feathers, and another with Wint and β-catenin expressed, allowing the feather morphogenesis. Hence, the node micro-RNA 141 is critical and the feather morphogenesis network is sensitive to its influence, but anyway even in its absence the possibility to start or not the feather morphogenesis by silencing or enhancing Wint and β-catenin remains possible, e.g., through the gene Smad3 (Figure 19 top right).

Perspectives and Conclusions
We have studied in this paper the robustness of regulatory interaction networks, especially the influence of boundaries in threshold Boolean automata networks. The sensitivity (or absence of robustness) in real biological networks appears dominant in the case of inhibitory actions exerted by micro-RNAs and we have noticed it in both cases of the cell cycle control and the feather morphogenesis regulation networks. We have also remarked the dependence on the updating mode [128] in the simulations and the role of positive circuits on the number and nature of attractors (especially in the Arabidopsis flowering network and in the n-switches), the influence of negative circuits (e.g., coming from simple negative regulons) on the existence of periodic behaviours (namely in the cases of the cardio-respiratory control network and of the glycolytic-oxidative coupling). More systematic studies as in [129,130] have to be performed in order to confirm the dominant influence of boundary negative interactions, for which the threshold Boolean automata networks seem to be less robust than for the positive ones, and also to make more precise the conjectured inequality about the number of attractors (verified here in each discrete threshold Boolean automata network studied in the examples given here as application of the systems biology approach). Many problems remain open: for instance, we could consider the regulatory networks in an evolutionary perspective, in order to determine phylogenetic trees of networks (by using classical distances between networks, like the tree distances [131][132][133][134][135][136]). We could indeed show that the known regulatory networks have evolved following a complexification such as the number of nodes at step k, n(k), verifies: n(k) = C.2 Log(r(k))Log(Log(r(k))) , where C is a positive constant and the directed graph diameter is equal to 2r(k) at step k. Then, if the indegree of the interaction graph comes from the initial value 2, it loses this indegree not too rapidly and goes to a small-world structure [137], like those observed in the metabolic regulatory networks [1].
More, let us consider the notion of r-tree, i.e., an undirected connected graph without circuits, but with a root and with each vertex of any level i from this root (except sinks) having r adjacent vertices at level (i + 1). If the number m(k) of the edges of the undirected graph associated to a random regulatory network at step k verifies:

m(k)=O(K n(k) (r-2)/r [n(k)-1]/2)
where K and r are positive constants, then the number of r-trees of size r + 1 is on average equal to K r(r+1)/2 /(r + 1)! [68,[138][139][140][141][142]. All these trees controlled by their root bring each only one attractor. Following [143], if we consider for example the network of the copper biolixiviation by Thiobacillus ferrooxidans [144,145], we have 354 genes and 534 interactions: the prediction for the numbers of isolated genes and 2-trees are 354/(7.4) 2 = 6.5 and 3 3 /3! = 4.5, and the observations show 8 isolated genes and 6 2-trees, in qualitative agreement with the hypothesis of randomness. This result emphasizes the role of circuits in the differentiation, because positive ones are alone to bring multiple fixed points and be present since the origin of life [131]. A last open problem consists in making explicit the dynamical consequences of the transcription regulation of micro-RNAs, by boundary genes like p53 [117]: this regulation enlarges the classical boundary reduced to sources, by introducing 2-circuits in it.
To conclude, all the predictions from the theoretical models have to be falsifiable by the empirical observations, reinforcing the interest of the methodological approach, summarised here shortly in the framework of the systems biology by using the tools of the dynamical systems and complex systems theories, and applied in various biological fields, from genetic and metabolic to physiologic control.