Construction Method of Probabilistic Boolean Networks Based on Imperfect Information

: A probabilistic Boolean network (PBN) is well known as one of the mathematical models of gene regulatory networks. In a Boolean network, expression of a gene is approximated by a binary value, and its time evolution is expressed by Boolean functions. In a PBN, a Boolean function is probabilistically chosen from candidates of Boolean functions. One of the authors has proposed a method to construct a PBN from imperfect information. However, there is a weakness that the number of candidates of Boolean functions may be redundant. In this paper, this construction method is improved to efﬁciently utilize given information. To derive Boolean functions and those selection probabilities, the linear programming problem is solved. Here, we introduce the objective function to reduce the number of candidates. The proposed method is demonstrated by a numerical example.


Introduction
One of the aims in systems biology is to develop a method for modeling, analysis, and control of gene regulatory networks. Control of gene regulatory networks is closely related to therapeutic interventions, and is important toward developing gene therapy technologies [1] in the future.
In this paper, we consider the problem of finding a PBN based on imperfect information such as the network structure, the sample mean, and the prescribed Boolean functions. For a BN, such a problem has been studied in [24,25]. For a PBN, such a problem has been studied in [26][27][28]. Comparing the method in [27] with other methods in [26,28], there is the advantage that the size of matrices manipulated in the procedure is smaller. Hence, we focus on the method in [27]. In this method, a linear programming (LP) problem with no objective function is solved. However, it is appropriate that the objective function is set based on imperfect information. In this paper, a new objective function, which can be set from the prescribed Boolean functions, is proposed. By a numerical example, the effectiveness of the proposed method is presented.

Preliminaries
As preliminaries, we explain a probabilistic Boolean network [3] and a matrix-based representation for a PBN [27].
Next, we explain a probabilistic Boolean network (see [3] for further details). In a PBN, the candidates of f (i) are given, and for each x i , selecting one Boolean function is probabilistically independent at each time. Let Then, the following relation must be satisfied. Probabilistic distributions are derived from experimental results. Finally, N i , i = 1, 2, . . . , n are defined by

Matrix-Based Representation for PBNs
In this subsection, we explain the outline of the matrix-based representation for PBNs [27]. Instead of this representation, we may use the matrix-based representation proposed in [5,6,14], where the semi-tensor product (STP) of matrices is used. In this paper, we do not need manipulation of matrices using the STP. Hence, we use a simple matrix-based representation based on truth tables.
First, we define the notation. Binary variables x 0 i (k) and , consider transforming the BN (1) into a matrix-based representation. Define Then, the matrix-based representation for x i (k + 1) is given by where The matrix A (i) can be derived from the following procedure [24].

Procedure for deriving A (i) in (3):
Step 1: Derive a truth table for x i (k + 1).
Step 2: Based on the obtained truth table, assign Step 3: Express the assignment obtained in Step 2 by a row vector. Denote the obtained row vector by Step 4: Derive A (i) as Next, consider extending the matrix-based representation of BNs to that of PBNs. Noting that the probability distribution of each x i is independent, E[x i (k + 1)] can be obtained as where l can be derived from the above procedure.
We present a simple example.

Example 1.
Consider the PBN with two states. Boolean functions and those selection probabilities are given as follows: where N 1 = N 2 = {1, 2}, q(1) = 2, and q(2) = 3. Using the truth table, we derive the matrix-based representation for each Boolean function. As an example, consider the Boolean function f  0). Then, we can obtain the following matrix representation: The second row in this matrix corresponds to the truth table. In a similar way, we can obtain other matrices as follows: Thus, we can obtain the matrix-based representation for the PBN as follows:

Construction Problem and Its Solution Method
Using the matrix-based representation, the problem of finding a PBN is reduced to the problem of finding matrices. In this section, we formulate the construction problem of PBNs and propose its solution method.
First, let s k+1 i,p , s k i,p , p = 1, 2, . . . , p denote the sample mean of the states, which are measured multiple times at a certain time interval k and k + 1 under a certain pattern p of the experiment, where p is the number of patterns. Each experiment can be characterized by the initial state. However, in an experiment of biological systems, there is the case where the initial state cannot be set precisely. In this paper, we suppose that the initial state for each pattern may be unknown. Next, we give the construction problem of a PBN as follows. We remark here that in this problem, candidates of Boolean functions ( f (i) l ), those selection probabilities (c (i) l ), and the number of candidates of Boolean functions (q(i)) are not given. Problem 1 can be equivalently rewritten as the following problem using the matrix-based representation.

Problem 2.
For a PBN, suppose that the index set N i , i = 1, 2, . . . , n and the sample mean s k+1 i,p , s k i,p , i = 1, 2, . . . , n, p = 1, 2, . . . , p are given. Then, find the probabilities c (i) l , l = 1, 2, . . . , q(i) and the matrices A (i) l , l = 1, 2, . . . , q(i) satisfying the following condition: where s k i,p : It is difficult to directly reduce Problem 2 to a certain optimization problem. Here, we propose a solution method consisting of two steps. In the proposed method, first, the matrices including both the selection probabilities and the matrices expressing the Boolean functions are calculated by solving a linear programming (LP) problem. Next, the solution of Problem 2 is derived from the matrices obtained.
First (5) can be rewritten as where each element ofB (i) must take on a real variable on [0, 1]. In addition, the sum of each column must be equal to 1.
Here, consider the following problem.
By a simple calculation, Problem 3 can be reduced to n LP problems with no objective function.
UsingB (i) obtained by solving Problem 3, we can obtain the probabilities c (i) l , l = 1, 2, . . . , q(i), the matricesÃ (i) l , l = 1, 2, . . . , q(i), and q(i). See e.g., [27] for the further details. Now we consider the case where the candidate of Boolean functions is given by only one based on theoretical biology and past experiments. In this case, it is desirable that the probability that the Boolean function given in advance is chosen is higher than other selection probabilities. Here, we newly introduce the objective function for Problem 3.
The matrixB (i) is represented bỹ where m (i) := 2 |N i | and a q , q = 1, 2, . . . , m (i) must satisfy 0 ≤ a (i) q ≤ 1. Assume that the matrix-based representation of the Boolean function given in advance is given bỹ where b (i) q , q = 1, 2, . . . , m (i) are a given binary value. Then, consider the following problem.

Problem 4.
For a PBN, suppose that the index set N i , i = 1, 2, . . . , n and the sample mean s k+1 i,p , s k i,p , i = 1, 2, . . . , n, p = 1, 2, . . . , p are given. Then, find the matricesB (i) , i = 1, 2, . . . , n maximizing the following objective function J (i) : Problem 4 can also be reduced to n LP problems with the objective function. Hence, we can easily solve it. By solving Problem 4, it is expected that each element ofB (i) obtained is close to that ofB

Remark 1.
In the existing methods in [26,28], the matrices of the size 2 n × 2 n must be manipulated. On the other hand, in the method in [27] and the proposed method, the matrices of the size 2 × 2 |N i | are manipulated. Hence, using the latter methods, a PBN can be derived by manipulating the matrices of the smaller size.

Example
In this example, we consider the PBN with three states. Suppose that the index set for each i is given by N 1 = {3}, N 2 = {1, 3}, and N 3 = {1, 2}, and the sample mean for each i is given by s k+1 . Suppose also that one of the candidates of Boolean functions is given by We demonstrate the proposed method in the case of x 2 . The matrixB (2) can be obtained as 3 , and a (2) 4 are decision variables. From the given Boolean function x 2 (k + 1) = x 1 (k) ∧ ¬x 3 (k), the matrixB (2) d can be obtained as Then, the objective function J (2) in the LP problem for x 2 is given by The constraint condition (5) is given by By solving the LP problem in which J (2) is maximized under the above constraint condition, we can obtainB Hence, we can obtain two Boolean functions and those selection probabilities as follows: where q(2) = 2. In a similar way, we can obtain two Boolean functions and those selection probabilities for x 1 and x 3 as follows: Thus, we can obtain the PBN. In the case of Problem 3 (i.e., the case where there is no objective function in Problem 4), we can obtain the PBN as follows: From these results, we see that the probability that the Boolean function given in advance is chosen becomes higher by introducing the objective function. Thus, if a dominant Boolean function is known, then the proposed method is effective.

Conclusions
In this paper, we studied a construction method of PBNs based on the network structure and the sample mean. As an extension of the method in [27], we proposed a method for giving an appropriate objective function in the LP problem. Using it, the probability that the Boolean function given in advance is chosen becomes higher. By a simple example, we demonstrated the effectiveness of the proposed method.
One of the future efforts is to develop a construction method using only the network structure. Applying the proposed method to large-scale gene regulatory networks is also important. To realize efficient computation, we will consider utilizing optPBN, a MATLAB-based toolbox [29].

Conflicts of Interest:
The authors declare no conflict of interest.