An Integer Linear Programming Formulation for the Minimum Cardinality Segmentation Problem

In this article, we investigate the Minimum Cardinality Segmentation Problem (MCSP), an NP-hard combinatorial optimization problem arising in intensity-modulated radiation therapy. The problem consists in decomposing a given nonnegative integer matrix into a nonnegative integer linear combination of a minimum cardinality set of binary matrices satisfying the consecutive ones property. We show how to transform the MCSP into a combinatorial optimization problem on a weighted directed network and we exploit this result to develop an integer linear programming formulation to exactly solve it. Computational experiments show that the lower bounds obtained by the linear relaxation of the considered formulation improve upon those currently described in the literature and suggest, at the same time, new directions for the development of future exact solution approaches to the problem.


Introduction
Let S = {s ij } be a m × n binary matrix.We say that S is a segment if for each row i = 1, . . ., m, the following consecutive ones property holds [1]: Given an m × n nonnegative integer matrix A = {a ij }, the Minimum Cardinality Segmentation Problem (MCSP) consists in finding a decomposition A = K t=1 u t S t such that u t ∈ Z + , S t is a segment, for all t ∈ 1, . . ., K, and K is minimum.
For example, consider the following nonnegative integer matrix: then a possible decomposition of A into segments is [2]: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 The MCSP arises in intensity modulated radiation therapy, currently considered one of the most powerful tools to treat solid tumors (see [3][4][5][6][7]).In this application, the matrix A usually encodes the intensity of the particle beam that has to be emitted by a linear accelerator at each instant of a radiation therapy session (see Figure 1).As the linear accelerator can only emit, instant per instant, a particle beam having a fixed intensity value rather than those encoded in A, the intensity matrix has to be decomposed into a set of segments, each encoding those elementary quantities of radiation, in order to deliver entry per entry the requested amount of radiation (see [7] for further details).
The MCSP is known to be N P-hard even if the matrix A has only one row [1] or one column [8].It is worth noting that the two restrictions mainly differ from each other for the type of segments considered.Specifically, in the one row case the segments are binary vector rows satisfying the consecutive ones property; in the one column case segments are just binary column vectors.
The N P-hardness of the MCSP has justified the development of exact and approximate solution approaches aiming at solving larger and larger instances of the problem.These approaches have been recently reviewed in [7,9,10], and we refer the interested reader to these articles for further information.
Here, we just focus on the exact solution approaches for the MCSP.
The literature on the MCSP reports on a number of studies focused on a specific restriction of the problem characterized by having the highest entry value of the matrix A, denoted as A max = max{a ij }, bounded by a positive constant H.This assumption is generally exploited e.g., in [1,[11][12][13] to develop pseudo-polynomial exact solution algorithms for the considered restriction.A recent survey of these algorithms can be found in [7].Surprisingly enough, however, in the last decade only a limited number of exact solution approaches have been proposed in the literature for the general problem.These approaches are restricted to the pseudo-polynomial solution algorithm described in [14], the Constraint Programming (CP) approaches described in [9,15], and the Integer Linear Programming (ILP) approaches described in [16][17][18][19].Specifically, the algorithm proposed in [14] is based on an iterative process that exploits the pseudo-polynomial solution algorithm of [13] to solve at each step an instance of the MCSP characterized by having 1 ≤ A max = max{a ij } ≤ H.The author shows that the algorithm is able to solve an instance of the general problem with an overal computational complexity O((mn) 2H+3 ).However, as shown in the computational experiments performed in [9], the algorithm proves to be very slow in practice.
The CP approach described in [15] was not initially conceived to solve the MCSP.In fact, the authors consider an objective function that minimizes at the same time a linear weighted combination of the number of segments involved in the decomposition and the sum of the coefficients used in the decomposition.As shown in [10], this approach can be adapted to solve the MCSP.However, the performances (in terms of solution times) so obtained are poorer than those relative to the CP approach described in [9].Specifically, the authors of [9] first use an heuristic to find an initial feasible solution to the problem.Then, they attempt to find either a solution that uses less segments or to prove that the current solution is optimal.The certificate of optimality of the proposed algorithm is based on the use of an exhaustive search on the space of segments and coefficients compatible with the decomposition.As far as we are aware, this approach currently constitutes the fastest exact solution algorithm for the MCSP.
An example of discretization of the particle beam emitted by a linear accelerator (from [2]).The matrix on the right encodes the discretized intensity values of the radiation field shown on the left.
The ILP formulations described in the literature are usually characterized by worse performances than the constraint programming approach presented in [9].The earliest ILP formulation was presented in [16] and is characterized by a polynomial number of variables and constraints.Specifically, provided an upper bound on the overall number of segments is used, the authors introduce a variable for each entry of each segment in the decomposition of the matrix A. This choice gives rise to multiple drawbacks: it may lead to very large formulations, it does not cut out the numerous equivalent (symmetric) solutions to the problem, and it is characterized by very poor performances in terms of solution times and size of the instances of the MCSP that can be analyzed [9,10].The mixed integer linear programming formulation presented in [19] arises from an adaptation of the formulation used for a version Cutting Stock Problem [10].As for [16], the formulation contains a polynomial number of variables and constraints and requires an upper bound on the overall number of segments used.As shown in [10], the formulation is characterized by better performances than the one presented in [16].However, it proves unable to solve instances of the MCSP containing more than seven rows and seven columns.The polynomial-sized formulation presented in [18] has a strength point the fact that does not explicitly attempt at reconstructing the segments of the decomposition.This fact allows for both the reduction of the number of involved variables and a break in the symmetry introduced by the search of equivalent sets of segments.However, computational experiments carried out in [10] have shown that the linear relaxation of this formulation is usually very poor.This fact in turn leads to very long solution times [10].A similar idea was used in [17].In particular, the proposed polynomial-sized formulation minimizes the number of segments required in the decomposition of the matrix A, without explicitly computing them.This last step is possible in a subsequent moment via a post-processing of the optimal solution.However, also in this case, the formulation is characterized by poor computational performances when compared to the constraint programming approach described in [9], mainly due to the poor lower bounds provided by the linear relaxation.
Starting from the results described in C. Engelbeen's Ph.D. thesis [20], in this article we investigate the problem of providing tighter lower bounds to the MCSP.Specifically, by using some results related to the one row restriction of the MCSP (see [2,20]), we transform the MCSP into a combinatorial optimization problem on a particular class of directed weighted networks and we present an exponential-sized ILP formulation to exactly solve it.Computational experiments show that the performances (in terms of solution times) of the formulation are still far from the CP approach described in [9].However, the lower bounds obtained by the linear relaxation of the considered formulation generally improve upon those currently described in the literature.Thus, the theoretical and computational results discussed in this article may suggest new directions for the development of future exact solution approaches to the problem.
The article is organized as follows.After introducing some notation and definitions, in Section 2 we investigate the one row restriction of the MCSP.In particular, we transform the restriction into a combinatorial optimization problem on a weighted directed network and we investigate the optimality conditions that correlate both problems.In Section 3, we show how to generalize this transformation for generic intensity matrices and in Section 4 we present an ILP formulation for the MCSP based on this transformation.Finally, in Section 5 we present the computational results of the formulation, by providing some perspectives on future exact solution approaches to the problem.

The One Row Restriction of the MCSP as a Network Optimization Problem
In this section, we consider a restriction of the MCSP to the case in which the matrix A is a row vector.By following an approach similar to [15,20], we transform this restriction into a combinatorial optimization problem consisting of finding a shortest path in a particular weighted directed network.The insights provided by this transformation will prove useful both to extend the transformation to general matrices and to develop an ILP formulation for the general case of the MCSP.Before proceeding, we first introduce some notation and definitions similar to those already used in [2,9,20] and that will prove useful throughout the article.
Let q be a positive integer.We define a partition of q as a possible way of writing it as a sum of positive integers, i.e., for some r, β j ∈ Z + (see [21]).For example, if we consider q = 4 and we set r = 3 and β 1 = β 2 = 1 and β 3 = 2 then a possible partition of 4 is 1 + 1 + 2. We call Equation ( 4) the extended form of this particular partition of q.Interestingly, we also observe that an alternative way to partition q consists in writing it as a nonnegative integer linear combination of nonnegative integers, i.e., for some s, β j , γ j ∈ Z + .For example, if we set s = 2, , then an alternative way to encode the considered partition is (2 • 1) + (1 • 2).We call Equation ( 5) the compact form of this particular partition of q.The definition of a partition of a positive integer proves useful to investigate the one row restriction of the MCSP.In particular, it is worth noting that any decomposition of a row vector A into a sum of positive integers implicitly implies partitioning the entries of A. Then, a possible approach to solve the considered restriction consists in finding from among all of the possible partitions of each entry of A the ones that, appropriately combined, lead to the searched optimal decomposition.To this end, we denote as the set {1, . . ., K}, for some positive integer K, and we associate to the row vector A a particular weighted directed network D = (V, A), called partition network, built as follows.Given an entry a j of A and a generic partition of a j , let q w be the number of terms equal to w in the extended form of the considered partition or, equivalently, the coefficient associated to the term w in its compact form.Let V be the set of vertices including a source vertex s, a sink vertex t and a vertex for each partition of each entry of A: Let A be the set of arcs of D defined as follows: ∪ {((j; q 1 , . . ., q H ), (j + 1; q 1 , . . ., q H )) : (j; q 1 , . . ., q H ), (j + 1; q 1 , . . ., q H ) ∈ V } (10) ∪{((n; q 1 , . . ., q H ), t) : (n; q 1 , . . ., q H ) ∈ V } (11) Note that the construction of the partition network D is such that a decomposition of A corresponds to a path from the source to the sink in D. In particular, traversing a vertex (j, q 1 , . . ., q H ) means that in the decomposition there are exactly q w segments with a coefficient equal to w and a 1 in position j, for all w ∈ [H].Traversing an arc ((j; q 1 , . . ., q H ), (j + 1; q 1 , . . ., q H )) in the network means that in the corresponding decomposition of A there are exactly q w segments with a coefficient equal to w and a 1 in position j and exactly q w segments with a coefficient equal to w and a 1 in position j +1, for all w ∈ [H].
As an example, Figure 2 shows the network corresponding to the row vector It is worth noting that, as we want to minimize the number of segments in the decomposition, whenever we cross arc ((j; q 1 , . . ., q H ), (j + 1; q 1 , . . ., q H )) we use max{0, q w − q w } segments with a coefficient equal to w, a 1 in position j and a zero in position j + 1; min{q w , q w } segments with a coefficient equal to w and a 1 in both positions j and j + 1 and finally max{0, q w − q w } segments with a coefficient equal to w, a zero in position j and a 1 in position j + 1.Hence, passing from the vertex (j; q 1 , . . ., q H ) to the vertex (j + 1; q 1 , . . ., q H ) implies to add max{0, q w − q w } new segments with a coefficient equal to w.In the light of these observations, consider a length function : A −→ Z + defined as follows: if α = (s, (1; q 1 , . . ., q H )), (1; q 1 , . . ., q H ) ∈ V H w=1 max{0, q w − q w } if α = ((j; q 1 , . . ., q H ), (j + 1; q 1 , . . ., q H )) ((j; q 1 , . . ., q H ), (j + 1; q 1 , . . ., q H )) Then, finding a decomposition of the row vector A into the minimum number of segments is equivalent to compute the length of any shortest s − t path, denoted as p st , in the corresponding weighted directed network D. Note that, due to the particular structure of the partition network D, such a problem is fixed-parameter tractable in H [2,20,22].Once computed p st , we can build the segments and the corresponding coefficients involved in the decomposition by iteratively applying the following approach.For all integers and r such that < r, let [ , r) = { , + 1, . . ., r − 1}.Set r := min{j > : q w = 0 in (j ; q 1 , . . ., q H ) ∈ p st }.
and build a segment S having the generic entry s j equal to 1 if j ∈ [ , r) and 0 otherwise.Associate to S a coefficient equal to w. Remove one unit in component q w from each vertex of the path that corresponds to an entry j ∈ [ , r) and iterate the procedure.As an example, the application of this procedure to the row vector Equation ( 12) leads to the following optimal solution: A := 2 1 1 1 0 + 0 1 0 0 + 3 0 0 0 1 In the next section, we show how to generalize this transformation to generic intensity matrices.This transformation will prove useful to develop an ILP formulation for the general version of the problem.

The MCSP as a Network Optimization Problem
Let A be a generic m × n nonnegative integer matrix.For each row i ∈ [m] we build a partition network D i = (V i , A i ) in a way similar to the procedure described in Section 2. Specifically, for each row i ∈ [m], the set V i includes a source vertex s i and a vertex for each partition of each entry of the i-th row of A: ∪{(i, j; q 1 , . . ., q H ) : j ∈ [n], (q 1 , . . ., q H ) ∈ Z H + : Similarly, the set A i includes an arc for each pair of vertices corresponding to consecutive entries on the same row: ∪ {((i, j; q 1 , . . ., q H ) , (i, j + 1; q 1 , . . ., q H )) : (i, j; q 1 , . . ., q H ) , (i, j + 1; q 1 , . . ., q Consider the combined partition network D = (V, A) having as vertexset and as arcset D contains a vertex corresponding to each possible partition of each entry a ij of A. Each of these vertices is denoted by (i, j; q 1 , . . ., q H ), where (i, j) denotes the position of a ij in A, and q w denotes the number of terms equal to w in the extended form of the corresponding partition of a ij or, equivalently, the coefficient of w in its compact form, meaning a ij = H w=1 q w • w.As an example, Figure 3 shows the combined partition network corresponding to the intensity matrix Figure 3.The combined partition network for matrix Equation (23).The dotted arcs represent a flow in the network that encodes a possible decomposition of A.
It is worth noting that, as for the one row restriction, a decomposition of the i-th row of A corresponds by construction to a path in the network D i .In particular, traversing a vertex (i, j; q 1 , . . ., q H ) means that in the decomposition, there are exactly q w matrices with a coefficient equal to w and a 1 in position (i, j), for all w ∈ [H].
In particular, we use the convention to associate the length vector (0, . . ., 0) to both the arcs leaving the source vertex s and the arcs entering the sink vertex t.For all the remaining arcs, the w-th entry of the length vector, denoted as w (α), represents the coefficient of w in the compact form of the corresponding partition of a i,j+1 = H w=1 q w • w minus the coefficient of w in the compact form of the corresponding partition of a ij = H w=1 q w • w, i.e., w (α) := max {0, q w − q w }.As an example, Figure 4 provides the length vector of (certain arcs of) the partition network corresponding to the first row of matrix Equation (23), namely (2 3).It is easy to realize that any integral s − t flow of value m in the combined partition network D associated to a given intensity matrix A, i.e., a s − t flow in D composed by a family of s i − t flows, each lying in the subnetwork D i , corresponds to a decomposition of the intensity matrix A. Any such flow is a combination of m paths p 1 , . . ., p m such that each p i is a s − t path in the network D i , for all i ∈ [m].The length vector associated to this s − t flow is Note that the w-th component of vector Equation (25) provides the overall number of segments with a coefficient w in the decomposition of A. Hence, solving the MCSP is equivalent to find a s − t flow of value m (that is, paths p 1 , . . ., p m ) such that the sum of all components of Equation ( 25) is minimized, or equivalently over all s − t paths p 1 , . . ., p m in D 1 , . . ., D m .By referring to matrix Equation ( 23) and to the corresponding combined partition network shown in Figure 3, the length vector of the top-most dotted path p 1 equals (2, 0, 0) + (0, 1, 0) = (2, 1, 0) and the length vector of the bottom-most dotted path p 2 equals (1, 1, 0) + (0, 0, 0) = (1, 1, 0).The length vector of the whole flow equals (2, 1, 0).Given paths p 1 and p 2 , we can build the corresponding decomposition of the matrix A in the following way.For each component w, consider a nonnegative integer m × n matrix A w having the generic entry (i, j) equal to the component q w in vertex (i, j; q 1 , • • • , q H ) belonging to p i .Then, the matrix A can be decomposed as As an example, by referring to the matrix Equation ( 23) and to the corresponding combined partition network shown in Figure 3, this procedure leads to the following decomposition of A: It is worth noting that the matrices A w in general are not segments.However, it is possible to prove that each matrix A w can be in turn decomposed into a sum of segments, in such a way that the resulting decomposition is optimal for the MCSP.To this end, we introduce the following auxiliary problem: The Beam-On Time Problem (BOTP).Given a nonnegative integer matrix A, find a decomposition A = K t=1 u t S t such that u t ∈ Z + , S t is a segment, for all t ∈ [K], and K t=1 u t is minimum.
The BOTP is polynomially solvable via the algorithms described in [1,20] and differs from the MCSP for the fact that it searches from among all the possible decompositions of the matrix A the one for which the sum of the coefficients is minimal.
Given a decomposition of A as in Equation ( 27), let us decompose A w , for each w ∈ [H], into a nonnegative integer linear combination of segments minimizing the BOTP.Then, Equation ( 27) can be rewritten as where K = H w=1 |{t : u t = w}|.The minimality of K in this new decomposition (i.e., the optimality of Equation (29) for the MCSP) is then proved by the following proposition: Proposition 1.Let K be the optimal value to the MCSP.Then, the following equality holds: where BOT (A w ) denotes the minimum sum of K t=1 u t in the decomposition of A w .
Proof.Let β w denote |{t : u t = w}|, i.e., the sum of the coefficients in the decomposition of A w = t:ut=w S t , for all w ∈ [H].Then, it holds that β w ≥ BOT (A w ), for all w ∈ [H].Hence, we have that To prove that in Equation (30) the strict equality holds, assume by contradiction that for some ŵ ∈ [H], β ŵ > BOT (A ŵ).Then, it is possible to obtain a smaller value of β ŵ by replacing t:ut= ŵ S t with the decomposition provided by the optimal solution of the BOTP with input A ŵ.However, β ŵ is defined as the optimal solution to the BOTP with input A ŵ, hence this would contradict its optimality.Thus, the statement follows.
As an example, by using the decomposition algorithm for the BOTP described in [1,20] on the matrices A w in Equation (28), we obtain the following decomposition for Equation ( 23) which is optimal due to Proposition 1.The transformation of the MCSP into a network optimization problem and the result provided by Proposition 1 constitute the fundation for the ILP formulation that will be presented in the next section.

An Integer Linear Programming Formulation for the MCSP
In this section, we develop an exponential-sized integer linear programming formulation for the MCSP.Unless not stated otherwise, throughout this section we will assume that A is a generic m × n nonnegative integer matrix.
Let D = (V, A) be the combined partition network associated to A and let P i be the set of all of the s − t paths whose internal vertices belong to the subnetwork D i = (V i , A i ), for all i ∈ [m].We associate to each path p ∈ P i , i ∈ [m], a length vector (p) of dimension H = ||A|| max = max{a ij } obtained by summing over the lengths of the arcs belonging to p: where { (α)} are the length vectors defined in Section 3. Let λ i p be a decision variable equal to 1 if path p ∈ P i , i ∈ [m], is considered in the optimal solution to the problem and 0 otherwise.Finally, let d w be an integer variable equal to the number of segments with a coefficient w needed to decompose the matrix A. Then, a possible integer linear programming formulation for the MCSP is: The objective function Equation (33a) accounts for the number of segments involved in the decomposition of the matrix A. Constraints Equation (33b) impose that for each row i ∈ [m], the number of segments with coefficient w has to be greater than or equal to max i∈[m] { α∈p i w (α)}, which corresponds to the w-th component of the length vector associated to the s − t flow in D. Constraints Equation (33c) impose to choose exactly one path in each partition network D i .Finally, constraints Equations (33d) and (33e) impose the integrality constraint on the considered variables.The validity of Formulation 1 is guaranteed by the transformation described in Section 3. It is worth noting that we need not to impose the integrality constraint on variables d w .In fact, it is easy to see that in an optimal solution to IMP, the integrality of variables λ i p together with Equation (33b) suffice to guarantee the integrality of variables d w .Formulation 1 includes an exponential number of (path) variables and a number of constraints that grows pseudo-polynomially in function of H.A possible approach to solve it consists of using column generation and branch-and-price techniques [23].
In order to study the pricing oracle, consider the following formulation: where P i is a strict subset of P i , i ∈ [m], and let π i,w and µ i be the dual variables associated to constraints Equations (33b) and (33c), respectively.Then, the dual problem associated to the IMP is: A variable with negative reduced cost in the RLPM corresponds to a dual constraint violated by the current dual solution.As variables d w are always present in the RLPM, constraints Equation (35c) will never be violated.To check the existence of violated constraints, Equation (35b) means searching for the existence of a row î ∈ [m] and a path p ∈ P î such that Let π * i,w and µ * i be the values of variables π i,w and µ i in the current dual solution.Consider a new length function * : and let D * î be the partition network D î with lengths provided by Equation (37).By definition, the length Then, determining the existence of a path violating Equation (35b) implies to check whether the shortest s − t path in D * î has an overall length shorter than µ * i , i.e., to check if it holds that Since all length values are nonnegative, this task can be performed by using a standard implementation of Dijkstra's algorithm [22].

Computational Experiments
In this section, we analyze the performance of Formulation 1 in solving instances of the MCSP.Our experiments were motivated by a twofold goal: to compare the lower bounds provided by the linear relaxation of Formulation 1 vs. the lower bounds provided by the linear relaxations of the ILP formulations described in [17][18][19]; and to measure the runtime performances of Formulation 1.We emphasize that our experiments neither attempt to investigate the use of Formulation 1 in IMRT nor to compare Formulation 1 to other algorithms that use an objective function that is different from the one used in the MCSP.The reader interested in a systematic discussion about such issues is referred to [7,10].
In order to measure the quality of the lower bounds provided by the linear relaxation of Formulation 1, we considered the instances of the MCSP provided in Chapter 2 of L. R. Mason's Ph.D. thesis [10].One of the advantage of considering Mason's instances derives from the fact that the author first compared the linear relaxations of the ILP formulations described in [17][18][19], by creating benchmarks that can be used for further comparisons.The author considered a dataset constituted by 25 squared matrices having an order ranging in {6, . . ., 10}.Fixed an order, the dataset provides five random instances of the MCSP having H = ||A|| max ≤ 15 if the order is smaller than or equal to eight, and H = ||A|| max = 10 otherwise.We refer the interested reader to Chapter 2 of [10] for further information concerning the performances of the considered ILP formulations and to the appendix of the same work to download the instances used in this work.

Implementing Formulation 1
One of the main difficulties in designing a solution approach for Formulation 1 consists of devising effective branching rules when the binary variables are priced out at run time.In particular, a typical problem is that there may be no easy way to forbid that a variable that was fixed at 0 by branching could still be a feasible solution for the pricing problem.However, if the value of H is relatively small, it is possible to overcome this issue by branching on "arc variables" rather than on "path variables".Specifically, let us add in Formulation 1 a new decision (arc) variable x α equal to 1 if arc α ∈ A is used and 0 otherwise.Then, an alternative formulation for the MCSP is: x α ∈ {0, 1} ∀α ∈ A (40g) In particular, Formulation 4 has the same constraints that Formulation 1 plus constraints Equation (40d) that impose that only selected arcs can be used in a path p ∈ P i , i ∈ [m] and constraints Equation (40e) that impose that in each path in D i is made up of exactly n − 1 arcs.Proposition 2. In any feasible solution to Formulation 4, the integrality of variables x α together with constraints Equations (40c) and (40d) suffice to guarantee the integrality of variables λ i p .
Proof.Assume by contradiction that there exists a feasible solution to the PA-IMP in which there are both a row ĩ ∈ [m] and a path p ĩ such that 0 < λ ĩ p ĩ < 1.Then, due to constraint Equation (40c), there also exists at least another nonzero fractional path variable encoding an alternative path for row ĩ.These two paths must differ for at least one arc.We denote α 1 as the arc used by path λ ĩ p ĩ and α 2 as the arc used by the other fractional variable.Because of constraint Equations (40d) and (40g), this fact implies that both variables x α 1 and x α 2 are equal to one, which implies that constraint Equation (40e) must be violated for ĩ.This, in turn, contradicts the feasibility of the solution.
An immediate consequence of the above proposition is that Proposition 3. The integrality constraint on variables λ i p in Formulation 4 can be relaxed.
It is worth noting that the number of arcs in the combined partition network associated with the matrix A grows exponentially in function of H. Hence, we stress that the introduction of the arc variables is practical only for small values of H.
As for Formulation 1, the linear programming relaxation of Formulation 4, denoted as PA-RLMP, can be solved via column generation techniques [23].To this end, we define the following dual variables associated to constraints Equations (40b)-(40e), namely π i,w , for all i ∈ [m] and w ∈ [H]; µ i , for all i ∈ [m]; η iα , for all α ∈ [A i ], i ∈ [m]; and ω i , for all i ∈ [m].The dual of the PA-RLMP has the following constraints: As variables d w are always present in the RLPM, constraints Equation (42) will never be violated in the current dual solution.Assuming that |A| arc variables are always present in the PA-RLMP, constraints Equation (43) will never be violated.To check the existence of violated constraints Equation (41) means searching for the existence of a row î ∈ [m] and a path p ∈ P î such that It is easy to see that, by denoting π * i,w , µ * i and η * iα as the values of variables π i,w , µ i and η iα in the current dual solution and by using argumentations similar to those used in Section 3, determining the existence of violated constraints Equation ( 41 Once again, this task can be performed by using a standard implementation of Dijkstra's algorithm [22].

Implementation Details
We implemented Formulation 4 in FICO Xpress Mosel, version 3.8.0,Optimizer libraries v27.01.02 (64-bit, Hyper capacity).We run the algorithm on an IMac Core i7, 3.50 GHz, equipped with 16 GByte RAM and operating system Mac Os X, Darwin version 10.10.3, gcc version 4.2.1 (LLVM version 6.1.0,clang-602.0.53).We used the default settings for Fico Xpress Optimizer and we set 1 h as maximum runtime for each instance of the problem as in [9,10].The source code of the algorithm can be downloaded from [24].

Primal Bound
We constructed the primal bound of the problem by solving the one row restriction of the MCSP for each row of the matrix A. This task can be performed in polynomial time by using the algorithm described in Section 2. We computed the length of the corresponding s − t flow by using Equation (25) and set the value of this solution equal to the sum of the corresponding components.

Setting Initial Columns
We set the initial columns of the RLPM by choosing at random a path in D i , for each i ∈ [m].We solved the linear programming relaxation at each node of the search tree by implementing the pricing problem described in the previous section.In particular, we used a strategy similar to the one described in [25], i.e., we added to Pi , i ∈ [m], all of the paths violating Equation (41).

Branching Rules
We explored the search tree by branching on the arc variables x α .In particular, we first constructed a s − t flow in D. Subsequently, we used a depth-first approach in which we backtracked first on the arcs belonging to the partition network D m ; if the backtracking returns to s m then we backtracked on the arcs belonging to the partition network D m−1 and subsequently we returned to those belonging to D m .We recursively iterated this approach for all of the partition networks in D.

Performance Analysis
Table 1 shows the comparison between the linear relaxation of Formulation 4 and the linear relaxations of the formulations described in [17][18][19] when considering Mason's instances.Some of the values presented in the table refer to the computational tests performed in Mason's Ph.D. thesis (namely those reported in Tables 2.7 ad 2.8 of [10]).Specifically, the first column provides the name of the instances analyzed.The second column provides the corresponding optimal values.The third column provides the linear programming relaxations of the formulations described in [17][18][19], denoted as M C , M W , and M M , respectively.The fourth column provides the values of the gap of formulations M C , M W , and M M expressed in percentage, i.e., the difference between the optimal value to a given instance of the MCSP and the objective function value of the linear programming relaxation at the root node of the respective search tree, divided by the optimal value.The fifth column provides the linear programming relaxations of the formulations M C and M M when considering the constraints (2.16) and (2.45) in Mason's Ph.D. thesis (denoted as M + C and M + M , respectively) and the sixth column provides the corresponding gaps.The seventh and the eighth columns provide the lower bounds and the gaps of the shortest path formulation [15] as modified in [10], hereafter denoted as "SR".This formulation consider a relaxation of the MCSP obtained when partitioning the MCSP into a set of independent sub-problems, one for each row of the matrix A. Finally, the ninth column provides the linear programming relaxations of Formulation 4, the tenth column provides the corresponding gaps, and the eleventh column provides the gaps when ceiling the values of the corresponding linear relaxation of Formulation 4.
A drawback of current implementation of Formulation 4 is represented by the solution times which usually are longer than those described in [9,10].In particular, computational experiments have shown that the branch-and-price approach for Formulation 4 is unable to solve any of the considered instances within the time limit.This fact appears to be due to a number of technical details related to the implementation of Formulation 4, included the choice of the primal bound, the branching rules, the poor current ability to handle the combined partition network and the use of an interpreted language such as Mosel.In order to obtain better insights about the computational limits of current implementation, we considered 36 supplementary datasets of the MCSP, each containing 10 random instances numbered from 0 to 9 and characterized by having a fixed number of rows and columns for the matrix A. Specifically, we considered a number of rows m ranging in {4, . . ., 9} and a number of columns n ranging in {5, . . ., 10}.Fixed the values of m and n, a generic instance in a dataset is generated by creating a m × n nonnegative integer matrix A whose generic entries are random integers in the discrete interval {1, . . ., H}, where H is randomly generated in the set {3, . . ., 7}.A generic dataset is identified by the dimension of the intensity matrices encoded in its instances.For example, dataset 4 × 5 includes instances of the MCSP encoding intensity matrices having dimension 4 × 5. We used the Mersenne Twister libraries [26,27] as pseudorandom integer generator and we obtained, at the end of the generation process, 360 instances of the MCSP downloadable from [24].Table 2 shows the computational performances obtained on the datasets so generated.As a general trend, we have observed that the performances of the branch-and-price approach for Formulation 4 decrease in function of both the size of the input matrix and the value of H, by reaching a maximum of 10 solved instances within 1h computing time for matrices having dimension 4 × 5 and a minimum of 3 solved instances for matrices having dimension 9 × 10.In current implementation, high values for H directly influences the size of each partition network as the number of edges in A partitions grow pseudo-polynomially in function of H. Investigating possible approaches to overcome current limitations and improve the overall performances warrants additional research effort.

Conclusions
In this article, we investigated the Minimum Cardinality Segmentation Problem (MCSP), a N P-hard combinatorial optimization problem arising in intensity-modulated radiation therapy.The problem consists in decomposing a given nonnegative integer matrix into a nonnegative integer linear combination of a minimum cardinality set of binary matrices satisfying the consecutive ones property.We showed how to transform the MCSP into a combinatorial optimization problem on a weighted directed network, and we exploited this result to develop an exponential-sized integer linear programming formulation to exactly solve it.Computational experiments showed that the lower bounds obtained by the linear relaxation of the considered formulation improve upon those currently described in the literature.However, current solution times are still far from competing with the current state-of-the-art approach to solution of the MCSP.Investigating strategies to overcome current limitations warrant additional research effort.

Figure 2 .
Figure2.The partition network corresponding to the row (2 ; 3 , 2 , 3) with some lengths on arcs.The first two vertices on the right of s correspond to the two possible partitions of 2 (i.e., 2 = 1 + 1 and 2 = 2).The subsequent three vertices on the right correspond to the three possible partitions of 3 (i.e., 3 = 1 + 1 + 1, 3 = 1 + 2, 3 = 3), and so on.Note that the entries of two subsequent columns, say j and j + 1, in the row vector A induce a bipartite subnetwork in D. The dotted arcs represent the shortest s − t path in D.

Formulation 4 .
-Path-Arc Integer Master Problem (PA-IMP) Formulation 2. -Restricted Linear Program Master (RLPM) p∈ Pi ) is equivalent to check whether the shortest s − t path in D * î has an overall length shorter than µ * i − α∈p η * îα , i.e., to check if it holds that