Maximum Disjoint Paths on Edge-Colored Graphs: Approximability and Tractability

: The problem of ﬁnding the maximum number of vertex-disjoint uni-color paths in an edge-colored graph has been recently introduced in literature, motivated by applications in social network analysis. In this paper we investigate the approximation and parameterized complexity of the problem. First, we show that, for any constant ε > 0 , the problem is not approximable within factor c 1 − ε , where c is the number of colors, and that the corresponding decision problem is W [1]-hard when parametrized by the number of disjoint paths. Then, we present a ﬁxed-parameter algorithm for the problem parameterized by the number and the length of the disjoint paths.


Introduction
Social networks are usually represented and studied as graphs.Vertices represent the elements analyzed (e.g., individuals), while edges represent a binary relation between the considered elements.Among the different properties considered to study such graphs, one of the most relevant is the vertex connectivity of two given vertices.Vertex connectivity is a measure of the information flowing from one vertex to the other, and it has many applications.For example, it is used for the identifications of important structural properties of a social network, like group cohesiveness and centrality [1,2].A classical result of graph theory, known as Menger's theorem, states that vertex connectivity is equivalent to the maximum number of disjoint paths between two given vertices.
While a lot of interest has been put in the study of networks that represent a single type of relation, a natural extension that has been recently introduced in literature [3] is to consider multi-relational social networks, that is social networks where more than one kind of relation between elements of the network is considered.In order to investigate vertex connectivity in multi-relational social networks, the combinatorial problem known as Maximum Colored Disjoint Paths (MAX CDP) has been introduced in [3].MAX CDP asks for the maximum number of vertex-disjoint uni-color paths in an edge-colored graph, where the different edge-colors represent different kinds of relation.
The computational and approximation complexity of MAX CDP has been investigated in [3].When the input graph contains exactly one color, MAX CDP is polynomial time solvable (it can be reduced to the maximum flow problem), while it has been shown to be NP-hard when the edges of the graph are colored.Moreover, MAX CDP is shown to be approximable within factor c, where c is the number of colors of the edges of the input graph, but not approximable within factor 2 − ε, for any ε > 0, even when c is a fixed constant.
In [3], it is also investigated a variant of the problem, denoted as -LCDP, where the length of the paths in the solution are (upper) bounded by an integer ≥ 1.The -LCDP problem is NP-hard, for ≥ 4, while it admits a polynomial time algorithm when ≤ 3.This variant of the problem can be approximated in polynomial time within factor ( − 1)/2 + ε.
In this paper we investigate the approximation and parameterized complexity of MAX CDP and -LCDP.First, we show in Section 3 that MAX CDP is not approximable within factor c 1−ε , for any constant ε > 0, and that the corresponding decision problem (CDP) is W [1]-hard when parametrized by the number p of disjoint uni-color paths.Then, in Section 4, we give a fixed-parameter algorithm for -LCDP, when and the number of disjoint uni-color paths are considered as parameters.Table 1 summarizes the results known about the complexities of these problems along with the new results presented in this work.

Definitions
In this section we give some preliminary definitions that will be useful in the rest of the paper.First, in this paper, we will consider only undirected graphs.Consider a set of colors C = {1, . . ., c}.In the paper we denote by c the cardinality of C. A C-edge-colored graph (or simply an edge-colored graph when the set of colors is clear from the context) is defined as G = (V, E), where V denotes the set of vertices of G and E = {E 1 , . . ., E c } denotes a collection of edge sets, where the set E i , with i ∈ C, represents the set of edges colored with color i.Notice that, for a given pair of vertices v i , v j , there may exist more than one edge between v i and v j (each of these edges is associated with a distinct color of C).
A path π in G is called a uni-color path if all the edges of π have the same color, that is they belong to the same set E i (for some i ∈ C).Given two vertices x, y ∈ V , an xy-path is a path between vertices x and y.Two paths π and π are internally disjoint (or, simply, disjoint) if they do not share any internal vertex, while a set of paths are internally disjoint if they are pairwise internally disjoint.
Next, we introduce the formal definitions of the problems we deal with in this paper, namely the optimization problem MAX CDP, the decision problem (CDP) naturally associated with MAX CDP, and the corresponding length-bounded variants -LCDP and -LCDP p .

Problem 1. MAXIMUM COLORED DISJOINT PATHS (MAX CDP).
Input: a set C of colors, a C-edge-colored graph G = (V, E), and two vertices s, t ∈ V .Output: the maximum number of disjoint uni-color st-paths.

Problem 2. COLORED DISJOINT PATHS (CDP).
Input: a set C of colors, a C-edge-colored graph G = (V, E), a non-negative integer p, and two vertices s, t ∈ V .Output: Do there exist at least p disjoint uni-color st-paths in G?
The -LENGTH COLORED DISJOINT PATHS ( -LCDP) problem is a variant of MAX CDP where the length of the paths in the solution is bounded by an integer ≥ 1.The -LCDP p problem is the decision version of -LCDP which asks if there exists a solution of -LCDP with cardinality at least p.

Approximation and Parameterized Complexity of MAX CDP
In this section, we present a reduction from MAXIMUM INDEPENDENT SET to MAX CDP.Since the reduction preserves the solution cost, it implies that MAX CDP is not approximable within factor c 1−ε , for any ε > 0, and that CDP is W[1]-hard when the parameter is the size p of the solution.
Given an undirected graph G I = (V I , E I ), the MAXIMUM INDEPENDENT SET (MAX INDSET) problem asks for an independent set I ⊆ V I of maximum cardinality, i.e., a maximum-cardinality set I such that if v , v ∈ I then {v , v } ∈ E I .In the following, starting from a graph G I , we construct a gadget (an edge-colored graph) G C , such that finding an independent set I of cardinality k in G I is equivalent to finding k disjoint uni-color st-paths in G C .First, we describe the edge-colored graph G C associated with a generic graph G I , then we prove some properties of the computed gadget.
Description of the gadget.Let G I = (V I , E I ) be an undirected graph, with V = {v 1 , . . ., v n } and E I = {e 1 , . . ., e m }.Without loss of generality, we assume that G I is connected, since a maximum independent set of a non-connected graph is the union of the maximum independent sets of its connected components.Let Π E I be an ordered list of the edges of G I , based on some ordering.We construct an edge-colored graph G C = (V C , E 1 , . . ., E n ) associated with G I as follows.Informally, the vertex set V C is composed by two distinguished vertices s and t and a vertex for each edge of G I , while each set E i , 1 ≤ i ≤ c, is composed connecting the vertices associated with edges of G I incident to v i in the same order as they appear in Π E I .Formally, the set of colors is: Now, we define the vertex set V C :   Properties of the gadget.First, we introduce the following properties of the gadget.
Proof.The proof follows by construction, since the edges of color i, with 1 ≤ i ≤ c, induce a st-path that contains each vertex u i,x of G C associated with an edge incident in v i ∈ V I ordered as in list Π E I .
Next, we prove the two main results of the reduction from MAX INDSET to MAX CDP.The first lemma is easily proved by showing that the uni-color st-paths associated with the vertices of the independent set I are pairwise disjoint.Conversely, the second lemma can be proved by showing that the vertices of G I associated with the k uni-color st-paths of G C form an independent set for G I .
Proof of Lemma 2. By construction, in G C there exists a uni-color st-path associated with each vertex v of the original graph G I .We will show that the set P of paths of G C associated with each vertex v ∈ I are internally disjoint.Let π i and π j be two paths of P associated with vertices v i and v j , respectively, of I. Notice that the two paths π i and π j connect the vertices which represent the edges of G I incident to v i and v j , respectively.Since I is an independent set in G I , no edge e ∈ E I is incident to both v i and v j (i.e., u i,j ∈ E I ), thus π i and π j are (internally) disjoint.
Proof of Lemma 3. Let P be the set of k disjoint uni-color st-paths of G C .Since each color is (biunivocally) associated with a single path in G C which, in turn, is (bi-univocally) associated with a single vertex of G I , we can define a set I ⊆ V I that consists of the vertices of G I associated with a path of P .Clearly, |I| = |P | = k.We claim that I is an independent vertex set for G I .Suppose that I is not an independent set, thus there exist two vertices v i , v j ∈ I such that {v i , v j } ∈ E I .Let u i,j be the vertex of G C representing edge {v i , v j }.Since v i , v j ∈ I, then there exist two paths π i , π j in P associated with v i and v j .By Remark 1, both paths must contain vertex u i,j as an internal vertex, since edge {v i , v j } is incident to both v i and v j .Hence paths π i and π j are not internally disjoint, which contradicts our assumption and thus I is an independent set for G I .
Consequences.Lemmas 2 and 3 prove the existence of an L-reduction [4]   This result greatly improves the previous inapproximability factor 2 − ε for MAX CDP [3] and, given the c-approximation algorithm presented in [3], it is the asymptotically optimal inapproximability ratio for MAX CDP.However, notice that the inapproximability factor 2 − ε for MAX CDP given in [3] holds even if c is a fixed constant, while in our reduction c is not fixed.
From the parameterized complexity point of view, the reduction also implies the W[1]-hardness of the decision problem CDP, as stated in the following theorem.
Theorem 5. CDP is W[1]-hard when parameterized by the number p of disjoint uni-color st-paths.
Proof.The reduction presented by Lemmas 2 and 3 is also a parameterized reduction [6] from INDEPENDENT SET (the decision problem naturally associated with MAX INDSET) to CDP (indeed the size of an independent set of G I is identical to the number of disjoint uni-color st-path in G C ).Since INDEPENDENT SET is W[1]-hard when the parameter is the size of the required independent set [7], then also CDP is W [1]-hard when parametrized by number p of disjoint uni-color st-paths.

A Fixed-Parameter Algorithm for -LCDP p
In this section, we study -LCDP p , the length-bounded (decision) version of MAX CDP, which asks if there exist p uni-color disjoint st-paths of length at most .We show that -LCDP p is fixed-parameter tractable when the parameters are and p by presenting a parameterized algorithm based on the color coding technique [8].For an introduction to parameterized complexity see [6].Notice that -LCDP p is unlikely to admit fixed-parameter tractable algorithms when parameterized only by p or only by .Indeed in the latter case, -LCDP p is already NP-hard when = 4 [3].In the former case, we have proved in the previous section that CDP (hence -LCDP p , when = n) is W[1]-hard when parameterized by p.
Color coding is a technique initially introduced to design fixed-parameter algorithms for various restrictions of the subgraph isomorphism problem.It then gained popularity and it has been successfully applied to tackle the computational hardness of various problems on networks and graphs [9][10][11], on strings [12,13], and problems of subset selection [14,15].The basic idea of the color coding technique applied on graph problems is, first, to "color" the vertices of the graph from a set of k colors (for an appropriate choice of the number k of colors), and, then, to find a solution of the given problem with the additional constraint that the vertices of the solution are colored with distinct colors (called a "colorful" or "color coded" solution), if such a solution exists.The process is re-iterated with a different coloring if a colorful solution is not found.
The key theoretical result, which allows to obtain deterministic algorithms based on the color coding technique, is the deterministic construction of k-perfect families of hash functions.A family F of hash functions from a set U (the vertex set in the traditional applications of color coding) to the set {1, . . ., k} of colors is k-perfect if, for each subset U of U such that |U | = k, there exists a hash function f in F such that U is colorful w.r.t.f , i.e., f assigns a distinct label to each element of U .In fact, if the given problem has a solution S of size k, then there exists a hash function in F such that solution S is colorful.Hence, it suffices to test if there exists a colorful solution for one of the colorings given by the hash functions of the k-perfect family in order to guarantee the existence of a solution of the original problem, if such a solution exists.Crucial to the overall running time is the size of a k-perfect family and the time required to enumerate and evaluate the hash functions of the family.Currently, the best bounds (such as [8,16,17]) are, in general, explicit constructions of families of size 2 O(k) log O(1) (|U |) in time proportional to their size.
The description of the parameterized algorithm for the -LCDP p problem is divided into two parts.First, we present a procedure that, given an edge-colored graph G C and a vertex-coloring function λ, verifies if in G C there exist p disjoint uni-color st-paths long at most and with the additional constraint that the inner vertices of the p paths are colored with distinct colors.Then, we show that, by exploiting well-known properties of families of perfect hash functions, the previous procedure can be used to solve the -LCDP p problem in polynomial time (if p and are parameters).In the following, to avoid ambiguities between vertex's and edge's colors, function λ will be called vertex-labelling function (or, simply, a labelling function) instead of the traditional term of coloring function.
A dynamic-programming procedure for the L-labelled ) be a C-edge-colored graphs with two distinguished vertices s and t, and let λ be a labelling function which maps each vertex v of V \ {s, t} to a label λ(v) belonging to a set L (we assume that λ assigns a distinct label to each vertex of a solution of -LCDP p ).Let L ⊆ L be a fixed set of labels.A simple path π in G C is L-labelled if and only if the labels of its vertices (with the exclusion of s and t) are contained in L and are pairwise distinct.A set {π 1 , . . ., π k } of simple paths is L-labelled if and only if there exists a partition {L 1 , . . ., L k } of L such that each π i is L i -labelled.We say that a path π is g-colored, with g ∈ C, if all of its edges belong to set E g .The L-labelled -LCDP p problem, given G C and λ : V → L with |L| = ( − 1)p, asks if there exists an L-labelled solution for the -LCDP p problem on G C .We solve the L-labelled -LCDP p problem by combining two dynamic-programming recurrences.The first one, M [L, v, g], tests if, for a set of labels L ⊆ L, there exists an L-labelled g-colored path from vertex s to a vertex v different from t.The second one, P [L], tests if, for a set of labels L ⊆ L such that |L| = ( − 1)q for some integer q ∈ [0, p], there exists a partition {L 1 , . . ., L q } of L in q subsets such that each set L i labels a g i -colored st-path of length l ≤ .
Recurrence for M [L, v, g] is defined as follows (where represents the disjoint union operator): Correctness of the previous recurrence is proved by the following lemma.
Lemma 6. M [L, v, g] is true if and only if there exists an L-labelled g-colored path from s to v.
Proof.The proof is by induction on the cardinality of L. If |L| = 0, the base cases apply and a path which does not use any label exits if and only if v = s.Now, assume that M [L, v, g] is correct for any L such that |L| ≤ k (for some k) and we will prove the correctness of M [L , v, g] for all L such that |L | = k + 1.Moreover, assume that v = s, since, otherwise, the first base case applies which is clearly correct.Then, an L -labelled g-colored path from s to v = s exists if and only if (i) λ(v) ∈ L and (ii) there exists an L -labelled g-colored path π from s to a vertex u such that {u, v} ∈ E g and Clearly, M can be used to test if there exists an L-labelled g-colored st-path, as illustrated by the following corollary.conditions.Notice that |L | = |L| − ( − 1) = ( − 1)(k + 1) − ( − 1) = ( − 1)k.Hence, by induction hypothesis, since P [L ] is true, there exists an L -labelled set S of k disjoint uni-color st-paths.The other conditions, as shown in the proof of Corollary 7, test the existence of an L -labelled g-colored st-path π for some color g ∈ C. Thus, if there exists a bi-partition which satisfies all the conditions, then there exits an L-labelled set S = S {π} of k + 1 uni-color st-paths.Moreover, since L and L are disjoint, by Property 9, path π and any path of S are disjoint.Furthermore, since |L | = − 1, by Property 10, the length of π is, at most, (in particular, − 1 from s to vertex v, plus 1 from v to t).As a consequence, S is an L-labelled set of p = k + 1 disjoint uni-color st-paths of length, at most, .Now, we prove that if there exists an L-labelled set S of (k + 1) disjoint uni-color st-paths, then P [L] is true.For each path π ∈ S, let L i be the set of labels labelling π i .Since the paths in S are disjoint, also the sets L 1 , . . ., L k+1 are disjoint.Moreover, since the length of each path is at most , we have that |L i | ≤ ( − 1).Notice that |L| is ( − 1)(k + 1), thus it is possible to find a partition of L in p sets L 1 , . . .L k+1 of cardinality − 1 such that each L i is a subset of L i .Let us consider a generic path π i .Since π i is a uni-color st-path (of length at most ), then there exists a vertex v and a color g such that M [L i , v, g] is true and {v, t} ∈ E g (Corollary 7).Finally, since L i = L \ L i is a set of labels of cardinality ( − 1)k and S \ {π i } is an L i -labelled set of k disjoint uni-color st-paths of length at most , by induction hypothesis we have that P [ L i ] is true.Therefore, the bi-partition {L i , L i } satisfies all the condition of Equation 4.2 and P [L] is true.
An immediate consequence is that the L-labelled -LCDP p problem can be solved in polynomial time when and p are parameters.
The algorithm for -LCDP p .As explained before, it is possible to explicitly construct a k-perfect family F of hash functions, that is a set F of hash functions from a universal set U to the set of integers {1, . . ., k} such that for each U ⊆ U of cardinality k there exists a hash function f ∈ F which assigns distinct integers to the elements of U .It has been shown (see, for example, [8,16,17]) that a k-perfect family of hash functions of size 2 O(k) log O(1) |U | can be explicitly constructed in time proportional to its size.As a consequence, the -LCDP p problem can be solved by solving the L-labelled -LCDP p problem for all the labelling functions given by the hash functions of a ( − 1)p-perfect family (where 1) |V C |).We remark that this algorithm is mainly of theoretical interest, since the running times are impractical even with modest choices of the parameters and p.However, as formalized by the following theorem, it settles the parameterized complexity of the -LCDP p problem for the parameters and p.
Theorem 12.The -LCDP p problem parameterized by the bound on the path length and the number p of disjoint uni-color st-paths is in FPT.

Conclusions
In this paper we have considered the MAX CDP problem, a combinatorial problem motivated by applications in social network analysis that, given an edge-colored graph G C , asks for the maximum number of disjoint uni-color paths in G C .We have shown that the problem is not approximable within factor c 1−ε , for any constant ε > 0, and that the corresponding decision problem (CDP) is W [1]-hard when parametrized by the number p of disjoint uni-color paths.Then, we have given a fixed-parameter algorithm for -LCDP p , a restriction of the problem where the length of the disjoint paths are bounded by a parameter.An interesting open problem is to improve the time complexity of the fixed-parameter algorithm for -LCDP p .Moreover, kernelization complexity issues are still completely unexplored.

Figure 1
Figure 1 represents an example of an undirected graph G I and of the edge-colored graph G C associated with it.

Figure 1 .
Figure 1.An example of a graph G I and the edge-colored graph G C associated with it.For convenience, we labelled the edges of G I such as they correspond to the vertices in G C .The colors of the edges in G C are indicated by numbers placed near the edges, while the two distinguished vertices s and t are highlighted in grey.The order Π E I of the edges of G I is simply the lexicographic order of their labels.
G C Given a graph G I with n vertices and m edges, the associated edge-colored graph G C has m + 2 vertices, O(mn) edges, and n colors, i.e., c = n.

Lemma 2 .
Let G I = (V I , E I ) be an undirected graph and I ⊆ V I be an independent (vertex) set for G I .Then, we can compute in polynomial time (at least) |I| disjoint uni-color st-paths in the edge-colored graph G C associated with G I .Lemma 3. Let G I = (V I , E I ) be an undirected graph and G C the edge-colored graph associated with G I .If there exist k disjoint uni-color st-paths in G C , then we can compute in polynomial time an independent set I ⊆ V I for G I , with |I| = k.
from MAX INDSET to MAX CDP with constants β = γ = 1.Hence, considering that, unless P = NP, MAX INDSET cannot be approximated in polynomial time within factor |V I | 1−ε for any constant ε > 0 [5], and that |V I | = c, the following theorem holds.

Theorem 4 .
For any constant ε > 0, MAX CDP cannot be approximated within factor c 1−ε in polynomial time unless P = NP.

Corollary 11 .
The L-labelled -LCDP p problem can be solved in time O(2 2 p m), where m = g∈C |E g |.Proof.The evaluation of M needs O(2 |L| m) time.For a fixed L ⊆ L, the evaluation of P [L] requires O(2 |L| m) and, since there are 2 |L| possible subsets of L, the time needed to evaluate P is O

Table 1 .
Complexity status of MAX CDP.
path π exists if and only if M [L , u, g] is true.The inductive case of Equation4.1 tests the above mentioned conditions, hence M [L , v, g] is correct also for sets of labels L such that |L| = k + 1, concluding the proof.