# New Bipartite Graph Techniques for Irregular Data Redistribution Scheduling

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

_{BC}

_{(}

_{α}

_{,}

_{β}

_{, P, Q)}= (X, Y, E), where X = {x

_{0}, x

_{1}, …, x

_{|}

_{X}

_{|−1}} and Y = {y

_{0}, y

_{1}, …, y

_{|}

_{Y}

_{|−1}} represent source processor set {p

_{0}, p

_{1}, …, p

_{|}

_{X}

_{|−1}} and destination processor set {p

_{0}, p

_{1}, …, p

_{|}

_{Y}

_{|−1}}, respectively. In G

_{BC}

_{(}

_{α}

_{,}

_{β}

_{, P, Q)}, if the source processor p

_{i}wants to send c elements of the array to the destination processor p

_{j}, we add an edge (x

_{i}, y

_{j}) to E with a weight c. For simplicity, in this work, a shorter symbol BC(α, β, P) is used to represent BC(α, β, P, P).

_{BC}

_{(1, 4, 4)}is also shown in Figure 2. In Figure 1, array elements A(1:16) mapped onto four processors (P) are to be redistributed to four (P) processors. In source distribution, every block with α(=1) element are mapped onto one of processors which are ordered in the cyclic sequence. As a result, A(1), A(5), A(9), A(13) are mapped onto processor P

_{0}in Figure 1. On the other hand, in target distribution, every block with β(=4) element are mapped onto one of the processors which are ordered in the cyclic sequence. Consequently, A(1), A(2), A(3), A(4) are mapped onto processor Q

_{0}in the new distribution. Since A(1), A(2), A(3), A(4) are sent to Q

_{0}by P

_{0}, P

_{1}, P

_{2}, P

_{3}, respectively, there are four edges connecting P

_{0}, P

_{1}, P

_{2}, P

_{3}to Q

_{0}in the distribution graph (Figure 2).

_{GB}

_{(P, Q)}= (S, T, E). For instance, in Figure 3, array elements A(1:100) are to be redistributed. Originally, array elements A(1:24), A(25:50), A(51:64), A(65:100) are mapped onto P

_{0}, P

_{1}, P

_{2}, and P

_{3}, respectively. After redistribution, a new distribution is formed, such as A(1:15), A(16:49), A(50:89), A(90:100) onto Q

_{0}, Q

_{1}, Q

_{2}, and Q

_{3}, respectively. Its distribution graph, G

_{GB}

_{(4, 4)}, is presented in Figure 4. Note that one processor may need to send array elements to multiple consecutive processors in an irregular distribution. For example, P

_{0}has 24 elements A(1:24) which A(1:15) are sent to Q

_{0}, and A(16:24) are sent to Q

_{1}. As a result, there is an edge with a weight 15 and another edge with a weight 9 in the distribution graph to indicate these two communications in Figure 4.

**Definition**

**1.**

_{1}, F

_{2}, F

_{3}, …, F

_{z}} of F to minimize $\sum}_{i=1}^{z}\mathrm{max}\left\{w(e)|e\in {F}_{i}\right\$, where w(e) is the weight of edge e.

_{i}} indicates the longest time required to complete all communications allocated to ith step. Accordingly, the sum of max{w(e)|e∈F

_{i}}(for 1 ≤ i ≤ z) is denoted as $\sum}_{i=1}^{z}\mathrm{max}\left\{w(e)|e\in {F}_{i}\text{}\right\$, which indicates the overall communication time for the data redistribution represented by G. In the next problem, we try to not only shorten the overall communication time but also the number of required communication steps. Additional variants of MECP use the minimum number of edge colorings, which are defined as follows.

**Definition**

**2.**

_{1}, F

_{2}, F

_{3}, …, F

_{Δ}

_{(G)}} of G to minimize $\sum}_{i=1}^{\Delta (G)}\mathrm{max}\left\{w(e)|e\in {F}_{i}\text{}\right\$, where w(e) denotes the weight of e and Δ(G) denotes the maximum degree of G.

_{o}, Q

_{0}}, {P

_{1}, Q

_{1}}, {P

_{3}, Q

_{2}}}, {{P

_{o}, Q

_{1}}, {P

_{2}, Q

_{2}}, {P

_{3}, Q

_{3}}}, and {{P

_{1}, Q

_{1}}} with the minimum value max{15, 16, 25} + max{9, 15, 11} + max{9} = 25 + 15 + 9 = 49, as shown in Figure 5. Since the number of colors used equals to the maximum degree of G, the edge partition is also a solution to the MDECP.

_{1}, V

_{2}, …, V

_{k}minimize $\sum}_{i=1}^{k}\mathrm{max}\left\{w(e)|e\in {V}_{i}\text{}\right\$ where w denotes the weight function and k denotes the number of colors used. However, the maximum coloring problem is an NP-hard problem even when its input is an interval graph [11]. Due to its intractability, a lot of papers aim to devise approximation algorithms for it on special graphs [11,12]. Obviously, the above problem is a kind of vertex coloring, which is totally different from the edge coloring problems defined in this work.

- Define three new bipartite graph models including MECP, MDECP, and CSMECP.
- Design a first approximation algorithm with ratio bound of two for MECP when the input is a biplanar graph.
- Give a formal proof to show that the CSMECP is an NP-complete problem even when the input is a biplanar graph.

## 2. Definitions and Notations

_{G}(v) of a vertex v in a loopless graph G is the number of incident edges. The maximum degree of vertices of G is denoted by Δ(G).

_{m}

_{, n}. The line graph of a graph G, written L(G), is the graph whose vertices are the edges of G, with (e, f) is in the edge set of L(G) when e = (u, v) and f = (v, w). In Figure 7, the right graph is the line graph L(G) of the left graph G.

_{u}such that I

_{u}∩I

_{v}is nonempty precisely when (u, v)∈E. For example, in Figure 8, the graph is an interval graph because there is a set of intervals representing vertex set.

## 3. Related Work

## 4. Two-Approximation Algorithm for Irregular Data Redistribution Scheduling

**Theorem**

**1.**

**Proof.**

_{GB}

_{(P, Q)}= (S, T, E) is bipartite. The rest is to prove that it is planar. We can try to find out a biplanar embedding for G

_{GB}

_{(P, Q)}(see Figure 3 and Figure 4 for example) by placing all processors on two lines and numbering them from left to right (i.e., {P

_{0}, P

_{1}, …, P

_{|}

_{P}

_{|}

_{−1}} and {Q

_{0}, Q

_{1}, …, Q

_{|}

_{Q}

_{|}

_{−1}}). In the GEN-BLOCK format GB (P, Q), any set of consecutive array elements are originally allocated to consecutive source processors with respect to {P

_{0}, P

_{1}, …, P

_{|}

_{P}

_{|}

_{−1}}. Moreover, any set of consecutive array elements needs to be reallocated to consecutive destination processors with respect to {Q

_{j}, Q

_{j+}

_{1}, …, Q

_{k}}. As a result, when there is a message sent and reallocated from P

_{i}to Q

_{o(i}

_{)}, then for any j such that i ≤ j ≤

_{|}

_{P}

_{|}

_{−1}and any message sent from P

_{j}to Q

_{o(j}

_{)}, we have o(j) ≥ o(i). Suppose the corresponding redistribution graph is not biplanar, there is at least a pair of crossing edges such that (P

_{j}, Q

_{o(j}

_{)}) and (P

_{j}

_{+}

_{Δ}, Q

_{o(}

_{j}

_{+}

_{Δ)}) where Δ > 0 and o(j + Δ) < o(j). A contradiction occurs. Therefore, the corresponding redistribution bipartite graph is a biplanar graph. □

**Theorem**

**2.**

- (1)
- A graph G is a biplanar graph.
- (2)
- After removing all leaves in G, the remainder is an acyclic graph and contains no vertices with a degree more than two.
- (3)
- A graph G contains a set of caterpillars which are disjoint graphs.

^{2}) edges. Yet, if the input is planar, the number of edges in G would be reduced to 3n − 6 [18]. Moreover, because biplanar graphs are essentially a subclass of planar graph, the number of edges in biplanar graphs is evidently less than 3n − 6. Additionally, by combining Theorem 1 with Theorem 2, we conclude that the graph generated from the GEN-BLOCK method is a forest in the following corollary. The next corollary further indicates that the number of edges in G is less than or equal to n − 1.

**Corollary**

**1.**

**Theorem**

**3.**

**Proof.**

_{1}, T

_{2}, …, T

_{ω}

_{(G)}(by Corollary 1). Since the size of edge set of a tree is one less than the size of its vertex set and G consists of ω(G) trees, we have |E| = |X| + |Y| − ω(G). □

**Theorem**

**4.**

**Proof.**

- Every vertex v in S∩T can be assigned and labeled with a distinct integer χ(v) from left to right where 1 ≤ χ(v) ≤ |S|+|T|. Moreover, the x-coordinate of vertex v on two horizontal lines is exactly χ(v).
- Vertex set S∩T can be grouped into two disjoint subsets: spine set B and leaf set L. Here B contains all vertices in the spine, and L contains the rest of vertices (that is, all vertices with degree one in G); that is, L contains all leaf nodes in G. Since every vertex v in L is a leaf, there is a unique edge (v, b
_{k}) connecting v to another vertex in B by Theorem 2. - Suppose B={v
_{b}_{0}, v_{b}_{1}, …, v_{b}_{|B}_{|−1}} is sorted ascending by x-coordinate. Every vertex v in B is assigned with a unique integer χ(v) in interval [1, |S| + |T|]. Due to biplanar graph properties, the space between the two parallel lines is partitioned into |B| + 1 closure subspaces. Similarly, interval [1, |S| + |T|] can be partitioned into at most |B| + 1 subintervals [1, χ(v_{b}_{0})], [χ(v_{b}_{0}), χ(v_{b}_{1})], …, [χ(_{|B}_{|−1}), |S| + |T|] with respect to the x coordinates of vertices in B. Since G is a connected graph, there is an edge connecting every pair of neighboring vertices in spine set B. - Due to the properties of bipartite planar graphs, L is able to be divided into R ≤ |B| disjoint leaf sets L
_{1}, L_{2}, …, L_{|B}_{|}such that each vertex in the same leaf set is adjacent to the same vertex in B. We can rescale the biplanar drawing properly so that every vertex v in the same leaf set is allocated with a x-coordinate χ(v) contained in a same subintervals selected from [1, χ(v_{b}_{0}) − 1], [χ(v_{b}_{0}) + 1, χ(v_{b}_{1}) − 1], …, [χ(_{|B}_{|−1}) + 1, |S| + |T|]. Moreover, when two leaf nodes are in different leaf sets, their x-coordinates contained in different subintervals selected from [1, χ(v_{b}_{0}) − 1], [χ(v_{b}_{0}) + 1, χ(v_{b}_{1}) − 1], …, [χ(_{|B}_{|−1}) + 1, |S| + |T|] (e.g., blue dash rectangles in Figure 9). Also notice that for each edge (u, v) in this sequence, their corresponding x-coordinates χ(u) and χ(v) are located in the same subintervals selected from [1, χ(v_{b}_{0})], [χ(v_{b}_{0}), χ(v_{b}_{1})], …, [χ(_{|B}_{|−1}), |S| + |T|].

_{1}) − 0.1, χ(v

_{2}) + 0.1] is constructed for every (v

_{1}, v

_{2}) in edge set (Figure 9). The rest shows that the intervals obtained represent L(G).

_{1}= (u, v

_{1}) and e

_{2}= (u, v

_{2}). Without loss of generality, let χ(v

_{1}) < χ(v

_{2}). When χ(u) < χ(v

_{1}) < χ(v

_{2}), the interval [χ(u) − 0.1, χ(v

_{1}) + 0.1] (representing e

_{1}) intersects the interval [χ(u) − 0.1, χ(v

_{2}) + 0.1] (representing e

_{2}). On the other hand, When χ(v

_{1}) < χ(v

_{2}) < χ(u), the interval [χ(v

_{1}) − 0.1, χ(u) + 0.1] (representing e

_{1}) also intersects the interval [χ(v

_{2}) − 0.1, χ(u) + 0.1] (representing e

_{2}). When χ(v

_{1}) < χ(u) < χ(v

_{2}), the interval [χ(v

_{1}) − 0.1, χ(u) + 0.1] (representing e

_{1}) intersects the interval [χ(u) − 0.1, χ(v

_{2}) + 0.1] (representing e

_{2}).

_{1}) ≤ χ(u

_{2}) ≤ χ(v

_{1}) ≤ χ(v

_{2}) and {u

_{1}, u

_{2}}⊆S and {v

_{1}, v

_{2}}⊆T, we have a pair of overlapped intervals [χ(u

_{1}) − 0.1, χ(v

_{1}) + 0.1] and [χ(u

_{2}) − 0.1, χ(v

_{2}) + 0.1]. We claim that the corresponding two edges (u

_{1}, v

_{1}) and (u

_{2}, v

_{2}) are incident with a same vertex (either u

_{1}= u

_{2}or v

_{1}= v

_{2}). Otherwise, we have u

_{1}≠ u

_{2}and v

_{1}≠ v

_{2}(similarly χ(u

_{1}) ≠ χ(u

_{2}) and χ(v

_{1}) ≠ χ(v

_{2})), and the constructed intervals for the edges must be [χ(u

_{1}) − 0.1, χ(v

_{1}) + 0.1] and [χ(u

_{2}) − 0.1, χ(v

_{2}) + 0.1] where χ(u

_{1}) < χ(u

_{2}) < χ(v

_{1}) < χ(v

_{2}). Since χ(u

_{1}) and χ(v

_{1}) (similarly χ(u

_{2}) and χ(v

_{2})) are in the same subintervals selected from [1, χ(v

_{b}

_{0})], [χ(v

_{b}

_{0}), χ(v

_{b}

_{1})], …, [χ(

_{|B}

_{|−1}), |S| + |T|]. That means χ(u

_{1}), χ(u

_{2}), χ(v

_{1}), χ(v

_{2}) are in the same subintervals selected from [1, χ(v

_{b}

_{0})], [χ(v

_{b}

_{0}), χ(v

_{b}

_{1})], …, [χ(v

_{|B}

_{|−1}), |S| + |T|]. As a result, at most one vertex of {u

_{1}, u

_{2}, v

_{1}, v

_{2}} is in B. Otherwise, if u

_{1}and v

_{2}are in B, then v

_{1}and u

_{2}should be adjacent to the same vertex. A contradiction occurs. Suppose only one vertex of {u

_{1}, u

_{2}, v

_{1}, v

_{2}} is in B, the remaining two leaf vertices should be adjacent to a same node in B. Another contradiction occurs. When χ(u

_{2}) < χ(u

_{1}) < χ(v

_{1}) < χ(v

_{2}), Without loss of generality, we have {u

_{1}, u

_{2}}⊆S and {v

_{1}, v

_{2}}⊆T. Then two edges (u

_{1}, v

_{1}) and (u

_{2}, v

_{2}) are crossing in the biplanar graph drawing. A contradiction occurs.

Algorithm AMECP: |

Input: G =(X, Y, E) of GEN-BLOCK. |

Output: an edge coloring {F_{1}, F_{2}, F_{3}, …, F_{z}} of G and $\sum}_{i=1}^{k}max\left\{w\left(e\right)|e\in {F}_{i}\right\$. |

1: Construct L(G) from G. |

2: Find out an interval representation for L(G) by executing the recognition algorithm of interval graphs based on Theorem 4. |

3: Construct a vertex coloring {V_{1}, V_{2}, V_{3}, …, V_{z}} of L(G) by executing an algorithm to solve the maximum coloring problem for interval graphs. |

4: Construct an edge coloring {F_{1}, F_{2}, F_{3}, …, F_{z}} of G from vertex coloring {V_{1}, V_{2}, V_{3}, …, V_{z}}, and calculate $\sum}_{i=1}^{z}max\left\{w(e)|e\in {F}_{i}\right\$. |

**Theorem**

**5.**

**Proof.**

**Theorem**

**6.**

^{2}) time, where n = |X| + |Y|.

**Proof.**

^{2}) time because to implement it we need to compare every pair of edges in E of G to create the edge set of L(G). By Theorem 3, |E| = |X| + |Y| − ω(G). That means |E| < n, because ω(G) ≥ 1. As a result, this step takes at most (n − 1) × (n − 2)/2 = O(n

^{2}) computation time. The second step can be implemented in an efficient way (O(n

^{2}) time) if we apply Booth and Lueker’s interval graph recognition algorithm, which adopted well-known PQ trees to test the consecutive ones property [34]. The third step needs O(nlog n) time if applying Pemmaraju et al.’s two-approximation algorithm [11]. At last, the last step can be implemented in linear time trivially. Finally, AMECP takes O(n

^{2}) time totally. □

## 5. New Graph Model for Applying Partitioning Message Techniques

**Property**

**1.**

- 1.
- Additional edge (s, t) is allowed to be added in edge set E′ only if these two end points, s and t, are originally adjacent by an edge (s, t) in E.
- 2.
- Δ(G′) = Δ(G). That means after adding some multiple edges, the maximum degree of the new graph G′ remains unchanged when compared to that of the original one G.
- 3.
- In G′ = (S, T, E∪E′), $\sum}_{i=1}^{\Delta (G\prime )}\left\{w(e)|e=(s,\text{}t)\text{}\mathrm{in}\text{}E\prime {\displaystyle \cup E}\right\$ = $\sum}_{i=1}^{\Delta (G\prime )}\left\{w(e)|e=(s,\text{}t)\text{}\mathrm{in}\text{}E\right\$ in G = (S, T, E), where w(e) is the weight of e.

_{1}, F

_{2}, F

_{3}, …, F

_{Δ}} of G′ = (V, E∪E′) such that $\sum}_{i=1}^{\Delta}\mathrm{max}\left\{w(e)|e\in {F}_{i}\text{}\right\$ ≤ K.

**Property**

**2.**

^{+}for each k∈A are given as inputs, the partition problem is to answer whether there exists a subset A

^{*}in A such that $\sum _{k\in A*}s(k)=}{\displaystyle \sum _{k\in A-A*}s(k)$.

**Theorem**

**7.**

**Proof.**

_{1}, α

_{2}, …, α

_{n}} with a weight function s(α) for each α

_{i}∈A are given as inputs, we generate a graph G=(S, T, E) so that

- vertex set S = {s
_{x}, s_{y}, s_{1}, s_{2}, …, s_{n}} and T = {t_{1}, t_{2}}, - edge set E = {(s
_{x}, t_{1}), (s_{y}, t_{1}), (s_{1}, t_{2}), (s_{2}, t_{2}), …, (s_{n}, t_{2})}, and - the weights of (s
_{x}, t_{1}) and (s_{y}, t_{1}) are $\sum _{\alpha \in A}s(a)$/2, and the weight of (s_{i}, t_{2}) is s(α_{i}) for all α_{i}in A.

_{x}, s

_{y}, s

_{1}, s

_{2}, …, s

_{6}}, T = {t

_{1}, t

_{2}}, and E = {(s

_{x}, t

_{1}), (s

_{y}, t

_{1}), (s

_{1}, t

_{2}), (s

_{2}, t

_{2}), …, (s

_{6}, t

_{2})}. The associated weights of (s

_{x}, t

_{1}) and (s

_{y}, t

_{1}) are 16 = (2+3+4+6+8+9)/2, and each of the remaining edges is assigned with a distinct element (value) in A.

^{k}) time where k is a constant. The rest is to find out that there is a subset A

^{*}in A so that $\sum _{k\in A{*}^{}}s(k)=}{\displaystyle \sum _{k\in A-A*}s(k)$ if there is an edge coloring {E

_{1}, E

_{2}, E

_{3}, …, E

_{Δ}

_{(G′}

_{)}} of G′ = (S, T, E∪E′) and $\sum}_{i=1}^{\Delta \text{}(G\prime )}\mathrm{max}\left\{w(e)|e\in {E}_{i}\text{}\right\$ ≤ K = $\sum _{k\in A}s(k)$.

^{*}= {α

_{r}

_{(1)}, α

_{r}

_{(2)}, …, α

_{r}

_{(k)}}⊆A = {α

_{r}

_{(1)}, α

_{r}

_{(2)}, …, α

_{r}

_{(n)}} such that $\sum _{a\in A*}s(a)=}{\displaystyle \sum _{a\in A-A*}s(a)$, obviously $\sum _{1\le i\le k}s({\alpha}_{r(i)})$ = ($\sum _{a\in A}s(a)$)/2 = K/2. We can then construct a weighted biplanar graph G = (S, T, E) accordingly. Assume that the extra added edges E′ consist of k − 1 copies of (s

_{x}, t

_{1}) and n – k − 1 copies of (s

_{y}, t

_{1}). As a result, Δ(G′) and Δ(G) all equal n. Because $\sum _{1\le i\le k}s({\alpha}_{r(i)})$ = ($\sum _{a\in A}s(a)$)/2 = K/2 and $\sum _{a\in A*}s(a)=}{\displaystyle \sum _{a\in A-A*}s(a)$, we distribute {s(α

_{r}

_{(1)}), s(α

_{r}

_{(2)}), …, s(α

_{r}

_{(k)})} to k copies of (s

_{x}, t

_{1}) and {s(α

_{r}

_{(k+1)}), s(α

_{r}

_{(k+2)}), …, s(α

_{r}

_{(n)})} to n − k copies of (s

_{y}, t

_{1}). There exists a coloring for edges {E

_{1}, E

_{2}, E

_{3}, …, E

_{n}} of G′ = (S, T, E∪E′) where E

_{i}= {(s

_{x}, t

_{1}), (s

_{r}

_{(i)}, t

_{2})} for 1 ≤ i ≤ k and E

_{i}= {(s

_{y}, t

_{1}), (s

_{r}

_{(j)}, t

_{2})} for k + 1 ≤ j ≤ n. Evidently, $\sum}_{i=1}^{n}\mathrm{max}\left\{w(e)|e\in {E}_{i}\text{}\right\$ = $\sum _{1\le j\le n}s({\alpha}_{r(j)})$ = K.

^{*}= {2, 6, 8}⊆A = {2, 3, 4, 6, 8, 9} such that 2 + 6 + 8 = 3 + 4 + 9 = 16. We distribute {2, 6, 8} to three copies of (s

_{x}, t

_{1}) and {3, 4, 9} to three copies of (s

_{y}, t

_{1}).

_{1}, E

_{2}, E

_{3}, …, E

_{Δ}

_{(G′}

_{)}} of G′ so that $\sum}_{i=1}^{\Delta (G\prime )}\mathrm{max}\left\{w(e)|e\in {E}_{i}\text{}\right\$≤K = $\sum _{a\in A}s(a)$. According to the previous construction of G, the weights of (s

_{x}, t

_{1}) and (s

_{y}, t

_{1}) equal $\sum _{a\in A}s(a)$/2. Let A

^{*}⊆A contains the corresponding elements of the edges, which are adjacent to t

_{2}and scheduled with the copies of (s

_{x}, t

_{1}). Similarly, A-A

^{*}contains the corresponding elements of the edges, which are adjacent to t

_{2}and scheduled with the copies of (s

_{y}, t

_{1}). We claim that $\sum _{a\in A*}s(a)=}{\displaystyle \sum _{a\in A-A*}s(a)$ = K/2. Otherwise, for any edge coloring, we always have either $\sum _{a\in A*}s(a)$ > K/2 or $\sum _{a\in A-A*}s(a)$ > K/2. Without a loss of generality, assume that $\sum _{a\in A*}s(a)$ > K/2 and the cost to schedule A

^{*}is greater than K/2. On the other hand, the cost to schedule A-A

^{*}is not less than K/2 because the cost distributed to the copies of (s

_{y}, t

_{1}) is K/2. As a result, $\sum}_{i=1}^{\Delta (G\prime )}\mathrm{max}\left\{w(e)|e\in {E}_{i}\right\$ > K. A contradiction occurs. □

## 6. Conclusion and Future Work

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Park, N.; Prasanna, V.K.; Raghavendra, C.S. Efficient Algorithms for Block-Cyclic Data Redistribution Between Processor Sets. IEEE Trans. Parallel Distrib. Syst.
**1999**, 10, 1217–1240. [Google Scholar] [CrossRef] - Petitet, A.P.; Dongarra, J.J. Algorithmic Redistribution Methods for Block-Cyclic Decompositions. IEEE Trans. PDS
**1999**, 10, 1201–1216. [Google Scholar] [CrossRef] - Wang, H.; Guo, M.; Wei, D. Divide-and-conquer Algorithm for Irregular Redistributions in Parallelizing Compilers. J. Supercomput.
**2004**, 29, 157–170. [Google Scholar] [CrossRef] - Wang, H.; Guo, M.; Chen, W. An Efficient Algorithm for Irregular Redistribution in Parallelizing Compilers. In Lecture Notes in Computer Science, Proceedings of 2003 International Symposium on Parallel and Distributed Processing with Applications, Aizu-Wakamatsu, Japan, 2–4 July, 2003; Springer: Berlin/Heidelberg, Germany, 2003; Volume 2745, pp. 76–87. [Google Scholar]
- Tang, B.; Jaggi, N.; Wu, H.; Kurkal, R. Energy-efficient data redistribution in sensor network. In Proceedings of the 7th IEEE International Conference on Mobile Ad-hoc and Sensor Systems, San Francisco, CA, USA, 8–12 November 2010; pp. 352–361. [Google Scholar]
- Yi, Q.; Wang, J.; Liu, C. Energy-efficient data storage solutions under sink failures. In Proceedings of the 10th International Conference on Communications and Networking in China, Shanghai, China, 15–17 August 2015; pp. 349–354. [Google Scholar]
- Cheng, L.; Li, T. Efficient data redistribution to speed up big data analytics in large systems. In Proceedings of the 2016 IEEE 23rd International Conference on High Performance Computing, Hyderabad, India, 19–22 December 2016; pp. 91–100. [Google Scholar]
- Dreher, M.; Peterka, T. Bredala: Semantic data redistribution for in situ applications. In Proceedings of the 2016 IEEE International Conference on Cluster Computing, Taipei, Taiwan, 12–16 September 2016; pp. 279–288. [Google Scholar]
- Marrinan, T.; Insley, J.A.; Rizzi, S.; Tessier, F.; Papka, M.E. Automated dynamic data redistribution. In Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, Lake Buena Vista, FL, USA, 29 May–2 June 2017; pp. 1208–1214. [Google Scholar]
- Cheng, L.; Wang, Y.; Pei, Y.; Epema, D. A Coflow-Based Co-Optimization Framework for High-Performance Data Analytics. In Proceedings of the 2017 46th International Conference on Parallel Processing (ICPP), Bristol, UK, 14–17 August 2017; pp. 392–401. [Google Scholar]
- Pemmaraju, S.V.; Raman, R.; Varadarajan, K.R. Buffer minimization using max-coloring. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Philadelphia, PA, USA, 11–14 January 2004; pp. 562–571. [Google Scholar]
- Pemmaraju, S.V.; Raman, R. Approximation algorithms for the max-coloring problem. In Lecture Notes in Computer Science, Proceedings of the International Colloquium on Automata, Languages, and Programming, Lisbon, Portugal, 11–15 July 2005; Springer: Berlin/Heidelberg Germany, 2015; Volume 3580, pp. 1064–1075. [Google Scholar]
- Gonzales, T.; Sahni, S. Open shop scheduling to minimize finish time. J. ACM
**1976**, 23, 665–679. [Google Scholar] [CrossRef] - Yook, H.G.; Park, M.-S. Scheduling GEN_BLOCK Array Redistribution. J. Supercomput.
**2002**, 2, 251–267. [Google Scholar] [CrossRef] - Yu, C.W.; Hsu, C.-H.; Yu, K.-M.; Lian, C.K.; Chen, C.-I. Irregular Redistribution Scheduling by partitioning Messages. In Lecture Notes in Computer Science, Proceedings of the 10th Asia-Pacific Computer Systems Architecture Conference, Singapore, 24–26 October 2005; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3740, pp. 295–309. [Google Scholar]
- Yu, C.W. On the complexity of the maximum biplanar subgraph problem. Inf. Sci.
**2000**, 129, 239–250. [Google Scholar] [CrossRef] - Yu, C.W.; Chen, G.H. Efficient parallel algorithms for doubly convex-bipartite graphs. Theor. Computer Sci.
**1995**, 147, 249–265. [Google Scholar] [CrossRef] [Green Version] - Bondy, J.A.; Murty, U.S.R. Graph Theory with Applications; Macmillan: London, UK, 1976. [Google Scholar]
- Ramaswamy, S.; Simons, B.; Banerjee, P. Optimization for Efficient Data redistribution on Distributed Memory Multicomputers. J. Parallel Distrib. Comput.
**1996**, 38, 217–228. [Google Scholar] [CrossRef] - Prylli, L.; Touranchean, B. Fast runtime block cyclic data redistribution on multiprocessors. J. Parallel Distrib. Comput.
**1997**, 45, 63–72. [Google Scholar] [CrossRef] - Bandera, G.; Zapata, E.L. Sparse Matrix Block-Cyclic Redistribution. In Proceedings of the Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, San Juan, PR, USA, 12–16 April 1999. [Google Scholar]
- Hsu, C.-H.; Bai, S.-W.; Chung, Y.-C.; Yang, C.-S. A Generalized Basic-Cycle Calculation Method for Efficient Array Redistribution. IEEE Trans. Parallel Distrib. Syst.
**2000**, 11, 1201–1216. [Google Scholar] - Hsu, C.-H.; Yang, D.-L.; Chung, Y.-C.; Dow, C.-R. A Generalized Processor Mapping Technique for Array Redistribution. IEEE Trans. Parallel Distrib. Syst.
**2001**, 12, 743–757. [Google Scholar] - Kalns, E.T.; Ni, L.M. Processor Mapping Technique Toward Efficient Data Redistribution. IEEE Trans. Parallel Distrib. Syst.
**1995**, 6, 1234–1247. [Google Scholar] [CrossRef] - Lee, S.; Yook, H.; Koo, M.; Park, M. Processor reordering algorithms toward efficient GEN_BLOCK redistribution. In Proceedings of the ACM Symposium on Applied Computing, Las Vegas, NV, USA, March 2001; pp. 539–543. [Google Scholar]
- Kaushik, S.D.; Huang, C.H.; Ramanujam, J.; Sadayappan, P. Multiphase data redistribution: Modeling and evaluation. In Proceedings of the 9th International Parallel Processing Symposium, Santa Barbara, CA, USA, 25–28 April 1995; pp. 441–445. [Google Scholar]
- Desprez, F.; Dongarra, J.; Petitet, A. Scheduling Block-Cyclic Data redistribution. IEEE Trans. Parallel Distrib. Syst.
**1998**, 9, 192–205. [Google Scholar] [CrossRef] - Guo, M.; Nakata, I.; Yamashita, Y. Contention-Free Communication Scheduling for Array Redistribution. Parallel Comput.
**2000**, 26, 1325–1343. [Google Scholar] [CrossRef] - Lim, Y.W.; Bhat, P.B.; Prasanna, V.K. Efficient Algorithms for Block-Cyclic Redistribution of Arrays. Algorithmica
**1999**, 24, 298–330. [Google Scholar] [CrossRef] - Wakatani, A.; Wolfe, M. Optimization of Data redistribution for Distributed Memory Multicomputers. short communication. Parallel Comput.
**1995**, 21, 1485–1490. [Google Scholar] [CrossRef] - Guo, M.; Pan, Y.; Liu, Z. Symbolic Communication Set Generation for Irregular Parallel Applications. J. Supercomput.
**2003**, 25, 199–214. [Google Scholar] [CrossRef] - Eades, P.; McKay, B.D.; Wormald, N.C. On an edge crossing problem. Available online: http://users.cecs.anu.edu.au/~bdm/papers/EdgeCrossing.pdf (accessed on 15 May 2019).
- Tomii, N.; Kambayashi, Y.; Shuzo, Y. On planarization algorithms of 2-level graphs. Tech. Group. Elect. Comp. IECEJ
**1977**, 38, 1–12. [Google Scholar] - Booth, K.S.; Lueker, G.S. Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. J. Comput. Syst. Sci.
**1976**, 13, 335–379. [Google Scholar] [CrossRef] [Green Version] - Garey, M.R.; Johnson, D.S. Computers and Intractability: A guide to the Theory of NP-completeness; Freeman: San Francisco, CA, USA, 1978. [Google Scholar]

**Figure 5.**A solution to the maximum edge coloring problem (MECP) (and maximum degree edge coloring problem (MDECP)) when the input is the distribution graph in Figure 4.

**Figure 10.**An example of cost-sharing maximum edge coloring problem (CSMECP) constructed from an instance of the partition problem A = {2, 3, 4, 6, 8, 9} with n = 6.

Notations | Descriptions |
---|---|

n | the number of vertices in the input graph G |

d_{G}(v) | the number of incident edges to vertex v in a graph G |

L(G) | the line graph of G |

N(v) | the set of vertices adjacent to vertex v |

w(e) | the weight of edge e |

Δ(G) | the maximum degree of vertices of G |

ω(G) | the number of components of G |

χ’(G) | the edge chromatic number of G |

z | the number of used colors |

G_{BC}_{(}_{α}_{,}_{β}_{, P, Q)} | the distribution graph of BLOCK-SYCLIC array redistribution using Block-Cyclic(α) to Block-Cyclic(β) (from P processors to Q processors) |

G_{GB}_{(P, Q)} | the distribution graph of GEN_BLOCK redistribution from P processors to Q processors |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Li, Q.; Yu, C.W.
New Bipartite Graph Techniques for Irregular Data Redistribution Scheduling. *Algorithms* **2019**, *12*, 142.
https://doi.org/10.3390/a12070142

**AMA Style**

Li Q, Yu CW.
New Bipartite Graph Techniques for Irregular Data Redistribution Scheduling. *Algorithms*. 2019; 12(7):142.
https://doi.org/10.3390/a12070142

**Chicago/Turabian Style**

Li, Qinghai, and Chang Wu Yu.
2019. "New Bipartite Graph Techniques for Irregular Data Redistribution Scheduling" *Algorithms* 12, no. 7: 142.
https://doi.org/10.3390/a12070142