Distributed Balanced Partitioning via Linear Embedding †
AbstractBalanced partitioning is often a crucial first step in solving large-scale graph optimization problems, for example, in some cases, a big graph can be chopped into pieces that fit on one machine to be processed independently before stitching the results together, leading to certain suboptimality from the interaction among different pieces. In other cases, links between different parts may show up in the running time and/or network communications cost, hence the desire to have small cut size. We study a distributed balanced-partitioning problem where the goal is to partition the vertices of a given graph into k pieces so as to minimize the total cut size. Our algorithm is composed of a few steps that are easily implementable in distributed computation frameworks such as MapReduce. The algorithm first embeds nodes of the graph onto a line, and then processes nodes in a distributed manner guided by the linear embedding order. We examine various ways to find the first embedding, for example, via a hierarchical clustering or Hilbert curves. Then we apply four different techniques including local swaps, and minimum cuts on the boundaries of partitions, as well as contraction and dynamic programming. As our empirical study, we compare the above techniques with each other, and also to previous work in distributed graph algorithms, for example, a label-propagation method, FENNEL and Spinner. We report our results both on a private map graph and several public social networks, and show that our results beat previous distributed algorithms: For instance, compared to the label-propagation algorithm, we report an improvement of 15–25% in the cut value. We also observe that our algorithms admit scalable distributed implementation for any number of partitions. Finally, we explain three applications of this work at Google: (1) Balanced partitioning is used to route multi-term queries to different replicas in Google Search backend in a way that reduces the cache miss rates by ≈
Share & Cite This Article
Aydin, K.; Bateni, M.; Mirrokni, V. Distributed Balanced Partitioning via Linear Embedding †. Algorithms 2019, 12, 162.
Aydin K, Bateni M, Mirrokni V. Distributed Balanced Partitioning via Linear Embedding †. Algorithms. 2019; 12(8):162.Chicago/Turabian Style
Aydin, Kevin; Bateni, MohammadHossein; Mirrokni, Vahab. 2019. "Distributed Balanced Partitioning via Linear Embedding †." Algorithms 12, no. 8: 162.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.