Next Article in Journal
Idea of Using Blockchain Technique for Choosing the Best Configuration of Weights in Neural Networks
Next Article in Special Issue
Using Graph Partitioning for Scalable Distributed Quantum Molecular Dynamics
Previous Article in Journal
Distributed Centrality Analysis of Social Network Data Using MapReduce
Open AccessArticle

Distributed Balanced Partitioning via Linear Embedding

Google Research, 76 Ninth Ave, New York, NY 10011, USA
Author to whom correspondence should be addressed.
This article is an extended version of our paper published in Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, San Francisco, CA, USA, 22–25 February 2016.
Authors contributed equally to this work.
Algorithms 2019, 12(8), 162;
Received: 18 July 2019 / Revised: 5 August 2019 / Accepted: 7 August 2019 / Published: 10 August 2019
(This article belongs to the Special Issue Graph Partitioning: Theory, Engineering, and Applications)
Balanced partitioning is often a crucial first step in solving large-scale graph optimization problems, for example, in some cases, a big graph can be chopped into pieces that fit on one machine to be processed independently before stitching the results together, leading to certain suboptimality from the interaction among different pieces. In other cases, links between different parts may show up in the running time and/or network communications cost, hence the desire to have small cut size. We study a distributed balanced-partitioning problem where the goal is to partition the vertices of a given graph into k pieces so as to minimize the total cut size. Our algorithm is composed of a few steps that are easily implementable in distributed computation frameworks such as MapReduce. The algorithm first embeds nodes of the graph onto a line, and then processes nodes in a distributed manner guided by the linear embedding order. We examine various ways to find the first embedding, for example, via a hierarchical clustering or Hilbert curves. Then we apply four different techniques including local swaps, and minimum cuts on the boundaries of partitions, as well as contraction and dynamic programming. As our empirical study, we compare the above techniques with each other, and also to previous work in distributed graph algorithms, for example, a label-propagation method, FENNEL and Spinner. We report our results both on a private map graph and several public social networks, and show that our results beat previous distributed algorithms: For instance, compared to the label-propagation algorithm, we report an improvement of 15–25% in the cut value. We also observe that our algorithms admit scalable distributed implementation for any number of partitions. Finally, we explain three applications of this work at Google: (1) Balanced partitioning is used to route multi-term queries to different replicas in Google Search backend in a way that reduces the cache miss rates by ≈ 0.5 % , which leads to a double-digit gain in throughput of production clusters. (2) Applied to the Google Maps Driving Directions, balanced partitioning minimizes the number of cross-shard queries with the goal of saving in CPU usage. This system achieves load balancing by dividing the world graph into several “shards”. Live experiments demonstrate an ≈ 40 % drop in the number of cross-shard queries when compared to a standard geography-based method. (3) In a job scheduling problem for our data centers, we use balanced partitioning to evenly distribute the work while minimizing the amount of communication across geographically distant servers. In fact, the hierarchical nature of our solution goes well with the layering of data center servers, where certain machines are closer to each other and have faster links to one another. View Full-Text
Keywords: cut minimization; embedding to line; imbalance; local improvement; MapReduce; maps; partitioning; social networks cut minimization; embedding to line; imbalance; local improvement; MapReduce; maps; partitioning; social networks
Show Figures

Figure 1

MDPI and ACS Style

Aydin, K.; Bateni, M.; Mirrokni, V. Distributed Balanced Partitioning via Linear Embedding . Algorithms 2019, 12, 162.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Search more from Scilit
Back to TopTop